SQL vs NoSQL: shortest possible explanation

First, recall CISC CPUs. They were evolved into very complex ISAa. Even baroque.

CISC instructions gotten more and more complex. Internal microcode was larger and larger.

At some point, CPU developers tried to find idea, how to make busy all CPU parts evenly? When ALU calculates 2+2, let the load unit to read from memory/registers at the same time, some other data for the next operation. And let the store unit to write to memory/registers some data which was calculated at the previous operation. Thus (ideally) CPU can do 3 operations at the same moment. This is what is called 'pipeline' in CPU.

But to achieve this, CPU must be as simple as possible, even primitive. First ARM hadn't microcode at all. Also, recall the 'load-store arch' term -- that means that an arithmetical or logical instruction works only with registers, not with memory. To load from memory, load/store instructions are to be used. This is true for ARM and MIPS. But this is not true for x86, where ADD instruction (for example) can read from memory. x86 ISA inherited from CISC 8086 CPUs, but that doesn't mean modern x86 are still CISCs.

Now SQL databases.

Oracle RDBMS was evolved into a monster with its own PL (PL/SQL).

Even simple MS-DOS DBs of 1980s and 1990s (dBase, Clipper, FoxPro) were developed with the 'everything in DB' philosophy. You could create GUI forms to access DB, etc.

At that time users didn't have such large DBs as today, so this was perfectly OK.

But to scale DB, to get all that replicas and sharding, you need to make it primitive. Simplest possible DB -- just key/value store.

So everything has a price. And the price of scalability is primitiveness of DB. And absence of JOINs and other complex SQL queries, that can be very helpful when you start learning DB and/or SQL. In SQL DB you can offload some burden to SQL queries. You can program less. And a single SQL query with JOINs can be more readable then multiple lines of code for NoSQL.

'Relational' SQL DB means that you can read data from multiple tables with a single query (JOIN). That implies that a table may contain links (IDs) to other tables.

Compare pure C and Java or Python. Pure C code is very fast and also very compact. But to run Java or Python you need interpreter or JIT engine -- and this is the price you pay for high-levelness of PL. But of course, it's harder to program/debug something big in such a low-level PL as pure C. Which PL is better? Of course, they all are good and have their own niches. To squeeze maximum of your hardware, to optimize most critical functions, take pure C (or Rust nowadays).

Bottom line (TL;DR): SQL databases are for smaller DBs, supporting complex queries and thus making life easier (from MySQL/PostgreSQL/SQLite to Oracle/IBM DB2). NoSQL databases are for social network or cloud storage service scale databases.

NoSQL DBs are not 'better'. They are simpler and (thus) scale better.

(the post first published at 20230306.)

List of my other blog posts.

Subscribe to my news feed

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.