SQLite 3.33

simonw · on Aug 16, 2020

SQLite 3.33 also increases the maximum size of a SQLite database file from 140TB to 281TB: https://www.sqlite.org/releaselog/3_33_0.html

I asked about this on the SQLite forum and D. Richard Hipp said "This change was in response to a customer request. They still have a factor of 4 before reaching the old upper limit, but asked for additional headroom." https://sqlite.org/forum/forumpost/8e40a7f588428077d7f073c00...

bob1029 · on Aug 16, 2020

It is very good to hear there are others leveraging the full capabilities of this software.

We haven't broken the 100GB barrier for a single SQLite database file yet, but we have strong confidence that everything will simply continue working as expected once we do.

NewEntryHN · on Aug 16, 2020

Out of curiosity I wondered what was the biggest Sqlite databse in my filesystem:

find / -name *.sqlite -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1

The winner is `favicons.sqlite` from Firefox profile directory, which is 40 MB.

why_only_15 · on Aug 16, 2020

macos version:

mdfind "kMDItemDisplayName == *.sqlite" -0 | xargs -0 stat "-f%z %N" | sort -nr | head -n 5

should be fairly fast because mdfind uses the spotlight backend and already has this data cached.

Hello71 · on Aug 16, 2020

does it include system files?

why_only_15 · on Aug 17, 2020

Yeah -- the top 5 for me includes photos and notes

proto-n · on Aug 16, 2020

Make sure to include .etilqs files as well

kevincox · on Aug 16, 2020

I found the source. It appears that it is an etilqs_ prefix rather than .etilqs but the story is somewhat humorous:

https://github.com/mackyle/sqlite/blob/3cf493d4018042c70a4db...

rasz · on Aug 16, 2020

Chrome History at ~500MB

koeng · on Aug 16, 2020

I have a ~1TB SQLite database (genetic data). All my queries run really fast, and overall I'm very impressed with the performance.

itroot · on Aug 16, 2020

Yep, that's a great use-case! However on such amounts of data I would recommend to split it to several databases - backup, replication, vacuum will be much easier and better. Doesn't applies to sqlite3 only =) . I assume that you data is immutable or append-only, and there is rare writes with lots of reads?

koeng · on Aug 16, 2020

Yep! The data is for the most part append-only, with massive writes once a month (that's how often the genetic repositories update their data dumps) with a few updates scattered in when annotations change.

What exactly do you mean by split it to several databases? It seems like to me that would make backup and replication and such more difficult, since now I'd have to manage multiple databases. But I don't have experience there, so I'd love to hear if there are easy ways to do that

itroot · on Aug 17, 2020

If you are doing this for analytical things, setup with one database is OK. But, imagine that you a running something on production with sqlite, and database is really big. It is hard to : VACUUMizing, creating indexes and so on. In that case it's great to shard this thing, even it will be several files on one machine (of course, if you have data that can be sharded, like different users data can be stored on different dbs.)

misev · on Aug 16, 2020

The largest I've worked with until now is a bit over 0.5TB. It's still performing as well as if it was a few GB.

tobylane · on Aug 16, 2020

For those of us who haven’t had a few GB SQLite db, does it perform different to one that is a few MB?

tyingq · on Aug 16, 2020

A few MB would likely be fully in the OS buffer cache. I think you would see the difference when it no longer fits.

sprash · on Aug 16, 2020

Just as a tip if you experience bad performance of SQLite once the size reaches a triple GB range: Understand how the BEGIN/END TRANSACTION and transaction control in general works, learn about journaling why and when you need it and how to control it via the appropriate pragmas and be careful about indexing. You can increase sqllite3 performance by literally 4 orders of magnitude with simple tweaking to the point that you could easily serve a website with a million hits per day off a single SQLite database.

nine_k · on Aug 16, 2020

Still I see that a mere 60GB SQLite database for browser history, accumulated in maybe 10 years, causes trouble with Firefox, and I had to rename it and make Firefox create a new one in order for it to continue storing my recent history.

I bet it's not SQLite's problem.

ufmace · on Aug 16, 2020

I'd bet that the appropriate table is missing an index, since whoever designed it didn't anticipate it growing to 60GB.

zxcvbn4038 · on Aug 16, 2020

I always had to fight with corporate DBAs to add indexes - they never wanted to create them because it made their backups take longer so instead they made everyday usage slower for everyone.

One poor woman I worked with used to come in every day and kick off a bunch of queries that would do a couple linear scan of a database and take four-six hours to complete. I added a single one column index and the same query ran in less then a second. Got a hug on the spot. The DBA was upset the backup took a couple minutes longer to complete but when he complained he got into trouble for not profiling the queries and adding the index himself (ha!).

After that I started using SQLite for a lot of things - instead of confronting the DBAs I’d just dump the official database into an SQLite database and run all my queries against that. Was kind of amazing that you would have these huge Oracle clusters supporting a database with a few tens or hundreds of megabytes (not even gigabytes) of data but perfect size for slurping into SQLite on my desktop.

stingraycharles · on Aug 16, 2020

What you were dealing with is most likely a sysadmin that grew into a role of DBA. The DBAs I know are highly qualified and are actually making operations so much more smoother. They help with strategies of effective schema migrations, know the profiler in-depth, and actually detect optimization opportunities before the devs realize it.

My guess is that you were dealing with a person who may not actually enjoy their work and is optimizing to make their work as little as possible — but of course I may be way off course here. :)

zxcvbn4038 · on Aug 16, 2020

You may be right, now that I think it there was one guy responsible for all the databases in my business line, and he was the one always launching a two ocean war when his backup ran a minute slower then the day before.

stingraycharles · on Aug 17, 2020

Which is just petty stuff, and sends the message “I am more important than the rest of the company”.

It sounds like he doesn’t have the organization’s best interests in mind, but rather just his own.

ufmace · on Aug 16, 2020

Just goes to show you that just because somebody managed to get a job with the title DBA doesn't mean they're actually an expert in database management.

Lex-2008 · on Aug 16, 2020

Pardon my french, but I would call such DBA an "anti-DBA"

sebastien_bois · on Aug 16, 2020

That's not french at all. :)

zxcvbn4038 · on Aug 16, 2020

administrateur bâtard de l'enfer

giancarlostoro · on Aug 16, 2020

That is kind of entertaining to think someone can store a 281TB SQLite file and I am still waiting for companies to sell 1PB hard drives.

Back in 2010 I thought yeah soon we will have 50TB hard drives then 500TB and someday I can have 1PB it will be all I will need! Sadly no such change. I would be happy to be in the 10s of TBs rage for under 200 dollars.

Games are getting stupid wasteful. Someone told me Fortnite is almost 90GB or so and I flinched considering GTA5 is around that range and offers a loooot more rich gameplay and features! What a wasteful game resource wise.

Stratoscope · on Aug 16, 2020

Kids these days.

My first hard drive was an 8" 20MB drive. That's about 9.5 x 4.6 x 14.2 inches and weighed about 20 pounds. Only cost me $6000, in 1981. And with that massive capacity, I never did run out of room.

Now I have a 200GB MicroSD in my phone. It cost $75. Haven't run out of room on that one either.

justinclift · on Aug 17, 2020

Along those lines:

https://youtu.be/hiLlNzxDfAg?t=714

slightwinder · on Aug 16, 2020

Gameplay and features are cheap, they don't take as much storage as assets. All the dances, outfits and gears in fortnite are heavy it seems.

formerly_proven · on Aug 16, 2020

GTA V has seven or so years worth of major bi- or tri-annual DLCs. There are probably a few hundred hours worth of (fully voiced) co-op content in GTA Online.

https://gta.fandom.com/wiki/Updates_in_GTA_Online (37 total)

tyingq · on Aug 16, 2020

"I can have 1PB it will be all I will need"

You can have a 1PB file without a 1PB drive. I think md has a limit of 8 Zib. Several filesystems allow files up to 8EB.

giancarlostoro · on Aug 16, 2020

Well yeah, I know that, but still it would be nice to have a single 1PB hard drive.

evan_ · on Aug 16, 2020

That’s a lot of eggs for one basket...

jraph · on Aug 16, 2020

Right? Better have several of those, bought from different vendors at different times, to handle the backups!

It depends on the size of the eggs though. With 1 PB drives will probably come 3D 360° wat K videos in which you can take a walk, and video games.

These videos will be downloaded from your crappy rural TB connection by the way. Web pages will still be slow as fuck to load, like today, because we will have figured out that downloading the entire NPM registry several times per page load is easier than taking the time to bundle the dependencies, and optimizing image and video size will be considered as a premature optimization that nobody does anymore.

imtringued · on Aug 17, 2020

It would take almost a month to fill up assuming 500MB/s write speed.

giancarlostoro · on Aug 17, 2020

When I say hard drive I don't mean just hard drive, I usually mean storage medium designed for desktops. So a PCI-E drive would suffice, besides my assumption was that we'd have more storage space but files wouldnt endlessly increase as they seem to be.

brian_herman · on Aug 16, 2020

Yeah Call of Duty: Modern Warfare (the latest one). https://i.imgur.com/bsiGKFp.png Their patches are 30-40gb each!

Q6T46nT668w6i3m · on Aug 16, 2020

The size is typically from assets, not gameplay and features.

giancarlostoro · on Aug 16, 2020

Even in that context GTA5 seems to have a lot more though, a massive city with all types of places like deserts and downtown areas, planes, cars, motorcycles, and so on.

charrondev · on Aug 16, 2020

Yeah, visual assets up to very high resolutions is a large part of it. I wish I remembered the game, but I do recall one (it was a free to play battle royal game, not fortnite or PUBG) that had an ~15GB base game download, and then an addition 30GB for 4K textures.

In the case of GTA I’d imagine the voice lines are a major source of size as well. They’ve been doing updates for the online with new missions and voice lines for ages.

nine_k · on Aug 16, 2020

A SAN with a volume manager, or just a big enough RAID NAS could be it. You need mere 36 8TB drives, not even a full Backblaze pod.

justinclift · on Aug 17, 2020

36 x 8TB = (roughly) 288TB raw storage.

The overhead (depends on partition/filesystem/etc layout) would probably take that under 281TB.

Adding a few more drives would still probably be needed to hold the file, and add some level of redundancy. :)

tyingq · on Aug 16, 2020

Json and markdown output for the CLI also...that's nice.

crazygringo · on Aug 16, 2020

Those limits seem so... arbitrary. And at that scale, why bother having a limit at all? Genuinely curious for the justification.

lopsidedBrain · on Aug 16, 2020

https://sqlite.org/limits.html

> SQLite was originally designed with a policy of avoiding arbitrary limits. [...] Unfortunately, the no-limits policy has been shown to create problems. Because the upper bounds were not well defined, they were not tested, and bugs were often found when pushing SQLite to extremes.

crazygringo · on Aug 16, 2020

Thanks -- that makes total sense, all for guaranteed tests.

Though I shudder to imagine the effort that goes into testing a 281 TB database.

Do they purchase all those hard drives and make one enormous RAID concatenated disk set? Or is there a way to more cheaply virtualize that by combining a bunch of max-sized 16 TB AWS EBS instances? What types of errors are even likely to come up at that point -- errors in SQLite's internal logic, or errors in the operating system or drivers?

SQLite · on Aug 16, 2020

The TH3 test harness for SQLite (https://www.sqlite.org/th3.html) supports a virtual filesystem in which we can create test database files that appear to be very large but that don't actually contain much data or use much space.

We also have a simple utility program in the SQLite source tree (https://www.sqlite.org/src/file/tool/enlargedb.c) that lets you create a massive database file using a sparse file (https://en.wikipedia.org/wiki/Sparse_file) on systems that support that kind of thing.

lopsidedBrain · on Aug 16, 2020

If they have customers who need that kind of database, I'm sure they will be willing to help out with the testing. The setup doesn't have to be too crazy. Storage appliances with 20-30 hard drives are fairly common, actually. Even for individual humans, not just corporations. LVM allows you to easily create a logical volume out of multiple physical volumes, e.g. https://www.redhat.com/sysadmin/creating-logical-volumes.

You might find it interesting to look at the actual sqlite source commits where this change was introduced: https://sqlite.org/src/timeline?r=larger-databases It turns out the number comes from having a max of 2^32 pages in their database. Their default page being 4 kB each: https://www.sqlite.org/pgszchng2016.html

Working backwards, they must have raised it to 64kb: `python -c 'print(1024 * 64 * 232)'` produces 281,474,976,710,656.

jraph · on Aug 16, 2020

Maybe upping the limit required some changes in the way some things are handled in the code. Therefore, the limit would not be arbitrary, but a consequence of how SQLite is written.

Or the authors want to provide some guaranties and upper limits are set according to what is known to work / tested.

I'd be interested in knowing the actual answer to your question though.

edit: the actual answer is in the sibling comment, ah ah.

miohtama · on Aug 16, 2020

Decimal type, finally. Now SQLite is 100x easier to use in finance.

fauigerzigerk · on Aug 16, 2020

That's brilliant! I've been waiting for this for years.

chmaynard · on Aug 16, 2020

If you appreciate SQL, the example (update inventory from daily sales) is a thing of beauty.

dtf · on Aug 16, 2020

Is “GROUP BY 2” a shorthand for “GROUP BY itemId”, the second column in the FROM SELECT? Not sure I’ve seen that syntax before.

barrkel · on Aug 16, 2020

You can mention columns from the select clause by ordinal in group by and order by clauses, in lieu of restating the full expression. Very helpful with complex expressions, but IMO not the greatest for readability, so I tend to only use it in ad hoc queries. It's part of SQL-92, but I believe it is deprecated.

Many interpreters let you use a column alias from the select clause in the group and order clauses. This has better readability IMO but I'm not sure it's in the SQL-92 standard, but I believe it is now standardized.

nicoburns · on Aug 16, 2020

> Many interpreters let you use a column alias from the select clause in the group and order clauses. This has better readability IMO but I'm not sure it's in the SQL-92 standard, but I believe it is now standardized.

Not sure if it's standardised, but Postgres doesn't let you do this (rather irritatingly).

benesch · on Aug 16, 2020

No, PostgreSQL definitely lets you do this.

> An expression used inside a grouping_element can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. [0]

> Each expression can be the name or ordinal number of an output column (SELECT list item), or it can be an arbitrary expression formed from input-column values. [1]

Perhaps you are thinking of trying to use an output column name in a WHERE clause?

[0]: https://www.postgresql.org/docs/current/sql-select.html#SQL-...

[1]: https://www.postgresql.org/docs/current/sql-select.html#SQL-...

nicoburns · on Aug 18, 2020

> Perhaps you are thinking of trying to use an output column name in a WHERE clause?

Yeah, I think I am thinking about this.

Its pretty frustrating because the actual query engine is perfectly capable of executing fairly complex SQL expressions efficiently (they're not really that complex computationally, but they are syntactically because of SQL's verbosity), but the code becomes quite unmaintainable if you use too many of them.

For example, the following expression:

    ROUND((
      (EXTRACT(EPOCH FROM (s.end_time - s.start_time)) / 60) -- Shift length in minutes
        - (FLOOR((EXTRACT(EPOCH FROM (s.end_time - s.start_time)) / 60) / 380) * 20) -- Break length in minutes
    )::numeric / 60 , 2) as hours_planned,

It repeats the sub-expression `EXTRACT(EPOCH FROM (s.end_time - s.start_time)) / 60)`. If I could name that sub expression and reference it multiple times then the overall expression would be a lot more readable.

formerly_proven · on Aug 16, 2020

Yes, it's not in the standard. IIRC they only don't do it because it's not in the standard, somewhat annoying for iterating on queries. In stuff like SQLAlchemy it's not as annoying, since you can write it DRY.

barrkel · on Aug 17, 2020

I believe it's in SQL:1999.

It's tedious that all this is hearsay without open access standards. It's "only" $195 for the latest.

tmoertel · on Aug 16, 2020

Yes. GROUP BY n refers to the nth column of the result set.

cabaalis · on Aug 16, 2020

Order by [column number] is also a useful shortcut in mssql. Very often my columns are a compilation of functions, it feels clean to not repeat them. I'm not sure of availability in other sqls.

nogabebop23 · on Aug 16, 2020

IMO it looks cleaner but adds cognitive overhead for readability so I'm not sure if it is actually better. Feels kind of like a lookup table that a reader now has to reference. YMMV

sroussey · on Aug 16, 2020

In code, I agree. But when you are a REPL, it is handy to have shortcuts.

brilee · on Aug 16, 2020

It isn't idempotent, though - if you execute the query twice, you're kind of screwed, since your inventory numbers are now bad. A better way to do it would be to create a new table that consists of daily snapshots of inventory, select the latest row, and then use that row's date as a WHERE filter when aggregating the latest transactions.

chmaynard · on Aug 16, 2020

> It isn't idempotent, though

Agreed, but the purpose of the example is to demonstrate how to use UPDATE FROM, nothing more. I appreciate how it accomplishes a complex update in a very clear, concise, elegant way.

tgb · on Aug 16, 2020

I'm only passingly familiar with SQL, can anyone explain?

headalgorithm · on Aug 16, 2020

Essentially the FROM part calculates how many of each item has been sold today and then uses this amount to reduce how many are currently in stock (the UPDATE and SET parts).

tgb · on Aug 16, 2020

I see what it's doing but I don't really see what the benefit is. Maybe a comparison to what previously had to be done would help me? Would you have to make the 'daily' table in a separate step and then UPDATE using that?

nicoburns · on Aug 16, 2020

Previously you would have to SELECT and then UPDATE in separate queries, meaning that the data is round tripping to your application server and back again.

Edit: Or use subqueries apparently.

enedil · on Aug 16, 2020

This is sqlite, you typically don't have a server.

bepvte · on Aug 16, 2020

Is the "set" evaluated where its written or only during "sum()"? If not, whats the sum for?

lazulicurio · on Aug 16, 2020

As someone who uses SQL Server at work, I was surprised to find that this wasn't a standard feature in other DB engines. I use it all the time! Glad to see it spreading.

zingplex · on Aug 16, 2020

SQLite isn't a database so much as a file format with a handy API, so it should be expected it won't have all of the feature you may expect from a full database engine. It doesn't support users and permissions for example. I don't think a features presence in SQLite can be seen as indicative of the feature set of any other database software.

isoprophlex · on Aug 16, 2020

Postgres has this too afaik?

Yeah![1] with some caveats, that also apply to the sqlite implementation if im not mistaken:

When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_item list, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.

https://www.postgresql.org/docs/current/sql-update.html

anarazel · on Aug 16, 2020

The release notes are directly referencing that they copied us / PG: "Support for UPDATE FROM following the PostgreSQL syntax. The sqlite folks have done that a few times, and I think that's good. No need to unnecessarily proliferate more variants of doing the same.

SloopJon · on Aug 16, 2020

I'm intrigued by the decimal extension. I thought SQLite was IEEE double only, but it handles 64-bit integers, and now decimals. They don't seem to be fully incorporated into the documentation yet, and decimal.c leaves me with questions, so I guess I'll just have to play around with it.

nabla9 · on Aug 16, 2020

Decimals are represented as strings.

They have add, sub, mul and sum functions for text strings as well as comparison.

mamcx · on Aug 16, 2020

Oh, decimal support is landing!

I have a lot of stuff dancing around the lacking of decimals in sqlite, so this is so welcoming!

stef-13013 · on Aug 16, 2020

This software is a piece of magic...

justinclift · on Aug 16, 2020

As a data point, the nightly builds of DB Browser for SQLite now include SQLite 3.33:

https://nightlies.sqlitebrowser.org/latest/

wfriesen · on Aug 17, 2020

UPDATE FROM is a nice extension. In Oracle I've done similar things with an UPSERT/MERGE statement. The linked example of:

  UPDATE inventory
     SET quantity = quantity - daily.amt
    FROM (SELECT sum(quantity) AS amt, itemId FROM sales GROUP BY 2) AS daily
   WHERE inventory.itemId = daily.itemId;

Could be done with MERGE using:

  MERGE INTO inventory i
  USING (
    SELECT
      SUM(quantity) AS amt
    , itemId
    FROM sales
    GROUP BY 2
  ) daily
  ON (
    i.itemId = daily.itemId
  )
  WHEN MATCHED THEN UPDATE
    SET i.quantity = i.quantity - daily.amt;

oars · on Aug 17, 2020

Is SQLite capable of supporting multiple writes to the database from different processes at the same time?

This question is purely around INSERTS - no UPDATES.

Anything I should be aware of? Looking to write data from multiple services to an SQLite file.

itroot · on Aug 17, 2020

I think that this applies to all modifying operations - INSERTs, UPDATEs, DELETEs.

It's perfectly ok to write to sqlite from different processes in the same time, but to achieve good results it's better to:

* use WAL mode - so the readers and writers do not block (you can turn on it with `PRAGMA journal_mode=WAL;` in CLI, and it's better to add `PRAGMA main.synchronous=NORMAL;` also).

* all concurrent writes will be queued by sqlite3 lib and done in sequential manner, and if any write attempt will wait longer that BUSY_TIMEOUT ( see https://www.sqlite.org/pragma.html#pragma_busy_timeout ) , it will return error.

Snippet

  # setting things up...
  itroot@l7490:/tmp$ grep -i pragma ~/.sqliterc
  PRAGMA journal_mode=WAL;
  PRAGMA main.synchronous=NORMAL;
  PRAGMA busy_timeout=1000;
  itroot@l7490:/tmp$ sqlite3 test.sqlite 'CREATE TABLE records (id INTEGER PRIMARY KEY, record TEXT);' > /dev/null 2>&1

  # running 10 parallel processes that inserts numbers from 1 to 1000...
  itroot@l7490:/tmp$ echo {1..1000} | xargs -n1 -d' ' -P 10 -i% sqlite3 test.sqlite 'INSERT INTO records (record) VALUES (%);' >/dev/null 2>&1

  # getting number of records
  itroot@l7490:/tmp$ sqlite3 test.sqlite 'SELECT count(*) FROM records;'
  count(*) = 1000

ficklepickle · on Aug 17, 2020

From the FAQ:

> Multiple processes can have the same database open at the same time. Multiple processes can be doing a SELECT at the same time. But only one process can be making changes to the database at any moment in time, however.

Sounds like you can, just not at the same moment as it locks the file during writes. It appears to retry if the file is locked, so not the end of the world.

Faq- https://www.sqlite.org/faq.html#q5

Relevant SO post - https://stackoverflow.com/questions/15383615/multiple-access...

ZnZirconium · on Aug 16, 2020

SQLite is great because I don't want to run a database server if I don't need to and whenever it seems like I might have to give up and run a database server instead I find a configuration option in SQLite that solves my scaling problem or performance problem or concurrency problem. I don't like to recompile SQLite to set compile time options but I'm perfectly capable of doing it though.

I wish WAL mode were the default. I wish read only WAL mode were easier to do without putting the database file in a sticky directory. Oh well.

tmp538394722 · on Aug 16, 2020

Why do you need read only WAL mode? Or do you just mean some processes are read only, while others can write?

ZnZirconium · on Aug 16, 2020

The real reason I need WAL mode is so I can read the database while vacuum is running.

The web server runs unprivileged and only needs read permission. Insert and vacuum are done as a user which has write permission.

The documentation says read only WAL mode is possible if there is write permission on the directory. So I put the database file in its own directory and set the mode of the database file to 0664 and set the mode of the directory to 3777. So an unprivileged process has permission to create WAL files which become group writable.

tmp538394722 · on Aug 17, 2020

Thank you for explaining.

anitil · on Aug 18, 2020

My understanding is that this is because a temporary file needs to be created next to the db in WAL mode so write permissions are necessary on on directory. (https://www.sqlite.org/tempfiles.html, search for 'WAL')

thom · on Aug 16, 2020

This is useful but the clunkyness of using FROM/WHERE vs JOIN ... USING has always annoyed me in Postgres, which they've based this on. The MySQL version is sort of what my brain expects.

zeroimpl · on Aug 16, 2020

I wonder if they could just make this syntax work in Postgres:

    UPDATE inventory 
    SET quantity = inventory.quantity - daily.amt
    FROM
        (SELECT sum(quantity) AS amt, itemId FROM sales GROUP BY 2) daily USING( itemId );

Basically just allowing a ON/USING clause after the first FROM entry as if it was joining to the table being updated.

Otherwise it's kind of annoying when you end up joining to several tables in the update, but have to use the WHERE clause to join back to the main table.

Izkata · on Aug 16, 2020

Personally I prefer UPDATE JOIN as I've used it in mysql:

  UPDATE inventory
  INNER JOIN (SELECT sum(quantity) AS amt, itemId FROM sales GROUP BY 2) AS daily USING (itemId)
  SET quantity = quantity - daily.amt;

I had thought postgres could do it this way without the extra FROM syntax, but apparently not.

zeroimpl · on Aug 16, 2020

The idea of being able to update multiple tables in a single query this way is intriguing to me! But I like the fact that the Postgresql syntax makes it crystal clear which table is being updated.

funny_falcon · on Aug 17, 2020

In fact, MySQL allows update of multiple tables. https://stackoverflow.com/questions/4361774/mysql-update-mul...

yread · on Aug 16, 2020

Nice feature, I hate having to work around it with subqueries or worse

chmaynard · on Aug 16, 2020

Agreed. It would be interesting to compare the UPDATE FROM query plan with the equivalent subquery-based update.

emddudley · on Aug 16, 2020

Bummer, Python 3.9.0 RC1 just missed it! It bundles SQLite 3.32.3. Ah well, I guess we can look forward to it in 3.9.1 in a few months.

ufo · on Aug 16, 2020

Is UPDATE FROM part of ISO SQL or is it just a very popular extension?

josefx · on Aug 16, 2020

Going by the SQLite documentation it isn't.

https://www.sqlite.org/lang_update.html

pella · on Aug 16, 2020

https://www.sqlite.org/releaselog/3_33_0.html

leeoniya · on Aug 16, 2020

Fossil also got a new release today: https://fossil-scm.org/home/uv/download.html

gigatexal · on Aug 16, 2020

a better link to the changes in this ver than what the entry provides: https://www.sqlite.org/releaselog/3_33_0.html

Communitivity · on Aug 16, 2020

Interesting. Great for Command Query Responsibility Segregation (CQRS), but I could see this leading to a reduction in data normalization for a lot of teams not using CQRS too. CQRS is a great tool - but the complexity is high so you only want to use it where you need it.

CQRS overused a lot though, like using a $75k surveillance robotic dog from Boston Dynamics [https://spectrum.ieee.org/automaton/robotics/industrial-robo...] (Massive Dynamics?) to see who's at the door, when you could have just looked through the peephole.

I could see this feature becoming the CQRS of the SQL world for a while, used in many places where it should be considered harmful, in addition to the places where it is helpful.

stingraycharles · on Aug 16, 2020

I don’t understand why you focus on CQRS; I don’t think it’s specifically related? At least, not any more related than say, materialized views are useful for CQRS.

I’d expect a CQRS system on top of SQL to be implemented using a single event table and a lot of triggers, each triggering a different aggregate.

Communitivity · on Aug 16, 2020

Not sure why I got downvoted. Also, Domain Driven Design (which is where I've usually seen aggregates discussed), Event Sourcing, and CQRS are three different things that might be used together in an implementation but don't have to be.

The reason I noted how it's great for CQRS is that I've usually seen CQRS implemented with two data models. One is queried, and one is modified via commands. The one that's queried is updated according to some strategy, sometimes on an event triggered by the command model being modified. This this you could trigger and update on your query model when your command model is changed, all within SQL.

stingraycharles · on Aug 17, 2020

I think you're mostly downvoted because:

* It's only about CQRS, which in no way was mentioned in the article;

* You're not properly qualifying why this feature is so important for CQRS;

* In the same breath, you're also claiming it's overused and complex, while in fact you're the one who's "promoting" it.

I agree that in my reply, I was actually thinking about ES + CQRS rather than pure CQRS, but the gist of my comment still stands: I would expect a CQRS implementation to work with triggers (for incremental updates) or materialized views (one table insert-only, the "view" is kept in sync automatically by the database, which is what users query). Could you elaborate on why you think this new feature is better than those two approaches?