The innovation here is getting sql.js to use http and range requests for file ac...

misterdata · on May 2, 2021

See also https://github.com/bittorrent/sqltorrent, same trick but using BitTorrent

phiresky · on May 2, 2021

Yeah, that was one of the inspirations for this. That one does not work in the browser though, would be a good project to do that same thing but with sqlite in wasm and integrated with WebTorrent instead of a native torrent program.

I actually did also implement a similar thing fetching data on demand from WebTorrent (and in turn helping to host the data yourself by being on the website): https://phiresky.github.io/tv-show-ratings/ That uses a protobufs split into a hashmap instead of SQLite though.

westurner · on May 2, 2021

This looks pretty efficient. Some chains can be interacted with without e.g. web3.js? LevelDB indexes aren't SQLite.

Datasette is one application for views of read-only SQLite dbs with out-of-band replication. https://github.com/simonw/datasette

There are a bunch of *-to-sqlite utilities in corresponding dogsheep project.

Arrow JS for 'paged' browser client access to DuckDB might be possible and faster but without full SQLite SQL compatibility and the SQLite test suite. https://arrow.apache.org/docs/js/

https://duckdb.org/ :

> Direct Parquet & CSV querying

In-browser notebooks like Pyodide and Jyve have local filesystem access with the new "Filesystem Access API", but downloading/copying all data to the browser for every run of a browser-hosted notebook may not be necessary. https://web.dev/file-system-access/

hfmuehleisen · on May 3, 2021

DuckDB can directly & selectively query Parquet files over HTTP/S3 as well. See here for examples: https://github.com/duckdb/duckdb/blob/6c7c9805fdf1604039ebed...

tyingq · on May 2, 2021

Would also be great to add (efficient) search to a static blog.

edoceo · on May 2, 2021

yea, sqlite FTS5 has been pretty amazing for quick search solutions (but I use english only)

nanna · on May 3, 2021

Definitely. Just need to add a layer to the static site generator for it to populate the SQLite DB, right?

intergalplan · on May 3, 2021

I'd also version the DB in the URL, else you could end up changing the file out from under someone who's already got the page loaded, with who-knows-what results depending on how different the file is. You could just prefix a head to every range request to check for changes, but that adds overhead and doesn't actually completely close the gap, so it'd still be possible to read a file different from the one you intended. Cost is more disk usage, depending on how many copies you keep around, but at least keeping the most recent "old" version seems reasonable unless you're skating really close to the quota on whatever system you're using.

M2Ys4U · on May 3, 2021

Requests could use the If-Unmodified-Since header to guard against DB changes without the overhead of a separate HEAD request.

If the underlying DB has been changed then the server should respond with 412 Precondition Failed.

jhgb · on May 2, 2021

Microsoft Access Cloud Edition, basically?

nbevans · on May 3, 2021

Sort of. Access had a "Forms" feature that let you create basic GUIs on top of your database. Also, the OP's project is (currently) only providing a read-only view of the SQLite database. Adding write support is possible but will be far less impressive to the HN crowd because SQLITE_BUSY will rear its ugly head ;-)

jhgb · on May 3, 2021

I was mostly referring to the shared file access approach.

mvanaltvorst · on May 2, 2021

I'm curious, in what manner could this method speed up Next.js builds? That's all done locally, which negates the effect of HTTP range requests, right?

city41 · on May 2, 2021

I'm guessing they mean rather than build a static Next site that generates 10k+ pages (or whatever large means in the given context), it instead creates one page that just queries the data from the client.

I have one Next static site that has about 20k pages and takes about 20 minutes to build and deploy. I think that's an acceptable build time. But I do know of other people around the net who have mentioned having sites with 20k-ish pages taking an hour+ to build. For them I could see the desire to try this sqlite trick.

nindalf · on May 2, 2021

You should write a post about this if you implement it. My humble suggestion for blog post title - 'Speed up your Next.js builds with this author's one weird trick! Vercel hates him!'

lsmurray · on May 3, 2021

I generate my entire blog on nextjs from a sqlite data base already. https://lsmurray.com/blog/database-first-development

It’s got a ton of rough edges but the boilerplate is there to get a proof of concept pretty quickly

altano · on May 3, 2021

For really large sites Next.js already has Incremental Static Regeneration which is usually the right solution to fast [re]builds: https://www.smashingmagazine.com/2021/04/incremental-static-...

throw123123123 · on May 3, 2021

Its not the same because you have to rebuild all the pages if you change your data source. In this implementation you can upload a new data set and it will work.

Its just a different stack.

gitgud · on May 2, 2021

Generating 20k pages in 20mins is impressive, 16 pages a second on average.

In my experience, it can take a couple of minutes just to deploy 20 pages, but that could just be the overhead of Typescript and SASS compilation too...

nicoburns · on May 3, 2021

Hugo claims <1ms a page. Which would mean 20k pages in under 20 seconds. 20k pages in 20 mins is not fast!

gawin · on May 3, 2021

Confirmed: Build a 30k pages site with Hugo in 53 seconds, including grabbing JSON content from Contentful and deploy on AWS S3.

Took some effort to get everything under 60 seconds, speed was the main reason to use Hugo.

city41 · on May 3, 2021

Oops, my memory was off. It's 10,925 pages and the last build took 18 minutes.