Litestream author here. Let me know if you have any questions. It's built to run as a separate process and to be super easy to get up and running. We have a GitHub discussion board and an active Slack group as well if you need any help.
Hey, lead of Debezium here, a change data capture tool for a number of databases (not SQLite, though). Out of curiousity, how are you implementing change ingestion, is there some interface/API in SQLite which lets you do this? Or are you manually parsing its log files?
Hi Gunnar, good to meet you. Litestream works by reading off the SQLite WAL file which acts as a circular buffer. It takes over the checkpointing process to control when the buffer rolls over so it doesn't miss any frames. Those frames get copied over and each buffer is recreated as a sequential set of WAL files that can be replayed to reconstruct the state of the database at a given point-in-time.
It sounds like Litestream differs from Debezium in that it provides physical replication rather than logical row changes. However, I've been toying with the idea of determining row-level changes from the WAL frames by using ptrmap pages to traverse up the b-tree and determine the owner table. There's a bunch of stuff on the roadmap before that like live read replication though.
There's some additional info on the site about how Litestream works[1] and I'm planning on making a video similar to this Raft visualization[2] I did a while back.
Thanks for sharing those insights. Indeed Debezium is based on logical replication. I'll definitely keep an eye on Litestream, perhaps there may be some potential for collaboration at some point? SQLite hasn't come up really in our community so far, but personally I find it very interesting.
Yeah, for sure. I'm up for some collaboration where it makes sense. I can understand how SQLite probably wouldn't come up so far. If someone is running CDC on Kakfa (Debezium) then they're probably running a client/server database instead of an embedded one.
Hit me up on Twitter[1] if you have any questions or we have a pretty friendly, active Slack[2] too.
Yes, it reads off the WAL then compresses the frames using LZ4 and then uploads to S3. The SQLite WAL acts as a circular buffer so Litestream takes over the checkpointing process to control when it rolls over so it can recreate that buffer as separate files. There's additional information on the web site: https://litestream.io/how-it-works/