We've been using SQLite in production as our exclusive means for getting bytes t...

gfody · on Dec 29, 2021

> It is my understanding that something about not having to take a network hop and being able to directly invoke the database methods makes a huge difference. Are you able to execute queries and reliably receive results within microseconds with your current database setup?

with SQL Server you can get a very fast local connection by specify "server=(local)" in the connection string - this uses shared memory protocol bypassing the net stack.

bob1029 · on Dec 29, 2021

This is true - The counterpoint is that now you have this leviathan that is the SQL Server process(es) running on the same machine as the application.

If I am constraining my SQL Server usage to fit on 1 box, I might as well use SQLite (assuming no future plans for horizontal scaling).

devwastaken · on Dec 30, 2021

How is this handling privledge separation? Typically you'd have all accesses as root in SQLite.

usrbinbash · on Dec 30, 2021

>Typically you'd have all accesses as root in SQLite.

So? It's the backend talking to the db.

devwastaken · on Dec 31, 2021

Systems of least privilege prevent the levels of access exploitation (either vuln or misconfigure) from going further than is risk assessed. Or from "accidents" like a little utility script nuking the database with root privilege. Or having multiple entry points with their own tables of data. That's the difference between "hackers stole X but not Y and Z" and "hackers dumped everything".

Your "backend" should not be assumed to be the infallible security. Outside of limiting DB access procledges, we have proper selinux and other configs to lock down the software abilities so even in the case of Arbitrary code execution caused by a common exploit it may be very difficult to use. This is important when you're running a box with multiple other applications with their own DB's as well.

Infact these reasons are why many times the "backend" software and DB are hardware isolated or at least VM isolated so that those db access limitations aren't bypassed.

hnfong · on Dec 31, 2021

I haven't done anything like this before, but simple FS level file permissions should resolve the issue as long as the data with separate priv requirements is put into different sqlite databases.

chasil · on Dec 29, 2021

Oracle clients can do the same by setting the ORACLE_SID and ORACLE_HOME environment variables.

For remote databases, setting the TWO_TASK environment variable to the server's TNS descriptor is one way to force a network login.

qvrjuec · on Dec 29, 2021

How is this setup fault tolerant? What happens if there is a hardware failure? How do you partition access in a way that means an extremely active user doesn't impact availability?

bob1029 · on Dec 29, 2021

> How is this setup fault tolerant?

It is not.

> What happens if there is a hardware failure?

The product would suffer a total outage until manual intervention takes place. A restore of the VM from snapshot would be carried out by the customer. Some loss of the most recent business data would occur (i.e. between latest snapshot and time of failure). All of this is understood and agreed to by our customers.

> How do you partition access in a way that means an extremely active user doesn't impact availability?

Partitioning is not applicable. Our entire product fits on 1 machine and dividing the I/O along any dimension does not add much value to the performance equation.

vidanay · on Dec 29, 2021

Wait a minute....are you implying that I don't need Azure and FB level infrastructure for my local recipe application?

/s

lkxijlewlf · on Dec 29, 2021

I'm pretty sure you do need FB and Azure level infra for you recipe app. I've read a couple blog posts, watched a video on YouTube and copied the code from Stack Overflow, so I'm pretty much an expert on this, so trust me.

vidanay · on Dec 29, 2021

My recipe application is going to be written in a purely functional language with CQRS and a message bus that provides "guaranteed once" determinism. I should be able to spin up no more than a half dozen containers on two VM's to handle this architecture. The database will probably be redundant MSSQL and Oracle (just in case one technology is fundamentally broken). Database writes will be proxied by a Redis instance. I will use 3x 3080 GPU to train a ML model used for scaling the recipe depending on the number of people you wish to serve. Hmmm...I might need a third VM.

joconde · on Dec 29, 2021

You can't do any serious neural stuff on 8GB of VRAM. I'd say save yourself some pains and buy the $3000 professional models with ECC so you can train transformers.

kevin_thibedeau · on Dec 29, 2021

It's never gonna scale. Get on the NoSQL train now to future proof and woo the VCs.

ocimbote · on Dec 29, 2021

Yes. Make it web scale.

Also, think early on about your compensation packages. You don't want to lose a 10x engineer to a FAANG, do you?

fzil · on Dec 30, 2021

> Make it web scale.

Oh god whenever I read/hear that it reminds of the ridiculously funny video about nodejs and Apache servers. [1]

[1]. https://youtu.be/bzkRVzciAZg

ocimbote · on Dec 30, 2021

Glad you liked it! The reference to this series of videos was intended of course.

The specific video I had in mind was actually https://youtu.be/b2F-DItXtZs

Enjoy!

fzil · on Dec 31, 2021

Oh yes! I’ve seen this before as well. Should’ve guessed it’s this considering we were talking about sql.

usrbinbash · on Dec 30, 2021

This video is just gold.

asyncio is probably one of the worst things that happened to programming in the last 2 decades. It's essentially a step backwards, to the technology of the 90s (cooperative multitasking), when a badly written text editor could cause an entire OS to freeze.

joconde · on Dec 31, 2021

Curious why you think that. We have an app that does computation using thread pools, and moving from manually-created threads to asyncio has made it much more efficient and robust.

Some advantages: cancellation and error handling are trivial, there's no need for error-prone thread coordination, threads don't have to waste time waiting on locks (because we use asyncio sync primitives instead, so we don't submit work to a thread pool unless it's doable right now with no wait).

Of course, it depends on the tasks being written correctly. But "don't block the event loop" is a much easier rule to follow than "don't deadlock these 10 different threads that share queues and locks".

We didn't write an entire OS with it, but I don't think that was ever the point of asyncio, was it?

zbentley · on Dec 29, 2021

> The database will probably be redundant MSSQL and Oracle (just in case one technology is fundamentally broken)

Only one technology?

qvrjuec · on Dec 29, 2021

Interesting. For an extremely specific use case and with users who understand and accept the caveats of this approach I'm sure it would work well enough. The most confusing thing to me is that there is apparently an intersection of users who are ok with an outage and data loss with users who want a product which can

> execute queries and reliably receive results within microseconds

What is your product? Who are these users?

gbear605 · on Dec 29, 2021

I can think of plenty of services that an occasional (once a year? less?) outage is okay. Heck, anything relying on AWS us-east-1 is going to have outages that frequently based on the last few months. Meanwhile, almost any service is better off when its response times are cut drastically. I’ve seen many instances where a service’s response times are more than half waiting for a db to respond.

qvrjuec · on Dec 29, 2021

It's not the threat of an outage with data loss that is concerning to me- I just want to understand use case that needs fractions of a second shaved off of query times by using SQLite in this way that is also ok with the possibility of data loss.

wongarsu · on Dec 29, 2021

Interactive interfaces. There's a huge difference between moving a slider and seeing the reaction in real time and moving the slider and seeing the reaction a second later. If you define "real time" as 30 fps, you have 33ms to process each query and show the result. That could involve multiple database queries with computation in between, if your business logic isn't easily expressible in SQL.

Come to think about it, that covers most apps with UI. Apps where you are exploring data are definitely more impacted however.

Too · on Dec 30, 2021

I'd rather have a service that is down one day per year, rather than death by thousand papercuts of a slow UI every day.

Imagine some frustratingly slow app like Jira for example, i would be much happier with a snappy UI there, where one could theoretically still work for a downtime of week or more without updating your task status.

If we are talking about shaving the last milliseconds, it could be something very interactive like autocompletion. Which is something with a graceful fallback of typing in the whole thing manually.

gbear605 · on Dec 30, 2021

If my webpage loading makes ten database calls, and each database call takes 45 milliseconds, my webpages takes 300 milliseconds longer to load than if each call only took 15 milliseconds. 0.3 seconds across each page load, especially if the rest of the call is less than a second, can significantly improve the usability of a website and increase user retention. This applies for any user facing website whether its a social media platform, an online shop, or just a blog.

Obviously there are other improvements that can be made too, like batching db requests or in-memory caching, but this is one option.

lelanthran · on Dec 30, 2021

All users understand and empathise when you say "Sorry, the system is down right now" once or twice a year.

None of them display any understanding or empathy whatsoever when you say "Your trade will always be executed after a 1 second delay, even if the price has moved"

Users find occasional downtime awful, but they find consistent lethargy worse.

stjohnswarts · on Dec 30, 2021

No customer in fintech is going to accept the "we lost some data transactions" and buy the software so your use case is covered in that they are up front with the customer that if the server goes down any transaction in progress will not complete.

lelanthran · on Dec 30, 2021

Financial transactions, especially payments, work under the assumption that any node in the path from "payment attempt made" to "Payment attempt successful" can go down and not have an incorrect record of payment.

Authority? Me - I worked in the payment space for EMV for a number of years. There are (maybe) a minimum of five nodes in the path from swiping a card/making a payment to a successful payment, and each of these computers/systems are owned by different companies, all of which don't trust the other not to be a MitM, and all of which are designed to not lose transactions if any of the systems involved suddenly power off.

It's almost a myth at this point that a system doing financial transactions needs 100% uptime or it will lose transactions.

Trust me, even a system with only 80% uptime is never going to mislay transactions. The worst that happens is that transactions won't be processed during the downtime, payment or transfers are declined and the entity making payment/transfer can use the decline message to try again later.

If you're quadrupling your development and infrastructure cost in an attempt to reduce your downtime from twice a year to once every decade you're simply throwing money away[1].

[1] If you're doing that, it's also possible that you don't know of the many fail-safes built into the protocols for message exchange. Remember that these systems worked fine prior to HA/Always-Available cloud systems. Adding HA cloud tech is not going to make them more resilient than they are now, it just introduces additional points of failure.

lokedhs · on Dec 30, 2021

There is a difference between losing data, and losing transactions. Transactions can be replayed from the message log (depending on the implementation) and reprocessed if needed.

I have seen plenty of recovery operations in finance, sometimes cases where the entire database server was lost, and even though the transactions were not saved, no actual financial data was ever lost, as this data could always be reloaded.

Of course, certain systems exist that do store data in a way where loss is not accepted, but even in finance these are somewhat rare compared to the vast number of other systems.

zbentley · on Dec 29, 2021

> users who are ok with an outage and data loss

It depends on how you couch it. Working with large fleets of cloud servers has jaded us (well, me) into forgetting just how high you can get the uptime of a single server or well-managed VM if you're careful.

Sure, power cords get tripped over and hardware fails sometimes, but I'd imagine a lot of users who do care about reliability are making totally reasonable judgements that single points of failure in their systems are acceptable if the SPOFs are hardened and the oh-shit plan (restoring from a VM snapshot in this example) is something that they have confidence in.

Heck, given the availability issues that arise from the complexity of redundant/HA systems, I think there are probably plenty of seasoned enterprise customers who prefer hardened SPOFs managed by experts over distributed systems, especially once the dimension of cost enters into the discussion.

usrbinbash · on Dec 30, 2021

For starters, how likely is a data loss? SQLite may lose a transaction if the server shuts down or the application crashes, but this doesn't mean the db gets corrupted.

Secondly, even if that happens, the fact that the database is a simple file on the production server, makes it easy to back up and restore.

Thirdly, a "traditional" db is no proof against loss either...sure, the production server with the SQLite can go down...so can the container running Postgre.

And lastly, actually most services are doing perfectly fine when a data loss occurs. Take a message board for example. Worst case scenario: A post isn't written and someone has to type a few lines and emojis again.

bob1029 · on Dec 29, 2021

I agree that 'reliably' would be a poor choice of diction when considered in your context.

The user group of our product is effectively only professional bankers and their managers. No one in the general public has any direct access.

naniwaduni · on Dec 30, 2021

For the extremely specific use case of "almost everything" and the users "most everyone, maybe at the right price point", yeah.

lkxijlewlf · on Dec 29, 2021

[flagged]

qvrjuec · on Dec 29, 2021

No need to be condescending when I'm trying to learn. The premise of using SQLite because it was easier and to save fractions of a second didn't make sense to me as a tradeoff for potentially losing data

scottcodie · on Dec 29, 2021

As long as you're estimating your future costs correctly then you're golden. If you aren't and your application becomes complex through growth or you need resiliency then you'll need to pay that cost and that cost can be big.

twicetwice · on Dec 29, 2021

Interesting. So I would ordinarily want to put a foreign key constraint on the user_id column of a UserSessions table (or similar). In general, presumably you have relationships across the tables that are in those discrete databases. Do you just enforce these constraints/do joins in code? It seems like splitting related tables across multiple databases loses some (possibly a lot?) of the benefits of relational DBs, so I'm curious how you handle/manage that.

That said, I love the idea this architecture. Might use it for whatever next dumb little web service I cook up! I love how this simplifies a lot of dev/deployment ops, perfect for a side project.

stu2b50 · on Dec 29, 2021

You'd have to do the join on the client. This is a problem in general for scaling SQL - its not impossible, but with sharded SQL servers you have to be very deliberate in how you do joins so as to coincide as much as possible with your sharing strategy.

It would be similar here. According to the author each sqlite DB belongs to a single microservice, which will naturally group together the most common of joins. Anything else will indeed have to be manually joined.

Part of the reason nosql became popular for a bit. That's reversed, and sharded SQL is pretty common now, but it definitely adds more cognitive load to schema design.

nathcd · on Dec 29, 2021

SQLite has ATTACH (https://sqlite.org/lang_attach.html) for cross database operations. I've never tried it for foreign key constraints across databases, but I think it would work?

lisper · on Dec 29, 2021

> So, we would have databases like Users.db, UserSessions.db, Settings.db, etc.

How do you do joins?

em500 · on Dec 29, 2021

Not the OP, but you can attach and join tables from multiple sqlite databases in the same process.

lisper · on Dec 29, 2021

Really??? How? Because I use sqlite, and the query functions that I know take a single DB as an argument.

dwohnitmok · on Dec 29, 2021

https://stackoverflow.com/questions/6824717/sqlite-how-do-yo...

tmp_anon_22 · on Dec 29, 2021

I would guess they do parallelized calls to the microservices to collect the data.

La-Douceur · on Dec 31, 2021

Are you capable of achieving no downtime deployment ? I mean, on the product I currently work on, we have one mongo database, and a cluster of 4 pods on which our backend is deployed. When we want to deploy some new feature without having any downtime, one of the pod is be shut down, our product still work with the 3 remaining pods, and we start a new pod with the new code, and do this for the 4 pods. But with SQLite, if I understand correctly, you have one machine or one VM, that is both running your backend, and storing your SQLite.db file. If you want to deploy some new features on your backend, can you achieve no downtime deployment ?

gunshowmo · on Dec 29, 2021

This is really interesting. Would you mind sharing a bit about how you keep the data that is shared across services in sync? Or is there a hard separation of concerns so that services would only commuincate with the service that owns that data to obtain it?

plq · on Dec 29, 2021

To my knowledge, WAL mode still needs to serialize writes for each database file. I'm assuming this is not a setup where there are too many concurrent writers?

bob1029 · on Dec 29, 2021

Correct - We are using the typical SQLite build which serializes all writes by default and we have made no effort to undo this behavior. We actually rely on this to ensure things happen in the right order and take advantage for performance reasons.

Because SQLite is effectively serializing all the writes for us, we have zero locking in our code. We used to have to lock when inserting new items (to get the LastInsertRowId), but the newer version of SQLite supports the RETURNING keyword, so we don't even have to lock on inserts now.

Also, the fact that we have the databases divided across function helps free up some of the lock contention in the provider. We don't really have any mixed workload databases - its either "slow" gigantic operations (JSON blob access), or super quick tiny things (updating session timestamps). So, there is some minor isolation of contention issues on a subsystem basis.

thunderbong · on Dec 29, 2021

Testing with JMeter, I got upto 10,000 concurrent writes on SQLite with WAL mode enabled

starik36 · on Dec 29, 2021

Is this a situation where multiple web servers in a farm are accessing the SQLite databases on a file server?

bob1029 · on Dec 29, 2021

This is a situation where a single process owns all of its SQLite databases and handles everything out of 1 machine.

hu3 · on Dec 30, 2021

Is the process multithreaded? I guess WAL takes care of that.

nurettin · on Dec 29, 2021

How do you join?

nathcd · on Dec 29, 2021

Presumably ATTACH: https://www.sqlite.org/lang_attach.html