Around Opus 4.6 release it got good enough people tried laundering it all the time, and around 4.8 the group dynamics were such that it was worse to call it AI writing than launder it as your own.
Regardless, there was still enough of a taboo around it that the way people would try to launder it began to often include an AI-generated "sources" table, which was also aggressively bad, but lord knows no one reads those. So it was just another sigil that misled.
I agree completely and empathetically and vehemently with the idea behind the message.
The slop & aggressively poor argumentation, the kind that I think would have caused me to fail it if I tried it in speech & debate in middle school, leaves me feeling empty.
They keep saying $400M, $400M, $400M, $400M, and the only cost they came up with is $20M. It makes me uncomfortable to support the overall cause if this is how it'll be played, because, setting aside morality of tactics, it's not playing to win. Anyone who is at the margins will see it plainly and be given a reason not to listen.
> Isn't concurrency also limited by your machines disk speed for writes
Yes, in theory: given a large enough database, and a disk that can only do one operation at a time, and a large enough operation that touches enough of the database. In practice, in a SQLite single tenant scenario? No, not at all.
> what difference does it make if you write sequentially vs concurrently. Why does concurrency even matter for databases?
As soon as your codebase involves reacting to events independently of a user taking action it becomes a practical concern. Generally, this is a broad question and has 1,000,000 answers.
EDIT: Originally I had "I think you understand generally, no?" appended but realized that's not helpful at all, if you did, you wouldn't be asking.
Something that may help is imagining what'd happen if a DB wasn't thread safe / didn't allow multiple writers. Ex. in SQLite's case, it allows multiple write operations to take place but there's a one-at-a-time queue. If we didn't have databases that were able to execute multiple writes simultaneously, you'd need a separate database for each concurrent writer you expect, and you'd effectively have a global lock. Orderly scaling would be ~impossible unless you did something crazy like have a single server per user
I guess I need to dive deeper into this as I do not understand the implications you gave me, but I appreciate the attempt. Generally I understand why concurrency is good in many cases, I just dont get why its important for database stuff too.
Edit: thanks for clarifying in the edit, makes a lot more sense.
Imagine if every tweet had to go through a one-at-a-time queue before being persisted. There's about 6000 tweets per second, so you would have to be able to save them at <0.17ms per tweet or else you would become backlogged. If you are getting backlogged, you have to buffer those incoming tweets somewhere until they can be writted, and eventually that buffer gets full and you start losing tweets.
Maybe that too is a native question, but there's a large scale between single user and 6000 tweets per second - most of our apps will never reach anything approaching even one save a second. So where to draw the line? I do far have gone the sqlite route for my hobby apps as it's so easy to handle and doesn't require setting up two docker containers for a single app. Am I drawing myself in a corner in case my apps ever do become relevant?
Excellent question, and I spent so many years asking myself it, this over and over. You asking it made me realize I just...don't anymore. So allow me to blather a bit / free associate because I won't be sure why myself until I've written it out.
TL;DR: whatever works for you is the right decision. (which isn't helpful, I heard this so many times and as the recipient, I thought "That's nice. Now how do I choose what works for me?")
I finally had to use Postgres a couple years ago after a career of only SQLite - startup founder & iOS app developer using SQLite, turned Googler on Android, turned doing-my-own-thing.
In retrospect, I have made only one bad decision:
I went way out of my way to make SQLite work at my 2009-iOS-startup. It was a restaurant point of sale system, and to allow a networked system, one of the iOS devices would act as a server. This was a really cool trick, even an advantage in marketing that was appreciated by users. It meant the restaurant could continue to operate if the internet went down. But it eventually became clear owners loved having internet-based access too, ex. to do reporting/financial analysis over the data. And I kept contorting, instead of moving past my fear of getting into things I didn’t know, I instead did some like rudimentary thing over port forwarding. The bad decision here was riding one horse for so long and letting it affect the product, having a real server database would have allowed for a lot more features, think, first party gift cards, and a 100 others.
After leaving Google I needed server-side storage and fought and fought to avoid it. Then it turned out Postgres is easy and, just like SQLite, 99.999% of the time I don’t even know I’m using it.
In retrospect, there’s ~0 switching cost to these, particularly in age of LLMs. If you do need something more one day, it’ll be easy to do, and if you have to do it in a rush because you’re successful, you’re in Good Problem territory.
Hope that helped, after writing it out, dunno how convincing it is. Feel free to follow up, I appreciate the curiosity/framing because I had the same thought for so long.
If we imagine 1 tweet = 1 transaction, that's only 6k tps. 6k tps is completely achievable, dare I say even pedestrian for an optimized database. And most systems are operating far below the scale of Twitter/X.
Round trip time is actually much faster than Postgres, since there’s no need to touch the network. You can get massive single threaded throughput. In order to achieve comparable throughput in Postgres you need a large amount of concurrent connections, since each conn spends most of its time passing messages, deserializing etc (with a much larger total amount of overhead). There are a surprising amount of bottlenecks and misconfiguration that can tank performance of networked systems, particularly DBs.
Like you suggest, the reason for not picking SQLite is not reliability, speed, etc. Networked DBs allow decoupling between app and db servers, which have operationally different characteristics. But most importantly, you can have multiple apps access the same DB at the same time. Eg analytics, one off queries, any 3p app that interacts with your data directly.
In the beginning apps and SQL were co-mingled. Oracle eventually came along and noticed that people wanted SQL on the network so that many different apps, running on different computers, could all access the same data. But then people realized that clients really want rich, 'tree'-like data, not simple rows and columns, so people started sticking networked databases in front of networked databases to serve as a transformation system. And now people are realizing that the second networked database layer is redundant and never used beyond what is required for the client-facing network database, so they are moving the storage back into the first network database layer, just like Oracle did all those years ago. What is old is new again.
What changed is SSDs. SSDs means that local access is faster than hitting the network. An expensive SAN stopped making sense because of this in specific cases. So for read heavy, or even read only database loads, you copy the SQLite file to the node that's processing the file, and just update that file whenever the data does get changed.
I absolutely 100% do not understand it either. At all. Every time I try to over the last year or two I come away with the conclusion its something that sounds cool (to me too!) but is guaranteed to cause more problems than more obvious solutions.
That being said I'd kill for someone who used it and benefited to explain it to me in a practical sense. (specifically where syncing is involved, and syncing a subset of the SQLite is necessary. If it's "just" a document store thats treated like a blob for syncing/backup, that's familiar. If it's all in one storage but only local, that's familiar.)
Re: TFA, I guess it would have helped if I knew what Obelisk was, which is on me, and a more in-depth explanation of how this ties into AI/agents, which is on the industry/writer.
It's very likely that you have multiple SQLite databases in your pocket right now. It's one of the most widely deployed pieces of software on the planet. If your conclusion is that it's guaranteed to cause more problems than other solutions, then that's on you.
I use it to keep infra spend low for some systems I built/maintain for a handful of volunteer orgs. These systems have multiple users, dozens to a couple hundred. I just serialize writes in app code. Backup the db files to blob storage every so often and don't think about it much more.
Been an app developer since 2009, worked on Android for 6 years at Google. Push notifications suck, users hate them.
Simultaneously, I cannot match the pull quote, an argument summary, to their argumentation. IIUC if the reword patent / Apple’s summarizing disappear there’s 0 reason to say it wasn’t control passing purely to the consumer.
So I’m left a bit empty as the high-minded purpose has little backing, and thus comes across as bloviating.
It’s a bunch of Claude blather, and I love Claude. Just not worth copying over to HN, because the rush to get to a narrow answer to a narrow question elides the meaningful bits, ex. what does happen during sleep deprivation. Has a “not even wrong” air simply because you’re trying to get to true/false on a narrow question then pushing your research assistant to disavow what you’re quote unquote “skeptical” of.
I don't understand this post. I can read it, but I don't understand it.
Why do you think I'm arguing something? Again, smacks of Pauli's not even wrong. You're confusing your headspace with everyone else's and rushed to copy pasta an AI you're browbeating into disavowing things you're skeptical of to...win an argument, I guess? Based on this post? Unclear to me what the argument is, or, if I'm understanding correctly, due to the narrow focus while being self-absorbed.
Around Opus 4.6 release it got good enough people tried laundering it all the time, and around 4.8 the group dynamics were such that it was worse to call it AI writing than launder it as your own.
Regardless, there was still enough of a taboo around it that the way people would try to launder it began to often include an AI-generated "sources" table, which was also aggressively bad, but lord knows no one reads those. So it was just another sigil that misled.
reply