Hacker Newsnew | past | comments | ask | show | jobs | submit | diath's commentslogin

No benchmarks. No FLOPs. No comparison to commodity hardware. I hate the cloud servers. "9 is faster than 8 which is faster than 7 which is faster than 6, ..., which is faster than 1, which has unknown performance".

You need to benchmark a new EC2 instance anyway. If it’s out of spec, burn it down and redeploy.

Why is that needed, and how would you know if it’s out of spec?

How: You've ran the test on a bunch of hosts and create a spec from ranges.

Why: you might be concerned with network connectivity (you don't get to choose which data center you launch in and it might not be exactly equal), noisy neighbors on shared hosts, etc. if you're measuring for networking, you probably are spinning ups separate accounts/using a bank of accounts and something in every az until you find what you're looking for.


Why: you’re paying for a service level that has no guarantee from the vendor

As soon as they're publicly usable people benchmark them carefully. All currently available models have clear metrics.

Since it will be a virtual machine, its performance can be arbitrarily reduced.

If you're interested in using them you should just bench them yourself.

I’ve had terrible luck benchmarking EC2. Measurements are too noisy to be repeatable. The same instance of the wrong type can swing by double digit percentages when tested twice an hour apart.

Who exactly believes manufacturer benchmarks? Just go run your benchmarks yourself and pick. Price/performance is a workload thing.


That didn't stop people from throwing a fit over master-slave terminology in software (having nothing to do with slavery), going so far as to rename long-standing development branch names, as well as put significant effort into removing such terms from the code itself and any documentation.

How does that work? You transfer the money in BTC, you exchange that into real money, you open a new bank account, you deposit that money, the bank's anti-money-laundering detection sees a large deposit to a newly open account and triggers an alert, the bank locks you out of the account and asks you for a proof of income/tax payment, you have no explanation of where that money came from legally, they freeze your account and report you to their local revenue services, you're SOL.

Oh yeah, this isn't about large sums, this is about "I need money to live there for a month".

However, with a lot of BTC trading sites, you get money from real people's accounts, so its not that crazy as long as the amounts are low.


It's way harder to support Linux than Windows from a developer's perspective. Proprietary vs. open source drivers, approach to driver updates (rolling release vs. stable distros), 5 trillion incompatible glibc versions, X11 vs. Wayland etc, janky sound systems with varied support across Linux distributions (Pulse, Alsa, PipeWire), no ABI compatibility guarantee etc.

What has that got to do with Valve providing a compatibility layer so devs can broadly ignore all that nonsense and just target Proton?

> A picking texture is a very simple idea. As the name says, it’s used to handle picking in the game, when you click somewhere on the screen (e.g. to select an unit), I use this texture to know what you clicked on. Instead of colors, every object instance writes their EntityID to this texture. Then, when you click the mouse, you check what id is in the pixel under the mouse position.

Unrelated, but why? Querying a point in a basic quad tree takes microseconds, is there any benefit to overengineering a solved problem this way? What do you gain from this?


Well, it's significantly easier to implement than a octree. Game is actually 3D under the hood, projected at a careful angle to look isometric 2D.

The reason the game is 3D has to do with partially visible things being way easier than with isometric textures layered in the right order.

Also, now that i just grab a pixel back from the GPU, it's no overhead at all (to construct or get the data for it).


I assume for the picking system you're rendering each entity/block as a different color (internally) and getting the pixel color under the mouse cursor?

"Color" is pretty much just "integer" to the GPU. It doesn't care if the 32-bit value a shader is writing to its output buffer is representing RGBA or a memory pointer.

It aligns with what appears on the screen accurately and without needing any extra work to make sure there's a representation in a tree that's pixel-accurate. It's also pretty low overhead with the way modern GPU rendering works.

What if you have a collision system where collision filters can exclude collisions based on some condition in such a way that their bounding boxes can overlap? For instance an arrow that pierces through a target to fly through it and onto another target? How do you accurately store the Entity ID information for multiple entities with a limited number of bits per pixel?

Entities that can't be picked, don't write to the texture, entities that can be picked, write to the texture their id. Whatever is closer to the camera will be the id that stays there (same as a color pixel, but instead of the object color you can think object id). So you are limited at one ID per pixel, but for me that works.

Right, it's the same z-buffer problem of deciding what pixel color is visible, with a non-blending buffer update mode.

To be totally coherent, you have to draw the entity ID in the same order you would draw the visible color, in cases where entities could "tie" at the same depth.


A point in screen space is a line in world space after inverse camera projection, so this way you get the line-to-closest-geometry test in O(1), after the overhead of needing to render the lookup texture first.

> Hopefully, this post helps illustrate the unreasonable effectiveness of SQLite as well as the challenges you can run in with Amdahl's law and network databases like postgres.

No, it does not. This article first says that normally you would run an application and the database on separate servers and then starts measuring the performance of a locally embedded database. If you have to keep the initial requirement for your software, then SQLite is completely out of equation. If you can change the requirement, then you can achieve similar performance by tuning the local PGSQL instance -- and then it also becomes a valuation of features and not just raw throughput. I'm not saying SQLite is not an option either, but this article seems confusing in that it compares two different problems/solutions.


Right - but SQLite handily beats the case where postgres is on the same box as well. And it's completely reasonable to test technology in the configuration in which it would actually run.

As an industry, we seem to have settled on patterns that actually are quite inefficient. There's no problem that requires the solution of doing things inefficiently just because someone said databases should run on a different host.


If you're going to run on more than one piece of hardware, something is going to be remote to your single writer database.

As an industry, we've generally decided against "one big box", for reasons that aren't necessarily performance related.


I sometimes dream of a local-first world in which all software works with local DB and only writes to the cloud as an afterthought, maybe as a backup or a way to pick up work on another machine. It just boggles my mind that more software nowadays relies on an always on internet connection for no good reason other then the design itself.

I think people's reaction to cloud vendors is to go local first. But, there's a middle ground VPS, rented server, even self hosting.

My problem with local first is it's fine for solo apps with the occasional sync. But doesn't work for medium to large datasets and the stuff I work in is generally real-time and collaborative. To me multiplayer is one of the strengths of the web.


> If you have to keep the initial requirement for your software, then SQLite is completely out of equation.

No it isn't? You can run a thin sqlite wrapping process on another server just fine. Ultimately all any DB service is, PostgreSQL included, is a request handler and a storage handler. SQLite is just a storage handler, but you can easily put it behind a request handler too.

Putting access to sqlite behind a serial request queue used to be the standard way of implementing multi-threaded writes. That's only spitting distance away from also putting it behind TCP.


You could do that, but you'd run into exactly the same bottleneck the author describes with a remote Postgres instance. The workload exposes high contention on hot rows. If transactions are kept open for several milliseconds due to this being a remote network call between client and DB server, throughput will be equally limited also when using SQLite.

Yeah this is a very good point. Any sort of network drive will have similar issues with SQLite. You're very much wanting attached NVME.

Exactly. People forget that “SQLite can’t do X” often really means “SQLite doesn’t ship with X built in.” If you wrap it with a lightweight request handler or a queue, you essentially recreate the same pattern every other DB uses. The fact that PostgreSQL bundles its own coordinator doesn’t make SQLite fundamentally incapable. It just means you choose whether you want that layer integrated or external.

As long as WAL mode is not enabled, connections over NFS/SMB or other file sharing protocols will work.

I'm not saying that this is a good idea, and it could fail in a spectacular manner, but it can be done. DML over this is just asking for trouble.


Well that's just dqlite/rqlite.

> Well that's just dqlite.

Far from it, as now your not just dealing with network but also with raft consensus... So each write is not just a network trip, its also 2x acknowledging. And your reads go over the leader, what can mean if somebody accessed node 1 app but node 2 is the leader, well, ...

Its slower on reads and writes, then just replications that PostgreSQL does. And i do not mean async but even sync PostgreSQL will be faster.

The reason dqlite exists is because canonical needed something to synchronize their virtualization cluster (lxd), and they needed a db with raft consensus, that is a lib (as not a full blown server install like postgres). Performance was not the focus and its usage is totally different then most people needs here.


Dqlite and Rqlite are primarily for buildling fault-tolerant clusters. But if you just take the network access part, then ok sure, but also so what?

rqlite[1] creator here.

Nit: dqlite is a library, it is not a network-exposed database like rqlite is. Sure, it requires connecting to other nodes over the network, but local access is via in-process. In contrast one connects with rqlite over the network - HTTP specifically.

[1] https://rqlite.io


Massive fan rqlite. Awesome work!

Paradoxically, raw throughput matters a lot more if you are going to scale on a single box. SQLite is 10x PG on a single box in this example. Considering databases tend to be the bottle neck that can take you an order of magnitude further. PG on the same server will also be slower the more complex the transaction as unix sockets are still going to be considerably slower than a function call.

The other thing to point out is in this article is that the PG network example CANNOT scale horizontally due to the power law. You can throw a super cluster at the problem and still fundamentally do around 1000 TPS.


Also important is just how fast cheap hardware has gotten which means vertical scaling is extremely effective. People could get a lot farther with sqlite in wal mode on a single box with an nvme drive than they imagine. Feels like our intuition has not caught up with the material reality of current hardware.

And now that there are solid streaming backup systems, the only real issue is redundancy not scaling.


> Paradoxically, raw throughput matters a lot more if you are going to scale on a single box.

There’s absolutely nothing paradoxical about any of this.


> If you have to keep the initial requirement for your software, then SQLite is completely out of equation.

It'd be a very short article if so, don't you think? Full article would be something like: "Normally you'd have a remote connection to the database, and since we're supposed to test SQLite's performance, and SQLite is embedded, it doesn't compare. Fin"


The table of data at the end of the article has 7 lines, only one has data for both DBs. What was the point of setting up the comparison if there is no comparison made?

Because it shows that a network RDBS database cannot get you out of this predicament.

What is says first is: "SQLite is for phones and mobile apps (and the occasional airliner)! For web servers use a proper database like Postgres!"

Though I'd say it's for a broader set of applications than that (embedded apps, desktop apps, low-concurrency server apps etc).

Phones and mobile apps installations of course outnumber web app deployments, and it doesn't say what you paraphrased about servers.


Anyone else just not use any DM? The user/password prompt is sufficient for me and then my shell calls startx when initialized from the tty. It was mostly due to laziness that I did not feel like experimenting with any when I first migrated to Linux but then I just never felt the need to install one. My PC boots, asks me for user/password, then boots the X session and Cinnamon DE.

Never bothered to set one up either! I type `startx` and get dumped into dwm. (Udiskie is a nice addition for automounting drives too.)

I also type "startx". Never saw the point of a display manager (which might be my own shortcoming!).

Yup, but I still have graphics start on a tty ;) I just don't have a login because I use an ephemeral OS (custom alpine base) with runit. It's like windows 95 all over again, you start the computer and that's it, use it. Highly recommended, no passwords just yubikeys

Switch console, ^z and you got a login shell. Handy for anyone who happends to find your device.

yepp, running sway as non-root

> Never scan QR codes: There is no evidence of widespread crime originating from QR-code scanning itself.

> The true risk is social engineering scams...

Exactly. My grandma is very susceptible to phishing and social engineering, I don't want her scanning random QR codes that would lead to almost identical service to the one she would think she is on and end up with identity theft or the likes.

> Regularly change passwords: Frequent password changes were once common advice, but there is no evidence it reduces crime, and it often leads to weaker passwords and reuse across accounts.

Database leaks happen all the time.


Forced password changes are one of those security theater exercises that drive me absolutely nuts. It's a huge inconvenience long-term, and drives people to apply tricks (write it on a post-it note, or just keep adding dots, or +1 every time).

Plus, if your password gets stolen, there's a good chance most of the damage has already been done by the time you change the password based on a schedule, so any security benefit is only for preventing long-term access by account hijackers.


> Database leaks happen all the time

The point is to use unique passwords. If there is a leak, hopefully it is detected and then it is appropriate to change the password.


Sure, if you use unique passwords, then changing passwords isn't as useful. Yet we shouldn't judge a security policy based on the existence or not of another policies.

What you are judging then is a whole set of policies, which is a bit too controlling, you will most often not have absolute control over the users policy set, all you can do is suggest policies which may or may not be adopted, you can't rely on their strict adoption.

A similar case is on the empiric efficacy of birth control. The effectiveness of abstinence based methods is lower than condoms in practice. Whereas theoretically abstinence based birth control would be better, who cares what the rates are in theory? The actual success rates are what matters.


If databases contain your password, you have a problem that regular password changes won't fix.

Isn't that pretty much GitLab? But then most people still prefer GitHub anyway.

GitLab is too heavyweight for many projects. It’s great for corporations or big organizations like GNOME, but it’s slow and difficult to administer. It has an important place in the ecosystem, but I doubt many small projects will choose it over simpler alternatives like Codeberg.

Gitlab is worse than GitHub in every way.

At least GitHub adds new features over time.

Gitlab has been removing features in favor of more expensive plans even after explicitly saying they wouldn’t do so.


Gitlab works fine for me. Been using it at work for a few years and recently moved all my personal repos there

> At least GitHub adds new features over time.

Not as quickly as they add anti-features, imho.


Personally, I prefer the CI/CD setup on GitLab over GitHub Actions.

Horses for courses I guess ¯\_(ツ)_/¯


Gitlab is part of the reason I'm thinking along these lines: It has been around for a while, as a known, reasonably popular alternative to GitHub. So, I expected the announcement to be "We moved to GitLab", Yet, what I observe is "We moved to CodeHouse" or "We moved to Source-Base" The self-hosting here with mirrors to two one I'm not familiar with is another direction.

I think people are wary of moving to gitlab because its a similarly large platform and dont want to repeat their mistakes

gitlab has also gone full slop

> unscalable

Instagram, which is significantly bigger than Reddit, disagrees.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: