Hacker News new | past | comments | ask | show | jobs | submit | matharmin's comments login

In my opinion neither hooks nor CI should ever make changes to code automatically. When I commit changes, I want to see exactly what I commit, and not have some system change it at the last minute.

Instead, have tooling to do that before committing (vscode format-on-save, or manually run a task), then have a pre-commit hook just do a sanity-check on that. It only needs to check modified files, so usually very fast.

Then, have an additional check on CI to verify formatting on all files. That should rarely be triggered, but helps to catch cases where the hooks were not run, for example from external contributes. That also makes it completely fine for this CI step to take a couple of minutes - you don't need that feedback immediately.


This doesn't work if you have branch protection rules blocking pushes to main, which in my opinion should be standard on any repo


Oh I forgot about that. I mostly work on private repo (free account) so that feature isn't available to me.


Nope. Pushing to main is only disallowed by default if you change history. A normal squash or rebase merge is allowed and should be allowed.

UI-only workflows are for dummies, they won't fly for larger projects


Define large projects. While you can use the cli Google has ui only workflows. I'm fairly certain Google probably has the largest project out there.


Didn't know that Google is for dummies. Matches my interview experience there


I view JQuery as similar to C in some ways: It's been around forever, it's mature, and it works. It gives a good experience to get something up and running quickly: it's lightweight and simple.

But if you're working on bigger projects: It is possible, but have have to be very principled in how you use it, otherwise you're going to end up with either a massive spaghetti codebase and lots of edge cases in your app that breaks.

Alternatives like React and Rust may add more complexity upfront, but the improved structure and safety it gives has big benefits in all but the smallest projects.


Not so sure about that. You can easily write horrible code in React: Too complex, inefficient, and/or resource-intensive. If you don’t know the tools and have good theoretical programming knowledge, all code will be spaghetti code in the long run.


I'm no fan of React, but these aren’t equivalent. If you follow the rules, react (or any of its alternatives) will manage stateful changes like adding and removing components and event listeners. JQuery is more similar to doing manual memory management in C. It’s extremely easy to get it wrong and introduce leaks and other non-local errors.


React is also extremely easy to get it wrong.


Can you give an example? I mean I know you can shoot yourself in the foot with any UI framework, but jQuery has no way of managing state, everything just leaks by default. Unless they’ve added something.


I don't disagree, but this is not relevant for the vanilla JS vs. jQuery discussion, since vanilla JS has exactly the same problems you mention as jQuery.


I think I'm missing something here - what is specific about Docker in the exploit? Nowhere is it mentioned what the actual exploit was, and whether for example a non-containerized postgres would have avoided it.

Should the recommendation rather be "don't expose anything from your home network publically unless it's properly secured"?


From TFA:

> This was somewhat releiving, as the latest change I made was spinning up a postgres_alpine container in Docker right before the holidays. Spinning it up was done in a hurry, as I wanted to have it available remotely for a personal project while I was away from home. This also meant that it was exposed to the internet, with open ports in the router firewall and everything. Considering the process had been running for 8 days, this means that the infection occured just a day after creating the database. None of the database guides I followed had warned me about the dangers of exposing a docker containerized database to the internet. Ofcourse I password protected it, but seeing as it was meant to be temporary, I didn't dive into securing it properly.

Seems like they opened up a postgres container to the Internet (IIRC docker does this whether you want to or not, it punches holes in iptables without asking you). Possibly misconfigured authentication or left a default postgres password?


Docker would punch through the host firewall by default, but the database wouldn’t be accessible to the internet unless the user opened the ports on their router firewall as well, which based on the article, it sounds like they did. Making the assumption they’re using a router firewall…

In this case, seems like Docker provided a bit of security in keeping the malware sandboxed in the container, as opposed to infecting the host (which would have been the case had the user just run the DB on bare metal and opened the same ports)


That's a bit of a stretch here... Had the attackers' target been to escape from the docker container, they would have done it. They may even have done it, we can't know as OP does not seem to have investigated thoroughly enough apart from seeing some errors and then stopping the container...

Also, had it been a part of the host distro, postgres may have had selinux or apparmor restrictions applied that could have prevented further damage apart from a dump of the DB...


> Seems like they opened up a postgres container to the Internet

Yes, but so what? Getting access to a postgres instance shouldn't allow arbitrary execution on the host.

> IIRC docker does this whether you want to or not, it punches holes in iptables without asking you

Which is only relevant if you run your computer directly connected to the internet. That's a dumb thing to do regardless. The author probably also opened their firewall or forwarded a port to the host, which Docker cannot do.


Also from TFA:

> it was exposed to the internet, with open ports in the router firewall

Upvoted because you're right that the comments in this thread have nothing to do with what happened here.

The story would have been no different if OP had created an Alpine Linux container and exposed SSH to the internet with SSH password authentication enabled and a weak password.

It's nothing to do with Docker's firewalling.


>The story would have been no different if OP had created an Alpine Linux container and exposed SSH to the internet with SSH password authentication enabled and a weak password.

What? The story would have been VERY different, obviously that's asking for trouble. Opening a port to your database running in a docker container is not a remote execution vulnerability, or if it is, the article is failing to explain how.


I feel like you and grandparent are the only people who read the article, because I'm wondering the same thing.

The article never properly explains how the attack happened. Having a port exposed to the internet on any container is a remote execution vulnerability? What? How? Nobody would be using docker in that case.

The article links to a blog post as a source on the vulnerability, but the article is a general "how to secure" article, there is nothing about remote code execution.


Are you sure about that? Last I checked pg admins had command execution on the DB host, as well as FS r/w and traversal.

See https://www.postgresql.org/docs/current/sql-copy.html#id-1.9...

Specifically the `filename` and `PROGRAM` parameters.

And that is documented expected out of the box behaviour without even looking for an exploit...


It's funny that you said TFA a few comments earlier, because you seem to have not read the article either, or are making some great leaps here.

If the break in happened as you would explain the article would also mention that:

* the attacker gained access to the postgres user or equally privileged user

* they used specific SQL commands to execute code

* would have not claimed the vulnerability was about docker containers and exposed ports

And the take away would not be "be careful with exposing your home server to the internet", but would be "anyone with admin privileges to postgres is able to execute arbitrary code".


The article would only say that if OP was competent enough to determine exactly what went wrong. I did read the article however I do not agree with the conclusions in it as simply opening a postgres port to the Internet while having set up authentication correctly, is not fatal (though admittedly inadvisable).


Docker doesn’t expose ports by default. It only bypasses your firewall if you choose to explicitly publish a port.

OP explicitly forwarded a port in Docker to their home network.

OP explicitly forwarded their port on their router to the Internet.

OP may have ran Postgres as root.

OP may have used a default password.

OP got hacked.

Imagine having done these same steps on a bare metal server.


I do imagine:

1. postgres would have a sane default pg_hba disallowing remote superuser access.

2. postgres would not be running as root.

3. postgres would not have a default superuser password, as it uses peer authentication by default.

4. If ran on a redhat-derived distro, postgres would be a subject to selinux restrictions.

And yes, all of these can be circumvented by an incompetent admin.


This is one that can sneak up on you even when you're not intentionally exposing a port to the internet. Docker manages iptables directly by default (you can disable it but the networking between compose services will be messed up). Another common case this can bite you is if using an iptables front-end like ufw and thinking you're exposing just the application. Then unless you bind to localhost then Posgres in this case will be exposed. My recommendation is to review iptables -L directly and where possible use firewalls closer to the perimeter (e.g. the one from your vps provider) instead of solely relying on iptables on the same node


All this talk of iptables etc is really confusing. People don't use iptables rules on servers do they? Ubuntu server has the option to enable ufw but it's disabled by default because it would be a really annoying default for a server which is by definition supposed to have services. I couldn't imagine trying to wrangle firewall rules across every box on the network vs using network segregation and firewall appliances at the edges. Is there some confusion here between running docker on your dev box vs running it on a server to intentionally run network services?


Yes, they do. At least back when I was at ZEIT, docker definitely used iptables directly. I know this because I was patching them as part of our infra that managed Docker at the time.


With your own native client: Yes, you can send arbitrary headers in the Upgrade request.

In a browser however, you can't. It typically sets very little headers itself, and doesn't allow you to add custom headers.


The auth headers (Authorization, Cookie) are all passed along, and that's what I want to establish a secure connection from the browser.

For more customized wishes there's always this "ticket"-based flow[0][1] that shouldn't be hard to implement. I might be a bit naive, but what needed metadata and custom headers are we talking about?

[0]: https://devcenter.heroku.com/articles/websocket-security#aut...

[1]: https://lucumr.pocoo.org/2012/9/24/websockets-101/#authoriza...


It feels a little misleading to say you're using Google Sheets as the backend database, when you need an actual database in conjunction with it.


Extremely misleading, in this case Google Sheets it's working more like a frontend


In my experience, the "outbox" approach means a lot more manual work for developers. It requires developers to create a message for every type of change they want to sync to the client, and then also interpret that on the client. ElectricSQL's Shapes does a lot more work to keep the shape in sync between the client and the server, reducing the need for the developer to do that work.

You're right that "single-table sync" does have its limitations. At PowerSync we effectively support one level of "joins", and even then it's often not enough for more complex schemas. An older version of ElectricSQL did also actually have multi-table shape sync support, but I believe doing that at scale proved to be difficult.

One solution to this is often denormalizing data - either adding more denormalized columns in the existing table, or creating new tables dedicated to sync data. Conceptually, keeping these tables up to date is not that different from writing updates to an outbox table.

I'm also interested in seeing what Zero comes up with in the space. They seem to have solved doing multi-table query sync, but it remains to be seen how well that works in practice.


They did not say it was dementia, and they did not offer care suggestions - they merely shared their own related experience.


Unlike other opinions here I do think it is technically feasible to stream a copy of the WAL - it just has to be implemented in the VFS. "Shared memory" could be a SharedArrayBuffer, or just a normal buffer if you only have one database connection open at a time (for example in a SharedWorker, which is already common). It may not be simple to implement, but definitely possible.

The biggest actual issue is that it will capture block-level changes, not row-level changes. This means it can work to replicate a complete database, but partial sync (e.g. sharing some rows with other users) won't be feasible.

To get row-level changes for partial sync, you need to use something like triggers or the SQLite session extension [1]. For PowerSync we just use triggers on the client side. I find that works really well, and I haven't found any real downsides to that except perhaps for the work of maintaining the triggers.

[1]: https://sqlite.org/sessionintro.html


Just wondering, do you have a specific use case for read transactions implemented on the database level here?

In SQLite in general read transactions are useful since you can access the same database from multiple processes at a time. Here, only a single process can access the database. So you can get the same effect as read transactions either by doing all reads in one synchronous function, or implement your own process-level locking.


E.g. if you have many websocket connections and they each have a snapshot at a point in time (that spans over many different await function calls/ws messages).

SQLite can have many readers and a single writer with WAL, so a many read transactions can exist whilst the writers move the db state forward.


We (Cloudflare) have considered adding an API to create multiple "database connections", especially to be able to stream a response from a long-running cursor while representing a consistent snapshot of the data.

It's a bit tricky since if you hold open that old connection, the WAL could grow without bound and cannot be checkpointed back into the main database. What do we do when the WAL gets unreasonably large (e.g. bigger than the database)? Cancel old cursors so we can finally checkpoint? Will that be annoying for app developers to deal with, e.g. causing errors when traffic is high?

SQLite itself calls an open database a "connection" even though there's no actual network involved.


I did guess it might be harder to do than vanilla SQLite, as vanilla SQLite just has the WAL and main db on the same hard drive, so it has more space to grow the WAL and it is not an issue when the machine/instance reboots (as it just starts where it left off, even if the WAL is large and has not been check-pointed back to the main db).

To be honest this is an edge case. But I often start a read transaction on a SQLite connection just so I know multiple queries are reading from the same state (and to ensure state has not been changed between queries).


Ugh didn't notice until too late to edit, but apparently HN interpreted my asterisk as an instruction to italicize everything between it and the footnote it referred to.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: