Hacker Newsnew | past | comments | ask | show | jobs | submit | glenjamin's commentslogin

This pitch seems ok to people using simple log aggregation tools or metric tools that have to be wary of tag cardinality

But how does it compare to an actual modern observability stack built on a columnar datastore like Honeycomb?


Any advice on how to learn modern Swift?

When I tried to do learn some to put together a little app, every search result for my questions was for a quick blog seemingly aimed at iOS devs who didn’t want to learn and just wanted to copy-paste the answer - usually in the form of an extension method


A failure mode of ULIDs and similar is that they're too random to be easily compared or recognized by eye.

This is especially useful when you're using them for customer or user IDs - being able to easily spot your important or troublesome customers in logs is very helpful

Personally I'd go with a ULID-like scheme similar to the one in the OP - but I'd aim to use the smallest number of bits I could get away with, and pick a compact encoding scheme


I’m amazed that this comment is so low down

Stacked diffs seems like a solution to managing high WIP - but the best solution to high WIP is always to lower WIP

Absolutely everything gets easier when you lower your work in progress.


This seems idealistic. It's very normal to be working on a feature that depends on a not-yet-merged feature.


> It's very normal to be working on a feature that depends on a not-yet-merged feature.

Oh sure, many bad ideas and poor practises such as that one are quite "normal". It's not a recommendation.


I invite you to look into feature flagging.

It is entirely viable to never have more than 1 or 2 open pull requests on any particular code repository, and to use continuous delivery practices to keep deploying small changes to production 1 at a time.

That's exactly how I've worked for the past decade or so.


Does pglite in memory outperform “normal” postgres?

If so then supporting the network protocol so it could be run in CI for non-JS languages could be really cool


Look into libeatmydata LD_PRELOAD. it disables fsync and other durability syscalls, fabulous for ci. Materialize.com uses it for their ci that’s where i learned about it.


for CI you can already use postgresql with "eat-my-data" library ? I don't know if there's more official image , but in my company we're using https://github.com/allan-simon/postgres-eatmydata


You can just set fsync=off if you don't want to flush to disk and are ok with corruption in case of a OS/hw level crash.


Huh, i always just mounted the data directory as tmpfs/ramdisk. Worked nicely too


There's a couple of passing mentions of Download Monitor, but also the timeline strongly implies that a specific source was simply guessing the URL of the PDF long before it was uploaded

I'm not clear from the doc which of these scenarios is what they're calling the "leak"


> but also the timeline strongly implies that a specific source was simply guessing the URL of the PDF long before it was uploaded

A bunch of people were scraping commonly used urls based on previous OBR reports, in order to report as soon as it was live, as it common with all things of this kind

The mistake was that the URL should have been obfuscated, and only changed to the "clear" URL at publish time, but a plugin was bypassing that and aliasing the "clear" URL to the obfuscated one


> in order to report as soon as it was live

We don't actually know that, it's just that the report did hit Reuters pretty swiftly.


https://obr.uk/docs/dlm_uploads/OBR_Economic_and_fiscal_outl... 5.pdf

Not hard to guess really. Wouldn't they know this was likely and simply choose a less obvious file name?


Turn out, no. Not they would not.



It sounds like a combination of the Download Monitor plugin plus a misconfiguration at the web server level resulted in the file being publicly accessible at that URL when the developers thought it would remain private until deliberately published.


Other than motherduck, is anyone aware of any good models for running multi-user cloud-based duckdb?

ie. Running it like a normal database, and getting to take advantage of all of its goodies


For pure duckdb, you can put an Arrow Flight server in front of duckdb[0] or use the httpserver extension[1].

Where you store the .duckdb file will make a big difference in performance (e.g. S3 vs. Elastic File System).

But I'd take a good look at ducklake as a better multiplayer option. If you store `.parquet` files in blob storage, it will be slower than `.duckdb` on EFS, but if you have largish data, EFS gets expensive.

We[2] use DuckLake in our product and we've found a few ways to mitigate the performance hit. For example, we write all data into ducklake in blog storage, then create analytics tables and store them on faster storage (e.g. GCP Filestore). You can have multiple storage methods in the same DuckLake catalog, so this works nicely.

0 - https://www.definite.app/blog/duck-takes-flight

1 - https://github.com/Query-farm/httpserver

2 - https://www.definite.app/


I wonder if anyone has experimented with "Mountpoint for S3" + DuckDB yet

https://docs.aws.amazon.com/AmazonS3/latest/userguide/mountp...


The duckdb http extension reads S3 compatibles.


that looks neat - how but do you handle failover/restarts?


in which one? restarts are no problem on ducklake (ACID transactions in catalog)

the others, I haven't tried handling it in.


GizmoSQL is definitely a good option. I work at GizmoData and maintain GizmoSQL. It is an Arrow Flight SQL server with DuckDB as a back-end SQL execution engine. It can support independent thread-safe concurrent sessions, has robust security, logging, token-based authentication, and more.

It also has a growing list of adapters - including: ODBC, JDBC, ADBC, dbt, SQLAlchemy, Metabase, Apache Superset and more.

We also just introduced a PySpark drop-in adapter - letting you run your Python Spark Dataframe workloads with GizmoSQL - for dramatic savings compared to Databricks for sub-5TB workloads.

Check it out at: https://gizmodata.com/gizmosql

Repo: https://github.com/gizmodata/gizmosql


Oh, and GizmoData Cloud (SaaS option) is coming soon - to make it easier than ever to provision GizmoSQL instances...


Feels like I keep seeing "Duckdb in your postgres" posts here. Likely that is what you want.



This reminded me of a slide from a Dan North talk - perhaps this one https://dannorth.net/talks/#software-faster? One of those anyway.

The key quote was something like "You want your software to be like surgery - as little of it as possible to fix your problem".

Anyway, it doesn't seem like this blog post is following that vibe.


I like this quote.

Unfortunately, my predecessor at work followed a different principle - "copy paste a whole file if it saves you 5 minutes today".

Well, I am still a surgeon, I just do a lot of amputations.


This doesn't seem accurate to me - Gambling sites legally operating in the UK already have strict KYC requirements applied to them via the Gamling regulator.

Visiting a gambling site isn't restricted, but signing up and gambling is.


You <-------> The point

If age restriction technology is now being introduced to prevent kids *viewing* "inappropriate" websites, then why are gambling websites being given a free pass?

The answer is to follow the money:

https://www.google.co.uk/search?q=gambling%20industry%20lobb...


They’ve already found a loophole for that: If you gamble with fake money (acquired through real money and a confusing set of currency conversions) and the prizes are jpegs of boat-girls (or horse-girls, as I hear are popular lately) or football players, you can sell to all the children you want.


The only mention I can see in this document of compression is

> Significantly smaller than JSON without complex compression

Although compression of JSON could be considered complex, it's also extremely simple in that it's widely used and usually performed in a distinct step - often transparently to a user. Gzip, and increasingly zstd are widely used.

I'd be interested to see a comparison between compressed JSON and CBOR, I'm quite surprised that this hasn't been included.


> I'm quite surprised that this hasn't been included.

Why? That goes against the narrative of promoting one over the other. Nissan doesn't advertise that a Toyota has something they don't. They just pretend it doesn't exist.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: