Hacker Newsnew | past | comments | ask | show | jobs | submit | more leobuskin's commentslogin

Yes, it's mature, but you (and your potential audience) basically need to learn a new language, a lot of quirks and "weird" (I'd even say counter-intuitive) nuances, and it's also significantly less readable in comparison with strict and typed Python. Even its modern syntax doesn't click immediately (also performance wise the new syntax somehow is a bit slower in my tests)


I am by no means a Cython fanboy but I think you are exaggerating the syntactic differences and readability differences.

Apart from type annotation they are very minor, well worth the speed benefits and type-error benefits. Given that we are discussing it in the context of SPy, SPy is not fully compatible with Python either, which is quite understandable and in my opinion a Good trade-off.

The benchmarking features are great, interactivity with C libraries is great.

One annoyance I have with Cython though is debuggability. But it's an annoyance, not a show-stopper.

Have used in production without problems.


It does seems very much like "worst of both worlds" in that these nuances make it harder to write than a language like Go while still being drastically less performant


It does mean. The switch from writing “applicable” software to creating cutting edge AI is almost impossible. The parent comment makes great examples, we can add to that list JetBrains (amazing IDEs, zero ability to catch up with ML), for example. It’s a very different fast-paced scientific driven domain.


What about a specialized dict for FASTA? Shouldn't it increase ZSTD compression significantly?


Yes I'd expect a dict-based approach to do better here. That's probably how it should be done. But --long is compelling for me because using it requires almost no effort, it's still very fast, and yet it can dramatically improve compression ratio.


From what I've read (although I haven't tested and I can't find my source from when I read it), dictionaries aren't very useful when dataset is big, and just by using '--long' you can cover that improvement.

Have any of you tested it?


I don’t think the size of content matters, it’s all about patterns (and their repetitiveness) within, and FASTA is a great target, if I understand the format correctly


Size of content does matter. Because the start of your content effectively builds up a dictionary. Once you have built a custom dictionary from the content you are compressing the initial dictionary is no longer relevant. So a dictionary effectively only helps at the start of the content. Exactly what "start" means will depend on your data and the algorithm but as the size of the data you compress grows the relative benefit of the dictionary drops.

Or another way to look at this is that the total bytes saved of the dictionary will plateau. So your dictionary may save 50% of the first MB, 10% of the next MB and 5% of the rest of the first 10MB. It matters a lot if you are compressing 2MB of data (7% savings!) but not so much if you are compressing 1GB (<1%).


I tried building a zstd dictionary for something(compressing many instances of the same Mono(.net) binary serialized class, mostly identical), and in this case it provided no real advantage. Honestly, I didn't dig into it too much, but will give --long a try shortly.

PS: what the author implicitly suggests cannot be replaced with zstd tweaks. It'll be interesting to look at the file in imhex- especially if I can find an existing Pattern File.


I’m also surprised to see nginx and hetzner in this project. Why not entirely Cloudflare: workers, R2, and cache


You can get cheap dedicated server on Hetzner with unlimited bandwidth, would the cost be similar with CF?


I have a small/insane project of mine, I wrote a compiler for Python (strict and static subset only) to WebAssembly (bc-to-bc approach, 1:1 CPython compat due to walking internals), than I do wasm2c to sandbox it + pledge and compiling with cosmopolitan into a miniature standalone thing (fast as hell). Just because you have zero dependencies and it's a pure Python and properly typed, lemme try next weekend as PoC. No promises, but this message clicked in my heart


<subscribe>


It seems like a pretty simple rule in 2025: if your AI-related devtool project is not an open source, doesn't allow to self-host, and is not a tier-1 (your own models, or similar level of "secret sauce") -> it will be replicated within a week or so. And I like this new realm.


We are thinking of open sourcing it, the current codebase requires Cloudflare Workers so it will take some changes to make it more generic. Thank you for the feedback!


They are hijacking the entire python's ecosystem in a very smart way, that's all. At some point we, probably, will find us vendor locked-in, just because the initial offer was so appealing. Take a closer look at it: package manager, formatter/linter, types, lsp. What's left before it will poke cpython one way or another? Maybe cloud-based IDE, some interesting WASM relationship (but RustPython is not there yet, they just don't have enough money). Otherwise, Astral is on a pretty straightforward way to `touchdown` in a few years. It's both, the blessing, and the curse.

Let's be honest, all tries to bring a cpython alternative failed (niche boosters like PyPy is a separate story, but it's not up-to-date, and not entirely exact). For some reason, people think that 1:1 compatibility is not critical and too costly to pursue (hello, all LLVM-based compilers). I think, it's doable and there's a solid way to solve it. What if Astral thinks so too?


Honestly... I don't really care. If in 5 years they turn around and try to charge for uv we'll still be in a much better place than if we'd all stuck with the catastrofuck that is pip.


it just uses osmtools/osmupdate [0] to update the official weekly release

[0] https://github.com/openplanetdata/osm/blob/924d680ff8df6263f...


I apologize, wasn't clear from the documentation: router's IPv6-like address is a fingerprint of the public key, but does it also encode geo-prefix and distance (I mean, it's hypothetically doable, I'm curious what's the approach if it is)? or the router's address has no metadata encoded, and only end-user addresses are encoded this way?


Mycoria brute forces a public key/IP pair until it matches the desired geo-prefix.


I am a little confused how the geo encoded addresses and private addresses work. It seems like the network will be overwhelmed with keeping track of switch labels?


Switch labels are effectively interface IDs on the servers, there is no data to be stored.

Geo encoding simply improves routing to unknown routers, kind of as a baseline structure to the whole network.


I don't like the "vibe" term nowadays, but when you mix two pretty abstract domains (AI and development), it's all about vibes and aura. Some model/agent works perfectly for one of us (let's keep in mind, we have a bunch of factors, from language to the complexity of the implementation), and does everything wrong for others.

You just can't measure it properly, outside of experiments and building your own assessment within your context. All the recommendations here just don't work. "Try all of them, stick with one for a while, don't forget to retry others on a regular basis" - that's my moto today.

Cursor (as an agent/orchestrator) didn't work for me at all (Python, low-level, no frameworks, not webdev). I fell in love with Windsurf ($60 tier initially). Switched entirely to JetBrains AI a few days ago (vscode is not friendly for me, PyCharm rocks), so happy about the price drop.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: