Hacker Newsnew | past | comments | ask | show | jobs | submit | minusf's commentslogin

It is a useful skill. But regardless of the theme at hand there is also

"You either die a hero or you live long enough to see yourself become the villain."

People change all the time, and things need to be reevaluated from time to time.

So another skill is to disengage with our heroes when the values start misalign.


while it's mentioned in the post, it seems to me a bit burried:

isn't this more like a port of `html5ever` from rust to python using LLM, as opposed to creating something "new" based on the test suite alone?

if yes, wouldn't be the distinction rather important?


Depending on your perspective, you can take away any of the two points.

The first iteration of the project created a library from scratch, from the tests all the way to 100% test coverage. So even without the second iteration, it's still possible to create something new.

In an attempt to speed it up, I (with coding agent) rewrote it again based on html5ever's code structure. It's far from a clean port, because it's heavily optimized Rust code, that isn't possible to port to Python (Rust marcos). And it still depended on a lot of iteration and rerunning tests to get it anywhere.

I'm not pushing any agenda here, you're free to take what you want from it!


I just had Codex CLI figure out where that first version ended and the new one began.

It looks to me like this is the last commit before the rewrite: https://github.com/EmilStenstrom/justhtml/tree/989b70818874d...

The commit after that is https://github.com/EmilStenstrom/justhtml/commit/7bab3d2 "radical: replace legacy TurboHTML tree/handler stack with new tokenizer + treebuilder scaffold"

It also adds this document called html5ever_port_plan.md: https://github.com/EmilStenstrom/justhtml/blob/7bab3d22c0da0...

Here's the Codex CLI transcript I used to figure this out: https://gistpreview.github.io/?53202706d137c82dce87d729263df...


Thank you for the clarification, that was not entirely clear to me from the post.

You also mention that the current "optimised" version is "good enough" for every-day use (I use `bs4` for working with html), was the first iteration also usable in that way? Did you look at `html5ever` because the LLM hit a wall trying to speed it up?


It was usable! Yeah, the handler based architecture that I had built on was very dependent on object lookups and method calls, and my hunch was that I had hit a wall trying to optimize the speed. I was slower than html5lib still, so decided to go with another "code architecture" (html5ever) that was closer to the metal. Worked out in getting me ~60% faster than html5lib.

As for bs4, if you don't change the default, you get the stdlib html.parser, which doesn't implement html5. Only works for valid HTML.


Made possible in turn by giving safe haven for user content on the big social networks. Turned out to be a double edged sword.

When Rupert tried to lie about voting machines, he was fined couple of hundred mils. All the social networks mouthpiece accounts spouting nonsense suffer no repercussions whatsoever.


Will you also blame the telephone companies and mailman too?


This is the old dichotomy: either you dont censor and are just a medium (like electricity) or you do censor some things and then you are responsible of what is published. Social media seems to want to censor while not being responsible.


Section 230 of the communications decency act explicitly gave these companies this power, on purpose. Unmoderated online spaces are mostly useful to scammers and spammers.


And thus now they are responsible for all content published there.


If somebody kept using the same phone line to trigger bombs, do you think that the phone company doesn't have an obligation to shut that line down? Let's say the police came to the phone company and said "we know that if you shut this phone line down, so and so wont be able to trigger the bomb they have planted in XYZ space." Do you think the phone company should do nothing?

What about a courier that knows it is delivering bombs? We should look past that too?

Which principles are you invoking exactly?


Traditional telephone is currently at risk of being so full of scams that it isn't sensible to keep a number.


I think that when GP stated "All the social networks mouthpiece accounts spouting nonsense suffer no repercussions whatsoever." they were referring to the people lying and not the social networks them themselves.


The hong kong police had to release this statement recently:

HKPF reminds the public that the following websites are not the official Hong Kong Police Force website.

1. 96o.1ss5623.com 2. joshhyoung.icu 3. amylfraser.xyz 4. tiaflowe.icu


these examples look ridiculous but you have to remember that people are used to chinese characters and can't easily recognize if a url written in latin characters is right or wrong. this is made even worse by the fact that even official websites are not always hosted on an official domain, and even when they are they use ridiculous hostnames, because again whoever is setting up the site just sees a sequence of letters that they are not closely familiar with.


There is that and there is the fact that 50-60 years ago China was coming out of a Cultural Revolution that had shut down the education system, and places like Shenzhen were fishing villages with dirt roads well within living memory.

It is not exactly surprising that in such a breakneck development pace that some people did not get up to speed at the same pace.

———

I will also say I think that China’s embrace of super apps and the quasi-app-internet is not helping with online literacy.


nobody is gonna mention TeX? so i will.

yes, i have TAOCP on the shelf and yes guilty of "one day i'll ready it".

but every now and then i open them up and just flip thru that magnificent typography.

Knuth has not just written up all these things, he has developed an entire typesetting system (complete with fonts) to bring technical publishing screaming and kicking into the 20th century (when other software thought kerning and hypenation were creatures from space). it's the only program deserving a version number approximating PI.


Anybody knows how Amazon is making this happen regarding Law enforcement, US jurisdiction, etc?


> 1.1.1.1 has a WebAssembly app called static_zone running on top of the main DNS logic that serves those new versions when they are available.

webassembly? what is that word even doing in a post mortem about DNSSEC failures?


CF Workers (that runs WebAssembly) are all over place. They may not run the main logic (not the actual Ngix, or DNSEC code) but they are used for several maintaince tasks.


Wasm runtimes are great for stuff like plugins! Seems like this static_zone thing was something like that (but they call them apps)



none of the news anchors should get the kind of dramatic powers they acquired over time. obligatory reference to Network (1976)

here is a fantastic analysis of Network and rise of the news anchors in the framing of Network: https://www.youtube.com/watch?v=WlCLZIn38ao


not a heavy :! user, what i used there worked, but afaik neovim recommends :te . that's one of the bigger differences. neovim was very proud to have a fully integrated terminal


apple maps doesnt have cycling maps for _amsterdam_. none of the eu countries do where i have been


Ah--looks like you're right. I hope it will arrive soon--Apple is still building out features around the world.

There's a table of feature availability for the top 60 metro areas here:

https://www.justinobeirne.com/apple-maps-feature-availabilit...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: