Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Since “The reckless, infinite scope of web browsers” is depicted at the start of the article, I think it’s worth pointing out that its claim of W3C having 1,217 specs totalling 114 million words is wildly wrong, probably by 2–3 orders of magnitude in the total. The considerable majority of the documents considered were not specs or not web-relevant, and dozens of versions of the same thing were often counted. Source: https://news.ycombinator.com/item?id=22617721.


> probably by 2–3 orders of magnitude

So the web is maybe only 1.2 specs with 114,000 words? I think it's considerably more than that. If that estimate is off, it's by no more than a factor of 10, IMO. No need to exaggerate.


In the source comment he gives it's more explicit, but the "2-3 orders of magnitude" is referring to "the total", i.e. the 114 million words, not the number of specs.


Worth noting the discussion at the link.

Given that he omitted huge specs like WebGL etc. I wouldn't say it's wildly wrong. But I'd love to somehow arrive at a better estimate.


Given that someone that needs to develop a browser probably needs to hunt and peck through all that trash to find the relevant bits of information, is it not actually more damning that such a vast quantity of irrelevant cruft exists?

Is this not the corporate equivalent of creating a walled garden (perhaps not the right phrase here, gastric moat sounds more apt), by exhausting the resources of all that should choose to attempt to scale this mountain of junk?

That being said, I can't make any suggestions as to how you could shortcut through that other than just having decades of experience in the field.


Isn't that true for any system that's been around for a few decades? Try implementing XMPP; which XEPs do you pick? It's a long list.[1] Try implementing email: there's probably more RFCs to exclude than include at this point, and what do you need and what is optional?

[1]: https://xmpp.org/extensions/


This is in fact one of the big issues with xmpp. Everything is sorta-kinda compatible but not really. And email is so getting so complicated that many people are scared of running their own server let alone programming one.


The Compliance Suites (also linked on the top of the page you linked) are intended to provide some guidance about what's important.

The current edition of those can be found in XEP-0459: https://xmpp.org/extensions/xep-0459.html


One thing the web specs do incredibly well is cross-linking. I've found it quite easy to start with a high-level spec (e.g. flexbox) and drill down into the bits I need because anywhere another spec is referenced it's linked to directly.


I believe it’s still wildly wrong. Had arp242 not spoken up at that time, I’d have been saying something similar, because the numbers were to me blindingly obviously extremely unrealistic. The entire HTML Standard (which is somewhat of a misnomer now, it covers much more than just HTML, quite a bit of CSS interactions, other web platform functionality, JavaScript APIs, and the likes) is now about half a million words by the most generous of counting methods (it’s written in a fairly verbose style, which is a really really really good thing when you compare it to the average IETF RFC), and I suspect it’s bigger than everything else put together, apart from ECMAScript (around 270,000 words¹ and probably growing at a faster rate than all other specs: it’s written in an even more verbose style, most of which is effectively straight code in prose form, whereas in the HTML Standard “straight code” is only a decent chunk of it).

As for WebGL, the WebGL parts are actually quite little. https://registry.khronos.org/webgl/specs/latest/1.0/ is only about 20,000 words. I gather it defers significantly to GLES20 (PDF, 204 pages, ~60,000 words), and GLES20GLSL (PDF, 119 pages, ~30,000 words), and it has GL32CORE in its references (PDF, 404 pages, ~125,000 words), but doesn’t actually use cite it in the text and I don’t know if it’s relevant. There doesn’t look to be anything else significant that wouldn’t already be included.

But really, WebGL is a fairly thin layer atop OpenGL ES 2.0, just removing some functionality and applying some restrictions. I believe you would reasonably expect a browser to use an existing OpenGL ES 2.0 implementation, so I’d be quite content to exclude the 90,000 (or perhaps it’s ~215,000?) words of that, just like it’s common to reuse an existing JavaScript engine (though you also don’t have to). Yet note this: it seems that even if we include it all (and presuming I haven’t missed anything, which I admit I could easily have done, I’m not conversant with these specs like I am with HTML/CSS/JS specs), it’s still under 0.2% of Drew’s massively-inflated figure.

—⁂—

¹ Whew, https://262.ecma-international.org/ took me several minutes to download, despite being only 7MB. Sigh; the trials of being in Australia, where things hosted in the USA are often inexplicably painfully slow—like, sub-256kbps. When already downloaded, it renders in under four seconds, which is really fairly impressive when it’s doing all that layout on a document a million pixels tall—this ain’t a PDF where you can only render one page at a time. The HTML Standard is almost two million pixels tall, and also loads completely in under four seconds—simpler styles, perhaps? I refer to it often enough that I build it locally so I don’t have to download its 13MB all the time, or compromise with the multipage version that you can’t search through as easily.


To underscore something that might get lost in the wall:

The entire premise given in Reckless, Infinite Scope is that the number of words in the specification is positively correlated with the intractability of implementing a given thing. From this foregone conclusion, it tries to quantify how much worse the task of implementing a Web browser is. The problem is that that the premise is a bad one; even if it takes more time to read a wordier spec, it is easier to implement one that describes well-defined behavior than a terse one that glosses over things and leaves huge gaps of undefined behavior. This is not just conjecture—it tracks with the development and progress of implementing, say, the HTML parsing algorithm; it is easier to implement a correct and acceptable HTML reader in 2023 armed with only the spec than it was to try to do the same thing in 2003 which involved reading the spec and also reverse engineering how other (esp. proprietary) browsers deal with the pages that you find authors actually publishing in the wild. This is a task that was made easier because the standard got bigger.

The point is that its broken methodology doesn't even matter; we don't have to try to come up with better ways of evaluating whether a spec should be included or not because its whole premise is flawed to begin with. Any attempt to produce an input set that you can then use to run a word count analysis is a moot academic exercise at best that will only tell you how many words it contains.


And yet, it's a decent proxy. That, and the number of specs required (many of whom are locked behind IEEE, IEC etc. paywalls)


Uh, no. It's a bad proxy, for the reasons just stated. As the spec gets more detailed, it gets easier to implement, not harder.


A more detailed spec might have more concrete definitions but it also means more actual code for someone to write. An under-detailed spec you might have a case switch with a couple defined values and an undefined catch all. A super detailed spec just adds case statements and requires more code to handle them. The detail in the spec makes for lower cognitive load but the code still needs to be written and ideally tests written.


> A more detailed spec [...] means more actual code for someone to write.

No, it doesn't. A detailed spec has the same amount of code to write as a spec for the same thing with less detail; for the types of specs relevant to this discussion, the primary requirement of "does what the other browsers do" exists whether the details are made explicit in the spec or not. More code is a consequence of an increase in requirements, not detail.

In any case, neither circumstance is I/O bound to begin with.


No. If a spec doesn't define a behavior code can jump to some "undefined" handler which could be anything from a no-op to some quirks mode. Unless you're Microsoft writing specs "do what Word 97 does", copying the behavior of existing browsers is not a specification.


Please don't ignore the context. We are talking about Web browsers.

You don't, in reality, have the latitude to do "anything from a no-op to some quirks mode" of your choice. The requirement is absolutely the one stated: to be compatible with what other browsers are doing. If your browser doesn't satisfy that requirement, then you break the Web, regardless of whether the spec is a hundred words or a hundred million. No amount of pointing at a standard and arguing that it doesn't specify clearly defined behavior in some area will ever be enough to teach a site to be able to say, "Oh, I'll just unbreak myself then so you can go ahead and view/use this page on your computer."

Besides that, even if you were right—and to be clear, you aren't—that doesn't change the fact that, again, arguing for underspecification because "a couple defined values" isn't as much "actual code" that "still needs to be written" is an argument that approaches a problem that isn't I/O bound as if it is.


> and I suspect it’s bigger than everything else put together, apart from ECMAScript

The thing is, it's not just the HTML standard. It's also all the standards it references. And all the standards they reference, and all the standards those standards reference, ad infinitum.

For example, HTML 5 references SVG 2 which references CSS 2 which references Unicode and XML 11. Or, to go the same route, HTML 5 references SVG 2 which references CSS 2 which references CC.1:2004-10 (Profile version 4.2.0.0) Image technology colour management which references (normative) ISO/IEC 646:1991, Information technology — ISO 7-bit coded character set for information interchange, IEC 61966-2-1 (1999-10), Multimedia systems and equipment — Colour measurement and management — Part 2-1: Colour management — Default RGB colour space — sRGB and TIFF 6.0 Specification, Adobe Systems Incorporated among other things.

Yes, some of those overlap (as many standards will reference many the same standards), but the number of those standards is definitely non-trivial. Some of them you can probably pull in as system libraries or external libraries. The question is, how many?

Edit: and some of them are definitely not relevant to the web, but how would you know until you read through the spec that references it, and through the referenced spec to find and understand the relevant bits?


Essentially any specification that includes any kind of image support will include this kind of chain of specifications; just as any system that does networking will eventually end up with TCP, any system that does text ends up with Unicode, etc. Even the simplest possible 1995-esque browser will have to deal with that (support for images was added in 1993, and text and networking were always central).


> Even the simplest possible 1995-esque browser will have to deal with that (support for images was added in 1993, and text and networking were always central).

Indeed they did. Here's what author of KHTML said, https://twitter.com/LarsKnoll/status/1421121639845187585

--- start quote ---

Implementing a browser engine from scratch was a lot of work in 1999/2000, it’s close to impossible today.

--- end quote ---


To make a web browser from scratch is like making a hamburger from scratch. The problem is not about the first part but what you truly mean by from scratch.


ISO 646 and 61966? I won’t disagree with your annoyance with ISO water torture[1], but ASCII and sRGB are not the examples of needlessly sprawling web of references I would’ve chosen. Even if sRGB is an utter mess[2], it’s a mess you essentially have to use if you’re doing colour on computers.

[1] https://www.cs.auckland.ac.nz/~pgut001/pubs/x509guide.txt

[2] https://photosauce.net/blog/post/what-makes-srgb-a-special-c...


I just randomly selected some without going too deep into details. But yes, sRGB is also referenced from CSS because, you guessed it, CSS deals with color :)


How do you even start implementimg such a spec? I know it's probably a dumb question but how would one structure their code to check all those boxes? Does it usually involve reading the entire spec, then figuring out the foundational parts and building from there? Does that work when you need multiple specs to "fit" together? How do you make sure or even check that some code you built for some part of the spec does not interfere with something else?

I implemented a few specs in my short career but nothing even close to that. It's actually mind boggling that we manage to have all those moving parts fit together.


Other than layout and rendering, implementing HTML, ECMAScript and CSS is genuinely easy. There’s a lot of it so that it’ll take you a long time, but it’s very much not hard, because the HTML and ECMAScript specs fully spell out the algorithm, telling you exactly what you must do (or, more precisely, what you must be equivalent to doing: e.g. “implementations must act as if they used the following state machine to tokenize HTML”), so it’s largely mechanical. This is very unusual in specs. I wish it were less so.

Take a look through https://html.spec.whatwg.org/multipage/parsing.html. It’s verbose but very approachable, very implementable.


The question is what exactly is needed for a useful and functional browser. You certainly don't need all features from Chrome, but you do need more than, say, Lynx or Dillo.

Is WebGL needed? I've browsed the web for years with it disabled and have not suffered any inconvenience. I'd probably say it's not needed, but I'm a bit on the fence about it and can understand if people would disagree. All browsers implement XSLT, but is that actually needed for a functional modern browser? Maybe not? I can't remember the last time I've seen it used, but perhaps it is. And do you include HTTP? Or is that too low-level? Do you include PNG and SVG or just PNG? If you include SVG then why not PNG?

There are some obvious "we need this", some obvious "we don't need this", and a lot of unclear and somewhat subjective area. I do know that you can't really say "yes there's bad data, but it probably cancels out against stuff omitted"; if anything, it only underscored my point that the list is not good.

An uncurated or minimally curated document dump is not the correct approach in the first place, if you do that for SMTP you'd end up with a lot of irrelevant documents too simply because the specification is a few decades old and stuff gets superseded, some things never sees real-world implementations, things no one uses any more, etc.

I started making a better list when the article was originally posted, starting from "okay, let's just check what you need for a useful browser normal people can use every day" and ended up with a few dozen things, but I never really posted it as I wasn't quite sure that was fully correct either and because I never really figured out some of the questions above.

I think most of the complexity stem not just from the word count, but rather that everything interacts with everything else. Consider the relatively new "position: sticky" in CSS. Okay, great. But it doesn't work well with flexboxes, or RTL, or negative margins, or z-index, etc. etc. [1] Adding what seems like a fairly simple feature is quite complex because it interacts with so many things. It's not hard to imagine a fresh new HTML and CSS which allows all the features the current does but does so in a much simpler and orthogonal way, which would of course break backward compatibility and every website.

[1]: In 2020 anyway; I'm not sure on the current state; here are some of the links of my post from 2020 which like most of my posts I never finished:

https://bugzilla.mozilla.org/show_bug.cgi?id=1488080 https://bugzilla.mozilla.org/show_bug.cgi?id=1498772 https://bugzilla.mozilla.org/show_bug.cgi?id=1519600 https://bugzilla.mozilla.org/show_bug.cgi?id=1490487 https://bugzilla.mozilla.org/show_bug.cgi?id=1488950 https://bugzilla.mozilla.org/show_bug.cgi?id=1514291 https://bugzilla.mozilla.org/show_bug.cgi?id=1528957 https://bugzilla.mozilla.org/show_bug.cgi?id=1472602 https://bugzilla.mozilla.org/show_bug.cgi?id=1455660 https://bugzilla.mozilla.org/show_bug.cgi?id=1450601 https://bugzilla.mozilla.org/show_bug.cgi?id=1424384 https://bugzilla.mozilla.org/show_bug.cgi?id=1341643 https://bugzilla.mozilla.org/show_bug.cgi?id=1526342 https://bugzilla.mozilla.org/show_bug.cgi?id=1519073 https://bugzilla.mozilla.org/show_bug.cgi?id=1414874


> I think most of the complexity stem not just from the word count, but rather that everything interacts with everything else.

And most specifically in layout and rendering. HTML, JavaScript and the parts of CSS that aren’t, y’know, doing anything, are all very straightforward, despite having the significant majority of the word count. If anything, I’d say that in web matters implementation difficulty is inversely proportional to word count, because its verbosity pretty consistently comes from precision (which makes implementation easy). Layout stuff would be much harder to define exhaustively in that fashion, nor is it done so in most places.


> I think most of the complexity stem not just from the word count, but rather that everything interacts with everything else.

That is definitely the main issue.

And you're completely correct on the needed/non-needed/subjective front. Many of the standards reference (in a recursive manner) a lot of other standards. A listed some here: https://news.ycombinator.com/item?id=35524018 As an outsider it's impossible to know whether TIFF spec or ISO 7-bit coded character set for information interchange are relevant, an need to be studied, or are there just because they define some minor values referenced in some more higher-level spec.


It's not hard. Start with the WHATWG's spec, then incorporate the other specs it references using a reasonable heuristic to determine if a given item should be included or not.

If you don't think the estimate from Reckless, Infinite Scope is wildly off, then you either didn't read the methodology and do a spot-check of the dataset, or you really don't understand the scope of what gets published by W3C and how little much of it has to do with Web browsers or how many revisions of them there are.


It may not be possible to come up with a "reasonable heuristic": https://news.ycombinator.com/item?id=35524018


It is possible, and that sentence is verging on nonsense. A heuristic is not by definition perfect or optimal.


> It is possible, and that sentence is verging on nonsense.

Define "reasonable" then, when talking about the web.


The only bar that the heuristic has to pass here is "delivers a result that doesn't suck as bad as the analysis in Reckless, Infinite Scope". The analysis in that article is so bad, however, that your heuristic can literally be, "if you encounter an item that was also in Drew DeVault's input set, then assign an arbitrary probability 0.9 (or whatever) of whether the item should be counted", and it would still give you a more realistic result than what the article says (and that people are actually relying on in their arguments—and that you are defending) here.

Aside from that, given how many logical errors and weird counterconclusions[1] you've managed to stuff into this discussion, though (and to have been able to do so economically[2]), I'm going to go ahead and say this is my last response to you that I spend more than 10 seconds writing out.

1. e.g. <https://news.ycombinator.com/item?id=35521704#35524952>

2. wrt number of words, fittingly


Mozilla Windows binaries weight 226MB ... does it mean encoding rate of 2 bytes per word of specifications? Pretty good packing, I'd say.


I'm not surprised, it usually takes many words of English to describe very little code.

For example, changing "greater than" to "greater than or equal to" might require only a single bit of change in the machine code.


I think the opposite. Sure, in your example that is true, but that's assuming we speak to the code line by line, in English.

But we don't, really. Example:

Write a function that keeps track of the name and weight of each person added, then prints the list, sorted by weight, lightest to heaviest.

I asked GPT this. My 140 character, unoptimized query resulted in 816 characters of c++ code.


Your example is heavily underspecified. In what form are people’s details added? How is the list printed? A spec that’s actually implementable will be a good deal longer. One that defines behaviour completely (what ordering should you use for equal weights?) will be longer still.

The HTML and ECMAScript specs that comprise most of what we’re talking about are very much closer to line-by-line, because they’re designed to be both implementable and completely specified.


If I change my query to be more specific

Write a program that keeps track of the name and weight of each person added, sorted by weight, lightest to heaviest. The input should be a command line prompt asking for input in 3 fields - first name, last name, weight. If two people have the same weight, order them alphabetically by last name. At the end, when a blank line is entered, print the list with headings first name, last name, weight. Check the input, if it's not 3 sections or empty, print an error explaining the input format.

497 input characters, 1317 output characters.

In the case of a detailed or verbose spec, you're probably right. I'm just replying to the assertion that it generally takes many words of English to equal little code. If that were true, nobody would be using ChatGPT to scaffold.

Now, if you're going to be detailed about -how- each line should look, I'd agree that English would be more verbose than code.


Still seriously underspecified for an interoperable spec. You haven’t defined the input or output forms anywhere near precisely enough, or how to order alphabetically by last name (sorting depends on locale: e.g. is æ equivalent to ae—though that still raises stability questions—a letter after a, a letter after z, something else? I think there are languages that treat it as each of these. Or are you just rejecting anything beyond ASCII letters, which will cause different trouble?), or what to do about two people with the same weight and last name.

Web specs need to consider all of these sorts of things. That’s why they’re verbose—they’re designed to be implementable and complete.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: