> we also have a hard time judging Ladybird’s progress for the reasons stated ab...

> we also have a hard time judging Ladybird’s progress for the reasons stated above

There are plenty of available test for web browser stack, allowing to compare browsers and their implementation of standards, including Ladybird.

  * Acid3: http://wpt.live/acid/acid3/test.html
      * All browsers score 100/100
  * HTML5test: https://html5test.com/index.html
      * Chrome: 528
      * Firefox: 491
      * Safari: 471
      * Ladybird: 266 (in december 2022, I couldn't find more recent figures, but I wouldn't be surprised if it were significantly higher today)
  * Test262 (JS engines): https://test262.report
      * Chrome (V8): 86%
      * Firefox: 85%
      * Safari: 85%
      * Ladybird: 87% (behind on language syntax, significantly ahead on built-ins, internationalization, and AnnexB)
  * Web Platform Tests: https://web-platform-tests.org/
      * I don't have figures for this, but this is integrated in Ladybird's CI pipeline
  * Probably others

The point is: it is quite possible to judge and measure overall progress, and the vertical slice approach they use lead to constant overall improvement.