Hacker News new | past | comments | ask | show | jobs | submit login

Neat. I can't help but wonder if this will fall victim to the "last 10% is the hardest" rule? Will going from tech demo to production ready remove the performance gain?

It sounds like you're claiming that FreeType is slower because the parsing/accumulation implementations are slower. It's far from my area of expertise, but wouldn't 20 y/o open source software as prevalent as FreeType have optimized those code paths?

Edit: Author is a heavyweight in font rendering circles. Excuse my ignorance. Just wary of "10x faster with 90% of functionality!" benchmarks.




Author:

> The current state of the code is quite rough. The code isn't well organized, and it's basically not ready for prime time.

From what I can see in https://github.com/google/font-rs/blob/master/src/font.rs, this is what is missing:

* support of CFF-based fonts (that is "postscript-flavored outlines" (i.e. cubic) OTF files)

* "Advanced Typographic Tables" (see Opentype spec: https://www.microsoft.com/typography/otspec/otff.htm), this is what is needed to render more complex non-latin languages like Arabic etc, because it defines context-specific replacements and positioning, but also opentype-level kerning

* support for the kerning table

* support for slightly more exotic TTF variations like EOT and WOFF for webfonts

* hinting support for smaller rendering sizes

Most of these things are supported by Freetype, and are probably a considerable amount of work to add. Once you add them into the rendering calculations, the abstractions in the code would have to be refactored and the code would become more complex and probably slower.

Having said that, it's still a nice implementation and easy to read, something I wouldn't necessarily say about Freetype ;)


> Once you add them into the rendering calculations, the abstractions in the code would have to be refactored and the code would become more complex and probably slower.

These features don't have much to do with the core rasterization algorithm, which is where the vast majority of the time is typically spent. So I wouldn't expect things to go slower.


The hinting is definitely rasterization related, isn't it?


Well, sure, but it's increasingly common to just not do hinting these days. For example, Mac, iOS, and Android (from what I can gather) don't.

In any case, hinting just changes point positions. It doesn't affect the way the rasterizer works at a fundamental level, I don't believe.


In the case of TrueType hinting with the interpreter, it can. The SCANTYPE instruction [1] allows the hint program to request different scan conversion settings. It's mainly used to change the dropout control mode. Without dropout control, the rule is strictly that pixels whose centers are inside of the glyph are drawn. At small scales, this can lead to features smaller than a pixel disappearing, so dropout control adds additional rasterization rules to draw some pixels even if they're slightly outside of the glyph. I've seen that FreeType supports them, and I'm sure it must have a complexity and performance impact on them. Granted, it's less useful with antialiasing, but is still a case where hinting affects the rasterizer's operation.

[1] https://developer.apple.com/fonts/TrueType-Reference-Manual/...


For CFF at least, hinting instructions are integrated tightly with the rendering instructions. I'm not sure about TrueType outlines though.


FreeType does not support EOT.

FreeType also doesn't handle Advanced Typographic Tables. For that you need HarfBuzz. Note that rendering and shaping are usually considered to be separate processes - one doesn't expect a font rendering library to handle shaping

Other than hinting nothing you've mentioned should affect the rasterization performance.


> wouldn't 20 y/o open source software as prevalent as FreeType have optimized those code paths

You'd be surprised how often SIMD goes unused. libpng, for example, doesn't use it on x86…

Outside of games, scientific computing, and a few other fields, it's notable how little of the hardware in our devices actually gets put to use.


> libpng, for example, doesn't use it on x86…

Do you mean doesn't use it explicitly? If so, GCC is still happy to find quite a few spots to automatically inject it. On my arch system:

    $ objdump -d /usr/lib/libpng.so | grep -c xmm
    316
Then again, gif doesn't seem to get even that, so maybe there is something in libpng?

    $ objdump -d /usr/lib/libgif.so | grep -c xmm
    0


gcc and clang generate SSE2 instructions for scalar (non-SIMD) arithmetic, depending on the target architecture and compiler flags. For x86_64 targets, the calling convention puts floating point arguments in xmm registers, so the compilers must use SSE2 (and it's faster than x87 anyway). That's what you're seeing. If you look at the instructions in libpng.so that mention xmm registers, you'll see that they aren't SIMD instructions.


Explicitly, for filtering.


I wonder whose fault is that.. if you build a feature that almost on one will use, then is the 'API' too arcane, everyone too lazy, or are there no tools available to ease the pain?


No tools available, or at least, none that are easy to use. It's the classic parallelism problem: tools are either too low level (e.g. CUDA/OpenCL for the GPU space) for day to day developers to leverage effectively, or too high level to provide the performance in the other 10% of the codebase, wiping out your parallelism benefits. In this space, the low level alternative is manually writing SIMD x86 instructions, or using a similarly low level C library to do the same. The alternative is switching to something like haskell and using one of their high level array manipulation DSLs. In the first case, you have to explicitly manage the parallelism, and there _will_ be bugs there, in the second case you probably lose more performance in the other 90% of your application by switching to (say) haskell than you gain in parallelism.

In terms of automation, SIMD is hard to automatically implement (i.e. through the compiler) with a lot of traditional programming languages (e.g. C), and hard to add to dynamic languages, as you end up adding extra code paths/jit passes for each new type of simd/parallelism hardware construct available.


I believe that C is in part to blame, because the ANSI C abstract machine describes a single-threaded processor.


SIMD is not about threading. Of course you can multithread it too, just like any other code.


My comment was responding to the concept that "it's notable how little of the hardware in our devices actually gets put to use".


It's funny, anecdotally I've seen developers say that it should be the compiler's job to insert SIMD instructions and do optimizations. But many don't understand SIMD and think it's something as simple as replacing a few instructions.


Interesting, especially seeing how other architectures do it, I don't know if libpng does google summer of code, but that would be a great suggestion for a GSOC.


> wouldn't 20 y/o open source software as prevalent as FreeType have optimized those code paths?

The tradeoffs that made the most sense 20 years ago are not those that would lead to the fastest implementation on current hardware. This is not so much the Freetype is unoptimized, it's that old, possibly wrong optimizations are pretty much baked in now...

Modern CPU have massively more cache and have vectorization instructions, which make the optimal solution very different from the one optimal for the Pentium II that was top of the line when Freetype was first conceived... It's also acceptable to use vastly more memory, Freetype dates from a time where a beefy desktop machine had maybe 32Mb of RAM...


Raph really, really knows his stuff. His interview on the New Rustacean podcast starts with a summary of his background: Gimp, Ghostview, android font rendering.


Your 10x speed, 90% functionality caution is definitely warranted. I've lost count of how many times I've seen "we run Python/Ruby/Perl/whatever" 3x faster, just haven't implemented exceptions and monkey-patching yet. :-)


Yup, I still remember being blown away by his dissertation. That he is using Rust in earnest is enough reason for me to take another look at the language.


Any chance we can get a link?

Edit: Whoops, see below. And for the lazy: http://www.newrustacean.com/show_notes/interview/_2/index.ht... (credit to @wscott)


As a font neophyte I can only say that he sounds like he breathes fonts. His interview in Android dev Backstage was very interesting : http://androidbackstage.blogspot.fr/2015/01/episode-20-fonts...

(And the whole podcast is an easy recommandation. Googlers talking about Android dev stuff from an insider perspective)


For what it's worth; Gimp, Ghostscript, and Android all use FreeType for text rendering. At best, they would require him to know how to use FreeType bitmaps correctly (i.e. gamma-correct blending, when to multiply colours in). Though he could easily go beyond that, it is very interesting stuff and I find it incredibly hard to stop myself from reading into FreeType.


"It sounds like you're claiming that FreeType is slower because the parsing/accumulation implementations are slower."

This is one of those performance drains across many languages that we often don't see because our weak languages don't let us. Manifesting an array for a consumer that only wants to iterate on it once, in order, with no backtracking, is a common antipattern. Just as I'd look askance at any putative "next big language" that doesn't have any sort of closure support, I look askance at "next big languages" that don't have iterators.


> It's far from my area of expertise, but wouldn't 20 y/o open source software as prevalent as FreeType have optimized those code paths?

No, 20 y/o software had Heartbleed. Just because software is old doesn't mean it's battle-tested.


And just because software is mature, open-source, actively developed does not imply it is optimal for today's hardware. See WebRender versus GDI/GTK2, for instance.


Being open source is no guarantee of being fast, or good, or anything. The only thing you really know you're getting is the ability to inspect the code to figure out if it suits your needs instead of having to assume.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: