Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The audio "loss" example sounds plausible in passing (and the diagram looks plausible) but is actually incorrect. The frequency and timing dimensions of analog audio below the Nyquist frequency is preserved perfectly by digital quantization, which in practice for CD/DVD usecase are the full spectrum of the human ear. This counterintuitive result is explored in some detail in [1].

It is true that the amplitude dimension (only) is quantized to (typically, 16 or 24) bits, which you could detect with a very good oscilloscope. However 24 bits is way smaller than any human ear can discern. Visually, it is like looking at two stacks of dollar bills that are 6,000 feet high and trying to discern which one has one extra bill.

I suppose that is technically "lossy", but the only thing we are "forgetting" is something no human could perceive, or remember.

[1] https://people.xiph.org/~xiphmont/demo/neil-young.html



It's still lossy! Maybe it isn't lossy to the human ear, but what is the data going to be used for? There are assumptions you're building in here. What if I'm now interested in using the data for a bat ear model? All of a sudden, that information is now of limited use.

She's saying "be aware of your assumptions." Her entire screed is a call to take a step back, and recognize that programming is reifying assumptions and systems of control. Be aware of what those happen to be. Make those choices conscious, rather than unconscious.

When I'm creating a UI, am I assuming the viewer has 20/20 corrected vision on a 1920x1080 monitor with the full set of rods and cones? Am I considering folks that are colorblind, might need to have different zoom levels, or might use screen readers?

When I'm creating a tool for data analysis, am I making it for other programmers? Or can I maybe widen its usage to the business analysis side, thereby making the tool more useful to more people. When I change a tool, is it breaking someone else's workflow?


> and the diagram looks plausible

I've seen this and similar diagrams used over and over again by a lot of people who should now better. This kind of diagram can be interpreted two ways

(a) the diagram maker has no idea how sampling works (b) the diagram maker made it to illustrate a pathological case to make quantization noise obvious, that is, the diagram shows fs=infty with a four level / two bit ADC with no dithering

In any case, it has nothing to do with how digitizing audio works.

> It is true that the amplitude dimension (only) is quantized to (typically, 16 or 24) bits, which you could detect with a very good oscilloscope.

Nope, normal scopes work with 8 bits ADCs and manage 10 to 12 bits in ERES and similar modes (... they don't really get to 10 or 12 bit SFDR though ...). Some special scopes have 16 bit ADCs, but you'd still be hard pressed to detect the difference.

Also, while again a "counterintuitive result" in a certain sense, the dynamic range of a 16 bit audio signal is greater than 96 dB, that is you can encode and actually discern noises below -96 dbFS. This is because the 96 dB figure assumes a white dithering signal, which is not used.

Of course, all that doesn't really matter with pop music. Who needs >100 dB SNR and DNR if the piece you are encoding only has 10 dB DNR anyway?!


> The audio "loss" example sounds plausible in passing

Not really. Sure, your debunking is sound, but we really don't need to appeal to Nyquist for this one. Just consider the alternatives; analog media? It begins rotting the moment it's recorded. Some abstract representation? Fine, until you forget how to interpret it. Digitizing is the most robust means we've yet invented to forestall "forgetting;" a technique that enables precise and efficient replication of audio on myriad forms of media now and in the future.

The Unicode case is also naive. Every language suffers change as new speakers/writers and new representations appear; that isn't a feature specific to programming or computing at all. On the other hand, thousands of symbols from hundreds of obscure languages are being permanently preserved for posterity in Unicode; how is this "forgetting?"


The argument is not that "forgetting" is fundamentally bad - it's that the choice of what to leave out can be meaningful. Han unification removes some amount of distinction between things that are meaningfully different and instead relies on additional metadata to reconstruct that. What are the wider social consequences of that? I don't know, but it's not clear that those involved in making the decision do either.

The fundamental point here is that hacker culture has often made decisions without considering the effect they have on non-hackers (or even hackers of different backgrounds), and as a result those decisions may result in abstractions that "forget" meaningful data. Uncompressed digitisation of audio is a case where it's unlikely that the difference is important in any way, but there are plenty of examples given where it is. The suggestion that having more information can help us make better decisions shouldn't be controversial.


By the way, people seem to think Han unification was forced on CJK users by evil white people from Unicode (there was an article like this in modelviewculture once), but it was contributed by the relevant Asian governments.

And of course China already made much larger changes in real life by creating Simplified Chinese.


> Just consider the alternatives; analog media?

Now you're offtopic altogether, talking about different possible types of potential loss....

> Every language suffers change as new speakers/writers and new representations appear

That was the point (to whit, the whole discussion about transcription was illustrative).


The idea of Han unification doesn't seem right to me. Cyrillic alphabet for example got its own set of codes, even though some letters resemble Latin letters. But, even though those letters look the same, they sometimes sound differently and mean different things.

I suppose somebody came up with this idea back then when they were trying to fit everything into 16 bits.


I think the speaker knows that.

The point is that something is still lost. Maybe this is trivial, maybe not; in making that decision you are making assumptions about why recordings are made and how they will be used. Those assumptions may be correct, but they remain assumptions.


The quantization of amplitude isn't quite that straight forward though.

Usually some form of dithering is involved to transform any quantization errors into a smooth noise.

See http://www.audiocheck.net/audiotests_dithering.php

And I'm pretty sure a DAC isn't supposed to output jagged waveform, no matter how high the resolution.


Find me an example of a bandlimited finite time signal in the real world, please. _Technically_ the example is correct (although the diagram is not)

Any time you digitise a signal, what information you are willing to lose is the first question you ask. The answer is different depending on the signal and what you need it for. That's what is being brought to attention. Not the fact that all the information in the signal anyone would care to represent is representable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: