It is extremely hard to justify 32-bit audio from a technical standpoint, since it requires that every component in your audio chain has better than 144 dB dynamic range. In practice, just about none of the analog components in your audio chain will have that kind of dynamic range, and any analog components in the recording chain are unlikely to have that range either.
The kind of equipment I'd expect to see in order to record that kind of range is where you'd have something like multiple microphones, and switch back and forth between them based on the level of the material you're recording (which would have to be something like actual gunfire or explosions).
I don't know what kind of equipment you'd need to reproduce this dynamic range, but I don't think it's audio equipment.
My guess is that the main real application of this is going to be compression of non-audio data that happens to compress well with FLAC.
In audio production, the relevance of 32 bit audio is that it's using the same 32 bit float representation of the audio that most of the signal processing stages are also using. In contrast, when people talk about 16 or 24 bit audio, they are customarily referring to an integer representation.
In principle, every time you convert from the 32 bit float back to 24 bit integer, there's an opportunity for a careless human to screw up the scaling, and throw away some of the available integer range. Rinse and repeat until you have an audible problem.
So this forms the basis for an argument in favour of keeping the audio in a float representation for as much of the production process as can reasonably be achieved.
I don't see any benefit in 32 bit representation for delivery of the finished content, but I suppose that if, as part of the production process, you're transferring the audio online between different sites/people/whatever, then having lossless compression that works without having to convert back to integer might be useful.
EDIT: Just read the thing more carefully, and realised that it's specifically talking about 32 bit int, NOT float. So right now, I can't see much practical use for this.
You're absolutely right insofar as you're speaking about static audio that's already been produced and finished - there is almost no point in storing anything above 24-bit integer as far as dynamic range even for archival purposes.
However, there is a legitimate purpose behind having higher dynamic range for production purposes and sample sources. There are some recording sources that can actually produce 32-bit audio. Plus, you might want to do some processing on the sound that would end up affecting the dynamic range, or otherwise benefit from the increased resolution. One example is nonlinear processing that generates new musical information from the original signal - you can of course just reduce the gain after processing, but you are then sacrificing some of the resolution of the new combined signal, which itself could otherwise be used by further downstream processes. This all happens post-recording, but can still be musically important before getting to the finished product.
This is why DAWs work in 32-bit or 64-bit processing internally, and why many high-quality sample libraries will come in 32-bit, especially smaller one-shots. I often convert samples to .flac for space reasons, and have to either skip 32-bit .wavs or downsample them to 24.
> However, there is a legitimate purpose behind having higher dynamic range for production purposes and sample sources.
This is the justification for 24-bit audio... is there a reason why 24 bits is not enough here?
If you're capturing audio sources directly, you'd use something like a 24-bit ADC, which you can find easily enough. The "raw" output of the ADC is 24 bits.
If you're doing intermediate processing in your DAW, then the DAW is using single-precision floats (or possibly double), which cannot be losslessly converted either to 32 bit or 24 bit integers, so how would you choose the right format to store? It seems to me that you'd either store the original floating-point data, or you'd perform some kind of lossy conversion to a high-quality archival format... but if you do that, isn't 24 bits good enough? You're quantizing either way, and at 24 bits, you can have plenty of headroom and noise floor at the same time. Loads, even.
There are two justifications, one, for recording - 24-bit is the standard in the studio yes (and more than you need for that context indeed), but 32-bit is more and more the standard for field recordings where the hardware is capable of it and it provides genuine utility, where you often have extremely soft and subtle sounds captured that you want to increase in gain to a more useful level.
Also, once inside the digital world, there are many processes you can perform that add new musical information to the original sound that might be higher in gain but that you want to preserve for downstream processing until you're ready to actually "print" and quantize the final product, at which point, yes, 24-bit will be more than enough.
The latest Zoom field recorders support 32-bit float recording and achieve a wider dynamic range than that (upwards of 210db) by having a circuit with two different ADCs.
Also, it's less about the absolute resolution, and more about the ability to boost the gain, often by a lot, while still having a wide and useful dynamic range after the fact.
32bit FLOAT is 24 bits of audio data, and has the unique advantage of being the typical internal processing format of plugins and audio workstations, due to the desire for headroom while processing to avoid artifacts from overflows of various kinds, among other things. (Upsampling is also common before performing calculations that may alias or mirror.)
The stacked ADC approach does allow for a wider overall dynamic range, but I would be skeptical of the physical transducers capabilities in using it all, but the 32-but float format here appears to be largely about lossless transfer to a DAW such that further processing is lossless, as we are not converting from int to float and back.
This format could then be said to be useful as a master media format.
The article in the OP is about a flac containing a 32-bit int audio stream, however, which appears less useful and not at all related to 32-bit float
Note that 144dB range is what you get from 24 bits per sample. 32 bits gets you another 48dB more.
Some of your ADC/DAC chain can reasonably claim 125dB range. Some amplifiers can claim 17 or even 18 bits above their noise floor - 108dB.
No full-spectrum microphones, headphones or speakers can reasonably claim 125dB without distortion, but if they did you would still want to limit your exposure to "never". Long-term damage begins with long-term exposure under 96dB.
"Unprecedented fidelity and detail, 130 dB dynamic range"
I've just realized that the product page I linked is extremely unusual for stating a dynamic range value.
However this makes sense if you read the papers covering Neumann's dual ADC and pre design of which they were justifiably proud. System D was a mid nineties introduction.
One would only have the SNR worth of content inside those 130db, was the point I think that person was trying to make. Real world physical transducers will have physical limitations of some kind, not all of which must be simultaneously surpassed to encounter said limitation.
People will use microphones to record audio like gunshots and explosions for use as sound effects. The humble snare drum will produce loads of dBs too, and is usually close-miked. You then take the same microphone and use it to record something much quieter. You end up with microphones that definitely do reasonably claim >125 dB range. The TLM102 claims something like 130 dB. How much you care about distortion will depend on the situation.
My thought is that it would be very hard to get that 130 dB range all the way from an audio source to your ADC, and it would be very hard to get it back out all the way to speakers again.
>The TLM102 claims something like 130 dB. How much you care about distortion will depend on the situation.
Neumann makes no such claims. They do claim max spl of 144 dB. But the TLM and is 10 dB less and this is prior to ADC of variable ability. Distortion isn't a ceteris paribus value for signal to noise calculations.
> Due to its enormous dynamic range of 132 dB [...]
Not really interested in dissecting what this means, or how to reconcile it with the SNR numbers. Just saying that Neumann does, in fact, make this claim.
Not sure what point you're making about distortion.
I could see it as part of a pipeline/workflow where having that dynamic range lets you not worry about losing meaningful information through everything, even if you're going to compress it and take it back to 16bit cd quality at the end. Being able to losslessly store the results of each step in a FLAC file should be better than raw data since it'll be compressed and easier to manage then. That said, at the edges of that I totally agree even 24-bit can be questionable there since a lot of the analog side doesn't have a noise floor that would let it be meaningfully used.
> That said, at the edges of that I totally agree even 24-bit can be questionable there since a lot of the analog side doesn't have a noise floor that would let it be meaningfully used.
It's not that hard to beat 16-bit, which is 96 dB, using easy-to-find, off-the-shelf equipment. One example scenario is that you are recording something but you don't have a precise idea of how loud it will be ahead of time, so you record at low levels and rely on 24-bit capture to give you headroom above and noise floor below. Trying to capture at 16-bit can, in practice, be annoying and difficult because it is more likely that you will ruin takes by setting the gain wrong.
The 55% compression ratio (or whatever the case may be) seems much more useful at the end (where it nearly doubles how long a consumer can listen before swapping media) then along the way (where it nearly doubles how much raw material a studio can capture before swapping media).
The kind of equipment I'd expect to see in order to record that kind of range is where you'd have something like multiple microphones, and switch back and forth between them based on the level of the material you're recording (which would have to be something like actual gunfire or explosions).
I don't know what kind of equipment you'd need to reproduce this dynamic range, but I don't think it's audio equipment.
My guess is that the main real application of this is going to be compression of non-audio data that happens to compress well with FLAC.