I am 100% abundantly positive that signal processing code could do this, and in ...

crazygringo · on May 29, 2024

> Now, scaling it down to where you can't hear it anymore may make it harder to believe, but the same code will pick it up.

That's where you're wrong. FFT frequency bands are surprisingly wide. You can make them narrower but with the tradeoff of losing temporal resolution. And it gets worse the lower the frequency gets.

There is absolutely no way you're going to detect a near-0.555 hZ effect from a few seconds of audio and determine whether it's off the frequency by 0.1% or even 1%.

Like I said, sure if you're dealing with a pure sine wave. But not a complex signal using FFT.

Or to put it another way -- a 1,000 hZ signal? Absolutely. But a 0.5 hZ signal? Absolutely not.

rerdavies · on May 29, 2024

There are various DFT-based algorithms for high-precision pitch detection.

Two common algorithms are cepstrum and analysis, and auto-correlation, which involve taking the DFT or inverse DFT of the absolute value of the DFT of the signal.

Find the peaks of the result, and then fit a cubic polynomial to the the peak, and the bins on either side, and then calculate the maximum value of the polynomial. The value at the which the maximum occurs determines the inverse frequency, which can then be converted to pitch.

Both algorithms produce results that are accurate to less than 0.1 cents. You do have to tweak buffer sizes and windowing depending on what pitch ranges you are interested in, and do some post filtering to skip over transients.

The temporal resolution problem is solved by calculating the result on overlapping frames. .

crazygringo · on May 30, 2024

Sure but the problem remains: you can't do that with only a few oscillations of a weak signal against a loud noisy complex signal.

You simply can't detect an inaudible-to-human-ears 0.5 hZ signal from 3 or 5 seconds of complex normal-volume audio, down to the accuracy of cents, much less 0.1 cents.

As I said above: a 1,000 hZ signal? Absolutely. But a 0.5 hZ signal? Absolutely not. There just isn't enough signal for that level of precision. No matter what tool you're using.

rerdavies · on June 3, 2024

But you could easily detect frequency modulation of a 220Hz signal by a 0.5 Hz sin wave, which would have sidebands separated by 4 cents. This is conceptually similar to heterodyning. Wow in the source material ends up creating sidebands of the source material in a frequency range that is more amenable to signal analysis. Whether this works or not depends on how much wow an actually record player has. But a back-of-the-envelope calculation seems to suggest that very tiny amounts of wow should create detectable side-bands.

My suspicion is that OP assumed that the source material was accurately tuned to A=440, which is not a safe assumption, but is probably true for any source material that has a keyboard instrument which will almost always be tuned to A=440. Calculate the reference pitch for the source material, and you can tell how much the speed of the turntable is off. (And as others have pointed out, may be completely buggered by common mastering practices, and by Original Instrument recordings of classical music using pitch references other than A=440).

But it doesn't seem implausible that you could use analysis of wow in the source signal too.

mlyle · on May 29, 2024

I agree that FFT is not the easiest tool to use for the job. If I were trying to solve this problem, I'd use autocorrelation.

But it still sounds very challenging: there's multiple sources of periodic frequency change both in recordings and in playback mechanisms.