Actually, there are two things that stick out to me in the paper.
1 The low FAR (False Accept Rate) is unbelievably high at 0.01%
2 The "partial prints" are described as single or "mixed" minutia
The FAR is 1-2 orders of magnitude off of even cheap mobile device authentication.
The described size of the partial prints imply that the relative location of partials is not extractable.
Since most fingerprint matchers rely on multiple (3-4) minutiae at a minimum and their relative location along with ridge orientation and pitch correlations it seems like this doesn't provide necessary information. More importantly, even with more information it can't construct that relative information, because it can't resolve symmetries (certainly not without knowing the direction of motion for the swipes and the orientation of the finger) correlated with those sounds. That requires other out of band information.
It's interesting work, but there's probably a reason that you don't see fingerprint matchers with decent FAR/FRR using only the microphone on a mobile device and some software. There are a $B reasons to develop that every year, and yet there hasn't been one developed for 15-20 years.
Why wouldn't it know the direction of motion for the swipes and the orientation of the finger? Any mobile app has access to the touchscreen input which provides exactly that information.
Unless you're playing fruit ninja there's really quite little control over the speed and direction of swipes... many people swipe very little at all or with a different finger than they enroll in biometrics or type on the screen. Some even swipe with multiple fingers.
Then there's the fact that there are more than 20 minutia per finger so when referring to orientation, this means the angles and what portion of the finger is in contact with the touch surface... and thus which of those minutia (swirl, end, fork, etc) are in generating any sound.
What I hear in your comment is that if you want to steal people's fingerprints, you'd have to target-advertise them a clone of fruit ninja, possibly designed so that certain specific angles of fingers are clearly optimal for doing well at the game.
If you want to steal a single print, you would need a moderately long game and pliable player along with wide angle synchronized video that imaged hand/finger orientation. If the video isn't good, or they use different fingers intermittently, or only one orientation, or they have a callus on their finger, or they have very fine/course ridges, too many forks/ends, an irregular accidental whorl or arch, or there's noise, etc... you wont' get a single print with high enough resolution to match. That still may not be a print they use for authentication.
Most people use their forefinger or a thumb to swipe, and their thumbs to match/type, and multiple fingers or both thumbs to scroll/zoom.
More interesting is that if one had full knowledge of a persons prints then perhaps with microphone, touch tracing, and wide angle video, one could compare expected vs measured sounds. This is a one-one confirmation rather than one-many match. Perhaps it could prevent long term usage after a person had unlocked their device and another was using it.
As far as I can tell, they don't actually draw a fingerprint from the point cloud they form, yet generally I agree.
Some seem to be saying its not that bad from a personal credit card or phone unlocking thief perspective. However, my main concern is with large nation state groups that have access to pre-existing fingerprint database files.
There's something here that feels like the NSA, FBI, FSO/Spetssvyaz, 3/4PLA, Unit 8200, GCHQ, BfV, DGSE, CSE, TERM, and ISI would probably all have their "figurative" ears perk up.