It's not guessing if the form is known and you can read the information directly.
This is a common scenario at many banks. You can expect nearly perfect metadata for anything pushed into their document storage system within the last decade.
Oh yea if the form is known and standardized everything is a lot easier.
But we work with banks on our side, and one of the most common scenarios is customers uploading financials/bills/statements from 1000's of different providers. In which case it's impossible to know every format in advance.
This is a common scenario at many banks. You can expect nearly perfect metadata for anything pushed into their document storage system within the last decade.