You asked "why too don't we scoff at a data file without units". You just answered your own question: because it's an "agreed-upon format."
All of my response was to point out that an Excel spreadsheet, CSV file, etc. can equally be considered an "agreed-upon format" by those who use it, so don't need explicit units.
My original question was a simple one. Why should units be a requirement for most datatypes?
I know all of the reasons for why it's useful. I don't understand why it should be a requirement.
The PDB format is not an easy format to understand. Unit conversion is one of the least of the problems in using it outside of its field. Determining bond assignments is much harder, and bond type assignment harder still. In fact, I have a hard time figuring out an example where an explicit "this is in unit X" would make things appreciably easier, as compared to near useless data taking up space.
Could you give an example of how someone could start to use it in another field, easily, where they couldn't now? I can only see it occurring by completely replacing the format, since adding an "A" after each coordinate, or a comment at the top that the coordinates are in angstroms, can't be what you mean. (Nor would including the PDB spec as a comment in each record be what you mean either - though it would be self-documenting!)
For that matter, the X-ray resolution field in a PDB record contains significant digits, so "2.0Å" and "2.00Å" mean different things. The ontology of units is not easy. 2.00Å is 2.00E-10m, not 2E-10m.
In any case, I deal with a lot of unitless numbers as well: pH, molarity, number of atoms and bonds, number of rotatable bonds, ratio between elongation and fixed elongation, etc. The ontology of values is not easy, and a required SI-unit system looks much more like it would get in the way than be useful.
All of my response was to point out that an Excel spreadsheet, CSV file, etc. can equally be considered an "agreed-upon format" by those who use it, so don't need explicit units.
My original question was a simple one. Why should units be a requirement for most datatypes?
I know all of the reasons for why it's useful. I don't understand why it should be a requirement.
The PDB format is not an easy format to understand. Unit conversion is one of the least of the problems in using it outside of its field. Determining bond assignments is much harder, and bond type assignment harder still. In fact, I have a hard time figuring out an example where an explicit "this is in unit X" would make things appreciably easier, as compared to near useless data taking up space.
Could you give an example of how someone could start to use it in another field, easily, where they couldn't now? I can only see it occurring by completely replacing the format, since adding an "A" after each coordinate, or a comment at the top that the coordinates are in angstroms, can't be what you mean. (Nor would including the PDB spec as a comment in each record be what you mean either - though it would be self-documenting!)
For that matter, the X-ray resolution field in a PDB record contains significant digits, so "2.0Å" and "2.00Å" mean different things. The ontology of units is not easy. 2.00Å is 2.00E-10m, not 2E-10m.
In any case, I deal with a lot of unitless numbers as well: pH, molarity, number of atoms and bonds, number of rotatable bonds, ratio between elongation and fixed elongation, etc. The ontology of values is not easy, and a required SI-unit system looks much more like it would get in the way than be useful.