Many of those formats are made by archival institutions, and also follow rules like ISO certifications.
Even librarians (my husband being head librarian for a city library) shake their head at this, and they're trained in databases and archives and related materials as a matter of profession, some to a higher degree than some CS students.
Are you (or your husband) really trying to tell me something like this[1] is an archive or a library?
That is one of the more, if not most, flagrant demonstrations of the Internet Archive not understanding what an archive is. There's more where that came from, including stuff like their one:many digital book lending program.
The law stipulates what an archive/library is and how they are protected from certain copyright prosecutions. The Internet Archive does not abide them, and thus I would absolutely not want my tax dollars sent their way. If they want to engage in flagrant piracy and political activism, they can do so on their own dime.
A proper archive deals solely in storing and preserving works thereof.
Note that redistribution isn't among them, and as such copyright doesn't apply because there is no copying for copyright to be concerned with. Aside from copying that must occur as a practical matter of archiving, of course, which are protected by the exceptions I cited.
This is why I said archives may not be directly available to the public, because once you start redistributing you are beholden to copyright regulations and otherwise the demands of rightsholders thereof. The very fact we can access that page freely is proof that the Internet Archive is not an archive.
Libraries, specifically those that aren't private libraries which are a form of archive, such as public libraries and institutions like the Library of Congress operate abiding copyright. Either by signing contracts with rightsholders permitting such redistribution or by adhering to the exceptions provided by copyright laws.
In other words: No, the Internet Archive is not an archive; they either don't know what an archive is, or more likely they are disingenuously calling themselves an archive to try and skirt the law which is unacceptable particularly if someone is proposing funding them with public monies.
For the record, I have no personal qualms with the Internet Archive dealing in pirated goods if they are honest about it. They will still reap the books thrown at them anyway, but honesty is a virtue. I do have a problem with them claiming to be an archive or even a library and serving a public good: Hell no they are not, fucking liars they are.
Even librarians (my husband being head librarian for a city library) shake their head at this, and they're trained in databases and archives and related materials as a matter of profession, some to a higher degree than some CS students.