Here’s an issue I ran into recently where digitization fell flat.
Wanted images of a particular issue of a regional newspaper from the 1940s. I am in the US State where it was published and it was one of the top three newspapers at the time, so (I thought) no problem. The State library even sent me a copy of the digital archived copy of that issue.
But the scanning had been done from a bound copy of the newspaper and done with a (state-of-the-art at the time) low resolution scanner. Two or three words of each line (right side of right page, left side of the left-facing page) were missing…not just distorted, missing. Contacted them to get the original and they said they didn’t have the original physical copies of those issues from the 1940s, just those scans. (Library of Congress says the State library have the only complete physical collection.)
So I am dealing with lacunae (you know, the stuff archeologists wrestle with when texts are missing from 2000 year old burnt papyrus) from newspapers that are less than 100 years old. With 2000 year old burnt papyrus, new tech can maybe fix the problem. But there is no original source material to work with for this 1940s newspaper.
In addition, any early scanned photos from that issue run the gamut from bad to totally unusable. No future AI would be able to reliably reproduce anything from many of the photo scans…there is simply too little signal left. A trivial rescan issue with today’s scanning tech, but what can you do when the original is unavailable? The content in question relates to art and artists, so the published graphic images are important.
Early digitizations often don’t meet the needs of researchers. If the physical source material was tossed and unavailable for rescan, no matter how recently published, some text fragments and images are more “lost” than the burnt papyri of Egypt.
Physical archived copies of core documents are still important, even/especially bulky “ephemera”.
Wanted images of a particular issue of a regional newspaper from the 1940s. I am in the US State where it was published and it was one of the top three newspapers at the time, so (I thought) no problem. The State library even sent me a copy of the digital archived copy of that issue.
But the scanning had been done from a bound copy of the newspaper and done with a (state-of-the-art at the time) low resolution scanner. Two or three words of each line (right side of right page, left side of the left-facing page) were missing…not just distorted, missing. Contacted them to get the original and they said they didn’t have the original physical copies of those issues from the 1940s, just those scans. (Library of Congress says the State library have the only complete physical collection.)
So I am dealing with lacunae (you know, the stuff archeologists wrestle with when texts are missing from 2000 year old burnt papyrus) from newspapers that are less than 100 years old. With 2000 year old burnt papyrus, new tech can maybe fix the problem. But there is no original source material to work with for this 1940s newspaper.
In addition, any early scanned photos from that issue run the gamut from bad to totally unusable. No future AI would be able to reliably reproduce anything from many of the photo scans…there is simply too little signal left. A trivial rescan issue with today’s scanning tech, but what can you do when the original is unavailable? The content in question relates to art and artists, so the published graphic images are important.
Early digitizations often don’t meet the needs of researchers. If the physical source material was tossed and unavailable for rescan, no matter how recently published, some text fragments and images are more “lost” than the burnt papyri of Egypt.
Physical archived copies of core documents are still important, even/especially bulky “ephemera”.