Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For archival, you need to worry about long term ability to read the format easily. JPEG wins that fight out of the three easily.


RAW is a very old format that has been around for some time, Apple's ProRAW is backwards compatible and has some additional metadata attached (IIRC ProRaw attaches the image pipeline the iphone would have used, so that image software can recover this and produce the same image the iphone would have after the image pipeline). RAW is older (1988) than JPEG (1992), largely because RAW is largely based on TIFF (and any vendor specific variation is usually TIFF-like too) in it's mid-1980s state. The latest standard RAW standard (TIFF/EP) is from 2001.

So in terms of long term ability to read... RAW wins, various versions aren't as old but JPEG got it's fair few of extensions too.

For long term readability, I don't think JPEG would win on another standpoint; bitrot. It'll happen eventually, even if you use ZFS, you will eventually loose some bits. Maybe a sector of data. JPEG doesn't like loosing parts of the file.

On the other hand, a TIFF file can be recovered from bitrot, if you don't mind loosing a part of the image. Because there is no compression, loosing a sector of data amounts to loosing the bits on that sector. The only sensitive part of the file would be the header, which isn't terribly complex and can be typed on notepad if needed be.


You obviously have a very different experience with TIFF or RAW than I have had?

Every vendor and application has weird extensions or behaviors with the format that means only custom built support for the software that makes it actually works well.

Every piece of software on the planet seems to support jpeg out of the box.


“raw” is a generic term that refers to many different raw formats, most of them compressed.


If you care about bit rot then add some parity.

Or even store multiple copies, it's still smaller!


Multiple copies of a smaller file are still susceptible to bit rot. In fact, RAW is still more resistant because it can still be read even if multiple sectors corrupt, while many small files might each individually be toast.

(And btw, parity doesn't protect you from bitrot forever, only for like a decade or two)


If you have three copies then you can automatically heal from any single sector going bad, and semi-automatically heal from a sector going bad in the same spot in two copies at once. Or you could make five copies even.

> (And btw, parity doesn't protect you from bitrot forever, only for like a decade or two)

Based on what settings and what environment?

By the time you're losing a large percentage of your sectors, you're probably losing everything regardless of format. You don't use file formats to protect from entire disks or tapes failing.

Also, if you set up paranoid levels of parity you can recover a perfect image even when a RAW file would be covered in gaps and noise, while still being a lot smaller.


And at what paranoia level would you rather loose data to lossy compression than use the original without and just spend a dollar more on storage?


Let's assume I'm willing to spend the dollar more on storage for both options.

I can either store one copy of a RAW, or I can store an unholy ball of parity that's exactly the same size.

The unholy ball of parity can lose up to 90% of the data and still be completely recovered, giving you a very high quality image, but if you lose more than 90% you get nothing.

A RAW image degrades more and more as you lose data, and if you lose 90% it's going to be useless anyway.

I'll definitely pick the compression+parity option.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: