Hacker News new | past | comments | ask | show | jobs | submit login
Fujifilm Created a Magnetic Tape That Can Store 580 Terabytes (petapixel.com)
425 points by elorant on Dec 28, 2020 | hide | past | favorite | 326 comments



Aw, c'mon chaps! Can we just admire the tech for a second?

> Data on the tape is stored at a record-breaking density of 317 gigabytes per square inch...

> When tape is being read it is streamed over the head at a speed of about 15 km/h and with our new servo technologies we are still able to position the tape head with an accuracy that is about 1.5 times the width of a DNA molecule.


Loveliest part was the part right before that:

“Just let me geek out for a second”

Absolutely IBM storage person, the more geeking out the better!


[flagged]


Data storage is already so cheap thanks to HDDs that a poor African country can afford to spy on its population. This will hardly change anything.

If we want to do something about it we have to revolt in the streets.


The tape may well be the better scenario.

With a pile of random access devices, the time and cost of a single record retrieve is low, though the cost of bulk records can get high.

All the oppressors need is a half decade window, maybe a decade to be pretty much golden, as far as totalitarianism goes.

The tape will definitely hold the data, and then some apparently.

But access costs go up.

The drives seem to encourage indiscriminate targeting.

Tapes would at least favor some priority.

The smarter bears will use both!

You are not wrong about what it will take.


In a world where we hold people accountable for stupid things they did or said when they were teenagers, I can see a lot of blackmail value in retaining data for a much longer period of time.


It is all bad, agreed. Was just musing over kinds of evil more than anything else.


HDDs are fine for short term storage, but they are too unreliable when you want to keep the data for many years, possibly for a lifetime.

Unfortunately, currently there is no other commercially available method of archival storage, except magnetic tapes. Optical storage has a too low density to be able to compete with magnetic tapes.


That presumes you're putting the data in cold storage somewhere. For data that's being kept accessible, the reliability of a hard drive doesn't matter. It's transferred from RAID to RAID over time. And spy data is probably in warm storage.


Nope.


Help me out here; which part are you saying nope to?


Archival...

Imagine saving BILLIONS of [data points](face,interactions,vids,text,etc) per person on hot or warm? NOPE

Life history will be on tape in the archives of such...


Except for videos, that doesn't take up a lot of space. The oppressive part is tracking everywhere you go and everything you say, which fits easily into warm storage.

For example, storing your position every 20 seconds might take 10KB a day. You'll collect 15 million data points in a decade, but each one is only a few bytes.


the problem I see, some entitity will dev an algo that will extrapolate between frames using corrolated data from even yet other users' info

so you subject-A at frame 0 then you have suject-A at frame 100

but no data from subject-A between - but you have data from subjects-b-------------j and can track subject-A in such....

extrap, and you have the "zoom and enhance" of surv....


scale of the infra cost... (the lifetime of a single person cost NxBACKUPS tapes...)

Edit: but i like your discourse


Can someone explain the Utah reference?


From Wikipedia[1]:

> The Utah Data Center (UDC), also known as the Intelligence Community Comprehensive National Cybersecurity Initiative Data Center, is a data storage facility for the United States Intelligence Community that is designed to store data estimated to be on the order of exabytes or larger. Its purpose is to support the Comprehensive National Cybersecurity Initiative (CNCI), though its precise mission is classified. The National Security Agency (NSA) leads operations at the facility as the executive agent for the Director of National Intelligence.

[1] https://en.wikipedia.org/wiki/Utah_Data_Center


Absolutely. I had no idea tape was being pushed to these levels!

Crazy good tech accomplishment right there.

In a way, I am somehow comforted. Tape remaining relevant kind of dropped off my radar. Good to see.

317Gb / square inch AND 15km/h transport speed? That is nuts fast.

Wonder how robust it really is? Of course at those densities and potential transfer rates there is ample room for error recovery.


It has never really entered the public consciousness since it's not a consumer-facing technology but tape development has continued apace. LTO-9 should be hitting the market very soon and supports up to 45TB per tape (compressed capacity, 18TB raw).

Not quite sure where IBM's numbers here come from, their previous numbers don't match up to the progression of the LTO tape series' capacity. Maybe they are citing "research numbers" that they can do in a lab but aren't production-ready yet. I would certainly assume they are citing "compressed" data figured there.

But certainly tape has continued to progress much faster than most people would have imagined. Big tape libraries are still a thing in certain environments and they work very well, there is no better solution for bulk cold storage.


> LTO-9 should be hitting the market very soon and supports up to 45TB per tape (compressed capacity, 18TB raw).

LTO is cool and all but is the "compressed capacity" number really something to repeat with a straight face? The tape holds 18TB, we don't need to pretend it's anything else.

> But certainly tape has continued to progress much faster than most people would have imagined.

Mostly it has. But I'm somewhat worried about the future after the sudden late-game announcement that LTO-9 would have a 50% capacity improvement instead of the usual doubling.


> LTO is cool and all but is the "compressed capacity" number really something to repeat with a straight face?

No, but it's been the standard in the tape industry for decades. Probably dates back to the first tape controllers that had builtin compression (so compression didn't tax the main cpu).


I am trying to find something what would be make compression done by the tape controller favorable. Maybe it somehow makes the recovery more fault tolerant in the long run? Because it knows about the intricacies of the medium. Just guessing, I know nothing about tape storage


My first LTO drive installation was on a multi user SGI CAD application server. This system did all the compute and data management for roughly 30 users. I/O streaming was easy and efficient.

IRIX allowed for live file system backups and the drive doing compression meant all that happening with negligible user performance impact.

Was literally set it and forget it, aside from tape rotation into off site storage.

Compression would have had an impact.

We don't do multi user app serving much today, so maybe a smart drive has less benefit. But it mattered then. 2000's era.


The announcement [1] is of lab results, so yes "research numbers". A commercial product might be years out (if ever developed). This is not meant to belittle the achievement (which is awesome), but to clarify what has been done and what to expect.

[1] https://www.fujifilm.com/news/n201216_01.html


Amazon Glacier is tape-based, so anyone certainly can get access to tape for backup purposes.


That transport speed has to just be for rewinding and fast-forwarding. If the terabyte you want is the 580th terabyte, you need a quick way to skip past terabytes 1 through 579.

The hardware is not going to read 300 Gb/in density at bicycle speeds. :)


> Data on the tape is stored at a record-breaking density of 317 gigabytes per square inch

Me wondering about those micro SD cards that can store 1Tb.


Maybe around an order of magnitude more. The big difference is that square inch of tape might be 3-5 orders of magnitude cheaper to manufacture.


Those SD cards aren't close to a single layer, so it's not really equivalent.

Ad absurdum, stacking two Micro SD cards on top of one another hasn't just doubled your density even though surface area is unchanged.


Fair enough but chances are the tapes will have a much longer lifetime for storing data, which is the primary use case for this sort of thing.


The density of flash memory is competitive with magnetic tapes, but the retention time is too low, making flash memory completely unusable for archival storage, even if it would have been as cheap as magnetic tape.

In theory, write-once memory cards, using some kind of antifuses, could be designed to have a lifetime good enough for archival storage, but nobody has attempted to develop such a technology, because it is not clear if there would be a market for them.

Most people do not think far ahead in the future, so they do not care much about archival data storage, until it is too late and the information had already been lost.


> The density of flash memory is competitive with magnetic tapes, but the retention time is too low, making flash memory completely unusable for archival storage, even if it would have been as cheap as magnetic tape.

I disagree that it's unusable. You'd end up with a puck the size of a data tape that can archive a petabyte of data and needs to be plugged in to a 5 watt power supply for long term storage. That's not super onerous. Then consider that tapes need to be stored at exactly room temperature with 20-50 percent humidity, while this puck would barely care about environment at all. And you could plug it directly into a computer without a $5k drive. Honestly it sounds pretty good to me. We just need to drop the price of flash by a factor of 20 to make the scenario happen.


It probably references the density within the data tape world, which is significant as there could be other ways to achieve a higher total storage, but this is one of the major components here it seems


> 15 km/h

The tape stores 580TB for a length of 1255m, does that mean the read speed is 580TB/1255m*15km/h = 1.9TB/s ? Seems too high.


Tapes are often structured in bands, and those bands are divided into wraps, and those wraps are divided into tracks. There are many tracks on a tape, and they snake back and forth from end to end on the wrap (so you don't need to "rewind" when you get to the end of the tape—you just start reading back in the other direction). In newer tape drives, you physically can't read all of the data at once because the tape head is only a fraction of the width of a single band: it physically moves (laterally) to position itself over the right data.


Since I had to look up more info to understand this explanation, I'll try to give my own, using the numbers for LTO-8.

The drive has 32 heads, and reads/writes 32 tracks at a time. It goes from the start of the tape to the end, then aligns with the next 32 tracks and reverses direction.

Each group of 32 tracks is called a wrap. There are 208 wraps, so 6656 total data tracks. Even wraps go one direction, and odd wraps go the other direction.

That's the important part.

But also the tape is divided into 4 "bands", each one holding a quarter of the wraps/tracks. Between the bands, and at the edges of the tape, are special servo tracks that are used for alignment.

So when a source talks about "wraps per band", it's pointless abstraction. Unless you're really in the weeds, the only thing you want to know is the total number of wraps.


I'm guessing that is the fastest it can traverse the tape, it didn't say it could read at that speed.


> When tape is being read it is streamed over the head at a speed of about 15 km/h...

It did say that. I wonder if there's some multi-track/double-layer/double-sided stuff going on that needs multiple passes to read, that does sound awful fast!


Most recent tape standards have multiple "bands" and "wraps" placed in parallel. The head reads only one wrap at a time, so it takes many passes to read the whole tape. For example an LTO-8 tape has 52 wraps each within 4 bands, requiring 208 passes to read completely.


Why not just have multiple heads then?


There are. The tradeoff of speed vs. complexity is currently sitting at 32 heads.


It would be like downloading a file using GetRight back in the day, where you'd have one thread downloading the 0-25% chunk, one downloading the 25-50% chunk, and so on.

You could hypothetically do that, for sure, but the software basically is derived from the tape era where you'd just have one logical stream coming out of the tape.


Positioning the tape should become harder, I suppose.

Also, I suppose they limit the read bandwidth with something like what a single Infiniband connection would support; few disk arrays support much higher speeds.


Yeah I've been wondering about the numbers too.

The numbers they cite for "previous generations" don't match up to the progression of the LTO tape series' capacity. Maybe they are citing "research numbers" that they can do in a lab but aren't production-ready yet. I would certainly assume they are citing "compressed" data figured there.

Also bear in mind that tapes typically store data striped across the tape in multiple tracks and multiple bands. There are four bands per tape and 12-52 wraps per band, so reading the whole tape requires up to 208 passes across the tape.

https://en.wikipedia.org/wiki/Linear_Tape-Open#Physical_stru...

But yes, to agree with another parallel comment, tape data rates are quite high sequentially (abysmal in random of course, but that's not how they're used). LTO-8 does 750 megabytes per second compressed / 360 megabytes per second raw.


That is too high but tape really does have surprisingly high transfer rates. You need a real computer to keep them streaming.


Wow, it's like a segment of a floppy disk of infinite radius!


> Data on the tape is stored at a record-breaking density of 317 gigabytes per square inch...

The table from IBM[1] in the middle of the page says this number is in gigabits (not gigabytes) per square inch. This is obviously impressive nonetheless, but I wonder what else this article got wrong.

[1] https://petapixel.com/assets/uploads/2020/12/ibmtale.jpg


> record-breaking density of 317 gigabytes per square inch...

Yawn on the raw density figure, though.

The chip inside a 128 gigabyte micro SD card is a small fraction of a square inch and is cunning enough to provide random access.

You just can't easily and cheaply have a long tape of them.

It used to be the case once upon a time that mass storage media like magnetic tapes and discs based on writing on surfaces with a head had better density than memory chips.


317 gb per square inch

That seem perfectly reasonable. You can get 256 USB flash in in size about half square inch, and that is random access data. (e.g: https://www.walmart.com/ip/SAMSUNG-256GB-Bar-Plus-Titan-Gray... )


That is roughly 500 bits per square micrometer or 2000 square nanometers per bit or an average (assuming square lattice) distance of 50 nanometers per bit.


I inherently want to know how this works but I suck at physics and I’m just going to simply also stare in aw from the outside.


With that kind of precision I would like to know what kind of system they use to reduce outside mechanical vibrations?


It isn't mentioned and I don't expect much other then inertia of the device itself. There is a servo mechanism assuring the head follows the tracks on tape. Improvements on that are mentioned. I don't think it matters whether the tape (necessarily) or the environment vibrates or whether that distinction is meaningful.


Time for the ever golden quote: Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.

https://en.wikiquote.org/wiki/Andrew_S._Tanenbaum


When I was working in vfx we employed a guy who's sole job was to fly back and forth from LA to Melbourne with pelican cases full of hard drives.

It was faster and more secure since we were working on air gapped networks.


That feels like a dream job for a temporary amount of time. And very effective too. I assume the VFX studio is in Melbourne?


Maybe if they’re flying first class? Otherwise that sounds like a nightmare that would require an insanely high paycheck.


If you're flying back and forth on that route that often, you're going to get upgraded quickly, I would imagine. I've done 6 flights a month domestic and have been upgraded to first class for free fairly often.


Depends on the rewards program I suppose. A US airline I’m familiar with does not upgrade frequent fliers on international flights.


You're allowed to say their name here.


>Otherwise that sounds like a nightmare that would require an insanely high paycheck.

If the courier in question is 6'2" that might be the case, but I suspect if the person was much smaller and is a heavy sleeper, that it wouldn't be too bad.


He was a writer, and seemed to enjoy the job


As a person of 6’0” I found the flight from LAX to SYD, and return, miserable in coach. I slept a lot and watched some movies but any way you cut it that is a long time to be sitting in a very uncomfortable seat.


In the before times I flew Auckland to Doha over 20ish hours. I can’t believe anyone likes flying, every bit of it is awful.

I’m 1.97m or so and it’s really not designed for me. I suspect it’s optimised for those at 1.6-1.7m.


1.75m here and coach sucks for me too. For that reason and social distancing I just bought a car that can drive me to places I’d usually fly, in much more comfort.


It's an ideal job if all you need is hours of concentration, reading, and writing or drawing. A writer, a PhD student with a laptop, a comic book author, etc.

Or, maybe, even a Zen monk who spends time meditating.


Dream job to spend hours and hours in a plane? Then more hours clearing customs? Thanks but no thanks.


But Americans going to Australia just pass through a computerized system. You don't even have to talk to anyone.

I agree with OP, if this was a first class flight, it wouldn't be a bad gig.


Yeah one location in LA and another in Melbourne so that we had 24 hour coverage.


[flagged]


Frequent fliers are still a thing.

Your post is ill informed FUD.


Still sounds like an awful job to me flying that much.

Not sure why people said it was a good thing to do. You'd be bored of the process within a couple months and hating planes and airport... unless you're a certain type of person.

A lot of traveling sales guys have talked about this.


I wonder if tapes are more prone to some form of driveby attack, whereby instead of requiring physical or remote access to a location, a strong enough magnetic field within a certain distance of a datacentre could penetrate bricks and mortar and render them useless

I'm envisioning a huge device in the back of a van which pulses a powerful beam, similar to typical movies (Oceans 11?) cutting the power to a bank / casino prior to a raid


Are you familiar with the story of the force field generated at the 3M plant apparently in 1980?

It would probably take more power than that to do what is being suggested from the distances that are being suggested.

I'm not sure it's feasible.



Hopefully this was cabin baggage and not something left unattended for minutes on the baggage belt


Unattended for 14+ hours you mean. A baggage handler on the departing side could be bribed to steal it and you would have half a day of unfettered access before the courier on the plane even knew it was missing.

Of course you don’t check it if it’s that critical.


Transport security wasn't as big a concern as consistent delivery.

Network volitility is a huge problem when transferring huge files.


Is the internet in Australia really less reliable than airline on-time performance?


Time is one factor, it's also surprisingly easy to corrupt vfx master files with just a few missing bits.

Checksums will identify something went wrong, and then you need to redownload the file to a quarantine network and scan it. Takes time.

Much easier to go from trusted source to trusted source and verify the files on drive prior to shipping.

Amazon just released a device (forget the name) for exactly the same use case. We developed it in house.

Not to mention most major studios will contractually prevent you from exposing anything to the web.


Apologies for all the questions, I'm just curious about this.

>Checksums will identify something went wrong, and then you need to redownload the file to a quarantine network and scan it. Takes time.

Surely a sensible file transfer algorithm would compute checksums on small and easy-to-retransmit chunks? Does rsync not do this? Isn't it already happening in TCP?

>Not to mention most major studios will contractually prevent you from exposing anything to the web.

I understand that workstations with media on them are not going to have internet access, but do they really prohibit site to site VPNs?

>Amazon just released a device (forget the name) for exactly the same use case. We developed it in house.

Snowball and Snowmobile? IIRC these are primarily meant for one-time migration from on-prem storage to S3. Do people really use them on an ongoing basis?


FYR Amazon has the AWS Snow Family with the largest being a 45ft TEU, while Microsoft does Azure Data Boxes that are somewhat smaller.


Or you encrypt it.


If it could be checked, then it could have been shipped. I would assume hand carry.


Doing some guestimation, it seems like you could fit an exabyte into an AWS snowmobile. That's insane


Why guess? https://aws.amazon.com/snowmobile/ says 100PB.


I'm more interested in the lifetime of the medium. Durable backup media for consumers is still a holy grail as I understand. M-DISC didn't hold up to their 1000 year lifetime promise. Archival-grade DVD's are also not good enough as I understand. Syylex went bankrupt. I want a consumer-grade backup medium that can provide at least a 100 years of lifetime.

That said, I was able to recover 90% of the voice recordings my father made between 1959-1963 on reel-to-reel tapes 60 years later. Tape can be very durable but what I recovered was analog voice very tolerant of errors. I'm not so sure about gigabytes snuck into an inch-square.


Brings back memories of a great remix contest in 2006 where the digital copies of analog tape tracks from Peter Gabriel’s “Shock the Monkey” were made available to remixers.

To my surprise the pitch of the samples was a little lower (and varying ever so slightly over the duration of the song) than what you’d have expected with a440 tuning. It baffled me, since I expected some of the early digital synths used in the original sessions should habe been rock solid 440 tuning.

And that’s how I learned about “tape stretch” where analog audio tape stretches just enough to make the pitch of everything a few cents lower over long period of time.

p.s. I ended up applying digital pitch correction, so I could “jam along” with my own synths :-)


A different problem happened to me when digitizing my dad's tapes. Dad bought the player in USA and brought it to Turkey and made the recordings there. When I digitized them in USA, everything sounded in higher pitch. It later turned out that the voltage difference in AC (60hz/50hz) caused the rotation speed to change proportionally, so I slowed it down to 5/6, and it was perfect afterwards.


Ask HN: How can I create the most reliable and durable NAS today? I have a lot of very sentimental, very-important files, such as family photos and videos. And I simply like to hoard data.

I currently have 8TB of data stored on a Synology DS218+ with RAID1, and monthly data scrubs (verifying checksums). It is backed up remotely to Google Drive (in encrypted form), and I also maintain an infrequently-updated, once-per-quarter disk clone with an external HDD.

My biggest concern with my current setup is that the memory is non-ECC. Even though the files are checksummed, I am concerned that specific memory corruption / bit-flips could propagnate into the checksums, and hence result in data corruption.

I am considering:

* Building my own FreeNAS box using AMD Ryzen (which semi-officially support ECC memory). My concerns here are the semi-official nature of support: how do I know if ECC works, before a rare cosmic bit-flip?

* Purchasing a Synology DS1621+. This is AUD$1400 which is a tough pill to swallow, for the equivalent of a first-gen Ryzen quad core and 4GB of memory.

Any options or recommendations is appreciated!!


You’ll know ECC works when it matters; When you encounter a flipped bit. With an 8TB RAID, that is likely to be within the next 24 months.

Go with option 1, and Raid-Z2+. With RaidZ2, you’ll be able to not only detect but correct a flipped bit - Even if that flip happens when writing out the data.

Pay attention to the counters. Your ZFS scrubs will report how many resilvers they have. You’re likely to encounter 1 in a scrub. You’re unlikely to encounter more than 2. If you see that, that’s when you check for memory errors. A single bad sector is likely the hard drive. Even a single flipped bit is likely a transient error; It could be your memory, or your disk, or anything in between. It happens at scale, and 8TB, read repeatedly, is a lot of bits.

Look into rasdaemon and memtest86 - They’re the tools you use to debug what happens when

The other advice I can give you: Don’t be paranoid. Your photos are likely to acquire bit rot. You will have dozens, even hundreds of bit flips that will happen in your lifetime. Of the many thousands of photos you will take, the chances that you will ever notice the discoloration or bad line that a bit flip will have in a photo are pretty small. Bit rot happens. Your photos are important to you, and you should treasure them, but treasure them for what they are: Things that you protect and are under your care, not things that must be twenty nines of correct. You can realistically achieve 10, even 12 nines of correct reads on your data. You don’t need more.


But what if a bit flips in a header or in the OS's file length field?


ZFS checksums metadata, too.


Regarding ECC and ZFS: https://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-y...

"Next, you read a copy of the same block – this copy might be a redundant copy, or it might be reconstructed from parity, depending on your topology. The redundant copy is easy to visualize – you literally stored another copy of the block on another disk. Now, if your evil RAM leaves this block alone, ZFS will see that the second copy matches its checksum, and so it will overwrite the first block with the same data it had originally – no data was lost here, just a few wasted disk cycles. OK. But what if your evil RAM flips a bit in the second copy? Since it doesn’t match the checksum either, ZFS doesn’t overwrite anything. It logs an unrecoverable data error for that block, and leaves both copies untouched on disk. No data has been corrupted. A later scrub will attempt to read all copies of that block and validate them just as though the error had never happened, and if this time either copy passes, the error will be cleared and the block will be marked valid again (with any copies that don’t pass validation being overwritten from the one that did)."

(dont just read the quote, read the link)

I am using ZFS since 2007/2008 and I have never had any issues (except with those damn Seagate 3tb DeathStar hdds, where I was barely able to replace them fast enough - in 3 months 3 were gone - i will never buy seagate again)

Co-creator of ZFS: https://arstechnica.com/civis/viewtopic.php?f=2&t=1235679&p=...

I am having a asus microatx board with 16gb of non ecc ram, additional SAS HBA and 3x3Tb Toshiba into zraid and additional 10TB HGST He3 disks + Ultrium 3000 (LTO-5, they are quite cheap today and are designed for 24/7 operation which is barely what I will ever encounter while tapes can be restored on latest LTOs if needed, while you can get cartridges on some sale for peanuts) for backups. There is no way in hell to go for my important data (like images) to disk only and tape is nice. You take the cartridge and store it at parents/gf/worspace drawer/...

Anyway if I remembered correctly, google has lost 150k of user accounts in 201x and restored them from tapes. So even for cloud opinionated people, it still makes sense to shovel important data to tapes if you dont use the data in everyday processing (and just an info - even shelved disks die)


Write the really important data for long-term storage to MDISC BD XL media (100GB each). If it's pictures you care about, i'm sure it's far less than 8TB that need the VIP treatment


See my other replies about reliability (or lack thereof) of M-DISC.


I just built my Truenas box using Ryzen. Let me know if you want a part list for inspiration or any tips.


I would love your part list! Did you go with ECC memory? Did you have any way of verifying the ECC is working and actually detecting/correcting bitflips?

The other thing I am interested in is minimizing idle power consumption. Just to be more environmentally friendly.


I don't know if you want to build the same kind of system but at least you can get a list of parts that work together.

I use my Truenas box for storage using ZFS, VMs and NFS server for different PCs.

I bought ECC memory as I understand this is more or less a requirement for ZFS.

I found out that FreeBSD which Truenas is based on can give you info about what type of RAM is present.

The command is: # dmidecode -t memory

According to this I have ECC RAM. :)

I did a build with the following parts:

Case: SST-CS380 V2 (space for 8 3.5" drives, 2 x ). Mainboard: ASRock X470D4U2-2T Power supply: Seasonic Focus Plus 550W Gold 80 Plus Full Modular Power Supply RAM: Kingston Server Premier, DDR4, 16 GB, KSM26ED8/5 M CPU: AMD Ryzen 5 3600X Wraith Spire CPU Cooler: Arctic Liquid Freezer II NVMe to PCI bridge: ASUS Hyper M.2 x16 Gen 4 (PCIe 4.0/3.0) supports 4X NVMe M.2 devices (2242/2260/2280/22110) up to 256 Gbps for AMD TRX40 / X570 PCIe 4.0 NVMe RAID and Intel® RAID platform - CPU features

I bought all this from Amazon.de. The main board is a server main board with 10 GE Ethernet and a console for flashing bios and remoting into the machine - no graphics card needed. This was expensive and you can most likely save a lot using a consumer grade main board.

The RAM was taken from the list provided at https://www.asrockrack.com/general/productdetail.asp?Model=X...

The CPU models supported are here: https://www.asrockrack.com/general/productdetail.asp?Model=X...

Be careful not to buy an AMD APU - this doesn't support ECC RAM for some insane reason. An APU would have build in graphics and the CPU.

I use both SATA drives (long term storage) and SDD drives (for speed).

I created two ZFS volumes (I don't remember the proper terms). One for rotating discs (which could sleep most of the time) and one using SDDs for fast storage which doesn't use much power.

I have 6 KW solar cells with battery so I don't really care if the box uses a lot of power. During daylight its more or less free when sun is shining. I get next to nothing when selling the electricity that I generate and would like to use as much as possible locally.


There really isn't much market for it. You can pay Google or Apple or one of the large cloud providers a very reasonable and decreasing rate for a literal guarantee that your data is accessible. The only risk is the company goes under, which is extraordinarily unlikely for someone like Google / Apple and the shutdown would have advance notice.

I realize for the hacker news audience there are multitudes of reasons the solution above doesn't fit your needs, but realize the consumer market is near nonexistent.


Wouldn’t the other risk be that after you’re dead and stop paying the data is deleted or inaccessible? Maybe if you could prepay 100 years it’d work?


If nobody is "inheriting" your data (or rather—nobody cares enough to keep your data around), it seems kind of moot to ensure it hangs around. That is, if I put stuff in an S3 bucket and pre-pay for 100 years, if nobody is around to download it in 100 years then why bother?

If you wanted to make a sort of digital time capsule and didn't care who discovered it, your next best bet would probably be the Internet Archive or some other archival community.

If your data isn't appropriate for archival (i.e., can't be publicly consumed) and isn't interesting enough for your friends/family/etc. to keep around on your behalf, keeping the data is purposeless.


I absolutely take inheritance into account when having backups lasting for a hundred years, but regardless of how uninteresting my data looks, we don't know if today's boring data would be invaluable for science in the future. We show slippers from 5000 years ago in museums today and they're invaluable. Consider the person who owned it, walking on a national treasure, unaware. Maybe, they didn't even like the slippers, found them boring. :)


I was thinking that DNA is a pretty robust storage medium. Perhaps we could use it in coming years to store data for long term survival.

Though considering these comments and the advent of mRNA/CRISPR, perhaps we could store data for future generations in our own DNA. That'd be fascinating if you could read journals or even audio/video of your ancestors from your biological inheritance from them. What if we could engineer an extra chromosome to do just that, then let them remix and recombine segments of memories so everyone's would be unique.


Just store your diaries in a line of yeast that produces tasty beer or wine. That could work. I wonder what the oldest yeast lines in use today are, and how stable their genomes are.


Or if you really want your data to survive, engineer it into a virus for your local species of cockroaches! Getting the data back could be gross, but it'll survive nuclear holocaust. ;)


The importance of those slippers is tied strongly to their rarity. So little survived from 5000 years ago that almost anything from that time is valuable.

By comparison, we'll create more data in the next ten minutes than entire centuries from our relatively recent history. Lots of stuff is getting preserved in lots of places with substantive redundancy for virtually nothing. Your slippers today are likely to be more valuable than the near-infinite troves of documents and photos and whatever else.


I agree consumer market doesn't exist because everyone is seduced by the cloud nowadays. Having said that supposed guarantees provided by these companies should not translate into blind and complete trust.

A single bad bug or security issue can make data inaccessible or corrupt. Tapes on the other hand does not have that issue. IMHO trusting all your important data to a single vendor or technology is a recipe for disaster.


The lifetime of the medium is half the equation. The biggest problem imho is the lifetime of the device. Say I give you an old floppy drive for 8" floppies from the seventies. Where would you connect it?


You're right but not all media are the same. Arctic Vault used an optical film with QR codes. You can theoretically even take its photo and decode it by hand if you want. They even added a Rosetta Stone to the entrance so, even if all the knowledge is lost, one can hypothetically decode the data stored there. For magnetic media, you need some more specialized equipment for sure.


Floppies are an interesting case because the protocols and physical specifications are all documented publicly, which means that one could literally build a drive from scratch today --- the trickiest part being the heads, but considering that they are many orders of magnitude lower density than HDDs, it would not be a big obstacle in the future.

(I believe 8" floppy drives have the same interface as 5.25" ones --- and there's no shortage of adapters from the retrocomputing community for those, some of them even open-source.)

Tape is far more closed, AFAIK most of the common formats are proprietary and the specs are behind NDAs and other walls.


I have hobby books from 30 years ago that teach you how to build magnetic heads for cassette tapes. No pictures or anything but with enough patience you definitely can build one today, at home (even then). Mind you, the size of the gap is not that big of the problem if you put your mind at polishing.


Tapes are uniquely terrible at this. I'd argue it's the Achilles' heel of the medium, even moreso than the actual limitations of linear tape.

First off, the drives themselves are going to be expensive brand-new. You're going to be paying thousands of dollars for a drive, and it's probably going to have some weird interconnect that you'll have to spend even more money and waste a PCIe slot on an adapter in order to use the drive. Most common is SAS or Fibre Channel; although there's at least one company selling drives in Thunderbolt enclosures for the Mac market.

(Aside from all that, SAS is actually pretty cool for things like hard drive enclosures, since it has SATA compatibility. I have a jury-rigged disk shelf built out of an old HP SAS expander, a slightly-modified CD duplicator case, and some 5 1/4" hard drive bays.)

Second, tape formats are constantly moving. LTO and IBM 3592 come out with a new format every 2-3 years and backwards compatibility is limited. Generally speaking you can only write one generation behind on LTO and read two generations back. So, if you want a format that's got drives still being made for it, you'll need to migrate every 5-7 years. Sure, the actual tape is shelf-stable for longer, but you're going to be buying spare drives or jumping on eBay if you want to keep old tapes around that long.

(eBay is actually not a bad place to buy used tape drives, but the pricing varies wildly. It's perfect for hobbyists and small-fry IT outfits looking for cheap backup media. Absolutely terrible if you're a large outfit with reliability guarantees and support agreements to maintain.)

Third, actually using a tape drive is a nightmare. First off, Windows hasn't shipped tape software since 2003 (I think?), so you'll be in the market for proprietary backup solutions. Second, if you're just writing data directly to the tape, you will shoe-shine for days. Common IT practice is to actually have a second disk array sitting in front of the tapes as a write cache and custom software to copy data to the tapes at full speed once all the slow nonsequential IO is done. Reading from tape doesn't have to worry about this as much, but the fact that you had to use custom software just to archive your files means that you now have proprietary archive formats to deal with. So you can have tapes that rely on both having access to working drives and licenses for proprietary backup utilities.

(Of course, if you had decently fast SSDs and a parallel archival utility, you could sustain decent write speeds on tape. I actually wrote this myself as an experiment: https://github.com/kmeisthax/rapidtar and it can saturate the LTO-5 drive I tested this with.)


That's probably not going to happen. Tapes have high longevities and you can buy used LTO drives on eBay for a few hundred bucks, but the biggest issue in 100 years is going to be finding a device to read the tape, and finding an adapter to hook it to the USB-H 12 quantum optical port on your motherboard.

A better way would be to use something like that to store it for a decade or two, then copy the data onto whatever the newer version of archive medium is (LTO drives can typically read the last one or two versions as well). Rinse and repeat every decade, and it also lets you test if there has been any bitrot.


Is there a source regarding M-DISC being unable to hold up to their lifetime promise?

A quick google search brings up amateur-ish blog posts, and even those co sign the medium.


Yes, French national metrology and testing laboratory LNE: https://www.lne.fr/sites/default/files/inline-files/syylex-g...


> The objective of this study was to investigate the behavior of the GlassMasterDisc of Syylex under extreme climatic conditions (90°C and 85% relative humidity) and to demonstrate the potential of this technology for digital archiving.

> The result of this study is that the GlassMasterDisc has a much longer lifetime in accelerated aging than other available DVD±R

I wouldn't draw any other conclusions on normal ageing of other tested media. They did an accelerated aging test at 90°C and 85% RH, where most discs didn't last a single test cycle (of 10 days), two discs lasted a single cycle, and only syylex lasted all 4 cycles.

Quote on a brand-name DVD

> This DVD model had the longest lifetime (i.e. 1500h) at 80°C and 85% RH. At 90°C, it is destroyed after the first cycle of 250 hours.

For an idea of what it does to the substrate:

> [for measurement] DVDs have to be taken [out] .. To prevent the formation of water droplets in the polycarbonate, it is necessary to "purify" the polycarbonate from the water that was absorbed at high temperature.

OTOH, I had CDs (Verbatim, upper middle class), of which about 1-2 of 50 had issues after 20 years storage (in dark, mostly room-temperature conditions).


Yes, they did an accelerated test, and M-DISC has performed as good as archival grade DVDs. Syylex, which promises the same lifetime as M-DISC, performed significantly better. That clearly shows either M-DISC didn't live up to their promise, or Syylex and archival grade DVDs surpassed expectations. Either is bad news for M-DISC, isn't it? What am I missing?


The "accelarated test" may not be in any way indicative of true lifetime in moderate conditions. Their own conclusion does not draw any such implications, the only other test they reference is done at 80°C (10°C lower), and the only writing on how or why this test could be indicative of archival lifetime was a generic two sentence: harsher conditions -> faster degradation (in part 4).

It was a pupose-built test to see how much of X would Syylex take. It took X better than others, none of which took X well. Tests like these are very good, if you want to go with Syylex, to make sure it's not worse in some way (X, or Y, Z), which would then suggest a need for further examination. In real aging, factor X may be completely meaningless, while Y and Z are crucial, so you cannot conclude which one will last more.

Why test 90°C and 85% RH, not 80°C, 50°C or 110°C, or bending, UV light, scratching, drop in acid .. whatever? For a proper accelerated lifetime test, you would need to identify (all) relevant degradation modes and model their behaviour (and interaction) in target vs. accelarated test conditions, and then extrapolate behaviour in target conditions. They didn't even write what type of degradation they are testing.


I'm not convinced. I'm no expert on aging simulation however. But heating a DVD to 90°C seems like it would do different things to the disc than normal aging at recommended temperatures, wouldn't it?


Thank you!


Given the pace at which storage capacity increases, what is the rationale for not copying over your data every 5-10 years onto the next cheapest consumer mass storage of the moment? You get all your data in one place and you don’t have to deal with standards disappearing (I read 2020’s game consoles can’t read CDs anymore, people should rip their CD collection right now).


The bookkeeping it requires for one, since you don't usually buy all your backup media at once, you acquire them over time, that gets unnecessarily complicated. It's riskier to copy the media periodically as you might increase the chances of data corruption due to the fault in the copying process (faulty RAM, faulty software, not concentrating good enough etc). You periodically introduce possibility of user/hardware/software errors to the longevity of your backups.

Also, when others inherit the media they may not have proper equipment or skill to do it themselves as goal of the preservation is to get it 100 years ahead, not keep it always in a usable state per se. For example, I'd like my children to keep my backups until my grandchildren could access them 60 years later.


For data corruption and mis-manipulation, I would be more concerned about the long term decay of any media than some bit flipping in RAM (even for tapes as their endurance relies on certain storage conditions, but it is likely to be a hard drive, writable DVD/Bluray or something flash based, these do not particularly age well).

For book keeping, I think my point is that storage media are becoming so big that you always consolidate into a single device every time you carry the data over (you may still want to duplicate for reliability). Like you can buy a 18TB hard drive today. A consumer isn't going to require more than one or perhaps two of those for anything to be preserved long term. And in 5y-10y, you will likely have 25-30TB hard drives.

The equipment problem is precisely what this addresses. You are always using the latest hardware, and the previous hardware you are using is still supported if you stick to a 5-10y cycle. For instance you would have moved away from IDE drives while you could still find motherboards with both IDE and SATA ports. But if your data is stored on an IDE drive, good luck connecting it on a computer in 2030 (if we haven't moved full Apple's "you can't customize your hardware and we deprecate everything very frequently").

Skills (and I would say mostly dedication) is still a problem. But we are talking about copy-pasting files between two media, it's not rocket science even if you don't script it.


The time needed to actually do the copying.


It seems like there may be a market there.

Every 10 to 15 years, you send in your archived (and new/interim) personal data and get it back on the current top-tech storage medium. That way it's not stored in the cloud and you can keep moving the stored data forward without having to deal with it all yourself.


Not really. You can buy a 18TB hard drive now. Even if your data is humongous and needs several of those, it will likely fit on a single drive in 5-10y. So it takes an increasingly smaller amount of time to replicate (excluding the copying time which keeps the machine busy but not you).


Use https://en.wikipedia.org/wiki/Parchive and add as much redundancy as you like. A lot cheaper to over-provision, than to create the uber-archive medium.


That's a good idea regardless of the medium, but if your flash disk becomes completely unreadable in 15 years, PAR doesn't help you much, does it?


Then don't use flash (or at least not MLC).

Use an M-Disc with 3x PAR redundancy.

Or RAID 1 enterprise magnetic HD with 3x PAR.

Or all of the above :-)

Are you trying to archive TBs or MBs? Likely the next most important consideration is a stable place for storage.


Hell, tell me it will last 20 years and I'm okay with that, if you can guarantee at least 15 years I can then buy replacements every decade and transfer files over...


You're right. That reminded me that I really don't know the state of my cold backups.


> M-DISC didn't hold up

Since I have a lot of stuff on M-DISC, you’ve now got me worried. Link? Any other info? I don’t need 1000 years but what is the lifespan?


Here it is, it basically performed as the same as archival grade DVDs: https://www.lne.fr/sites/default/files/inline-files/syylex-g...


Yep, but the reality is that "we don't know", an accelerated, simulated aging test may (or may not) have the same results as "real" aging.

250 hours @ 90 C°/85% humidity, how does that compare to - say - 100,000 hours at a "normal" 10-40 C°/30-50% humidity?


That's right, but regardless of how "extreme" testing is, archival grade DVDs have performed as good as M-DISC, and Syylex has surpassed the rest by a huge margin. Syylex promised the same lifetime as M-DISC (unlike archival grade DVDs). I think the results are good enough to see M-DISC either doesn't live up to the expectations or archival grade DVDs exceed the expectations. Either way, bad news for M-DISC. If Syylex hadn't bankrupted, it would have been the best option of course.


Yes, what I mean is that - set aside the "glass" disc from Syylex - we don't know if both M-DISKs and archival grade DVDs suck or excel, let alone how long they actually last (readable) in the real world.

IF my last guess in comparison is correct, 100,000 hours at "normal" temperature/humidity is roughly 11 years, but it may well be that without "cooking" them at 90 C°, the duration is for both 200,000 hours (or whatever) ...

Single point anecdata, I had to dig i


piql is the company that did the github arctic vault. And they offer it as a service. Maybe not for general public but still .

https://www.piql.com/arctic-world-archive/


Arctic vault is VERY expensive, about $100 per gigabyte.


I remember watching the pilot episode of Star Trek (the original series) and chuckling when Spock reports that “the tapes are badly damaged” from the capsule they recovered near the barrier at the edge of the galaxy.

Turns out it might not be that outdated after all.


In fairness, you can open any box of tapes and assume the media will be badly damaged XD


About a month ago found a stack of MiniDV tapes from about 15 years ago.

It was my own home videos. I wanted to preserve them, so wanted to upload them to Google Photos.

It took me some looking around in eBay to find a camcorder to play these back. When it came home, I realized that I needed a FireWire 400 port to capture full resolution. So more digging around for a FireWire PCI-E card. I was finally able to transcode and upload some 15GB worth of video. It took 3 minutes on my gigabit internet for the upload. About a month for the whole process acquiring the hardware etc.

When I think about these tapes some 50 years from now, it might as well be completely unrecoverable, not because the tapes went bad, but because we have nothing to read them. Makes you wonder about galactic time scales like in Star Trek.


That happens already; there were problems getting hold of https://en.wikipedia.org/wiki/Lunar_Orbiter_Image_Recovery_P... data only 20-30 years after the missions


And in 2 years those tapes might be the only thing you got.


Could you elaborate?


Maybe that he uploaded them to an unpaid service which might make a policy shift to remove the data at some unknown time in the future?



> found a stack of MiniDV tapes from about 15 years ago ... I wanted to preserve them

Here's where I raise my hand and ask "Why?"

They've been sitting in a box for 15 years. You never watched them, you never even thought about them. Why preserve them?

It's why I stopped taking pictures and videos of things. I never watched them again. It's all just a lot of waste motion over some dream that someday we'll find value in sitting and looking at this old stuff again.


For me it’s personal history. My kids LOVE watching videos of me and my wife when we were younger. Luckily for them, my wife’s family converted all their videos to digital and so it’s easy to watch.

I’m itching to do the same to my parents’ collection so my kids can see more.

I suspect their kids will enjoy the same.


Some things only get interesting with time. How young we looked. How different the city looks now. Or people who disappeared.


Just because you don't doesn't mean other people don't. May as well condemn someone for following a sport you don't like.


For me the biggest thing was that the tapes had been sitting for 15 years, then discovered and treated as though they were something valuable. If they were so important, why were they forgotten on a shelf for 15 years?

Wait until you have to clean out a parent's house with a rooms full of stuff saved because they thought the grandkids might find it interesting. Here's the reality: they don't. My mother saved boxes and boxes and boxes of photos. None of them ever sorted, or put in albums. Just thrown in boxes. Saved for decades. Was it hard to throw them away? Yes, a little. There were a lot of moments of my childhood there. But if I asked myself honestly, was I going to do anything with them other than put them on a shelf at my house? The answer was no. I'd advise sparing your kids that burden.

A friend of my mother's had saved every canceled check she ever wrote. Boxes of them, because she thought her kids or grandkids might be interested in them some day. Their reaction was likely: What's a check?

If you really cherish something, and it regularly brings happiness to your life, by all means save it. But do it for yourself, not for what you think your descendants will find interesting. And if it's been in a closet for 5 years, ask yourself why you are keeping it.


Asking "why" doesn't imply condemnation, does it?

At HN we assume best intention, asking why is how we learn about other people and their reasoning.

Of course it may be different than ours. But isn't that how we learn and are exposed to new ideas?

IMHO we should encourage more respectful "why?" questions.


It has nothing to do with intent. It’s tone deaf to ask why someone would want to preserve memories, especially if it’s based on your one anecdote of yourself not wanting them.

It’s like asking why someone would want to spread ashes or preserve an old piece of rickety furniture a long lost relative built.


Reread the post, especially this sentence, and ask yourself if it was "respectful"

> It's all just a lot of waste motion over some dream that someday we'll find value in sitting and looking at this old stuff again.


Yes, but when you read it in context, it's clear that the author is referring to themself:

>It's why I stopped taking pictures and videos of things. I never watched them again. It's all just a lot of waste motion over some dream that someday we'll find value in sitting and looking at this old stuff again.

The author uses "I" over and over again, explaining that they don't see the point, so asking about a different perspective seems genuine to me, or at least there's a plausible interpretation that it is genuine.

From HN guidelines:

> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.


Great, so when can I buy one and how much will the tape drive cost?

The largest issue with tape is that the drives themselves cost absurd amounts, and you better not cheap out because failure is both time consuming and scary.

Swapping tapes continues to be human intensive and restore times long. But the tapes themselves are so cheap that at this scale it becomes worthwhile again.


This is geared for enterprise settings, not home use. I believe someone mentioned these drives costing $25,000. I do agree reasonably priced tape drives with TB of space for home users would be great.


The prices are much cheaper now. You can get a single tape drive for under $300 and enterprise grade single tape drives for under $3k.

https://www.newegg.com/p/2BM-000A-000M5

https://www.provantage.com/quantum-tc-l72an-br~7QUAT0JW.htm


> The prices are much cheaper now. You can get a single tape drive for under $300 and enterprise grade single tape drives for under $3k.

Those tapes only hold 200 GB, and you can get a 10 TB hard drive today for less than $300:

https://www.newegg.com/red-wd101efax-10tb/p/N82E16822234407

While tapes are theoretically cool, the the drives are just too rare for them to be of any practical value to a home user. Even if the media has a better archival life than a hard disk (say 50 years), it won't do you a damn bit of good if there are no drives available in 50 years to read it.

IMHO, hard drives are better for backup (even offline backup, just get a hot-swap bay and imagine the drive is tape [1]). Archival is a harder problem, but I've settled on using high-grade optical media, burned slow for fewer errors, making a bunch of redundant copies. Even though the media might not last as long, I'm pretty much guaranteed to be able to find a drive for the next several decades [2].

[1] "pretend this hard drive is a tape" is even an enterprise product: https://en.wikipedia.org/wiki/RDX_Technology

[2] For instance, you can still attach a 5.25 floppy drive to a modern computer: https://www.ebay.com/sch/i.html?_nkw=5.25+floppy+drive + http://www.deviceside.com/fc5025.html


> Those tapes only hold 200 GB, and you can get a 10 TB hard drive today for less than $300:

That was a link to an obsolete LTO-2 drive. If you’re using LTO-8, which is the current generation, you’d get a 10-pack of 12TB tapes for around $500. Noticeably cheaper per byte than hard drives.

I don’t recommend LTO-2 just because I don’t think that the drives are well-supported.

> While tapes are theoretically cool, the the drives are just too rare for them to be of any practical value to a home user. Even if the media has a better archival life than a hard disk (say 50 years), it won't do you a damn bit of good if there are no drives available in 50 years to read it.

If you’re evaluating just on the basis of the price of drives / media, then there’s a cutoff where tape becomes cheaper than hard drives. The easy way to calculate this cutoff is to divide the overhead cost (tape drive price) by the difference in the cost per TB of tapes and hard drives, which in the example here, is around $3k divided by $25/TB, or 120 TB.

In other words, if you have more than about 120 TB of data, then it is cheaper to buy a tape drive. I think any comments about whether tape drives are suitable for home use are really comments about whether you are interested in the use case of people who need to store >120 TB at home.

If you are running a YouTube channel as a hobby, or you run a side business doing videography for weddings, the tape drive starts to sound a lot better. The 120 TB cutoff might be around 300 hours of video, which might be only 50 events.

There are a lot of other reasons why you might NOT want to use tape, but it’s easy to have enough data that tape is the cheapest storage option. At “enterprise” scale the cost calculus is completely different and involves things like support contracts with Oracle (not necessary for hard drives), power/cooling in your DC (tape is very low-power), etc.

And let’s not forget that if you have 120TB of hard drives, you’re in the regime where you start having to buy multiple machines.

As for longetivity—I don’t have the data handy. If you are storing data on tape, you need to migrate to newer generations of tape, as old generations become obsolete and unavailable. If you are storing on hard disks, you also need to migrate because hard disks eventually fail (even if they are not powered on).


> That was a link to an obsolete LTO-2 drive. If you’re using LTO-8, which is the current generation, you’d get a 10-pack of 12TB tapes for around $500. Noticeably cheaper per byte than hard drives.

But it looks like the cheapest drive on Newegg for that is $3,299.00: https://www.newegg.com/p/2BM-0046-00001

> In other words, if you have more than about 120 TB of data, then it is cheaper to buy a tape drive. I think any comments about whether tape drives are suitable for home use are really comments about whether you are interested in the use case of people who need to store >120 TB at home.

> If you are running a YouTube channel as a hobby, or you run a side business doing videography for weddings, the tape drive starts to sound a lot better. The 120 TB cutoff might be around 300 hours of video, which might be only 50 events.

No dispute there, but I think most >120 TB home use cases run into the question of if it's even worth it to keep the data (raw/uncompressed). For most people, the answer is probably no, and curation/compression makes far more sense. For instance, I don't think it's even typical for wedding photographers/videographers to store their final product indefinitely, let alone the raw footage for a re-edit. You can keep a lot more events in a lot less space if you're only keep the raw footage for a few recent events and the 15-30 minute final edit for a year after it's finalized.

> And let’s not forget that if you have 120TB of hard drives, you’re in the regime where you start having to buy multiple machines.

Not if you swap disks as offline storage.

> As for longetivity—I don’t have the data handy. If you are storing data on tape, you need to migrate to newer generations of tape, as old generations become obsolete and unavailable. If you are storing on hard disks, you also need to migrate because hard disks eventually fail (even if they are not powered on).

Honestly, the theoretical longevity is the only aspect of tape that appeals to me, but like you said enterprise tapes will end up being like these enterprise optical disks in a fairly short period of time, so you're regularly going to have migration legwork to do or your data's effectively toast.

https://twitter.com/foone/status/1236524992709316608


You don't even need the absolute latest drives to get reasonable storage volumes. Some careful shopping can net you an LTO-5 (1.5TB native capacity) library unit (one that can be upgraded to newer generations) for under $500, with new tapes $10-15 each.

The nice thing about LTO as a format is its relative predictability and ease of acquiring the parts, even the very obsolete ones. It's all SCSI or SAS, most of the interesting stuff happens at the hardware level, with a bog-standard API. Your average backup app, whether it be Backup Exec or mtx/tar/etc. on Linux doesn't need to care about the media format. Unlike actual "enterprise" shops with datacenters and support contracts and such cruft, where the primary concern is "does it work", it is fine to buy older units second-hand. They are plentiful and cheap.


> You don't even need the absolute latest drives to get reasonable storage volumes. Some careful shopping can net you an LTO-5 (1.5TB native capacity) library unit (one that can be upgraded to newer generations) for under $500, with new tapes $10-15 each.

That's still... pretty terrible. If we compare you costs for a tape drive against a 14TB easystore @ $190[1], the break-even point is around 72TB. I don't know about you, but that's of data you have to store for it to be worth it. Even at 150TB you're only looking at around ~25% (~$500) in savings compared to hard drive, which I don't think is much when you factor in how much of a hassle tape drives are to work with.

[1] https://slickdeals.net/f/14587381-14tb-wd-easystore-external...

[2] https://www.wolframalpha.com/input/?i=solve+y+%3D+500+%2B+%2...


It's not a fair comparison to put up hard drives (integrated mechanics) against tape drives (seprated mechanics). They do not solve the same problem and have different longevity profiles. If I'm spending $500 on tape storage, it's because I want something that will last a long time, something that portable hard drives tend to have issues with.

LTO5 onward supports LTFS, which exposes the tape to the OS as if it were any other removable storage, with the one proviso that deleting files doesn't reclaim space unless the entire tape is wiped.


LTFS is ok, a multilevel tar file with a file list followed by an FEC encoded block store would be easier and more portable.

It has been 8+ years since I played with LTFS, so maybe it has improved.


LTFS as of today is supported in FUSE, which realistically means it works on every OS that matters.


True, on hard drives, the controller board will eventually fail. The solders might decompose or the circuitry might fail. Then you’re left with a platter full of randomized bits.

But on tape, the controller mechanism is separated from the storage medium. The controller mechanism is inside the drive readers itself, which will eventually fail.

So for both, it’s a trade off. They’re both going to eventually fail.


That $300 drive is an LTO-2 drive. It's so old that it's parallel SCSI.

I'd be surprised if you could even get tapes for it that aren't sketch grey market stuff that you wouldn't want to use for backups anyway.


One day I found a really cheap LTO-4 drive that costed $150. Interesting, but I found it was not practical despite the price and decided against it. First, an 800 GB LTO-4 tape is no longer high density by today's standard, it couldn't even hold a 1 TB HDD image, also, I still had to pay for the necessary SAS peripherals to get it working, finally, the mechanical assembly inside a 10-year-old tape drive was not something that inspires confidence for data backup... Last time I've looked up, the cheapest decommissioned LTO-5 drive still costs $1000.


The first tape drive is a 200GB (native / uncompessed) internal SCSI drive. Not so easy for a consumer with a laptop to use.

The tapes are pretty cheap at $8 on Amazon if you buy 20.


Well they don't really make these things for laptops unfortunately but they also don't cost $25k+ like they used to. It's still more than the average consumer is willing to spend but they are affordable enough now to be a viable option for professionals and small businesses.


What makes you think I meant home use?


The first person pronoun you used ("I") strongly implied personal use.


No, it just means the person has the ability to purchase. They could be purchasing it for themselves or on the behalf of another organization.


I accept that, in a literal sense.

I was attempting only to answer the question and explain the source of confusion.

In native English speech, it would usually be expected that a business-case purchasor would speak of a more general case, perhaps by using "we".


Ah - I took the question as saying “I did nothing to indicate the sort and could in fact be purchasing for a business”, so I was trying to explain that side!


In English, when purchasing on behalf of a corporation, “we” would almost always be used.

“we” is even frequently employed by auctors writing papers to depersonalize.


I guess it’s just me then. I wouldn’t write “we” in a comment online without explaining who “we” is, it seems like an unnecessary detail to explain


That’s how I meant it. Seems odd to say “we” out of context like that. And anyway I would be the person purchasing so including the whole company is.. odd.

Anyway, I guess this is the pedantry we (!) should expect from HN


> Swapping tapes continues to be human intensive and restore times long.

Restore times are definitely long, but you can mostly avoid human labor by putting the tapes in a tape library, which will load tapes in the drives using robots. You still need technicians around because tapes / drives / robots will break, but individual restore operations can be completely automated.


Doesn't really sound like it's aimed at you. Tape is for serious long-term, high-volume storage. If you've got a limited budget for the tape device it's probably not designed for you in the first place.


I understood the exact opposite that you have understood, and I find your comment quite uncomfortable, as a result?

The parent is clearly seeking to confirm serious long-term use, as you say, in considering the costs of failure cases.


> I understood the exact opposite that you have understood, and I find your comment quite uncomfortably presumptuous, as a result?

Is this a question? I can't tell you if that's how you find my comment or not, sorry.

They wants a cheap tape device because they just want to use it in the home. Tape devices aren't cheap... because they aren't aimed at use just in the home.

They aren't aiming at home users - I think that's a simple fact not presuming anything.


The parent clarifies above that they are not a home user.

Even if you believed they were a home user, they clearly demonstrated in their original comment that they did know that it was for long-term, high volume storage, and they did know the cost, and you appeared still to belittle them for not knowing.

It's a question in the sense of - why did you choose to say it in this way, which seemed impolite? What did you intend your comment to add?


Is it possible to buy tape drives in the 2nd hand market for a reasonable price?


It is - the kicker is (and always has been) the media.


We reached an inflection 'amazing!' point when we are able to put '100 songs in your pocket'. That was really a shocking thing given the limitations of CD/Tape etc..

But the real inflection point may come when 'all relevant information in your local disk'. The tapes here in question can maybe store every book every written!

We may be able to put the entirety of Wikipedia, every film, TV Show ever made, every book every lecture on a little disk.

The only thing we'd need to access in realtime would be contemporary data like traffic flows, weather situation etc..

The ability to store 'that much data quickly and easily' locally, may quite fundamentally change the equilibrium we have right now with the cloud and lead to a more natural decentralization.

"All of YouTube from 2008 until the present, 99 cents at the local gas station, on usb-like contraption"


This will never, ever happen. English Wikipedia (w/o pictures) copressed is measly 20 GB. It is hard to quantify "all books ever written", but I have kept copies of some online libraries large enough that they for sure have pretty much every book you can remember and 10 000 you never heard of for every single book you can remember. It's not that much, you can fit it on 1 or 2 regular HDDs.

Now, I did it because I'm that type of guy. There's not that many people who actually do this bullshit, even though it's perfectly doable.

So why don't they? Because it doesn't make much sense, if you aren't afraid of upcoming nuclear winter. Wikipedia is updated and improved every day. You only sometimes want to refer to something old, but you nearly always want to check out something new. Petabytes of video are uploaded to Youtube every year. Probably TB/day wouldn't be an overestimation for audio on Spotify. All data is being updated constantly.

Also, the above is valid for pretty aggressive data compression. Is aggressively compressed data what we want? No. 2h video compressed into about 500 MB was totally fine 15 years ago. If I download a 2h movie today, it's normally around 20 GB. And by no means it's uncompressed.

Seriously, by now you should know for a fact, that if one believes there's such thing as "too much storage space" — he's stuck in the 80s.

And even if there would be such thing — realistically, a cluster of nodes in Google's datacenters can find you a book or a video you are looking for way faster than the most perfect HDD you could theoretically have locally. So, again, normal people wouldn't want to have all this stuff even if they could.


So I was just talking smack, but I think it might be possible.

Remember when we could do 64Kb/s reliably over the web? Then you could do 'voice'. And after that threshold was crossed, you could basically do unlimited voice very quickly.

So Wikipedia - text only - is ballpark 50G - which is to say it would fit on a single mobile phone.

That is bigger than the first Google index!

I don't know how old you are, but the notion that you could walk around with this massive database, literally the size of all of Google right in their pockets in 1999 - would have blown people's minds. It was basically unthinkable.

The rate of growth of storage has slowed down a little in the last decade but there are still jumps to be had, and it's not inconceivable that we get 100-500TB storage in the nex1 10-15 years in regular devices, meaning 10-100x that in a slightly large storage device.

While video data is expanding (4K is much bigger than original HD) it can't go on forever, meaning, just like voice and text, once we cross a certain threshold, then it becomes irrelevantly small relative to storage as well.

So I think there's some value in my point:

In 10-20 years from now, as video storage becomes 'trivial' just as text is today (aka all of wiki text on your phone) - then huge amounts of data become available, instantly.

Though some data sources change a lot - others do not.

It's not inconceivable that we put the 'entire western canon' in everyone's homes.

The other thing no so evident in my comment is that there's only so much use for all of this data.

We are getting massively smaller marginal returns for all this 'big data' we store, frankly, I question in many cases if it's worth it at all. I think a lot of companies have been duped into saving every mouse click or whatever concerning every customer. The world is just not that complicated.

What this means - is as computers miniturize - and storage as well - we may see regular data centers shift away from the cloud back to 'on premise'.

If you can fit 'infinite computing power' in a little closet, and the parts are easily replaced ... then I can see companies doing that.

The promise of cloud computing today largely rests on the economies of scale of physicality: parts take up space, cables, power, heating/cooling, you want a lot of flex/headspace, configurations.

I don't see why in 20 years from no, you can buy 'off the shelf' a 'box' that has the equivalent of 100Ec2s, 500TB of storage and multiple G/s networking cards.

You could run an entire corporate office of 1000 people from just a single box.

The 'physicality' of it all would be mundane and irrelevant. Obviously, it would be 'super complex' and still need 200 IT people to admin all of the software, but physically it could be small and cheap.

'Big maybe' of course, but there are some possibilities in there I think.


The entire libgen archive is roughly 50 terabytes (most of the books ever written). It will be a very long time before we reach that level of storage in a phone.


Only because we changed the goalposts. My 2007 iPod has more hard drive space than my current phone, but it was slow noisy disk storage.

Meanwhile my phone's storage is basically now just a high speed cache for the internet.

Whether it will go back the other way again in the future, I don't know if anyone can know but tech does tend to be cyclical.


To be fair, libgen has a lot of PDFs. Libraries in fb2 or such are orders of magntitude leaner.


I'm not sure what you mean by a very long time. We have 1tb microsdxc now and it is about 1000 times the capacity of sdcards available in 2005.


What's more ... although it would be ugly and awkward, one can easily imagine that 50 micro sd cards could easily be mounted on a single USB like device that sits in your pocket and attaches to a phone. The reason it doesn't exist is primarily nobody needs it rather than "it can't be done".


I share your childlike optimism. But another thing (that neither of the sibling comments mention) is write speed. Having a ton of capacity to store massive amounts of data at rest is not the same thing as being able to easily make near-instantaneous (or even slow) copies. Getting all that data on there is going to be a problem, and so it would only be economical for data sources in high demand, not unlike the way that optical media get the pits and lands stamped in at the factory based on an expensive master. "All of YouTube" might be (probably is) in high enough demand, but it would also require cooperation from the gatekeeper of that content, who at this time has adverse incentives because it's making (many times) more money selling ad impressions for basically every view that it's able to. Even aside from that, what other large datasets are in the same demand category such that the work could or would be subsidized in that way?


I love this thought, but have some doubts.

So far, storage is still following something resembling Moores law, but it will probably hit physical limits way before a year of youtube (probably 30 of these new tapes or so) fits into your hand.


HDDs are hitting physical limits, SSDs still have growing room. You can already buy 100 TB SSDs.

https://www.newegg.com/nimbus-data-dc-100tb/p/2U3-002M-00004

Following Moore's law of 2x density per 18 months, in 10 years we will see single drives of:

100 tb * 2^(10 years/1.5) = 10,159 TB


It's why I pirate at 720p x265, the files are tiny. I'd rather have a extremely large and varied Emby instance than 200 movies in Ultra Bluray Remux.


Love that last line! Wonder what will happen drm wise in the future if we get to this point.


Buying something would just transfer the encryption key


Not going to happen. Once you think you've got everything locally, exponentially more will be generated anew by everyone.


Copyright is a barrier to that.



> when not in use, tape requires no energy unlike hard disks and flash, IBM writes

IBM marketing smort


You mean you haven't been keeping your hard disks plugged into a power source at all times? Say goodbye to your data! /s


That's good. But when will it ever move out of the enterprise market?

What I really want is an affordable Tape storage for us common folks / home users to archive data long term. It's really a pain to keep transferring your old backups every 2-5 years to a new CD / DVD / portable HDD. There is a market for home users that really don't want to put their data on the "cloud". And this generation is really creating so much data, some of which, I am sure they would like to store long-term.


So having worked in the tape library space, they don't really make sense for home users.

They say that they have a 50 year lifespan, but everyone in the industry knows that's a blatant lie these days. The tape customers transfer everything over to the new generation every 2 to 3 years, so it's never tested and none of the customers really care.

You also only get a couple hundred writes to a tape in it's lifetime. That includes bulk appends as the stress from seeking is a major factor of what kills them. They really need to be treated as WORO media, bulk, the whole 12TB or what have you at once. Not a great fit for home use.


That's precisely what you need if you want to upload some backups from NAS systems or similar.


$100 each starts making way less sense when they have the same semantics as a burned optical discs for home users.

These days the main benefit to tape is the physical safety of having the media be inside a cartridge, and the streaming read speeds are stupid fast so you can reconstruct a backup faster.

Home use, even with a data hoarder sized NAS, is better served with a bluray burner with multiple copies.


> $100 each starts making way less sense when they have the same semantics as a burned optical discs for home users.

These tape-drives have 300MB/s+ of read/write speed. Sequential yes, but almost all backup tasks are sequential.

Optical has 10MB/s or so. You get more out of a 100Mbps connection to (insert cloud storage provider here), let alone Gbps.

At a minimum, a modern, reasonable mechanism for backups needs to be faster than cloud (100Mbps or Gbps), otherwise its basically worthless to the consumer. Hard drives and Tape get there, but there was no real way to improve Optical's read/write performance (outside of overengineered "jukebox" robots available to Facebook and a few other select groups), so it went fully obsolete.


16x bluray is 72MB/s (576Mb/s).


Hard Drives are going 220MB/s and people still think they're too slow.

I just gave a look at the first Blu Ray media on Newegg.com. Here's what is search #1: https://www.newegg.com/verbatim-6x-25gb-bd-r/p/N82E168171301...

That's 6x speed, or roughly 25MB/s. You can store 25GBs per disk and it is going to take 1000% longer than a $150 5TB Hard Drive AND you're going to have to sit there and remove / add a new disk every few minutes.

And for what? The hard drive is cheaper after all of that.


Hard drives have a bad habit of not turning on after being off for a year or two is why.

And yeah, you'd be buying 16x 128GB media in this use case. Not the bottom of the barrel stuff for giving home movies to grandma.


> Hard drives have a bad habit of not turning on after being off for a year or two is why.

So push the "ZFS Scrub" button every 6 months.

Don't store hard drives. Store a NAS. The entire freaking computer is stored as a unit. Every few months, turn it on, push "ZFS Scrub" to double-check the data, and you're set.

BluRays also degenerate over time unless stored in proper UV-sealed conditions / temperature controlled conditions. Everything requires a degree of maintenance and checking. The question is how to best automate that checking process.

Hard drives are read/written to at 200+ MB/s, making these maintenance checks much faster. They're also bigger, which means no need to manually insert / remove disks from drives. This entire process could be automated from a "Wake on LAN" packet and a few clicks from any terminal (phone, computer, whatever).

-------

I dunno. I'm looking up these BD-R XL drives you're talking about: its like $30 per 100GB disk or something. That's an obscene price for so little storage. I guess if all your data fits in one of those disks its fine, but... I got some archived movies and stuff on my NAS (50GB per Blu Ray). Needed to store some of my video editing files (so I need the original data that matches with my video editing files).

The real data of importance of mine probably fits in a BluRay. But then I won't have all the other stuff I've saved up "because I can" on my 2x 5TB NAS (Mirrored, so only 5TB of storage).


That violates the 3-2-1 backup rule.

3 copies, 2 different media, 1 offsite.

I've seen large orgs do what you're saying and lose nearly everything because one bug took them out completely.


> That violates the 3-2-1 backup rule.

Ehh? Not really. All of these things we're talking about are components of our backup strategy.

If I wanted an offsite backup, its just rsync to some cloud provider (like rsync.net) or something. I don't do that because I don't think my data is worth the recurring cost of that, but its an option.

-------

My point is that when my solution is 200MB/s and on the order of $500 to $1000 for the component ($500 if you build your own NAS, $1000 if you buy premade parts. Assuming 2x 5TB hard drive is just $300 for 5TBs of _mirrored_ storage)...

While your proposed component costs $30 per 0.1TB and read/writes at a lousy 70MB/s... to get the equivalent mirrored setup you need to buy 100 Blu-Rays or roughly $3000 in Blu Ray XLs alone (2 copies of your data across 2-different Blu Rays, for the same redundancy factor as the 2x Hard Drive solution).

So I have to raise my eyebrows a little bit. How are you checking that the data doesn't degrade? Are you manually checking all of those BluRays you've created for reliability? That's a lot of sitting around and inserting / removing drives.

Its literally cheaper to build a 2nd NAS and stick more hard drives into it, and keep that 2nd NAS offsite somewhere.

-------

If you're gonna hold my feet to the fire over this "2-media" thing, then my 2nd media of choice would be Flash storage before Optical. Because SPEED is king. Speed means you can checksum your data and ensure that your backups are still good. I think Flash is a bit expensive compared to HDD, but based off of these BL-XL disks, I'm thinking that Flash is actually cheaper than Blu Ray and something like 10,000% faster. (Tapes would be a more ideal 2nd media... but I'm not "big enough" to make tapes cost-effective).

Yes, checksums and scrubbing. If you want to protect against bitrot (in ANY medium), you MUST double-check your backups over time.

TEST your backups. Any backup strategy that doesn't have a regular testing schedule is null-and-void in my opinion.

I seem to think that HDD, then Flash, and MAYBE Tape (if you're going really, really big) are the mediums of choice of the modern computer user. I'm not really seeing where Optical fits in today's world. Maybe a future Disk format (with maybe TB-level disks) can make Optical a thing again... but 100GB Blu Rays aren't really in the works for the modern user.

70MB/s is slow. And 100GB is too small per storage unit.


Storing even 5 TB of data onto bluray disks requires 100 disks of 50 GB dual layer capacity. That's a lot of disks.


Yeah. And that too if you are willing to ignore the really slow speed to write them all!


If you're doing archival storage, does the write speed really matter? If you have enough volume of incoming data where the write speed does matter, you've already eliminated most archival storage methods anyways.


True. But slow transfer speed is still a negative when we are comparing it to other archival methods (in this case, the tape drives). And that is why the demand for tape drives for consumers.


I'm having a very difficult time understanding why they are a negative at all for most use cases. It's not like running a backup monopolizes the machine you're running it on.

Again, tapes are for archival storage. Not nearline, not live, archival. As in, you write the data and likely don't come back to it for months to years.


There might be a benefit to it. If 1 disc corrupts you only loose 50GB. If your hard drive fails, you loose all 5TB at once.


I built such a cloud service; you can archive the data uploaded on our cloud storage to tape on demand, and we'll send you the tape back when requested (you basically buy the tapes once, and they're yours, no string attached). It failed to gain any traction, though.


First off, mad props for building out the service. That was an ambitious idea and I’m sorry it didn’t take off. Second, if you ever write a blog or article about it someday please mention it on hacker news.


OK, maybe we'll try to relaunch it then :)


Sounds like a good idea. But I am guessing it failed as it wouldn't be attractive for common folks because, while the tapes might be affordable, tape drives are really expensive and uncommon.


In fact the service is tentatively targeted to content producers. Most post-houses do have LTO drives (it's a requirement to work with NetFlix among other to archive to LTOs). The idea is that any content owner can bring their tapes to the post house when needed (where they would actually use it) instead of USB disk drives.


Having given some thoughts, your potential business success depends on convincing the clients to not buy a tape drive, and addressing the above concerns. Smart pricing can address the former, but the latter needs more brainstorming.

Some of the issues you need to address:

- Trust. How can someone trust you with potentially proprietary / copyrighted data?

- Speed. Is it faster to copy the data to a tape locally, then to transfer to you?

- Reliability. How can the client be sure that you have made a successful backup?

- Advertising. Did you reach your target base to make them aware of your product and address all their concern to make them consider it?


Yes, that's the point. When you're a content creator and you're making at most a couple of TB of data per month (typical figures), hardly enough to fill a handful of tapes a year, it's hard to justify the expense for a tape drive. OTOH cloud storage is either too complex (S3, Glacier) or too expensive (Dropbox, Box) for this volume of data.

Trust: yep, hard to say for this one. At least I can provide actual guarantees (for instance, none of our storage is out of the country).

Speed: it's possible to send us tapes directly. However the main selling point of our solution is to allow people to upload relatively small volumes of data continuously, then archive it all to tape in large batches when you have enough to fill an LTO.

Reliability: the interface allows the user to see where their files are (on which tape), to run checksum controls on tapes, and restore from tape. Every tape is sold with 3 free operations a year (archive, checksum verification, restoration).

Advertising: that's clearly not our strong point :)


Totally! I have a friend who does sound engineering for hollywood and he has two firesafes full of HDDs, SSDs and TAPE with hundreds and hundreds of terabytes of data. I couldn't even finish the sentence "Have you tried the cl..." before he cut me off and was like: way too much data. Didnt have the heart to tell him two of those three media types need to be powered on periodically...


Ow! Telling him would be a kindness.


HDDs can't require powering all that periodically. I just found my hard drive from college, and plugging it into my computer just worked. Great finding all my old documents and music that I'd forgotten about.


The lubricants might settle, causing sticking of various components when you try to power it on after a long time.


If your drive uses ZFS you apparently need to power it on more regularly than that to avoid bit rot.


Why would the file system affect bit rot rate?


It doesn't. What matters is that you have a filesystem that can detect and repair bitrot. For that to work, it needs to check everything occasionally, which means they need to be powered up occasionally.

If you don't do that, eventually you'll get to a point where it can't repair anything, and then you gain nothing over using a filesystem that doesn't do this, is the point.


A 12 TB HDD requires 10 GB/day for three years to be filled. This is not home market, it's professional market or hoarding (by today's standards).

Objections on the failure rate of HDDs are absolutely valid, but then one should also need to consider the bigger picture (e.g. storage loss), in which case, having a remote copy is also important.


> A 12 TB HDD requires 10 GB/day for three years to be filled. [...] hoarding (by today's standards).

You have an interesting argument - by the same reasoning, when one has eventually filled up a 12 TB HDD, it would no longer be hoarding by tomorrow's standard. In other words, at this point, one should be able to get the next generation of HDDs for cheap, and it's this fact that makes all tape drives unnecessary.

Now I wonder whether it was a mistake for me to buy a spare 14 TB HDD to 1:1 mirror my new file server for cold backup. Perhaps a smaller one should be good for 5 years anyway...


> A 12 TB HDD requires 10 GB/day for three years to be filled. This is not home market (...)

You're basing your argument on extreme and unrepresentative assumptions.

First of all, you do not need to fill a storage device to the brim to justify it's use. You can use the exact same rationale to claim no one needs 500GB HDs even though they are pretty much standard these days.

Additionally, you falsely assume that your data storage needs start the very moment someone buys a drive, and up till then they have no data lying around. That's not the case at all. People buy high-density storage devices because they already have the data lying around, and they don't want to lose it. You're ignoring that people already have piles of CDRs/DVDs/Blu-ray disks lying around.

Additionally, you are somehow assuming that people would buy storage devices for their density, ignoring their primary use case: long term data preservation.


> A 12 TB HDD requires 10 GB/day for three years to be filled

I cheerfully quantify just about everything in my life, yet somehow I missed that one. I am a little bit of a data hoarder, or maybe just a little paranoid. Seeing those numbers is actually quite helpful. Thank you.

PS I get downvoted a lot for comments like this, so in case it sounds sarcastic or facetious, it is not. I mean it sincerely.


10Mbps (average bitrate) video fills 10GB on 2.2 hours. I don't think such usage is hoarding for today.


> 12 TB HDD ... This is not home market,

Indeed. The home directory on my PC at home has 80GB of files stored. Most of it is just there out of laziness. About the only thing I try to keep backups of are my tax returns.

What do people have on a home machine that needs enterprise-grade backups?

You might say photos and videos. I guess that's a personal thing; I realized a long time ago that I never spent any time actually looking at any of the pictures I took, so I stopped taking them.


> I realized a long time ago that I never spent any time actually looking at any of the pictures I took, so I stopped taking them.

That is a completely logical take. However, I sort of think of it as a posterity move. Although my kids might not particularly want to see pictures of themselves, their great grand children—or beyond—might love it. I would pay a pretty penny to see home movies of my grandparents, whom I never knew.


100% agree. I have a NAS for photos and media, plus a big USB HDD for ZFS snapshots of my machines - but all of this is homelab tinkering, not an actual backup strategy.

I realized in the last year or so that the only digital media I have which I would be genuinely sad to lose were wedding photos - so I saved those to 3 different cloud providers, made a Blu-ray copy, and a copy on the NAS. If I lose all of that in some tragedy, chances are I've got bigger things to worry about.


Photos, videos, music, but also current laptop backups, servers/VMs backups, current and past phones backups and HDD images of past computers hard drives (to be done), digitized family videos and photos (also to be done).

It adds up quite fast. And being able to have put everything on a tape every year with a label on it would give me some peace of mind, even more if I place 2 of them on different location.


Except for a brief period in nineties, tape never been aimed at consumer storage.

I think the market for tape will always be the same few giant companies for whom it will be priced just a bit less than the next cheapest option.


Home users had tape backups since the 80’s.

The problem was, nobody actually backs up so not enough people bought tapes/drives so they quit making them.


Well, nowadays digital documents, such as videos and photos, are much more important and widespread and mundane than they were a couple of decades ago.

During the 80s there was no digital consumer camera market. Nowadays every person can easily generate hundreds of megabytes of photos and videos per day. Each hour of a 4K video can be close to 7GB, and we're already seeing cellphones which are able to record 4K video at 60fps and 1080p videos at 240fps.


Shucks - It was too early for its time. Did they even have RAM in MB's during the 80's? :)


Just barely. It was possible to buy a PC with 2 or 4MB of RAM in the late 1980s, but you couldn't do much with it. On Windows/DOS, I think applications had to be written specifically with "extended memory" or "expanded memory" in mind. There were 2 incompatible "standards" for how to organize memory beyond 1MB.

On the Mac side, in 1989 you could apparently buy a Mac IIci for $6200 with 1MB or 4MB, "expandable to 128MB".


Let's have someone invents a ultra high def head to repurpose all the old VCRs and VHS tapes :)


That would be wonderful, unfortunately the storage density of tape is dependent also on the magnetic particle in the tape; smaller particles which can be magnetised enable smaller heads, which in turn enable higher storage density.

I know your comment was a throwaway idea but I got a little bit too involved looking up information around this answer - so I've committed further and looked a little deeper to avoid looking too stupid: Currently with BaFe (Barium Ferrite) as their magnetic particle, LTO-7+ tapes have particle sizes less than 100nm in size (https://indico.cern.ch/event/637013/contributions/2669089/ - check out slides 14 and 15 in the powerpoint.).

VHS on the other hand is a little more tricky to find information on I suspect due to the age of the technologies involved it's going to be significantly bigger - but the closest I've found to a technical document is a student paper at NYU which refers to this IEEE Paper (https://ieeexplore.ieee.org/document/50474) saying that the average particle size for VHS is ~300nm and for S-VHS at ~150nm.

But that first presentation also mentions smoothness increases over time with LTO, which suggests improvements in the coating process.

I suspect improvements in the coating, combined with both the size of magnetic particle shrinking, enabled an increase in density of particles on the tape. This in conjunction with improved thinness of the tapes enabled large increases in tape length for the same cartridge size (nearly double the length from LTO-1 -> LTO-8), has in turn led to the enormous jumps in capacity we've seen in LTO.

Meaning unfortunately I don't think it'll be possible to do super interesting things with VHS.


Thanks for the research, I naively assumed VHS band material was very lowly used by analog signal recording techniques.


D-VHS was a thing, tape capacity started at 25GB


Such a shame it didn't take off


Oh I only remembered philips DCC .. fun


I store on an external HD, and every year I buy a new HD and copy across to it and verify checksums against my records. Costs basically zero and low-effort. What am I missing?


>... every year I buy a new HD...

OK, but OP said:

>It's really a pain to keep transferring your old backups every 2-5 years...


The OP is mixing mediums with different characteristics. The DVD has a fixed size (say, 5.x GB), while hard disks are relatively open-ended. One can buy 10+ TB magnetic disks for a cheap price (less than 200$).

If the OP really needs dozens of TBs of capacity every few year, they definitely don't fit in the home user market they are talking about.


Yes, I do use a mix of medium in case one medium fails.


Because they're considering options like DVDs.

With a new HD you can just copy it across in one step.


OP explicitly mentioned HDDs in their post; they do not want to be transferring data to a new HDD every year, or two years, or five years.


Well that's why I asked what I'm missing! It's just one command to copy from an old HD to the new HD? Takes me all of 30 mins once a year.


Not trying to be rude or sarcastic here: what is that command? Certainly in my experience on both macOS and windows, the default operating system GUI mass copy invariably poops out halfway through, with no explanation as to why. Maybe something like this?

    rsync --archive --acls --xattrs  --checksum --vvv /src /dest


You have to save your important data in multiple mediums to keep it safe. This is in case one medium fails. Obviously, cost being a factor, optical disks is one of the mediums to be considered despite all its other downsides (slow transfer speed).


Low cost, high density and a low failure rate compared to HDs. That's why it's still used in the enterprise for backup solutions.


Actually nearly every enterprise backup system my company has deployed in the last 5 years is tapeless. Most systems involve snapshot backup management that are replicated across multiple disc/SSD systems. (I'm the network guy, so not my focus area)


Tape is moving from just 'enterprise' to cloud provider enterprise. The periodic cost of having a tape storage/management system installed is a hard pill to swallow. Most companies still use tape but they just access it via AWS Glacier.


I don't think these are home-use concerns.


Every year you buy a new HD and yet the cost is basically zero?


Yeah what does a new HD cost? Like £100 max for a big high quality one? Compared to buying a £5k tape drive and expensive tapes... yeah that's basically zero.


$100 is definitely not "basically zero". Not only that, but a $5k tape drive and tapes are designed to last a significantly longer period of time, essentially bringing the cost of long-term storage closer to $0/year than your option of spending $100/year.


> a $5k tape drive and tapes are designed to last a significantly longer period of time

You're going to use the drive for 50 years? 50 years? Half a century? And you think it'll be working and supporting a format with a useful capacity in half a century?


5k amortized at 100/year is 50 years to get your money's worth. It seems 100/year is a much better deal. I don't see your math working out...


That's assuming a consumer tape drive will cost that much. They'll succeed only if they cost less than $500. Enterprisey stuffs are always costly. Just like consumer grade HDDs vs entrprise HDDs.


It's 50 years before $100/year catches up with a one-time cost of $5k. On top of that, the total capacity of the mediumgoes up steadily over time, while you will need to drop another $5k every 10 years or so or span your backups across multiple tapes.


In reality I am sure he could get away with buying a new drive every 5 years.


You miss the point - consumer tape drives won't cost 5k, just like consumer HDD's are cheaper than enterprise HDD's.


About consumer tape drive reliability though...


Normal people don't know what a checksum is.... or they think that's the amount their paid every week from the bank.


The checksum isn't the relevant bit. That doesn't change whether you use a hard drive, a tape, or a DVD. Forget about it as part of the discussion if you want.


Why do you do this manual checksumming thing instead of e.g. setting up a ZFS mirror once every n years?


Because simple and manually inspectable is better for backups.


What do you do if you find a checksum error? ZFS mirror will fix it for you.


What do you think I do? Copy from the last good copy.


ZFS does that, except with error correction codes applied against bitrot. All you gotta do is turn on a ZFS-system and hit the "scrub" command. That automatically checks all checksums: and error-corrects all data that fails a checksum.

If anything, ZFS makes that whole system you just described easier and more automatic. Its not really hard for me to push the "scrub" button on my Nas4Free box every few months.


What do you do with errors within the latest increment? A shelf of annual HDD's doesn't sound too bad otherwise.


> What am I missing?

Money - The cost factor of buying the new HDD. And to be safe, you need to have more than one medium. That's again more cost.

Time - Ensuring that your existing machine is compatible with both your old and new disks (IDE vs SATA or working DVD drives for example).


AWS Glacier?


The post you were replying to specifically didn't want to use "the cloud." But for single-digit TB, which covers consumers who don't either have a huge number of video files or is a serious data hoarder, the answer is the use local disks. And,if they are OK with cloud storage which they probably should be, something like Backblaze is probably a better choice.


The solution is generally to not be doing full duplicative backups but to check hashes between already backed up files and new files, only backing up what you need.

I don't recommend backing up to HDD's. They are prone to early failure from portable use because of vibration and drops.

What we need are more affordable sata SSD's. Currently nvme SSD's are very close to the same price.

If you could sell a 1TB portable usb 3/usb c backup device that comes with good software the vast majority of people would be set.


> I don't recommend backing up to HDD's. They are prone to early failure from portable use because of vibration and drops.

Your backup device should stay at home, on the desk or in the closet where you hide your networking equipment. There's very little benefit to trying to carry around your backup device with your laptop on a regular basis, and if you do, you still need another backup device that isn't going to get stolen at the same time as your laptop.


SSD and flash storage can have their own issues. For long term unpowered storage there might be issues with data retention. Not so much with magnetic media. And with bad drivers there is risk of ruining parts of disk by writing too many times...


>affordable sata SSD

This. Even 100MB/s SSD will do. We need a type of NAND and SSD that offers High Capacity and low speed and low cost. The Current 2TB is ~$170, compared to a 2TB Portable HDD at ~$60.

And I am wondering, if Silent File Corruption, bitrot etc are things of the past on SSD? Does BTRFS / ZFS even make sense on SSD?

[1] https://www.amazon.com/Green-1TB-Internal-SSD-WDS100T2G0A/dp...


Current LTO-9 tapes can store 18TB and the LTO roadmap doubles capacity about every 3 years. So this tape tech would be on the same scale as we might expect LTO-14 to offer in 2035.

So unless this is an incredibly radical breakthrough, that’s the timeframe I’d expect for the headline to become a real product.

Note that the article shows a table from IBM claiming they achieved 35TB-per-cartridge capacity in 2010, and that still isn’t something you can buy.


Why do LTO manufacturers always quote compressed sizes, like we’re in the days of stacker?


IMHO, partly for marketing bluster, but partly because hardware compression in the tape drive is a useful feature if you don’t want to handle 1000MB/sec of compression workload on the host during backups.


Mercifully, there was no requisite simpleton analogy like "that is the equivalent of stacking VHS tapes to the moon" or some idiotic thing like that.


>To put 580 terabytes in perspective, it’s roughly the equivalent of 120,000 DVDs or 786,977 CDs — IBM notes that stacking that many CDs would result in a tower 3,097 feet (944m) tall, or taller than Burj Kalifa, the world’s tallest building.


How many CDs can you stack on top each other before the one at the bottom fails?


Is that just the media or the media + protective jewel cases?


Just the media


Last I looked into this, if you had a lot of data to back up (Say 1 Gbyte per second, continuous, with a retention time of 1 year), it was still far cheaper to simply use hard drives. One employee can keep up with all drive replacements, hardware setup, etc with time to spare. Drives aren't super power hungry, so any old office building is suitable. Encrypt the drive contents on another site so you don't need 24/7 security. Total system cost was sub $1M with a running cost of $500k/year and storage of 50PB. Bargain.

Now that GDPR applies, most companies need to rewrite backups every 30 days anyway to remove data where a GDPR deletion request applies. That tips the scale further in the direction of always spinning hard drives. Just hook up 64 drives to each machine, make sure you only do streaming writes of 1Gb+ files, do some ZFS raid-like scheme, and away you go.


You don't have to do it this way, you can just use encryption at rest with a different key for each user and throw the key when a user ask for deletion of their data. No need to get all the backup back to scrub them one by one.


Our lawyers didn't think that sufficient to cover ourselves. Upon finding out it wasn't going to cost many millions to simply scrub the actual data rather than the keys, they came back that it was money well spent to delete the actual data.

It was mostly because a user might share data with another user, for example two users at the same postal address. Our fraud team needs to be able to look stuff like that up, so there need to be database keys on stuff like that, both in the backups and in production. If one of the users with a specific address has a GDPR deletion, we need to delete that users data, but if another user has the same address, we still need to keep the address itself. Yet if both users have a GDPR deletion apply to them, as well as deleting the address, we need to delete the fact both deleted users had the same address (even if we don't know the address, because the pattern of which deleted users shared info with other deleted users could identify them)

See... it's complex! Simplest solution properly delete the data and rewrite the backups!


IANAL but you do not need to rewrite backups to brute force compliance. You need to inform your customers what your retention policy is though. I've seen large enterprises communicating backup lifetime up to 6 months after a deletion request.


That didn’t add up. 1GB per second for a month is 2.6PB. Do you have a source of 40TB disk drives?


1GB/second for a year... Plus spare drives for redundancy, hotspotting, a test/training setup, and unplanned for increases in storage requirements.


"to each machine" implies that there's more than one machine with 64 drives attached.


And I'm pretty sure "any old office building" means buying a large office or small office building and filling it up with them.


cryptoshredding


If the tape was twice as long, it could hold 1.16 petabytes. That would have been a more interesting headline.


My reaction was that any kind of tape can hold 580 Terabytes if it's long enough.


How long would the tape from an audio (compact) cassette need to be, to hold 580TB?

How thick would that tape need to be to allow the motorized spool to pull the tape from the other (passive) spool?

Imagine you were to make an assembly that could hold two spools of this tape, how large would it be? It might be hard to fit it through the door of your server room. And it might not even fit in the van that carries away your off-site backups :)


Yeah! Somehow my brain started to channel Shakespeare and say "Why, so can I, and so can any man, be the tape but long enough!"

(Then I had to look it up. Henry IV Part I, Act III, Glendower: I can call spirits from the vasty deep; Hotspur: Why so can I, and so can any man, but will they come when you do call them?)


Most likely there are standard sizes of containers.


Finally a way I can store all those family photos I swear I'm going to sort through someday!


That's a great effort in technology and it is remarkable. I have some doubt about the real benefits that are beyond the technology though.

First of all about the security of this system. Yes, it is very avantgarde but being a physical support I will always have the worry that it can be damaged.

If I immagine to use it to store sensitive information this worry rise up. It can easily be accessed or stolen or damaged.

And what about the possible cost of a solution like this?

a part from this aspect it is incredible what Fujifilm has been able to reach in terms of tech


580 Terabytes is really impressive. Although, I do not know why I am still waiting 5D crystal tech (maybe durability), and hopping it will be market available some time soon For now Microsoft has taken over the project and status is quite uncertain ...


Question to those who are familiar with latest tape backup performance: does it make sense for me to take tape backups of my data (e.g. using a second hand tape drive)? Or its a bad idea for home use?


I wonder about using these for science data archival, although these days I suppose that most projects can just provision archival from a commercial vendor.


A vendor like what AWS does with Glacier


> In addition, when stored properly, data recorded on tape today will still be readable in 30 years.

I do not find this to be all that impressive, frankness be.


How does tape reading/writing wear out the physical medium? Any practical limits?


Memories of windows disk defragmenter come to mind. Might take a while


magnetic film!


Less than a day's new TikToks :-)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: