It can't be that simple, Pixar has to have massive distributed systems right? I mean it can't be a single linux system with a single backup, if that's the case you would be safer storing it in Dropbox (Not that you would) since at least then you would have copies on all of your nodes.
Hi, I'm the "Oren Jacob" from the video. Rendering at that time was distributed across hundreds of CPUs in hundreds of Sun boxes in a renderfarm. The authoritative version of the film's data (models, animation, lighting, sets, shaders, textures, etc..) which was where all rendering pulled data from or verified that it had the most up to date versions, was stored on a single machine.
There were several backup strategies for that machine, but they failed. A few reasons are outlined elsewhere in this thread (running past the 4gig limit on tapes) but, if memory serves, other errors were also occurring that involved drives being out of space so error logging wasn't working properly.
Making matters worse, after our initial restoral, the crew came back to start working on the show and then, a few days later, we discovered that the restoral we thought was good was, in fact, not. So we had, in effect, split the source tree and now had a bad branch with a week of the crew's work on it, including animation.
I imagine they're referring not to the rendered frames, but to the models, textures, etc. that the animators were using.
And honestly...the larger the system, the more things that can go wrong. Lots of people back up files to RAID arrays, only to find out that their parity drive(s) are hosed at the worst possible moment. The price of a good backup system is eternal diligence.
And even then, the entire film is created in storyboard form, and then the voice work is done, and then the final rendering is done. Sure, re-doing parts of the work would be a pain, but they wouldn't have had to get Tim Allen back into the studio.
If you are suggesting that could they store Toy Story II content in Dropbox you A. forget how long ago Toy Story II was (1999) and B. underestimate the size and throughput of all the data that goes into making such a movie.
edit: added "could" to make it clear that I'm suggesting that such was an impossibility.
The tech of 1999 is certainly underestimated today
Still, in 1999 I had a PII 333MHz and 10GB HD (and Windows 98, sigh)
But reading about the history of Pixar, the difficulties (in the first Toy Story) involved the workflow of creating a full-feature film (apart from all the other problems they had)
So I'm guessing, since Toy Story 2 was their 3rd feature film, things were still a bit raw (especially given their fast pace)
How is that underestimated, a world of 10GB Hardrives when that's much less than the base package of dropbox these days. Managing the terabytes of data required for the movie back then was a much more difficult task.
Did you just decide to not read the part where he said "Not that you would"? He was making a point, not actually suggesting that they use Dropbox before it was invented to make backups of their multi million dollar movie production.
"Not that you would" Implies that you could. I was pointing out that they could not, for two rather poignant reasons. I didn't mean to do it offensively, but the point is that syncing and managing the huge amount of data that they have would be difficult and impractical over the internet even today, and this was also 13 years ago. Especially then, maintaining a single "master" copy on a single linux machine was in fact a pretty decent solution to the problem.
^ This. However wow it's been that long?
My point was a single linux box seems "Low tech" considering you need a render farm just to produce the movie. Also they were clearly referring to assets and not the rendered film (Hats, sets etc.. getting deleted) which is 1000x worse.
The infrastructure to maintain authority on terabytes of data on late 90's hardware in a distributed fashion would be a pretty significant amount of development effort.
It'd be much easier to just get a giant RAID NAS, and manage it all there, which should in theory be quite sufficient assuming you are keeping proper backups. Many of the render nodes will have copies of the needed data, but the authoritative copy needs to live somewhere.