1) It was mostly manual, though AIs were useful in certain filtering tasks.
2) Depends on the quality you are seeking. If you only want one run of a similar, off-the-shelf model, around the 1000s is enough. But at the number of iterations you have to run to build your own and improve results, you probably need about 100k.
To tackle this problem, we built our own supercomputer off of parts we bought off of ebay, though I can't say I recommend that route, because it now lives in our living room.
Not the OP, but my guess would be threadrippers (or similar w/lots of PCIe lanes), each with a great number of GPUs. That's usually what you'd do for training AI in a home lab.
Server processors gets you more bang for the buck... iff you're planning to run the hardware flat out for literally years. You save on power, but the up-front cost takes up most of that, so for a system that's mostly idle you wouldn't use them. On the other hand, any CPU with fewer PCIe lanes than a TR won't be able to run multiple GPUs optimally, and TRs are relatively cheap enough to make the reduction in PSUs/chassises worth it.
Not to mention that there are some approaches to training you can only use if you have multiple GPUs on the same motherboard, aka. sharding a single model across GPUs without communications overhead killing any benefit of that.
2) Depends on the quality you are seeking. If you only want one run of a similar, off-the-shelf model, around the 1000s is enough. But at the number of iterations you have to run to build your own and improve results, you probably need about 100k.
To tackle this problem, we built our own supercomputer off of parts we bought off of ebay, though I can't say I recommend that route, because it now lives in our living room.