I’m curious if anyone’s employer has set up their own LLM. My employer has a couple of A100 sitting around which could easily host a couple instance of 65B LLaMA or Alpaca. Convincing upper management to allow me is the hard part.
The 65B quantized model fits in 64GB of RAM, which I already had.
Though RDIMMs on eBay are even cheaper than UDIMMs (just over $1/GB) and Broadwell-era Xeon workstations aren't that expensive if you want to run the unquantized version.