There was an article on here a week or two ago on batch inference.
Do you not think that batch inference gives at least a bit of a moat whereby unit costs fall with more prompts per unit of time, especially if models get more complicated and larger in the future?
Do you not think that batch inference gives at least a bit of a moat whereby unit costs fall with more prompts per unit of time, especially if models get more complicated and larger in the future?