Not small but… smaller than you would expect.
Most companies aren’t, and shouldn’t be, training their own models. Especially with stuff like RAG where you can use the highly trained model with your proprietary offline data with only a minimal performance hit.
What matters is inference and accuracy/validity. Inference being ridiculously cheap (the reason why AI/ML got so popular) and the latter being a whole different can of worms that industry and researchers don’t want you to think about (in part because “correct” might still be blatant lies because it is based on human data which is often blatant lies but…).
And for the companies that ARE going to train their own models? They make enough bank that ordering the latest Box from Jensen is a drop in the bucket.
That said, this DOES open the door back up for tiered training and the like where someone might use a cheaper commodity GPU to enhance an off the shelf model with local data or preferences. But it is unclear how much industry cares about that.
Yes, there are corner cases (many of which no longer exist because of software/compiler enhancements but…). But there is always the argument of “Okay. So we run at 40% efficiency but our GPU is 500% faster so…”
You are thinking of this like a consumer where those thoughts are completely valid (just look at how often I pack my hatchback dangerously full on the way to and from Lowes…). But also… everyone should have that one friend with a pickup truck for when they need to move or take a load of stuff down to the dump or whatever. Owning a truck yourself is stupid but knowing someone who does…
Which gets to the idea of having a fleet of work vehicles versus a personal vehicle. There is a reason so many companies have pickup trucks (maybe not an f150 but something actually practical). Because, yeah, the gas consumption when you are just driving to the office is expensive. But when you don’t have to drive back to headquarters to swap out vehicles when you realize you need to go buy some pipe and get all the fun tools? It pays off pretty fast and the question stops becoming “Are we wasting gas money?” and more “Why do we have a car that we just use for giving quotes on jobs once a month?”
Which gets back to the data center issue. The vast majority DO have a good range of cards either due to outright buying AMD/Intel or just having older generations of cards that are still in use. And, as a consumer, you can save a lot of money by using a cheaper node. But… they are going to still need the big chonky boys which means they are still going to be paying for Jensen’s new jacket. At which point… how many of the older cards do they REALLY need to keep in service?
Which gets back down to “is it actually cost effective?” when you likely need