• just_another_person@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    6 days ago

    Your assessment is missing the simple fact that FPGA can do things a GPU cannot faster, and more cost efficiently though. Nvidia is the Ford F-150 of the data center world, sure. It’s stupidly huge, ridiculously expensive, and generally not needed unless it’s being used at full utilization all the time. That’s like the only time it makes sense.

    If you want to run your own models that have a specific purpose, say, for scientific work folding proteins, and you might have several custom extensible layers that do different things, N idia hardware and software doesn’t even support this because of the nature of Tensorrt. They JUST announced future support for such things, and it will take quite some time and some vendor lock-in for models to appropriately support it…OR

    Just use FPGAs to do the same work faster now for most of those things. The GenAI bullshit bandwagon finally has a wheel off, and it’s obvious people don’t care about the OpenAI approach to having one model doing everything. Compute work on this is already transitioning to single purpose workloads, which AMD saw coming and is prepared for. Nvidia is still out there selling these F-150s to idiots who just want to piss away money.

    • NuXCOM_90Percent@lemmy.zip
      link
      fedilink
      English
      arrow-up
      0
      ·
      6 days ago

      Your assessment is missing the simple fact that FPGA can do things a GPU cannot faster

      Yes, there are corner cases (many of which no longer exist because of software/compiler enhancements but…). But there is always the argument of “Okay. So we run at 40% efficiency but our GPU is 500% faster so…”

      Nvidia is the Ford F-150 of the data center world, sure. It’s stupidly huge, ridiculously expensive, and generally not needed unless it’s being used at full utilization all the time. That’s like the only time it makes sense.

      You are thinking of this like a consumer where those thoughts are completely valid (just look at how often I pack my hatchback dangerously full on the way to and from Lowes…). But also… everyone should have that one friend with a pickup truck for when they need to move or take a load of stuff down to the dump or whatever. Owning a truck yourself is stupid but knowing someone who does…

      Which gets to the idea of having a fleet of work vehicles versus a personal vehicle. There is a reason so many companies have pickup trucks (maybe not an f150 but something actually practical). Because, yeah, the gas consumption when you are just driving to the office is expensive. But when you don’t have to drive back to headquarters to swap out vehicles when you realize you need to go buy some pipe and get all the fun tools? It pays off pretty fast and the question stops becoming “Are we wasting gas money?” and more “Why do we have a car that we just use for giving quotes on jobs once a month?”

      Which gets back to the data center issue. The vast majority DO have a good range of cards either due to outright buying AMD/Intel or just having older generations of cards that are still in use. And, as a consumer, you can save a lot of money by using a cheaper node. But… they are going to still need the big chonky boys which means they are still going to be paying for Jensen’s new jacket. At which point… how many of the older cards do they REALLY need to keep in service?

      Which gets back down to “is it actually cost effective?” when you likely need

      • just_another_person@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        6 days ago

        I’m thinking of this as someone who works in the space, and has for a long time.

        An hour of time for a g4dn instance in AWS is 4x the cost of an FPGA that can do the same work faster in MOST cases. These aren’t edge cases, they are MOST cases. Look at a Sagemaker, AML, GMT pricing for the real cost sinks here as well.

        The raw power and cooling costs contribute to that pricing cost. At the end of the day, every company will choose to do it faster and cheaper, and nothing about Nvidia hardware fits into either of those categories unless you’re talking about milliseconds of timing, which THEN only fits into a mold of OpenAI’s definition.

        None of this bullshit will be a web-based service in a few years, because it’s absolutely unnecessary.