DeepSeek launched a free, open-source large language model in late December, claiming it was developed in just two months at a cost of under $6 million.
If I remember it correctly (I learned this stuff 3 decades ago) they were basically an improvement on logic circuits without clocks (think stuff like NAND and XOR gates - digital signals just go in and the result comes out on the other side with no delay beyond that caused by analog elements such as parasitical inductances and capacitances, so without waiting for a clock transition).
The thing is, back then clocking of digital circuits really took off (because it’s WAY simpler to have things done one stage at a time with a clock synchronizing when results are read from one stage and sent to the next stage, since different gates have different delays and so making sure results are only read after the slowest path is done is complicated) so all CPU and GPU architecture nowadays are based on having a clock, with clock transitions dictating things like when is each step of processing a CPU/GPU instruction started.
Circuits without clocks have the capability of being way faster than circuits with clocks if you can manage the problem of different digital elements having different delays in producing results I think what we’re seeing here is a revival of using circuits without clocks (or at least with blocks of logic done between clock transitions which are much longer and more complex than the processing of a single GPU instruction).
I remember Xilinx from way back in the 90s when I was taking my EE degree, so they were hardly a fledgling in 2019.
Not disputing your overall point, just that detail because it stood out for me since Xilinx is a name I remember well, mostly because it’s unusual.
They were kind of pioneering the space, but about to collapse. AMD did good by scooping them up.
FPGAs have been a thing for ages.
If I remember it correctly (I learned this stuff 3 decades ago) they were basically an improvement on logic circuits without clocks (think stuff like NAND and XOR gates - digital signals just go in and the result comes out on the other side with no delay beyond that caused by analog elements such as parasitical inductances and capacitances, so without waiting for a clock transition).
The thing is, back then clocking of digital circuits really took off (because it’s WAY simpler to have things done one stage at a time with a clock synchronizing when results are read from one stage and sent to the next stage, since different gates have different delays and so making sure results are only read after the slowest path is done is complicated) so all CPU and GPU architecture nowadays are based on having a clock, with clock transitions dictating things like when is each step of processing a CPU/GPU instruction started.
Circuits without clocks have the capability of being way faster than circuits with clocks if you can manage the problem of different digital elements having different delays in producing results I think what we’re seeing here is a revival of using circuits without clocks (or at least with blocks of logic done between clock transitions which are much longer and more complex than the processing of a single GPU instruction).