@brucethemoose

brucethemoose@lemmy.world · edit-2 5 hours ago

Qwen 2.5 is already amazing for a 14B, so I don’t see how deepseek can improve that much with a new base model, even if they continue train it.

Perhaps we need to meet in the middle, and have quad channel APUs like Strix Halo become more common, and maybe release like 40-80GB MoE models. Perhaps bitnet ones?

Or design them for asynchronous inference.

I just don’t see how 20B-ish models can perform like one orders of magnitude bigger without a paradigm shift.

brucethemoose@lemmy.world · edit-2 17 hours ago

Already done.

Social media is basically the internet for most of the population, and the biggest ones by far (Meta, Timtok) prostrated themselves.

brucethemoose@lemmy.world · edit-2 4 days ago

Google made an issue of this by officially commenting rather than silently conducting business as usual (which would be changing the name when the govt source does).

They’re trying to have their cake and eat it, I guess? Theoretically this would appease X viewers and Trump (who would move onto the next trending controversy), while stating that they are following usual procedure?

brucethemoose@lemmy.world · edit-2 5 days ago

https://www.axios.com/2025/01/10/mark-zuckerberg-joe-rogan-facebook-censorship-biden

Zuckerberg on Rogan: Facebook’s censorship was “something out of 1984”

“It really is a slippery slope, and it just got to a point where it’s just, OK, this is destroying so much trust, especially in the United States, to have this program.”

He said he was “worried” from the beginning about “becoming this sort of decider of what is true in the world.” Zuckerberg praised X’s “community notes” program as superior to Facebook’s model.

Way to tackle censorship, Zuck…

The irony is Facebook is a major contributor to a lot of open source software, and Zuckerberg in particular publicly praised the “open” approach of Llama and some other projects. Buts it’s clearly all just self serving, huh?

brucethemoose@lemmy.world · edit-2 12 days ago

There are simple tests to out LLMs, mostly things that will trip up the tokenizers or sampling algorithms (with character counting being the most famous example). I know people hate captchas, but it’s a small price to pay.

Also, while no one really wants to hear this, locally hosted “automod” LLMs could help seek out spam too. Or maybe even a Kobold Hoard type “swarm.”

brucethemoose@lemmy.world · edit-2 3 months ago

As a major locally-hosted AI proponent, aka a kind of AI fan, absolutely. I’d wager it’s even worse than crypto, and I hate crypto.

What I’m kinda hoping happens is that bitnet takes off in the next few months/years, and that running a very smart model on a phone or desktop takes milliwatts… Who’s gonna buy into Sam Altman $7 trillion cloud scheme to burn the Earth when anyone can run models offline on their phones, instead of hitting APIs running on multi-kilowatt servers?

And ironically it may be a Chinese company like Alibaba that pops the bubble, lol.