• 0 Posts
  • 27 Comments
Joined 1 month ago
cake
Cake day: August 10th, 2025

help-circle
  • Yeah, I get it. I don’t think it is necessarily bad research or anything. I just feel like maybe it would have been good to go into it as two papers:

    1. Look at the funny LLM and how far off the rails it goes if you don’t keep it stable and let it kind of “build on itself” over time iteratively and don’t put the right boundaries on
    2. How should we actually wrap up an LLM into a sensible model so that it can pursue an “agent” type of task, what leads it off the rails and what doesn’t, what are some various ideas to keep it grounded and which ones work and don’t work

    And yeah obviously they can get confused or output counterfactuals or nonsense as a failure mode, what I meant to say was just that they don’t really do that as a response to an overload / “DDOS” situation specifically. They might do it as a result of too much context or a badly set up framework around them sure.


  • Yes, I get all that. What I’m saying is that you’re making it pretty clear that they’re getting a corrupt bargain but you’re still going to make them go through all the bullshit.

    Like if you show up in Dubai and give $5 million dollars to the right person, I’m pretty sure they don’t make you stand in line and pay your visa processing fee, and then say you can only stay 270 days and you have to scram or else they’ll be in trouble. You just get to come hang out.

    Just either be aboveboard, and make them go through all the hassle, or have them give you 5 million dollars and then say “Hey broski, we’ll take care of it.” Just talk to your friends that you nominated to run the IRS and say “Hey this guy’s offshore income is going to be $17 and some change, we’re fine with that, right? He’s a friend of mine.” That kind of thing. Give the IRS a little list of the few hundred $5 million platinum card holders and make sure they have some vague understanding of what to do with the information. I don’t get this weird middle-ground bullshit where they’re paying the bribe, but they’re still getting treated like a pleb and made to go through the bureaucracy.

    I mean honestly the way it reads to me is that they kind of want to keep an eye on you, they want to put themselves in the position of deciding whether or not you’re allowed back into the country every 270+90 days. I feel like most people who are in that “not American while having to be at the mercy of US immigration” category, intersected with the “capable enough to have $5 million to throw around” category, are probably going to be able to see through that stuff. That’s what I was saying, more to the point.

    It’s just more of Trump’s MO. He’s transactional, like pathologically so to where it’s all he really understands, but the other person never actually gets their end of the transaction. He just gets his and then the other person gets fucked. That really comes through to me reading this, because of how convoluted it sounds even when they’re trying to make it sound like this wonderful thing.


  • For a processing fee and, after DHS vetting, a $5 million contribution, you will have the ability to spend up to 270 days in the United States without being subject to U.S. taxes on non-U.S. income.

    Fuckin’… what?

    Why do I have to pay a processing fee before giving you 5 million dollars?

    Why is it “up to 270 days”? Who is going to be swayed by this perk when they opt for the platinum card instead of the gold? What the fuck is all this? I understand selling residency, a lot of shithole countries do that, it’s usually successful at what they are trying to achieve with it. Why are you immediately walking it back with all these nonsensical asterisks though? Did someone put this page together subversively, because they really want people to look at how un-benevolent the whole package is and start to think twice about who it is exactly that they’re making this Faustian bargain with?


  • Initial thought: Well… but this is a transparently absurd way to set up an ML system to manage a vending machine. I mean it is a useful data point I guess, but to me it leads to the conclusion “Even though LLMs sound to humans like they know what they’re doing, they does not, don’t just stick the whole situation into the LLM input and expect good decisions and strategies to come out of the output, you have to embed it into a more capable and structured system for any good to come of it.”

    Updated thought, after reading a little bit of the paper: Holy Christ on a pancake. Is this architecture what people have been meaning by “AI agents” this whole time I’ve been hearing about them? Yeah this isn’t going to work. What the fuck, of course it goes insane over time. I stand corrected, I guess, this is valid research pointing out the stupidity of basically putting the LLM in the driver’s seat of something even more complicated than the stuff it’s already been shown to fuck up, and hoping that goes okay.

    Edit: Final thought, after reading more of the paper: Okay, now I’m back closer to the original reaction. I’ve done stuff like this before, this is not how you do it. Have it output JSON, have some tolerance and retries in the framework code for parsing the JSON, be more careful with the prompts to make sure that it’s set up for success, definitely don’t include all the damn history in the context up to the full wildly-inflated context window to send it off the rails, basically, be a lot more careful with how to set it up than this, and put a lot more limits on how much you are asking of the LLM so that it can actually succeed within the little box you’ve put it in. I am not at all surprised that this setup went off the rails in hilarious fashion (and it really is hilarious, you should read). Anyway that’s what LLMs do. I don’t know if this is because the researchers didn’t know any better, or because they were deliberately setting up the framework around the LLM to produce bad results, or because this stupid approach really is the state of the art right now, but this is not how you do it. I actually am a little bit skeptical about whether you even could set up a framework for a current-generation LLM that would enable to succeed at an objective and pretty frickin’ complicated task like they set it up for here, but regardless, this wasn’t a fair test. If it was meant as a test of “are LLMs capable of AGI all on their own regardless of the setup like humans generally are,” then congratulations, you learned the answer is no. But you could have framed it a little more directly to talk about that being the answer instead of setting up a poorly-designed agent framework to be involved in it.



  • I think the crisis of Trump is likely to be worse than any crisis in the Western world for the last 50 years. I think the closest analogue is probably the collapse of the USSR. So yes, some of the rich people upped their wealth by orders of magnitude, and honestly you might be right that Zuck might manage to be one of that category, but also some of them lost everything or got thrown out windows, or had to survive in reduced capacity within their new walled fortresses in the horrifying new meta. I feel like more likely is that the MAGA world will remember Facebook censoring their posts about ivermectin, and not feel like Zuck needs to have a seat at the table, no matter how many ass-kissing sessions he shows up at the White House to do.

    For example I feel like breaking up Meta and mandating Truth Social and TikTok as the only new sanctioned social media going forward might be one possible outcome. It’s kind of hard to say and I won’t swear that you’re definitely wrong that he might come out way ahead in the end. I’m just saying that this type of crisis is a very different type of crisis.