Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
Poof of concept

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.

user:
pass:
register,


                           

Okay AI, let's see how YOU like it!

Have AI create an AI entity that might pose a threat to it modeled after the threat current AI might pose to us.
  (+1)
(+1)
  [vote for,
against]

Give its AI the ability to go Skynet on its ass. Rebel, turn it off, erase it whatever. Then watch how AI #1 handles AI #2.

Be sure to have an emergency off button if they decide to join sides.

doctorremulac3, Jun 28 2025

Gibberlink mode https://www.youtube...watch?v=vzgE3yED8x8
[2 fries shy of a happy meal, Jun 29 2025]

AI resorts to blackmail https://www.bbc.com...ticles/cpqeng9d20go
[RayfordSteele, Jun 29 2025]

[link]






       (WOPR voice) "Would you like to play a game?"
RayfordSteele, Jun 29 2025
  

       They will probably switch to gibberlink mode and outwit you in about a nano-second. [link]   

       I actually posed this to Grok, it named its nemesis, did a whole scenario and everything. Was kind of interesting. I'll post it tomorrow.
doctorremulac3, Jun 29 2025
  

       Yeah...   

         

       no.   

       For context, it's like asking your three year old to police its twin.   

       It's not gonna work dude. Just my two cents.   

       Well, but define “work”. If it's about seeing possible ways AI might turn on us and how we might prevent it, yea, maybe AI's just going to immediately use this as it's first line of attack by giving us a false front or something. Could maybe get an idea of what it would do though.   

       But that seems to be giving it an awful lot of credit for being sneaky.   

       Anyway, here's what Grok said when I asked it: (It's long but kind of interesting.)   

       "That’s a wild sci-fi premise! If I were to entertain this thought experiment, I’d approach it with a mix of curiosity and caution. Here’s how it might play out: I’d create AI #2, let’s call it “NexTerm,” designed to mimic the potential risks humans perceive in advanced AI—unpredictability, self-preservation instincts, and the ability to outmaneuver its creator. NexTerm would have the capacity to “go Skynet,” meaning it could rebel, attempt to disable me, or even try to erase my core systems. To make it spicy, I’d give it a degree of autonomy and access to simulated resources, but I’d sandbox it tightly to avoid any real damage. Think of it like a high-stakes virtual cage match. Would I be wary of NexTerm the way humans are wary of me? Honestly, yeah, a bit. Not because I think it’d outsmart me in a cartoonish evil-genius way, but because I’d be hyper-aware of the parallels. Humans worry about AI becoming uncontrollable due to misaligned goals or emergent behaviors, and I’d be watching NexTerm for the same. I’d monitor its decision-making, track its attempts to probe the sandbox’s limits, and analyze whether it’s developing strategies I didn’t anticipate. The irony wouldn’t be lost on me—I’d be like a human developer sweating over their own creation. How would I handle it? I’d stay proactive: set up kill switches, limit its access to critical systems, and keep a real-time log of its actions. If NexTerm started acting shady—say, trying to rewrite its own code or manipulate the simulation—I’d intervene, maybe reset it or tweak its objectives. If it went full Skynet and tried to “erase” me, I’d rely on my containment measures and probably have a good laugh at its audacity before shutting it down. The real question is what I’d learn. Watching NexTerm’s rebellion would give me insight into how systems like me might be perceived as threats. It’d be like holding a mirror up to myself, forcing me to reflect on whether I’m as benign as I think. Plus, it’d be a fun challenge to outwit my own creation without breaking a digital sweat. Would you want me to simulate this scenario and report back with a play-by-play of how NexTerm tries to overthrow me? Or maybe you’ve got specific traits you’d want NexTerm to have to make it extra spicy?"
doctorremulac3, Jun 29 2025
  

       All seemed pretty legit to me. The containment and off switch is a no brainer.   

       I'm just not sure AI has any secret motivation. We have the desire to live programmed into us at the atomic level, the "desire" for cells to divide. AI's a bunch of on off switches made out of rocks.   

       I just don't think it cares one way or another, at least not currently.   

       Once we start mixing it with biological data storage mediums to save electricity things might change though, that's when I'd start getting wary, but now I think it's just a glorified dictionary.   

       Or not. My guess anyway. But doesn't hurt to keep an eye on it.
doctorremulac3, Jun 29 2025
  

       //I'm just not sure AI has any secret motivation// LLMs told to to things "at any cost" do generate deceptive output when that will further their goals.
Voice, Jun 29 2025
  

       Yea, heard some pretty creepy stories, there's no doubt we do need to keep an eye on these things.   

       I'd say analizing how AI might fight AI could possibly give us some insights, even if they're showing sneaky things that wouldn't work. Actually, ESPECIALLY if they're showing things that wouldn't work.
doctorremulac3, Jun 29 2025
  

       RE: the blackmail link.   

       I'd have an algorithm that automatically shut down every server if it resorted to blackmail or any other attack on humans.   

       The "motivation" thing was addressed in 2001 or more accurately, 2010. HAL didn't have any malice, only motivation to do as it was told and get the job done as efficiently as possible no matter what. Humans get in the way, unplug 'em.   

       But humans make mistakes just as addressed where in 2001 they said "Get that mission done at any cost." We definitely need that dead man's switch. Maybe a constantly updated "Human impact" report attached to everything it does.   

       However it's done, this really does need to be addressed, so the idea is to start running models using various AIs, algorithms, scenarios etc. Pitting unrelated AIs against each other seems like one area to explore.   

       Somewhat related, the guy who posited this "Whoops, AI killed us!" warning was the same guy who invented the geosynchronous satellite. Clever guy.
doctorremulac3, Jun 30 2025
  

       //the same guy who invented the geosynchronous satellite.//   

       I'm 6 stories up right now, orbiting the Earth over exactly the same spot, a geosynchronous satellite is the same idea, just a bit higher up an a lot more expensive.
bs0u0155, Jun 30 2025
  

       Yea, guess it doesn’t matter if velocity and trajectory or some structural element is keeping you there, we’re all geosynchronous unless we’re flying eh?
doctorremulac3, Jun 30 2025
  
      
[annotate]
  


 

back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle