Thoughts on the Eliezer vs. Hotz AI Safety Debate

I just got done watching
the debate
on AI safety between
George Hotz
and
Eliezer Yudkowski
on
Dwarkesh Patel’s podcast.

The debate was quite fun to watch, but also frustrating.

What irked me about the debate—and all similar debates—is that they fail to
isolate the disagreements. 90% of the discussion ends up being heat instead
of light because they’re not being disciplined about:

Finding the disagreement
Addressing their position on that disagreement
Listening to a rebuttal
Deciding if that’s resolved or not
Either continuing on 2-4 or moving onto the next #1

Instead what they do (they being the General They in these debates) is
produce 39 different versions of #2, back and forth, which is dazzling to
watch, but doesn’t result in a #4 or #5.

It feels like a Chinese martial arts movie from the 80’s after you’ve seen a
lot of MMA. Like why don’t you just HIT him? Why all the extra movements?

I think we can do better.

How I’d characterize and address each of their positions

I’m not saying I’d do better in a live debate with either of them. I could
very well get flustered or over-excited and end up in a similar situation—or
worse.

But if I had time and notes, I’d be able to do much better. And that’s what
I have right now with this written response. So here’s what I see from each
of them.

Hotz’ arguments

I’m actually not too clear on Hotz’ argument, to be honest, and that’s a
problem for him. It’s also why I think he lost this debate.

He’s flashy. And super smart, obviously, but I feel like he was just taking
sniper shots from a distance while mobile.

Wait, I think I have it. I think he’s basically saying:

Hotz Argument 1: We’ll have time to adjust

AI’s intelligence will not explode quickly
Timing matters a lot because if it moves slowly enough we will also have
AIs and we’ll be able to respond and defend ourselves
Or, if not that, then we’d be able to stage some other defense

My response to this is very simple, and I don’t know why Eleizer and other
people don’t stay focused very clearly on this.

We just accidentally got GPT-4, and the jump from GPT-2 to GPT-4
was a few years, which is basically a nanosecond
The evolution of humans happened pretty quickly too, and they had no
creator guiding that development. That evolution came from scratch and
no help whatsoever. And as stupid and slow as it was, it lead to us
typing this sentence and creating GPT-4
So given that, why would we think that humanity in 2023, when we just
created GPT-4, and we’re spending what?—tens of billions of dollars?—on
trying to create superintelligence, would not be able to do it quickly?
It’s by no means a guarantee, but it seems to me that given #1 and #2,
betting against the smartest people on planet Earth, who are
spending that much money, being able to jump way ahead of human
intelligence very soon is a bad, bad bet
Also keep in mind that we have no reason whatsoever to believe human IQ
is some special boundary. Again, we are the result of a slow, cumbersome
chemical journey. What was the IQ of humans 2,000 years ago compared to
the functional (albeit narrow) IQ of GPT-4 that we just stumbled into
last year?

Hotz Argument 2: Why would AI even have goals, and why would they be counter
to us?

There’s no reason to believe they’ll come up with their own goals that
are counter to us
This is sci-fi stuff, and there’s no reason to believe it’s true

Eliezer addressed this one pretty well. He basically said that—as you
evolutionarily climb the ladder—attaining goals becomes an advantage that
you’ll pick up. And we should expect AI to do the same. By the way, I think
that’s exactly how we got subjectivity and free will as well, but that’s
another blog post.

I found his refutation of Holz Argument #2 to be rock solid.

Now for Eliezer’s arguments.

Yudkowski’s arguments

I think he really only has one, which I find quite sound (and frightening).

Given our current pace of improvement, we will soon create one or more
AIs that are vastly superior to our intelligence
This might take a year, 10 years, or 25 years. Very hard to predict, but
it doesn’t matter because the odds of us being ready for that when it
happens are very low
Because anything that advanced is likely to take on a set of goals (see
evolution and ladder climbing), and because it’ll be creating those
goals from a base of massive amount of intelligence and data, it’ll
likely have goals something like “gain control over as many galaxies as
possible to control the resources”
And because we, and the other AIs we could create, are competitors in
that game, we are likely to be labeled as an enemy
If we lots of time and have advanced enough to have AIs to fight for us
this will be our planet against their sun. And if not it’ll be their sun
against our ant colony
In other words, we can’t win that. Period. So we’re fucked
So the only smart thing to do is to limit, control, and/or destroy
compute

Like I said, this is extremely compelling. And it scares the shit out of me.

I only see one argument against it, actually. And it’s surprising to me that
I don’t hear it more from the counter-doomers.

It’s really hard to get lucky the first time. Or even the tenth time. And
reality has a way of throwing obstacles to everything. Including
superintelligence’s ascension.

In other words, while it’s possible that some AI just wakes up and instantly
learns everything, goes into stealth mode, starts building all the diamond
nanobots and weapons quietly, and then—BOOM—we’re all dead…that’s also not
super likely.

What’s more likely—or at least I hope is more likely—is that there
will be multiple smaller starts.

A realistic scenario

Let’s say someone makes a GPT-6 agent in 2025 and puts it on GitHub, and
someone gives it the goal of killing someone. And let’s say there’s a market
for Drone Swarms on the darkweb, where you can pay $38,000 to have a swarm
go and drop IEDs on a target in public.

So the Agent is able to research the darkweb, find where they can rent one
of these swarms. Or buy. Whatever. So now there’s lots of iPhone footage of
some political activist getting killed from 9 IEDs being dropped on him in
Trafalgar Square in London.

Then within 48 hours there are 37 other deaths, and 247 injuries from
similar attacks around the world.

Guess what happens? Interpol, Homeland Security, the Space Force, and every
other law enforcement agency everywhere suddenly goes apeshit. The media
freaks out. The public freaks out. Github freaks out. OpenAI freaks out.

Everyone freaks out.

They find the Drone Swarm people. They find the Darkweb people. They bury
them under the jail. And all the world’s lawmakers go crazy with new laws
that go into effect on like…MONDAY.

Now, is that a good or a bad thing?

I say it’s a good thing. Obviously I don’t want to see people hurt, but I
like the fact that really bad things like this tend to be loud and visible.
Drone Swarms are loud and visible.

And so too are many other instances of an early version of what we’re
worried about. And that gives us some hope. And some time.

Maybe.

That’s my only argument for how Eliezer could be wrong about this. Basically
it won’t happen all at once, in one swift motion, in a way that goes from
invisible to unstoppable.

Here’s how I wish these debates were conducted

Point: We have time. Counterpoint: We don’t. See evolution and GPT-4.
Point: We have no reason to believe they’ll develop goals. Counterpoint:
Yes we do; goals are the logical result of evolutionary ladder climbing
and we can expect the same thing from AI.
Etc.

We should have a Github repo for these. ?

Summary

I wish these debates were structured like the above instead of like
Tiger Claw vs. Ancient Swan technique.
Hotz’ main argument is that we have time, and I don’t think we do. See
above.
Eliezer’s main argument is that we’re screwed unless we
limit/control/destroy our AI compute infrastructure. I think he’s likely
right, but I think he’s missing that it’s really hard to do anything
well on the first try. And we might have some chances to get aggressive
if we can be warned by early versions failing.

Either way, super entertaining to see both of them debate. I’d watch more.

May 23, 2025

Thoughts on the Eliezer vs. Hotz AI Safety Debate

How I’d characterize and address each of their positions

Hotz’ arguments

Hotz Argument 1: We’ll have time to adjust

Hotz Argument 2: Why would AI even have goals, and why would they be counter
to us?

Yudkowski’s arguments

A realistic scenario

Here’s how I wish these debates were conducted

Summary

0 responses on "Thoughts on the Eliezer vs. Hotz AI Safety Debate"

Leave a Message Cancel reply

Recent Posts

Recent Comments

Thoughts on the Eliezer vs. Hotz AI Safety Debate

How I’d characterize and address each of their positions

Hotz’ arguments

Hotz Argument 1: We’ll have time to adjust

Hotz Argument 2: Why would AI even have goals, and why would they be counter to us?

Yudkowski’s arguments

A realistic scenario

Here’s how I wish these debates were conducted

Summary

0 responses on "Thoughts on the Eliezer vs. Hotz AI Safety Debate"

Leave a Message Cancel reply

Recent Posts

Recent Comments

Hotz Argument 2: Why would AI even have goals, and why would they be counter
to us?