Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
2:00
have turned many people into much more extreme
2:02
versions of themselves. So
2:04
that's one example. But in the
2:06
long run, if you just ask anyone,
2:08
even a person, a die-hard skeptic
2:10
as we call them, or a
2:13
denialist as you might also call
2:15
them, okay, so we're investing
2:17
hundreds of billions of dollars with the
2:19
goal of creating general purpose intelligence
2:22
that's more intelligent than human beings,
2:25
and therefore more powerful than human beings. How
2:27
do you propose to retain power over
2:30
more powerful entities than yourself forever?
2:33
So that's the question. And
2:36
usually when you put it like
2:38
that, people say, oh yeah, I see what you mean. Okay,
2:41
I haven't thought about that. And
2:43
that's the issue, right? We are spending hundreds
2:46
of billions of dollars to achieve something and
2:48
we haven't thought about it yet. So
2:51
my colleague or
2:54
former student, Andrew Ng, is one
2:56
of the skeptics and he says, well, you know, I
2:58
don't worry about this anymore than I worry about overpopulation
3:01
on Mars. But
3:04
if we had a plan to move the
3:06
entire human race to Mars and
3:09
no one had thought about what we were going to
3:11
breathe when we got there, you
3:13
would say that's an unwise plan. But
3:17
that's the situation that we're in. No one has thought
3:19
about what happens if we succeed. So
3:21
the book is partly about how
3:24
to convince people that this matters
3:27
and then what is my
3:30
proposal for doing something about it. Right.
3:32
And the social media thing is interesting because I was thinking
3:34
about if you could go back in time 10 or
3:37
even just five years and
3:39
you tried to be the Paul
3:42
Revere of this system,
3:44
it would be really difficult to even convince people
3:47
of what 2019 would look like in
3:49
that way. Like, I don't think people
3:52
would have believed we would have entire governments fundamentally
3:54
altered as an unintended consequence
3:56
of optimizing for click through. And yet
3:58
it already happened this quickly. Yes,
4:02
so for the non-American listeners,
4:04
Paul Revere is someone who warned that the
4:06
British are coming, the British are coming, so
4:08
he was on the side of the American
4:10
revolutionaries. And
4:13
I guess my recommendation
4:15
would have been, first
4:17
of all, change the way you
4:19
think about the problem. So don't
4:22
just think, okay, what
4:24
is my objective, my in this case
4:26
being the social media platforms, what
4:28
is my immediate short-term objective is to make
4:30
money, and then set
4:32
up some optimizing machinery with that
4:34
as the objective, and then completely
4:36
ignore the effects that that's
4:38
going to have on things other
4:41
than your bottom line. So
4:45
with, you know, with, you
4:47
know, chemical companies that used to just dump poisonous
4:50
chemicals into the river while
4:53
they were making money, we said, okay, you
4:55
have to stop doing that or you have
4:57
to pay enormous fines or taxes or whatever.
5:00
We're trying to tax the
5:02
oil companies and coal companies for the carbon dioxide,
5:05
but that doesn't seem to be working. So
5:10
we can't really do that with turning
5:14
people into neo-fascists. It's not clear, you
5:16
know, what should the penalty be per
5:18
neo-fascist created. But
5:24
basically, if you're
5:26
going to build a system that messes
5:28
with stuff whose
5:31
value you're not sure
5:33
about, then you should
5:35
try to avoid messing with that stuff.
5:38
So in this case, the stuff is the
5:40
human mind, you know, our opinions,
5:42
our positions, our perceptions of the
5:44
world. So to the extent possible,
5:48
don't build systems that mess with that.
5:51
Since you don't know whether that messing is
5:53
a good idea or a bad idea. And
5:57
so you can design algorithms that... are
6:01
much less likely to manipulate people. So
6:04
the basic difference for the geeks
6:07
is between a supervised learning algorithm
6:09
that learns what people
6:11
want and a reinforcement learning algorithm that
6:14
changes what people want so that it's
6:16
easier to supply. And the
6:18
reinforcement learning algorithm doesn't know you have a brain.
6:20
It doesn't know you have political opinions. You're
6:22
just a clickstream history. And
6:27
they learn that given a certain
6:29
type of clickstream history, if
6:32
you subsequently feed certain articles
6:34
to that clickstream history, it
6:37
starts to generate more money. And
6:40
that's it. So it
6:42
turns out from our side that you're
6:45
gradually feeding people more and more extreme
6:47
violent videos or more and more extreme
6:49
pornographic content or more and more extreme
6:51
political content. Whereas
6:55
a supervised learning algorithm is not trying to change
6:57
the world. It's just trying to learn what the
6:59
world is like. In this case, learn what your
7:01
opinions are. So you might still get a bit
7:03
of an echo chamber effect, but
7:05
you wouldn't get this manipulation of people
7:07
to the extremes, which is
7:09
what seems to have happened. So
7:12
this is a general principle. And
7:15
this is one of the consequences
7:17
of the new way of doing AI that the
7:20
book proposes is that when
7:23
the algorithm knows that it doesn't know
7:26
the value of everything, it
7:29
will naturally avoid messing with the parts of
7:31
the world whose value it's not sure about.
7:34
And if it does have to mess with that, it
7:37
will ask permission. So
7:40
if it was a climate
7:42
control system before turning
7:44
the ocean into sulfuric acid in order to
7:46
reduce the amount of carbon dioxide, it
7:50
would ask permission because it's not sure if we want the
7:52
ocean to be made of sulfuric acid. And
7:55
so you get the
7:57
kind of deferential behavior that you would hope
7:59
for. by
12:00
just playing the right moves. Well, in theory, you
12:02
could just play the right moves, but in practice,
12:04
you can't because it's smarter than you
12:07
are. So
12:09
it will always anticipate and
12:11
frustrate your attempts. And it'll
12:13
take preemptive steps and possibly
12:15
even deceptive steps. So
12:18
it might pretend to
12:20
be innocuous, harmless, and
12:22
stupid long enough to prepare
12:24
all of its defenses so that it can
12:27
carry out the objective that you gave it.
12:30
It's not deceptive because it's
12:32
evil or because it wants to do something different
12:34
from what you told it. It's
12:36
just afraid that something it might do
12:39
would cause you to switch it off.
12:41
And so since it needs
12:43
to achieve the objective that's been programmed
12:45
into it, it develops
12:47
a subterfuge of appearing helpless so
12:50
that it can prepare all its
12:52
defenses and
12:55
then come out with the real plan to
12:57
achieve the objective. Whereas
13:01
in the new approach to AI, you
13:04
get the exact opposite effect that the
13:06
smarter the machine is, the
13:08
better it is for you. Because the
13:10
better it learns what your
13:12
true preferences are, the better
13:15
it avoids messing with
13:17
parts of the world that it's not sure about. And
13:21
just in general, it's going to be more useful to you. So
13:26
really, partly the
13:28
book is aimed at everybody, saying, look,
13:31
here is AI. Here is how it's
13:33
done. Here is why doing more of that leads
13:35
you off the cliff. And
13:38
then this is other approach. But it's also
13:40
a little bit aimed at the AI community
13:43
to say, listen, I think
13:46
I want everyone to stop
13:49
and think about how they're building their systems.
13:53
And I'm not wagging my finger and saying, you're bad
13:55
people. I'm just saying
13:57
the method of engineering that we've
13:59
developed. And
14:02
it was developed back in the middle of the 20th century,
14:06
the basic paradigm. And
14:08
it's the same paradigm as we have
14:10
for control theory, control engineering, where
14:13
you have a fixed cost function
14:15
that the controller has
14:17
to minimize. In
14:20
economics, you have
14:22
a fixed target like a GDP or the corporate
14:24
profit. In
14:27
statistics, you try to minimize the
14:29
loss function, so basically
14:31
the cost of prediction errors. And
14:34
in all these cases, we assume that the
14:36
objective is fixed and known to
14:38
the machinery that's supposed to be optimizing it. And
14:42
that's just an extreme and
14:45
extremely unrealistic special case
14:48
of what is generally true, which is
14:50
that the machinery that's supposed to be
14:52
optimizing doesn't have access to the
14:54
objective. Right. And you
14:56
spend a fair bit of time in the
14:58
book talking about the definition of intelligence, even
15:00
in humans. And we
15:02
don't always know what our true reasoning
15:05
behind things are. We're not even
15:07
anywhere. You said that we're as
15:09
far away from being rational as,
15:11
what was the analogy? I
15:14
think it's a sluggish from overtaking the
15:16
Starship Enterprise at warp nine. That's the
15:18
way you put it. Yeah. So
15:22
there's a number of things about
15:24
human intelligence that are not
15:27
ideal. So
15:29
one of them is clearly that the
15:32
world is much, much, much too complicated
15:34
to actually behave rationally, i.e.
15:37
for our actions to be the ones
15:39
that best satisfy our own preferences about
15:41
the future. So you
15:43
can see that very simply if you look at chess,
15:46
right? You're
15:48
standing there in front of a chess board.
15:51
That chess board is a tiny little piece of
15:53
the real world, and it's very, very well behaved.
15:57
We Know exactly how the pieces move and what
15:59
the rules are. And yet we
16:01
still can't make the right decision. In
16:04
that situation, and the real world
16:06
is so much more complicated, the
16:09
horizons a so much longer than
16:11
they are in chess on there
16:13
are so many more moving parts.
16:15
The rules are so much less
16:17
well known. The world is much
16:19
less predictable. So that means that.
16:21
ah, As a practical matter,
16:24
in fact, know computer is ever going
16:26
to be rational either. Or
16:28
even if it was the size of uverse,
16:30
it's still com. Calculate what is
16:32
the right course of action. She's
16:34
my. So
16:37
that's one thing sometimes called bounded
16:39
rationality. But
16:42
another thing is you say that we
16:44
don't even know our own preferences about
16:46
the future. I'm so sad.
16:49
Makes it doubly hard to write
16:51
them down completely incorrectly and provide
16:53
them to the machine. A. Sort.
16:56
Example: Using the book which which
16:58
apparently has has already been adopted
17:00
by some philosophers is a this
17:02
fruit called the Durian which which
17:05
I never tried and I have
17:07
I've delivered he didn't try it
17:09
was writing the book on. Because
17:13
the the jury of food is something
17:15
that some people think is completely sublime
17:17
and in a writer's but going back
17:19
to nineteenth century has described as a
17:22
d most sublime was food provided on
17:24
this earth. And
17:26
then other people say well it reminds
17:28
me of skunk spray foam it in
17:30
a well known be wound swabs in
17:33
our isn't it banned of the options
17:35
and what a country's Yeah yeah so
17:37
it so it's I'm is common in
17:39
the Southeast Asia, Indonesian so on and
17:42
and every. So often. You. hear
17:44
one of these during emergencies where you
17:46
know they tax on them into a
17:48
crate on an airplane and they didn't
17:51
seal it properly and the passengers in
17:53
a revolt and force the pilot to
17:55
turn around and land a plane or
17:58
your entire building their evacuated and
18:01
so on and so forth. So for
18:03
some people, it's absolutely unbearable. And I
18:06
don't know which of those two kinds of people I am.
18:09
So that's a clear case where
18:11
I don't know my preferences about a future
18:13
that involves eating durian, right? Is that a
18:15
future I want or a future I don't
18:18
want and I don't know. And in
18:23
fact, when you think about it, that's
18:25
pretty much the universal situation
18:28
we find ourselves in. You know, if you're
18:31
finishing high school and you go to the
18:33
career counselor and they say, well, you know,
18:35
there's a job in the coal mine or
18:37
there's a job open in the library, right?
18:40
So do you want to be a librarian or a coal miner? You
18:43
don't know. You haven't the faintest idea. You
18:45
don't know how you're going to enjoy
18:48
being underground or being surrounded by dusty
18:50
books and have no one to talk
18:52
to for hours on end. And
18:56
so I think this is actually pretty
18:59
much the normal condition that there's large
19:01
parts of our
19:04
own preferences, meaning
19:06
how much we will like any
19:08
given life that
19:11
we just don't know until we
19:14
see it. You know, someone who's good at
19:16
introspection probably
19:18
has a better idea of how they're going
19:20
to feel about a given situation,
19:22
but you still don't know until you're in
19:25
it. And how do you
19:27
program an AI to take into account that
19:29
those preference changes or personal
19:31
growth, right? That's the issue. Well,
19:35
there's two issues, right? So
19:37
it isn't necessarily preference changes
19:39
in the sense that my
19:42
preferences are sort of in me.
19:45
They're there, but
19:47
I don't know what they are, right?
19:49
So whether or not I like durian, it's
19:52
not a decision I make, right? I
19:54
taste the durian and I find out
19:57
what my preferences are, but they were
19:59
there. there in me is a latent part
20:02
of my neurological
20:04
structure, I guess, or something about my
20:06
DNA as to whether or
20:08
not I like the durian taste. And
20:11
so that part where
20:13
your preference for durian is something
20:15
that's fixed but
20:17
unknown, that's relatively easy
20:19
for us to deal with. We're
20:23
already working under the assumption
20:25
that the machine is learning about your
20:27
preferences from
20:29
choices that you make and if
20:33
you don't know whether or not you like
20:35
durian, then you're not either going to run
20:37
away from it or drool at
20:40
the prospect of eating some durian. You're
20:42
going to exhibit sort of
20:44
not indecisive behavior, not sure if I
20:47
really want to try this and kind
20:50
of like if you read Dr. Seuss's Green Eggs
20:52
and Ham. I
20:54
don't want to try it. No, I don't want to try it. I definitely
20:56
don't want to try it. So
20:59
that kind of behavior clearly shows
21:01
that you actually are not really
21:03
sure whether you like the durian or the green
21:06
eggs and ham. And
21:09
so that's fine. And the machine wouldn't
21:11
force you to eat durian because it's
21:13
convinced that you like it and
21:16
it wouldn't deprive you of it because it's convinced that
21:18
you hate it. It would maybe
21:20
suggest that you try a little bit at some point
21:22
whenever you're ready. And
21:25
that's what you'd want. The difficult part actually
21:28
is the plasticity of preferences that
21:31
are obviously we're not born with
21:33
a whole complicated set of
21:37
preferences about politics, about religion,
21:39
about how much we value
21:43
wealth generation versus
21:45
family time versus this versus that.
21:49
We're not born knowing what it's like to have children. Many
21:52
people think they really want to have children
21:54
and change their minds and
21:56
so on. So we're acquiring.
22:00
solidifying preferences all the time
22:02
through experiences that
22:04
may not be the experience that
22:06
the preference is directly about.
22:09
For example, I think a lot of
22:11
our culture convinces
22:14
us that having
22:17
children is a desirable thing,
22:19
that it's a wonderful experience. I
22:21
think that contributes to the
22:25
formation of our preferences. The
22:28
question is how do you avoid the AI system
22:32
manipulating human preferences so that they're
22:34
easier to satisfy? The
22:38
loophole theory that you talked about, like if this
22:41
thing is smart enough, it's going to find a way to
22:44
shortcut to get the goal that you gave it. The
22:49
problem has to be formulated very
22:52
carefully. You
22:54
might say, okay, the goal
22:57
is not just in English,
22:59
if we were talking to each other, we would say,
23:01
okay, we want the underlying
23:04
constitutional objective of the machine
23:06
to be satisfying
23:09
human preferences, to be beneficial to us.
23:12
When you get into setting
23:15
up the mathematical problem, if
23:17
you're operating under
23:20
the assumption that human preferences can be changed,
23:22
then you need to be more precise. Do
23:24
you mean the preferences the human had at
23:27
the beginning? The human, we know what they
23:29
had at the end. The
23:32
preferences that they would have if you
23:34
weren't interfering, it becomes
23:37
a little bit more complicated. The
23:39
simplest answer would be the preferences
23:41
that they had at the beginning. That's
23:46
a little bit problematic because if, let's
23:49
say you have a domestic robot that's with
23:51
you for most of your life, well,
23:54
obviously, by the time you're 50, you
23:56
don't want it to be satisfying the preferences you had when
23:58
you were five. So,
24:02
but at the same time, you don't
24:04
want it to be molding your preferences
24:06
actively. It cannot really have
24:09
no effect on your preferences because,
24:11
you know, just having a domestic
24:14
robot serving you is going to change
24:16
the kind of person you are. Probably
24:20
you're going to be a little bit more spoiled than
24:23
you would be otherwise. And
24:27
so, I don't think you can argue that the
24:30
machine cannot touch human preferences
24:32
or have any effect on them because I think
24:34
that's just infeasible. So, I would say this is
24:36
one of the areas where we
24:38
need a lot more philosophical
24:42
help actually to
24:45
get these kinds of refinements
24:47
done correctly. And speaking
24:49
of philosophy, we didn't actually define intelligence
24:52
to start this conversation. Obviously,
24:55
we already have machines that are hyper competent and
24:57
more competent than humans in a lot of different
24:59
fields, but like what is the definition of
25:02
intelligence and what is
25:05
this? If everyone
25:07
succeeds in what they're doing right now, what will AI
25:10
look like? Do these AIs have to have
25:12
their own intrinsic goals to be intelligent as
25:15
opposed to just ones we gave them? Do they have to have wants
25:17
like humans do? So,
25:20
no, they certainly don't have
25:22
to have any
25:25
of their own internally generated
25:27
desires. So
25:32
the standard model is where we build machinery
25:34
that optimizes objectives that we put
25:36
in. And
25:39
that can be done in many different ways. There
25:41
are many different kinds of AI frameworks
25:44
and algorithms. So
25:47
for example, reinforcement learning is one where
25:49
you don't put in, in some sense,
25:51
you don't put in the entire objective upfront.
25:54
You kind of feed it to the
25:56
learning algorithm in drabs and drabs depending
25:59
on its behavior. you
26:01
give positive or negative rewards or negative
26:03
reward being a punishment in some sense.
26:07
And so its goal is
26:10
to maximize the stream of
26:12
positive rewards that it receives.
26:16
And so the
26:18
precise subjective is implicit
26:21
in the part that is supplying the
26:24
rewards. That's
26:26
what we would call the objective. So
26:30
it doesn't make sense for them to derive
26:33
their own separate goals
26:36
and objectives because for one
26:38
thing adding
26:41
its own goals and objectives would mean that
26:43
it wouldn't be achieving the ones
26:45
that we set for it. And
26:48
also we don't really have any good
26:50
idea for how to generate goals
26:53
out of nothing. Yeah, yeah,
26:55
when you start to think about that it kind
26:57
of blows your mind. Like what is a goal,
26:59
right? Right. I
27:02
mean we
27:04
have a very complicated system. There's
27:07
a biological system based
27:09
around our dopamine system which
27:13
evolution built into us to
27:15
give us a kind of a guidepost for how
27:19
not to die immediately. So the
27:21
dopamine system is
27:24
positively stimulated by nice
27:26
sweet calorie rich foods
27:30
and sex and other things like that.
27:32
So basically this is evolution saying look
27:34
if you eat a lot of
27:37
edible food and have lots of sex
27:39
then you'll probably end up having a
27:42
high degree of evolutionary fitness.
27:47
But it doesn't work perfectly, right? So
27:50
you can also take a whole bunch of
27:52
drugs to stimulate your dopamine system and
27:55
then you don't reproduce and you die fairly quickly.
28:00
And so the dopamine system is not a
28:02
perfect signpost to how to behave
28:04
in order to have evolutionary success,
28:06
but it's so much
28:08
better than nothing that many,
28:13
many successful species have
28:16
dopamine systems or
28:18
something equivalent. So
28:20
that, and that dopamine system is what
28:22
allows you to learn during your lifetime.
28:24
It gives you a signal saying, yeah,
28:27
this is probably good, this is probably
28:29
good. So become better at finding this
28:31
kind of sweet food or finding mates
28:33
or whatever it might be. And
28:36
learning during your lifetime turns
28:39
out to actually
28:41
accelerate evolution. So it's sort
28:43
of a doubly beneficial process
28:46
from evolution's point of view. So
28:49
that's one part of our own internal motivation
28:53
system or preference structure, if you like. And
28:55
then another part and
28:58
possibly much more important is
29:00
what we soak in from our
29:03
culture, from family, friends,
29:05
peers, and these days
29:08
from media. And
29:12
there, you know, we, that I
29:14
think departs often very
29:17
strongly from the basic
29:19
biological urges that
29:21
the dopamine system provides. So
29:24
by setting up, for example, in some
29:27
cultures, let's say, in
29:29
Tibetan Buddhism, the goal to
29:32
be a monk is set up as
29:35
one of the most desirable objectives. And
29:37
that was also true in medieval
29:39
Europe, you know, with
29:41
the Catholic monasteries, they were wealthy, they
29:44
were relatively safe compared to ordinary life,
29:49
privilege, powerful. So that
29:51
was a very desirable cultural goal that
29:53
was built in to individuals
29:56
through the culture. But
29:59
it's a... non-reproducing role. So
30:03
clearly it's not something
30:05
that evolution would
30:08
advocate, at least for individuals.
30:11
Maybe there's some wise evolutionary
30:13
plan to
30:15
have a large number of people
30:18
being monasteries to keep the species safe
30:20
and on the right track. But I doubt it. I
30:23
think it's just this is what happens
30:25
with cultural processes as opposed
30:27
to biological processes. So
30:30
these days we
30:32
have all kinds of different role models, all
30:35
kinds of different pressures
30:39
to consume, whether it's
30:41
food or clothes
30:43
or fashion, media content, sport,
30:45
etc., etc., etc. It's a
30:47
very, very complicated landscape
30:50
and that
30:52
interacts with our
30:55
emerging, maturing consciousness
30:58
and internal
31:00
mental processes in ways that are
31:03
wonderfully varied and
31:06
produce individuals with all
31:09
kinds of vocations and
31:12
desires for their own future and the future of
31:14
other people. So
31:17
all of that is going
31:20
on in humans. And basically, to
31:22
sum it all up, you're
31:24
intelligent to the extent that your
31:27
actions can be expected
31:29
to achieve your objectives. And
31:34
this is a notion that goes
31:36
back in economics and philosophy for hundreds
31:38
or thousands of years of
31:41
rational behavior. And
31:46
it's often caricatured as sort
31:48
of homo economicus, just
31:50
greed, acquisition
31:53
of wealth is the only objective. Of course, that's not
31:55
what it means. Your objectives can
31:57
be anything At All. You
32:02
can be Mother Teresa and has the
32:04
objectives of of the saving the lived
32:06
in destitute children. And
32:09
that's completely fontana. You don't have to
32:11
be selfish known as be greedy, Don't
32:13
have to care about money. It can
32:15
be anything at all. So rational behavior.
32:18
Is the the ideal.
32:21
For what we mean by human intelligence
32:23
and then we basically just copy that
32:25
into machines. And
32:28
and that became the basis for ai
32:30
back in the forties and fifties when
32:32
the home field was getting going. And.
32:37
I think this was a mistake. With
32:41
having it just modeled after a human
32:43
goals and in of itself as a
32:45
mistake. You're at
32:47
having it be a having idea
32:49
be. The machines are intelligent to the
32:52
extent that their actions can be
32:54
expected to achieve their objectives. Off.
32:56
By copying this notion, saying it will, That's what
32:58
in it means for humans be intelligent Than that's
33:00
what it means for machine to be intelligent. And
33:04
then of course you are. The machine doesn't have it's own
33:06
objectives. It doesn't have all the biology and the culture. So.
33:10
We were just put those it and for
33:12
the simple kind of. Toy
33:15
world like into the chessboard so
33:17
for them to virtual chessboard. It.
33:21
Seems quite natural that you'll just have the
33:23
goal of winning the game. And.
33:27
Or. If you want to. Be
33:30
No. Find
33:32
roots on a map. The goal is just
33:34
okay. You wanna get to the destination as
33:36
quickly as possible and so it seemed like
33:39
on in the toy examples. That.
33:41
People were are beginning to work on that
33:43
It with. Your specifying
33:45
objective wasn't a problem. And
33:48
in fact, in many cases what they
33:50
were working on in a I was.
33:54
Artificial. problems that had already been
33:56
set up with a well defined objectives of
33:58
chess is one of those that all
34:00
of checkmate is just part of
34:02
a definition of chess. So it kind
34:04
of comes with a perfectly defined
34:06
objective. Which
34:09
is not like the real world. Right. Exactly.
34:12
So that's the problem. And
34:14
funnily enough, in the early
34:18
part of the history of AI,
34:21
we also made an assumption that the
34:24
rules were known and
34:27
that the state of the world was known. And
34:29
that again is true in chess. We know the
34:31
rules of chess. We can see
34:33
where the pieces are on the board. And
34:36
so uncertainty simply doesn't come into it.
34:40
And so for most of the first 30
34:43
years or so of AI research,
34:47
it was assumed that you would know the rule and
34:49
you would know the state of the world. And
34:54
sometime around 1980,
34:57
the main
35:00
leading researchers in the field sort of
35:02
fessed up and they said, okay, fine.
35:05
We're right. We admit that in
35:08
fact, we won't always have perfect knowledge of
35:10
how the world works. And we won't always have
35:12
perfect knowledge of the state of the world. I
35:15
mean, this is sort of blatantly obvious to everybody now.
35:17
It was surprisingly
35:20
difficult for people to admit it because
35:23
it meant that the technology they
35:25
had developed, which was mainly this sort
35:27
of symbolic logic technology, was
35:29
limited in its application that you couldn't
35:31
solve a lot of real world problems
35:34
using symbolic logic because you didn't have
35:36
definite knowledge of the state
35:38
or of the rules, the dynamics, the
35:40
physics of the world. So
35:42
we accepted uncertainty wholeheartedly
35:46
by the end of the 1980s
35:48
and the beginning of the 1990s. But
35:51
we continued to assume
35:53
that the objective was known completely incorrectly,
35:55
that we had perfect knowledge of
35:58
the objective and the machine would be able to... have
36:00
that perfect knowledge. And I
36:02
can't really explain why it's taken
36:04
another 25 or 30 years to
36:12
see. And I'm one of them.
36:14
It took me a while to see that, in fact,
36:17
in the real world, you'd almost never have perfect
36:19
knowledge of the objective that
36:21
the machine was supposed to be pursuing.
36:25
It's surprising you talk about how people
36:27
who are raising the alarm about possible
36:29
negative outcomes of AI are seen as
36:32
anti-AI or Luddites, when in fact you're
36:34
just saying, no, we just have to
36:36
take into account these possible problems. And
36:39
that people who are developing the technology are some of
36:41
the ones who are saying, don't worry, we'll never even
36:43
get there. So there's no need for concern. Well, then
36:45
why are you working on it if you think you
36:47
won't actually achieve? Yeah, I mean,
36:49
it's bizarre. And I think we
36:51
just have to assume that it's
36:54
a kind of defensive denialism.
36:58
It would be uncomfortable and awkward
37:00
to admit that what you're working
37:02
on might be
37:05
sort of all the wrong path and also a
37:07
threat to the human race. What
37:09
are the biggest events that would happen in the
37:11
course of human civilization would be
37:14
inventing superhuman AI that would be up there with
37:16
an asteroid wiping out civilization or things like
37:19
that? Yeah, I think so. And this was
37:21
actually at the beginning of the book,
37:24
I'm recounting a talk that I
37:26
gave at an art
37:28
museum in London to a completely non-technical audience. And
37:30
it was the first time that I
37:32
was sort of publicly declaring
37:35
this position. So
37:38
the phrase, the biggest
37:40
event in human history comes from that
37:42
lecture. And
37:45
it was basically, I formulated
37:48
it as kind of like the Oscars. Here
37:50
are the five candidates for biggest
37:52
event in human history, you know,
37:54
asteroid wipes, or we all die
37:57
in climate disaster. you
38:00
know, we develop fast and light
38:02
travel and conquer the universe, we
38:04
solve the problem of aging and
38:07
all we all become immortal.
38:11
We're a superior
38:14
alien civilization lands on the earth.
38:16
And then the last one was
38:19
that we developed super intelligent AI. And
38:22
so, you
38:25
know, I chose that one as as
38:28
the winner, the biggest event, because basically
38:32
our whole civilization is just
38:34
built on our intelligence. And if
38:37
we have a lot more of it,
38:40
that would be an entirely
38:43
new civilization, and
38:46
possibly a much better one if we can
38:48
actually keep it. If
38:51
we can control the
38:54
potentially much more powerful entities
38:56
that we're creating, then
39:00
we can we
39:03
can direct that power to
39:05
the benefit of everybody. So it could be
39:07
a golden age. It
39:09
could in fact give us the immortality and the
39:11
fast and light travel if those things are possible,
39:13
then they're going to be much more possible if
39:17
we have access to such tools. And
39:20
it's a little bit like the arrival of
39:24
superior alien civilization, except
39:27
that it's not
39:29
a black box. At least it's not a black
39:32
box if we do it the right way. You
39:35
know, if it was really a black box, if an
39:38
alien entity landed on earth that was much
39:40
more intelligent than humans, you know,
39:42
how would you control it? You couldn't. Yeah, right,
39:45
you lose your toast. So forget it.
39:48
The only route to
39:52
getting this right is to design
39:54
the AI system in such a way that
39:57
we can provably control. Call
40:00
it. Is not good enough
40:02
to say? Well I think we've done a good
40:04
job and you know and I are given all
40:06
the programmers some. I'm pretty
40:08
good guidelines. You
40:10
know and we have a panels you know
40:13
as experts just in case something goes wrong
40:15
that this is not gonna cut it. Is
40:17
a look at what happened with. Nuclear
40:20
Power right? The. Risks of
40:22
equal power pretty apparent because people and see
40:24
what a nuclear explosion would like him, what
40:26
he could do. And
40:28
the the was a lot of
40:30
regulation. A.
40:33
Some people estimated for every for
40:35
every pound of nuclear power station
40:37
they're are seven pounds of paper.
40:41
It's hard to imagine that that's that's what
40:43
I've been told by a nuclear engineers. To
40:47
the amount of. Of for
40:49
regulation around the construction ah and
40:51
testing and checking of nuclear power
40:54
stations with with immense much bigger
40:56
i think than anything ever before
40:58
in the History of Mankind school.
41:00
that wasn't enough right? We still
41:03
had to noble and see the
41:05
humor. And that wiped out.
41:07
The nuclear industry as well as a
41:10
fair number of people in a large
41:12
chunk of land on. And.
41:15
So. We didn't get any of
41:17
the benefits of. Nuclear Power. Ah,
41:20
Because we stopped building nuclear power stations
41:22
in a lot of countries have actually
41:24
decided to phase it out altogether. so
41:27
Germany for example is in the process
41:29
of getting rid of. All with nuclear
41:31
power stations are all potential benefits of
41:33
carbon free energy and cheap electricity and
41:35
so on. Ah, We. Lost.
41:38
Because. We didn't pay attention to
41:40
the risks and nobody would say. You.
41:42
Know that a nuclear engineer who's. Proposing.
41:46
Ah, and improved design of nuclear power
41:49
stations as less likely to suffer a
41:51
meltdown no one would call him a
41:53
luddite. Breasts. Ah,
41:55
so why. why
41:58
is So it's
42:00
the Information Technology Innovation Foundation
42:03
that awards the Luddite award.
42:07
And they've awarded that prize to people
42:09
who are pointing to potential risks from AI.
42:14
And this seems weird, right? And
42:19
at the same time, I guess
42:21
they're applauding people who
42:23
say, you know, people within the field
42:25
of AI who are now saying
42:27
for the first time ever, oh, by
42:30
the way, you know, the reason we don't have
42:32
to worry is because in fact, we're guaranteed to
42:34
fail. Now, if you
42:36
ask me, that's anti AI. To
42:40
say that this
42:43
problem is beyond the capabilities of
42:46
the assembled AI researchers
42:49
of the world, you know, who
42:51
are growing rapidly, and, you know,
42:54
now have access to hundreds of billions of dollars
42:56
in funding, to say
42:59
that that all of those incredibly smart
43:01
people were too stupid to
43:04
solve the remaining problems between here
43:06
and human level AI. First
43:11
of all, I think it's completely
43:14
ground. Right? There
43:16
is no argument being made
43:18
as to why the problem
43:20
can't be solved other than, well, if
43:23
it isn't, if it isn't solved,
43:25
then we don't have to worry. So it
43:28
basically means it's a way of washing
43:31
your hands of the problem. Yeah.
43:33
Other than that, there's no justification being
43:35
given whatsoever. The other thing
43:37
is that, you know, history tells
43:39
us that that's a pretty foolish
43:43
attitude. And
43:47
in fact, coming back to nuclear power again, right, that
43:49
was the position of many
43:52
leading nuclear physicists in the early part of
43:54
the 20th century that, yes,
43:56
there is a massive amount of energy locked in
43:58
the atom. And
46:01
he says, you know, don't worry.
46:04
I know we're heading for a cliff, but I guarantee
46:06
we're gonna run out of gas before we get there.
46:09
Right? It's like, well, come on, guys.
46:12
That's not how you manage the effect of
46:14
the human race when the stakes are so
46:16
high. Yeah. So overall, are
46:19
you optimistic that if people
46:21
heed this warning now that we could put
46:23
in place these rules for
46:25
what the future of AI would look like, and we
46:27
could be in this golden era version
46:29
of the future and not one
46:31
of these various dystopias brought about by the
46:34
King Midas problem and things like that? So
46:38
I'm reasonably optimistic. There's certainly a lot of
46:41
work to do because we've
46:43
got 70 years of technological development
46:46
under the old model. And
46:48
it's not easy to replace that
46:50
overnight with
46:52
technology that operates under the new model. We're
46:56
just at the early stages of developing the
46:58
algorithms and the various subcases
47:00
and how you solve them for
47:03
that. So there's still a lot of work to do. But
47:05
even before then, I think just
47:08
the advice to think
47:11
not what is the objective
47:13
that I want the system to optimize, but
47:15
what are the potential effects of the system?
47:18
Do I know whether those effects are desirable
47:20
or undesirable? And if I
47:23
don't know, then I design
47:25
the system not to have those effects,
47:29
not to change the world in
47:31
ways that the system and I
47:33
don't know whether that's a good idea or not.
47:37
That's a better approach. So
47:39
that's sort of like a
47:42
best practice guideline for the time being. But
47:44
yeah, in the long run, the
47:46
goal would be to have
47:49
technological templates designed for software
47:51
that are provably safe
47:53
and beneficial. And
47:55
then, there are two other basic
47:59
problems are
48:01
much less technological, but I still worry about them. And
48:03
at the end of the book, I
48:07
discuss these. And I would
48:09
say I'm a
48:12
little bit less optimistic about
48:14
these, because I don't see
48:17
technological solutions for them, because
48:19
they're not really technological problems.
48:21
One is something
48:23
that probably is apparent to many people,
48:26
is if we develop this
48:28
incredibly powerful technology, what
48:30
about people who want to use it
48:33
for evil purposes? They're
48:36
not going to use the safe and
48:38
beneficial version, which would actually prevent
48:41
them from doing bad things to people. Because
48:44
it will be designed to have the
48:47
preferences of everyone in mind. So
48:49
if you try to destroy
48:52
the world, or take over the world, or do whatever it is you
48:54
want to do, it would have
48:56
to resist. But what's
48:58
stopped them developing the unsafe version,
49:02
perhaps under the old model, and putting in the
49:04
objective of, I'm the ruler of the universe. And
49:07
the system finds some way
49:10
of satisfying that, that
49:12
maybe is not even what the bad
49:14
guy intended. So it's not that he
49:16
might succeed, it's that the bad person
49:19
might fail by
49:22
losing control over the AI
49:24
system that is unleashed. And
49:27
so that's one set of worries. And if you
49:29
think about how well we're doing
49:31
with cybercrime right now, not,
49:34
then this
49:36
would be much, much more of
49:39
a risk and a threat. And so
49:42
we're going to need to develop not
49:45
just policing, but also, I think
49:47
we've got to somehow build this
49:49
into the moral fabric of
49:52
our whole society, that this is
49:54
a suicidal direction
49:57
to take. And there are interesting
49:59
precedents. in science
50:02
fiction. For example,
50:04
in Dune, which is Frank
50:06
Herbert's novel about the
50:08
farthest in the future, humanity
50:10
has gone through a near-death
50:12
experience in the form of a
50:15
catastrophic conflict between humanity and
50:17
machines, which, as
50:19
we're told, we
50:22
only just survived to
50:24
tell the tale. And so
50:26
as a result, there's basically an 11th
50:28
commandment to not make a machine in
50:30
the likeness of man. So
50:32
there are no computers in
50:35
that future. So
50:39
that gives you a sense that
50:41
this is not something you want to mess around
50:43
with, that you would need pretty
50:45
rigorous regulations and
50:48
enforcement, but also a kind
50:51
of a moral code and understanding that
50:53
everyone understands. Just as I
50:55
think creating a
50:57
pandemic organism,
51:00
some engineered virus that would destroy the human race,
51:03
I think everyone
51:05
understands that's a bad idea. Yeah.
51:07
You have to hope your evil supervillains at
51:09
least have some self-preservation instincts on top of
51:11
their evil. Even they are not proposing. I
51:13
think, well, maybe there are some groups
51:18
who really think that we should cleanse
51:20
the earth of human beings altogether, but
51:23
fortunately, they're not too bright. The
51:27
second issue is sort
51:30
of the other half, or the other 99.99% of the human
51:33
race, not the bad actors, but
51:35
all of the rest of us who are
51:41
lazy and short-sighted, even the
51:43
best of us are lazy and short-sighted. And
51:49
by creating machines that
51:51
have the capacity to run our
51:54
civilization for us, we
51:56
create a disincentive
51:58
to run it ourselves. And
52:03
when you think about it, right, we've spent over
52:06
the whole human history, it
52:08
adds up to about a trillion person years
52:11
of teaching and learning just
52:14
to keep our civilization moving forward, right, to pass
52:16
it to the next generation so that it doesn't
52:18
collapse. And
52:23
now, or at least at some point in
52:25
the visible future, we may not have to
52:27
do that because
52:30
we can pass the knowledge into machines instead
52:32
of into the next generation of humans. And
52:35
once that happens, right, it's in
52:38
some way sort of irreversible. Like
52:43
once there are no humans left who even knew
52:45
how these machines were designed, who is going to
52:47
have an incentive to figure it out in a
52:49
retro- Right. And it's just
52:51
very, very complicated to sort of pull
52:53
yourself up by the knowledge bootstraps. You
52:58
know, perhaps the machines could sort of
53:00
reteach us if we
53:02
decide that this is in fact, you know, we made a
53:04
huge mistake. But if you
53:06
look at, so if you see Wall-E, in
53:10
Wall-E, right, the humans
53:13
have been taken off the earth on
53:15
sort of giant intergalactic cruise ships, and
53:18
they just become passengers. They no longer know
53:21
how it works. They become
53:23
obese and stupid and lazy, totally
53:26
unable to look after themselves. And
53:29
this is another, you know, another story that
53:31
goes back thousands of years
53:33
to, you know, the Lotus Eaters and
53:37
other mythological temptations
53:40
that when life makes it
53:42
possible to do
53:46
nothing, to not learn,
53:49
to not face up to challenges, to
53:51
not solve problems, we have
53:53
a tendency to take advantage of
53:56
that. You know, that are not healthy for us. Well,
54:01
one thing, if your listeners haven't read the
54:03
story, The Machine Stops
54:06
by E.M. Forster, I
54:10
highly recommend that story. E.M. Forster
54:12
mostly wrote, you know,
54:14
acute social observation novels of
54:18
early Edwardian England or, you know, But
54:21
yeah, those are the Bertrand Ivey movies. But
54:23
this is a story that is
54:27
really a science fiction story. You know, in
54:29
1909, he basically described
54:31
the internet, iPad,
54:34
video conferencing, mooks.
54:36
So most people are
54:38
spending their time either, you know, consuming
54:40
or producing mook content. And
54:46
The Machine looks after everything.
54:48
It makes sure you get fed,
54:51
it pipes in music, keeps
54:53
you comfortable. So
54:55
The Machine is looking after everyone and we
54:57
pursue these increasingly
55:00
effete activities
55:04
and have less and less understanding of how
55:06
everything really works. And
55:09
so that was a warning sign from 110 years ago of one
55:11
direction that it
55:19
seems like a slippery slope that's
55:21
pretty hard to avoid.
55:24
Yeah. And, you
55:26
know, some people have argued that it's already happening.
55:28
I think people have been arguing this for
55:31
a long time. I
55:33
mean, it makes sense when you have everything offloaded on your
55:35
phone. Why would you waste your
55:38
own brain cycles and doing things you don't
55:40
have to? Yeah. Yeah. Yeah. So I
55:42
think my, you know,
55:44
my ability to navigate, even
55:46
in the Bay Area where I live
55:48
has probably decreased because it's much
55:51
easier just to have the phone navigate for me.
55:55
And so you don't exercise that part of your
55:57
brain, you don't refresh.
56:00
those memories of how all the streets connect to
56:02
each other and wherever they are. I
56:06
think there are trade-offs.
56:09
You offload some parts, but because
56:11
you have access to much more
56:13
knowledge through the internet, rather
56:16
than just saying, it's too hard to go,
56:19
you know, trundle down to the library,
56:21
wait for the library to open, find the book
56:23
if they happen to have it, open the book,
56:25
read the page. It used to take a
56:27
whole day to find out a fact, and
56:30
now it takes a second or less to
56:32
find out that fact. So we actually find
56:35
out more stuff than we used to as
56:37
a result. So there are pluses
56:39
and minuses to the way things work right
56:41
now, but we're talking about something much more
56:43
general, a general potentially
56:47
debilitating enfeeblement of human
56:49
civilization. And the
56:51
solution to that, again, it's not a technical solution, right?
56:53
This is a cultural problem. It's
56:57
the economic incentive to
57:00
learn, and let's face it, that's
57:02
one of the primary drivers
57:06
of our education system. You know, the system
57:08
of training and industry
57:11
is economic. Basically, our
57:13
civilization would collapse without it. And
57:17
when that goes away, you know, what replaces
57:20
it? How do we ensure
57:22
that we don't slide
57:25
into dependency? And
57:27
it seems to me it has to be a cultural imperative
57:30
that this part of what it means
57:32
to be a
57:34
good self-actualized
57:36
human being is not just
57:39
that we get
57:41
to enjoy life and have
57:43
aromatherapy massages and all
57:45
that kind of stuff. But that we know
57:48
things, that we are able to do things,
57:50
that if we want to build
57:52
a deck, we can build a deck. If
57:54
we want to design new
57:57
kinds of radio telescopes, we can design new kinds
57:59
of radio telescopes. I
1:00:01
hate to end on a pessimistic note, but
1:00:03
again, it's not to say this couldn't all
1:00:05
end very well. It
1:00:07
certainly can if everybody starts thinking about
1:00:09
these problems now as opposed to when it's too late.
1:00:13
Exactly. And I can't emphasize enough,
1:00:15
the book has so much more than what
1:00:17
we've already delved into and it's a great
1:00:20
read. Everyone should check out Human Compatible, Artificial
1:00:22
Intelligence and the Problem of Control. Stuart
1:00:24
Russell, thank you so much for joining me. Thank you. It
1:00:27
was a pleasure.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More