Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:01
N.P.R. A
0:12
few years ago, Marty Wiegmann got a new
0:14
laptop. Marty is working on
0:16
his PhD at Bauhaus University Weimar in
0:18
Germany, and he wanted a new messenger
0:20
bag so he could walk around with
0:22
it. Marty searched for bags
0:25
on Google, but he wasn't happy with
0:27
the results. All of the pages
0:29
I stumbled on, they listed lots of
0:31
bag packs and they were describing them,
0:35
but I felt very dissatisfied
0:37
and disinformed after visiting them.
0:39
These were the kinds of spammy websites that
0:41
had a ton of ads and links, but
0:43
not really any insights. A somewhat
0:46
new and frustrating experience because
0:48
we were used to just going on Google,
0:50
clicking the first link, and it would work.
0:52
And that got Marty asking, is
0:54
Google getting worse? Ooh,
0:57
provocative question. Of
0:59
course, there's certainly a lot of
1:01
clutter now, a ton of sponsored
1:03
links, that new AI-generated box for
1:06
some searches, which sometimes works, sometimes
1:08
doesn't. Sometimes helpful, sometimes
1:10
wrong, always confident. That's
1:13
how I approach life, Darien. I
1:17
often find myself just scrolling past that stuff to
1:19
get to what I consider the real search results
1:21
at the bottom. And we
1:23
too want to know about the real stuff,
1:25
the Google search results. This
1:28
is the indicator from Planet Money. I'm Darien Woods.
1:31
And I'm Weyland Wong. Today on the
1:34
show, Testing Google Search. There are many
1:36
anecdotal complaints about Google not being what
1:38
it used to be. Today, we test
1:40
that claim and we bring it to
1:42
Google itself. One
1:47
way of figuring out the usefulness of
1:49
Google over time is to look at
1:51
product reviews. When you search for
1:53
reviews for a new messenger bag, you could
1:55
categorize the types of websites you might get.
1:57
And on one side, there are websites that
1:59
are... to careful reviews.
2:02
Maybe they even have a bag specialist going
2:04
through the pros and cons. On
2:06
the other side is a category that
2:09
has less useful types of websites. A
2:11
content farm you could call it. Not very
2:13
useful content and also not very trustworthy if
2:16
you looked at this into detail. You know
2:18
the types, Baelin. Oh yes I
2:20
do. And on that content farm
2:22
we had some spam. E-I-E-I-O. S-E-O.
2:27
Our messenger bag shopper Matti Wiegmann
2:29
is a PhD candidate in computer
2:32
science and for his research he
2:34
decided to investigate. Matti
2:36
and his colleagues collected the top 20 search results
2:39
for more than 7,000 product
2:41
review searches and tracked these every two weeks
2:43
for over a year between 2022 and 2023.
2:45
They did this for Google but also
2:50
for other search engines like Bing
2:52
and DuckDuckGo. So we ended up with
2:55
hundreds of thousands of search results
2:57
and then measured properties
2:59
of these pages. How many words are on
3:02
the page? How many images are on this
3:04
page? How many links? How long are the
3:06
links? And so on how many headlines are
3:08
there. Now in Google's favour Matti found that
3:11
out of the commercial search engines they looked
3:13
at Google performed the best at filtering out
3:15
spammy websites without much content. That
3:17
said Matti found that certain legacy
3:20
magazine websites have sections that churn
3:22
out a lot of product review
3:24
pages every day just to get
3:27
advertising and link commission money and
3:30
these websites tend to get ranked highly on
3:32
Google. They put out so much content over
3:35
such a broad area and
3:37
all of them is what
3:39
we would consider reasonably low quality.
3:41
They are very strongly
3:44
designed so that Google ranks
3:46
them highly. They are also
3:48
very shallow and they're very dissatisfying
3:50
if you really want to ever
3:52
get an informed decision. Matti
3:54
and his colleagues found that Google would issue
3:56
periodic updates to their search algorithms and that
3:58
would push low quality. quality review websites
4:01
off the results page. But
4:03
then, as these websites learned the new ways to
4:05
game the search engine, they would creep back up
4:07
again. This pattern will repeat in
4:09
some way or another. It sounds like a
4:11
constant game of cat and mouse. Yes,
4:14
and this is exactly what we call
4:16
it in our paper, too. It seems
4:18
there is an ongoing struggle between the
4:20
content publishers that try to get ranked
4:22
very highly and
4:24
Google themselves who try to update the
4:26
search engine. So what does
4:28
Google itself say about this cat and
4:31
mouse game? Pandu Nayak is
4:33
the chief scientist for search at Google.
4:35
There's certainly an adversarial
4:37
component to it. So yes.
4:40
Pandu confirms there is this constant
4:42
updating needed to keep search results
4:45
relevant. This has been
4:47
sort of challenging since the
4:49
beginning of search. And when
4:51
Google started, they introduced a
4:54
specific algorithm called PageRank. It's
4:56
about using links as
4:58
sort of votes of confidence in
5:00
a particular site and then aggregating
5:03
it in a particular way that
5:05
gave a very nice signal of
5:08
reliability. Is Google getting worse
5:10
over time? Actually, all
5:12
our measurements say that
5:14
it's not getting worse over time.
5:17
We launch thousands
5:19
of changes every year to
5:22
ensure that the changes we are making
5:24
are actually to the benefit of users.
5:26
Since Mati's paper, Pandu says, Google
5:28
has added all kinds of additional
5:30
measures of whether a page is
5:32
higher or low quality. We
5:34
complement this kind of algorithmic
5:37
work with a set of spam
5:39
policies that allows us to take
5:42
action against sites that are trying
5:44
to manipulate us. So
5:46
there's sort of this comprehensive effort.
5:48
It's a fairly significant effort that
5:50
allows us to do these things.
5:53
Google had a big updated march aimed
5:55
at improving search quality. By the
5:57
way, if you're super interested in the inner
5:59
workings search, internal documents
6:01
about Google leaked recently.
6:04
And this has been
6:06
one piece in all this scrutiny that
6:08
Google has been under. And
6:11
among all this, they've actually read
6:13
Mati's paper. And Pandu
6:15
points out that Mati's paper is
6:17
just focused on product reviews, not
6:19
all types of search. There's just
6:22
this tremendous variety of queries we
6:24
get, and product reviews is just
6:26
one of them. And
6:28
Pandu says every day, 15% of
6:31
search queries are something Google's never
6:33
seen before. And so Google
6:35
is always updating its search tools,
6:37
even using artificial intelligence. Most
6:39
noticeable to users these days is Google's
6:42
AI overview. It's this box at the
6:44
top of some searches which generates answers
6:46
to users' queries using a large language
6:49
model, kind of like chat GPT. Now
6:52
like all large language models, it's
6:54
not always accurate. And this has
6:57
occasionally spawned some unintentionally hilarious answers.
6:59
Yeah, my favorite was an AI overview, suggesting
7:02
users add glue to a pizza to stop
7:04
the cheese from falling off. How do
7:06
you know until you've tried it? It
7:08
probably works. It may have some side effects there.
7:11
Now to be fair to Google, this
7:14
AI overview box is marked as experimental,
7:16
but it hasn't been great PR. Pandu
7:19
says the algorithm of Google's core search
7:21
engine has been using AI-like deep learning
7:23
since at least 2015. So
7:26
that sort of generative AI makes it easier to
7:28
create a kind of junk website. Well,
7:31
I mean, I think that's
7:33
certainly true for any
7:36
technology, that it can have
7:38
good and bad uses. Right
7:40
now it's possible that with
7:42
generative AI, the scale of
7:44
the problem might
7:47
go up in the future.
7:50
But once again, I think we are sort
7:52
of ready and willing to engage
7:54
with that problem. It's like
7:56
AI has heightened this cat and mouse
7:58
problem. So it's a
8:01
Jaguar versus Capybara
8:03
problem. Big cats and the world's
8:05
biggest rodent. But
8:08
that said, Mati Wiegmann, the computer
8:10
scientist, is feeling optimistic about the
8:12
future of search, if only because
8:14
of all this extra attention from
8:16
academics and governments into how big
8:18
tech operates. Plus, for all
8:21
his scrutiny of Google, the search
8:23
engine did eventually help him find
8:25
his messenger bag. I found a
8:27
very interesting indie site about two
8:29
pages down on Google that
8:31
actually had like lots of lots of test
8:33
videos. For all that fuss, he
8:36
ended up settling on a plain black
8:38
messenger bag. It was the world's greatest
8:40
deep dive just to buy a backpack. You
8:43
know what? I respect that research. That's
8:45
what I would have done.
8:47
Google, we should mention, is a sponsor of NPR.
8:51
This episode was produced by Julia Richie with
8:53
engineering by Neil Rausch, was fact-checked by Cyril
8:55
Poides, kicking cat and edits the show and
8:57
the indicator as a production of NPR.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More