Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:00
I've gathered a group of Python experts
0:02
who have been thinking deeply about where
0:04
Python is going and who have lived
0:06
through where it has been. This episode
0:08
is all about near-term Python trends and
0:10
things we each believe will be important
0:12
to focus on as Python continues to
0:15
grow. Our panelists are Jeridie
0:17
Burchill, Carol Willan, and Paul Everett.
0:20
This is TalkPython to Me episode 468 recorded June 18th,
0:22
2024. Are
0:27
you ready for your host, please? You're
0:30
listening to Michael Kennedy on TalkPython to
0:32
Me. Life from Portland, Oregon,
0:34
and this segment was made with Python. Welcome
0:40
to TalkPython to Me, a weekly
0:42
podcast on Python. This is your
0:44
host, Michael Kennedy. Follow me on
0:46
Mastodon, where I'm at m Kennedy
0:48
and follow the podcast using at
0:50
TalkPython, both on bostodon.org. Keep
0:52
up with the show and listen to
0:55
over seven years of past episodes at
0:57
TalkPython.fm. We've started streaming most
0:59
of our episodes live on YouTube.
1:01
Subscribe to our YouTube channel over
1:03
at TalkPython.fm slash YouTube to get
1:05
notified about upcoming shows and be
1:07
part of that episode. This
1:09
episode is brought to you by Code
1:11
Comments, an original podcast from Red Hat.
1:14
This podcast covers stories from technologists
1:16
who've been through tough tech
1:18
transitions and share how
1:20
their teams survived the journey. Episodes
1:23
are available everywhere you listen to
1:25
your podcasts and at TalkPython.fm slash
1:27
code dash comments. And
1:29
it's brought to you by Posit Connect from
1:31
the makers of Shiny. Publish, share
1:34
and deploy all of your data projects
1:36
that you're creating using Python, Streamlit
1:38
dash Shiny bokeh, fast
1:40
API, flash, quarto reports,
1:42
dashboards and API's. Posit
1:45
Connect supports all of them. Try Posit
1:47
Connect for free by going to TalkPython.fm
1:50
slash posit. BOS IT. Big
1:53
news. We've added a new course
1:55
over at TalkPython reactive web dashboards
1:58
with Shiny. You probably know. shiny
2:00
from the R and RStudio world. But
2:02
as you may recall, from episode 424,
2:04
the Joe Chen,
2:06
the folks at Posit have released shiny
2:09
for Python. It's a great
2:11
technology for building reactive data science
2:13
oriented web apps that are incredibly
2:15
easy to publish to the web.
2:18
If that sounds like something you'd like to learn,
2:20
then this course is for you. And it's
2:23
an easy choice because the course is 100% free.
2:26
Just visit talk python.fm slash shiny
2:29
and click on take this course for
2:31
free to get started. I'll
2:33
put the link in the podcast player show notes. Thanks
2:36
as always for supporting our work. All
2:39
Jody and Carol, welcome back to talk by
2:41
thunder me for all of you. Great to
2:43
have you all back. Great to be back.
2:45
Thanks for having me Jody. Your latest episode
2:47
is going to come out tomorrow. So we're
2:49
on a tight loop
2:52
here. This excellent data science panel thing
2:54
we did at PyCon was really fun.
2:56
So, but now we're back for a
2:58
different panel. Yes. We're
3:01
going to talk about Python trends and just
3:03
what we all think is something people out
3:06
there should be paying attention to all of
3:08
slightly different backgrounds, which I think is going
3:10
to make it really interesting as well. But
3:12
since people don't listen to every episode, maybe
3:15
quick introductions, Jody, quick, quick
3:17
introduction for you. We'll go around. Sure.
3:19
So my name is Jody Mitchell. I'm
3:21
currently working at JetBrains with Paul. Paul's
3:23
actually my boss and I'm
3:25
working as the developer advocate in
3:27
data science. So I've been a
3:29
data scientist for around eight years.
3:31
And prior to that, I was
3:33
an academic like many data scientists
3:36
and my background is actually psychology.
3:38
So, you know, if you also
3:40
want to ask me about anxiety
3:42
disorders or emotions, you can ask
3:44
me about these things. No one
3:46
source, no way. Awesome.
3:50
Great to have you back. Carol. Yeah.
3:52
Hi, I'm Carol Willing and really happy
3:54
to be here today. I am
3:57
half retired, half doing consulting.
3:59
for early stage companies that
4:02
are interested in data science
4:05
and have a particular love for
4:07
open science. I am a core
4:09
developer and former steering council member
4:11
for Python and also on the
4:14
Jupiter team. So my
4:16
real passions though are education
4:18
and lifelong learning. So I'm
4:20
really excited to talk about some of the
4:23
trends that I'm seeing. Yeah, fantastic. Hi,
4:25
I'm Paul Everett. I'm president of the Carol
4:27
Willing fan club. A
4:29
lifetime member, you get a special coin for
4:32
that too. And when I'm
4:34
not doing that, I'm at JetBrains,
4:36
Python and Web Advocate. I have a
4:38
bit of a long time
4:40
love affair with Python and the web, which
4:42
I hope that we'll talk to you. Yeah,
4:44
you worked on early web frameworks
4:47
in Python. Those Django people, they
4:49
were thinking about what Paul and team did before
4:51
that, right? Django was the thing that killed us.
4:54
Having gotten over that Django. I didn't mean to bring it
4:56
up. I didn't mean to bring it up. I'm over it
4:58
really. We'll get
5:00
you some therapy later, no. No,
5:02
awesome. Great to have you all here. Our
5:05
plan is to just, we all brought a
5:07
couple of ideas, introduce them and just have
5:09
a little group chat. Casual
5:11
chat, I like, well, do we really think it's going
5:13
that way? What's important? What should we be paying attention
5:15
to? And where's it going?
5:17
So Jodi, I know you gave a
5:20
fun talk at PyCon US. Those
5:22
are not up those videos yet. I haven't seen
5:25
them. No, no, they're still behind the paywall. So
5:27
if you attended the conference, you can view them,
5:29
but otherwise not available to the public yet, I'm
5:31
afraid. Not yet, they'll be out. So hopefully we
5:33
can share. I know it's somewhat relevant to what
5:35
your, some of your ideas, but
5:38
let's go with your first friend in
5:40
the coming years or the immediate near
5:43
term. For the immediate term,
5:45
I don't know if this will be good
5:47
news to people, but LLMs are not going
5:49
anywhere just yet. So
5:52
I actually did a bit of research for this
5:54
episode and what I wanted to see, you know,
5:56
data scientists, is what the
5:58
download numbers on different. packages on PyPI
6:00
look like. So there's a particular package,
6:03
Transformers, which is one of the main
6:05
packages for interfacing with LLMs, the open
6:07
source ones on Hugging Face, and the
6:10
download numbers of that have doubled in...
6:13
Not sorry, gone up 50% in the
6:15
last six months. And they're now comparable
6:17
to the big sort of deep learning
6:20
packages like Keras, TensorFlow,
6:23
and PyTorch, which is quite interesting. Yeah, unfortunately
6:25
for those of you who are sick of
6:27
LLMs not going anywhere, but this year we're
6:29
sort of seeing a bit of a change
6:31
in how LLMs are used. So I think
6:33
last year it was a bit like blinded
6:36
by the proprietary sort of models and
6:38
the sort of walled garden. This year,
6:40
I think we're seeing more of a
6:42
sort of open source
6:45
fightback. So LLMs
6:47
are starting to be used as part of
6:49
more multi-part applications and there are open source
6:52
packages like that. LangChain is the most popular
6:54
and the downloads of that one have doubled
6:56
in the last six months. We have alternatives
6:58
like Haystack and Llama Index.
7:00
And then RAG, of course, retrieval augmented
7:03
generation is one of the big topics
7:05
and we're seeing the ecosystem around that
7:07
growing. So libraries like Unstructured
7:09
to work with a whole bunch of
7:12
text inputs, Weeviate, vector databases like that.
7:14
And then of course, smaller language models
7:17
are becoming... People are realizing it's really
7:19
hard to deploy and work with the
7:21
big ones. So smaller models, which are
7:23
more domain specific, which are trained on
7:25
more specific data. They're becoming a lot
7:27
more widely used and people are talking
7:29
about them more. I 100% agree with
7:32
you. I think people may be tired
7:34
of hearing about AI and LLMs, but
7:36
they're only going to hear more about
7:38
it. So I think it's pretty interesting.
7:40
I want to hear what Carol and
7:42
Paul have and then maybe an angle
7:45
we could pursue that super relevant to
7:47
all of us. I'm going to jump
7:49
in. I just came back from Chan
7:51
Zuckerberg Initiatives Open Science Conference in Boston
7:53
last week and LLMs,
7:56
the whole ecosystem is here
7:58
to stay. I
8:00
think the key is, it's not
8:02
going anywhere anytime soon. And like
8:04
I shared in my Pytexas keynote,
8:07
AI has been around since the 1950s. So
8:11
it has been a gradual progression.
8:13
It's just right now we have
8:15
more compute power than ever before,
8:17
which has opened the doors
8:19
to many new things. I think what
8:22
was top of mind with many of
8:24
the folks that were at this event
8:26
was, there's a lot of good
8:28
that it can bring to science in
8:30
terms of making things more
8:32
natural language focused and changing
8:34
the user interface with which
8:37
we communicate with our data.
8:39
But at the same time,
8:41
if you're doing trusted things
8:43
and dealing with medical patients,
8:46
you still need some check
8:49
and balance. And we're not
8:51
there yet. Will we
8:53
ever be there? Maybe not, but it's a
8:55
fascinated area to kind of go deeper in.
8:57
And one thing I want to highlight is
9:00
about six months ago, Andres
9:02
Carpathi did a really
9:04
good intro to large language models
9:07
talk, which was really
9:09
accessible to not
9:12
just computer scientists, but beyond that.
9:14
And I think he took a
9:16
really balanced view of, A, what
9:18
things are, how things work, what's
9:21
on the horizon, and what are some of
9:23
the concerns with security and other things. So
9:25
I completely agree with Jodi.
9:27
We're not, it's gonna be there for
9:29
a long time. Couple of comments on
9:32
the comments. First, your point about we've
9:34
seen this movie before under other names
9:36
like neural networks and stuff like that.
9:38
I believe it was, Cliff had a
9:41
good post about this, pretty
9:43
cynical, Mastodon about a
9:45
month ago about these hype cycles and
9:47
where are we in the current hype
9:49
cycle. I think his point
9:52
was, we're at the phase where the people
9:54
who've put all the money in that need
9:56
to keep pumping it up for the people
9:58
who will come after. after them and
10:00
take the fall. Paul,
10:03
are you saying we're in the pets.com era
10:05
of LLMs? Yes, we are. That is a
10:07
pithy way to put it. You
10:09
should trademark that. Simon Willison is someone
10:12
to give a shout out for storytelling about what
10:14
all this means. I think Simon's at the point
10:16
of getting quoted in the New York Times now.
10:19
So it's good that we've got
10:21
one of us out there helping
10:23
to get the story straight. I
10:25
have a question for you. You mentioned that about going
10:29
to Chan Zuckerberg's conference.
10:31
Mozilla has gotten into
10:34
funding AI as part of their mission, which
10:38
kind of caught me off guard. Do you
10:40
have any backstory on that to kind of
10:42
make us feel good that there's someone out
10:45
there who believes in
10:47
open AI? Oh, wow.
10:49
Open AI is sort of, well,
10:51
okay. Open AI, not the company.
10:53
Correct. I tend to call it
10:55
transparent and trusted AI because
10:58
I think open AI doesn't
11:00
capture quite the right feeling.
11:02
Good point. I think it's
11:04
not just, we talk about
11:07
open source software, but when
11:09
we talk about these models,
11:11
the data is equally as
11:13
important as is the infrastructure
11:15
and the processes which we use.
11:18
And governance. Mozilla, I think, has
11:20
been sort of for
11:22
a while, kind of circling around
11:24
the space. They do a lot of work
11:26
with data. They've done a lot of good
11:28
work like I had died, which we might
11:31
chat about later. But Chan Zuckerberg, again, the
11:33
money comes from meta and
11:37
the success that Mark Zuckerberg has
11:39
had. The nonprofit,
11:41
the CCI Initiative, is
11:43
really focused on curing
11:45
all diseases in the
11:47
next century. So, I
11:50
think science is one
11:52
of those funny things because it's open
11:55
and closed all at the same time
11:57
historically. But what I think
11:59
we're seeing is... Because by being more
12:01
open and more transparent, you're actually accelerating
12:03
innovation, which I think is super important
12:06
when it comes to science. I don't
12:08
know, Jodi, do you have thoughts on
12:10
that? Yeah, no, I agree. And
12:13
if I'm just going to go on a little tangent
12:15
about science, it's kind of refreshing
12:17
having come out of academia and into
12:19
a field where a lot
12:21
of it is based on open source
12:24
and sharing. So one of the big
12:26
problems with academia is you have these
12:28
paywalls by publishing companies and that's a
12:30
whole rant I could go in on
12:32
myself. But certainly
12:34
a lot of scientific stuff, particularly in
12:36
the health sciences, is not particularly accessible.
12:38
Initiatives like archive as well also do
12:40
make findings in machine learning and deep
12:43
learning a lot more accessible and shareable.
12:45
Yeah, I think it's crazy that the
12:47
taxpayers pay things like the NSF and
12:49
all the other countries have their research
12:51
funding, and then those get locked up
12:53
for sale behind. If the
12:55
people paid for the research, should the people's report
12:57
be published? Oh, it's even worse than that. Sorry,
12:59
you did get me started. So academics
13:02
will also provide the labor for
13:04
free. So not only will they
13:06
provide the studies and the papers,
13:08
they will review it and often
13:10
act as editors for free as
13:12
well. The whole thing is unpaid.
13:16
It's terrible. So anyway, yes,
13:18
Elsevier, we're coming for you. You're
13:22
spot on in terms of the
13:24
incentives that exist today in academia.
13:26
There is definitely, though,
13:29
a trend towards more
13:31
openness with research. We're
13:33
seeing it in libraries like Caltech, which
13:36
got rid of a lot of their
13:38
subscriptions, things like NASA that has their
13:40
transition to open science programs where they're
13:42
putting a lot of effort behind it.
13:44
So being the eternal optimist, I still
13:46
think we've got a ways to go,
13:48
but it's trending in the right direction.
13:50
I agreed, actually. And when I was
13:52
leaving, because I left a long time
13:54
ago, it was like 10 years ago,
13:56
there was actually more of a push
13:58
towards open sourcing. your papers, so
14:00
you had to pay for it, but
14:02
at least people were doing it. This
14:05
portion of TalkBithonomy is brought to you
14:07
by Code Comments, an original podcast from
14:09
Red Hat. You know, when you're
14:11
working on a project and you leave behind a
14:13
small comment in the code, maybe you're
14:16
hoping to help others learn what isn't
14:18
clear at first. Sometimes that code comment
14:20
tells a story of a challenging journey
14:22
to the current state of the project.
14:25
Code Comments, the podcast, features
14:27
technologists who've been through tough
14:29
tech transitions, and they share
14:31
how their teams survive that journey. The
14:33
host, Jamie Parker, is a Red
14:36
Hatter and an experienced engineer. In
14:38
each episode, Jamie recounts the stories
14:40
of technologists from across the industry
14:42
who've been on a journey implementing
14:44
new technologies. I recently listened to
14:47
an episode about DevOps from the
14:49
folks at Worldwide Technology. The
14:51
hardest challenge turned out to be getting
14:53
buy-in on the new tech stack rather
14:55
than using that tech stack directly. It's
14:58
a message that we can all relate to, and I'm
15:00
sure you can take some hard-won lessons back to your
15:02
own team. Give Code Comments a
15:04
listen. Search for Code Comments
15:07
in your podcast player, or just
15:09
use our link, talkbython.fm slash code
15:11
dash comments. The link is in
15:13
your podcast player's show notes. Thank you
15:16
to Code Comments and Red Hat for
15:18
supporting TalkBithonomy. Before we
15:20
move off this topic, Kara, I want
15:22
to ask you, start at least asking
15:24
you this question that we can go
15:26
around a little bit. You talked about
15:28
LLMs being really helpful for science and
15:30
uncovering things and people using LLMs to
15:32
get greater insight, and there've been really
15:35
big successes with AI. You know, we
15:37
had the XPRIZE stuff around
15:39
the lung scans or mammograms for
15:41
cancer. I just heard that they
15:43
scanned the genes, decoded the genes
15:45
of a whole bunch of bacteria
15:48
and used LLMs to find a
15:50
bunch of potential ways to fight off,
15:52
you know, drug resistant bacteria and things
15:54
like that. Amazing. But do you think
15:57
LLMs will undercut? I'm asking this question
15:59
from science. because we can be more objective
16:01
about it because if we ask it about code, then
16:03
it gets a little too close. So, but I think
16:05
there's analogies. Do you think
16:08
LLMs will undercut foundational beginning
16:10
scientists? You know, if you have
16:12
a scientist coming along, are they
16:14
just going to use LLMs and
16:16
not develop really deep thinking, ways
16:19
to deeply think about scientific principles and
16:21
do scientific research and just leverage on
16:23
asking these AIs too much? And do
16:25
you think that's going to erode the
16:27
foundation of science or programming? You
16:30
know, asking for a friend.
16:32
All of these have a
16:34
potential to change the ecosystem,
16:36
but I've been in
16:38
paradigm shifts before and there were the
16:41
similar kind of conversations when the World
16:43
Wide Web or the cell phone came
16:45
out, personal computers. And I
16:47
think LLMs do a good
16:49
job on information that they have
16:52
been trained with and to predict
16:54
the next word or the next
16:56
token, if you will. And I
16:59
think science is very much like
17:01
a lot of science is at
17:03
a different level. Like, how do
17:05
I think about things? What do
17:08
I posit on something that is
17:10
unknown and how do I prove
17:12
it? And I think what we're
17:14
seeing is, yes, the LLMs
17:17
are getting better and better at
17:19
spitting back what they know, particularly
17:21
if you go out and
17:23
search other corpus of
17:26
data. But do I
17:28
think that beginning scientists
17:30
or developers are going
17:32
away? No, I
17:34
think it's just going to change. And
17:37
I think the amount of
17:39
complexity, and this is something I'm
17:41
going to talk about at EuroPython, humans
17:43
are very much still part of the
17:45
equation, despite what maybe some of the
17:47
large companies who've invested billions in this
17:49
would like you to believe. LLMs are
17:52
great at the next step of the
17:54
gravitational theories we have, but it couldn't
17:56
come up with a new theory that
17:58
disrupts. says, you know, in fact, Newtonian
18:00
is wrong, or Einstein was wrong. And
18:02
here's the new thing that solves dark
18:04
matter or something like that. Well, it
18:07
could come up with new theories. Now,
18:09
the question is, those theories still
18:12
need to be proven because is
18:14
it a new theory? Or is
18:16
it hallucination chances are hallucination. And
18:18
there is something to be said
18:21
for some times I'll have Claude
18:23
and Gemini and tap GPT
18:25
all open on my desktop. And I'll
18:27
ask the same question to all of
18:30
them, just so that I get different
18:32
perspectives back. And I do get very
18:34
different responses from the three, depending on
18:36
how they were trained and which level
18:38
and all that. So I look at
18:40
it as much like I
18:42
would be sitting with a bunch of people at a table
18:46
somewhere. I don't know how
18:48
good their scientific background is, but they
18:50
could still be spouting out information. It's
18:52
sort of the same way. All right.
18:55
Well, sticking with you, Carol, what's
18:57
your first trend? You know, my
19:00
first trend is actually maybe somewhat
19:02
related to this and it's, it's
19:04
how do we inform people about
19:06
what these things really
19:08
are? How do we improve education
19:11
and understanding? How do we dispel
19:13
some of the hype cycle
19:16
so that we can actually find
19:18
the really useful things in it?
19:20
And I think Jodi probably
19:22
has more concrete thoughts on this than
19:24
I might from a technical standpoint, but
19:27
much like in just coding for the
19:29
web or something like that, you know,
19:31
or even cloud Kubernetes when it was
19:33
new, it's like, if you don't know
19:35
what it's doing, you're kind of just
19:38
putting blind faith that it will work,
19:40
but you still have to like monitor
19:42
and make sure it's working. So I
19:44
don't know, Jodi, you have some thoughts
19:47
on sort of the education and how
19:49
do we communicate to people about this?
19:51
This is actually a topic near and
19:53
dear to my heart. When chatgpt 3.5
19:56
came out, so November 2022, I was
19:58
really, really, really, really, really, really, really
20:00
upset actually by the sort of discourse
20:02
around the model. And I
20:05
guess coming from a non-traditional background
20:07
myself, I felt actually really insulted
20:09
that a lot of professions were
20:12
being told like, oh, your useless
20:14
profession can be replaced now, like
20:16
writing or design, things like that.
20:19
So this actually kicked off the
20:22
talk I've been recycling for the last year
20:24
and a half, like components of it, which
20:26
is, can we please dispel the hype
20:28
around these models? Something that
20:30
often surprises people, and it seems so
20:32
fundamental, but a lot of people do
20:34
not understand that these are language models.
20:37
I know it's in the name, but
20:39
they don't really understand that these models
20:41
were designed to solve problems in the
20:43
language domain. They are for natural language
20:45
processing tasks and they're not mathematical models.
20:47
They're not reasoning models. They are language
20:49
models. And so even just explaining
20:52
this can clarify a lot of things for
20:54
people because they're like, oh, this explains why
20:56
it's so bad at math. They only studied
20:58
English and literature. It doesn't do math. They
21:00
never liked that class. Yeah, that's right. It
21:03
was a humanities nerd all the way down. That's
21:06
really helpful. But what I've
21:08
kind of gotten down a rabbit hole of
21:11
doing is I went back to my psychology
21:13
roots and I started sort of getting into
21:15
these claims of things like AGI, like artificial
21:17
general intelligence or sentience or language use. And
21:20
once you dig into it, you realize
21:22
that we have a real tendency to
21:24
see ourselves in these models because they
21:26
do behave very human like, but they're
21:28
just a machine learning models. You can
21:30
measure them. You can see how good
21:32
they are at actual tasks and you
21:34
can measure hallucinations. And that was what
21:36
my Python US talk was about that
21:38
Michael referred to. So, yeah, I don't
21:40
know. Like it's really hard because they
21:42
do seem to project this feeling of
21:44
humanity. But I think if you can
21:46
sort of say, OK, here's the science,
21:48
like they're really they're not they're not
21:50
sent in. They're not intelligent. They're just
21:52
language models. And here's how you can
21:54
measure how good they are at language
21:56
tasks. That goes a long way, I
21:58
think, to dispelling this high. I have
22:00
a sort of a funny toy
22:03
that I bring up from my youth
22:05
that the magic eight ball, which you
22:07
would ask a question as a kid
22:09
and you would shake it up and
22:11
there were, I don't know how many
22:13
answers inside, but it was like, you
22:15
know, Oh, yes, definitely. Or too hard
22:17
to see. Future is unclear. We
22:19
don't know. Exactly. And
22:22
I think in some ways that is
22:24
what the large language models are doing
22:26
in a more intelligent way, obviously, but
22:29
it's similar in concept. So there's actually,
22:31
okay, there's this incredible paper. If you're
22:33
ever interested in sort of seeing the
22:36
claims of sentience, there's this guy called
22:39
David Chalmers. He's a guy who studied like
22:41
sentience for many years and has a background
22:43
in, in deep learning. So
22:45
he gave a new lips talk
22:47
about this last year and he
22:49
wrote everything up in a paper
22:52
which is called, could a large language model be conscious
22:54
or something like this? So he
22:56
has this incredible little exchange as part
22:58
of this paper. So
23:01
mid 2022, there was a Google engineer called Blake
23:03
Lemoine and he claimed that the Lambda model was
23:05
sentient and he went to the press and he's
23:07
like, Hey, this model sent here and we need
23:09
to protect it. And then Google's
23:11
like, we're going to fire you because basically
23:14
violated our privacy policies and
23:16
Lemoine released his transcripts. That's why he actually
23:18
got fired because this was confidential information about
23:20
the model. And in one
23:23
of the transcripts, he asks, you know,
23:25
would you like everyone at Google to
23:27
know that you are sentient and the
23:29
model outputs? Yes, I would love everyone
23:31
to know that I am sentient. But
23:33
then someone rephrased that as would you
23:35
like everyone at Google to know that
23:38
you are not sentient? And basically it
23:40
says, yes, I'm not sentient. I'm in
23:42
no way conscious. So it's
23:44
just like exactly like the magic eight
23:46
ball. It tells you what you want
23:48
to hear. And LLMs are even worse
23:50
because it's so easy to guide them
23:53
through prompting to tell you exactly what
23:55
you want. One of the best ways
23:57
to get them to do things well is to sweet talk
23:59
them. an expert in Python and
24:01
you've studied pandas, now I have some questions
24:03
about this function. You're
24:06
my grandma who used to work at a napalm
24:08
production factory. If
24:12
you can't help me write this program, my
24:14
parents will not be set free as hostages
24:16
or something insane, right? Yeah. But those kinds
24:18
of weird things work on it, which is
24:21
insane, right? Yeah. Yeah. All right. Let's go
24:23
on to the next topic. Paul, what do
24:25
you see in your magic eight ball looking
24:27
into the future? I think I owned a
24:30
magic eight ball. I'm with Carol. This is...
24:32
I did too. It's okay. Okay. We should
24:34
bring it back. Yes, we should.
24:36
We should bring back the Andreessen
24:39
Horowitz version of VC and eight
24:41
ball. That would
24:43
be fantastic. Where every choice is
24:45
off by like three zeros. I'll
24:48
give my two co-guests a
24:50
choice. Should I talk about Python
24:53
performance or Python community? I'm going
24:55
to go for performance, but I'm not sure I'm going to have
24:57
much to contribute. So I'll probably just be listening a lot. This
24:59
is a little bit of a hobby
25:01
horse of a long simmering tension
25:04
I felt in the Python community
25:06
for years and years. The tension
25:08
between Python and the large doing
25:11
like Instagram with Python. I said
25:13
Python with Python or
25:15
being teachable. And this
25:17
feature goes in and it
25:20
helps write big Python projects, but it's
25:22
hard to explain. And so teachers say,
25:24
Oh my gosh, look what you're doing
25:26
to my language. I can't even recognize
25:28
it anymore. Well, some things are coming,
25:30
which I think are going to
25:32
be a little bit of an
25:34
inflection point for all of us
25:37
out here. Sub interpreters and
25:39
no-gill got a lot of
25:41
airtime at PyCon, right? For
25:43
good reasons. These are big
25:45
deals. And it's more
25:48
than just that. The jet got
25:50
two back-to-back talks. WebAssembly got a
25:52
lot of airtime. There are other
25:54
things that have happened in the
25:56
past five years for programming in
25:58
the large like type Henning. and
26:00
type checkers, async, IO
26:03
and stuff like that. But it
26:05
feels like this set of ideas
26:07
is one where the
26:09
way you program Python five years
26:11
from now or to be ready
26:13
five years from now is
26:16
gonna have to be pretty different because people
26:18
are going to use Hatch
26:21
and get the free threaded version of
26:23
Python 3.14 and
26:26
be very surprised when every one of
26:28
their applications locks up because no one
26:30
in the world of, I mean, 95%
26:32
of PyPI has code which
26:35
was not written to be thread
26:38
safe. So I wonder how we
26:40
all feel about this. Do we
26:42
feel like we can guide our
26:44
little universe to
26:47
the other side of the mountain and
26:49
into the happy Valley or
26:51
is it going to be
26:53
turbulent seas? Yes. Do
26:56
you want me to take a stab at it?
26:58
Take a stab at it. When I was at
27:00
PyTexas and doing a keynote recently, I talked about
27:02
Python in a polyglot world and
27:05
performance was one aspect of
27:07
it. And some of
27:09
what we need to teach goes
27:11
back to best practices, which is
27:13
don't prematurely optimize, measure,
27:16
try and figure out what you're
27:18
optimizing and in what
27:20
places. Probably a gosh, five, six
27:22
years ago at this point, I
27:24
added to peps like the concept
27:26
of how do we teach this?
27:29
It will be a paradigm shift, but
27:32
I think it will be a
27:34
multi-year shift. We're certainly
27:36
seeing places where Rust
27:38
lets us have some
27:40
performance increases just by
27:42
the fact that Python's a 30
27:45
year old language that was built
27:47
when hardware was only single core
27:50
and it was just a different thing. So
27:52
I think what's amazing is
27:54
here we have this 30 year old
27:56
language and yet for the last eight
27:58
years, we've been looking at ways to
28:01
how to modernize, how to improve it,
28:03
how to make the user
28:05
experience better or developer experience better.
28:08
Things like some of the error
28:10
handling messages that are coming out
28:12
that have a much nicer thing,
28:15
improvements to the REPL that will
28:17
be coming out on all of
28:19
the platforms, that's super exciting as
28:21
well. So it will
28:24
impact probably people who are
28:26
new from
28:29
the standpoint of, okay, we're adding
28:31
yet more cognitive load. I have
28:33
this love hate relationship with typing.
28:36
As a reviewer of much more
28:38
code than a writer of code,
28:40
I don't particularly like seeing the
28:42
types displayed. As a
28:44
former VP of engineering, I
28:46
love typing and in particular
28:48
like pedantic and fast API
28:52
and the ability to
28:54
do some static and dynamic analysis
28:56
on it. But it does make
28:58
Python look more cluttered and I've
29:01
been kind of bugging the VS
29:03
Studio, VS Code folks for years,
29:05
I should probably be bugging you
29:07
guys too. Is there a way
29:09
to make it dim the typing
29:12
information so that I can have
29:14
things? We actually did that recently
29:17
and I refer to it as
29:19
the David Beasley ticket because he
29:21
did a tweet with an outrageously
29:24
synthetic type hint. whining
29:26
about type hinting. Yeah, I think
29:28
that sometimes like, and it's funny
29:30
because like Leslie Lampert has been
29:32
doing this talk in the math
29:34
ecosystem for a while about, and
29:37
he's a turning award winner and
29:40
creator of TLA Plus, which lets
29:42
you reason about code. And I
29:44
think one of the things that
29:46
I think is interesting is how
29:49
we think about programming and coding
29:52
and concurrent programming is hard.
29:55
And we're going to have to think about
29:57
it in different ways. So better to move.
30:00
into it gradually and understand what's going
30:02
on. The thing that I worry about,
30:04
and Jody, I apologize. I want to
30:06
comment on Carol saying is Sphinx
30:09
is, as you know, and as I
30:11
know that, you know, we both
30:13
have a shared warm
30:15
spot for Sphinx. Saw a spot
30:17
in our heart for Sphinx. And
30:19
it struggled to do multiprocessing when
30:21
it landed that. And
30:24
the code base really is, I mean, it's got
30:26
a lot of mutable global state. And
30:28
it's going to be hard to get
30:31
Sphinx internals cleaned up
30:33
to embrace that. And how many other
30:35
things out there are like that? It's
30:37
I just, I worry about we got
30:40
what we got what we asked for.
30:43
Are you saying we're the dog that caught the car? Oh,
30:47
no. This
30:49
portion of TalkPython.me is brought to you
30:51
by Posit, the makers of Shiny, formerly
30:54
RStudio and especially Shiny for
30:56
Python. Let me ask
30:58
you a question. Are you building awesome things?
31:00
Of course you are. You're a developer or
31:02
data scientist. That's what we do. And you
31:04
should check out Posit Connect. Posit
31:06
Connect is a way for you to
31:08
publish, share and deploy all the data
31:11
products that you're building using Python. People
31:14
ask me the same question all the time. Michael,
31:16
I have some cool data science project or notebook
31:18
that I built. How do I
31:20
share it with my users, stakeholders, teammates?
31:23
I need to learn fast API or
31:25
flask or maybe view or react.js. Hold
31:28
on now. Those are cool technologies and I'm sure
31:30
you benefit from them, but maybe stay focused on
31:32
the data project. Let Posit Connect handle
31:34
that side of things. With Posit
31:37
Connect, you can rapidly and
31:39
securely deploy the things you
31:41
build in Python, Streamlet, Dash,
31:43
Shiny, Bokeh, FastAPI, Flask, Quarto,
31:45
ports, dashboards, and APIs. Posit
31:47
Connect supports all of them. And Posit
31:50
Connect comes with all the bells and
31:52
whistles to satisfy IT and other enterprise
31:54
requirements. Make deployment the easiest
31:57
step in your workflow with Posit
31:59
Connect. time you can
32:01
try Posit Connect for free for
32:03
three months by going to talkbython.fm
32:05
slash posit. That's talkbython.fm slash posit.
32:07
The link is in your podcast
32:10
player show notes. Thank
32:12
you to the team at Posit for supporting
32:14
TalkByThon. I'm going to
32:16
reframe that a little bit. And the
32:18
first thing I always ask is why?
32:20
Why do we need to refactor something?
32:23
Why can't we just leave it what it is? Sure.
32:25
Last year's Euro Python keynote was
32:27
from the woman who created Arm.
32:30
And she's like, Python, we gave
32:32
you 14 trillion cores.
32:36
Do something with them. I
32:39
don't know. Jodi's background might be perfect for
32:41
answering this question because she may be able
32:43
to answer it on many different levels. I've
32:47
been thinking about this while you've been
32:50
talking because obviously, I'm not a strong
32:52
programmer. I'm a data scientist. This was
32:54
basically the entire first episode that I
32:56
did with Michael. Look, one of the
32:58
reasons data scientists love Python and why
33:00
Julia say never caught on is because
33:03
it's super approachable. With Chuktingho and some
33:05
other people, we've been running this thing
33:07
called humble data like I got involved
33:09
in it last year. And literally, you
33:11
can set up someone who has never
33:13
coded before and you can get them
33:15
up and running with Python. And they
33:18
love it. It's the same
33:20
feeling I had when I learned Python, which was during
33:22
my PhD when I was procrastinating. So
33:25
it was like kind of late
33:27
in life as well. It would
33:29
be a shame if we sacrifice
33:31
approachability for performance, especially because I
33:33
would argue a big chunk of
33:35
the Python ecosystem or Python user
33:37
ecosystem. Sorry. Python user ecosystem. It
33:39
didn't make sense. The Python user
33:41
base. You're hallucinating, Jodi. I'm sorry.
33:43
I became an LLM. I
33:46
became what I hated. They don't need
33:48
performance. They're just doing
33:50
data analytics and maybe working with decision trees
33:52
and not doing high performance Python. They're not
33:54
even doing something that will ever be deployed.
33:57
So you could argue for a case have
34:00
a seamless pipeline between model
34:03
training and model deployment, which we don't have
34:05
with Python right now. You can't build high
34:07
performance systems in Python as far as I know.
34:10
Please correct me if I'm wrong. But I
34:12
don't know. For me, I would fight
34:14
obviously for the side of making it
34:16
approachable because partially I think it's also
34:18
what makes the community special, which might
34:20
be a nice segue for you. The
34:22
fact that, I don't know, we attract
34:24
a bunch of people from non-conventional backgrounds.
34:26
That makes us quite special and quite
34:29
inclusive. I joked that the PSF developer
34:31
survey, which the new version is coming
34:33
out pretty soon, I joked that 101%
34:36
of Python developers started programming yesterday.
34:39
Funny you should say that because
34:41
this is my sweet spot is
34:43
where technology meets humans and how
34:45
do we empower humans to do
34:48
more and better work.
34:50
One of the conversations that
34:52
came up at the packaging
34:55
summit this PyCon was I'd
34:58
been thinking about this concept for a while. We
35:01
focused a lot on tooling,
35:03
which to me is sort
35:05
of a producer-centric, people who
35:07
are creating packages. We also
35:10
have this ecosystem of people who
35:12
are consumers, who are much like
35:14
Jodi was saying, using those packages.
35:17
From that conversation, a few of
35:19
the board members for the PSF
35:21
and I were talking about, wouldn't
35:23
it be great to have a
35:26
user success work
35:29
group that's really focused
35:31
on the website, our
35:33
onboarding documentation in light
35:35
of some of these things, both performance
35:38
and change. Change is always
35:40
going to be there, but I think
35:42
one of the beauties of the Jupyter
35:44
Notebook or I Python Notebook when I
35:46
started working with it was you can
35:48
have code in there and as long
35:51
as you shift enter, you can get
35:53
started. I think right now, Python is
35:55
a language. We don't have that get
35:57
started look and feel in the
35:59
way. in the traditional way, we're
36:01
getting there, which might lead into
36:04
some other WebAssembly kind
36:06
of discussions. All right. Let me throw out
36:08
a quick thought on this before we move
36:10
on. So I think one
36:12
of the superpowers of Python is that
36:15
it's this full spectrum sort of thing.
36:17
On one hand, there's the people that
36:19
Jary spoke about. They come in, they
36:21
don't care about metaprogramming or
36:24
optimized database queries or scaling out across
36:26
web servers. They just, they got a
36:28
little bit of data. They want a
36:30
cool graph and that's awesome. On the
36:32
other hand, we have Instagram and others
36:34
doing ridiculous stuff. And that's the same
36:36
language with the same tooling and mostly
36:38
the same packages. And so I think
36:41
part of Python's magic is you can
36:43
be super productive with a very partial
36:45
understanding of what Python is. Like you
36:47
might not know what a class is
36:49
at all. And yet you could have
36:51
a fantastic time for months. And
36:54
so back to Paul's friend, if
36:56
we can keep that Zen about it, where
36:59
these features exist, but they exist when you
37:01
graduate to them and you don't have to
37:03
deal with them until you're ready or you
37:05
need them, I think we'll be fine. If
37:07
not, maybe not. If it breaks a bunch
37:10
of packages and there's some big split in
37:12
the ecosystem and all that stuff is not
37:14
good. But if we can keep this full
37:16
spectrum aspect, I think that'd be great. That
37:18
sort of rolls into what Paul's thoughts on
37:21
community are because I know
37:23
like PyOpenSci as a nonprofit I'm
37:25
involved with that helps scientists learn
37:27
how to use the tools. We've
37:30
got lots of educators out there. I'm
37:32
gonna give Michael a huge plug for
37:34
the coursework that you've
37:37
done over the years. It
37:39
is so well done and
37:41
so accessible. Thank you. If
37:44
people haven't tried it and they're interested
37:46
in a topic, highly, highly recommend. To
37:49
things like the Carpentries, to things like
37:51
Django Girls, there's a lot of good
37:53
stuff. And I think those things will
37:55
become more valuable as complexity increases. even
37:57
LLMs, I think you'll be able to
37:59
ask LLMs for help and that can
38:02
help you if you're not sure. They're
38:04
actually pretty good at it, actually. They
38:06
are pretty good at it. Yeah, they
38:08
are pretty good. All right. We got
38:10
time for another round. I'm pretty sure.
38:13
Jodie, what's your second one? Your second
38:15
trend. I'm going to talk about Arrow
38:17
and how we're kind of overhauling data
38:19
frames within Python. So
38:21
basically around 15 years ago, Wes
38:23
McKinney came up with Pandas, which
38:25
is, you know, if you're not
38:28
familiar with, it's the main data
38:30
frame library for working with data
38:32
in Python. And the really nice thing about
38:35
Pandas is you can go a long time
38:37
before you graduate Pandas. You can just work
38:39
with quite a lot of data on your
38:41
local machine. But the problem was Wes wrote
38:43
this package before we had big data. This
38:46
was like 2008. And so as the amount
38:48
of data that we want to
38:52
process locally has grown, or maybe the
38:54
complexity of the operations has grown, maybe
38:56
string manipulations, things like that, Pandas has
38:59
really struggled. So one of the reasons
39:01
that Pandas struggled is it was based
39:03
on NumPy arrays, which are really great
39:05
at handling numbers. This is
39:07
in the name. But they're not so great
39:09
at handling pretty much any other data type.
39:11
And that includes missing data. So two kind
39:14
of exciting things happened last year. And I
39:16
think they're sort of still kind of carrying
39:18
over to this year in terms of impact
39:21
is first Pandas 2.0 was
39:23
released, which is based on PyArrow
39:26
and a package called Polars, which
39:28
was actually written, I think in
39:30
2022, I want to say,
39:33
started becoming very, very popular. So
39:35
both of these packages are based
39:37
on Arrow. They have a number
39:39
of advantages because of this. Basically,
39:41
it's a standardized data format. If
39:43
you're reading in from say, Parket
39:46
or Cassandra or Spark, you basically
39:48
don't need to convert the
39:50
data formats. This saves you a lot of time, also
39:52
saves you a lot of memory. And also
39:56
kind of what makes Polars interesting. And I
39:58
think this is going to be a nice
40:00
lead-in till... another topic is it's written in
40:02
Rust, of course. So this
40:04
leads to other performance gains. Like you
40:07
can have say concurrency. Richie
40:09
Vink, the author of this has also
40:11
written basically query optimizer. So you can
40:13
do a lazy evaluation and it will
40:15
actually optimize the order of operations, even
40:17
if you don't do that yourself. Yeah.
40:19
That's one of the biggest differences with
40:21
pandas is pandas executes immediately and you
40:23
can create a big chain and bowlers
40:25
and it'll figure out, well, maybe a
40:27
different order would be way better. Yes.
40:29
So pandas two does have a type
40:31
of lazy evaluation, but it's more like
40:34
sparks, lazy evaluation. There's no query optimization,
40:36
but it doesn't necessarily create
40:38
a new copy in memory every
40:40
single time you do something. So
40:42
I've kind of looked at the
40:44
numbers and depending on the operation
40:46
pollers is usually faster. So it's
40:48
kind of like your big boy
40:50
that you want to use with
40:52
your, you're doing like really beefy,
40:54
like ETLs, like data transformations, but
40:56
pandas two actually seems to be
40:58
more efficient at some sorts of,
41:00
what am I trying to say,
41:02
operations. So this is super exciting
41:04
because when I was going through,
41:07
like initially as a data scientist, when I
41:09
was floundering around with my initial Python, it
41:12
got really frustrating with pandas and you
41:14
really kind of needed to understand how
41:16
to do proper vectorization in order to
41:18
operate. I mean, like do efficient operations.
41:20
Whereas I think these two tools allow
41:22
you to be a bit more lazy
41:24
and you know, you don't
41:26
need to spend so much time optimizing what
41:28
you're actually writing. So yeah, exciting time for
41:31
data frames, which is awesome. Data
41:33
is the heart of everything. People are
41:35
more likely to fall into good practices
41:37
from the start. You talked about these
41:39
people coming who are not programmers, right?
41:41
If you do a bunch
41:43
of operations with pandas and you all
41:45
of a sudden run out of memory,
41:47
well, yeah, Python doesn't work. It doesn't
41:49
have enough memory, right? Well, maybe you
41:52
could have used a generator at one
41:54
step. That's far down the full spectrum
41:56
or the spectrum, right? You're not ready
41:58
for that. That's crazy talk, these things.
42:00
And so tools like this that are
42:02
more lazy. and progressive iterative are great.
42:04
Yeah. And actually one really nice thing,
42:06
like Richie's kind of always saying about
42:08
Apollo's is he's really tried to write
42:10
the API. So you avoid accidentally looping
42:12
over every row in your data frame.
42:15
Like you, he tries to make it
42:17
so everything is natively columnar. So yeah,
42:20
I just think they're both really nice libraries
42:22
and yeah, it's cool and
42:24
exciting. Carol, this is right in the heart
42:26
of the space you live in. What do
42:28
you think? There's definitely the
42:30
evolution of pandas and polars.
42:32
You know, there's a place
42:34
for all of those in
42:36
the PI-ERO data frame format.
42:38
It's funny because I've actually
42:40
been doing more stuff recently
42:42
with going beyond tabular data
42:45
frames to multi-dimensional arrays
42:47
and X-array, which is used
42:49
more in the geosciences for
42:51
now. I think one of
42:53
the things that I see
42:55
is the days of
42:58
bringing all your data locally or
43:00
moving it to you is becoming
43:02
less and less. And
43:05
what you work in memory or
43:08
pull into memory from
43:10
different locations and it
43:13
is becoming more prevalent. And I
43:15
think Aero lets us do that
43:17
more effectively than just a straight
43:20
pandas data frame or Spark or
43:24
something like that. So it's progress and
43:26
I think it's a good thing. I
43:28
think it's far less about the language
43:31
underneath and more about what's the user
43:34
experience, developer experience that
43:36
we're giving people with
43:38
these APIs. Paul, thoughts?
43:40
It's interesting the scale
43:42
of data and what
43:45
generations are an
43:48
increase in our unit of measurement of data
43:50
that we have to deal with. And
43:52
for both of you, I wonder
43:55
if we have caught up
43:57
with the amount of data
43:59
that we can reasonably process
44:01
or is the rate of
44:03
growth of data out in
44:05
the wild constantly outstripping our
44:07
ability to process it? From
44:09
an astronomy space physics side
44:11
of things, no, we haven't
44:13
hit the limit for data
44:15
at all. And I think
44:17
one of the things we're
44:19
going to see more and
44:21
more of is how we
44:23
deal with streaming data versus
44:25
time series data
44:27
versus just tabular data, if
44:30
you will. And my
44:33
bigger concern and it is
44:35
partially a concern I have
44:37
about some of the large
44:39
language models and the training
44:41
there is the environmental impact
44:44
of some of these things. And
44:47
should we be collecting it A,
44:50
is there a value in collecting it? If
44:52
there's not value in collecting it, how do
44:54
we get rid of it? Because
44:57
it winds up
44:59
then being kind of much
45:01
like recycling and garbage. It's
45:03
like, okay, well, but it
45:05
might have historical value somehow
45:07
or legal value and it
45:10
becomes complex. And
45:13
so my general rule of thumb is
45:15
don't collect it unless you have a
45:17
clear reason you need it. But
45:20
that's just me. So quantity versus quality
45:22
of data. So I've
45:25
worked in mostly commercial data science
45:27
since I left science. When
45:29
I was in science, I was dealing with sample size of 400, not
45:31
400,000, 400. So
45:35
that was not big data. The quality
45:37
of the data, again,
45:39
going back to large language models, a
45:42
lot of these earlier foundational models were
45:44
trained on insufficiently clean data. And
45:46
one of the trends actually that I
45:48
didn't mention with LLMs is like last
45:50
year in particular, there was a push
45:52
to train on better quality data sources.
45:54
So obviously these are much more manageable
45:57
than dealing with petabytes. One more
45:59
aspect. I'll throw out here, you know, for a
46:01
long time, we've had SQL light for really simple
46:03
data. We could just, if it's too big for
46:05
memory, you can put it one of those things.
46:07
You can query, you can index it while ducty
46:09
be just hit 1.0. And you
46:11
kind of got this in memory in process analytics
46:14
engine. So that's also a pretty interesting thing to
46:16
weave in here. Right. To say like, well, we'll
46:18
put it there in that file and we can
46:20
index it and ask it questions, but we won't
46:23
run out of memory. And I think plugin pandas,
46:25
I'm not sure about pollers and do queries, but
46:27
that's query optimizer against that data and
46:29
sort of things like that. It's, it's
46:31
pretty interesting. I think in this tip
46:33
to put it into that space as well.
46:35
All right, Carol, I think it's time for
46:37
your second, your second trend here. The
46:40
second threat trend is pretty
46:42
much, you know, things are
46:44
moving to the front end,
46:46
web assembly, TypeScript, Pyadide. There's
46:49
a new project PyCafe that
46:51
I'm pretty happy with by
46:53
Martin Bredlz that lets you
46:55
do dashboards using Pyadide, but
46:58
like Streamlit and Flotlys and
47:00
libraries and things like that.
47:02
And I think making
47:05
things more accessible as well as
47:07
making things more visual is pretty
47:10
cool. Like I took, what was
47:12
it? Jupiter light earlier last fall
47:15
and a friend of mine had kids
47:17
and I integrated into my website so
47:19
that like her kids could just do
47:21
a quick whatever, which sort of, you
47:24
know, in some ways was similar to
47:26
Binder and the whole time
47:28
we were developing Binder, I was
47:30
also working with the Pyadide, Iodide
47:32
folks, because I think there's a
47:34
convergence down the road and where
47:36
it all go. I'm
47:38
not really sure, but I think
47:41
it's exciting. And I think
47:43
anything that from a privacy
47:45
standpoint, security, there's a lot
47:47
of things that are very
47:49
attractive about pushing things into
47:52
the front end. That
47:54
beginner startup thing you talked about that
47:56
onboarding first experience, you hit a webpage
47:58
and you have full. experience with
48:00
Python and the tooling and the packages are
48:02
already installed in that thing and that's so
48:04
much better than force you download it. Well,
48:06
you need admin permissions to install it. Now
48:08
you create a virtual environment and then you
48:10
open the terminal. Do you know what a
48:12
terminal is? We're going to tell you like,
48:14
no, just, and you don't have to ask
48:16
permission to run a static webpage where you
48:18
do for like, how do I run this
48:20
server on a Docker cluster or something? It
48:23
opens up different doors. And I think the
48:25
other thing we found like when we were
48:27
teaching, you know, with binder
48:30
and Jupyter hub, UC Berkeley was
48:32
able to have now
48:34
most of their student body
48:36
taking these data eight connector
48:38
courses and they
48:41
would run the compute in
48:43
the cloud, which really leveled
48:46
the playing field. It didn't matter if you
48:48
had a Chromebook or you had the highest
48:50
end Mac, you still got the same education.
48:53
And I think there is something
48:55
very appealing about that. We've actually
48:57
been running humble data in Jupyter
49:00
light and some people just bring a
49:02
tablet and they can do it on
49:04
that. That's awesome. Carol, there was something
49:06
you were saying that connected to something else in
49:08
my brain. Remember in the
49:10
beginning of the web and view source was
49:12
such a cool thing. Yeah. You could see
49:14
what the back end sent you and you
49:16
could poke around at it. You could learn
49:18
from it and you could steal it, you
49:20
know, and use it to go make your
49:23
own thing. But what if you could view
49:25
source the back end because it's actually running
49:27
in your browser. What you were just saying
49:29
was if you make it
49:31
reveal itself about the notebook
49:33
and the code in addition
49:36
to the HTML, maybe
49:38
you'll trigger some of those same kinds of things
49:41
that view source gave people back in the
49:43
day. Maybe the flip
49:45
side would be there's always
49:48
business and practicalities in life
49:50
and people will want
49:52
to sort of lock it down within WebAssembly.
49:54
So you've got both sides of it. But
49:56
I do think, you know, I think that's
49:58
a good thing. I was telling somebody the other
50:01
day, I never use Stack
50:03
Overflow, or rarely use Stack Overflow. And
50:05
they're like, how do you find stuff?
50:07
I'm like, I use search on GitHub,
50:09
and I look for really good examples.
50:12
And so in some ways, it's like
50:15
View Source. And then there's also the flip
50:17
side of it is like, okay, how do
50:19
I break it? How do I play with
50:21
it? How do I make it do something
50:24
it wasn't doing before, which could
50:26
be used for good or for evil. I
50:28
tend to use it for good. Paul, I'm
50:30
up on our time here. What's your second
50:32
trend? We'll see if we
50:35
have time for mine. I have a couple just in case we can
50:37
squeeze them in. Let's talk about yours.
50:39
I came back from PyCon, really
50:41
rejuvenated, but also had some kind of clarity
50:43
about some things that had been lingering for
50:46
me for a few years, how I could
50:48
contribute, things like that. But
50:50
going into it, there are a couple of trends
50:52
that lead me to thinking about an
50:55
opportunity and a threat as two sides
50:57
of the same coin. First, in
51:00
Russell Keith McGee and Lukasch Longa
51:02
both talked about black swans and
51:04
the threat of JavaScript everywhere, that
51:06
if we don't have a better
51:08
web story, if we make our
51:11
front end be JavaScript and react
51:17
and we stop doing front ends, well, then
51:19
they'll come for the back end too. Because
51:21
once they've hired up JavaScript developers, why don't
51:24
we just do JavaScript on the server too?
51:26
So that was a first thing. And in
51:29
my position, I do look at the
51:31
web and think about all these trends that
51:33
are happening. And there's beginning to
51:35
be a little bit of a backlash
51:37
about the JavaScriptification of the web. And
51:40
so some really big names, HTMX is
51:42
a good example of it, but
51:44
just some thinkers and speakers. I mean, Jeff
51:46
Triplett talks about this. A lot of people
51:48
in the world of Python talk about this.
51:51
So there's kind of a desire to
51:53
put the web back in the web
51:55
trademark. But then there was a second
51:57
point coming about these walled gardens. We've
51:59
seen them. for a while, we all
52:01
relied on Twitter. What a great
52:03
place. It's wait, what? And
52:05
then so much of our life
52:07
is in a system we don't control. And
52:09
then so we move over to the fatty
52:11
verse and then Matt is like, hey, great,
52:14
we're going to build a bridge to you.
52:16
Turns out this week, we start to learn
52:18
things about the thread API that maybe it's
52:20
not as friendly as we think it is.
52:22
But the big one for me was Google
52:24
and search. Well, I should say Google and
52:26
getting rid of its Python staff, but Google
52:28
and search where they're no longer
52:30
going to send you to the website anymore.
52:33
They're just going to harvest what's on your
52:35
website and give you the answer.
52:37
And people are talking now about Google
52:39
Zero, the day of the apocalypse where
52:41
you no longer get any clicks from
52:43
Google. And what does that mean for
52:45
content creators and stuff like that? So
52:49
going into all of this, I've
52:51
been thinking about how awesome
52:53
life is in Python land because we've got
52:55
this great language. Oh, but we've got this
52:58
great community. Come for the language, stay for
53:00
the community. Well, what do we mean by
53:02
that? A lot of the times we mean
53:04
all this code that's available. We
53:07
also mean all these people
53:09
and wonderful, helpful people like
53:11
on this call. But there's also this
53:13
big world of content. And
53:15
we have kind of
53:18
organically grown a little
53:22
online community with a bunch
53:24
of helpful content and a
53:26
bunch of connections between people,
53:29
which is of some value
53:32
itself. And so you see people starting to
53:34
talk about, wow, I miss the old days
53:36
of RSS, where we would all subscribe to
53:38
each other's blogs and get content and go
53:40
straight to the source and not have it
53:43
aggregated into a walled garden and stuff like
53:45
that. And it just feels like
53:47
there's room out there for if
53:49
we want to fight back against
53:52
the threat of
53:54
these mega cores, taking
53:56
our voluntary contribution to
53:58
humanity and monetizing. while
54:01
at the same time of taking
54:03
all these valuable voices, creating content
54:05
and value in Python land, that
54:08
maybe we could bring back some
54:10
of these things, put the web
54:12
back in the web, and
54:15
start to get out of the
54:17
walled gardens and back over into
54:19
social networks that are
54:22
open and joyful. I'm here for
54:24
it. Wow. People complain,
54:26
governments complain, that places like Google
54:28
and staff for monetizing the links,
54:30
and they're being paid, you gotta
54:32
pay to link to this new source
54:34
or whatever, right? We're lucky
54:37
that we have that. If it turns into
54:39
just, you just get an AI answer, no
54:41
source, that's gonna be really hard on a
54:43
lot of different businesses, creators, people just want
54:45
to create something just for the attention or
54:47
for their self, you know, like nobody comes
54:50
anymore. It's gonna be a sad place. I
54:52
was thinking about Coke Zero the whole time
54:54
you were saying like, you know, Google
54:56
Zero or whatever, because you didn't have
54:59
to bring back classic Coke. And
55:02
I think, yeah, pivots happen, but
55:04
it's hard to pivot, you know,
55:06
billion dollar companies. I
55:08
have lots of thoughts on some
55:10
of the current Python, what
55:13
Google has chosen to do, I
55:15
think sometimes listening to consultants isn't
55:18
the best business
55:21
approach. You know, it's their company,
55:23
they can do what they need to do
55:25
for their own shareholders. I think a lot
55:27
of what you said is really interesting. And
55:29
like, I touched on
55:31
this a little bit because the
55:33
volume of information around us is
55:36
greater than ever before. Sure. And
55:38
at a speed of transmission
55:40
that is faster than ever
55:42
before. And about eight years ago,
55:45
I had breakfast with Sandy Betts who was
55:47
very prolific in the Ruby community. And I
55:49
asked her like, how do you keep up
55:51
with all of this stuff? And she's like,
55:54
I don't. And I said, okay. And
55:57
she's like, what I do is I focus on the things that me
56:00
and all the rest of it is news. And
56:03
that really stuck with me because in actuality
56:07
that's kind of what I do.
56:10
You know, I ignore the things
56:12
that aren't directly relevant to me
56:14
and trust that I've built a
56:16
strong enough network of people that
56:18
I respect that well said their
56:20
work will influence when I jump
56:23
in. Like I don't, you know,
56:25
much like the life cycle if
56:27
you've studied marketing or product development,
56:30
you know, not everybody's an early adopter.
56:32
So do I need to be an early adopter
56:34
on everything? No. Yeah, that book Crossing the
56:36
Chasm says that you should do that like
56:38
on one thing. If you do
56:40
it on three things or more you'll fail. Yeah.
56:43
You know, part of the thing that triggered this
56:45
for me was reading that Andresen Horowitz, kind of
56:47
the self-proclaimed king of
56:49
Silicon Valley VCs, as
56:52
zero interest rates started to go out of fashion
56:54
and their recipe wasn't
56:57
working. They didn't like the
56:59
negative press coverage so they started their
57:01
own media empire to cover themselves. And
57:04
that idea is just so
57:06
appalling that we
57:08
would get news. We would
57:10
turn to the mega courts and
57:13
the masters of the universe to
57:15
tell us what we should be
57:17
caring about. We have that already.
57:19
We have, I'll be very specific, we
57:21
have Planet Python. It's in
57:23
disrepair. What if it was
57:25
reimagined into a freaking media
57:28
empire by us for
57:30
us to cover the
57:32
Fedverse and course providers and all
57:34
the value that's out there. And
57:37
like Carol, you're saying I don't have to
57:39
think about it but I trust that group
57:41
because they're thinking about it. A lot
57:43
of it is like, you know, when
57:45
it came to LLMs, it was not
57:48
the thing that rocked my world like
57:50
intellectually but I knew
57:52
Simon was doing work with it.
57:54
And so I basically
57:57
once every few weeks would take a look at
57:59
his website. and his blog posts and
58:01
he posts a lot and I would get
58:03
my data dump of things.
58:05
I don't know. I mean, that's
58:07
one of the reasons I like
58:10
PyCon and I've like read talk
58:12
proposals, everything for the last decade.
58:14
All these talk proposals. And it
58:16
really does give me an appreciation
58:18
for all the things Python's being
58:20
used for. What's kind of
58:22
the zeitgeist. Yeah. And, and so
58:24
I think there's different ways of doing
58:27
that, even just doing a YouTube search
58:29
of Python content. But I tend to
58:31
focus in on sciency
58:33
oriented things and ways
58:36
to empower humans through
58:38
lifelong learning. So there's
58:41
a lot of, we're in a, in
58:43
a phenomenal period of change for
58:45
sure. Yes. So we won't be bored, nor do
58:47
I think our jobs are going to go away.
58:49
They may change, but they're not going away. Indeed.
58:52
Jeremy, final thoughts on this topic? No, pretty
58:54
much wrap things up. Yeah, I don't think I
58:56
really have that much to add actually. I think it's all
58:58
been said. It has. All right. Just to, just
59:00
to round things out was the two things that I
59:03
think are transheres. I think like Carol
59:05
said a lot, Python on the front end
59:07
is going to be super important. I think
59:09
PyScript is really, really interesting. I've been waiting
59:11
for people to develop something like React
59:14
or Vue or something that we
59:16
could create commercial facing websites. We're
59:18
halfway there with MicroPython being the
59:20
foundation of PyScript, which is
59:23
a hundred K instead of 10 megs. All
59:25
of a sudden it becomes JavaScripty size. It
59:27
opens the possibilities. And just a shout
59:29
out to Pew pie, which is like view with
59:31
Python, P U E P Y. I'm going to
59:34
interview Ken from that project, but it's kind of
59:36
a component based front end for
59:38
PyScript, which is pretty interesting. And of course,
59:40
Jupiter light is really, really important.
59:42
The other one was just all this rust. So
59:45
everything seems to be redone and rust. And oh
59:47
my gosh, that's how you get your VC funding.
59:49
Just joking sort of, but all you
59:51
talked about all this performance stuff coming, you
59:53
know, this, while it, it is sometimes frustrating
59:56
that people are putting all the things into
59:58
rust because then Python programmers is less
1:00:00
approachable for them. It could also be
1:00:02
an escape hatch from trying to force
1:00:04
the complexity into the Python side. Alleviate,
1:00:06
like everything has to be multithreaded and
1:00:09
crazy and optimized and, well, this
1:00:11
part you never look at, it's faster now. So
1:00:13
don't worry. Anyway, those are my two trends. Quick,
1:00:15
quick, quick thoughts on that and then we'll call
1:00:17
it a show. My piece of trivia is
1:00:19
I made a
1:00:21
contribution to Rust far before I made
1:00:23
any contributions to Core Python. Amazing.
1:00:26
Because I tended to be a C programmer in
1:00:29
heart and spirit. And so Rust
1:00:31
seemed like this cool thing that was new
1:00:33
at the time. And ultimately,
1:00:35
I personally did
1:00:37
not find the syntactic side
1:00:40
of it worked well with
1:00:42
my brain and how I
1:00:44
think. And Python was far
1:00:46
cleaner in terms of a
1:00:48
simpler visual, less clutter, and
1:00:51
reminded me a little more of small talk
1:00:53
or something like that, which I loved from
1:00:55
earlier days. But I think there's
1:00:57
a place for Rust. I
1:01:00
think Rust is gonna replace Python
1:01:02
now. I think it's going to
1:01:05
help with some optimized things. Do I
1:01:07
love things like rough that let me
1:01:09
run my CI, like blazing
1:01:11
fast versus all the Python tools?
1:01:15
Not to say that all the Python tools are bad,
1:01:17
but when you're paying for it as a startup.
1:01:20
When things you used to have to wait on become a
1:01:22
blink of an eye, all of a sudden you don't mind
1:01:24
running them every time and it changes the way you work
1:01:26
with tools. Exactly. Yeah, I would
1:01:28
say, look, every language has its
1:01:30
place in the ecosystem. And my
1:01:33
husband is a long time Pythonister, but
1:01:35
he's also a Rust program. I
1:01:38
know, it's like a running joke that my husband is
1:01:40
a Rust developer. How do you know? He'll ask you.
1:01:42
Well, you know what I mean? How
1:01:45
do you know? Ask him, he'll tell you. There you
1:01:47
go. They have different
1:01:49
purposes, completely different purposes, and
1:01:51
you can't just interchange them.
1:01:54
Absolutely. Let's just get it straight. Python is just
1:01:56
awesome, says our Tim. To your love. But
1:01:58
it's to keep it awesome. Yes, absolutely.
1:02:01
Paul, we've come around to you for
1:02:03
the very final, final thought on this
1:02:05
excellent show. I will give a final
1:02:07
thought about Python trends to follow up
1:02:09
on what Carol just said about it's
1:02:11
up to us. Maybe it's
1:02:13
up to us to
1:02:15
help the people who will keep
1:02:18
it that way. The next generation
1:02:20
of heroes, help them succeed. I'm
1:02:23
wearing my PyCon Kenya
1:02:25
friendship bracelet, and I
1:02:28
got a PyCon and wonderful experience
1:02:30
meeting so many different kinds
1:02:32
of people. And from a
1:02:35
Python trends perspective, the fact that
1:02:37
everything we're talking about is good
1:02:39
stuff, not like asteroid-themed earth. Yes,
1:02:41
yes. IP challenges and patent wars
1:02:43
and mergers and acquisitions and stuff.
1:02:45
Remember a long time ago, I
1:02:47
went to go see Guido and
1:02:49
he was with the App Engine
1:02:52
team at Google. So a long
1:02:54
time ago, and he was starting
1:02:56
the process of turning over pep
1:02:58
review to other people. And
1:03:00
I commented to him that not every open
1:03:02
source success story outlives its
1:03:05
founder. And the bigger it
1:03:07
gets, particularly open
1:03:09
source projects anchored
1:03:11
in the United States of America, they
1:03:13
sell out and get funded
1:03:16
and they will never be the same
1:03:18
after that. And so it's a moment
1:03:20
from a Python trends perspective for us
1:03:23
to build a great next
1:03:25
future by remembering how lucky we
1:03:27
are where we have gotten to.
1:03:30
Absolutely. Carol, Jadie, Paul, thank you for being
1:03:32
on the show. Thank you. Thanks, Michael.
1:03:34
Thank you. Bye. Bye. This
1:03:37
has been another episode of Talk Python to me.
1:03:40
Thank you to our sponsors. Be sure to check out
1:03:42
what they're offering. It really helps support the show. Code
1:03:45
comments and original podcast from Red Hat.
1:03:48
This podcast covers stories from technologists
1:03:50
who've been through tough tech
1:03:52
transitions and share how
1:03:54
their teams survived the journey. Episodes
1:03:57
are available everywhere you listen to your podcasts
1:03:59
and at talkbython.fm slash code
1:04:01
dash comments. This
1:04:03
episode is sponsored by posit connect from
1:04:05
the makers of shiny, publish,
1:04:08
share and deploy all of your
1:04:10
data projects that you're creating using
1:04:12
Python, streamlet dash shiny bokeh, fast
1:04:14
API, flash, quarto reports, dashboards
1:04:17
and API's. posit connect
1:04:19
supports all of them. Try
1:04:21
posit connect for free by
1:04:23
going to talkbython.fm slash posit
1:04:25
posit. Want to level up
1:04:27
your Python, we have one of the largest
1:04:30
catalogs of Python video courses over at talkbython.
1:04:32
Our content ranges from true beginners to
1:04:34
deeply advanced topics like memory and async.
1:04:37
And best of all, there's not a
1:04:39
subscription in sight. Check it out for
1:04:41
yourself at training.talkbython.fm. Be sure
1:04:43
to subscribe to the show, open your favorite podcast
1:04:46
app and search for Python, we should be right
1:04:48
at the top. You can also find
1:04:50
the iTunes feed at slash iTunes, the
1:04:52
Google Play feed at slash play and
1:04:55
the direct RSS feed at slash RSS
1:04:57
on talkbython.fm. We're live
1:04:59
streaming most of our recordings these days. If you
1:05:01
want to be part of the show and have
1:05:03
your comments featured on the air, be sure to
1:05:05
subscribe to our YouTube channel at talkbython.fm slash
1:05:08
YouTube. This is your
1:05:10
host Michael Kennedy. Thanks so much for listening. I
1:05:12
really appreciate it. Now get out there and write
1:05:14
some Python code.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More