Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:02
Two and a half admins, episode 183. I'm Joe. I'm
0:05
Jim. And I'm Alan. And here we are
0:07
again. Windows
0:09
11.24 H2 goes from unsupported to
0:11
unbootable on some older PCs. Now,
0:13
to be fair, this is some
0:16
very old PCs, but
0:18
it is a potential sign of things to come
0:20
for people who've hacked Windows 11 onto
0:23
machines that Microsoft doesn't want you to have.
0:25
Yeah, it's as if the universe is telling
0:27
you, you should have listened to Jim when
0:29
he mentioned this a couple weeks ago. Yes.
0:32
Yeah, this is just confirmation again of
0:34
what had to be coming down the
0:37
pike. It's also confirmation that Microsoft actually
0:39
had some concrete purpose in mind when
0:41
they started issuing these hardware
0:43
requirements. They may have started out theoretical,
0:46
but they are rapidly becoming real, and
0:48
that's going to continue to happen. So
0:51
we can't already set it all. If you're just
0:53
into this stuff and you think it's really cool
0:56
to wedge Windows 11 onto some hardware that Microsoft
0:58
didn't want you to, rock on.
1:00
But if you're looking for a reliable,
1:03
supportable thing that you can depend on, well,
1:05
then maybe don't install operating systems on hardware
1:07
that they specifically say not to install them
1:10
on. Yeah, during the Break Between episodes,
1:12
me and Alan were just talking about old ThinkPads.
1:15
And you don't really want to do that.
1:17
You'll be able to pick them up so
1:19
cheap after the death of
1:21
Windows 10 or as we get closer to that.
1:24
But if you're going to give
1:26
it to someone else, don't be putting
1:28
Windows 11 on there because you're going
1:31
to potentially run into stuff like this.
1:33
I mean, this particular one is the
1:35
population count feature of the CPUs which
1:37
only started in some of the early
1:40
I-5s. But what's the next thing going
1:42
to be? It's a part of the
1:44
SSE4 instruction set. Yeah, it counts how many of
1:46
the bits are set to one. It turns out
1:49
very familiar with that feature not because of
1:51
that feature, but it was all Intel's chips
1:53
that had that were also the first ones
1:55
that completely supported BeHive on FreeBSD. And
1:57
so knowing that, that's like whatever was
1:59
added. after Nahalim, this is like second
2:02
gen. It's very old CPUs that
2:04
still support that. But as
2:07
Microsoft wants to take advantage of these newer CPU
2:09
features, they're just setting a floor
2:11
to what hardware is supported. And to
2:13
a certain degree, I think it makes sense when you're
2:15
writing the software. It's like instead of trying to have
2:17
to do kind of iFunx or
2:20
something to deal with it, just knowing that, hey, we
2:23
only support CPUs that have this
2:25
set of instructions. And
2:27
we are going to be able to get more performance out
2:29
of everything by not having to dynamically
2:31
decide to have acceleration or not. Presumably,
2:34
FreeBSD is quite similar to Linux in
2:36
that there's quite a long time before
2:38
you stop having hardware cut off like
2:41
this. Yeah, and FreeBSD does specifically use
2:43
a feature called iFunx to be able
2:45
to say, oh, hey, if
2:47
your CPU supports it, we will replace this
2:49
set of code with the fancy CPU instruction
2:51
that will do it more efficiently. But
2:54
at the same time, FreeBSD is finally starting
2:56
to be like, well, if your computer is
2:59
that old, maybe you should use an old version
3:01
of FreeBSD, not try to run
3:03
the latest one. Yeah, I mean, we've seen
3:05
it with 32-bit, for example. That has slowly
3:07
been just abandoned. Well, in FreeBSD, it is
3:10
now what we call tier 2, meaning
3:12
that while we will still make some
3:14
effort to keep it working, we don't
3:17
guarantee that if there's another Spectre meltdown
3:19
type thing, that we're going to spend
3:21
all the extra effort to fix that
3:23
for 32-bit, especially since with the smaller
3:25
address space fix may be a lot
3:27
more complicated. And we try
3:30
to make packages, but nobody's tests them
3:32
because who still has a computer that's
3:34
only 32-bit. But really,
3:36
this story is the canary in the
3:38
coal mine for don't hack Windows 11
3:40
onto machines that it's not supported on.
3:43
Unless, again, you're just into that. If you want
3:45
to do that because you're into that, that's cool.
3:47
And I will not say the first bad thing
3:49
about it. That's awesome. Just don't
3:52
think that you're doing a reliable thing
3:54
to give other people or to put
3:57
your own really important work on to really
3:59
bite. you in the butt when it stops
4:01
working one Saturday night before you know the
4:04
presentation you've got to give on Monday morning.
4:06
This is just it's not something that you
4:08
want to do on mission critical equipment. And
4:10
you do not want to get a call
4:12
from your niece or your grandma or whatever
4:15
saying hey what's up this is just not I've
4:17
done this update and it's just broken it. That's
4:20
what we call mission critical squared because when
4:22
it becomes mission critical for grandma it is
4:24
also mission critical for you and that's multiplicative
4:26
not additive. Investors threw
4:29
50% less money at the
4:31
quantum sector last year. This is
4:33
a piece in the register about
4:35
how VC money has gone away
4:38
from quantum computing and towards
4:40
AI. Now we knew it was going towards
4:42
AI but it's interesting that the
4:44
quantum hype seems to have died down
4:46
at last. Well the AI hype proved
4:49
to be a lot more directly and
4:51
immediately profitable. The quantum hype you essentially
4:53
you're just hoping somebody will give
4:55
you money until you can finally
4:57
create a product but there's not
5:00
really any way to get like
5:02
end-user money out of it. It's
5:04
it's all business to business stuff. Well
5:06
the AI speculation has long since reached the
5:08
stage where you can actually extract things of
5:10
value from the Hoi Paloi. You
5:13
can put chat GPT in your bing and you can
5:15
you know throw an assistant here and a model
5:17
there and you get a lot back out of
5:19
it. So I can see where
5:21
that's a much more attractive investment for VC
5:23
looking for a place to parks and dollars.
5:26
Whereas quantum computing the
5:28
big potential payoff for quantum
5:31
computing essentially boils down to
5:33
breaking public-private key pair encryption.
5:36
There will eventually become a point
5:38
where quantum computing breaks most of
5:41
the encryption techniques that we're using
5:43
today. One interpretation
5:45
you could have of investors throwing
5:47
less money at quantum is
5:50
that breakthrough is looking further away than they thought. However
5:52
I think that would be giving the investors too much
5:54
credit. I think it's what I said to begin with.
5:57
AI just looked a lot more immediately
5:59
sexy. And immediately sexy, let's
6:01
face it, is exactly what VCs are
6:03
looking for. Well, also, AI is
6:06
something you can sell a recurring subscription
6:08
for. Lots of people are willing to pay $15
6:11
or $20 a month to have access to
6:13
chat GPT or an image generator or
6:16
a video generator and things like that. And
6:18
being able to build that annual recurring
6:20
revenue is the more
6:23
popular business model right now than anything involving
6:25
hardware. Well, and again, those are killer apps
6:27
that already exist. The killer apps for quantum
6:29
computing are largely theoretical. Like, we know what
6:32
they'll be, but we haven't really
6:34
hit them yet. There are some niche uses
6:36
for quantum computing in its current state, but
6:39
I mean, they are niche.
6:41
If you needed a quantum computer, you already
6:43
knew exactly why you needed it. You don't
6:45
need me to tell you about it. It's
6:47
that niche. Whereas again, like you said,
6:49
AI, I mean, you can come up with a million things. And even
6:52
the things that don't make sense, they seem
6:54
to make enough sense that you can get
6:56
idiots to buy off on it immediately. Whereas,
6:58
again, what's your comparative
7:01
function of quantum computing that you can sell a
7:03
whole bunch of people on the idea that they
7:05
need to pay you money for it right now?
7:08
This is the part where you cue in
7:10
the crickets chirping from the soundboard, Joe. Yeah,
7:15
whereas AI, we've seen people building plugins
7:17
to help you write better emails or
7:19
make more presentations and make images and
7:21
all kinds of, yeah, like you're saying,
7:24
much more immediately applicable and mass market
7:26
appealing compared to quantum. Is
7:29
quantum going to actually get
7:31
there, though? Probably. I
7:33
don't see any reason to expect that it
7:35
wouldn't eventually. It's just – it's one
7:37
of those things that it's going to require – from
7:40
my outsider's perspective, it looks like it
7:42
needs something
7:44
along the lines of an Einstein-level breakthrough or
7:46
two. But, again, completely
7:49
wild-ass guessing here, it doesn't look
7:51
to me like that's a thing
7:53
that won't happen. I
7:55
think quantum computing to the level that it
7:57
breaks our last 20 years' worth of –
8:00
public-private key pair encryption schemes.
8:03
I do think it's going to happen. I do think it'll be
8:05
here probably within a decade or two. I also
8:07
think that by the time it arrives, we'll
8:10
have moved largely enough to quantum-resistant
8:12
computing schemes, like the noise framework,
8:14
that it won't be the world-ending
8:16
deal that would be if, like,
8:18
somebody were to announce tomorrow, oh, hey,
8:21
you know, we got a 2000-qubit quantum
8:23
CPU over here, and it
8:25
turns out every SSH algorithm you use
8:27
is dead. Essentially,
8:29
I think quantum computing is going to end
8:31
up being Y2K as it actually was, as
8:34
opposed to Y2K, you know, the way that
8:36
it was often dreamed of in
8:38
a panic before the fact. Do you think that
8:40
we will be ready for it? Yeah, I think
8:42
we'll be ready. There must be
8:44
other uses for it, apart from breaking encryption, though.
8:47
There are. I mean, I said there are a
8:49
few other use cases. They're quite niche. I'd
8:52
have to look them up. I always forget because they're
8:54
so weird they don't stick in my head. They're
8:56
so boring. Surely, we'll find new
8:59
uses for it once it advances to a
9:01
point where it's usable. Right. Whereas to get
9:03
VC money, you have to know the uses
9:05
ahead of time and know that people are
9:07
going to want it. VCs don't
9:09
do, like, R&D funding, where we're just going to –
9:11
we're going to see what we can invent. You
9:13
have to invent something, and you just want
9:16
to have the money to scale it. That's
9:18
where the VCs come in. Because I'm reminded
9:20
of the moon missions. Tons of technology came
9:22
out of that, and tons of patents and
9:24
things like Tesla. But that was undirected
9:26
R&D money. Well, maybe not undirected,
9:28
but that was research and development. That was
9:31
very different than the kind
9:33
of investing that VCs do. To your
9:35
earlier question, Joe, one of the obvious
9:38
use cases for quantum computing, apart from
9:40
breaking public-private key pairs, is
9:42
– you remember protein folding back in the day,
9:44
distributed computing stuff that everybody would load up on
9:46
their computer? Oh, yeah. And quantum
9:48
computers are very good at analyzing
9:51
enzymes in ways that classical
9:53
computers are just not.
9:56
So we're expecting that
9:58
a sufficiently powerful computer could
10:00
predict in a matter of hours
10:03
the properties structure and reactivity of
10:05
enzymes. Things that can
10:07
be literally impossible right now if not
10:09
take months or years. I mean there's
10:11
a reason why the protein
10:13
folding project was trying to distribute that
10:15
workload across millions of computers. Okay,
10:19
this episode is sponsored by Collide.
10:21
When you go through airport security, there's one line
10:24
where the TSA agent checks your ID and
10:26
another where a machine scans your back. The
10:28
same thing happens in enterprise security, but
10:31
instead of passengers and luggage, it's end users and
10:33
their devices. These days,
10:35
most companies are pretty good at the first part
10:37
of the equation where they check user identity. But
10:40
user devices can roll right through authentication
10:42
without getting inspected at all. In
10:45
fact, a huge percentage of companies allow
10:47
unmanaged, untrusted devices to access their data.
10:50
That means an employee can log in from a
10:52
laptop that has its firewall turned off and hasn't
10:54
been updated in six months. Or
10:57
worse, that laptop might belong to a
10:59
bad actor using employee credentials. Collide
11:02
finally solves the device trust problem. Collide
11:04
ensures that no device can log into
11:06
your Octa-protected apps unless it passes your
11:08
security checks. Plus you can use
11:11
Collide on devices without MDM like your
11:13
Linux fleet, contractor devices and every BYOD
11:15
phone and laptop in your company. So
11:18
support the show and go to collide.com.25a
11:20
to watch a demo and see how
11:23
it works. That's
11:25
collide.com.25a. Viral
11:30
news story of botnet with 3 million
11:32
toothbrushes was too good to be true.
11:35
This all played out over the last few weeks and
11:38
it originally came from a
11:40
report published by Arghawat Zaitong.
11:43
And it seems that some of the details
11:46
got lost in translation. It was kind of
11:48
a hypothetical that then some publications ran with
11:50
us. And this has definitely happened. But
11:53
like the Ars Technica headline says,
11:56
it was too good to be true. These toothbrushes
11:58
are not capable of... being a
12:00
botnet. They haven't got enough computing power in them.
12:03
No, and they don't have a direct connection to the
12:05
internet either. When you have a smart toothbrush
12:07
and God help you, why do you have a smart
12:09
toothbrush? But when you have one, generally
12:11
it's going to pair to your Android
12:13
device or your iPhone over Bluetooth. And
12:16
in order for it to get updates, essentially
12:19
the app that's running on your phone gets
12:21
those updates over the internet and
12:23
then can push it over the Bluetooth connection
12:25
to your, and again, I'm judging you for
12:27
owning this smart toothbrush. So
12:29
how the toothbrush itself would
12:32
be part of a botnet kind of beggars
12:35
understanding or belief. And
12:37
if you were to say that, oh, well,
12:39
it's the app on the phone. Well, in
12:41
that case, you don't have a toothbrush botnet.
12:43
You've got a bog standard phone botnet. It
12:46
seems what actually happened here is during
12:48
some discussions, an analyst talked about a
12:50
hypothetical situation of what denial of service
12:53
attack on the website could look like.
12:56
And to do something different than the one
12:58
we saw a couple of years ago where
13:00
it was all the network DVRs, the security
13:02
camera recorders that were
13:04
botnetted and attacking places, he
13:07
made up this hypothetical example of
13:10
the smart toothbrushes being compromised and
13:12
specifically talking about a supply chain compromise. So
13:14
like all the toothbrushes got to the
13:17
users already compromised, so the firmware update was
13:19
compromised. But it turns
13:21
out that's not actually what
13:23
was happening. And it
13:25
was just someone in a meeting misunderstood
13:28
the theoretical example as something that actually
13:30
happened. It actually looks like it was
13:32
probably just a straight translation issue because
13:35
Fortinet did say that it was a
13:37
hypothetical. And when the
13:39
Swiss paper quoted them, it
13:42
seems as though the Swiss paper did
13:45
still quote it as a hypothetical, but
13:47
the error was introduced by Tom's hardware
13:49
using automatic translation from German to English.
13:52
And the reason I say that is because the German
13:54
language paper, it's actually a Swiss paper, when
13:56
it issued a follow-up clarifying again that the
13:59
Fortinet situation... was hypothetical, Google
14:02
Translate still seems to show them
14:04
as saying, no, no, it's absolutely
14:06
real. So it looks
14:08
like Google Translate is just missing
14:11
a beat when it comes to translating
14:13
from German to English. Or the AI
14:15
that powers the translation was hallucinating. It
14:17
might be specifically a Swiss German to
14:19
English era, because Swiss German is a
14:21
little bit different. But I'd have to
14:23
check with a native speaker on
14:25
that one. And then, as is the nature
14:27
of the internet, ZDNet based their story on
14:29
the Tom's hardware story and started to start
14:32
3 million smart toothbrushes were just used in
14:34
a dial service deck. Really. And it
14:36
just went from there. And honestly, both
14:38
Tom's hardware and ZDNet really should
14:41
have known better. Translation error
14:43
or no. In
14:46
my opinion, anybody who's going to be covering
14:48
that kind of story should understand that kind
14:50
of story well enough to realize, no, there
14:53
is not a 3 million strong botnet of
14:55
freaking toothbrushes out there. Because again, if you
14:58
know anything at all about the
15:00
ecosystem, it just it doesn't make
15:02
sense. You can't really make a botnet
15:04
out of devices that don't have a
15:07
direct internet connection. Yeah. And the
15:09
fact that ZDNet's story specifically said, what's next, malware
15:11
infected dental floss seems like they kind of knew
15:13
it was too good to be true, but it
15:15
was too good of a headline to pass up.
15:17
I don't think it means that they knew it
15:19
was too good to be true. I think it
15:21
just means that they don't know enough to be
15:23
covering those topics. Probably true.
15:26
Google has killed cached results in
15:28
search. So it used to be
15:30
that you'd search for a website and if it
15:32
didn't work, didn't load for whatever reason, you could
15:34
just click through into the Google cache of it
15:37
and see a sort of archive.org
15:39
style cached version. Or in some
15:42
cases, the results, the abstract, the
15:44
excerpt that you would get on
15:46
Google itself would not
15:48
match the content in the actual page because it
15:50
was stuff either hidden in meta tags or
15:53
specifically delivered in one way to the
15:55
index crawler. And when you went to
15:57
the actual site, you'd see something completely
15:59
different. However, if
16:01
you load the cache, then you will
16:04
actually see the contents of the site
16:06
as they were at the time it was indexed even.
16:08
Yeah, and sometimes that was malicious, but sometimes it was
16:11
literally just that they had changed the story or the
16:13
website had broken since then, and the cache version was
16:15
quite useful. Or your only link
16:17
to the story wasn't a deep link direct
16:19
to the story or article. Sometimes it would
16:22
have been at the top of a feed
16:24
when Google indexed it, and so that's what
16:26
you get is the feed itself. But
16:29
it's been six years since it was indexed that
16:31
way, and now God only knows where it is,
16:34
so it would be really helpful to actually be able
16:36
to load that Google cached result because it
16:38
would have the thing that you were looking
16:40
for where the new version might not. So
16:43
quite a lot of people are a little
16:45
bit up in arms in this and really lamenting
16:47
the loss of the cached
16:50
results. But Jim, you are
16:52
not. You don't think these are being very
16:54
useful for quite some time. It's
16:57
not that I'm not up in arms over the loss of
16:59
it. It's that I was up in arms about that probably
17:01
10 years ago. It used to be
17:03
that every single result you would see on
17:05
Google had a very obvious link right there
17:07
to go to the cached version of that
17:10
page, and they
17:12
deprecated that a long time ago, and
17:14
it has increasingly gotten more
17:16
difficult to find the Google cached version and
17:18
less likely that there would be a cached
17:20
version available to the point that I just
17:22
stopped relying on that feature at all. Some
17:26
years ago, I would love to have
17:28
it back the way that it used to be,
17:30
where you always knew that for every single result,
17:32
there was a link right there for at least
17:34
one cached version, possibly several. That
17:37
was fantastic. I want that back.
17:39
I just don't think this is really
17:42
news in that those results have
17:44
not been reliably there and useful
17:46
for a long time already. Yeah,
17:48
and that's really the headline kind
17:50
of brought that into focus. I
17:52
hadn't really noticed that they slowly
17:54
made it get harder and harder and less and
17:56
less having them, and then it just wasn't there
17:58
anymore until it was there. was pointed out and
18:00
it was like, yeah, I really missed that feature.
18:02
But I also realized I haven't used
18:05
it usefully in years because it hasn't
18:08
been around. Not because it wouldn't be useful to
18:10
me, it's just that it hasn't been available most
18:12
of the times when I did want it anymore
18:14
and I had to rely on some other site
18:16
to go look at an old version
18:19
of a page or catching a page that was
18:21
down because there was too much traffic or it
18:23
was just gone or whatever. And now if you
18:25
can't find the thing that you're looking for in
18:27
the actual link that looked like it was going
18:29
to be what you wanted from the Google excerpt, now
18:32
you're just going to look for it on web.archive.org, which
18:35
is, to be clear, considerably less
18:37
convenient. But it's what
18:39
we've been doing for years. And it's
18:41
funny that Danny Sullivan, who is a Google
18:43
search liaison, at least he posts
18:45
from that Twitter account, he said
18:48
that, personally, I hope that maybe
18:50
we'll add links to the Internet Archive from
18:52
where we had the cache link before. It's
18:54
such an amazing resource. And then later
18:57
on in this Twitter thread, he says that hopefully
18:59
they can kind of cut a deal with the
19:01
Internet Archive, which feels a bit
19:03
sort of wishful
19:06
thinking from him. Scummy. Well,
19:08
no, I don't think it's scummy because cutting
19:11
a deal suggests that they'll give them some money. And that's
19:13
why I think it's wishful thinking. I don't think Google's going
19:15
to give them any money. But if
19:17
they did, that would be great if they were
19:19
willing to pay someone else to do this job
19:21
for them. I think it's adorable that you think
19:23
all they're looking for is the ability for Google
19:26
to give the Internet Archive money in
19:28
terms of cutting a deal. I don't think
19:30
that's what cut a deal means. Because
19:32
the other big problem there is a
19:35
cache link is significantly less useful if
19:37
it's not a cache link from exactly
19:39
what Google saw when they
19:41
decided that this was a match for your search result.
19:43
So while having an easy link to
19:46
the Internet Archive would
19:48
be nice and somebody can probably write a
19:50
browser plugin to make it inject a link,
19:53
it's just not quite the same thing if it's
19:55
not benefiting from the fact that Google went to
19:57
all these websites. depending
20:00
on how it's true it is maybe the internet
20:02
archive just didn't go to that site the month
20:04
when that search result came from or at all.
20:07
And that's a big chunk of what
20:09
is missing is google just no longer
20:11
sharing near the same volume of the
20:13
results from them walking all
20:16
over the internet and people being
20:18
willing to accept their bots indexing
20:20
all their sites and using a pre-source.
20:23
What is the question why they
20:25
got rid of this because clearly they
20:27
are still going on to that date
20:29
somewhere otherwise the search wouldn't
20:31
work they need it in the database
20:33
don't they why. What depends if there
20:36
searches more a base now
20:38
you learn from the information you don't have to
20:40
keep copies of it around forever. But
20:43
I don't know that the answer is that google
20:45
decided that they didn't want to have that much
20:47
storage so I don't know the rationale behind it
20:49
there's less to support and it's
20:51
not making them any money directly so it goes
20:53
on the chopping block. Yeah, because
20:55
I'm sure they still have the data but
20:57
I think the same thing like
20:59
Jim just said that is just
21:02
extra features that aren't making money
21:04
and also maybe causing trouble
21:06
because you can imagine somebody using google
21:08
cash results like that. To
21:10
do like the attack we looked at
21:12
with hiding the command and control for
21:14
your botnet in your arse technica profile
21:17
picture i'm sure you could have done
21:19
the same thing with google cash page
21:21
results or just some page
21:23
is taken down for dmca or whatever and
21:25
then the google cash is still there there
21:27
is yeah if it attracted lawsuits
21:30
to google who is like will just turn
21:32
the feature off and we can be bothered
21:34
yeah. That's what i was waiting
21:36
for everything from dmca to revenge porn do
21:38
you name it if google is making the
21:40
cash copies available then they also have to
21:42
be. Managing issues with
21:45
those cash copies for
21:47
that matter even you know what happens if they who
21:49
have her up some child porn it's
21:51
one thing as a search engine
21:54
to have accidentally indexed and linked
21:56
to something truly
21:59
horrible and. truly incredibly illegal
22:01
like child porn. So one thing
22:03
to accidentally index and link to
22:05
that, it's another entirely to archive
22:07
the content and serve it directly
22:09
from your own servers. It's
22:12
a surprise that they did it for so long
22:14
then thinking about all that. Oh, well, I think
22:16
back in the day, they genuinely meant it when
22:19
they said, you know, do no evil is our
22:21
corporate motto, but you know, they dropped
22:23
it and I think they meant that too. Okay.
22:27
This episode is sponsored by auto mocks.
22:30
Are you prepared for whatever shit storm may hit
22:32
your desk during the workday? Auto mocks
22:34
has your back. Check out the
22:36
brand new autonomous it podcast. Listen
22:39
in as various it experts discuss
22:41
the latest patch Tuesday releases, mitigation
22:44
tips and customer automations to help
22:46
with CVE remediation. Make
22:48
new work friends. Listen now to
22:50
the autonomous it podcast on Spotify,
22:52
Apple, or wherever you tune into
22:54
your podcasts. Let's do
22:56
some free consulting, man. But first, just a quick, thank
22:58
you to everyone who supports us with PayPal and Patreon.
23:01
We really do appreciate that. If you
23:03
want to join those people, you can go
23:05
to two dot five admins.com/support and
23:07
remember that for various amounts on Patreon, you can get
23:09
an advert free RSS feed of either just this show
23:11
or all the shows in the late night Linux family,
23:14
and you also get some episodes early. If
23:16
you want to send any questions for Jimin on a
23:18
feedback, you can email show at two dot five admins
23:20
dot com. Jonathan writes
23:23
over the past year, I've been backing up
23:25
my personal computer, both in the cloud, Jupyter,
23:27
Kati, gooey to back, raise and
23:29
an external hard drive, KDE backups to
23:32
hard drive. Once every two
23:34
weeks, I do a manual restore of a singular
23:36
file from each location. I usually pick
23:38
a different file from each source with the choice
23:40
of file coming from folders that it would really
23:43
suck to lose file from. I
23:45
was wondering if you guys had any thoughts on
23:47
automating this process or if my once every two
23:49
weeks pick a random file is good enough. I
23:52
actually really like your once every two weeks, pick
23:54
a random file. As long as you've got the
23:56
discipline to actually keep doing that yourself, given
23:59
the limit. of the technology you're
24:01
working with, that's a pretty ideal approach in
24:03
my opinion. There are obvious
24:05
big loopholes, you know, compared to Alan's
24:07
and my preferred, well, we'll use ZFS
24:10
replication to move everything because if
24:12
everything's on ZFS and you're replicating, well,
24:14
every single block of that data is
24:16
automatically checksummed and, you know, you can
24:18
just run a scrub and know everything
24:20
is perfect block for block. You
24:23
don't really have great options for that
24:25
level of attention with, you know, what
24:27
you're doing with your traditional file system
24:30
and, you know, just running Duplicati to
24:32
backplace. But what you're
24:35
doing with picking a random file to check
24:37
yourself personally every couple of weeks is fantastic
24:39
because you change it up, it's not always
24:41
the same file, it means you have a
24:44
chance to detect problems that are occurring in
24:46
different places and because you're actually
24:48
looking at it, you can
24:50
detect in some ways better than
24:52
any program can when something is
24:54
wrong because ultimately that's the
24:56
thing that data really has to do is satisfy
24:59
you. You're making sure
25:01
that you're still satisfied with that data and
25:03
that's great. Yeah, especially with remote backups where
25:05
it's probably impractical to do a full restore
25:07
of everything on a very frequent basis, this
25:10
kind of spot check is really the
25:12
low cost version of what you can
25:14
reasonably do on a regular basis. Scripting
25:17
it to some degree helps but then
25:19
you're risking that, you know, the script thinks things
25:21
are fine and they're not, right? No
25:24
computer can check as exactly as you
25:26
can on some of these types
25:28
of files that not only is the file
25:30
matching a checksum or something but like the
25:32
file works in the program the way you
25:34
expect it to. It is probably worth doing
25:36
a SHA-256 some though as well. Yeah, and
25:38
you can definitely automate some of that but
25:41
this is not a bad thing to do,
25:44
it's maybe not the best thing you could do
25:46
but like Jim said, realizing limitations of
25:48
it would really suck to restore everything
25:50
from the cloud every
25:53
two weeks if you have a lot
25:55
of stuff that's going to take a long time and
25:57
possibly run up extra costs from the
25:59
backup software. With that said, one thing
26:01
that I might recommend in order to strengthen this type
26:03
of routine is when
26:05
it comes time for a new computer or a new
26:07
hard drive or what have you, maybe
26:10
don't clone what's on your existing
26:12
computer or your existing drive. Maybe
26:15
do your final backup, offline
26:17
that system or that drive, and do a
26:19
full restore from the cloud. You
26:22
haven't lost anything because you've still got that old
26:24
system or at least that old drive. So if
26:26
everything goes wrong with your restore, it's okay. You
26:28
can just put it back together the way it
26:31
was. But this means that at a time, when
26:33
it actually makes sense to do it and you're
26:35
already kind of prepped to hunker in and sit
26:38
through something that's going to take a
26:40
while, now you can really test those
26:42
backups and you can really learn whether
26:44
you know how to restore from them as well as
26:46
you think you do or not. Yeah, I think the
26:48
biggest problem with this setup is, like
26:51
Jim mentioned, if you have the distance to do
26:53
it. If you're doing this every two
26:55
weeks and it's fine, but then you're like, oh, I got
26:57
busy and then it turns out it's been three months since
26:59
we checked it back up and something went
27:01
wrong in that three months, then you'll be very
27:03
sad panda. Whereas with ZFS, if you've
27:05
got scrubs set up to automatically happen once a
27:07
month, then it's doing it for
27:10
you. As long as you remember to look at
27:12
your Z-pool status, sure. But if you're not remembering
27:14
to do that, then you're just in the same
27:16
boat again. So you can't get
27:18
away from the need to pay attention to things
27:20
to one degree or another. You can
27:23
set up automated systems to change
27:25
what you have to pay attention to,
27:27
to make the thing that you're paying
27:29
attention to more or less detailed or
27:33
better or worse tailored to what you need
27:35
to know. But
27:37
ultimately, you do still have to
27:39
pay attention, and you do have to have an
27:41
actual schedule for that. Yeah, because
27:43
you need to monitor the scrub
27:45
on the source, monitor the replication
27:47
that's actually happening so the backup
27:49
actually is getting updated, and
27:52
then the scrub on the backup. Just
27:54
as if you have to monitor the backup you're doing
27:56
with duplicate or whatever and make sure the files are
27:58
making it to the cloud. cloud and other files that
28:00
can make it back from the cloud. So
28:02
the big advantage is to doing things with ZFS and
28:05
being able to run a scrub and check the Zepul
28:07
status or being able to do a
28:09
ZFS list, you know, dash T snap and look for
28:11
the newest snapshots that are replicated in on your target
28:13
system. You're doing the same
28:16
thing. You're verifying your backups. You're verifying the
28:18
content. What you're really getting out of it
28:20
there, you're not getting out of the need
28:22
to pay attention to yourself. You're
28:24
getting a much more comprehensive sense of
28:27
what's going on when you do go in there
28:29
and pay attention because when you look at the
28:31
output of that Zepul status and you see that
28:33
your scrub ran and whether or not it had
28:35
to correct any blocks along the way and whether
28:37
or not there are any corrupt blocks left after
28:39
it was done, well, you not only
28:42
know this one file open and it looks
28:44
good, you know every single
28:46
block of storage on this system is
28:48
fine. And when you look at the
28:50
list of snapshots, you know that not
28:52
only did my backup run on that
28:54
date, but because I see the snapshot,
28:57
it ran start to finish and everything
28:59
went fine. Because the ZFS receive is
29:02
checking the check sums on all those blocks
29:04
as it stores them as they come in. And
29:06
if that process crashes before you get all of
29:08
it, you don't get half a snapshot. You
29:11
can actually resume that replication later, but from the
29:13
purposes of just listing the snapshots on your system,
29:16
it's like it's not even there at all. So
29:18
if you see the snapshot, you know you're good. That's
29:21
what ZFS is really giving you. Now the other thing
29:23
that it can give you is it gives
29:26
you a scriptable way to check all
29:28
that even further so that now you
29:31
can potentially even say, well, rather
29:33
than know that I need to like look
29:35
at this every two weeks, I can set
29:37
up Nagios and I can set up an
29:39
app like Anag on my phone to connect
29:41
to Nagios and I can have my phone look
29:44
at my Nagios and my phone will literally just
29:46
blow up in my hip pocket whenever something goes
29:48
wrong rather than me having to
29:50
remember to go look. At that point, the
29:53
only thing you have to remember is just
29:55
make sure Anag is actually running on your
29:57
phone. Yeah, for my example here, my monitoring.
30:00
tells me how old the newest snapshot that's
30:02
been replicated is. And because
30:04
somebody wrote 10 terabytes of new data, the
30:06
replication is a bit behind and it's catching
30:08
up. And so, you know, I'm a day
30:11
and a half behind on replication because my
30:13
internet straw is too small. Right. And when that kind
30:16
of thing happens on, you know, one of my systems,
30:18
if it was a day and a half behind on
30:20
what would normally be an offsite backup, I
30:22
will see that, but it's going to be a
30:24
warn, not a crit, because that's the way I've
30:26
got mine tailored. Because if you're a day behind,
30:28
you know, that very frequently will just get resolved if
30:30
you wait another day. So it's worth
30:32
seeing the warn to get your attention, but it's not
30:35
worth having a crit that says, Oh my God, you
30:37
got to fix me right now. And
30:39
again, that level of detail is part of what
30:41
you get from going through the extra donkey work
30:43
to set up automated monitoring and you know, all
30:46
this stuff that Alan and I have and do
30:48
and talk about all the time is just again,
30:51
for the level of attention that you actually
30:53
have to devote, you can do more with
30:55
it. It's a force multiplier is what it
30:57
is. Right. Well, we'd
30:59
better get out of here then. Remember show at
31:01
2.5 admins.com. If you want to send any questions
31:03
or feedback, you can find
31:06
me at jrs.com/mastodon. You can
31:08
find me at jrs dash s.net/social. And
31:10
I'm at Alan Jude. We'll see you
31:12
next week.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More