Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:00
Two and a Half
0:02
Admins, Episode 189. I'm
0:04
Joe. I'm Jim. And I'm Alan. And
0:06
here we are again. Users
0:09
ditch Glassdoor. Stunned by
0:11
site adding real names without consent.
0:14
So on the off chance any of you out there don't know
0:16
what glassdoor.com is, it is a notorious
0:18
site where people who have
0:21
been employed typically at larger companies that lots
0:23
of people have heard of, they
0:25
go to the site and they give anonymous
0:27
accounts of what their time working for that
0:30
company was like. Pro tip, go and have
0:32
a look at the canonical one. It's quite
0:34
something. So
0:37
yeah, if you're thinking about working for a
0:40
large corporation, you can go look at reviews
0:42
on Glassdoor and see if the people who
0:44
have already worked there largely think
0:46
it was a good experience or a terrible one. Now
0:49
to be clear here, when we say the
0:51
site started using users' real names without their
0:53
consent, we're not actually saying that it just
0:55
pasted them directly on the public reviews where
0:58
anybody can see them. That's not
1:00
what's going on. What we're talking about is privately
1:03
de-anonymizing data, which
1:05
in one sense doesn't sound like it's as bad
1:08
because, well, it's just Glassdoor that knows the horrible
1:10
things you said about the last company you worked
1:12
for, right? Not anybody else. Well,
1:16
maybe, maybe not, because you
1:19
don't know for certain that Glassdoor will never
1:21
sell that information and you certainly don't know
1:23
that Glassdoor will never be breached. We
1:26
actually had a fair amount of all-fair discussion about this
1:28
before deciding the angle on how to cover it. And
1:30
Alan and I were both pointing out to Joe, name
1:33
one major site that all of our listeners will
1:35
be familiar with which hasn't been breached in the
1:37
last 10 years. And spoiler alert, nobody
1:40
can really think of one. Let's talk about
1:42
the timing and acquisitions that led to this. Glassdoor
1:45
acquired Fishbowl. It's a networking site
1:47
for professionals. So the whole goal of a site
1:50
like Fishbowl is to make sure everybody knows who
1:52
everybody else is and promote them. Whereas
1:54
the goal of Glassdoor was to
1:56
leave safely anonymized reviews where you
1:58
could recount your... experiences at a company
2:01
without worrying about that being held against
2:03
you at future employers or potentially even
2:06
being sued by your last employer. And
2:09
when you de-anonymize that data, you strip
2:11
those protections and those safeguards. I'm not
2:13
sure that most people are worried about
2:15
their next door neighbor knowing it was
2:17
them that said their last company sucked.
2:20
Their neighbor already knows. Probably most of the people
2:22
who know them personally already know how they felt
2:25
about their last job. What
2:27
they're going to be worried about, I would think,
2:29
would be the companies that are doing
2:32
business with Fishbowl and with Glassdoor, pay
2:34
a little extra to find out. Like,
2:37
well, you got anything on this guy before
2:39
we hire him? Is he left in your
2:41
reviews anywhere? And that's not what
2:43
the site was supposed to be for.
2:46
Yeah, well, and just the fact that
2:48
Glassdoor was anonymously leaving information about companies
2:50
used to work at and
2:52
salary ranges and so on really
2:55
seems to be at cross purposes with Fishbowl,
2:57
which is a professional networking app that wants
2:59
to require all users verify their identity. And
3:02
then they decided to connect the back ends of these
3:05
together and make every Glassdoor
3:07
user automatically a Fishbowl user. We
3:09
should also be clear here that
3:11
Glassdoor is not only asking users
3:13
to provide their personal identifiable information
3:15
and saying, but pinky swear we
3:17
won't tell anybody else unless we
3:19
do. In some cases where
3:21
users have declined to provide that information, Glassdoor
3:24
has decided to make their own best
3:27
guess as to who that person is
3:29
using what metadata they do have and trying to
3:31
match it to various recon data they can find
3:34
on the web. And
3:36
then tie that person's personal
3:38
identifiable information to the anonymized
3:40
originally reviews. And will they
3:43
always get that right? Maybe,
3:46
but probably not. So now you not only
3:48
have a site that said, hey, come leave
3:50
your anonymous review, and then said, well, you
3:52
left your anonymous review, but now tell us
3:54
exactly who you are. Don't worry. You'll
3:56
still be anonymous. But also a
3:58
site that says, oh, you won't tell us what you're doing. But we'll just take
4:00
our best guess and we'll slap it in there anyway. Yeah,
4:03
and I had another, for example, Josh
4:05
Simon from the matrix.org Foundation discovered that
4:07
Glassdoor had not only messed up and
4:10
said he worked at a different employer, but claimed
4:12
that he lived in London when he actually is
4:14
based in California. It
4:17
was bizarre because I had never provided
4:19
that information and it was somewhat incoherent
4:21
makes of random details. And
4:24
so requested his account be deleted rather
4:26
than remain on a site
4:28
that might randomly update his profile
4:30
without notifying him of potentially inaccurate
4:32
changes. I know that Matrix do have
4:34
an office in London, so that would
4:36
be where that came from. Well, that doesn't
4:38
make any better, is it? Most likely
4:41
where it came from is the same place that it came from
4:43
when you know that one kooky
4:45
friend or relative you have that fancies
4:47
themselves great at cyberstalking just goes to
4:49
like a million white pages and people
4:51
find her in spokeo sites and gets
4:54
together everything they can find on a particular
4:56
name of somebody and just assumes it's all
4:58
about the person that they think it is
5:01
and true. No, it's whatever
5:03
crap you could find on the internet. It's
5:05
interesting that in the Oz piece, they
5:07
refer to a lady called Monica, who
5:10
I don't think that's her real name,
5:12
but she had a real battle with
5:14
Glassdoor to delete her data. It
5:16
was a lot harder than it ought to have been. And
5:18
even then, I think they said, well, we'll
5:21
delete it in 30 days after they finally
5:23
act realized. What you're telling
5:25
me is the company that took anonymized
5:27
data demanded it be de-anonymized and then
5:29
did their own half baked internet research
5:32
to de-anonymize it when the actual user
5:34
refused to. You're saying
5:36
that that company acted weird and proprietary
5:38
about personal data and didn't want to
5:40
let users have control of it. Telegram's
5:44
peer-to-peer login system is a risky way to say
5:47
$5 a month. This
5:49
one is weird. So basically, in
5:52
order to send the two-factor authentication
5:54
messages, when somebody tries to log
5:56
in, they're asking other people to
5:58
consent to have a having their phone
6:01
send the 2FA token instead via
6:03
SMS. Yeah, it's unclear at this
6:05
point if this is tell you
6:07
I'm trying to save money by
6:09
not having to pay the carriers
6:11
to send the text messages, or
6:13
if it's, as they claim in the article
6:16
about the messages not getting
6:18
delivered to certain people, whereas a direct text
6:20
message from a person rather than from a
6:22
short code might get better delivery rates. Seems
6:25
a bit weird. I imagine most phone
6:27
carriers are gonna try harder to deliver
6:29
2FA tokens than they are to deliver
6:31
random messages from people. Yeah, but it's
6:33
fun. You get free Telegram Premium. You
6:35
get loads of extra reaction emojis. You
6:37
get the voice message
6:40
AI voice to text business.
6:42
Well, only if they use your phone to
6:44
send at least 150 text messages. And
6:48
also they're like, yes, sorry, this
6:50
is gonna show all the people that
6:52
we send messages to your phone number.
6:55
And we have no way of stopping them from
6:57
texting you other than we ask them
6:59
not to. Or calling you. No, they tell them
7:02
not to. They say, don't
7:04
do this. But they're absolutely going to
7:06
do that. I mean, I
7:08
assume both of you has at some point been on
7:10
the wrong end of a Joe job, where
7:12
your phone number got used as the spoofed
7:15
number that somebody was sending spam texts from.
7:17
I got calls where they did actual calls.
7:19
I got a bunch of voicemails in Spanish
7:21
yelling at me for telemarketing them or something.
7:24
Yep, I've been on the wrong end of
7:26
Joe jobs from a text message campaigns and
7:28
also robocall campaigns. And you get people
7:31
calling you like just
7:33
incandescently angry
7:35
because they think you're a spammer and there's
7:38
finally a spammer that they can yell at.
7:40
And maybe one time out
7:42
of five, you get somebody relatively reasonable and
7:44
you can explain to them what a Joe
7:46
job is and what spoofed numbers are. And
7:49
yes, I understand that you called the number that
7:51
you saw in the from on the text message
7:53
that you got. But that wasn't me. In
7:56
my experience, about one person out of five will listen
7:58
to that and like realize that. You're
8:01
speaking to them and not at them and
8:03
you sound like a reasonable human being. That's
8:05
how they should interact with you. The other
8:07
four out of five? No, man. They
8:10
figure they've got their teeth in the neck of a spanner and they ain't letting
8:12
go. That's what you're opening yourself
8:14
up to. I think the worst part
8:16
of this is that if you opt into
8:18
it and Telegram doesn't send enough text messages
8:20
with your phone number, you don't get
8:22
the gift code. Yeah, and a cynical person would say that
8:24
they might send 149 and then move on to the next
8:28
person. Well, it says 150 is the maximum. They
8:31
don't specify the minimum to get the
8:33
gift code. Yeah, basically. One
8:35
less than the minimum of each phone to not
8:38
have to give out $5. I
8:40
think we got to be honest here. I
8:43
don't doubt that in some cases
8:45
Telegram may be able to more
8:47
reliably deliver a short code via
8:49
a relay to SMS from a
8:51
nearby private person's telephone than
8:54
via a service. They may be
8:56
operating in countries where it's hard to get set up
8:58
with those services or those services
9:00
aren't reliable or whatever. I'm willing to
9:03
accept those are reasons
9:05
that you can look at and say, yes, I can
9:07
understand why you would say that. But
9:09
I also don't think any reasonable person would look
9:11
at this and say, I don't
9:14
think this is a cash grab. Let's be
9:16
honest. This is a cash grab. It's a
9:18
way to save some money. Well, I think
9:20
it's also a way to get people to
9:22
try premium and then miss
9:25
it and actually end up subscribing. I think
9:27
you just came up with an alternate spelling
9:29
of cash grab, Alan. Well,
9:33
as someone who was gifted Telegram Premium
9:35
for a year and then that friend
9:38
wouldn't gift it to him again and
9:41
whose wife said, no, you're not buying
9:43
Telegram Premium. That's ridiculous. I
9:45
mean, I'm not going to say I'm going to do it, but when
9:48
I heard about it, I thought, hmm, for
9:50
a second before going, no, it's not worth it
9:53
because it was pretty cool to have Telegram Premium.
9:55
Honestly, I would say to me, this is more
9:57
an example of why I am so against. having
10:00
a mobile app for everything because it just
10:02
broadens the scope of what the damn thing
10:04
might be doing behind your back. The
10:07
fact that Telegram can technically
10:09
speaking do this should
10:12
be a little bit of a wake up call
10:14
in my opinion to the folks who are normally
10:16
completely okay. Oh, I've got an app for this
10:18
website, an app for that website, an app for
10:21
the other website. Well, you're starting to see why
10:23
because having an app means that whoever runs that
10:25
app can do an awful lot more. And
10:28
in some cases, it can be a little
10:30
disturbing, the breadth of things that can be
10:32
done via smartphone app. Like
10:35
make calls and send text messages on your
10:37
behalf without you knowing about it when it
10:40
happens. Yeah. I first thought there
10:42
was why are they not doing app-based authentication?
10:44
But I guess in the case of Telegram,
10:46
it's because they're trying to verify your phone
10:49
number, not actually a
10:51
login 2FA. Because why don't
10:53
they just do the 2FA in the app that they
10:55
obviously already have on your phone? Like you said, because
10:57
then they wouldn't be sure that that really was your
11:00
phone number. And yet every
11:02
other alternate technical method you can think
11:04
of to verify possession of a particular
11:06
phone number costs money. This one
11:08
doesn't. Okay. This
11:10
episode is sponsored by Tailscale. Go
11:13
to tailscale.com/2 5 A. Tailscale
11:16
is an easy to deploy, zero config,
11:18
no fuss VPN that allows you to
11:21
build simple networks across complex infrastructure. It's
11:24
super simple to set up and you can have
11:26
machines and devices all over the world connected to
11:28
each other as if they're on the same LAN
11:30
in minutes. It's WiGuard on easy
11:32
mode. Jim tried Tailscale and found it
11:34
incredibly easy to pick up and use for the first
11:37
time. He liked that most
11:39
of the time it immediately negotiates a direct
11:41
point to point connection that's almost as fast
11:43
as encrypted point to point. And it's got
11:45
fallback modes to keep you functioning, even if
11:48
something changes the network topology and breaks the
11:50
point to point introduction. So
11:52
support the show and check out Tailscale for
11:54
yourself. Go to
11:56
tailscale.com/2 5 A and
11:59
try Tailscale out for free. free for up
12:01
to 100 devices and three users with no
12:03
credit card required. That's
12:05
tailscale.com/two five eight.
12:09
Feds ordered Google to unmask certain
12:11
YouTube users. Yeah, this one's a
12:13
kind of weird one. So
12:15
it turns out that undercover police
12:18
sent certain people a
12:21
link to a video to
12:23
try to capture them, come watch it, and
12:25
then went to Google and said, we want
12:28
the IP address and name and everything you
12:30
know about all the people that watch this
12:32
video between January 1st and January 8th. Of
12:35
course, the police doing this kind
12:37
of sting operation thing, weren't
12:39
smart enough to make the video unlisted. So because
12:41
it showed up in the
12:43
algorithm, it was watched over 30,000 times. So
12:46
it's not just the people they sent the
12:48
link to that they were trying to target
12:50
to unmask, it's just 30,000 people. Alan,
12:54
you're being both too unkind and too kind.
12:56
And assuming that the police just didn't think
12:58
of making it unlisted, because the first thing
13:00
I thought is they probably knew exactly
13:03
what they were doing and did not send that
13:05
video until to their targets until it had a
13:07
believable amount of traffic on it to make it
13:09
look like a legit video. Yeah, although even then
13:11
you would switch it to unlisted or something so
13:13
that the algorithm would stop sending new people to
13:15
it to contaminate your list
13:18
of people who watch this video on the first
13:20
week of the year. But then your target might
13:22
notice that it was unlisted. You can
13:24
figure that out. Yes, but if they've already gone to the
13:26
video, it's too late for them. Yeah,
13:29
but you don't want them to know, do you? Surely
13:31
they could just get a video that's less
13:34
popular, that's related to what you were talking
13:36
about. And odd, this video that I think
13:38
to it's not something that's going to be
13:40
hugely popular. It might not be a very
13:42
good video, for example. It sounds to me
13:45
like you're, you're advocating for a considerably greater
13:47
standard of professionalism and care than we normally
13:49
expect from law enforcement. Yeah, and I don't
13:51
know if Google's more or less willing to
13:53
provide this Information in bulk. I know with
13:56
even Google Analytics and so on. if the
13:58
traffic isn't enough, we're you may be able
14:00
to identify individual people they won't give you
14:02
the data by in this case the Sims,
14:05
the police are like we want the names
14:07
and all the information you have any ip
14:09
addresses These people which I think is the
14:11
kind of over the top. Yeah. And
14:14
we're burying the lead a little debt because
14:16
we're talking a lot about mechanics and not
14:18
much about the delay galleries and see up.
14:20
Said. Political implications and personal implications of
14:23
this for your sleeve. Even starting with
14:25
why they're going after these particular people
14:27
and it's literally because they were selling
14:30
bitcoins for pass. Note: Let's not start
14:32
with that because honestly, that's a little
14:34
bit irrelevant. The real issue here is
14:36
that you know what the cops are
14:39
trying to do is they're trying to
14:41
make it a potential crime for you
14:43
to watch whatever the You Tube algorithm
14:45
through at you. Nobody wants to think
14:48
that just because the You Tube algorithm
14:50
through this particular. Video at me and
14:52
my playlist while sitting on my couch. Him
14:54
on it's eleven o'clock last night and I
14:56
let it play at that. Now cops are
14:58
going to come knock on my door and
15:00
asked me, why were you watching this video
15:03
last night Citizen. That's. Not cool.
15:05
That's not the way things are supposed to work.
15:07
That heard of has why I remember
15:09
even to hundred like why did they
15:12
use a video that was still getting
15:14
new traffic and saw him to try
15:16
to. Do. Their dragnet on top
15:18
of the fact that they're Dreadnought is
15:20
probably wrong to begin with because the
15:22
things that the cops care about nothing
15:24
that Allen's you didn't insult or care
15:27
about are very different. Subsets are not
15:29
a lot of overlap and have been
15:31
diagram a magazine. there's none. They're absolutely
15:33
as some but not as much as
15:35
I would like or to be. Yeah,
15:37
and the cops are hearing quotes. There
15:39
is reason to believe that these records
15:41
would be relevant material to an ongoing
15:44
criminal investigation, including by providing identification information
15:46
about the perpetrators which. Is the Vegas
15:48
thing ever? Would. Actually kinda
15:50
reminds me of is sad. So.
15:52
the anti drug legislation we have and
15:54
in my country in the united states
15:56
we get these ridiculous court cases where
15:58
the cops will instead suing a
16:00
person who they pulled
16:02
over who had $10,000 in cash
16:05
in the glove box, they actually
16:07
sue the $10,000 in
16:09
order to attain it for the police department. In
16:13
a lot of ways, this reminds me
16:15
of that because it's yet another way
16:17
that cops attempt to avoid having a
16:19
defendant on the other side of something
16:22
iffy that they're doing. I'd heard of
16:24
companies being a person, but money is
16:26
a bit of a stretch. Yeah. Money
16:29
can be objects, locations, you name it.
16:32
This is how they go after alleged
16:34
property of drug dealers and whatever. The
16:36
idea is that, oh, it's a tool
16:38
for them to get the money off
16:41
the street so they can't just keep
16:43
selling those drugs. But
16:45
if you follow the news stories on this, it keeps
16:48
on ending up nailing
16:51
people who are clearly
16:53
like migrant families,
16:56
just like working class folks that
16:58
are moving from one town to
17:01
another. And
17:03
it's one of these deals where like, okay, yeah, sure.
17:06
Drug dealers who carry around large amounts of
17:08
cash are absolutely a thing. So
17:11
are working class people who are just trying to
17:13
get from A to B and like, yeah, I
17:15
get the idea that you might argue, oh, but
17:17
what if a drug dealer just
17:19
pretends that they are a family with
17:21
a station wagon loaded up with luggage
17:24
and the family dog or whatever? That
17:27
shouldn't be a defense, which just brings
17:29
us right back again, ultimately, to what this
17:32
has all really been about, which is the cops
17:34
have very different priorities and are looking at the
17:36
world through a very different lens. Their thought is
17:38
do not let bad guy get away. That's
17:41
the most charitable interpretation I can
17:43
possibly have, is that job number
17:45
one is catch bad guy. Whereas
17:49
we tend to be a little bit more
17:51
concerned with maybe don't wreck my life while
17:53
you're catching the bad guys. Yeah, they have
17:55
a couple other examples here where
17:58
apparently a Bomb threat was. Called
18:00
him to a trashcan that happened
18:02
to have be in the sideline
18:04
have some businesses webcam the had
18:06
a live stream and so they
18:08
suspect they perpetrator who called in
18:10
the fake bomb scare was watching
18:13
the live stream so they wanted
18:15
to subpoena records for the light
18:17
stream. Which. Is on a channel for
18:19
has one hundred and thirty thousand subscribers. So.
18:22
Let's I masked all of those
18:24
people to maybe try to find
18:26
one guys like. Does that make
18:28
sense? What Court is? Accepting
18:30
that as a probable cause he say
18:32
hundred and thirty thousand subscribers. That doesn't
18:34
mean that that many people watch the
18:36
live stream now, but. They're. So I'm
18:39
asking a very large number of people to
18:41
try to find one person. And.
18:43
That seems unreasonable. Yes, the fact that one
18:45
hundred and thirty thousand people worse described as
18:48
what tells you that is a pretty mainstream
18:50
thing to be doing. and what you don't
18:52
want to do is to be targeted as
18:54
a criminal because you're doing something mainstream in
18:56
normal. Which. Is exactly what we're
18:58
talking about when we have these wide
19:01
dragnets, get cast. Interestingly, just
19:03
earlier this month, the Supreme Court
19:05
of Canada ruled that and Internet
19:07
Protocol address attracts and reasonable expectation
19:09
of privacy Protected under Section eight
19:11
of the Canadian Charter of Rights
19:14
and Freedoms, this decision has a
19:16
patio implication for law enforcement authorities
19:18
who will now need prior judicial
19:20
authorization before requesting Ip addresses from
19:22
private organizations or Internet service providers.
19:24
And search engines are a lot
19:27
good if we all die. And
19:29
to be nice, visibly, we only
19:31
got it like. A couple of
19:33
weeks ago. And only by a fight
19:35
for decision of the supreme court. Oh.
19:38
Well let me put this into perspective for yeah and. You.
19:40
Know what's less private and your ip address.
19:43
My. Social Security number. Okay,
19:45
this episode is sponsored by Collide.
19:48
When. You go through airport security is one line
19:50
where the tear say as and checks you i
19:52
day and another were a machine guns you back.
19:55
The same thing happens in Enterprise Security. But.
19:57
instead of passengers and mortgage it's and
19:59
you and their devices. These
20:01
days, most companies are pretty good at the first
20:04
part of the equation where they check user identity.
20:06
But user devices can roll right through
20:09
authentication without getting inspected at all. In
20:11
fact, a huge percentage of companies allow
20:14
unmanaged, untrusted devices to access their data.
20:17
That means an employee can log in from a
20:19
laptop that has its firewall turned off and
20:21
hasn't been updated in six months. Or
20:23
worse, that laptop might belong to a
20:26
bad actor using employee credentials. Collide
20:28
finally solves the device trust problem. Collide
20:31
ensures that no device can log into your
20:33
Octa protected app unless it passes your security
20:35
checks. Plus you can use
20:37
Collide on devices without MDM like your
20:40
Linux fleet, contractor devices and every BYOD
20:42
phone and laptop in your company. So
20:45
support the show and go to collide.com.com. Let's
20:47
do some free consulting then. But first
20:49
just a quick thank you
20:51
to everyone who
20:59
supports us with PayPal and Patreon. We really do
21:02
appreciate that. If you want to join those
21:04
people you can go to 2.5admins.com. And remember that for various
21:08
amounts on Patreon you can get an advert
21:10
free RSS feed of either just
21:12
this show or all the shows in the Late Night
21:14
Linux family. And you even get some episodes
21:16
early. And if you want to send
21:18
in your questions for Jim and Alan or your
21:20
feedback you can email show at 2.5admins.com. Adam
21:23
says I have the cheapest of
21:25
cheap Linux servers, 1 gigabyte of RAM etc
21:27
which I used to host my personal next
21:30
cloud instance. One of the apps
21:32
installed is Pico CMS which is a lightweight
21:34
CMS for making a personal site or blog.
21:37
All has been good for many years but
21:39
then I recently started doing a daily blog
21:41
on some coding I was undertaking. I linked
21:43
to my blog entries from Mastodon but
21:45
this caused an issue. Every time I
21:48
toot my Linux gets slash dotted and
21:51
goes unresponsive for some time. What
21:53
Apache config should I use to protect
21:55
against this? So essentially this is about
21:57
setting limits properly. By default
21:59
you are very likely using
22:02
an Apache and PHP manager
22:05
that is configured with the expectation of far more
22:07
resources than you have available on your one gig
22:09
linode. So the first thing is going to be
22:11
make sure that Apache is not using the pre-fork
22:13
worker model. That will likely
22:15
require a clean reinstallation. If you're already using
22:18
Apache right now, it can be very difficult
22:20
to get it to change its mind once
22:22
it's already been set up and configured. When
22:25
you're doing that, you want to make
22:27
sure not to install mod PHP because
22:29
that will forcibly reconfigure your Apache again
22:31
into that pre-fork model that you don't
22:33
want. Instead, you want to
22:35
use PHP FPM, which is going to
22:37
provide a CGI gateway for PHP processes,
22:40
and you will tune its limits individually from
22:42
Apaches. You need to limit both sides of
22:44
that, but you're going to have much, much
22:46
harsher limits on PHP FPM than you do
22:48
for Apache. Once you've
22:50
successfully gotten PHP out of Apache's
22:53
clutches and into PHP FPM, it's
22:56
probably okay to have a fairly large
22:58
number of simultaneous connections to Apache, something
23:00
along the lines of like 50 to
23:02
100. Even on a one
23:04
gig VM is usually fine, although, again,
23:07
you're going to need some tuning when you're
23:09
working with this light a level of resources.
23:12
The more important one is going to be limiting
23:14
the number of PHP sessions. You want to limit
23:16
that very radically, and RAM is usually going to
23:18
be the bottleneck there. You need
23:20
to make sure that when somebody opens however
23:23
many PHP sessions you have allowed in your
23:25
PHP FPM, that the total amount of
23:27
RAM used is not going to be enough to send your
23:29
Linode into SWAP. Once you
23:31
manage all of that, you'll be amazed
23:33
at how much traffic you actually can
23:35
serve well. A lot of
23:37
people think that if they detune their
23:39
Apache or their PHP FPM that way,
23:41
that it's going to make their website
23:44
slower. It's actually usually the opposite case,
23:46
because what happens is, rather than
23:48
a really badly overworked server that's having to
23:50
do a lot of expensive context switching in
23:52
CPU, and potentially seeks in
23:54
storage, trying to service a lot more
23:56
users at once, and it's really capable
23:58
of doing well. it's going to serve
24:01
each one of them individually, very rapidly,
24:03
and then go on to the next.
24:06
So instead of having 100 users all
24:08
connecting at once and essentially getting tarp-headed,
24:10
just getting a few characters a second,
24:13
you have five connecting immediately at the web
24:15
page and then the next five immediately get
24:17
theirs and so forth, with
24:19
the net effect that you're not only
24:21
keeping your site up, but you're actually
24:23
providing a better experience for your visitors.
24:26
Interesting thing with PicoCMS is while
24:28
it's effectively a lightweight CMS, almost
24:31
a static site generator, it is statically
24:33
generating your site every page load. That's
24:35
not a static site, that is it?
24:37
It's not a static site generator. It's
24:39
a lightweight CMS where you write markdown
24:41
files and it auto-builds the site. It's
24:44
just on demand. It's not a copy-on-write
24:46
file system. It's a write-on render website.
24:48
Yeah. So it works fairly
24:50
well. I used it for a long time. I
24:52
haven't used it for the BSDNow website for a while. Although
24:54
once we had a good
24:57
collection of tutorials and 300
24:59
episodes, it started taking longer and
25:01
longer to render the site. Part
25:03
of that was we had a bit of extra logic to show
25:06
the most recent episodes and stuff, but
25:08
it did eventually kind of have scaling
25:10
problems. And especially looking at
25:12
the fact that you're at like
25:14
one gigabyte of RAM, you might
25:16
consider moving to something like Hugo.
25:18
I think it's gohugo.com, where
25:21
it can take basically the same markdown files
25:23
you've been writing in PicoCMS, but
25:25
it will generate a static site, actual
25:27
HTML files that don't need PHP at
25:29
all and put them on disk. And
25:31
then your Apache can serve those with
25:34
even less memory. And you
25:36
only have to run it every time you edit
25:38
the website. You just have it
25:40
kick off Hugo and it will recompile the
25:43
site in no time at all. Then
25:46
you don't need the PHP part and you
25:48
can get that many more visitors
25:50
served in the same amount of
25:52
resources. So what you're saying is
25:54
we shouldn't expect PicoCMS to serve at a nano
25:56
scale? to
26:00
Adam's email and I must
26:02
have just skimmed it because I missed
26:04
this PicoCMS part and I thought it
26:07
was WordPress because I suggested using memcache.
26:09
Is that something that you can do
26:11
with PicoCMS I wonder? Not directly
26:14
with PicoCMS although you can use
26:16
memcache.d as the back end for
26:18
things like APC, the alternative PHP
26:20
cache, which can do
26:23
bytecode caching and object caching to speed
26:25
up PHP in general. And you definitely need
26:28
every last bit of that if you're wanting
26:30
to serve any significant scale from one gig
26:32
linode. You can definitely do it. I have
26:35
been doing it for you know more than
26:37
10 years. Most of my sites are actually
26:39
on the one gig linodes
26:41
because I've just I've learned how to do
26:43
a really good job of tuning even WordPress
26:45
down so that it won't exceed the limits
26:48
of the hardware. Obviously that only
26:50
scales so far. There's still only so
26:52
much that the hardware itself can't accomplish
26:54
in a given time frame but
26:57
I mean for reference I've run
26:59
sites that getting where from 15
27:01
to 50,000 unique visitors a month
27:04
on one gig linodes without any problems. In
27:07
particular because this use case is
27:09
PicoCMS statically generating HTML might be
27:12
the way to get that
27:14
much more performance without having to do a lot
27:16
of configuration and tuning. Jim's
27:18
advice is much more applicable to any other
27:21
case where you're actually you know trying to do
27:23
something dynamic in PHP and you can
27:26
still make that work with one gigabyte of RAM.
27:28
To be clear even with no PHP whatsoever if
27:30
you're working from a one gig linode and
27:33
you expect you know to handle any
27:35
kind of scale you're
27:37
still probably going to need to tune Apache's
27:39
own limits because the default limits that it
27:41
shifts with from Apache itself
27:44
in my experience are almost never overridden by
27:47
the package managers in the repositories and they're
27:49
aimed at some pretty heavy iron. Indeed. Or
27:51
you could just put Cloudflare in front of it right? It
27:54
depends like you can literally get in the case
27:57
where the CDN you're putting in front of your
27:59
website will... still send enough
28:01
traffic to knock over your website. And
28:04
then everybody just gets an error. When I
28:06
used to consult for a big newspaper here, they
28:08
had that problem. They used Akamai, one of the
28:10
biggest CDNs. And because this was
28:13
a major newspaper, they got traffic from all
28:15
over the world. And that meant that every
28:17
Akamai node was always being like, where's that
28:19
page? And hounding them. And
28:22
their back end was
28:24
all microsoftasp.net on IIS.
28:27
So luckily, there was a layer of
28:29
Varnish and Nginx on FreeBSD in front
28:31
of it, saving its stream
28:33
itself. And this was
28:35
shortly after when Michael Jackson died, and
28:37
the CNN website had so much traffic
28:40
that it couldn't serve the CSS with
28:42
the website. So you just got the
28:45
HTML with no style on it and so on. And
28:48
so we specifically tuned the newspaper's
28:50
website so that in addition to a
28:53
bunch of caching and all the stuff we've done in front, we
28:56
saved off an HTML version of the
28:58
page when we delivered it successfully to
29:00
somebody so that if we couldn't deliver
29:02
it successfully to the next person, we could give them
29:04
that stale copy. Just another datum
29:06
for anybody who's having trouble believing that
29:08
just Cloudflare accessing your back end can
29:10
be enough to kill your back end.
29:13
One of the last web optimization projects that
29:15
I got involved in that used Varnish, since
29:17
Alan brought Varnish up, was actually optimizing a
29:20
Magento site. And for those
29:22
of you who aren't familiar, Magento is a
29:24
shopping platform. It's what people used to put
29:26
up on their websites back before you didn't
29:28
run a website anymore and you just tried
29:30
to sell everything at Amazon. But anyway, point
29:32
being, this is a fairly large company with
29:34
a very heavily trafficked Magento store. And they
29:36
had Cloudflare in front of it and everything. But
29:39
a typical page render was something like
29:41
5,500 to 6,500 milliseconds before I came
29:43
in and put Varnish
29:46
in front of it. After I
29:48
got done tuning Varnish and putting that
29:50
Varnish front end in between Magento and
29:52
Cloudflare, those 5,500 millisecond page render
29:55
times went down to about 50 milliseconds.
29:58
That's very extreme. But
30:00
that's kind of the point is that it
30:02
can get that extreme. Yeah, and
30:04
Varnish can do all kinds of cool
30:07
things including server-side includes and
30:09
was designed actually specifically for a
30:12
Danish newspaper, I think. And
30:14
so it had support for, oh, this user
30:16
is logged in. The content of
30:18
the story is going to be the same as what
30:20
we deliver to everybody else. But
30:22
this widget over the side that has the
30:25
weather is localized to their zip code, that's
30:27
going to be different. And
30:29
being able to tell, I can cache the guts of
30:31
this page but not the weather makes a big difference
30:33
than just being like, oh, this user is logged in
30:35
so I can't cache anything I showed them, which
30:38
is the default with like a WordPress and so on. That
30:40
was the biggest difference when I put Varnish behind that
30:43
Magento site is I did exactly what you're talking about.
30:45
I separated out the cacheable from the non-cacheable elements and
30:47
made sure they are all handled
30:49
appropriately and the net impact
30:51
was rather than having to render even though
30:53
it was just for CloudFlare every single page
30:56
as though it were from a logged in
30:58
user. Now only the logged
31:00
in users are actually triggering the dynamic
31:02
generated content that has to be unique
31:04
to them. And as a
31:06
result, now instead of 100% of
31:09
the site traffic being treated by the backend as
31:11
though it had to be rendered on
31:13
the fly on demand, now
31:15
only the logged in users with something
31:17
in their shopping cart are actually touching
31:19
the dynamic part at all and the
31:21
rest of it actually can just
31:23
come from CloudFlare. But that's all probably
31:26
quite a bit of overkill for a PicoCMS site.
31:28
Yes, I'm not recommending to anybody that they
31:31
put a Varnish cache in front of their
31:33
PicoCMS. For one thing, I can't really in
31:35
good conscience recommend Varnish to anybody who
31:38
needs a recommendation these days because what
31:40
Varnish does not support and it looks
31:42
like probably never will is
31:44
HTTPS. So if you
31:46
were going to use Varnish as a front-end cache, you're
31:49
also going to need an SSL accelerator and figure out
31:51
how to tie them together and just at
31:54
this point, it's easier just to start off with
31:56
something like Redis and say Varnish was
31:58
a wonderful tool for its day. but in
32:00
my opinion at least with the
32:02
refusal to integrate any SSL support, that
32:04
day has just passed. Right,
32:07
well we better get out of here then. Remember
32:09
show at 2.5admins.com if you want to send
32:11
any questions or feedback. You
32:14
can find me at joerest.com/mastodon.
32:16
You can find me at mercenaryassistadmin.com.
32:19
And I'm at Alan Jude. We'll see you next week.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More