Doing random things over at http://musteat.org
232 stories
·
4 followers

My list of magic numbers for your turkey day enjoyment

2 Shares

Hello, world! A couple of days ago, I noticed someone remarking on a line from an older post of mine that had been making the rounds. In it, I say something about adding a number to "my list of magic numbers". That person wanted to see that list.

It turns out that I've never written this down in any one coherent list... until now. I know that Thanksgiving here in the US usually brings a dull and dreary day online, in which nobody's posted anything interesting, and you just have to look at cat pictures until everyone gets over their turkey comas on Thursday and shopping burnout on Friday. This year will be even stranger since we're all trying to not get each other sick and might be doing it in isolation.

And so, I have put together this list of everything I can think of at the moment, along with a whole bunch of commentary that should be able to send you on dozens of romps through Wikipedia. Hopefully this will both satisfy the request for "the list" and provide a lot of amusement on a day which normally is lacking in content.

I might edit this page a bunch of times to add more items or links, so if it keeps popping up as "updated" in your feed reader, that's why.

Be safe out there, and enjoy.

...

"HTTP" is 0x48545450 (1213486160) or 0x50545448 (1347703880). If it shows up in your error logs, you're probably calling out to a poor web server instead of someone who actually speaks its binary protocol. See also: malloc("HTTP").

"GET " is 0x47455420 (1195725856) or 0x20544547 (542393671). This might show up in your server's logs if someone points a HTTP client at it - web browser, curl, that kind of thing.

"SSH-" is 0x5353482d (1397966893) or 0x2d485353 (759714643). If this shows up in your error logs, you're probably calling out to some poor SSH daemon instead of someone speaking your binary language.

1048576 is 2^20. It's a megabyte if you're talking about memory.

86400 is the number of seconds in a UTC day, assuming you aren't dealing with a leap second.

86401 would be the number when you *are* dealing with a (positive) leap second.

86399 is actually possible, if the planet speeds up and we need to *skip* a second - the still-mythical "negative leap second".

82800 (23 hours) is what happens in the spring in time zones which skip from 1:59:59 to 3:00:00 in the early morning: you lose an hour. Any scheduled jobs using this time scale in that hour... never run!

90000 (25 hours) is what you get in the fall in time zones which "fall back" from 1:59:59 summer time to 1:00:00 standard time and repeat the hour. Any scheduled jobs using this time scale in that hour... probably run twice! Hope they're idempotent, and the last one finished already! See also: Windows 95 doing it over and over and over.

10080 is the number of minutes in a regular week. You run into this one a lot if you write a scheduler program with minute-level granularity.

10140 is what happens when you have the spring DST transition.

10020 is then what you get in the winter again.

40 is the number of milliseconds of latency you might find that you can't eliminate in your network if you don't use TCP_NODELAY. See also: Nagle.

18 is the number of seconds that GPS and UTC are currently spread apart, because that's how many leap seconds have happened since GPS started. UTC includes them while GPS does not (but it does supply a correction factor). See also: bad times with certain appliances.

168 is one week in hours. Certain devices that do self-tests operate on this kind of timeframe.

336 is two weeks in hours and is another common value.

2000 is how many hours you work in a year if you work 40 hours a week for 50 weeks and the other two are handled as "vacation". This is why you can approximate a full-time yearly wage from an hourly wage by multiplying by 2: $20/hr -> $40K/year.

08 00 is the hex representation of what you get in an Ethernet packet for IPv4 traffic right after the destination and source hardware addresses. See also: EtherType.

45 00 is what comes after that, assuming a relatively boring IPv4 packet.

08 06 is what shows up in an Ethernet packet after the destination and source hardware addresses when some host is trying to do ARP for another host. You'll probably see it right before the aforementioned "08 00 45 00" when two hosts haven't talked to each other recently on the same network.

86 DD is what you'll get for IPv6, and you should see more of that in your future. Also, you won't see ARP.

33434 is the first UDP port used by the traditional (i.e., Van Jacobson style) traceroute implementation. See also: 32768 + 666.

49152 is $C000, or a block of open memory on a Commodore 64 frequently used for non-BASIC funtimes. SYS 49152 is how you jump there.

64738 is the reset vector on those same Commodore 64 machines.

64802 is the reset vector for people claiming to be even more OG than the C-64 crowd - it's from the VIC-20.

828 was the beginning of the cassette buffer on a Commodore 64, and was a place you could store a few bytes if you could be sure nobody would be using the cassette storage system while your program was running.

9090909090 might be a string of NOPs if you are looking at x86 assembly code. Someone might have rubbed out some sequence they didn't like. See also: bypassing copy protection for "software piracy".

31337 might mean you are elite, or if you like, eleet.

f0 0f c7 c8 is a sequence made famous in 1997 when people discovered they could lock up Intel machines while running as an unprivileged user on an unpatched system. See also: the Pentium f00f bug.

-1 is what you get back from fork() when it fails. Normally, it returns a pid_t which is positive if you're the parent, or zero if you're the child.

-1, when handed to kill() on Linux, means "target every process but myself and init". Therefore, when you take the return value from fork(), fail to check for an error, and later hand that value to kill(), you might just kill everything else on the machine. See also: fork() can fail - this is important.

15750 Hz is the high-pitched whine you'd get from some old analog NTSC televisions. See also: colorburst.

10.7 MHz is an intermediate frequency commonly used in FM radios. If you have two radios and at least one has an analog dial, try tuning one to a lower frequency and another to 10.7 above that (so 94.3 and 105.0 - hence the need for analog tuning). You may find that one of them squashes the signal from the other at a decent distance, even. See also: superheterodyne conversion.

455 kHz is the same idea but for AM receivers.

64000 is the signaling rate of a DS0 channel. You get it by having 8000 samples per second which are 8 bits each.

4000 Hz is the absolute max theoretical frequency of a sound you could convey during a phone call going across one of those circuits. In practice, it rolls off significantly a few hundred Hz before that point. See also: Nyquist.

1536000 therefore is the rate of a whole T1/DS1, since it's 24 DS0s. This is where we get the idea of a T1 being 1.5 Mbps.

/16 in the land of IPv4 is 2^16, or 65536 addresses. In the old days, we'd call this a "class B".

/17 in IPv4 is half of that /16, so 32768.

/15 in IPv4 would be double the /16, so 131072.

/64 is a typical IPv6 customer allocation. Some ISPs go even bigger. Just a /64 gives the customer 2^64 addresses, or four billion whole IPv4 Internets worth of space to work with. See also: IPv6 is no longer magic.

1023 is the highest file descriptor number you can monitor with select() on a Unix machine where FD_SETSIZE defaults to 1024. Any file descriptor past that point, when used with select(), will fail to be noticed *at best*, will cause a segmentation fault when you try to flip a bit in memory you don't own in the middle case, and will silently corrupt other data in the worst case. See also: poll, epoll, and friends, and broken web mailers.

497 is the approximate number of days a counter will last if it is 32 bits, unsigned, starts from zero, and ticks at 100 Hz. See also: old Linux kernels and their 'uptime' display wrapping around to 0.

(2^32) - (100 * 60 * 60 * 24 * 497)
887296
(2^32) - (100 * 60 * 60 * 24 * 498)
-7752704

49.7 is the approximate number of days that same counter will last if it instead ticks at 1000 Hz. See also: Windows 95 crashing every month and a half (if you could keep it up that long).

(2^32) - (1000 * 60 * 60 * 24 * 49)
61367296
(2^32) - (1000 * 60 * 60 * 24 * 50)
-25032704

208 is the approximate number of days a counter will last if it is 64 bits, unsigned, starts from zero, ticks every nanosecond, and is scaled by a factor of 2^10. See also: Linux boxes having issues after that many days of CPU (TSC) uptime (not necessarily kernel uptime, think kexec).

(2^64) - (1000000000 * 60 * 60 * 24 * 208 * 1024)
44235273709551616
(2^64) - (1000000000 * 60 * 60 * 24 * 209 * 1024)
-44238326290448384

248 is the approximate number of days a counter will last if it is 32 bits, signed, starts from zero, and ticks at 100 Hz. See also: the 787 GCUs that have to be power-cycled at least that often lest they reboot in flight.

(2^31) - (100 * 60 * 60 * 24 * 248)
4763648
(2^31) - (100 * 60 * 60 * 24 * 249)
-3876352

128 is 2^7, and you'll approach this (but never hit it) when you fill up a tiny unsigned counter on an old system. See also: our cherished 8 bit machines of the 80s, the number of lives you could get in Super Mario Bros, and countless other places of similar vintage.

256 is 2^8, in the event that same counter was unsigned.

32768 is 2^15, and if you used a default 16 bit signed value in your SQL database schema (perhaps as "int"), this is the first number you can't jam into that field. See also: a certain company's "employee ID" column in a database that broke stuff for me as a new employee once upon a time.

65536 is 2^16, in the event that same counter was signed instead.

16777... is the first five digits of 2^24, a number you might see when talking about colors if you have 3 channels (R, G, B) and 8 bits per channel. See also: "16.7 million colors" in marketing stuff from back in the day.

2147... is the first four digits of 2^31, a number you will probably see right around the time you wrap a 32 bit signed counter (or run out of room). See also: Unix filesystems without large file support.

4294... is the first four digits of 2^32, and you'll see that one right around the time you wrap a 32 bit unsigned counter (or run out of room).

9223... is the first four digits of 2^63... 64 bit signed counter...

1844... 2^64... 64 bit unsigned counter...

"1969", "December 31, 1969", "1969-12-31" or similar, optionally with a time in the afternoon, is what you see if you live in a time zone with a negative offset of UTC and someone decodes a missing (Unix) time as 0. See also: zero is not null.

"1970", "January 1, 1970", "1970-01-01" or similar, optionally with a time in the morning... is what you get with a positive offset of UTC in the same situation. Zero is not null, null is not zero!

A spot in the middle of the ocean that shows up on a map and, once zoomed WAY out, shows Africa to the north and east of it is 0 degrees north, 0 degrees west, and it's what happens when someone treats zero as null or vice-versa in a mapping/location system. See also: Null Island.

5 PM on the west coast of US/Canada in the summertime is equivalent to midnight UTC. If things suddenly break at that point, there's probably something that kicked off using midnight UTC as the reference point. The rest of the year (like right now, in November), it's 4 PM.

2022 is when 3G service will disappear in the US for at least one cellular provider and millions of cars will no longer be able to automatically phone home when they are wrecked, stolen, or lost. A bunch of home security systems with cellular connectivity will also lose that connectivity (some as backup, some as primary) at that time. Auto and home insurance rates will go up for some of those people once things are no longer being monitored. Other "IoT" products with that flavor of baked-in cellular connectivity which have not been upgraded by then will mysteriously fall offline, too - parking meters, random telemetry loggers, you name it.

February 2036 is when the NTP era will roll over, a good two years before the Unix clock apocalypse for whoever still hasn't gotten a 64 bit time_t by then. At that point, some broken systems will go back to 1900. That'll be fun. Everyone will probably be watching for the next one and will totally miss this one. See also: ntpd and the year 2153.

January 19, 2038 UTC (or January 18, 2038 in some local timezones like those of the US/Canada) is the last day of the old-style 32 bit signed time_t "Unix time" epoch. It's what you get when you count every non-leap second since 1970. It's also probably when I will be called out of retirement for yet another Y2K style hair on fire "fix it fix it fix it" consulting situation.

1024 is how many weeks you get from the original GPS (Navstar, if you like) week number before it rolls over. It's happened twice already, and stuff broke each time, even though the most recent one was in 2019. The next one will be... in November 2038. Yes, that's another 2038 rollover event. Clearly, the years toward the end of the next decade should be rather lucrative for time nuts with debugging skills! See also: certain LTE chipsets and their offset calculations.

52874 is the current year if someone takes a numeric date that's actually using milliseconds as the unit and treats it like it's seconds, like a time_t. This changes quickly, so it won't be that for much longer, and by the time you read this, it'll be even higher. See also: Android taking me to the year 42479.

$ date -d @$(date +%s)000
Sun Mar 25 21:50:00 PDT 52874

Any of these numbers might have an "evil twin" when viewed incorrectly, like assuming it's signed when it's actually signed, so you might see it as a "very negative" number instead of a VERY BIG number. Example: -128 instead of 128, -127 instead of 129, -126 instead of 130, and so on up to -1 instead of 255. I didn't do all of them here because it's just too much stuff. See also: two's complement representation, "signed" vs. "unsigned" (everywhere - SQL schemas, programming languages, IDLs, you name it), and *printf format strings, and epic failure by underflow.

These numbers ALSO have another "evil twin" in that they might be represented in different byte orders depending on what kind of machine you're on. Are you on an Intel machine? That's one way. Are you looking at a network byte dump? That's another. Did you just get a new ARM-based Macbook and now the numbers are all backwards in your raw structs? Well, there you go. I listed them for the HTTP/GET /SSH- entries up front because I know someone will search for them and hit them some day. The rest, well, you can do yourself. See also: htons(), htonl(), ntohs(), ntohl(), swab() and a bunch more depending on what system you're on.

Read the whole story
smarkwell
7 hours ago
reply
Share this story
Delete

Fannie Mae: Mortgage Serious Delinquency Rate Increased in July

1 Share
Fannie Mae reported that the Single-Family Serious Delinquency increased to 3.24% in July, from 2.65% in June. The serious delinquency rate is up from 0.67% in July 2019.

This is the highest serious delinquency rate since December 2012.

These are mortgage loans that are "three monthly payments or more past due or in foreclosure".

The Fannie Mae serious delinquency rate peaked in February 2010 at 5.59%.

Fannie Freddie Seriously Delinquent RateClick on graph for larger image

By vintage, for loans made in 2004 or earlier (2% of portfolio), 5.57% are seriously delinquent (up from 5.00% in May). For loans made in 2005 through 2008 (3% of portfolio), 9.36% are seriously delinquent (up from 8.37%), For recent loans, originated in 2009 through 2018 (95% of portfolio), 2.79% are seriously delinquent (up from 2.21%). So Fannie is still working through a few poor performing loans from the bubble years.

Mortgages in forbearance are counted as delinquent in this monthly report, but they will not be reported to the credit bureaus.

This is very different from the increase in delinquencies following the housing bubble.   Lending standards have been fairly solid over the last decade, and most of these homeowners have equity in their homes - and they will be able to restructure their loans once they are employed.

Note: Freddie Mac reported earlier.
Read the whole story
smarkwell
90 days ago
reply
Share this story
Delete

Leanpath is now a Certified B Corporation

1 Share

We’re proud to announce that Leanpath is now a Certified B Corporation® - joining only 3,400 other companies around the globe that are committed to balancing purpose and profit.

Read the whole story
smarkwell
100 days ago
reply
Share this story
Delete

I Love MDN, or the cult of the free in action

1 Share

Yesterday or so a new initiative I Love MDN was unveiled. People can show their appreciation for the MDN staff and volunteers by leaving a comment.

I have a difficult message about this initiative. For almost a day I’ve been trying to find a way to bring that message across in an understanding, uplifting sort of way, but I failed.

Before I continue I’d like to remind you that I ran the precursor to MDN, all by myself, for 15 years, mostly for free. I was a community volunteer. I know exactly what goes into that, and what you get back from it. I also burned out on it, and that probably colours my judgement.

So here is my message, warts and all.

I find I Love MDN demeaning to technical writers. It reminds me of breaking into spontaneous applause for our courageous health workers instead of funding them properly so they can do their jobs.

It pretends techincal writing is something that can be done by 'the community', ie. random people, instead of being a job that requires very specialised skills. If you deny these skills exist by pretending anyone can do it, you’re demeaning the people who have actually taken the time and trouble to build up those skills.

In addition, I see the I Love MDN initiative as an example of the cult of the free, of everything that’s wrong with the web development community today. The co-signers unthinkingly assume they are entitled to free content.

Unthinking is the keyword here. I do not doubt that the intentions of the organisers and co-signers are good, and that they did not mean to bring across any of the nasty things I said above and will say below. They just want to show MDN contributors that their work is being valued.

Thatr’s nice. But it’s not enough. Far from it.

Take a look here. It is my old browser compatibility site after four to six years of lying fallow. Would you use this as a resource for your daily work? There are still some useful bits, but it’s clear that the majority of these pages are sadly outdated.

That will be MDN’s fate under a volunteer-only regime.

What we need is money to retain a few core technical writers permanently. I Love MDN ignores that angle completely.

Did you sign I Love MDN? Great! Are you willing to pay 50-100 euros/dollars per year to keep MDN afloat? If not, this is all about making you feel better, not the technical writers. You’re part of the problem, not the solution.

Here’s our life blood — for free

MDN Web Docs is the life blood, the home, the source of truth for millions of web developers everyday. [...] As a community of developers we have access to all of this information for free ♥️

That’s not wonderful. It’s terrifying.

We get everything for free hurray hurray, also, too, community community community, and, hey! with that statement out of the way we’re done. Now let’s congratulate ourselves with our profound profundity and dance the glad dance of joy. Unicorn-shitting rainbows will be ours forever!

I Love MDN hinges on the expectation on the part of web developers that this sort of information ought to come for free — the expectation we’re entitled to this sort of free ride.

(That’s also the reason I never contributed to MDN. I feel I’ve done my duty, and although I don’t mind writing a few more articles I very much mind doing it for free.)

This is all made possible by a passionate community, inspirational technical writers, and a small, but determined team of developers.

Hogwash. The passionate community has nothing to do with anything, unless they’re willing to pay. A profoundly unscientific poll indicates that only about two-thirds of my responding followers are willing to do so. The rest, apparently, is too passionate to pay. It’s just along for the free ride. That isn’t very comforting.

Working in the long run

Rachel Andrew puts it better than I can:

The number of people who have told me that MDN is a wiki, therefore the community will keep it up to date tells me two things. People do not get the value of professional tech writers. Folk are incredibly optimistic about what "the community" will do for free.

So you once wrote an MDN page. Great! Thanks!

But will you do the boring but necessary browser testing to figure out if what you’re describing is always true, or just most of the time? And will you repeat that testing once new versions have come out? Will you go through related pages and update any references that need to be updated? Will you follow advances in what you descrived and update the page? If someone points out an error six months from now, will you return to the page to revise it and do the necessary research?

If the answer to any of these questions is No you did a quarter of your job and then walked away. Not very useful.

And if the answer to all of these questions is Yes, hey, great, you’ve got what it takes! You’re really into technical writing! We need you! Now, quick, tell me, how long will you keep it up without any form of payment? Quite a while, you say? Great! Try beating my record of 15 years.

The problem with expecting volunteers to do this sort of work is that they burn out. Been there, done that. And what happens when all volunteers burn out?

Yes, new volunteers will likely step up. But they have to be introduced to the documentation system, not only the techincal bits, but also the editorial requirements. Their first contributions will have to be checked for factual errors and stylistic problems, for proper linking to related pages, for enough browser compatibility information. Who’s going to do that? Also volunteers? But they just burned out.

It doesn’t work in the long run.

Money

What ought to happen is MDN (or its successor) securing the funding to retain a few core technical writers on a permanent basis. Without that, it’s doomed to fail.

Now there are two ways of securing funding. The first one is appealing to big companies, particularly browser vendors. I can see Google, Microsoft, and Samsung chipping in a bit, maybe even quite a lot, to keep MDN running. (Apple won’t, of course. They’re on their own cloud.) This could work, especially in the short run.

But will we be well served by that in the long run? You might have noticed that all companies I named use a Chromium-based browser. What about Firefox? Or WebKit?

I have no doubt that the Chrome, Edge, and Samsung Internet developer relations teams are totally serious about keeping other browsers on board and will not bend MDN new-style to their own browsers in any way. They’ve shown their commitment to browser diversity time and again.

What I doubt is that the final decision rests with them. Once MDN new-style funded by the browser vendors has been running for a while, managers will poke their heads around the corner to ask what we, as in Google, Microsoft, or Samsung, get in return for all the money we’re spending. More attention for our browser, that’s what. Make it so!

That’s why I prefer the second option in the long run: funding by the web community itself. Create an independent entity like Fronteers, but then international, get members to pay 50-100 euros/dollars per year, and use that money to fund MDN or its successor.

Now this is a lot of work. But I still feel it needs to be done.

But who will do it? Volunteers? We’ll run into the same problem that I sketched above, just one step removed. I briefly considered starting such an initiative myself, but I found that I am unwilling to do it for free.

And I know exactly what it takes. I founded Fronteers for free, and it took me half a year of mind-numbing work, including fending off random idiots community members who also had an opinion. Even though others stepped up and helped, my first burn-out was mostly caused by Fronteers’s founding, and I am unwilling to do it all over again for free.

So there we are. On balance, it’s more likely we go with the big-company solution that will work in the short run but will give problems in the long run.

Unless the web development community stops expecting a free ride, and starts to pay up. Initiatives such as I Love MDN don’t give me a lot of hope, though.

Read the whole story
smarkwell
103 days ago
reply
Share this story
Delete

IPv4, IPv6, and a sudden change in attitude

1 Share

A few years ago I wrote The World in Which IPv6 was a Good Design. I'm still proud of that article, but I thought I should update it a bit.

No, I'm not switching sides. IPv6 is just as far away from universal adoption, or being a "good design" for our world, as it was three years ago. But since then I co-founded a company that turned out to be accidentally based on the principles I outlined in that article. Or rather, from turning those principles upside-down.

In that article, I explored the overall history of networking and the considerations that led to IPv6. I'm not going to cover that ground again. Instead, I want to talk about attitude.

Internets, Interoperability, and Postel's Law

Did you ever wonder why "Internet" is capitalized?

When I first joined the Internet in the 1990s, I found some now-long-lost introductory tutorial. It talked about the difference between an internet (lowercase i) and the Internet (capital I). An internet is "any network that connects smaller networks together." The Internet is... well... it turns out that you don't need more than one internet. If you have two internets, it is nearly unavoidable that someone will soon figure out how to connect them together. All you need is one person to build that one link, and your two internets become one. By induction then, the Internet is the end result when you make it easy enough for a single motivated individual to join one internet to another, however badly.

Internets are fundamentally sloppy. No matter how many committees you might form, ultimately connections are made by individuals plugging things together. Those things might follow the specs, or not. They might follow those specs well, or badly. They might violate the specs because everybody else is also violating the specs and that's the only way to make anything work. The connections themselves might be fast or slow, or flakey, or only functional for a few minutes each day, or subject to amateur radio regulations, or worse. The endpoints might be high-powered servers, vending machines, toasters, or satellites, running any imaginable operating system. Only one thing's for sure: they all have bugs.

Which brings us to Postel's Law, which I always bring up when I write about networks. When I do, invariably there's a slew of responses trying to debate whether Postel's Law is "right," or "a good idea," as if it were just an idea and not a force of nature.

Postel's Law says simply this: be conservative in what you send, and liberal in what you accept. Try your best to correctly handle the bugs produced by the other end. The most successful network node is one that plans for every "impossible" corruption there might be in the input and does something sensible when it happens. (Sometimes, yes, "something sensible" is to throw an error.)

[Side note: Postel's Law doesn't apply in every situation. You probably don't want your compiler to auto-fix your syntax errors, unless your compiler is javascript or HTML, which, kidding aside, actually were designed to do this sort of auto-correction for Postel's Law reasons. But the law does apply in virtually every complex situation where you need to communicate effectively, including human conversations. The way I like to say it is, "It takes two to miscommunicate." A great listener, or a skilled speaker, can resolve a lot of conflicts.]

Postel's Law is the principle the Internet is based on. Not because Jon Postel was such a great salesperson and talked everyone into it, but because that is the only winning evolutionary strategy when internets are competing. Nature doesn't care what you think about Postel's Law, because the only Internet that happens will be the one that follows Postel's Law. Every other internet will, without exception, eventually be joined to The Internet by some goofball who does it wrong, but just well enough that it adds value, so that eventually nobody will be willing to break the connection. And then to maintain that connection will require further application of Postel's Law.

IPv6: a different attitude

If you've followed my writing, you might have seen me refer to IPv6 as "a second internet that not everyone is connected to." There's a lot wrapped up in that claim. Let's back up a bit.

In The World in Which IPv6 was a Good Design, I talked about the lofty design goals leading to IPv6: eliminate bus networks, get rid of MAC addresses, no more switches and hubs, no NATs, and so on. What I didn't realize at the time, which I now think is essential, is that these goals were a fundamental attitude shift compared to what went into IPv4 (and the earlier protocols that led to v4).

IPv4 evolved as a pragmatic way to build an internet out of a bunch of networks and machines that existed already. Postel's Law says you'd best deal with reality as it is, not as you wish it were, and so they did. When something didn't connect, someone hacked on it until it worked. Sloppy. Fits and starts, twine and duct tape. But most importantly, nobody really thought this whole mess would work as well as it turned out to work, or last as long as it turned out to last. Nobody knew, at the time, that whenever you start building internets, they always lead inexorably to The Internet.

These (mostly) same people, when they started to realize the monster they had created, got worried. They realized that 32-bit addresses, which they had originally thought would easily last for the lifetime of their little internet, were not even enough for one address per person in the world. They found out, not really to anyone's surprise, that Postel's Law, unyielding as it may be, is absolutely a maintenance nightmare. They thought they'd better hurry up and fix it all, before this very popular Internet they had created, which had become a valuable, global, essential service, suddenly came crashing down and it would all be their fault.

[Spoiler: it never did come crashing down. Well, not permanently. There were and are still short-lived flare-ups every now and then, but a few dedicated souls hack it back together, and so it goes.]

IPv6 was created in a new environment of fear, scalability concerns, and Second System Effect. As we covered last time, its goal was to replace The Internet with a New Internet — one that wouldn't make all the same mistakes. It would have fewer hacks. And we'd upgrade to it incrementally over a few years, just as we did when upgrading to newer versions of IP and TCP back in the old days.

We can hardly blame people for believing this would work. Even the term "Second System Effect" was only about 20 years old at the time, and not universally known. Every previous Internet upgrade had gone fine. Nobody had built such a big internet before, with so much Postel's Law, with such a variety of users, vendors, and systems, so nobody knew it would be different.

Well, here we are 25 years later, and not much has changed. If we were feeling snarky, we could perhaps describe IPv6 as "the String Theory of networking": a decades-long boondoggle that attracts True Believers, gets you flamed intensely if you question the doctrine, and which is notable mainly for how much progress it has held back.

Luckily we are not feeling snarky.

Two Internets?

There are, of course, still no exceptions to the rule that if you build any internet, it will inevitably (and usually quickly) become connected to The Internet.

I wasn't sitting there when it happened, but it's likely the very first IPv6 node ran on a machine that was also connected to IPv4, if only so someone could telnet to it for debugging. Today, even "pure IPv6" nodes are almost certainly connected to a network that, if configured correctly, can find a way to any IPv4 node, and vice versa. It might not be pretty, it might involve a lot of proxies, NATs, bridges, and firewalls. But it's all connected.

In that sense, there is still just one Internet. It's the big one. Since day 1, The Internet has never spoken just one protocol; it has always been a hairy mess of routers, bridges, and gateways, running many protocols at many layers. IPv6 is one of them.

What makes IPv6 special is that its proponents are not content for it to be an internet that connects to The Internet. No! It's the chosen one. Its destiny is to be The Internet. As a result, we don't only have bridges and gateways to join the IPv6 internets and the IPv4 internet (although we do).

Instead, IPv6 wants to eventually run directly on every node. End users have been, uh, rather unwilling to give up IPv4, so for now, every node has that too. As a result, machines are often joined directly to what I call "two competing internets" --- the IPv4 one and the IPv6 one.

Okay, at this point our terminology has become very confusing. Sorry. But all this leads to the question I know you want me to answer: Which internet is better!?

Combinatorics

I'll get to that, but first we need to revisit what I bravely called Avery's Laws of Wifi Reliability, which are not laws, were surely invented by someone else (since they're mostly a paraphrasing of a trivial subset of CAP theorem), and as it turns out, apply to more than just wifi. Oops. I guess the name is wrong in almost every possible way. Still, they're pretty good guidelines.

Let's refresh:

  • Rule #1: if you have two wifi router brands that work with 90% of client devices, and your device has a problem with one of them, replacing the wifi router brand will fix the problem 90% of the time. Thus, an ISP offering both wifi routers has a [1 - (10% x 10%)] = 99% chance of eventual success.

  • Rule #2: if you're running two wifi routers at once (say, a primary router and an extender), and both of them work "correctly" for about 90% of the time each day, the chance that your network has no problems all day is 81%.

In Rule #1, which I call "a OR b", success compounds and failure rates drop.

In Rule #2, which I call "a AND b", failure compounds and success drops.

But wait, didn't we add redundancy in both cases?

Depending how many distributed systems you've had to build, this is either really obvious or really mind blowing. Why did the success rate jump to 99% in the first scenario but drop to 81% in the second? What's the difference? And... which one of those cases is like IPv6?

Failover

Or we can ask that question another way. Why are there so many web pages that advise you to solve your connectivity problem by disabling IPv6?

Because automatic failover is a very hard problem.

Let's keep things simple. IPv4 is one way to connect client A to server X, and IPv6 is a second way. It's similar to buying redundant home IPv4 connections from, say, a cable and a DSL provider and plugging them into the same computer. Either way, you have two independent connections to The Internet.

When you have two connections, you must choose between them. Here are some factors you can consider:

  • Which one even offers a path from A to X? (If X doesn't have an IPv6 address, for example, then IPv6 won't be an option.)

  • Which one gives the shortest paths from A to X and from X to A? (You could evaluate this using hopcount or latency, for example, like in my old netselect program.)

  • Which path has the most bandwidth?

  • Which path is most expensive?

  • Which path is most congested right now?

  • Which path drops out least often? (A rebooted NAT will drop a TCP connection on IPv4. But IPv6 routes change more frequently.)

  • Which one has buggy firewalls or NATs in the way? Do they completely block it (easy) or just act strangely (hard)?

  • Which one blocks certain UDP or TCP ports, intentionally or unintentionally?

  • Which one is misconfigured to block certain ICMP packets so that PMTU discovery (always or sometimes) doesn't work with some or all hosts?

  • Which one blocks certain kinds of packet fragmentation?

A common heuristic called "Happy Eyeballs" is one way to choose between routes, but it covers only a few of those criteria.

The truth is, it's extremely hard to answer all those questions, and even if you can, the answers are different for every combination of A and X, and they change over time. Operating systems, web browsers, and apps, even if they implement Happy Eyeballs or something equivalent, tend to be pretty bad at detecting all these edge cases. And every app has to do it separately!

My claim is that the "choose between two internets" problem is the same as the "choose between two flakey wifi routers on the same SSID" problem (Rule #2). All is well as long as both internets (or both wifi routers) are working perfectly. As soon as one is acting weird, your overall results are going to be weird.

...and the Internet always acts weird, because of the tyranny of Postel's Law. Debugging the Internet is a full time job.

...and now there are two internets, with a surprisingly low level of overlap, so your ISP has to build and debug both.

...and every OS vendor has to debug both protocol implementations, which is more than twice as much code.

...and every app vendor has to test with both IPv4 and IPv6, which of course they don't.

We should not be surprised that the combined system is less reliable.

The dream

IPv6 proponents know all this, whether rationally or intuitively or at least empirically. The failure rate of two wonky internets joined together is higher than the failure rate of either wonky internet alone.

This leads them to the same conclusion you've heard so many times: we should just kill one of the internets, so we can spend our time making the one remaining internet less wonky, instead of dividing our effort between the two. Oh, and, obviously the one we kill will be IPv4, thanks.

They're not wrong! It would be a lot easier to debug with just one internet, and you know, if we all had to agree on one, IPv6 is probably the better choice.

But... we don't all have to agree on one, because of the awesome unstoppable terribleness that is Postel's Law. Nobody can declare one internet or the other to be officially dead, because the only thing we know for sure about internets is that they always combine to make The Internet. Someone might try to unplug IPv4 or IPv6, but some other jerk will plug it right back in.

Purity cannot ever be achieved at this kind of scale. If you need purity for your network to be reliable, then you have an unsolvable problem.

The workaround

One thing we can do, though, is build better heuristics.

Ok, actually we have to do better than that, because it turns out that correctly choosing between the two internets for each connection, at the start of that connection, is not possible or good enough. Problems like PMTU, fragmentation, NAT resets, and routing changes can interrupt a connection partway through and cause poor performance or dropouts.

I want to go back to a side note I left near the end of The World in Which IPv6 was a Good Design: mobile IP. That is, the ability for your connections to keep going even if you hop between IP addresses. If you had IP mobility, then you could migrate connections between your two internets in real time, based on live quality feedback. You could send the same packets over both links and see which ones work better. If you picked one link and it suddenly stopped, you could retransmit packets on the other link and pick up where you left off. Your precise heuristic wouldn't even matter that much, as long as it tries both ways eventually.

If you had IP mobility, then you could convert the "a AND b" scenario (failure compounds) into the "a OR b" scenario (success compounds).

And you know what, forget about IPv4 and IPv6. The same tricks would work with that redundant cable + DSL setup we mentioned above. Or a phone with both wifi and LTE. Or, given a fancy enough wifi client chipset, smoothly switching between multiple unrelated wifi routers.

This is what we do, in a small way, with Tailscale's VPN connections. We try all your Internet links, IPv4 and IPv6, UDP and TCP, relayed and peer-to-peer. We made mobile IP a real thing, if only on your private network for now. And what do you know, the math works. Tailscale with two networks is more reliable than Tailscale with one network.

Now, can it work for the whole Internet?

This article was originally posted to the Tailscale blog

Read the whole story
smarkwell
123 days ago
reply
Share this story
Delete

On Liberating My Smartwatch From Cloud Services

1 Comment and 3 Shares

I’ve often said that if we convince ourselves that technology is magic, we risk becoming hostages to it. Just recently, I had a brush with this fate, but happily, I was saved by open source.

At the time of writing, Garmin is suffering from a massive ransomware attack. I also happen to be a user of the Garmin Instinct watch. I’m very happy with it, and in many ways, it’s magical how much capability is packed into such a tiny package.

I also happen to have a hobby of paddling the outrigger canoe:

I consider the GPS watch to be an indispensable piece of safety gear, especially for the boat’s steer, because it’s hard to judge your water speed when you’re more than a few hundred meters from land. If you get stuck in a bad current, without situational awareness you could end up swept out to sea or worse.

The water currents around Singapore can be extreme. When the tides change, the South China Sea eventually finds its way to the Andaman Sea through the Singapore Strait, causing treacherous flows of current that shift over time. Thus, after every paddle, I upload my GPS data to the Garmin Connect cloud and review the route, in part to note dangerous changes in the ebb-and-flow patterns of currents.

While it’s a clear and present privacy risk to upload such data to the Garmin cloud, we’re all familiar with the trade-off: there’s only 24 hours in the day to worry about things, and the service just worked so well.

Until yesterday.

We had just wrapped up a paddle with particularly unusual currents, and my paddling partner wanted to know our speeds at a few of the tricky spots. I went to retrieve the data and…well, I found out that Garmin was under attack.

Garmin was being held hostage, and transitively, so was access to my paddling data: a small facet of my life had become a hostage to technology.

A bunch of my paddling friends recommended I try Strava. The good news is Garmin allows data files to be retrieved off of the Instinct watch, for upload to third-party services. All you have to do is plug the watch into a regular USB port, and it shows up as a mass storage device.

The bad news is as I tried to create an account on Strava, all sorts of warning bells went off. The website is full of dark patterns, and when I clicked to deny Strava access to my health-related data, I was met with this tricky series dialog boxes:

Click “Decline”…

Click “Deny Permission”…

Click “OK”…

Three clicks to opt out, and if I wasn’t paying attention and just kept clicking the bottom box, I would have opted-in by accident. After this, I was greeted by a creepy list of people to follow (how do they know so much about me from just an email?), and then there’s a tricky dialog box that, if answered incorrectly, routes you to a spot to enter credit card information as part of your “free trial”.

Since Garmin at least made money by selling me a $200+ piece of hardware, collecting my health data is just icing on the cake; for Strava, my health data is the cake. It’s pretty clear to me that Strava made a pitch to its investors that they’ll make fat returns by monetizing my private data, including my health information.

This is a hard no for me. Instead of liberating myself from a hostage situation, going from Garmin to Strava would be like stepping out of the frying pan and directly into the fire.

So, even though this was a busy afternoon … I’m scheduled to paddle again the day after tomorrow, and it would be great to have my boat speed analytics before then. Plus, I was sufficiently miffed by the Strava experience that I couldn’t help but start searching around to see if I couldn’t cobble together my own privacy-protecting alternative.

I was very pleased to discovered an open-source utility called gpsbabel (thank you gpsbabel! I donated!) that can unpack Garmin’s semi-(?)proprietary “.FIT” file format into the interoperable “.GPX” format. From there, I was able to cobble together bits and pieces of XML parsing code and merge it with OpenStreetMaps via the Folium API to create custom maps of my data.

Even with getting “lost” on a detour of trying to use the Google Maps API that left an awful “for development only” watermark on all my map tiles, this only took an evening — it wasn’t the best possible use of my time all things considered, but it was mostly a matter of finding the right open-source pieces and gluing them together with Python (fwiw, Python is a great glue, but a terrible structural material. Do not build skyscrapers out of Python). The code quality is pretty crap, but Python allows that, and it gets the job done. Given those caveats, one could use it as a starting point for something better.

Now that I have full control over my data, I’m able to visualize it in ways that make sense to me. For example, I’ve plotted my speed as a heat map map over the course, with circles proportional to the speed at that moment, and a hover-text that shows my instantaneous speed and heart rate:

It’s exactly the data I need, in the format that I want; no more, and no less. Plus, the output is a single html file that I can share directly with nothing more than a simple link. No analytics, no cookies. Just the data I’ve chosen to share with you.

Here’s a snippet of the code that I use to plot the map data:

Like I said, not the best quality code, but it works, and it was quick to write.

Even better yet, I’m no longer uploading my position or fitness data to the cloud — there is a certain intangible satisfaction in “going dark” for yet another surveillance leakage point in my life, without any compromise in quality or convenience.

It’s also an interesting meta-story about how healthy and vibrant the open-source ecosystem is today. When the Garmin cloud fell, I was able to replace the most important functions of it in just an afternoon by cutting and pasting together various open source frameworks.

The point of open source is not to ritualistically compile our stuff from source. It’s the awareness that technology is not magic: that there is a trail of breadcrumbs any of us could follow to liberate our digital lives in case of a potential hostage situation. Should we so desire, open source empowers us to create and run our own essential tools and services.

Edits: added details on how to take data off the watch, and noted the watch’s price.

Read the whole story
smarkwell
125 days ago
reply
Share this story
Delete
1 public comment
jepler
128 days ago
reply
"The point of open source is not to ritualistically compile our stuff from source. It’s the awareness that technology is not magic: that there is a trail of breadcrumbs any of us could follow to liberate our digital lives in case of a potential hostage situation. Should we so desire, open source empowers us to create and run our own essential tools and services"
Earth, Sol system, Western spiral arm
Next Page of Stories