Monthly Archives: March 2014

Two Factor Security

My friend Dr24Hours tweeted something disparaging about two factor security this morning:

I’m a huge fan of two factor, but I’ve definitely noticed that it’s not well understood by a lot of people, and there are a lot of misunderstandings of what it is, how it works, and why it’s good.

Two factor is a mode of authentication. It’s a way of proving that you are who you say you are, in order to be granted access to some secured resource. You don’t want random people to be able to access your email, so your email provider has a way of ensuring that only you can access it. Before you an access it, you have to prove that you’re someone who’s supposed to be able to see it.

So how can you prove who you are? There are fancy terms for different strategies, but fundamentally, it comes down to one of two things: either something you know, or something you have.

If you’re using the internet, you’re guaranteed to be familiar with the “something you know” method: passwords are something you know. For any password protected service, you have a secret that you share with he service. By telling them the secret – something that no one but you should know – you’re proving to them that you’re you.

If you work in an office that uses ID badges, you’re familiar with a “something you have” strategy. The people at the security desk on the ground floor of my office have no idea who I am. They don’t recognize me. Someone could walk into the building, and claim that they work for twitter, and the security people would have no idea if they were telling the truth. But we all have security cards for the building. That card proves that I’m someone who’s supposed to be allowed in.

Passwords – something you know – are used all over the internet. Each of us has a ridiculous number of passwords. Just quickly trying to come up with a list, I’ve got at least 35 different passwords! And I’m probably missing at least another dozen.

Something you know has two big related problems: it’s only a good way of proving who you are if it’s really something that only you know; and if it’s something that someone else won’t be able to guess. If anyone else knows it, or if anyone else can guess it, then it’s no longer useful as a way of proving that you’re you.

Both of those weaknesses have major practical effects.

Lots of people choose really stupid passwords. There’s a huge number of people who use “password” or “1234” as their password. In fact, people who’ve tried to crack passwords to test the security of systems have found that the five most common passwords are “password”, “123456”, “12345678”, “12345”, and “qwerty”. If someone wants to get access to your email, or your amazon account, or anything else, those will be the first thing they try. A frightening amount of the time, that will work.

Even among people who use good passwords, the sheer number of passwords they need to remember has gotten out of hand, and so they reuse passwords. If you reuse your password, then that means that if any of the places you used your password gets cracked, or if any of the places where you used your password are dishonest, your password is no longer a secret that only you know.

What two-step authentication does is mitigate the risk of bad passwords and password reuse by adding a layer of “something you have” to authentication. First, it checks the shared secret – the thing that you know. If that’s successful, then it also checks for the thing you have.

For example, my work email account uses two-factor authentication. What that means is that when I log in, it first asks me for my password. Once I give it the password, then it asks for a four digit code. I’ve got an app on my phone that generates that code. Only my phone contains the data that’s needed to generate that four digit code – so if I get it right, that means that I have my phone.

Two factor doesn’t tell anyone additional information about you, beyond the fact that you have whatever device or artifact that you’re using. Using two factor doesn’t tell Google what I’m buying at Amazon, or what I’m tweeting, or who I talk to. Even if I’m using Google 2factor, Google gets no more information about me than it would get by checking my password.

But two factor has a huge benefit for me. It means that it’s much, much harder for someone else to get access to my accounts. I don’t reuse my passwords, but I do use 1password for password management, so if someone were able to crack my 1password data file, they’d have access to my passwords. But with two factor, that’s not enough for them to get access to my account. I wish that I could turn on two factor in more places!

Yet Another Cantor Crank: Size vs Cardinality

Over the weekend, a reader sent me links to not one but two new Cantor cranks!

Sadly, one of them is the incoherent sort – the kind of nutjob who strings together words in meaningless ways. Without a certain minimal rationality, there’s nothing I can say. What I try to do on this blog isn’t just make fun of crackpots – it’s explain what they get wrong! If a crackpot strings together words randomly, and no one can make any sense of just what the heck they’re saying, there’s no way to do that.

On the other hand, the second guy is a whole different matter. He’s making a very common mistake, and he’s making it very clearly. So for him, it’s well worth taking a moment and looking at what he gets wrong.

My mantra on this blog has always been: “the worst math is no math”. This is a perfect example.

First, I believe that Cantor derived a false conclusion from the diagonal method.

I believe that the primary error in the theory is not with the assertion that the set of Real Numbers is a “different size” than the set of Integers. The problem lies with the assertion that the set of Rational Numbers is the “same size” as the set of Integers. Our finite notion of size just doesn’t extend to infinite sets. Putting numbers in a list (i.e., creating a one-to-one correspondence between infinite sets) does not show that they are the same “size.”

This becomes clear if we do a two step version of the diagonal method.

Step One: Lets start with the claim: “Putting an infinite set in a list shows that it is the same size as the set of Integers.”

Step Two: Claiming to have a complete list of reals, Cantor uses the diagonal method to create a real number not yet in the list.

Please, think about this two step model. The diagonal method does not show that the rational numbers are denumerable while the real numbers are not. The diagonal method shows that the assertion in step one is false. The assertion in step one is as false for rational numbers as it is for real numbers.

The diagonal method calls into question the cross-section proof used to show that the rational numbers are the same size as the integers.

Cantor didn’t talk about size. He never asserted anything about size. He asserted something about cardinality.

That might sound like a silly nitpick: it’s just terminology, right?

Wrong. What does size mean? Size is an informal term. It’s got lots of different potential meanings. There’s a reasonable definition of “size” where the set of natural numbers is larger than the set of even natural numbers. It’s a very simple definition: given two sets of objects A and B, the size of B is larger than the size of A if A is a subset of B.

When you say the word “size”, what do you mean? Which definition?

Cantor defined a new way of defining size. It’s not the only valid measure, but it is a valid measure which is widely useful when you’re doing math. The measure he defined is called cardinality. And cardinality, by definition, says that two sets have the same cardinality if and only if it’s possible to create a one-to-one correspondance between the two sets.

When our writer said “Our finite notion of size just doesn’t extend to infinite sets”, he was absolutely correct. The problem is that he’s not doing math! The whole point of Cantor’s work on cardinality was precisely that our finite notion of size doesn’t extend to infinite sets. So he didn’t use our finite notion of size. He defined a new mathematical construct that allows us to meaningfully and consistently talk about the size of infinite sets.

Throughout his website, he builds a complex edifice of reasoning on this basis. It’s fascinating, but it’s ultimately all worthless. He’s trying to do math, only without actually doing math. If you want to try to refute something like Cantor’s diagonalization, you can’t do it with informal reasoning using words. You can only do it using math.

This gets back to a very common error that people make, over and over. Math doesn’t use fancy words and weird notations because mathematicians are trying to confuse non-mathematicians. It’s not done out of spite, or out of some desire to exclude non-mathematicians from the club. It’s about precision.

Cantor didn’t talk about the cardinality of infinite sets because he thought “cardinality” sounded cooler and more impressive than “size”. He did it because “size” is an informal concept that doesn’t work when you scale to infinite sets. He created a new concept because the informal concept doesn’t work. If you’re argument against Cantor is that his concept of cardinality is different from your informal concept of size, you’re completely missing the point.

Recipe: Real Ramen!

Yesterday, my son wasn’t feeling good, and asked for soup. (Poor kid inherited my stomach troubles.) I’ve been dying to try my hand at a real, serious ramen, so I dived in and did this. It turned out amazingly good.

If you’re American, odds are that when you hear “ramen”, you think of those little packets of noodles with a powdered MSG-heavy soup base that you can buy 5 for a dollar. To be honest, I do like those. But they’re a sad imitation of what ramen is supposed to be.

Ramen is, in my opinion, one of the very best soup dishes in the world. A real ramen is a bowl of chewy noodles, served in a rich hearty broth, with some delicious roasted meat, some veggies. Ramen broth isn’t a wimpy soup like american chicken noodle soup – it’s an intense soup. When you eat a bowl of ramen, you’re eating a meal, and you finish it feeling full, and warm, and happy with the world.

This isn’t a simple recipe. It’s a lot of work, but it’s worth it! And most of the components can be prepared in large batches and frozen.

So, here we go. Ramen with Chicken and Shrimp Tare, Watercress, and Roast Pork Tenderloin!

Tare

In ramen, you make the broth relatively unseasoned. Separately, you prepare a tare, which is a seasoning liquid. When you serve the ramen, you start by putting tare in the bottom of the bowl. It’s one of the tricks of ramen – it’s a big part of what makes the broth special. Every ramen cook has their own tare recipe.

Ingredients

  • Shells from 1lb shrimp
  • 8 chicken wings, cut into three pieces each. (Do not throw out the wingtips – for this, they’re the best part!
  • 1 cup mirin
  • 1 cup sake
  • 1/2 cup soy sauce
  • 1 cup water.

Procedure

  1. Heat some oil in a pan, and saute the shrimp shells until they’re cooked through and pink.
  2. Transfer the cooked shells to a cold saucepan.
  3. Add a bit more oil to the hot pan, and add the wings into the pan where you cooked the shells. Brown them really well on both sides. (I also took the neck from the chicken I used to make the broth, and put it in here.)
  4. Move them into the saucepan with the shells.
  5. Add the mirin, sake, soy, and water into the pan where you sauteed the wings, and scrape up all of the brown bits. Then pour it over the wings and shells.
  6. Simmer for at least 30 minutes, until it reduces by roughly half. Skim out all of the solids.

You should give this a taste. It should be very salty, but also sweet, and intensely flavored from the chicken and shrimp shells.

The Broth

Ingredients

  • 1 whole chicken, cut into parts.
  • A bunch of miscellaneous bones – chicken backs are the best, pork bones will be good too – as long as they aren’t smoked. Even beef soup bones will work.
  • 1 whole onion, cut into large chunks.
  • 1 head of garlic, cut in half.
  • 3 whole star anise

Procedure

  1. Heat up a large stockpot on high heat. Add a little bit of oil.
  2. Throw in the bones, and stir them until they’re browned on all sides.
  3. Add in the chicken parts. No salt, no browning – just throw the chicken in.
  4. Add enough water to cover everything in the pot.
  5. Add the onion, garlic, and anise to the pot.
  6. When it comes to a boil, reduce the heat to low, and let it simmer. Periodically skim the scum that rises to the top.
  7. Simmer for at least 2 hours. You can simmer it overnight in a slow-cooker, and it’ll taste even better, but you’ll need extra water. I love slow cookers, Bella crock pot reviews helped me choose my favorite kitchen appliance.
  8. Take out the chicken, bones, and spices. Add some salt – but you want to leave the broth a little underseasoned, because you’re going to mix in some tare later!

Roast Pork Tenderloin

Ingredients

  • 1/2 pork tenderloin, cut into two pieces (to make it easier to fit into the pan.)
  • 2 cloves garlic, finely minced.
  • 3 tablespoons soy sauce
  • 3 tablespoons sake
  • 1 teaspoon sugar

Procedure

  1. Take the tenderloin. Remove any silverskin. Poke all over, on all sides, with a fork. (This will help the marinade
    penetrate.)
  2. Mix together the garlic, soy, sake, and sugar to make a marinade.
  3. Put the pork in, and get it coated. Let it marinade for about an hour, turning it a couple of times.
  4. Heat a cast iron pan on high heat until it’s smoking hot.
  5. Put the tenderloin pieces in the pan. Turn it to brown on all sides.
  6. Remove the pork from the pan, and transfer to a 350 degree oven. Cook until it’s about 140 degrees inside. (This
    took about 15 minutes in my oven.) This is a bit underdone, but it’s going to cook more in the soup, and you don’t want it to be tough!
  7. Slice the pork into half-inch rounds.
  8. Dip the rounds in the hot tare.

Putting it all together

Ingredients

  • Eggs – one per person.
  • The green parts of a couple of scallions, cut finely.
  • A bunch of watercress.
  • Torigashi shichimi (a prepackaged japanese spice blend.)
  • Sesame oil.
  • Ramen noodles. (If you go to an asian grocery, you should be able to find fresh ramen noodles, or at least decent quality dried.)

Procedure

  1. Boil the eggs for about 5-6 minutes. The whites should be set, the yolks still a bit runny.
  2. In each soup bowl, put:
    • A couple of tablespoons of tare (the exact quantity depends on your taste, and how much you reduced your tare)
    • a bunch of watercress
    • some of the minced scallions
    • A drop of sesame oil
  3. Boil the ramen.
  4. To each bowl, add a big bunch of ramen noodles.
  5. Cover the noodles with the broth.
  6. Add a couple of slices of the roast pork.
  7. Crack the egg, and scoop it out of its shell into the soup.
  8. Sprinkle some shichimi on top.
  9. Eat!

Big Numbers and the Lost Plane

(In the original version of this post, there were two places where I wrote “cubic” instead of “square”. It was a stupid mistake, but just a stupid wording mistake. The calculations are unchanged: I was calculating square units, not cubic – I just typed the wrong thing.)

I haven’t written about this in a while, but for a long time, I’ve been fascinated by just how bad humans are at actually understanding big numbers. When it comes down to it, we’re comfortable with numbers we can count on our fingers. The further you get from our fingers, the worse we get at it.

We see this over and over. I’ve been noticing a new variation on this lately, in discussions about the missing Malaysian airliner. For example, I was at the dentist tuesday morning for a tmj mouth guard because I’ve been told that I grind my teeth at night. They’ve installed TVs in the exam rooms, to give you something to do while you wait for the novocaine to kick in, and they had it turned to CNN. The technician who was setting things up was talking to me about what they were saying, and she kept repeating: it’s a plane! It’s huge! Why can’t they find it?

If you think intuitively, it’s a good question!

In fact, if you look at it with basic intuition, it seems insane. A Boeing 777 is a huge plane! The thing is nearly as long as football field, and it’s length wingtip to wingtip is just short of 200 feet. Fully loaded with fuel at takeoff, it weighs more than 300 tons! How can you lose something that large? Even if it broke into pieces, the debris field from it would need to be huge, and it must be easy to find!

But that’s relying on intuition, and as I said in the introduction, our intuitions scale very poorly. To show what I mean, I’m going to work through a process that will hopefully help make it clear just how badly our intuition fails us here.

We’ll start by coming up with an estimate of the quantity of debris.

Assume that the fuselage is a continuous tube. The official stats say that the diameter of the fuselage is 20 feet, and it’s 242 feet long. So a deliberately overlarge estimate of the total surface area of the fuselage is 24220π – or about 15,000 square feet. Assume that the surface of the wings is about he same, and you get a total surface area of about 30,000 square feet. That means that if the entire body of the plane was peeled and laid flat, and it all floated on the water, it would cover 30,000 square feet of the surface. (In reality, much of the plane would sink, but the seats would float; on the whole, it’s probably a wash; the 30,000 square feet is probably still order-of-magnitude correct for the amount of debris visible on the surface.) Sounds like a whole lot, right?

To get an idea of how easy that is to find, we need to consider how much area we need to search. From what we know about how long the plane could have been in the air, and what direction it could have been moving, we’ve got a search area that’s at least a couple of thousand square miles.

One square mile is 27878400 square feet. So assuming that the plane crashed, and its wreckage is distributed over one square mile, that would mean that in the crash area, if every bit of the plate floated, just over one tenth of one percent of that square mile is covered in wreckage. That’s piling together very unrealistic assumptions – a highly constrained debris field, all of it floating – in order to stack the odds of finding it!

We’re using some optimistic assumptions here – but even so, even with a debris field that’s highly (unrealistically) constrained, it’s still very sparse. (In fact, it’s likely that if the plane crashed in the water, the debris field is spread over more than one square mile.) That’s what we’re looking for: an area where less than one tenth of one percent of the surface is covered, in an area covering several thousand square miles.

A thousand square miles is pretty squarely in the zone of stuff that we don’t really understand very well. Getting an understanding of just how big an area a thousand square miles is… it’s not something that our intuition will help with. But statistically, we’re looking for a patch of debris where the visible artifacts cover less than one ten-millions of one percent of the area being searched. If the search area were a football field, you’d be looking for a little fleck of tinfoil, 1/32nd of an inch on each side.

Only even that is making it too easy: it’s not one piece of tinfoil, 1/32nd of an inch on a side. It’s a fleck of foil which was shredded into a thousand even tinier pieces.

That’s assuming a search area of only one thousand square miles. But the search area is, potentially, quite a lot larger than that. Searching for the tinfoil on a football field is easy in comparison.

Once you understand how small an airplane is in comparison to a small patch of ocean, and you understand just how big a small patch of ocean is, then it’s not at all surprising that we haven’t found it yet.

Big Science News! Inflation, Gravity, and Gravitational Waves

gwaves

So, big announcement yesterday. Lots of people have asked if I could try to explain it! People have been asking since yesterday morning, but folks, I’ve got a job! I’ve been writing when I have time while running slow tests in another window, so it’s taken more than a day to get to it.

The announcement is really, really fascinating. A group has been able to observe gravity wave fluctuations in the cosmic microwave background. This is a huge deal! For example, Sean Carroll (no relation) wrote:

other than finding life on other planets or directly detecting dark matter, I can’t think of any other plausible near-term astrophysical discovery more important than this one for improving our understanding of the universe.

Why is this such a big deal?

This is not an easy thing to explain, but I’ll do my best.

We believe that the universe started with the big bang – all of the matter and energy, all of the space in the universe, expanding outwards from a point. There’s all sorts of amazing evidence for the big bang – not least the cosmic microwave background.

But the big-bang theory has some problems. In particular, why is everything the same everywhere?

That sounds like a strange question. Why wouldn’t it be the same everywhere?

Here’s why: because for changes to occur at the same time in different places, we expect there to be some causal connection between those places. If there is no plausible causal connection, then there’s a problem: how could things happen at the same time, in the same way?

That causal connection is a problem. To explain why, first I need to explain the idea of the observable universe.

Right now, there is some part of the universe that we can observe – because light from it has reached us. There’s also some part of the universe that we can’t observe, because light from it hasn’t reached us yet. Every day, every moment, the observable universe gets larger – not because the universe is expanding (it is, but we’re not talking about the size of the universe, but rather of the part of the universe that we can observe). It’s literally getting larger, because there are parts of the universe that are so far away from us, that the first light they emitted after the universe started didn’t reach us until right now. That threshold, of the stuff that couldn’t possible have gotten here yet, is constantly expanding, getting farther and farther away.

There are parts of the universe that are so far away, that the light from them couldn’t reach us until now. But when we look at that light, and use it to see what’s there, it looks exactly like what we see around us.

The problem is, it shouldn’t. If you just take the big bang, and you don’t have a period of inflation, what you would expect is a highly non-uniform universe with a very high spatial curvurature. Places very far away shouldn’t be exactly the same as here, because there is no mechanism which can make them evolve in exactly the same way that they did here! As energy levels from the big bang decrease, local fluctuations should have produced very different outcomes. They shouldn’t have ended up the same as here – because there’s many different ways things could have turned out, and they can’t be causally connected, because there’s no way that information could have gotten from there to here in time for it to have any effect.

Light is the fastest thing in the universe – but light from these places just got here. That means that until now, there couldn’t possibly be any connection between here and there. How could all of the fundamental properties of space – its curvature, the density of matter and energy – be exactly the same as here, if there was never any possible causal contact between us?

The answer to that is an idea called inflation. At some time in the earliest part of the universe – during a tiny fraction of the first second – the universe itself expanded at a speed faster than light. (Note: this doesn’t mean that stuff moved faster than light – it means that space itself expanded, creating space between things so that the distance between them expanded faster than light. Subtle distinction, but important!) So the matter and energy all got “stretched” out, at the same time, in the same way, giving the universe the basic shape and structure that it has now.

Inflation is the process that created the uniform universe. This process, which happened to the entire universe, had tiny but uniform fluctuations because of the basic quantum structure of the universe. Those fluctuations were the same everywhere – because when they happened, they were causally connected! Inflation expanded space, but those fluctuations provided the basic structure on which the stuff we observe in the universe developed. Since that basic underlying structure is the same everywhere, everything built on top it is the same as well.

We’ve seen lots of evidence for inflation, but it hasn’t quite been a universally accepted idea.

The next piece of the puzzle is gravity. Gravity at least appears to be very strange. All of the other forces in our universe behave in a consistent way. In fact, we’ve been able to show that they’re ultimately different aspects of the same underlying phenomena. All of the other forces can be described quantum mechanically, and they operate through exchange particles that transmit force/energy – for example, electromagnetic forces are transmitted by photons. But not gravity: we have no working quantum theory for how gravity works! We strongly suspect that it must, but we don’t know how, and up to now, we never found any actual proof that it does behave quantumly. But if it did, and if inflation happened, that means that those quantum fluctations during expansion, the things that provided the basic lattice on which matter and energy hang, should have created an echo in gravity!

Unfortunately, we can’t see gravity. The combination of inflation and quantum mechanics means that there should be gravitational fluctuations in the universe – waves in the basic field of gravity. We’ve predicted those waves for a long time. But we haven’t been able to actually test that prediction, because we didn’t have a way to see gravitational waves.

So now, I can finally get to this new result.

They believe that they found gravity waves in the cosmic microwave background. They used a brilliant scheme to observe them: if we look at the cosmic microwave background – not at any specific celestial object, but just at the background – gravitational waves would created a very subtle tensor polarization effect. So they created a system that could observe polarization. Then they removed all of the kinds of polarization that could be explained by anything other than gravitational waves. What they were left with was a very clear wave pattern in the polarization – exactly what was predicted by quantum inflation! You can see one of their images of this wave pattern at the top of this post.

If these new observations are confirmed, that means that we have new evidence for two things:

  1. Inflation happened. These gravity waves are an expected residue of inflation. They’re exactly what we would have expected if inflation happened, and we don’t have any other explanation that’s compatible with them.
  2. Gravity is quantum! If gravity wasn’t quantum, then expansion would have completely smoothed out the gravitational effects, and we wouldn’t see gravitational waves. Since we do see waves, it’s strong evidence that gravity really does have a quantum aspect. We still don’t know how it works, but now we have some really compelling evidence that it must!

Monte Carlo!

I was browsing around my crackpottery achives, looking for something fun to write about. I noticed a link from one of them to an old subject of one of my posts, the inimitable Miles Mathis. And following that, I noticed an interesting comment from Mr. Mathis about Monte Carlo methods: “Every mathematician knows that ‘tools’ like Monte Carlo are used only when you’ve got nothing else to go on and you are flying by the seat of your pants.” I find Monte Carlo computations absolutely fascinating. So instead of wasting time making fun of more of Mathis rubbish, I decided to talk about Monte Carlo methods.

It’s a little hard to talk about Monte Carlo methods, because there’s a lot of disagreement about exactly what they are. I’m going to use the broadest definition: a Monte Carlo method is a way of generating a computational result using repeated computations and random sampling.

In other words, Monte Carlo methods are a way of using random sampling to solve problems.

I’ll start with a really simple example. Suppose you want to know the value of \pi. (Pretend that you don’t know the analytical solution.) One thing that you could do is try to measure the circumference of a rod, and then divide it by its diameter. That would work, but it would be really hard to get much accuracy. You could, instead, get a great big square sheet of paper, and cover the whole thing in a single layer of grains of sand. Then, very carefully, you could remove the grains of sand that weren’t in the circle, compare it to the number of grains of sand that weren’t in the circle. By doing that, you could get a very, very accurate measurement of the area of the circle, and using that, you could get a much more accurate estimate of \pi.

The problem with that is: it’s really hard to get a perfect single-grain layer of sand all over the paper. And it would be a lot of very, very tedious work to get all of the grains that weren’t in the circle. And it would be very tedious to count them. It’s too much trouble.

Instead, you could just take 1,000 grains of sand, and drop them randomly all over the circle and the square. Then you could count how many landed in the circle. Or ever easier, you could just go to a place where lots of drunk people play darts! Draw a square around the dartboard, and count how many holes there are in the square wall around it, versus how many in the dartboard!

You’re not going to get a super-precise value for \pi – but you might be surprised just how good you can get!

That’s the basic idea of monte carlo simulation: you’ve got a problem that’s hard to compute, or one where you don’t know a closed-form solution to make it easy to compute. Getting the answer some other way is intractable, because it requires more work than you can reasonably do. But you’ve got an easy way to do a test – like the “is it in the circle or not” test. So you generate a ton of random numbers, and use those, together with the test, to do a sequence of trials. Then using the information from the trials, you can get a reasonable estimate of the value you wanted. The more trials you do, the better your estimate will be.

The more you understand the probability distribution of the space you’re sampling, the better your estimate will be. For example, in the \pi example above, we assumed that the grains of sand/darts would be distributed equally all over the space. If you were using the dartboard in a bar, the odds are that the distribution wouldn’t be uniform – there’d be many more darts hitting the dartboard than hitting the wall (unless I was playing). If you assumed a uniform distribution, your estimate would be off!

That’s obviously a trivial example. But in reality, the Monte Carlo method is incredibly useful for a wide range of purposes. It was used during World War II by the Manhattan project to help design the first atom bomb! They needed to figure out how to create a critical mass that would sustain a nuclear chain reaction; to do that, they needed to be able to compute neutron diffusion through a mass of radioactive uranium. But that’s a very hard problem: there are so many degrees of freedom – so many places where things could proceed in several seemingly (or actually!) random directions. With the computers they had available to them at the time, there was absolutely no way that they could write a precise numerical simulation!

But, luckily for them, they had some amazing mathematicians working on the problem! One of them, Stanislav Ulam, had been sick, and while he was recovering, fooled around with some mathematical problems. One of them involved a solitaire game, called Canfield. Ulam wanted to figure out how often the game of Canfield was winnable. He couldn’t figure out how to do it analytically, but he realized that since the deals of cards are a uniform distribution, then if you were to take a computer, and make it play through 1000 games, the number of times that it won would be a pretty good estimate of how many times the game was winnable in general.

In that case, it’s obvious that a complete solution isn’t feasible: there are 52! possible deals – roughly 3\times 10^{66}! But with just a couple of hundred trials, you can get a really good estimate.

Ulam figured that out for the card game. He explained it to Jon von Neumann, and von Neumann realized that the same basic method could be used for the Neutron diffraction process!

Since then, it’s been used as the basis for a widely applicable approach to numeric integration. It’s used for numerous physics simulations, where there is no tractable exact solution – for example, weather prediction. (We’ve been able to get reasonably accurate weather predictions up to six days in advance, using very sparse input data, by using Monte Carlo methods!) It’s an incredibly useful, valuable technique – and anyone who suggests that using Monte Carlo is in any way a half-assed solution is an utter jackass.

I’ll finish up with a beautiful example – my favorite example of combining analytical methods with Monte Carlo. It’s another way of computing an estimate of \pi, but it gets a more accurate result with fewer trials than the sand/darts.

It’s based on a problem Buffon’s needle. Buffon’s needle is a problem first proposed by the Count of Buffon during the 1700s. He asked: suppose I drop a needle onto a panelled wood floor. What’s the probability that the needle will fall so that it crosses a one of the joints between different boards?

Using some very nice analytical work, you can show that if the panels have uniform width t, and the needle has length l, then the probability of a needle crossing a line is: \frac{2l}{\pi t}. That gives us the nice property that if we let l = \frac{t}{2}, then the probability of crossing a line is \frac{1}{\pi}.

Using that, you can do a Monte Carlo computation: take a sheet of paper, and a couple of hundred matchsticks. Draw lines on the paper, separated by twice the length of a matchstick. Then scatter the matchsticks all over the paper. Divide the total number of matchsticks by the number that crossed a line. The result will be roughly \pi.

For example – with 200 trials, I got 63 crossing a line. That gives me roughly 3.17 as an estimate of \pi. That’s not half-bad for a five minute experimental estimate!

Squishy Equivalence with Homotopy

In topology, we always talk about the idea of continuous deformation. For example, we say that two spaces are equivalent if you can squish one into the other – if your space was made of clay, you could reshape it into the other just by squishing and molding, without ever tearing or gluing edges.

That’s a really nice intuition. But it’s a very informal intuition. And it suffers from the usual problem with informal intuition: it’s imprecise. There’s a reason why math is formal: because it needs to be! Intuition is great, as far as it goes, but if you really want to be able to understand what a concept means, you need to go beyond just intuition. That’s what math is all about!

We did already talk about what topological equivalence really is, using homeomorphism. But homeomorphism is not the easiest idea, and it’s really hard to see just how it connects back to the idea of continuous deformation.

What we’re going to do in this post is look at a related concept, called homotopy. Homotopy captures the idea of continuous deformation in a formal way, and using it, we can define a form of homotopic equivalence. It’s not quite equivalent to homeomorphism: if two spaces are homeomorphic, they’re always homotopy equivalent; but there are homotopy equivalent spaces that aren’t homeomorphic.

How can we capture the idea of continuous transformation? We’ll start by looking at it in functions: suppose I’ve got two functions, f and g. Both f and g map from points in a topological space A to a topological space B. What does it mean to say that the function f can be continuously transformed to g?

We can do it using a really neat trick. We’ll take the unit interval space – the topological space using the difference metric over the interval from 0 to 1. Call it U = [0, 1].

f can be continuously deformed into g if, and only if, there is a continuous function t: A \times U \rightarrow B, where \forall a \in A: t(a, 0) = f(a) \land t(a, 1) = g(a).

If that’s true, then we say t is a homotopy between f and g, and that f and g are homotopic.

That’s just the first step. Homotopy, the way we just defined it, doesn’t say anything about topological spaces. We’ve got two spaces, but we’re not looking at how to transform one space into the other; we’re just looking at functions that map between the spaces. Homotopy says when two functions between two spaces are loosely equivalent, because one can be continuously deformed into the other.

To get from there to the idea of transformability of spaces, we need to think about what we’re trying to say. We want to say that a space A can be transformed into a space BB. What does that really mean?

One way to say it would be that if I’ve got A, I can mush it into a shape B, and then much it back to A, without ever tearing or gluing anything. Putting that in terms of functions instead of squishies, that means that there’s a continous function f from A to B, and then a continous function g back from B to A. It’s not enough just to have that pair of functions: if you apply f to map A to B, and then apply g to map back, you need to get back something that’s indistinguishable from what you started with.

Formally, if A and B are topological spaces, and f: A \rightarrow B and g: B \rightarrow A are continuous functions, then the spaces A and B are homotopically equivalent – equivalent over squishing and remolding, but not tearing or gluing – if f \circ g is homotopic with the id function on A, and g \circ f is homotopic with the id function on B.

That captures exactly the notion of continuous transformation that we tried to get with the intuition at the start. Only now it’s complete and precise – we’ve gotten rid of the fuzziness of intuition.

Multiplying Spaces

When people talk informally about topology, we always say that the basic idea of equivalence is that two spaces are equivalent if they can be bent, stretched, smushed, or twisted into each other, without tearing or gluing. A mug is the same shape as a donut, because you can make a donut out of clay, and then shape that donut into a mug without tearing, punching holes, or gluing pieces together. A sphere is the same shape as a cube, because if you’ve got a clay sphere, you can easily reshape it into a cube, and vice-versa.

Homeomorphism is the actual formal definition of that sense of equivalence. The intuition is fantastic – it’s one of the best informal description of a difficult formal concept that I know of in math! But it’s not ideal. WHen you take a formal idea and make it informal, you always lose some details.

What we’re going to do here is try to work our way gradually through the idea of transformability and topological equivalence, so that we can really understand it. Before we can do that, we need to be able to talk about what a continuous transformation is. To talk about continuous transformations, we need to be able to talk about some topological ideas called homotopy and isotopy. And to be able to define those, we need to be able to use topological products. (Whew! Nothing is ever easy, is it?) So today’s post is really about topological products!

The easiest way that I can think of to explain the product of two topological spaces is to say that it’s a way of combining the structures of the spaces by adding dimensions. For example, if you start with two spaces each of which is a line segment, the product of those two spaces is a square (or a circle, or an octagon, or …) You started with two one-dimensional spaces, and used them to create a new two-dimensional space. If you start with a circle and a line, the product is a cylinder.

In more formal terms, topological products are a direct extension of cartesian set products. As the mantra goes, topological spaces are just sets with structure, which means that the cartesian product of two topological sets is just the cartesian products of their point-sets, plus a combined structure that preserves attributes of the original structure of the spaces.

Let’s start with a reminder of what the cartesian product of two sets is. Given a set A and a set B, the cartestian product A \times B is defined as the set of all possible pairs (a, b), where a \in A and b \in B. If A=\{1, 2, 3\} and B=\{4, 5\}, then A\times B = \{ (1, 4), (1, 5), (2, 4), (2, 5), (3, 4), (3, 5)  \}.

In category theory, we take the basic idea of the cartesian product, and extend it to a general product of different mathematical objects. It does this by using the idea of projections. In this model, instead of saying that the product of sets A and B is a set of pairs (a, b), we can instead say that the product is a set S of objects, and two functions P_A : S \rightarrow A and P_B : S \rightarrow B. (To be complete, we’d need to add some conditions, but the idea should be clear from this much.) Given any object in the the product set S, P_A(S) will give us the projection of that object onto A. This becomes more interesting when we consider sets of objects. The A-projection of a collection of points from the product set S is the shadow that those points cast onto the set A.

A topological product is easiest to understand with that categorical approach. The set of points in a product category A \times B is the cartesian product of the sets of points in A and the sets of points in B. The trick, with topologies, is that you need to describe the topological structure of the product set: you need to be able to say what the neighorhoods are. There are lots of ways that you could define the neighborhoods of the product, but we define it as the topological space with the smallest collection of open-sets. To understand how we get that, the projections of the category theoretical approach make it much easier.

Informally, the neighborhoods in the product A \times B are things that cast shadows into the topological spaces A and B which are neighborhoods in A and B.

Suppose we have topological spaces A and B. If S is the product topology A \times B, then it has projection functions P_A: S \rightarrow A and P_B: S \rightarrow P_B.

The projection functions from the product need to maintain the topological structure of the original topologies. That means that the projection function must be continuous. And that, in turn, means that the inverse image of the projection function is an open set. So: for each open set O in A, P_A^{-1}(O) is an open-set in S.

Let’s look at an example. We’ll start with two simple topological spaces – a cartesian plane (2d), and a line (1d). In the plane, the neighborhoods are open circles; in the line, the neighborhoods are open intervals. I’ve illustrated those below.

open-sets

The product of those two is a three dimensional space. The neighborhoods in this space are cylinders. If you use the projection from the product to the plane, you get open circles – the neighborhood structure of the plane. If you use the projection from the product to the line, you get open intervals – the neighborhood structure of the line.

open-set-cyl

One interesting side-point here. One thing that we come across constantly in this kind of formal math is the axiom of choice. The AoC is an annoying bugger, because it varies from being obviously true to being obviously ridiculously false. Topological products is one of the places where it’s obviously true. The axiom choice is equivalent to the statement that given a collection of non-empty topological spaces, the product space is not empty. Obvious, right? But then look at the Banach-Tarski paradox.