Monthly Archives: March 2009

I Get Mail: Iterative Compression

Like a lot of other bloggers, I often get annoying email from people. This week, I’ve been dealing with a particularly annoying jerk, who’s been bothering me for multiple reasons. First, he wants me to “lay off” the Christians (because if I don’t, God’s gonna get me). Second, he wants to convince me to become a Christian. And third, he wants to sell me on his brilliant new compression scheme.

See, aside from the religious stuff, he’s a technical visionary. He’s invented a method where he can take a source document, and repeatedly compress it, making it smaller each time.

This is a stupid idea that I’ve seen entirely too many times. But instead of just making fun of it, I thought it would be interesting to explain in detail why it doesn’t work. It touches on a bunch of basic facts about how data compression works, and provides a nice excuse for me to write a bit about compression.

basic-compression.png

The basic idea of data compression is that you’re eliminating redundancies in the original text. You can’t discard information. Mathematically, a compression function is an invertible function C from an array of characters to an array of characters (or you could use bits if you prefer), such that if y=C(x), then on the average input, the length of y is smaller than the length of x.

An ideal compression system is one where for all possible values of x, C(x) is shorter than x. C is your compressor; and since C is reversible, it has a unique inverse function C-1 such that C-1(C(x)) = x. An illustration of this basic compression system is in the diagram to the side.

Continue reading

Commenting Problems

Just a quick status notice: a bunch of commenters have been having problems with the system demanding authetication to be able to comment. I’m trying to fix it with the help of the SB tech folks. My first attempt made things worse, and made it impossible for anyone to comment. I’m trying to re-enable comments now, but since I’m not sure what disabled them, I’m not sure of what will work. Commenting ability using typekey authentication will be re-enabled ASAP; and commenting without authentication will be re-enabled as soon as the SB techs can figure out what’s causing the authentication requirement.

Tax Thresholds: Why the horror stories about the Obama tax plan are lies

Watching news reports about President Obama’s proposed tax changes,
I’ve seen a number of variations on a very annoying theme, which involves
a very stupid math error.

A typical example is this story on ABC news, which contains a non-correction
correction:

President Barack Obama’s tax proposal — which promises to increase taxes for those families with incomes of $250,000 or more — has some Americans brainstorming ways to decrease their pay in an attempt to avoid paying higher taxes on every dollar they earn over the quarter million dollar mark.

A 63-year-old attorney based in Lafayette, La., who asked not to be named, told ABCNews.com that she plans to cut back on her business to get her annual income under the quarter million mark should the Obama tax plan be passed by Congress and become law.

“We are going to try to figure out how to make our income $249,999.00,”
she said.

“We have to find a way out where we can make just what we need to just under the line so we can benefit from Obama’s tax plan,” she added. “Why kill yourself working if you’re going to give it all away to people who aren’t working as hard?”

The original version of this article continued to follow this basic theme. The
updated article pretends to correct it, while still basically mantaining the
same focus.

The idea behind this, and similar stories, is that raising the income tax
rate on people earning over $250,000 per year creates a threshold, where earning
more than that threshold will result in your taking home less
after-taxes pay than if you earned less.

Continue reading

Basics: Significant Figures

After my post the other day about rounding errors, I got a ton of
requests to explain the idea of significant figures. That’s
actually a very interesting topic.

The idea of significant figures is that when you’re doing
experimental work, you’re taking measurements – and measurements
always have a limited precision. The fact that your measurements – the
inputs to any calculation or analysis that you do – have limited
precision, means that the results of your calculations likewise have
limited precision. Significant figures (or significant digits, or just “sigfigs” for short) are a method of tracking measurement
precision, in a way that allows you to propagate your precision limits
throughout your calculation.

Before getting to the rules for sigfigs, it’s helpful to show why
they matter. Suppose that you’re measuring the radius of a circle, in
order to compute its area. You take a ruler, and eyeball it, and end
up with the circle’s radius as about 6.2 centimeters. Now you go to
compute the area: π=3.141592653589793… So what’s the area of the
circle? If you do it the straightforward way, you’ll end up with a
result of 120.76282160399165 cm2.

The problem is, your original measurement of the radius was
far too crude to produce a result of that precision. The real
area of the circle could easily be as high as 128, or as low as
113, assuming typical measurement errors. So claiming that your
measurements produced an area calculated to 17 digits of precision is
just ridiculous.

Continue reading