Staring At Empty Pages: Mathematics

Showing posts with label Mathematics. Show all posts

Thursday, January 12, 2012

No mean feat

Physicist Stephen Hawking turned 70 last weekend, and has been living with ALS — amyotrophic lateral sclerosis — for nearly 50 years. Usually, the disease is diagnosed in patients over 50, and they die within a few years. I was reading an article in Scientific American about Dr Hawking’s longevity. The article contains an edited interview with Dr Leo McCluskey, an ALS expert at the University of Pennsylvania.

One answer, in particular, struck me:

Sci Am: What has Stephen Hawking’s case shown about the disease?
Dr McCluskey: One thing that is highlighted by this man’s course is that this is an incredibly variable disorder in many ways. On average people live two to three years after diagnosis. But that means that half the people live longer, and there are people who live for a long, long time.

The mathematician in me rose up at that: no, on average does not mean that half the samples are on each side of the average. Average refers to the arithmetic mean — take a bunch of numbers, add them, and divide by the count (how many numbers you added) — and it’s easy to show, by example, how that’s wrong.

Suppose we had five patients with ALS. Suppose four of those patients lived for one year following diagnosis, and one lived for eleven years. 1 + 1 + 1 + 1 + 11 = 15, and 15 / 5 = 3. So on average, people in this sample lived for three years... and only one of the five (20%) survived more than even one year. Given Dr Hawking’s experience of on the order of 50 years, he could offset about 25 patients who succumbed after one year, and still give us a three-year average.

The problem with the arithmetic mean is that it’s easily skewed by outliers. In the extreme example here, if 96% of the samples are 1 and 4% are 50, we get an average of 3 — three times the normal value. That means that with such a situation, the average is useless in giving us any reasonable prediction of what to expect. More generally, if the numbers are widely variable, the average doesn’t tell us anything useful. If we have nine patients who made it through 1, 2, 3, 4, 5, 6, 7, 8, and 9 years, respectively, what do we tell the tenth patient who shows up? 5 years, on average, sure, but, really, we might as well tell him to take a wild guess.

Averages are useful when the values tend to cluster around the arithmetic mean, particularly when the number of samples is large. They’re also helpful in analyzing trends, when we look at the change in the average over time... but, again, we have to be careful that a new outlier hasn’t skewed the average. Sometimes we adjust averages to try to compensate for the outliers — for example, we might eliminate the top and bottom 5% of the samples before taking the average.

Another common error is to confuse the mean with the median. The latter is often used in financial reporting: median income, median purchase price for houses, and so on. The median is a completely different animal from the mean. It’s, quite simply, the middle value. List all the sample values in increasing order, and pick the one in the middle (or one of the two in the middle, if the number of values is even).

In the first example above,^[1] if we write the values as 1, 1, 1, 1, 11, the median is the value in bold: 1. In the second example, we have 1, 2, 3, 4, 5, 6, 7, 8, 9, for a median of 5. You can see that in the first case, the median is not related to the mean, while in the second case it’s the same as the mean. It’s also the case that the mean (or average) is an artificial value that might not appear in the samples, whereas the median is, by definition, one of the sample values.

Also by definition, at least half the sample values are greater than or equal to the median (and at least half are less than or equal to it). In other words, Dr McCluskey’s statement would have been true (at least close enough) had he been talking about the median survival period, rather than the average. Medians are also less susceptible to skewing by outliers, as you can see from the first example.

But as the second example shows, when the numbers are all over the place, neither is of much use in predicting anything.

^[1] My examples use small numbers of values for convenience. In reality, both mean and median require a fairly large sample size to be useful at all.

Monday, March 21, 2011

The ultimate Hamantash

Long-time readers are well aware of three things, at least, about me:

I’m flamingly atheist; I think religion is silly, at best, but
I grew up in a Jewish family; also,
I’m a math geek.

The combination of the first two can sometimes be a little odd. There are many ways in which the Jewish culture stuck, though the belief system never took hold at all. I love Passover seders, for instance, especially if I can be irreverent about them (more about that in a few weeks). I look forward to certain traditional foods (while at the same time relishing shellfish and anything related to pork). That sort of thing.

One favourite food has always been Hamantashen: triangular pastries associated with the Jewish festival of Purim, filled with poppy-seed paste or fruit filling (prune, cherry, apricot, or raspberry, usually). They’re little hand-held, individual fruit pies, and well-made ones are true delights.

It’s Purim now (well, this past weekend), and the Hamantashen are in the air. And Seattle food blogger Deborah Gardner has tied it all in with the math-geek bit to make the ultimate Hamantash (that’s the correct singular; Hamantashen is plural (I’m a language geek, too, remember)): the Sierpinski Hamantash, modeled on the Sierpinski triangle.

Doesn’t it look grand? ! חג שמח

Monday, March 14, 2011

Happy Pi Day

[Thanks to commenter "D." for the image.]

Friday, March 11, 2011

Mathematics and advertising

I occasionally post here about abuse of mathematics (such as this post, and this one, and this one). I occasionally post about silly advertising (try here and here). And sometimes I get to combine them in an item about abuse of mathematics in advertising (this is a good example).

The other day, the excellent web comic XKCD covered the combo very nicely, and included some of the things I often ~~whine~~ ~~get huffy~~ ~~kvetch~~ whine and get huffy about:

Hold the mouse over the comic to see Randall Munroe’s extra text, present in most of his drawings. Click the comic to visit his web page. And while you’re at it, you might check out my other favourite XKCD comics: Sandwich (it’s a Unix joke), Snopes, Correlation, Can’t Sleep (warning: only extreme geeks will get this), and Spinal Tap Amps.

Thursday, October 07, 2010

Least common denominator

For another comment about something a recent speaker said, we look at the guy yesterday who made a reference to least common denominator, and included a graphic that showed the fraction 9 / 12, then displayed it as 3*3 / 4*3, and concluded with 3 / 4. There are two problems with the graphic.

One is that it’s gratuitous. It has nothing to do with the colloquial meaning of least common denominator, which doesn’t relate to fractions or mathematics at all. In English rhetoric, it refers to a common kernel that can serve or satisfy everyone involved. Alternatively, it can be used disparagingly to refer to someone or something from which every distinguishing and distinguished characteristic has been removed, leaving only a common bit that’s dull and useless.

Some presenters seem to like sticking graphics on every Powerpoint slide they show — sometimes several per slide — whether or not the graphics add anything to the understanding of the slides. Presenters who do that think the graphics make their presentations snazzier.

They don’t.

But the other problem with the graphic is from a mathematical point of view: it’s not illustrating the concept of least common denominator at all. It’s an illustration of greatest common factor. When we reduce a fraction, as in the graphic, we find the greatest common factor of the numerator (the top of the fraction) and the denominator (the bottom) — the largest number we can find that goes evenly into both numbers, that divides both numbers with a remainder of zero. When we cancel that greatest common factor out, what’s left is the fully reduced fraction.

We use the least common denominator to compare (or add or subtract) two or more fractions.

Which is greater?: 5 / 12 ... or ... 9 / 20 ?

To answer that using fractions, we need to convert them into fractions with a common denominator, and we customarily use the least common denominator — the smallest number that is a multiple of both denominators. In this case, 12 = 4 * 3, and 20 = 4 * 5, so the least common denominator would be 4 * 3 * 5 = 60. Multiply both the numerator and denominator by the same amount, and we get 5 / 12 = 25 / 60, and 9 / 20 = 27 / 60. And, so, because 25 is less than 27, 5 / 12 is less than 9 / 20. And the difference between the two is (27 - 25) / 60 = 2 / 60 = 1 / 30 (which we reduced by finding the greatest common factor of 2 and 60).

I have no quibble with the colloquial use of least common denominator as a language idiom, with a meaning that doesn’t relate to the mathematical one (though I do think the usage is trite). But when you bring mathematics into it, please get the maths right.

Saturday, May 22, 2010

An unbelievable deal!

I took the photo on the right (click to enlarge) with my BlackBerry’s camera (so please forgive its being somewhat blurry) at the local Whole Foods store. It depicts an incredibly good^[1] sale price on some one-quart packages of almond milk:

Sale! 3 for $6.00
Regularly $1.99 [each]

The unit pricing helpfully tells us that 3 one-quart boxes for 6 dollars makes it $2.00 per quart. We don’t need the unit pricing to compute that the regular, non-“Sale!” price comes to $1.99 per quart.

One wonders whether it’s possible to decline the sale price at the register.

^[1] Yes, that’s “unbelievable” as in not believable, and “incredible” as in not credible.

Monday, April 19, 2010

Marginal tax brackets

As we’ve just had income tax time in the U.S., I’ve been hearing a bunch of silliness associated with it. I thought I’d talk about two of my favourite (well, for some value of “favourite”) tax fallacies.

It doesn’t pay for me to get a raise! It’ll just put me in a higher tax bracket, I’ll pay more taxes, and in the end I’ll wind up making less than I did before.

I haven’t worked out every possibility, and maybe there’s really a way that can happen, but I can’t imagine what it could be. The “tax brackets” in the U.S. are marginal — you only pay the higher tax on the portion of your income that put you into the next bracket.

Let’s suppose you’re single and your taxable income is $33,000 in 2009. You’re in the 15% tax bracket. A 3% increase in your taxable income would make it $33,990, and that would put you into the 25% tax bracket. 15% tax on $33,000 would be $4,950. 25% tax on $33,990 would be $8497.50. The difference is more than $3500, well more than your rise in pay!

Only, that’s not the way the tax works. In fact, for 2009 you pay 10% on the first $8,350 of taxable income, and only pay the 15% on the amount between that and $33,950. And the 25% is only levied on any amount between $33,951 and $82,250.

So, in the example above, the tax bill at $33,000 of taxable income would be 10% of $8,350 ($835) plus 15% of $24,650 ($3,697.50), totalling $4,532.50. And the tax on $33,990 would be 10% of $8,350 ($835) plus 15% of $25,600 ($3,840), plus 25% of only $40 ($10), totalling $4,685. The guy’s taxes went up by $152.50. [Corrected; thanks, Ray, for pointing out privately that I put the wrong number here.]

I’m sure he wasn’t thrilled to pay that extra money, but it didn’t come anywhere close to eating up the whole $990 raise: he still came out with more than $800 more in his pocket.

Variant 1: Yay! I’m getting a big refund this year. That’s great!
Variant 2: Oh, no! I have to pay the feds $1000. That’s horrible! I’d rather get money back.

When you get a tax refund, you have given the government a free loan of your money. On the other hand, if you get more in each paycheck and wind up paying a bit of money at tax-filing time, then you have had the use of more of your money in the interim. Doesn’t the latter seem to make more sense, when you look at it that way?

À chacun, son goût, of course, but I’d much rather pay a little bit in April, knowing that it means that my money was mine that much longer.

Some people worry that if they set it up so that they owe money, when it comes time to pay it they won’t have it to hand. An easy way to fix that is to open a bank account just for that purpose, and “withhold” your own taxes — put a certain amount into that account with each paycheck. You still have the money, and it’s earning interest for you. You can invest it any way you like, as long as you can turn it into cash by 15 April.

Wednesday, March 24, 2010

Exponential

I hear talk all the time about exponential increases. Consider three very recent examples from the New York Times, here:

“Since the 1960s there has been an exponential increase in artists working with maps,” she adds, using Jasper Johns’s “Map,” from 1963, by way of example.

here:

[...] historically black colleges and universities actually serve more students today than they did 50 years ago because there has been an exponential growth in students pursuing higher education [...]

and here:

But he said first responders generally did not have enough training to deal with diversions that could be “almost exponential” compared with those faced by most drivers.

I’ll note that two of those are quoting someone, and the third is in a blog, for which the Times has slightly more relaxed standards concerning informal writing. But the point isn’t to pick on the Times, but to note how often people refer to “exponential” growth or increase.

Here’s the thing: “exponential” isn’t just a fancy or intensive way to say “large”. It has a particular meaning.

In grade school, we were given a problem to work out: Suppose your parents gave you a penny on the first day of January, two pennies on the second day, four on the third, eight on the fourth, and so on, doubling the number of pennies each day. On January 31st, how many pennies would you have, all told?

When you’re in the sixth grade, that’s not an easy problem. Of course, any computer geek can solve that instantly now: on day “n” you get 2^n-1 pennies, and you have a total of 2ⁿ-1 pennies collected. So your final total at the end of January is 2³¹-1, which is 2,147,483,647 — more than two billion pennies.

We were astounded by that answer, once we got it. And that is exponential growth.

Strictly speaking, exponential growth requires an interval, some repetitions of that interval, and growth over each of those intervals by a common factor. It can be a doubling every day for a month, as in the problem with the pennies. It might be a five-fold increase each year for several years. The point is that the thing being measured is increasing in the exponent, a quantity of n in the first interval, n² in the next, n³ in the next, and so on.

In fact, exponential growth doesn’t even have to imply fast or large growth. Note that if a population increases by 1% every decade... that’s exponential growth, though no one would really think of it that way. In this case, the interval is ten years, and the common factor is 1.01.

In practice, of course, it’s often not that straightforward, and we allow for variations and approximations. But we must insist on the interval thing, and of a pattern of applicable growth over some reasonable number of intervals (it won’t do to just say that something doubled, and to call that “exponential growth”; one interval is not enough to establish a pattern).

The first New York Times example, the quote about artists working with maps, has a starting point, but no interval. There may have been a large increase in the number of art works involving maps, but without more information we can’t call it “exponential”. The same is true for the blog entry about black university students. If we had data to back it up, it would be valid to say that over the last 50 years there’s been an exponential increase each decade. But as it stands, he just means that it’s gone up by a lot.

And I can’t even decide what the third example means, to say that “diversions” are “almost exponential”. I suppose we might say something like that if the number of diversions that first responders have to deal with is the square or the cube of those bedeviling others — for every four diversions you and I have, a first responder has to cope with sixteen, or maybe 64. But that seems awkward, as well as unlikely.

The guy just means that they have a great deal more diversions (I would say “distractions”) than the average driver.

When that’s what we mean, that’s what we should say.

Wednesday, February 17, 2010

It increased by how much?

In yesterday’s meetings, someone presented some data, showing us rates of sending spam, month by month, broken down by country. The summary page showed the change from six months ago to last month. Here’s an example of the summary table, with made-up numbers:

Spam by country, as % of total
Country	Jul 09	Jan 10	Change
Parador	39%	17%	-22%
Slobovia	4%	8%	+4%
Gumana	2%	< 1%	-1%

Maybe you already see the problem with this.

A change from 4% to 8% does not represent a 4% increase. It’s a 100% increase — the spam from Slobovia has not gone up by 4% in six months, it’s doubled. And similarly for the other numbers: Parador showed a 56.4% decrease.

What is true is that Slobovia’s contribution to the total has increased by 4 percentage points (which is not the same as saying that it’s increased by 4 percent). And if it’s clear that that’s what you’re saying, it’s a fine thing to say... we can then debate which number is more useful, and the answer to that will depend upon what we’re using the numbers for.

But it’s very misleading to list those numbers in a table like that, and entirely wrong to say that “the amount of spam from Slobovia has increased by 4% since last July.”

Gotta be careful with percentages.

Wednesday, December 23, 2009

Grace periods and other allowances

I heard on the radio Monday that the New York City Council had just overridden Mayor Bloomberg’s veto on the proposed 5-minute grace period for parking enforcement. The mayor thinks it will be hard to enforce and confusing. The citizens and the City Council have this to say:

Residents have long groused about what they say are overzealous traffic agents who rigidly hand out tickets without any chance for a reprieve. Some believe that the tickets are part of a push by city officials to squeeze as much revenue from any and all sources, at a time when the city is staring at a multi-billion dollar deficit.

The traffic agents (New York City uses a civilian force to write parking tickets) may be overzealous, indeed, but a “grace period” will change nothing in that regard. If you think you have 45 minutes on the meter, and come back in 47 minutes to see an agent placing a ticket under your wiper... will it really be any different if you think you have 50 minutes and come back in 52 to the same scene?

Limits of any sort — deadlines, tolerances, room capacities, and such — are somewhat arbitrary, but if they’re to be enforced there must be a definition of the enforcement point. If we codify any sort of “grace”, then the inclusion of that grace defines the new limit, subject to the same complaints of zealotry as before. We may say that a room can hold up to 110 people, and stop the 111th from joining, and you may say, “But it’s only one more; surely that’s not going to hurt anything.” OK, so let’s suppose we don’t enforce it until there are more than 115 in the room. Well, now, 116 is only one more. Surely that will be OK as well.

We're only fooling ourselves to define a grace period, and it will actually turn out to benefit no one — not the city, and not the residents. Once people are used to it, they'll overstay their deadlines just as before, and they'll complain about the enforcement, just as before.

Monday, July 06, 2009

Scientists, versus “normal” people

At Educated Guesswork, my IETF colleague Eric Rescorla notes how his ears “pop” when his car is moving fast and he closes the sun-roof:

I noticed the other day that if I’m driving my car on the freeway and close the sunroof my ears pop.

Now, Eric’s a scientist, so what’s the first thing he does about this?

He develops a hypothesis, of course:

After a bit of thinking, I concluded that what was going on was the Bernoulli effect: the air flowing over the sunroof lowers the pressure of the interior of the car. Then when you close it you get a sudden pressure change back to ambient pressure.

And next? Well, one has to test one’s hypothesis, right?

Initial experiments confirm this: my Polar 625SX [heart-rate monitor] has a built-in barometric altimeter. I repeatedly opened and closed the sunroof and watched the altimeter and readings seemed to consistently differ by about 75 feet. Obviously, there’s some uncertainty here because the road isn’t totally flat; if you wanted to be really sure you’d go over the same sections of the road again and again with the sunroof open and closed and measure the difference. Still, since I’m not exactly publishing this in Nature, it seems good enough for now.

Tom Lehrer, in his preamble to his song about Lobachevsky, notes that, “some of you may have had occasion to run into mathematicians and to wonder, therefore, how they got that way.” The same is true of any sorts of scientists. We’re a strange lot, to those who don’t see a need to contemplate an explanation for every little oddity we notice.

Some years ago, our work group at the office was planning to order some pizza to be delivered for a working lunch. The co-worker who organized the order came to me and said that she was going to order some set of large and small pizzas, depending upon how much folks wanted to eat. She asked me how many slices I wanted.

“I don’t know,” I replied. “It depends whether they’re slices from a large pizza, or from a small one.”

“I’ll figure that out when I have the count of slices. How many do you want?”

“But,” I persisted, “they’re not the same size. I don’t know, until I know the size of the slices.”

She rolled her eyes. “Just tell me how many slices you want.”

I said two, but I thought (and probably said aloud) that I might want more if one or both of them were small.

When my colleague left, I went to the white board. Let’s see... a large pizza was 14 inches across (7-inch radius), and is cut into eight slices. A small pie was 12 inches (6-inch radius), cut into six slices. So, for the large:

π × 72 / 8 = 19.24 square inches per slice

...and for the small:

π × 62 / 6 = 18.85 square inches per slice

A slice from a small pizza is only 2% smaller than a slice from a large pie. In other words: they’re about the same, close enough as not to matter.

Of course, my co-worker knew that intuitively. I had to be the mathematician, and work it out with a dry-erase marker.

Tuesday, June 23, 2009

Binary multiplication, the computer (and Ethiopian?) way

In the 360 blog (sub-heading, “12 tables, 24 chairs, and plenty of chalk”), blogger Ξ (Xi) recently wrote about “Ethiopian Multiplication” (and followed it up with a series of interesting posts on different ways to multiply, here, here, here, and here, so far).

Here’s the “Ethiopian” version, as Xi tells it:

Here’s the basic idea: Suppose you want to multiply two numbers like 14 and 12. You could use your fingers, of course, but here’s another way:
Start with the two numbers on top. Halve one, ignoring any remainders or fractions, and double the other, stopping when you get to 1.
14 & 12
7 & 24
3 & 48 [See how I ignored the fact that halving 7 leaves 1 left over?]
1 & 96 Now look at the numbers on the right. Some are across from an even number: in this case, 12 is across from the original 14. Ignore those, and add the rest. So we’ll add 24, 48, and 96, which were across from odd numbers, and get 168. And that’s the product! Isn’t that cool?

It’s cool, because it’s binary... but it’s not at all surprising that it works. Look at what we’ve really done: In the left column, we’ve represented the multiplier (14, here) in binary, where even numbers represent 0 and odds represent 1. In the right column, we’ve multiplied the multiplicand (12, here) by the corresponding powers of 2. Thus:

0 14 12 (12 × 20)
1 7 24 (12 × 21)
1 3 48 (12 × 22)
1 1 96 (12 × 23)

By “ignoring” the numbers on the right that match even numbers on the left, we’re omitting the entries that correspond to a binary zero.

Think about how we learn to do long multiplication in school. If we multiply 12 by 14 using that method, it looks like this:

     1 2
     1 4
  ------
     4 8   (12 * 4 * 100)
   1 2     (12 * 1 * 101)
  ------
   1 6 8

We use the visual left-shift on the second line of the answer to represent multiplication by another 10 (the shifted “12” really means “120”).

And that’s exactly what the Ethiopians were doing, except instead of doing it in decimal (powers of ten), they were doing it in binary (powers of 2):

          1 1 0 0   (binary representation of 12)
          1 1 1 0   (binary representation of 14)
         --------
          0 0 0 0   (12 * 0 * 20)
        1 1 0 0     (12 * 1 * 21)
      1 1 0 0       (12 * 1 * 22)
    1 1 0 0         (12 * 1 * 23)
  ---------------
  1 0 1 0 1 0 0 0   (binary representation of 168)

Now, why the Ethiopians did their multiplication in binary isn’t clear, but one can surmise that it’s easier to learn to double and halve arbitrary numbers than it is to learn the units multiplication table (quick!: What’s 8 times 7?).

But here’s the thing: now that we’re using digital computers, we can readily see that this is essentially how computers do multiplication internally. Basic logic circuits to take binary numbers and add them or shift them are easy, and that’s really all this has: a one-bit shift to the left doubles, while a one-bit shift to the right halves, and then we add things up at the end (or, really, as we go). So to look at the algorithm a computer would use, let’s assume we have a computer with at least three “registers”, and a “condition” indicator that’s set if we shift a “1” bit off the end of a register.

01 Load first number into R1
02 Load second number into R2
03 Set R3 to zero
04 Shift R1 right one bit
05 If condition is not set, jump to 07
06 Add R2 into R3
07 If R1 is zero, jump to 10
08 Shift R2 left one bit
09 Jump to 04
10 Store R3 as answer

If we’re robust about it, we’ll also check for an overflow after step 06 (the answer is too large for our registers), but that’s basically it. It’s a simple, fast method, requiring the simplest computer hardware instructions.

To see how it works with the 14 * 12 example, we’ll run through it:

Instruction R1 R2 R3 cond
01 Load first number into R1 14
02 Load second number into R2 14 12
03 Set R3 to zero 14 12 0
04 Shift R1 right one bit 7 12 0 no
05 If condition is not set, jump to 07 no (jump)
07 If R1 is zero, jump to 10 (no jump)
08 Shift R2 left one bit 7 24 0 no
09 Jump to 04
04 Shift R1 right one bit 3 24 0 yes
05 If condition is not set, jump to 07 yes (no jump)
06 Add R2 into R3 3 24 24
07 If R1 is zero, jump to 10 (no jump)
08 Shift R2 left one bit 3 48 24
09 Jump to 04
04 Shift R1 right one bit 1 48 24 yes
05 If condition is not set, jump to 07 yes (no jump)
06 Add R2 into R3 1 48 72
07 If R1 is zero, jump to 10 (no jump)
08 Shift R2 left one bit 1 96 72
09 Jump to 04
04 Shift R1 right one bit 0 96 72 yes
05 If condition is not set, jump to 07 yes (no jump)
06 Add R2 into R3 0 96 168
07 If R1 is zero, jump to 10 (jump)
10 Store R3 as answer 0 96 168

Instruction	R1	R2	R3	cond
01 Load first number into R1	14
02 Load second number into R2	14	12
03 Set R3 to zero	14	12	0
04 Shift R1 right one bit	7	12	0	no
05 If condition is not set, jump to 07				no	(jump)
07 If R1 is zero, jump to 10					(no jump)
08 Shift R2 left one bit	7	24	0	no
09 Jump to 04
04 Shift R1 right one bit	3	24	0	yes
05 If condition is not set, jump to 07				yes	(no jump)
06 Add R2 into R3	3	24	24
07 If R1 is zero, jump to 10					(no jump)
08 Shift R2 left one bit	3	48	24
09 Jump to 04
04 Shift R1 right one bit	1	48	24	yes
05 If condition is not set, jump to 07				yes	(no jump)
06 Add R2 into R3	1	48	72
07 If R1 is zero, jump to 10					(no jump)
08 Shift R2 left one bit	1	96	72
09 Jump to 04
04 Shift R1 right one bit	0	96	72	yes
05 If condition is not set, jump to 07				yes	(no jump)
06 Add R2 into R3	0	96	168
07 If R1 is zero, jump to 10					(jump)
10 Store R3 as answer	0	96	168

The interesting thing is that if you ask beginning computer students to write a low-level program for multiplication, most of them will use the old lesson that “multiplication is repeated addition”, and do something like this:

01 Load first number into R1
02 Load second number into R2
03 Set R3 to zero
04 If R1 is zero, jump to 08
05 Add R2 into R3
06 Decrement R1
07 Jump to 04
08 Store R3 as answer

The program is two steps shorter... but it will take much longer to run, because the loop is dependent on the value of the number, rather than on the number of bits (that is, the problem is now order n, as opposed to order log(n)).

Thursday, June 04, 2009

Hash algorithm strength

In his “Practical Security” column in the May/June issue of IEEE Internet Computing magazine, Stephen Farrell talks about strength of cipher-suites, and the fact that the overall strength is not simply measured by the key length of which the user is aware. It’s a useful thing to understand if you’re responsible for implementing encryption, and it’s something that we often get wrong, using strong asymmetric encryption up front, but weaker symmetric encryption behind it, or the other way around.

As often happens, though, when we try to state things simply, he says something that’s a little confusing, which I’d like to look at more closely here. In talking about the strength of the hash (digest) algorithm, Stephen says this:

Similarly, if a digest operation’s output were small, say only 32 bits long, then an adversary could simply guess possible inputs until it found a matching output, probably only after roughly 65,000 attempts.

The number, 65,000, comes from the fact that 16 bits represents about 65,000 possible values (65,536, to be exact), and that 16 bits are half as many as 32 bits. On the surface, though, this looks like an error. If we were picking random values from a 32-bit range (which represents a bit more than 4 billion possible values), we’d expect the average number of attempts to crack it to be half that value, not half the number of bits: 31 bits, not 16 bits, or about 2 billion attempts.

The trick, though, is that the hash algorithms we’re using have a theoretical collision-resistance strength equal to (at most) half the number of bits in their output. So, theoretically, the strength of a hash algorithm that’s similar to SHA but has a 32-bit output would be 16 bits. An that means that a brute-force attack would crack it in an average of about 65,000 attempts.

But why is the strength only equal to half the number of bits in the first place?

The fault in what we expect is in the intuitive — but incorrect — expectation of the rate in which random values will collide. We’re mapping all plain-text messages into the set of numbers represented by 160 bits for SHA-1, or, in Stephen’s hypothetical example, 32 bits. Let’s simplify it a little, and consider a similar situation. Let’s map all people into the set of numbers from 1 to 366.

In other words, let’s consider people and their birthdays.

The “birthday paradox” — which isn’t really a paradox — is an unexpected result that I talked about in my first “Paradox” post. It hinges on the idea that we expect to need a large number of people before it’s likely that two have the same birthday — that we get, in hash terms, a collision — but that, in fact, we need relatively few.

Comparing it to our hash expectation, we’d figure we’d need 366 / 2, or 183 people, to have a 50% chance of a collision. In fact, though, we only need 23. Thinking of that in terms of bits, it’s a difference between an expected 7.5 bits and an actual 4.5 bits. (You can get lots more information about the birthday paradox at the Wikipedia entry.)

The effect here is the same. Because the probability of two randomly chosen source strings hashing to the same value goes up rapidly, not in the linear way we expect, it turns out that the strength of a hash algorithm with an output of n bits is not 2n / 2, but, rather, 2n/2 — that is, not half the number of values, but half the number of bits.

Wednesday, May 13, 2009

Trigonometry of car doors

Recently, I was discussing the relative virtues of four-door and two-door cars with a friend. I prefer four-door cars, because they make it much easier for back-seat passengers to get in and out (they make it easier to access the back seat, in general). My friend prefers two-door cars, because he seldom has back-seat passengers, and the larger doors of two-door cars make it easier for the front-seat occupants to get in and out.

“But,” I say, “on a two-door car, the doors are larger and heavier. Also, if you’re in a limited space, the doors can’t open as far, and the smaller opening actually makes it harder for the people in the front seat.”

I also recently read Kate Nowak’s post “Introducing Right Triangle Trig”, from her blog f(t) [Function of Time].[1]

Putting the two together, I thought it would be fun[2] to turn the car-door discussion into a trig problem.

Suppose I park my car in a garage. The side of the car is some fixed distance from the garage wall, limiting the amount that the door can open. Figure 1 illustrates that: the car is parked at a distance of a from the wall (because mathematicians like to confuse things by representing everything with incomprehensible letters), the door has a length of d, and when the edge of the door hits the wall, the angle the door makes with the car is t. (Click on the figures to enlarge them.)

Figure 1

What we want to know is how big the opening (o, in figure 1) is, so we know how easy it will be for a person to get in when the door is as far open as it can be. Noting that the closed-door position, the open door, and the opening form a triangle, we bet we can sort this out with trigonometry.

The first step is to drop a line from the tip of the door to the car, perpendicular to the car. The dashed line labelled a in figure 2 forms a right triangle.

Figure 2

With respect to angle t, the opposite side has length a and the hypotenuse has length d. The sine of the angle, sin(t), is the opposite side divided by the hypotenuse, or a/d. So we have our first formula, which we can solve for t:

sin(t) = a / d
t = arcsin(a / d)

Now we’ll note that the two sides of length d form an isosceles triangle with base o. That means that if we bisect angle t we’ll get two equivalent right triangles (figure 3).

Figure 3

We now have an angle t/2, a hypotenuse d, and an opposite side o/2, giving us our second formula, which we can solve for o:

sin(t / 2) = (o / 2) / d
o = 2 * d * sin(t / 2)

It’s time to plug in the numbers.

Let’s assume that a four-door car’s front door is 4 feet long, and a two-door car’s door is 6 feet long. And let’s say we park the car 3 feet from the wall. So, for a 4-door car:

a = 36 in, d = 48 in
sin(t) = 36 in / 48 in
t = arcsin(36 / 48) = arcsin(.75) = 48.59 degrees
sin(48.59 / 2) = (o / 2) / 48 in
o = 2 * 48 in * sin(24.295) = 2 * 48 in * .41143 = 39.5 in

And for a 2-door car:

a = 36 in, d = 72 in
sin(t) = 36 in / 72 in
t = arcsin(36 / 72) = arcsin(.5) = 30.00 degrees
sin(30 / 2) = (o / 2) / 72 in
o = 2 * 72 in * sin(15) = 2 * 72 in * .25882 = 37.27 in

So, though the door on the two-door car is longer, with the wall restricting it the door will create an opening that’s about two and a quarter inches (5.6%) smaller. That’s actually less of a difference than I’d expected. It was good to do the exercise, which showed that I was right... but not by enough to matter.

[1] I love that blog name!

[2] Demonstrating, as it does, why we mathematicians are in such demand at parties.

Wednesday, April 22, 2009

Infinity plus one

John Tierney has included a series of amusing puzzles, recently, in his TierneyLab blog in the New York Times. This week’s has an aspect that I’d like to talk about a bit. Here’s a shortened version of the puzzle, removing the cutesy “Russell Crowe” stuff (go look at the original for the full version):

You arrive at a hotel with an infinite number of rooms, and no vacancy. Yes, despite the infinite number of rooms, they’re all full. Still, the desk clerk does some wangling, and finds a room for you, without throwing any existing guests out nor making then share rooms. How did he do it?

I like this one in part because it touches on the sort of stuff I talked about in October, when I covered countable and uncountable sets and the concepts of cardinality and infinity.

The answer that many readers got depends upon the axiom of choice and the assumption that the number of rooms is countably infinite, but that works for the puzzle — and one can make a more rigorous solution for a more general case. So, given those additional assumptions, here’s one way to free up a room:

Because the rooms are countable and we have the axiom of choice, we can number the rooms with consecutive natural numbers starting at 1: 1, 2, 3, 4, ...
Ask each guest to change rooms, moving to the room with the next higher number. The guest in room n will move to room n+1.
Put the new guest in the now-available room 1.

It’s an amusing little demonstration of an aspect of infinite sets. It should be clear that we can repeat this a countably infinite number of times, thus fitting a countably infinite number of new guests into our already-full hotel.

Unfortunately, some people will understand it imperfectly and turn it into some attempt to do finite arithmetic on “infinity”, as commenter PolarBearNJ does (errors as in the original, but emphasis mine):

Well, this time I did some reading before answering...the puzzle is really a question of proving the following: n = n+1. Where N = infinity. How do we prove infinity plus one?
Some solutions suggest it means doubling up the guests temporarily, adding one guest to the first room, and the 2nd person leaves for the 2nd room, repeating this infinite room shifting...
Another solutions suggests it means moving the guest in the first room to the Hallway, and then moving into the 2nd room, forcing the 2nd guest into the Hallway, repeated to infinity...
Apparently, there is no answer that adequately proves n = n+1...but I would like to hear from the Math Wizards.

The problem with PolarBearNJ’s characterization is that “infinity”, as he’s thinking about it, is not a number; it’s a concept. It doesn’t make sense to talk about “infinity plus one” in the same way as “two plus three is five”. And not all “infinities” are the same, as we saw in my October series of posts.

But when we talk about cardinal numbers and the cardinality of sets, it becomes very clear, and it’s easy to see that the mapping f(n) := n + 1 defines a one-to-one correspondence between the set of natural numbers and the set of natural numbers except 1. Both sets have cardinality aleph-null .

And we can define “addition” on cardinal numbers:
Let a be the cardinality of set A, and b be the cardinality of set B. Define a + b to be the cardinality of the set A ∪ B.

So if N is the set of natural numbers and M is the set of natural numbers except 1, our mapping above (f(n) := n + 1) shows that |M| = |N|. But since N = M ∪ {1}, that means that |M| = |M| + |{1}|, or = + 1.

Which is sort of what PolarBearNJ was getting at, but not strictly the same thing. We should avoid talking about “infinity” as though it were a number.

Wednesday, April 15, 2009

On customer loyalty programs

In this post, Liz at Everyday Goddess notes that she prefers American Airlines to Southwest for a number of reasons, and is willing to pay more for a ticket on the former. One of her reasons is that she’s accumulating miles on AA’s frequent flyer program and plans to use those miles to get a free trip to Florida this August.

I’ve heard people say that sort of thing often, and it is, of course, why the loyalty programs are there: the airlines (and hotel chains, and so on) hope you’ll do business with them, even to the point of paying more, because you want the benefits the loyalty program provides.

But, while that sounds reasonable on the surface, it doesn’t make much sense when you look at it more closely. I said this in a comment on Liz’s blog, but I wanted to bring it here, too, because it’s so common.

In general, you should not pay more for a flight because you want the miles. Here’s why, using Liz’s example:

LAX (Los Angeles) to ORD (Chicago O’Hare) gives you about 3500 miles, round trip. Let’s say that the AA flight cost $100 more. That means those miles cost you around 3 cents per mile extra (2.857 cents, more precisely) — you didn't get them for free. You need at least 25,000 miles for a "free" ticket (or 50,000, depending upon when you want to do and what restrictions you have). At $0.03/mile, it will cost you $750 for 25,000 miles (and $1500 for 50,000).

I just priced an AA ticket from LAX to FLL (Fort Lauderdale, FL) in mid-August, and I got a fare of $265. Spending miles that cost you $750 extra to “earn”, to buy a ticket that would cost under $300 over the counter, is not a good deal.

And this is almost always the case. If you select your airline based on a desire to accumulate miles there, on the whole you will pay far more than the miles are worth. In general, you should select your airline for other reasons (price, flight schedule, on-time record, seat comfort, like or dislike of the way they do business, or whatever matters to you), sign up for every airline’s loyalty program, and let the miles accumulate all around. When one program has enough miles for you to redeem them, then you really will get a free flight out of it.

To be sure, there are times when it’s worth paying a little extra, but it’s important to think about it and do it in an informed way. For example, suppose you’ve amassed 48,000 miles on AA and were careful not to pay extra for them. And you want to redeem miles for a trip soon, but you need 2,000 more. Then it could make sense to pay an extra $100 or $200 to buy your next ticket on AA, so that you have enough for, say, a $500 ticket somewhere. Similarly, you might be willing to pay a little more occasionally to keep your account active, so your existing miles won’t expire.

So if you have other reasons to prefer one airline over another, by all means, let those reasons decide for you. But be wary of doing it because of the frequent-flyer program.

Tuesday, April 07, 2009

Science Talent Search

About a month ago, the winners were announced for the 2009 Intel Science Talent Search. As is usual, several of the top-40 finalists (four, this year, 10%) came from a dueling pair of New York City schools, Stuyvesant High School and Bronx High School of Science. This year, they tied with two each.

I have a personal interest in this, because when I was a high-school senior — 35 years ago, at age 16 — I submitted an abstract-mathematics paper for the 1974 competition, then the Westinghouse Science Talent Search. I was thrilled that it made the semifinals (top 300 papers), and only somewhat disappointed that it didn’t get into the finals (top 40). Two of my classmates at Nova High School in Davie, Florida, also made it into the top 300 “Honors Group”: Jim Azar (now deceased), “Unique Quadrant Graphs and Theoretical Postulates Concerning Psychological Mathematization”; and Steve Peretz, “Aerodynamic Stability Induced by Gyration”. There were at least three other papers submitted that year from Nova, a school that strongly emphasized math and science.

As it turns out, in a search last week for something else, through some old things, I unearthed a photocopy of my paper — perhaps the only extant copy — along with related memorabilia (congratulatory letters from my senators and congressman, and such). I scanned the copy and turned it into a PDF: A Characterization of Divisible Abelian Groups.

The scan brings us back to the technology of the time. The paper was typed on an old, manual typewriter, complete with irregular lines of letters (the hammers didn’t line up perfectly; a higher quality machine would have done better). I had to leave space to hand-write the mathematical symbols. At the top of page 3, I left in a hand-written correction, not wanting to re-type the whole page. And, of course, the idea that one might easily archive this stuff in a PDF file (which I could download to my BlackBerry and read, if I should want to) was unimaginable science fiction in 1974.

The paper itself is something I’m still proud of (obviously; I’m writing about it now), but in retrospect I’m a little surprised that it made the semifinals. It’s not new work. It’s original work of mine, but I’d call it my own survey of existing work. I didn’t prove any theorems that hadn’t already been proved. I didn’t have any new ideas that hadn’t already been published.

On the other hand, I was a 16-year-old doing graduate-level mathematics. And the Science Talent Search people were probably looking for a few mathematics papers to include in what’s mostly slanted toward the hard sciences.

I also have to thank my high school math teacher, Larry Insel, who taught me so excellently, who pushed me to get my paper done and entered, and who helped steer my education in many ways. He always humbly said that it was all me, but I wouldn’t have gone where I’d gone without him. I was very pleased when, last summer, Mr Insel found me online and got in touch with me again.

I hope my readers will forgive this self-indulgent post. Now go read the stuff about this year’s finalists and winners, some of whom will be making scientific breakthroughs 15 or 20 or 30 years from now. And thank their Mr Insels for helping them along.

Tuesday, November 25, 2008

Random is as random does

When I did that “random things meme” recently, I took issue with its misuse of “random”, preferring the word “arbitrary”. That’s because I’m a mathematician and computer scientist (though far more the latter, these days, not having done real mathematics since college), and “random” has a specific meaning in those fields.

New Scientist has an article about recent work by engineers in Japan that fits into the “random” conversation, and calls for more explanation. The engineers have used lasers to generate random numbers:

Now a new method that uses lasers to produce streams of truly random numbers faster than ever before could help improve security at a time when digital traffic and cybercrime are both growing.
Strings of random numbers are used to make secret keys and other parts of encryption protocols. But software that generates random numbers can generally only manage a close approximation to random. Statistical analysis reveals underlying if near-invisible patterns that mean an attacker could predict the sequence and break the code.

At first glance, “truly random” looks like “truly unique”, a meaningless phrase using an inappropriate qualifier. Something is unique, or it isn’t. A number sequence is random, or it isn’t. Isn’t it?

Well, no. To understand that, we have to talk for a moment about what the properties of a random-number sequence are, and how random numbers — properly called “pseudo-random numbers” — are generated by computer programs.

First: we don’t talk about “a random number” when we’re being rigorous; we talk about a number taken from a random sequence (or a pseudo-random sequence). That’s an important distinction when we consider primary properties of a (truly) random sequence:

The sequence is unpredictable — knowing the sequence so far gives no clue to the next number in the sequence.
The sequence is not repeatable — we can’t run the algorithm again and get the same sequence.
The sequence exhibits a uniform distribution — for any position in the sequence, any number is equally likely to appear there.

Computer programs generally simulate this selection of a random sequence by using a pseudo-random-number generator (PRNG) algorithm and a random seed. The algorithm — and there are good ones out there — takes the seed and generates a sequence that

is unpredictable and unrepeatable, if you don’t know the seed, and
exhibits, to close examination of the results, what appears to be a uniform distribution.

Given a good algorithm, then, the key point is seeding it properly, because these algorithms also have the property that if you do know the seed, the sequence is entirely repeatable.

And since cryptographic systems use PRNGs in critical places, such as key generation, there have been many attacks on such systems by finding weaknesses in their choices of random seeds. A number representing the current time is often used as part of the seed, for example. If you know with significant accuracy when the program is run, you can reliably grab many bits of the seed. If you can get enough of the seed so that you only have, say, 10,000 seeds to try, you may be able to crack the encryption system.

Some systems, therefore, try to incorporate aspects of the outside world into their choice of a seed. When the BlackBerry Desktop Manager needs to create a new encryption key, for example, it will tell the user to move the mouse around and will use samples of the mouse position as it moves as a means to create a random seed. Such data is very hard to replicate — perhaps impossible — and will generate a very good seed.

That gets us back to the work with the lasers. Here, the combination of the signals from two lasers can be used as the external input to a seed generator. More than that, the signals can actually be used as part of the algorithm to generate the numbers — the random sequence is generated by the lasers themselves, not by an algorithm manipulating a seed.

I have not yet read the paper, so I haven’t seen the proof of the unpredictability nor of the uniform distribution. But assuming that these aspects of the laser system are true and reliable, this technique certainly could be used to build a random-number generator in hardware that would work in highly secure applications.

A final note: We don’t always want truly random numbers. There are lots of applications (especially in development and testing) for which it’s useful to be able to reproduce the sequence if we have the original random seed. As I mentioned in passing above, this laser system can still be used to create the seed for conventional PRNG algorithms, providing the robustness needed there, while also allowing us to repeat the sequence when the application calls for it.

Monday, November 17, 2008

Möbius strip sesquicentennial

The 19th-century mathematician August Möbius was born on this day in 1790. Möbius was responsible for a number of things in geometry, topology, and number theory, but he’s best known for the Möbius strip, a puzzling and endlessly fascinating entity that happens to be 150 years old this year.[1]

A Möbius strip is a two-dimensional object that, when viewed in three-dimensional space, has only one side. What’s that mean? Let’s make one, using paper to approximate a two-dimensional object (we’ll ignore the thickness of the paper).

Take a standard sheet of paper and cut a 3 cm (or 1 inch) strip off of a long edge. Of course, it has two sides. On one side, which we’ll call side A, write the letter “A” on each end, like this:

Turn it over to what we’ll call side B, and write the letter “B” on each end, similarly. We have a flat strip with two sides, and we’ve labelled the sides.

Pick up the strip and put the two short ends together so that the two letters “A” meet and the strip forms a ring, a band. Tape it together, as you see to the right.

We still have a two-dimensional strip, now formed into a circular band, with two sides. Let’s check that out. Get a pen and put it on one of the letters “A”. Keeping the pen on the paper, draw a line away from the other “A”, and keep drawing until you get to the other “A” and then return to your starting point (left).

The pen will have traced a line around side A of the band. Side B will be blank. That’s all as we’d expect.

Make another flat strip and mark it as before, with “A”s and “B”s. But this time, when you pick it up to form it into a band, give one end a half twist so that you join an “A” with a “B”, taping that together into a ring (right).

Take the pen and do as before: put it on an “A”, and, keeping it on the paper, draw a line away from the adjacent “B”. Keep drawing until the pen returns to where you started. Look at the result (left). You’ll see that there is no part of the band that’s blank, without the line traversing it. Unlike the un-twisted band, this one, a Möbius strip, has one side, not two. That is, we can’t define a side A that’s distinct from side B on this band.

That can be a little hard to get one’s head around, so think about it a while.

A fun thing to do with a Möbius strip is to cut it down the middle, a handy thing now that we have a line to cut along. Take first the “normal” band, the two-sided one that we made first. Get a pair of scissors and cut along the line you drew. The band will, of course, fall into two separate bands (right), each the same as the original, topologically. No surprise there.

Now do the same with the Möbius strip. What happens?

The Möbius strip does not separate into two bands, but instead winds up as a single, larger band — because it had only one edge, and the cut created a second. As a result, the cut band is not only twice as large, but is also no longer a Möbius strip (left). It’s a two-sided band with a full twist in it (two half-twists). It’d be very cool if you could keep cutting a Möbius strip and get a larger and larger one out of the cut... but that’s not how it works.

A reminder: the paper band is a model of a Möbius strip, using the paper to approximate a two-dimensional object. A true Möbius strip has no thickness, and exists only theoretically.

A similar concept is the Klein bottle: a two-dimensional object that can be thought of as a tube without distinct “inside” and “outside” surfaces — the inside and outside are one, much as side A and side B of the Möbius strip become one. A Klein bottle, too, exists only theoretically, and can’t even be modelled in three-dimensional space (modelling it requires four dimensions).

Topology is such fun!

[1] Sesquicentennial, from sesqui, Latin for “one and a half”, and centennial, “100th anniversary”.

Monday, October 27, 2008

Verrrrry interesting...

I’d like to finish up the paradox series with a bit of mathematical humour, using proof by contradiction to show that all natural numbers are interesting.

Let U be the set of all uninteresting natural numbers:

U := { n ∈ N | n is not interesting }

Because the set of natural numbers is well ordered, any non-empty subset of the natural numbers must have a smallest element. But the “smallest uninteresting natural number” is certainly interesting! So there can be no smallest element of U, and, thus, U must be the empty set:

U = {}
∴ ∀ n ∈ N, n is interesting

A bit more mathematical humour to go along with that:

There are 3 kinds of people in the world: those who can count, and those who can’t.

There are 10 kinds of people in the world: those who understand the binary system, and those who don’t.

Aleph-null bottles of beer on the wall,
Aleph-null bottles of beer,
Take one down, pass it around,
Aleph-null bottles of beer on the wall...

And you thought mathematicians couldn’t be funny. Well!