What Race is Your Last Name?

Malcolm Gladwell's new book Outliers has a chapter on how certain names are more common in certain ethnic groups.  Gladwell's writing is, of course, as terrible as ever, but he does his homework and dutifully cites all his sources.  One of them was an article by Steven Levitt (the author of Freakonomics) and Roland Fryer (a Harvard professor who studies race differences in society).  Levitt and Fryer had used data from birth certificates and the US Census Bureau to look at the distribution of names in different ethnic groups. I couldn't get data on first-name ethnicity for reasons of privacy.  But the Census data on last-name ethnicity is available online.  I whipped it up into a CGI program that tells you the distribution of your last name among different ethnic groups.

A couple observations:
1. There are very few names where the majority of people with that name are "black".  Usually they are one of the most common "white" names with a large minority of black name holders. This is probably because it's not common for two people with rare surnames to encounter each other and intermarry.

2. There are many more distinct last names from Germany than from any other nation, although the number of people with German last names  is not unusually large. This could mean that German immigrants are more outbred, that they regularly came up with new names, or that their names were just easier to misspell or mishear while coming through Ellis Island.

3. Indian and Middle Eastern names are often split between "asians" and "whites".  This might be due to people who self-identify as one or the other when filling out their Census form.

Update 12/28/10
I ported the script to PHP. Previously it was written in Perl + CGI, and before that, Javascript. I added the "most recent names" feature and changed the script to load data line-by-line instead of all at once, a vastly more efficient technique.

