Martijn Van Exel’s Battlegrid has been a fun resource for me to fix TIGER errors. The one problem however is that it was hard for me to identify regions that really needed some love, because Battlegrid does not allow you to zoom out and do a visual overview of what cities might have the most potential issues with the TIGER data.
For example, Chicago seems to have had a lot of errors, but these errors seem to have been mostly resolved:
While there are still regions like this part of Charleston, NC that needs a lot of work:
But, Martijn has kindly agreed to share the raw data behind Battlegrid, and using this data, I’ve produced some analyses can hopefully be useful to understanding both the current state of TIGER and also helping guide future fixup work.
Here is a view of Battlegrid for all of the US. Each dot represents something that is bad, from small errors to big ones.
And this is the same picture counting only tiles that have more than 200 misaligned nodes.
The counties in pink are the ones that originally got the old TIGER data (the white counties are likely to have more aligned and up-to-date data) – notice how the Battlegrid picks up this variation. But its not perfect, you can see the clear lines near Alabama (because all the states around Alamaba got bad data), but you can also see that large cities are changing quickly, so all the dots in Birmingham are likely to be from new roads.
Anyway, I wanted to systematically analyze what regions still that need fixing. Using the Battlegrid data, I came up with a quick ranking of MSAs that still have a lot of disagreements with TIGER 2012. I counted all the tiles that have at least 200 corrections in them, and then considered only those tiles in the top 80 MSAs in the country. Then for each MSA, I simply summed up the total number of errors and calculated a “density” (number of errors divided by the area of the MSA).
These are the places that are all fairly large MSAs and where error density is high, so your Battlegrid work is likely to bring you some joy. Eventually I want to put up a live map, and link to the relevant place on Battlegrid, so that you can get started corrected these places, but here is the list of the top MSAs that currently have a high density of poor TIGER data. (A high score indicates a high density of bad TIGER)
- Asheville, NC 8.5
- Charleston–North Charleston, SC 6.9
- Baton Rouge, LA 4.6
- Knoxville, TN 3.6
- Columbia, SC 3.5
- Winston-Salem, NC 3.1
- Atlanta, GA 2.6
- Birmingham, AL 2
- Phoenix–Mesa, AZ 1.8
- Orlando, FL 1.6
- Dallas–Fort Worth–Arlington, TX 1.6
- Pittsburgh, PA 1.6
- Tampa–St. Petersburg, FL 1.5
- Charlotte, NC–SC 1.5
- Las Vegas–Henderson, NV 1.5
- Bridgeport–Stamford, CT–NY 1.4
- McAllen, TX 1.4
- Riverside–San Bernardino, CA 1.4
- Tucson, AZ 1.4
- New Haven, CT 1.4
(you can find a downloadable version of all the top 78 MSAs here)
And here is the same data as a map :