Thursday, November 29, 2012

Inaccuracies in the Ancestry SSDI Database

Michael John Neill at RootDig has been discussing some issues he has with various Ancestry Databases. He’s had some success with Ancestry resolving some of the issues.

He asks if anyone else has issues with some of the databases. Why, yes, I have an issue with the Social Security Death Index, and in particular, Ancestry’s data for “Last Residence.” In a large number of instances, it’s simply wrong. Not ‘sort of right’ or ‘incomplete’ – wrong.  I blogged about this back in 2008, but it hasn't been fixed.

The problem arises because the original data provided to the various websites (Ancestry, GenealogyBank, etc) contains a couple different fields for “Last Residence.” One is the zip code. That piece of data Ancestry handles perfectly. However, the City and County information it mangles. One zip code often covers multiple cities, and I assume the original data source provided to all the companies lists all of them. Ancestry only provides in its records the first city in the list. This gives the incorrect appearance of a definitive answer, as opposed to multiple options. And the odds of Ancestry being correct go down as the number of cities the zip code covers goes up.

Additionally, possibly due to a unique event in St. Louis history in 1877, the first option the data provides in the list of cities for zip codes in much of St. Louis County is "St. Louis," which implies the City of St. Louis, which is wrong. The City of St. Louis isn’t part of the zip code. While this is a data-error for which Ancestry isn’t responsible, if they listed all the options, it would be clear to the researcher they had to conduct some further research to figure out which one is accurate. As is, it’s likely many researchers write down the wrong bit of information, thinking it is correct. And it’s not.

Let’s take a look at an some examples, and some maps, comparing Ancestry’s record, to the parallel record at GenealogyBank, and using Google Maps to visually see what I am talking about.

Here are the records for my grandmother, Belle (Feinstein) Newmark. Her last residence was in Creve Coeur, Missouri. The record on the left is from GenealogyBank. The record on the right is from Ancestry.

As you can see, Ancestry claims her last residence was in Saint Louis, Saint Louis. This actually is a non-existent location. In 1877 St. Louis City 'divorced' itself from St. Louis County, and they are separate entities. However, I'm not concerned about a database not being designed to facilitate a rare exception. GenealogyBank is able to display in its record that the city might either be St. Louis, or it might be Creve Coeur. Ancestry states that the city is St. Louis.  Ancestry is wrong.

Here's a record for my grandmother's uncle, Henry Blatt. I'm actually unsure which of the three cities for the zip code is the correct one, as I don't have his death certificate, and don't know his last known address. A small part of the 63105 zip code is in the city of St. Louis, though I suspect he resided in either Clayton or University City.  Once again, GenealogyBank lists all the possibilities. Ancestry picks the first one, whether accurate or not.

Here's a map of a portion of St. Louis County, courtesy of Google Maps.  The red polygon is the 63141 zip code. As you can see, it is a fair distance from the city of St. Louis, which is on the far right of the map. Several towns lie in between.  One border of the city is marked by where "Forest Park" meets Clayton.

Back in 2008 I theorized that the reason St. Louis City is listed as an option in the data was that part of 63141 is unincorporated.  However, I'm fairly certain now that that isn't the reason. Last Residences in most zip codes of St. Louis County seem to include St. Louis City as the first option. And thus, Ancestry lists it as the only option.  By conducting a search at Ancestry on just the year of death and a zipcode, I've also checked Kirkwood (63122), Brentwood (63144), Sunset Hills (63127), and Maryland Heights (63043). [Sunset Hills and Maryland Heights are off the above map. Sunset Hills is where 270 and 44 meet; You can almost see the label for Maryland Heights on the Northwestern part of the map.]  Of those four, St. Louis is absent from the options in only 63043.  One possibility is that St. Louis City is provided as an option for all zip codes beginning with the digits '631.'

To recap:

There is something funky going on with the data that the Social Security Administration provides for St. Louis County. Neither Ancestry nor GenealogyBank can do anything about that.

However, Ancestry only providing one option for the city of Last Residence creates a lot of inaccurate records. And this will happen whenever a zip code covers more than one city, which I suspect is often.

No comments: