Monday, June 1, 2009

The Size of my GEDCOM

There has been some discussion of how large is large with respect to genealogy databases (GEDCOMs). Tamura Jones began the discussion by comparing different software vendors definition of 'large', which apparently varies from 400,000 down to 2,500 individuals. (Seven generations is considered large by one vendor, and Tamura notes that can be as small as 7 individuals.) As a former computer programmer, I completely agree with the idea that software vendors should strive to understand, and address, the actual 'needs of the user.' However, the marketing department obviously has other goals, and when their product can't meet those needs, it is still their job to find as many buyers as they can.

Valerie at Begin with Craft says her GEDCOM is extra small by Tamura's standards. And while Randy Seaver at Geneamusing considers his database of 25,000 large, Tamura feels "a well-researched medium-size ancestry can easily contain 25,000 individuals or more." (He explains this in his article Medium Size Genealogy).

I have only been doing research for 2 years - where does my database fit in this playground yardstick competition?

My "master" database has 1458 individuals. Even smaller than Valerie's, though she has been researching for five years longer than me. However, I probably have sufficient unentered data to increase that to 2,000. My paternal lines go back 5 generations, while some of my maternal lines go back over 10. I also downloaded a 6795 person GEDCOM from WorldConnect, where all the individuals are related to me, as we are all descendants of Henry Rosenberger. However, I don't have a desire to merge it with my master database, as the relatives I 'care' about would be dwarfed by those that are of less interest to me. So my current information of entered and unentered data is somewhere around 8,000 individuals, which I feel is rather good for 2 years. Though I have no delusions about its size, and hope for it to grow.

The thing is, with maternal ancestry that arrived on the shores of this continent in the 1600s, I could probably go bonkers downloading and merging publicly available GEDCOMs of varying degrees of reliability. The reliability of the data, though, is important to me. And while I have interest in my Colonial ancestors, having all of their descendants in a database isn't one of my current goals. So I don't expect my database to grow exorbitantly. [I downloaded the Rosenberger database mainly because it is entered from a descendancy published in 1906, so I at least know the source.] Over time my goals might change, but right now they are much more modest.


Anonymous said...


Software vendors should strive to meet, the needs of the user?
I disagree. I think they should strive to meet and exceed the needs of the user,
so that today's product is good enough for tomorrow.

- Tamura

John said...

I changed 'meet' to 'address'. How they address the needs is up to them - meeting the needs is the bare minimum.

Anonymous said...

Ah, we are actually in complete agreement :-)

- Tamura