Showing posts with label Statistics. Show all posts
Showing posts with label Statistics. Show all posts

Thursday, April 18, 2019

Twelve Years of Genealogy Research

I've been blogging about something since May 2002.

  • In March of 2007 I wrote about my great grandfather Barney and his claim he was born in Dublin on March 17th.
  • A friend read my post, and sent me a census record with my great grandfather on it.
  • I had no previous idea what was online.
  • On April 16, 2007 I wrote my first two blog posts concerning research

In 2007 I wrote 131 blog posts.
2008: 263
2009: 323 (almost, but not quite, one per day)
2010: 293 (I began dating the woman of my dreams in May)
2011: 165 (We became engaged)
2012: 114 (We were married)
2013: 90 (We bought a home)
2014: 50 (We adopted twin 1 year old boys)
2015: 54
2016: 86
2017: 59
2018: 23

My research continues; I've just been blogging less.

  • Six years ago, the last time I looked at database statistics, my database had slightly over 2800 individuals, and my wife's had 340.
  • Today: There are over 4700 individuals in my database, and over 1700 in my wife's database.
I'm very pleased with the discoveries I have made, and I'm confident I will continue to make more. I missed my annual St. Patrick's Day post this year, but you can read all my past ones here.


Wednesday, June 19, 2013

Lifespan Statistics Revisited, Again

Using my genealogy software, I first generated a lifespan report from six different family databases back in 2008 - Age is Relative, There is a lot of different data contained in the report, as can be seen in the original post, however for this post I am going to focus on average lifespan.

2008 Statistics

Note: The numbers of individuals given are for the entire database, and includes those for whom both birth and death aren't known. The average lifespans are calculated on a smaller number of individuals.
Full Database
695 males: Avg lifespan: 61
603 females: Avg lifespan: 62

"Direct Ancestors" Only
71 males: Avg lifespan: 67
58 females: Avg lifespan: 68

Descendants of Israel David Neimark  (died approx 1890)
55 males: Avg lifespan: 66
53 females: Avg lifespan: 74

Descendants of Me’er Kruvant (born before 1795)
116 males: Avg lifespan: 68
110 females: Avg lifespan: 70

Descendants of William Denyer (1794-1848)
121 males: Avg lifespan: 53
103 females: Avg lifespan: 61

Descendants of Andrew Van Every (1798-1873)
93 males: Avg lifespan: 49
82 females: Avg lifespan: 48
For each of the last four databases, the forebear is a 3rd great grandfather, so they are roughly equivalent in generation. I wondered at the causes of the apparent lower lifespans for my Denyer and Van Every lines. One possible cause included rural vs urban living. Another possible cause included sampling size. Did I just not have enough data? Perhaps, over time, as the database grew, the average lifespans for the different family branches would even out.

2009 Statistics

In 2009, my database had grown, so I generated some updated statistics in Database Size and Lifespan Revisited for the full database, and for one family in particular.
Full Database
1030 males: Avg lifespan: 60
921 females: Avg lifespan: 62

Kruvant Descendants (Me'er and his two brothers)
368 males: Avg lifespan: 58
367 females: Avg lifespan: 62
I discussed the dramatic drop in lifespan for the Kruvant family. It wasn't unsuspected, as I had just entered a large Register I had received from a cousin researcher including many Lithuanian cousins who 'died' in rather graphic ways between 1939-1942. Though one might point out that while the lifespan for that family dropped significantly, the result was pretty close to the full database. Was this really due entirely to the Holocaust, or did the simple growth of data play a role?

Current Statistics

I didn't consider this possibility back in 2009, but as my database has grown even more since then, I thought I would return to this topic, and I realized I needed to address it.

The only way to do that was to generate a separate Kruvant database, run the statistics report, remove from that database every victim of the Holocaust, and then run the statistics report again. While I was just deleting records, and in a database I had generated only for this purpose, the process was a little disturbing, as I clicked on names, and the software asked if I was sure I wanted to delete the individuals.
Full Kruvant Line
380 males: Avg lifespan: 58
374 females: Avg lifespan: 63

Edited Kruvant Line
355 males: Avg lifespan: 63
351 females: Avg lifespan: 69
It appears the Holocaust accounted for almost all of the drop in the lifespan for Kruvant females, but it was only part of the equation for the males.

Below are the current results for several other databases.
Full Database:
1471 males: Avg lifespan: 61
1341 females: Avg lifespan: 63

Ancestors:
79 males: Avg lifespan: 67
78 female: Avg lifespan: 67

Deutsch Line:
95 males: Avg lifespan: 66
80 females: Avg lifespan: 75

Denyer Line:
188 males: Avg lifespan: 57
180 females: Avg lifespan: 62

Van Every Line:
158 males: Avg lifespan: 53
134 females: Avg lifespan: 53

Newmark Line
114 males: Avg lifespan: 69
129 female: Avg lifespan: 73

Dudelczak Line
152 males: Avg lifespan: 63
146 females: Avg lifespan: 73
The average male lifespan for the Denyer descendants has improved with added data, whereas the average female lifespan hasn't by much. The averages for my Van Every line have also improved, but only slightly. While the data so far is sparse, my mother's paternal Deutsch family appears to be among the longer-lived in my tree.

Some people might wonder at why my full database hasn't grown more in six years of research.  My database could be huge, especially on my maternal line, if I added everything I found from Ancestry's Public Trees, WorldConnect, etc. Early on in my research I found at WorldConnect a 6795 individual database containing every individual from the genealogy: A Genealogical Record of the Descendants of Henry Rosenberger of Franconia, Montgomery Co., Pa, Rev. A.J. Fretz, 1906. I downloaded the database, but I didn't merge it with my own. (Even if I trust Fretz's research, I have no way to know if the person who entered the data copied the data correctly from the book.) It would certainly have impacted my statistics greatly if I had:

Rosenberger Descent
3435 male: Avg lifespan: 33
3360 female: Avg lifespan: 32

I believe the low figures are due mostly to a very high infant mortality rate.

For some perspective I looked at two more databases.

First, I looked at my wife's database. I've only begun researching, and haven't entered all the data from a Register someone else produced. However, so far, it seems the women have a significantly higher lifespan than the men in her family.

Jen’s family
175 males: Avg lifespan: 59
165 females: Avg lifespan: 67

For another perspective, I looked at the Royal database provided with my genealogical software program, iFamily. The data goes back to the middle ages on some lines, so one would expect a lower average lifespan, which is the case.

Royal Database
1686 males: Avg lifespan: 49
1322 females: Avg lifespan: 52

In summary

Regardless of the reasons, whether they may be based on geography, time, genetics, or something else, different families have different average lifespans. The women in my separate family databases seem to have longer lifespans, which is what I have been told is the case in general. However, in my full database, the gender difference is much less. This makes me wonder if there are some branches of my family where the stats are reversed.

Database
Size (M/F)
Lifespan (M/F)
Full
695/603
1030/921
1471/1341
61/62
62/63
61/63
Ancestors
71/58
79/78
67/68
67/67
Newmark
55/53
114/129
66/74
69/73
Kruvant
116/110
368/367
380/374
355/351*
68/70
58/62
58/63
63/69*
Dudelczak
152/146
63/73
Deutsch
95/80
66/75
Denyer
121/103
188/180
53/61
57/62
Van Every
93/82
158/134
49/48
53/53
Jen’s Family
175/165
59/67
Rosenberger
3435/3360
33/32
Royal
1686/1322
49/52

Saturday, June 13, 2009

Database Size and Lifespan Revisited

On the first of this month I mentioned that my primary database had 1458 individuals in it. I also said I probably had enough unentered data to make that 2000.

Most of this unentered data came from a 694 individual register a cousin gave me from over 20 years of research on my paternal Cruvant branch of the tree. (One of the downsides of the Register numbering system is that it doesn't give numbers to spouses, so there are actually even more individuals in this document.)

Over the past two weeks I have entered all the data for descendants of my third great grandfather David Aron Kruvond. He did have two siblings whose descendants I haven't entered, but the information on those lines is sparse anyway. The size of the database is now 1951. So it was a pretty good estimate.

Of course, my cousin didn't type the Register by hand. She has her own database. I could have had her send me her GEDCOM, which I could have uploaded, and I would have had all her information in my database in a minute. But entering the vital statistics by hand helped me catch typographical errors I wouldn't have caught if I just imported all the data instantaneously. [I'm now going to ask her for the file, so I can have a separate database containing her extensive notes I didn't copy.]

For reasons that will become clear, I then began to wonder...

With the growth of my database in the past year, from 700 to almost 2000 individuals...are there any differences in the Lifespan Statistics I calculated a year ago.

(Numbers in parentheses below indicate value last year)

In my overall database, with now 1030 males and 921 females

Average Male Lifespan
60 (61)
Average Female Lifespan 62 (62)

This is nearly identical to what it was a year ago. It's interesting that there has been very little change here.

In my Cruvant branch, with 368 males and 367 females (an increase of 509 individuals from a year ago)

Average Male Lifespan 58 (68)
Average Female Lifespan 62 (70)

This is a significant drop, but not unexpected from what I have entered over the past two weeks. A large amount of this drop can be attributed to a high mortality in Lithuania between 1941-1943. I'm quite impressed we have all this information, but it is depressing to read repeatedly for cause of death 'townspeople with axes.' Part of me thinks I could have handled seeing the names of any of the death camps more easily than that repeated phrase.

Sunday, July 13, 2008

Age is Relative

Take some time to look over the data that you have collected on members of your family tree, and share a story of age with us … With the understanding that “age is often a state of mind”, share your family story about someone whose story stands out because of their age, either young or old.
For the 52nd Carnival of Genealogy, instead of focusing on one individual, I thought I would start by taking a look at some of the lifespan statistics that I was able to generate with my genealogical software. (see previous entry)

Ave Male Lifespan 61
Ave Female Lifespan 62

At first this feels a little low, but several lines go back a few centuries, when life expectancy was shorter, and infant mortality was higher.

Earliest recorded person with birth/death year
William HORTON, born 1550 AD

When I look at the record, it says “abt 1550” and since I didn’t do the research, I don’t know how the year was derived. William Horton’s grandson, Barnabas Horton, may be the earliest ancestor for whom I have an exact birth date. July 13, 1600. (Happy 408th Birthday!) Barnabas’s fifth-great granddaughter, Abigail Stuart, married Samuel Van Every, who had 22 children, one of which was my mother’s grandfather.

Age At Death &lt 1 = 3.4%
Age At Death 01 to 10 = 4.8%
Age At Death 11 to 20 = 2.8%
Age At Death 21 to 30 = 2.0%
Age At Death 31 to 40 = 5.3%
Age At Death 41 to 50 = 5.6%
Age At Death 51 to 60 = 12.6%
Age At Death 61 to 70 = 17.4%
Age At Death 71 to 80 = 20.2%
Age At Death 81 to 90 = 19.1%
Age At Death 91 to 100 = 6.2%
Age At Death 101 to 110 = 0.6%

I have no idea how this curve compares to that of other families. My gut instinct says the 11% for age 31-50 is probably higher than it is for the 'average family'. But what is the average family? I will look at this further a little later in the entry.

My database is small compared to some – at the moment slightly over 700 people, only half of them with known death dates. So the %s might very well change as I enter more records. Here are a few of the longer lived in my family:

People who lived over 100 years (only 2)
Israel David NEWMARK (1903 - 2004) 101
Joe WYMAN (1904 - 2007) 102

Israel David was my great grandfather, Barney Newmark’s youngest brother. However, due to being the same age as the children of his oldest siblings, he was was given the nickname, “Uncle Buddy.”

Joe was first cousin with my paternal grandmother. I'm not sure I ever met Joe, but I've met his younger sister several times, and she is still alive, and we've sent emails back and forth. She's already earned a place in at least the next category; hopefully she makes it to this one.

People who lived over 90 years
(there are 26, I'm going to mention a handful.)

Paternal grandfather's lines

Ida Adele KESSLER (1907 - 2003) 96

Ida married an “Israel David Newmark”, who happened to be the nephew of the one above, born 4 years later.

Cruvant William ALTMAN (1914 - 2008) 93

Cruvant passed away on March 15th of this year. He was a first cousin to my grandfather, and they were in a law practice together prior too WWII.

Bertha CRUVANT (1887 - 1978) 90

My great grandmother, she is he second longest living direct ancestor for whom I am certain about age at death. I was 9 when she died and I remember her well, but she was in a retirement home, and a wheelchair by then. I’ve enjoyed discovering photos of her from her younger years recently.

Paternal grandmother's lines

Robert Seymour Selig FEINSTEIN (1915 - 2008) 93

Seymour also passed away on March 15th of this year. I started to pen an entry about the Ides of March back then, but it didn't get finished. Seymour was another first cousin of my paternal grandmother.

Maternal Grandfather's lines

Berta DEUTSCH (1911 - 2003) 91

My maternal grandfather’s sister, she is the longest living member of my maternal grandfather’s lines. However, this is also the family I have the least records on, mostly due to it also being the family which, generationally, came to America most recently, with my grandfather (and Berta) having been born in Hungary.

Maternal Grandmother’s lines

Sarah SHOWERS (1762 - 1860) 98

She is a direct ancestor, however, her record says she died before 1860. There is no indication how the year was derived. It could have been significantly prior to 1860.

Elizabeth ROSENBERGER (1752 - 1847) 94

I have complete birth and death dates for Elizabeth which come from a Fretz Family History compiled in the 1890s, and is probably trustworthy. Her granddaughter Elizabeth Sliver married William Denyer, my mother’s second great grandparents.

***

I've posted before about several early deaths in my family tree, particularly in the Denyer line. While several of the early deaths were in the 1800s, when lifespans were shorter, I still thought the statistics for the Denyers seemed significantly lower than they should have been. I decided it was time to crack the numbers and see just how right I was.

My software doesn’t separate the lifespan statistics by families, but you can export Ancestors/Descendants into a separate GEDCOM, and then open the GEDCOM and run the stats. So that's what I did, and I had some numbers to compare.

"Direct Ancestors"
Starting with me, this GEDCOM had my direct ancestors, with no siblings.

71 males: Avg lifespan: 67
58 females: Avg lifespan: 68

I looked at this as a second control in the study, in addition to the 61/62 in the whole database.

Descendants of Israel David Neimark (his son Samuel was born in 1862, that's all I know)
55 males: Avg lifespan: 66
53 females: Avg lifespan: 74

Descendants of Me’er Kruvant (born before 1795)
116 males: Avg lifespan: 68
110 females: Avg lifespan: 70

These are the only two families I looked at for my father's side, as the other branches have less entries, making the significance of the data questionable. I question the siginificance of the data for the Newmark family as well, and wonder if the stats will look any different once I enter more of the information.

Descendants of William Denyer (1794-1848)
121 males: Avg lifespan: 53
103 females: Avg lifespan: 61

Descendants of Andrew Van Every (1798-1873)
93 males: Avg lifespan: 49
82 females: Avg lifespan: 48

Quite a significant drop.

The argument that it is due to pre-20th century data is possibly negated by comparison to the Kruvant family, as I have information that goes back to the late 1700s for them too, in Lithuania. However, when they immigrated to America, they immigrated to urban St. Louis, not rural Texas. Life expectancy can change based on geography.

Whatever the causes, I definitely confirmed that the life expectancy for the Denyers and Van Everys was sadly lower than for the rest of my family tree.