Previously, I posted a free census comparison (or correlation) form you could use in Evernote. In that post I said I assumed you had identified the correct family.
This is the follow-up about adjusting the form if you want to use census comparison to determine if several census records are the same family.
If you are unsure if you've found the right person, you can always "keep" a record and use correlation to see how well that person matches with the people in other records you are sure about.
RELATED POST: Automated Searches: Dealing with the Wrong Person
When you are correlating information that's not how you're using the information. This is more dry but that is also why it's fairly simple.
Correlating the data points is looking at them as flat pieces of information, "data." Data can easily be compared. Data points aren't clues when using them in correlation, that comes from analysis which is what you'll do after correlating.
The census was created for this "data point" use, remember it is to gather statistics. A data point from a census record is usually data in one column and one row. To determine if you have the right person/family, you need to go beyond just one row.
You will need to include information from other rows to determine if you are looking at census records for one family or several. You also need to look at the other household members and some of their data which will involve other rows. In this case, the other household members are additional data points.
You also need to treat given names and surname as two different data points. Given name is one datapoint, regardless of how many names or initials are used.
Everything is now set-up to begin the actual correlation.
It's important to record the full given name exactly as it appears in the census. Remember, these are data points, not "facts." They may be wrong or you may have the wrong person. If you don't record them exactly as shown, you're skewing your data and limiting its use for correlating.
Also, include this information for the other members of the household, including their relationships. Only include relationship if their relationship is explicitly given. Don't add something, "because I know it's true."
Here's why you need to only include the exact information given...
The longer I do genealogy, the more examples of "amazing" coincidences I come across. I can think of three cases (just since I've been a professional) where I've thought I found the right family in additional census records.
After more research I discovered in each case, it was not the same family. The household or family had an amazing resemblance to the family I was researching, though. They were similar families when comparing multiple data points including first and last name, age, residence, occupation, and names and ages of family/household members.
In two of the cases, the households were quite large and matched multiple names, ages, and general description (so the combination and order of boy and girls in the household was acceptable). In neither case did I find the "wrong" family were cousins who should have similar naming patterns.
How could I have gotten the wrong families? One of the problems with census records is they are ten years apart. Children come and go as they are born, move out, or die. Some families take boarders or have farm hands or other employees enumerated with them. Collateral family members may live with the family but not consistently.
On top of this, people can have multiple given names and even more nicknames. Finish it off with the errors in census records. With this combination, you can have the correct family in two census records with none of the same given names listed, the surname mauled, and ages skewed. It doesn't take much for a different family to resemble just one of these records.
The solution is comparing multiple data points from each record (in our example, each census year) and comparing as many records (census years) as you have available.
Capture other data points that appear on multiple census records for your family. This could be immigration year or naturalization status but it could also be the birthplace of parents or even a street address.
Occupation is consistently asked. You may need to do some research when you get to the analysis stage as some people work in the same industry doing different jobs or the same job can have different census descriptions. There are so many occupations, the enumerator instructions may not provide enough information to be sure two different occupations are actually the same.
You can include more data points, decide based on the project you are working on.
In my research (personal and professional) I've found children are fairly consistently recorded in the correct birth order, regardless if their ages are correct. However, I've also seen certain children consistently switched back and forth and no apparent birth order.
One of the strengths of correlation is it reveals a common pattern for the family you are researching.
There are all kinds of patterns that can appear when you correlate records. This even includes a pattern of inconsistency.
When you correlate all census records for a person, and even more if you include his/her siblings and all his/her children, you have a chance to discover the pattern that applies to that family.
This could be complete inconsistency, a high level of accuracy, or consistency in certain information but not other information.
If you find only the one person is inconsistent but other family members are consistent, you've learned something you can use as a clue when researching that person.
Remember, I'm giving you information about trying to determine if you've identified the right person/family. That means you have a problem.
There can be a problem with a name just as well as with other data. Sometimes it is just the name that is wrong, the other details are right. This is part of treating the information as data points. The names (given and surname) are data points, too. If you are including them in your correlation, you should treat them equally with the other information.
With census records it is usually pretty obvious what you can do next or why you are having problems. You'll start to see clues and get ideas of how else correlation could help you.
If you need to learn more, a great blog dedicated to helping genealogists with self education is Adventures in Genealogy Education. One post with suggestions related directly to correlation and analysis is "Studying Evidence Analysis, Part 2." Although the post itself isn't about correlation, correlation is a topic included in several of the suggestions.
One of the suggestions is webinars from the Board for Certification of Genealogists (BCG---you can see my post about why these aren't just for the certifiable, here). Additionally, the BCG website includes some articles and additional suggestions for furthering your own education.
It's a skill every genealogist needs and you can improve your skills and revisit a correlation later. You need to write down what you've found and keep track of it, not just use the form and try and "remember" what it revealed to you.
A great thing about doing this in Evernote is it makes it easy to come back and review the same correlation when you have new skills or a new perspective. You can also copy the note and adjust the correlation to include additional (non-census) records and slightly different data points so you can get even more out of this technique while saving time (not needing to re-enter information in the table).
If you are a fan of spreadsheets, you can do this in a spreadsheet. To deal with the issue of "seeing" everything, you can simply filter records. IF you like spreadsheets, it can be a great time saver to have one spreadsheet for a person/family and include all the data points in it. If you don't like spreadsheets or aren't familiar with all the tools for using them, it will be easier to use separate tables or another layout that works for you.
Whether you're an Occasional Genealogist or not, correlating genealogical information can reveal clues you might miss, otherwise. Starting with census records is a great, and fairly easy, introduction to this powerful technique.
If you're excited to add correlation and analysis to your toolbox and really want to learn more, I HIGHLY recommend purchasing Mastering Genealogical Proof. I do mean "purchasing" as this is a book you will want to come back to if you're serious about improving your skills (if you're not sure you're ready, go check it out at your local library). It includes exercises with answers in the back so you can test your skills. For this reason, I recommend purchasing a paper copy as it's a pain to digitally flip back and forth from the provided articles to the questions (and then the answers) in the Kindle edition. The Kindle edition is handy to read wherever you are, though.
Do you know of some great online examples of correlation and analysis? Share them in the comments.
This is the follow-up about adjusting the form if you want to use census comparison to determine if several census records are the same family.
If you are unsure if you've found the right person, you can always "keep" a record and use correlation to see how well that person matches with the people in other records you are sure about.
RELATED POST: Automated Searches: Dealing with the Wrong Person
How "Questionable" Comparisons Differ
If you don't know if you have the right person, you will need to use other "data points." I briefly used this term in my previous post. It's not a common genealogical term but to me it is a universal term (not specific to one industry) that highlights how you are going to use the information from the census.Data vs. Facts
In my blog post about using the census instructions, I mentioned you could learn if your family owned a radio. That's a great piece of information to flesh out your family and learn about them as people.When you are correlating information that's not how you're using the information. This is more dry but that is also why it's fairly simple.
Correlating the data points is looking at them as flat pieces of information, "data." Data can easily be compared. Data points aren't clues when using them in correlation, that comes from analysis which is what you'll do after correlating.
The census was created for this "data point" use, remember it is to gather statistics. A data point from a census record is usually data in one column and one row. To determine if you have the right person/family, you need to go beyond just one row.
Expanding the Data Points
In the birth date/place correlation in my previous post, you looked at the information for one person, only. That means data from one row in each census.You will need to include information from other rows to determine if you are looking at census records for one family or several. You also need to look at the other household members and some of their data which will involve other rows. In this case, the other household members are additional data points.
You also need to treat given names and surname as two different data points. Given name is one datapoint, regardless of how many names or initials are used.
Everything is now set-up to begin the actual correlation.
- You have gathered census records.
- You have identified the "data points" from each census record.
Choosing Data to Compare
You will need to make personal choices about which data points to use based on your family and which census records you are using. Everyone can start their correlation with birth information but also include given names.It's important to record the full given name exactly as it appears in the census. Remember, these are data points, not "facts." They may be wrong or you may have the wrong person. If you don't record them exactly as shown, you're skewing your data and limiting its use for correlating.
Also, include this information for the other members of the household, including their relationships. Only include relationship if their relationship is explicitly given. Don't add something, "because I know it's true."
Here's why you need to only include the exact information given...
The longer I do genealogy, the more examples of "amazing" coincidences I come across. I can think of three cases (just since I've been a professional) where I've thought I found the right family in additional census records.
After more research I discovered in each case, it was not the same family. The household or family had an amazing resemblance to the family I was researching, though. They were similar families when comparing multiple data points including first and last name, age, residence, occupation, and names and ages of family/household members.
In two of the cases, the households were quite large and matched multiple names, ages, and general description (so the combination and order of boy and girls in the household was acceptable). In neither case did I find the "wrong" family were cousins who should have similar naming patterns.
If you are researching a family that has cousins in the area, it is quite common to find a household that appears reasonably similar AND includes that "unusual" given name.
How could I have gotten the wrong families? One of the problems with census records is they are ten years apart. Children come and go as they are born, move out, or die. Some families take boarders or have farm hands or other employees enumerated with them. Collateral family members may live with the family but not consistently.
On top of this, people can have multiple given names and even more nicknames. Finish it off with the errors in census records. With this combination, you can have the correct family in two census records with none of the same given names listed, the surname mauled, and ages skewed. It doesn't take much for a different family to resemble just one of these records.
To see an example of a reason census records can be wrong, but it's not the enumerators fault, check out this census entry.
The solution is comparing multiple data points from each record (in our example, each census year) and comparing as many records (census years) as you have available.
Capture other data points that appear on multiple census records for your family. This could be immigration year or naturalization status but it could also be the birthplace of parents or even a street address.
Occupation is consistently asked. You may need to do some research when you get to the analysis stage as some people work in the same industry doing different jobs or the same job can have different census descriptions. There are so many occupations, the enumerator instructions may not provide enough information to be sure two different occupations are actually the same.
You can include more data points, decide based on the project you are working on.
Patterns
Once you've recorded the select data points in the form, you are looking for patterns. There isn't a hard and fast rule about this.In my research (personal and professional) I've found children are fairly consistently recorded in the correct birth order, regardless if their ages are correct. However, I've also seen certain children consistently switched back and forth and no apparent birth order.
One of the strengths of correlation is it reveals a common pattern for the family you are researching.
One of the strengths of correlation is it reveals a common pattern for the family you are researching.
There are all kinds of patterns that can appear when you correlate records. This even includes a pattern of inconsistency.
When you correlate all census records for a person, and even more if you include his/her siblings and all his/her children, you have a chance to discover the pattern that applies to that family.
This could be complete inconsistency, a high level of accuracy, or consistency in certain information but not other information.
If you find only the one person is inconsistent but other family members are consistent, you've learned something you can use as a clue when researching that person.
Equality Among Data
One error I've found less experienced genealogists make is giving more weight to a name than the other data points.Remember, I'm giving you information about trying to determine if you've identified the right person/family. That means you have a problem.
There can be a problem with a name just as well as with other data. Sometimes it is just the name that is wrong, the other details are right. This is part of treating the information as data points. The names (given and surname) are data points, too. If you are including them in your correlation, you should treat them equally with the other information.
Learning More
Once you've correlated the data, you can start to do an analysis. The purpose of this post is not to go into details about how to do that.With census records it is usually pretty obvious what you can do next or why you are having problems. You'll start to see clues and get ideas of how else correlation could help you.
If you need to learn more, a great blog dedicated to helping genealogists with self education is Adventures in Genealogy Education. One post with suggestions related directly to correlation and analysis is "Studying Evidence Analysis, Part 2." Although the post itself isn't about correlation, correlation is a topic included in several of the suggestions.
One of the suggestions is webinars from the Board for Certification of Genealogists (BCG---you can see my post about why these aren't just for the certifiable, here). Additionally, the BCG website includes some articles and additional suggestions for furthering your own education.
Taking It Further
Census correlation is fairly simple to do and a great way for an Occasional Genealogist to get more from the census without needing a lot of time.It's a skill every genealogist needs and you can improve your skills and revisit a correlation later. You need to write down what you've found and keep track of it, not just use the form and try and "remember" what it revealed to you.
A great thing about doing this in Evernote is it makes it easy to come back and review the same correlation when you have new skills or a new perspective. You can also copy the note and adjust the correlation to include additional (non-census) records and slightly different data points so you can get even more out of this technique while saving time (not needing to re-enter information in the table).
If you are a fan of spreadsheets, you can do this in a spreadsheet. To deal with the issue of "seeing" everything, you can simply filter records. IF you like spreadsheets, it can be a great time saver to have one spreadsheet for a person/family and include all the data points in it. If you don't like spreadsheets or aren't familiar with all the tools for using them, it will be easier to use separate tables or another layout that works for you.
Whether you're an Occasional Genealogist or not, correlating genealogical information can reveal clues you might miss, otherwise. Starting with census records is a great, and fairly easy, introduction to this powerful technique.
If you're excited to add correlation and analysis to your toolbox and really want to learn more, I HIGHLY recommend purchasing Mastering Genealogical Proof. I do mean "purchasing" as this is a book you will want to come back to if you're serious about improving your skills (if you're not sure you're ready, go check it out at your local library). It includes exercises with answers in the back so you can test your skills. For this reason, I recommend purchasing a paper copy as it's a pain to digitally flip back and forth from the provided articles to the questions (and then the answers) in the Kindle edition. The Kindle edition is handy to read wherever you are, though.
Do you know of some great online examples of correlation and analysis? Share them in the comments.