Artificial Intelligence and Genealogy
Elevenses with Lisa Episode 32
In this episode we tackle a few small geeky tech questions about artificial intelligence, better known as AI, that may have a pretty big impact on your genealogy life. Questions like:
- Is artificial intelligence the same thing as machine learning?
And if not how are they related?
- And am I using AI, maybe without even being aware of it?
- And what impact is AI really having on our lives? Is it all good, or are there some pitfalls we need to know about?
We’re going to approach these with a focus on family history, but pretty quickly I think we’ll discover it’s a much more far-reaching subject. And that means this episode is for everyone.
While I’ve done my own homework on this subject and written about it in my book The Genealogist’s Google Toolbox, I’m smart enough to call in an expert in the field. So, my special guest is Benjamin Lee. He is the developer of the Newspaper Navigator, the new free tool that uses artificial intelligence to help you find and extract images from the free historical newspaper collection at The Library of Congress’ Chronicling America. I covered Newspaper Navigator extensively in Elevenses with Lisa episode 26.
Ben is a 2020 Innovator-in-Residence at the Library of Congress, as well as a third year Ph.D. Student in the Paul G. Allen School for Computer Science & Engineering at the University of Washington, where he studies human-AI interaction with his advisor, Professor Daniel Weld.
He graduated from Harvard College in 2017 and has served as the inaugural Digital Humanities Associate Fellow at the United States Holocaust Memorial Museum, as well as a Visiting Fellow in Harvard’s History Department. And currently he’s a National Science Foundation Graduate Research Fellow.
Thank you so much to Ben Lee for a really interesting discussion and for making Newspaper Navigator available to researchers. I am really looking forward to hearing from him about his future updates and improvements.
Artificial Intelligence and Genealogy
Covering technology and its application to genealogy is always a bit of a double-edged sword. It can be exciting and helpful, and also problematic in its invasiveness.
Tools like family tree hints, the Newspaper Navigator and Google Lens (learn more about that in Elevenses with Lisa episode 27) all have a lot to offer our genealogy research. But on a personal level, you may be concerned about the long reaching effects of artificial intelligence on the future, and most importantly your descendants. In today’s deeply concerning cancel culture and online censorship, AI can seriously impact our privacy, security and even our freedom.
As I did my research for this episode I discovered a few things. Artificial Intelligence and machine learning is having the same kind of massive and disrupting impact that DNA has had on genealogy, with almost none of the same publicity. (For background on DNA data usage, listen to Genealogy Gems Podcast episode 217. That episode covers the use of DNA in criminal cases and how our data potentially has wide-reaching appeal to many other entities and industries.)
A quick search of artificial intelligence ancestry.com in Google Patents reveals that work continues on ways to apply AI to DNA and genealogy. (See image below)
AI now makes our genealogical research and family tree data just as valuable to others outside of genealogy.
This begs the question, who else might be interested in our family tree research and data?
Who Is Interested in Your Genealogy Data
One answer to this question is academic researchers. During my research on this subject The Record Linking Lab at Brigham Young University surfaced as just one example. It’s run by a BYU Economics Professor who published a research paper on their work called Combining Family History and Machine Learning to Link Historical Records. The paper was co-authored with a Notre Dame Economics and Women’s Studies professor.
In this example, their goals are driven by economic, social, and political issues rather than genealogy. Their published paper does offer an eye-opening look at the value that those outside the genealogy community place on all of the personal data we’re collecting and the genealogical records we are linking. Our work is about our ancestors, and therefore it is about ourselves. Even if living people are not named on our tree, they are named in the records we are linking to it. We are making it all publicly available.
In the past, historical records like birth and death, military and the census have been available to these researchers, but on an individual basis. This made them difficult to work with. Academic (and industry) researchers couldn’t easily follow these records for individual people, families, and generations of families through time in order to draw meaningful conclusions. But for the first-time machine learning is being applied to online genealogy research data making it possible to link these records to living and deceased individuals and their families.
It’s a lot to think about, but it’s important because it is our family history data. We need to understand how our data is being used inside and outside the genealogy sandbox.
Answers to Your Live Chat Questions About AI
One of the advantages of tuning into the live broadcast of each Elevenses with Lisa show is participating in the Live Chat and asking your questions.
From Linda J: What about all the “people search” sites (not genealogy) that have all, or a lot of, our personal date?
Lisa’s Answer: My understanding is that much of the information provided on many of the “people search” websites comes from public information. So while the information is much easier to access these days, it’s been publicly available for years. That information isn’t as accessible to projects like the one discussed in this episode because those websites don’t make their Application Programming Interface (known as API) publicly available like FamilySearch does.
From Doug H: Wouldn’t that potentially find errors in our trees?
Lisa’s Answer: Yes.
From Sheryl T: Do these academic researchers have access to the living people on the trees? Or are those protected from them as it is to the public?
Lisa’s Answer: They have access to all information attached to people marked as “Living Person.” Therefore, if the attached record names them, their identity would then be known. Click a hint on your tree at Ancestry for example, and the found records clearly spell out the name of the person they believe is your “Living” person.
From Nancy M: How long do the show notes stay available? am looking for Google Books two weeks ago and last week’s Allen Co Library.
Lisa’s Answer: The show notes remain available until the episode is archived in Premium Membership. You can find all of the currently available free Elevenses with Lisa episodes on our website in the menu under VIDEOS click Elevenses with Lisa.
Nannie A: I heard a rumor that Ancestry .com has been sold. Do you know if that’s true?
Lisa’s Answer: Yes, they were sold again this year. Read:
Private equity firm Blackstone Group Inc. buying Ancestry.com for $4.7 billion
Private equity wants to own your DNA by CBS News.
My genealogy research looks a lot like yours. Some family tree lines go back to pre-Revolutionary War. Other lines are richly researched well into the early 19th century.
And then there’s THAT family line. You know the one I mean. The one where the courthouse containing the records we need has burned down, or the records were microfilmed ages ago but are still sitting in the FamilySearch granite vault due to copyright issues. Or worst of all, it appears the needed records just don’t exist.
Don’t let these obstacles allow you to give up hope.
Every day, new records are being discovered and digitized. Records that have been languishing in a copyright stalemate might suddenly be cleared for publication. Or a cousin could contact you out of the blue and has the letters your grandmother sent hers. We never know when the records we’ve been waiting for, searching for, and yearning for, will bubble up to the surface.
Today I’m happy to share my story of a recent breakthrough that I never saw coming. Follow along with me as I take newly unearthed rocks and use tools to turn them into sparkling gems.
This is Almost Embarrassing
My one, agonizing family line that stops short in its tracks ends with my great grandfather Gustave Sporowski.
It’s almost embarrassing to admit. I’ve been at this nearly my whole life, and genealogy is my career for goodness sake! But there it is, a family tree with lovely far-reaching limbs except for this little stub of a branch sticking out on my maternal grandmother’s side.
I was about eight years old the first time I asked my grandma about her parents and their families. (Yes, this genealogy obsession goes back that far with me!) I still have the original page of cryptic notes she scratched out for me during that conversation.
She had several nuggets of information about her mother’s family. However, when it came to her father Gustave, she only recalled that he was the youngest of seven brothers. No names came to mind. I’ve always felt that if I could just identify some of the brothers, one of them may have records that provide more details about their parents.
According to his Petition for Naturalization, Gustave Sporowksi and Louise Nikolowski were married in LutgenDortmund, Germany. This indicated that both moved west from East Prussia before emigrating. While I knew Louise’s immediate family were in the LutgenDortmund area as well, I had no idea whether Gustave moved there on his own or with his family.
Gus (as he was later known) emigrated from Germany in 1910, landing at Ellis Island. He toiled in the coal mines of Gillespie, Illinois, and eventually earned enough money to move his wife and children west to California in 1918.
After filing his papers and years of waiting, he proudly became a U.S. citizen in 1940.
On that paperwork, he clearly states his birthplace as Kotten, Germany. You won’t find this location on a map today. In 1881, the year he was born, the area was East Prussia. I remember the hours I spent with gazeteers many years ago trying to locate that little village nestled just within the border of Kreis Johannisburg. Being so close to the border meant that he could have attended church there or in a neighboring district.
The records in the area are scarce, and today the entire area is in Poland.
Surprisingly, the records situation is quite the opposite with his wife Louise, also from East Prussia. She lived not far away in Kreis Ortelsburg, and the records for the church her family attended in Gruenwald are plentiful. I’ve managed to go many more generations back with her family.
And so, poor Gus alone sits in my family tree.
I periodically search to see if there’s anything new that has surfaced, but to no avail. I even hired a professional genealogical firm to review my work and suggest new avenues. I guess it is good news to hear you’ve pursued all known available leads, but it’s not very rewarding.
Over time, we tend to revisit tough cases like this less frequently. They become quiet. Digital dust begins to settle on the computer files.
And then it all changes.
German Address Books at Ancestry.com
I regularly make the rounds of the various genealogy websites, making note of new additions to their online collections. I typically publish the updates on a weekly basis here on the Genealogy Gems blog. It makes my day when readers like you comment or email, bursting with excitement about how one of the collections I mentioned busted their brick wall. I love my job.
This week I’m the one who is bursting!
It started simply enough. My third stop on my regular records round-up tour was Ancestry.com. The list of new records was particularly robust this week. The word “Germany” always catches my eye, and the second item on the list jumped out at me:
I should have had a healthy dose of skepticism that I would be fortunate enough to find anything. But to be perfectly honest, I felt instinctively that I would! Have you ever just had that feeling that your ancestors are sitting right there ready to be found? If you’ve been researching your family history for a while, then I’m guessing you have. Such a nice feeling, isn’t it?
So, I clicked, and I simply entered Sporowski in the last name field and clicked Search.
Experience has taught me that there haven’t been a lot of folks through history with this surname, so I’m interested in taking a look at anyone who pops up in the results. And yippie aye oh, did they ever pop up!
The results list include 31 people with the surname of Sporowski!
These names came from the pages of address books much like the city directories so common in the U.S. Since this collection was new to me, I took a moment to read up on the history.
GENEALOGY RESEARCH TIP: Learning the History of the Genealogy Record Collection
To truly understand what you are looking at when reviewing search results, we need to acquaint ourselves with the history of the collection.
- Why was it created?
- What does it include?
- What does it not include?
Look to the left of the search results and click Learn more about this database.
It’s definitely worth clicking this link because the next page may also include a listing of Related Data Collections, some of which you might not be aware. These could prove very useful, picking up the pace to finding more records.
In the case of foreign language records, look for a link to the Resource Center for that country. There you may find translation help and tips for interpreting handwriting and difficult-to-read script.
On the Learn more about this database page, I learned some important things about these search results.
First, not every citizen was listed. Only heads of households were included. This means that wives and children would not appear. I did find some widows, though, because they were the head of their household.
Second, Optical Character Recognition (OCR) was used on this collection. Ancestry suggests looking for errors and providing corrections. But this information about OCR also implies something even more important to the genealogist. We must keep in mind that OCR is not perfect. In this case, I planned on browsing the collection after reviewing the search results to ensure I didn’t miss anyone. This would include targeting people listed in the “S” section of directories for towns I might expect the family to be.
I was particularly thrilled to see the name “Emil Sporowski” on the list.
Several months ago I found a World War I Casualty list from a newspaper published in 1918.
On it was listed Emil Sporowski and he was from the village of Kotten. This was the first mention of Gustave’s birthplace in the record of another Sporowski that I had ever found. So, you can imagine my delight as I stared at his name in the address book search results.
The icing on that cake was that he was listed in the address book of Bochum. That town name was very familiar to me because I had seen it on a few old family photos in Louise Sporowski’s photo album. Although the photos did not have names written on them, I could easily identify the folks who had the facial characteristics of Louise Nikolowski’s clan, and those sporting the large eyes with heavy lids like Gus.
Spreading the German Addresses Out with Spreadsheets
With one and a half pages delivering a total of 31 Sporowski names, I knew I had some work ahead of me to tease them apart. This got me thinking of Genealogy Gems Podcast episode that I’m currently working on, which features a conversation with professional genealogist Cari Taplin. When I asked Cari how she organizes her data, she told me that she uses spreadsheets. I’m not typically a spreadsheet kind of gal, but in this case, I could see the benefits. Spreadsheets offer a way to get everybody on one page. And with the power of Filters and Sorting you slice and dice the data with ease. My first sort was by town.
GENEALOGY RESEARCH TIP: Free Genealogy Gems Download
Click here to download the simple yet effective spreadsheet I used for this research project. If you find your German ancestors in this collection, it’s ready to use. Otherwise, feel free to modify to suit your needs in a similar situation.
As you can see in the spreadsheet, these address books include occupations. For example, Emil was listed both as a Schmied and a Schlosser. A simple way to add the English translation to my spreadsheet was to go to Google.com and search Google Translate. Words and phrases can be translated right from the results page.
You can also find several websites listing German occupations by Googling old german occupations.
I quickly ran into abbreviations that were representing German words. For example, Lina Sporowski is listed with as Wwe .
A Google search of german occupations abbreviations didn’t bring a website to the top of the list that actually included abbreviations. However, by adding one of the abbreviations to the search such as “Wwe.” it easily retrieved web pages actually featuring abbreviations.
GENEALOGY RESEARCH TIP: Use Search Operators when Googling
Notice that I placed the abbreviation in quotation marks when adding it to my Google search query. Quotation marks serve as search operators, and they tell Google some very important information about the word or phrase they surround.
- The quotation marks tell Google that this word or phrase must appear in every search result. (If you’ve ever googled several words only to find that some results include some of the words, and other results include others, this will solve your problem.)
- They also tell Google that the word(s) MUST be spelled exactly the way it appears on each search result. This is particularly helpful when searching an abbreviation like Wwe. which isn’t actually a word. Without the quotation marks, you will likely get a response from Google at the top of the search results page asking you if you meant something else.
Click here to receive my free ebook including all the most common Google search operators when you sign up for my free newsletter (which is always chock full of goodies).
Katherine was my guest on Genealogy Gems Premium Podcast Episode #151 available exclusively to our Premium eLearning Members. She’s also written a couple of articles for Genealogy Gems on German translation:
I’ve written an article you may find helpful not only for translation but also to help you with pronunciation called How to Pronounce Names: Google Translate and Name Pronunciation Tools.
As it turns out, Wwe. stands for Widow. This tells me that Lina’s husband was deceased by 1961.
Finding the German Addresses in Google Earth
The most glorious things found in these old address books are the addresses themselves!
Google Earth is the perfect tool to not only find the locations but clarify the addresses. Many were abbreviated, but Google Earth made quick work of the task.
Unlike other free Google Tools, Google Earth is available in a variety of forms:
- Free downloadable software
- Google Earth in the Chrome Web browser
- A mobile app
Each has powerful geographic features, but I always recommend using the software. The web version and app don’t have all the tools available in the software. All versions require an internet connection. You can download the software here.
In the Google Earth search box I typed in the address. Don’t worry if you don’t have the full address or if you think it may be spelled incorrectly. Google Earth will deliver a results list of all the best options that most closely match.
In my case, reliable Google Earth not only gave me complete addresses, but also the correct German letters.
Soon I found myself virtually standing outside their homes thanks to Google Earth’s Street View feature!
Here’s how to use Street View in Google Earth:
- Zoom in close to the location
- Click on the Street View icon in the upper right corner (near the zoom tool)
- Drag the icon over the map and blue lines will appear where Street View is available
- Drop the icon directly on the line right next to the house
- Use the arrow keys on your keyboard to navigate in Street View or simply use your mouse to drag the screen
I went through the entire list. As I found each location in Google Earth, I checked it off on the spreadsheet.
GENEALOGY RESEARCH TIP: Create a Folder in Google Earth
When you have several locations like this to plot, I recommend creating a folder in the Places panel in Google Earth. It’s super easy to do and will help you stay organized. Here’s how:
- Right-click (PC) on the MyPlaces icon at the top of the Places panel (left side of the Google Earth screen)
- Select Add > Folder in the pop-up menu
- A New Folder dialog box will appear
- Type the name of your folder
- Click OK to close the folder
- You can drag and drop the folder wherever you want it in the Places panel
- Click to select the folder before placing your Placemarks. That way each placemark will go in that folder. But don’t worry, if you get a placemark in the wrong spot, just drag and drop it into the folder.
It didn’t take long to build quite a nice collection of Sporowski homes in Germany!
The beauty of Google Earth as that you can start to visualize your data in a whole new way. Zooming out reveals these new findings within the context of previous location-based research I had done on related families. As you can see in the image below, all the Sporowskis that I found in the German Address books at Ancestry.com are clustered just five miles from where photos were taken that appear in Louise Sporowski’s photo album.
I’ve Only Just Begun to Discover my German Ancestors at Ancestry.com
We’ve covered a lot of ground today, but this is just the beginning. There are additional sources to track down, timelines to create, photos to match up with locations, and so much more. In many ways, I’ve only scratched the surface of possibilities. But I need to stop writing so I can keep searching! 😊
I hope you’ve enjoyed taking this journey with me. Did you pick up some gems along the way that you are excited to use? Please leave a comment below! Let us all know which tips and tools jumped out at you, and any gems that you found.