Artificial Intelligence and Genealogy Elevenses with Lisa Episode 32
In this episode we tackle a few small geeky tech questions about artificial intelligence, better known as AI, that may have a pretty big impact on your genealogy life. Questions like:
Is artificial intelligence the same thing as machine learning?
And if not how are they related?
And am I using AI, maybe without even being aware of it?
And what impact is AI really having on our lives? Is it all good, or are there some pitfalls we need to know about?
We’re going to approach these with a focus on family history, but pretty quickly I think we’ll discover it’s a much more far-reaching subject. And that means this episode is for everyone.
Watch the free video below.
While I’ve done my own homework on this subject and written about it in my book The Genealogist’s Google Toolbox, I’m smart enough to call in an expert in the field. So, my special guest is Benjamin Lee. He is the developer of the Newspaper Navigator, the new free tool that uses artificial intelligence to help you find and extract images from the free historical newspaper collection at The Library of Congress’ Chronicling America. I covered Newspaper Navigator extensively in Elevenses with Lisa episode 26.
Ben is a 2020 Innovator-in-Residence at the Library of Congress, as well as a third year Ph.D. Student in the Paul G. Allen School for Computer Science & Engineering at the University of Washington, where he studies human-AI interaction with his advisor, Professor Daniel Weld.
He graduated from Harvard College in 2017 and has served as the inaugural Digital Humanities Associate Fellow at the United States Holocaust Memorial Museum, as well as a Visiting Fellow in Harvard’s History Department. And currently he’s a National Science Foundation Graduate Research Fellow.
Thank you so much to Ben Lee for a really interesting discussion and for making Newspaper Navigator available to researchers. I am really looking forward to hearing from him about his future updates and improvements.
Artificial Intelligence and Genealogy
Covering technology and its application to genealogy is always a bit of a double-edged sword. It can be exciting and helpful, and also problematic in its invasiveness.
Tools like family tree hints, the Newspaper Navigator and Google Lens (learn more about that in Elevenses with Lisa episode 27) all have a lot to offer our genealogy research. But on a personal level, you may be concerned about the long reaching effects of artificial intelligence on the future, and most importantly your descendants. In today’s deeply concerning cancel culture and online censorship, AI can seriously impact our privacy, security and even our freedom.
As I did my research for this episode I discovered a few things. Artificial Intelligence and machine learning is having the same kind of massive and disrupting impact that DNA has had on genealogy, with almost none of the same publicity. (For background on DNA data usage, listen to Genealogy Gems Podcast episode 217. That episode covers the use of DNA in criminal cases and how our data potentially has wide-reaching appeal to many other entities and industries.)
A quick search of artificial intelligence ancestry.com in Google Patents reveals that work continues on ways to apply AI to DNA and genealogy. (See image below)
Patent search result: a pending patent involving AI and DNA by Regeneron Pharmaceuticals, Inc.
AI now makes our genealogical research and family tree data just as valuable to others outside of genealogy.
This begs the question, who else might be interested in our family tree research and data?
Who Is Interested in Your Genealogy Data
One answer to this question is academic researchers. During my research on this subject The Record Linking Lab at Brigham Young University surfaced as just one example. It’s run by a BYU Economics Professor who published a research paper on their work called Combining Family History and Machine Learning to Link Historical Records. The paper was co-authored with a Notre Dame Economics and Women’s Studies professor.
In this example, their goals are driven by economic, social, and political issues rather than genealogy. Their published paper does offer an eye-opening look at the value that those outside the genealogy community place on all of the personal data we’re collecting and the genealogical records we are linking. Our work is about our ancestors, and therefore it is about ourselves. Even if living people are not named on our tree, they are named in the records we are linking to it. We are making it all publicly available.
In the past, historical records like birth and death, military and the census have been available to these researchers, but on an individual basis. This made them difficult to work with. Academic (and industry) researchers couldn’t easily follow these records for individual people, families, and generations of families through time in order to draw meaningful conclusions. But for the first-time machine learning is being applied to online genealogy research data making it possible to link these records to living and deceased individuals and their families.
It’s a lot to think about, but it’s important because it is our family history data. We need to understand how our data is being used inside and outside the genealogy sandbox.
Answers to Your Live Chat Questions About AI
One of the advantages of tuning into the live broadcast of each Elevenses with Lisa show is participating in the Live Chat and asking your questions.
www.GenealogyGems.com/Elevenses
From Linda J: What about all the “people search” sites (not genealogy) that have all, or a lot of, our personal date? Lisa’s Answer: My understanding is that much of the information provided on many of the “people search” websites comes from public information. So while the information is much easier to access these days, it’s been publicly available for years. That information isn’t as accessible to projects like the one discussed in this episode because those websites don’t make their Application Programming Interface (known as API) publicly available like FamilySearch does.
From Doug H: Wouldn’t that potentially find errors in our trees? Lisa’s Answer: Yes.
From Sheryl T: Do these academic researchers have access to the living people on the trees? Or are those protected from them as it is to the public? Lisa’s Answer: They have access to all information attached to people marked as “Living Person.” Therefore, if the attached record names them, their identity would then be known. Click a hint on your tree at Ancestry for example, and the found records clearly spell out the name of the person they believe is your “Living” person.
From Nancy M: How long do the show notes stay available? am looking for Google Books two weeks ago and last week’s Allen Co Library. Lisa’s Answer: The show notes remain available until the episode is archived in Premium Membership. You can find all of the currently available free Elevenses with Lisa episodes on our website in the menu under VIDEOS click Elevenses with Lisa.
Podcasts have always faced an obstacle: it just hasn’t been that easy to find them or listen.
After I launched The Genealogy Gems Podcast in early 2007, I spent most of my time trying to explain to potential listeners how to “subscribe” to the show. Along came the smartphone, and eventually podcast apps, and things got a little easier. In 2010 we launched our own Genealogy Gems Podcast app in hopes of improving the listener experience even more. That’s great for those tenacious enough to find us in the first place, but what about everybody else? Also though podcasts have experienced a huge surge in popularity thanks to the viral Serial podcast, 83% of Americans still aren’t listening on a weekly basis.
Pandora, the largest streaming music provider entered the game today and plans to change all that. And thanks to you, our loyal listeners, The Genealogy Gems Podcast has been selected by Pandora as part of their initial offering of podcasts!
Read below how this music giant is going to tap technology and human curation to recommend podcasts to those who are sure to love them. I’m sure that once Americans discover through Pandora that their family history is just waiting to be discovered, and that The Genealogy Gems Podcast is here to help them do just that, we’ll be welcoming many new listeners. Keep reading for all the details from Pandora. And, be sure to sign up for the early access offering here. You can expect to start seeing our show on Pandora sometime in December.
Thanks for listening friend! Lisa Louise Cooke
PRESS RELEASE
OAKLAND, Calif.–(BUSINESS WIRE)–Pandora (NYSE:P), the largest streaming music provider in the U.S., today unveiled its podcast offering, powered by the Podcast Genome Project, a cataloging system and discovery algorithm that uses a combination of technology and human curation to deliver personalized content recommendations. Beginning today, Pandora will roll out beta access to select listeners on mobile devices. Those interested in early access to the offering can sign-up here, with general availability in the coming weeks.
“It might feel like podcasts are ubiquitous, but, eighty-three percent of Americans aren’t yet listening to podcasts on a weekly basis, and a majority of them report that’s because they simply don’t know where to start,” said Roger Lynch, Chief Executive Officer, Pandora. “Making podcasts – both individual episodes and series – easy to discover and simple to experience is how we plan to greatly grow podcast listening while simultaneously creating new and more sustainable ways to monetize them.”
Similar to how its namesake the Music Genome Project has helped Pandora become the best and easiest way to discover music online since 2005, the Podcast Genome Project recommends the right podcasts to the right listeners at the right time, solving the questions, “is there a podcast that’s right for me?” and “what should I listen to next?” It evaluates content based on more than 1500 attributes – spanning MPAA ratings, timely and evergreen topics, production style, content type, host profile, etc – and listener signals including thumbs, skips and replays. It also utilizes machine learning algorithms, natural language processing, and collaborative filtering methods for listener preferences. And, similar to the Music Genome Project, the Podcast Genome Project combines these techniques with our expert in-house curation team to offer episode-level podcast recommendations that reflect who you are today and evolve with you tomorrow.
“With the introduction of podcasts, listeners can now easily enjoy all of their audio interests – music, comedy, news, sports, or politics – on Pandora, the streaming service that knows their individual listening habits the best,” said Chris Phillips, Chief Product Officer, Pandora. “The Podcast Genome Project’s unique episode-level understanding of content knows exactly what podcast you’ll want to discover next, and will serve it up through a seamless in-product experience that is uniquely personalized to each listener and will continue to grow with their tastes over time.”
At launch, Pandora has partnered with top-tier publishers including APM, Gimlet, HeadGum, Libsyn, Maximum Fun, NPR, Parcast, PRX+PRI, reVolver, Slate, The New York Times, The Ramsey Network, The Ringer, WNYC Studios, and Wondery, and will continue to feature existing podcast content including Serial, This American Life and Pandora’s original Questlove Supreme, with many more to come in the future. These partnerships introduce hundreds of popular podcasts across a wide variety of genres including News, Sports, Comedy, Music, Business, Technology, Entertainment, True Crime, Kids, Health and Science, offering inspiring audio experiences for a variety of diverse interests.
ABOUT PANDORA
Pandora is the world’s most powerful music discovery platform – a place where artists find their fans and listeners find music they love. We are driven by a single purpose: unleashing the infinite power of music by connecting artists and fans, whether through earbuds, car speakers, live on stage or anywhere fans want to experience it. Our team of highly trained musicologists analyze hundreds of attributes for each recording which powers our proprietary Music Genome Project®, delivering billions of hours of personalized music tailored to the tastes of each music listener, full of discovery, making artist/fan connections at unprecedented scale. Founded by musicians, Pandora empowers artists with valuable data and tools to help grow their careers and connect with their fans.
The US National Archives has signed agreements with FamilySearch and Ancestry to put more of the Archives’ unique genealogical treasures online. We think that’s worth shouting about!
The National Archives has been working with FamilySearch and Ancestry for years to digitize genealogical treasures from its vaults. Contracts have been signed to continue efforts with both partners to digitize even MORE genealogy records at the National Archives: MORE birth, marriage, death, immigration and military service records! Here are some highlights from the contract:
1. Partners will now “be able to post segments of large collections immediately, rather than waiting for the entire collection to be completed.” This sounds familiar to users of FamilySearch, which regularly dumps un-indexed chunks of digitized content onto its site just to make it available faster.
2. The updated agreement contains provisions to protect “personally identifying information.”
3. Ancestry will have a shorter time period (by 12-24 months) during which they have exclusive rights to publish the images together with the index. After that, the National Archives can put the material on its site and/or share it with other partners.
4. The National Archives “will continue to receive copies of the digital images and metadata for inclusion in its online catalog….Thepublic will be able to access these materials free of charge from National Archives research facilities nationwide [not online]. Ancestry.com makes the digitized materials available via subscription.”
What kind of data is already online from The National Archives?
FamilySearch and Ancestry already host digital images of millions of National Archives documents: U.S. federal censuses. Passenger lists. Border crossings. Naturalization records. Compiled military service records. Freedman’s Bank and Freedmen’s Bureau records (the latter are currently being indexed). Federal taxation records. And the list goes on! According to the press release, before these partnerships began, “many of these records were only available by request in original form in the research rooms of the National Archives.”
Click here to search all the National Archives content on Ancestry (more than 170 million images; subscription required to view).
Just in case you’re wondering (and I was wondering), The National Archives isn’t playing favorites with their partnerships. This list shows that a National Archives partnership is pending with Findmypast. They’re already working with Fold3. I wasn’t surprised to see the John F. Kennedy Library on their list, but I wouldn’t have guessed the Royal Commission on the Ancient and Historical Monuments of Scotland!
Click to read more National Archives gems on our website: