Takeoffs are accelerating. The past few months notwithstanding, we’re seeing a dramatic shift in the demographics of air travel. This presents new challenges for governing bodies in charge of traveler security. The International Air Transport Association has forecasted there will be 8.2 billion air passengers in 2037. This is a doubling since 2018. About 44% of the growth will come from China and India. And while yearly trips per person in developed countries will grow 1-2%, trips will grow 4-8% in developing countries. Of course, rapid air travel growth in any country affects the security of an entire region, or the whole world.
In normal times – but especially now with the need to trace where people have been and who they have been in contact with – nations and border entities need to understand who are the air travelers who cross into their borders. They want to match traveler data against known smugglers, human traffickers, terrorists, or disease carriers, and on top of this, they want to be able to construct their flight history. Given how much information airlines request, and how much identification travelers bring to the airport, one hopes a few matching rules would be enough. But when you study the data, you start to see the data problem that traveler entity resolution presents.
The Data Volume, Variety, and Velocity Problem of Passenger Data
This richness of airline passenger data is part of the problem. Are names and dates-of-birth enough for matching? Passport numbers and citizenship? How similar must two names be to be considered a match? What if a traveler presents his visa to security for one flight and his passport on the next? Flight reservation data has the same challenges and more, because it is older, often by months, and can pass through a family member or traveler agent. How would they handle nicknames, aliases, typos, or phonetic approximations of names written in a different language?
Targeting rules struggle to capture the variety and complexity of this data. Simple rules may miss derogatory matches, or they may return so many hits that they overwhelm targeting teams. Furthermore, without accurate matching of passenger information, the trends in flight history analyses will be meaningless or hard to trust.
Data Mastering Solves the Biggest Entity Resolution Problems
To improve entity resolution in air traveler security, Tamr built an enhancement to the open source Global Traveler Assessment System developed by U.S. Customs and Border Protection. Quoting its GitHub repository, “[GTAS] enables government agencies to automate the identification of high-risk air travelers in advance of their intended travel….The World Customs Organization (WCO) has partnered with U.S. Customs and Border Protection (US-CBP) because of the shared belief that every border security agency should have access to the latest tools. US-CBP has made this repository available to the WCO to facilitate deployment for its member states.”
The Tamr Enhancement to GTAS improves entity resolution in GTAS by applying machine learning matching models trained on demographically representative, global air traveler data. This data included thousands of examples of matching and non-matching pairs of traveler records for Tamr to study.
In tests on international air traveler data, Tamr found 98% of possible derogatory matches. That is, when travelers on a derogatory list made reservations or boarded their flights, GTAS with Tamr flagged 98% of them. GTAS presents these flagged travelers for human review. According to testing, targets using Tamr with GTAS can expect to reject only one false positive for every correct match.
Tamr also examines travelers’ personal information (name, date of birth, citizenship, traveler documents), in order to construct more complete flight histories and better understand the behavior of derogatory travelers. Indeed, when targeters can determine all the aliases, citizenships, and documents for a traveler, then it can better find all of that traveler’s flights, and no one else’s. In tests, 99.7% of Tamr’s traveler history records are correctly grouped, and it finds a traveler’s complete history 95% of the time.
Tamr performs this derogatory matching and history clustering in real time. As soon as GTAS receives a new flight reservation or flight status update, it queries Tamr for the travelers’ derogatory matches and flight history. Tamr responds in seconds.
Deploying Tamr with GTAS
The Tamr software runs on a dedicated server, separate from GTAS, on premises or in the cloud. It has modest system requirements; more powerful hardware improves query response times, the runtime of backend processing jobs, and the amount of traveler history you can store.
Tamr deploys a messaging service to communicate with GTAS, with minimal configuration. Once started, Tamr requires no interaction. It is pre-trained and configured for air traveler data. It automatically synchronizes the derogatory list and traveler entity groups with GTAS.
To get started with Tamr for GTAS, reach out to us at firstname.lastname@example.org. We’ll talk about feature/cost options, server hardware, and deployment.