Matching Imperfect Data: String Similarity
Any Big Data effort that brings together data from multiple sources must determine when records in different sources are referring to the same real-world entity. This process is called record matching, and it needs to be accomplished even when the…