Subscribe to Vincent Granville's Weekly Digest:
I'm trying to match off names between 2 bases. I essentially need to find all situations where there is an 80% match off between names. Does anyone know how to do this? If not, can you please tell me where i should post this to get this information.

Regards,
Sumanth

Views: 331

Replies to This Discussion

I used the SPEDIS function in SAS to compare two names. The function will return a score depending on the similarity between the words, then you can use some threshold to filter out the name matches. It is not obvious what a 80 % match means therefore I think it's better to use something like SPEDIS.
Have a look at http://support.sas.com/documentation/cdl/en/lrdict/61724/HTML/defau...

tomas
String Comparators are old neews in statistics. RegEx like Soundex operators are specialized but all do essentially the same thing, but according to your "model". Not having such a model only to depend on a default PROC like SPEDIS is always computing, not analytics;

"sam" has a nice site on similarity metrics in general, inclusing string comaprators and some of their computing-cost v. benefit discussions.

http://www.dcs.shef.ac.uk/~sam/stringmetrics.html

TV

RSS

Follow us

© 2013   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC

Badges  |  Report an Issue  |  Terms of Service