AnalyticBridge

Social Network For Analytic Professionals

I'm trying to match off names between 2 bases. I essentially need to find all situations where there is an 80% match off between names. Does anyone know how to do this? If not, can you please tell me where i should post this to get this information.

Regards,
Sumanth

Share

Reply to This

Replies to This Discussion

I used the SPEDIS function in SAS to compare two names. The function will return a score depending on the similarity between the words, then you can use some threshold to filter out the name matches. It is not obvious what a 80 % match means therefore I think it's better to use something like SPEDIS.
Have a look at http://support.sas.com/documentation/cdl/en/lrdict/61724/HTML/defau...

tomas

Reply to This

String Comparators are old neews in statistics. RegEx like Soundex operators are specialized but all do essentially the same thing, but according to your "model". Not having such a model only to depend on a default PROC like SPEDIS is always computing, not analytics;

"sam" has a nice site on similarity metrics in general, inclusing string comaprators and some of their computing-cost v. benefit discussions.

http://www.dcs.shef.ac.uk/~sam/stringmetrics.html

TV

Reply to This

RSS

Featured


Advertisement

© 2010   Created by Vincent Granville

Badges  |  Report an Issue  |  Privacy  |  Terms of Service