Hi Iam trying to join two tables with respect to their name and address. But the problem is that there is a mismatch in the spellings. like if its 1 E st in table A then the same is written as One E st in table B.,so its unable to return all possbile outcomes. Also there are problems with extra spaces,abbrevations etc, It is a file of almost 7000 records. Anyone can suggest me the best way to solve this porblem. Thank you very much.
I've had some success with SSIS's Fuzzy Lookup functionality. It's not perfect and you'll probably have to play around with the cutoff range, but it will eliminate a large percentage of the problem records. Unfortunately, you'll probably need to check the results afterward, but it will eliminate a large chunk of just-off-by-a-little results.
The first question is can you you change the schema? If you can, then it may make sense to create a table C that has an ID number and the addresses, and then have both table A and table B link to table C. When you need to find where the address in table A and table B matches, then you can do it on the key number from table C. If you can't do that, can you create and import a CLR assembly? If you can, then you can write the matching function in C# where you can invoke regex and invoke conditionals to do things like try the spelled out version of numbers when present (or vice versa). It would liekly be slow, but it could be more effective then anything you could do in SQL. If neight of those are options, then I think Hakan's suggestions are the way to go.