Thanks for all your help
What sort of loop?
Collapse
X
-
-
I also would prefer to just use a select query and not update the original data. The query itself should run relatively quickly and it's never a bad idea to keep the original data.
As far as the fuzzy name matching goes, we have various articles on this forum that go over different methods of approaching that. I've found soundex to be wildly inaccurate.
Double Metaphone
Levenshtein
N-grams
As far as the birthdates go, you can calculate the difference between the two. Combine that with a name matching score and you can come up with a confidence score of how closely 2 entities match.Comment
-
Thanks Rabbit - I've had a quick look at the first article - vely interlesting. With regard to my database, as I explained, someone who fastened on a name from the original source would search on that. That is why it is important that all versions of the name are retained. I understand that, in principle, it's better to use a temporary field but I don't think that much is gained in time terms - not on a database of this size. Perhaps I will try both approaches and see what happens.Comment
-
someone who fastened on a name from the original source would search on that. That is why it is important that all versions of the name are retained.
I understand that, in principle, it's better to use a temporary field but I don't think that much is gained in time terms - not on a database of this size.Comment
-
Hello Rabbit
I'm conscious that this might sound ungrateful - Far from it. I get it that one wouldn't want to affect the integrity of data by removing or altering some part - but I don't see the harm in adding to it; after all data rarely comes ready set with a unique id.Comment
-
Adding to it is fine, no one is saying you can't add to the data. It's your choice whether you add to it using a query or add to it by inserting related entities into an "also known as" table like TwinnyFo suggests. All we are saying is you should avoid updating the original data if possible. Augmenting the original data is perfectly fine and expected, we are not saying you shouldn't do that. It's just that your earlier posts say you want to update the original data from one value to another value.Last edited by Rabbit; Dec 4 '18, 11:53 PM.Comment
Comment