Hi,
We have a table of users with about 10000 rows. We have to import a
large text file into this table. While doing so we would like to
ensure that if a row already exists, it should not be imported. Rather
that True/false logic, we need to apply Fuzzy logic here so that if we
find similarities in the name, address, ad a few other columns, we
treat the row as a duplicate.
Such possibly duplicate rows should be identified and logged to some
table.
What transformation should I prefer?
Thanks,
Yash
We have a table of users with about 10000 rows. We have to import a
large text file into this table. While doing so we would like to
ensure that if a row already exists, it should not be imported. Rather
that True/false logic, we need to apply Fuzzy logic here so that if we
find similarities in the name, address, ad a few other columns, we
treat the row as a duplicate.
Such possibly duplicate rows should be identified and logged to some
table.
What transformation should I prefer?
Thanks,
Yash
Comment