Tricky SQL question

**Erland Sommarskog** · Feb 22 '06, 11:15 PM

Re: Tricky SQL question

Erich93063 (erich93063@gma il.com) writes:[color=blue]
> I have a field for "First Name", a field for "Last Name" and a field
> (looked up by ID) for "Company Name". Now here is the problem. In the
> usual case there will be a company name, a first name and a last name
> for example:
>
> Company Name: ACME Corporation
> First Name: John
> Last Name: Smith
>
> HOWEVER, sometimes the vendor is not part of a company and the users of
> the database have just been entering in the vendors name as the name of
> the company. So it would end up looking like this:
>
> Company Name: John Smith
> First Name: John
> Last Name: Smith
>
> So the problem is that the accounting program that I am importing these
> contacts into uses a weird schema. Instead of separate fields for first
> name, last name and company, they have a combined field for "Company
> Name/Last Name" and then a separate field for first name. So if the
> first name field is blank then we know this contact is a company and
> not an individual. As soon as we provide a first name, the record
> becomes an individual as opposed to a company.
>...
> SELECT CASE WHEN cu.vchrFirstNam e LIKE '%' + co.vchrCompanyN ame + '%' AND
> cu.vchrLastName LIKE '%' + co.vchrCompanyN ame + '%'
> THEN cu.vchrLastName
> ELSE co.vchrCompanyN ame
> END AS [Co./Last Name],
> CASE WHEN cu.vchrFirstNam e LIKE '%' + co.vchrCompanyN ame + '%' AND
> cu.vchrLastName LIKE '%' + co.vchrCompanyN ame + '%' THEN
> cu.vchrFirstNam e
> ELSE ''
> END AS [First Name],[/color]

You have it backwards: You are trying to see if "John" is like
%John Smith% which it isn't. But "John Smith" is like "%John%".

Really how you should do this, I don't know, because I suspect that
the quality of the data is not perfect. It it was, you should
just say:

CASE WHEN FirstName + ' ' + LastName = CompanyName

But users may have entered extra spaces between the first and last
name - or no space at all.

You could try

CASE WHEN CompanyName LIKE '%' + FirstName '%' AND
CompanyName LIKE '%' + LastName '%'

But there is entry where the users entered X first and last name, and
the company has an X in the name...

It would have been a lot simpler, if you had permitted CompanyName
to be NULL for private vendros.

By the way, in the output, shouldn't {First Name] be NULL rather than
'' for companies?

--
Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

Books Online for SQL Server 2005 at

http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx

Books Online for SQL Server 2000 at

http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

**Erich93063** · Feb 22 '06, 11:45 PM

Re: Tricky SQL question

OMG Im stupid. Ok yeah so now it works perfectly! THANKS

As far as the first name being NULL or not, I am generated a CSV file
so I just need an empty string. Unless you're saying it is more optimal
to use NULL instead of an empty string. As far as what would get
outputed to the CSV file it would be the same.

THANK YOU SO MUCH!!!!!!!!!!! !

**Andrew** · Feb 22 '06, 11:55 PM

Re: Tricky SQL question

"Erich93063 " <erich93063@gma il.com> wrote in message
news:1140647157 .967448.102260@ g47g2000cwa.goo glegroups.com.. .[color=blue]
> Ok so I'm gonna attempt to explain my problem as easy as I can. I have
> a database of contacts (they are actually vendors). I am writing a
> procedure that will query the vendors and generate a tab delimited file
> that will be used an accounting program (MYOB). In the vendors database
> I am exporting from, I have a field for "First Name", a field for "Last
> Name" and a field (looked up by ID) for "Company Name". Now here is the
> problem. In the usual case there will be a company name, a first name
> and a last name for example:
>
> Company Name: ACME Corporation
> First Name: John
> Last Name: Smith
>
> HOWEVER, sometimes the vendor is not part of a company and the users of
> the database have just been entering in the vendors name as the name of
> the company. So it would end up looking like this:
>
> Company Name: John Smith
> First Name: John
> Last Name: Smith
>[/color]

Unless I've misundestood what you're describing - the thing you absolutely
have to address is the data issues. It's a hoary old truism - but, GIGO.
In situations like this - I have always found that you will get a
significant number of entries which are duplicates. Eg - If you have John
Smith, Dave Jones and Min Patel from the same company - you may well have at
least 3 entries. If you have John Smith from several - perhaps competing
companies, or as individuals, you have a serious mess.
If you are trying to report on this data - you're feeding your decision
makers crap.
In the first instance - you have to try and get high level buy in to put
significant pressure on those entering the data to clean it and keep it
clean. This may involve a change to the interface at the front end - for
example, to force the user to enter a postcode/zip code for any new entry,
then next option is to choose from an entry already in the database - and
then - if it's not there - to allow them to select an option from a
commercial address lookup such as the PAF software available in the UK -
where you have to do a search linked to zip/post code.

All of this - obviously - has recource issues. You have to make it clear
that no matter how clever you try to be - you can't address shite data with
increasingly complex workrounds. A chaotic contacts database loses the
company money in a load of ways. If you claim that you can guesstimate what
people "should" have put in - the first time you fail on a juicy contract -
you're in the crap.

Your only professional and sensible option is to go to the people who are
living by this information to feed their families - get buy in and get the
people putting crap in to put it right, get agreement to implement a
strategy for ensuring that the data is useful, getting agreement that this
is stuck to.

Contacts are silver - leads are gold - you have to make sure they're usable
and tracable. Getting in touch with the wrong Mr Smith or Mr Singh is not
acceptable. Therefore, trying to address the issue by any other means than
a data cleaning exercise involving those putting in crap is not a good
option. Thwe ymay be putting in rubbish - but I bet they know who 90% + of
the people are. End users think computers can make heuristic decisions
based on their phone contacts with the customer - ot that yo can program
them. If you persist this myth - it'll get worse.

I'll be honest - an attempt at a workaround is easier to sell - but in this
case you really have to take the hard road and sell other people who know
who is who putting it right and knowing they have to keep it right. It's
not going to make you popular and by the time your company is making money
on it - you'll be forgotten. You'll be entitled to sit back and feel smug
though (;->

Summary;
There is *no* work around for unusable data. The better the data quality -
the more money your company makes. Mistakes on your on your contacts
database can be "very" embarassing and lose money - if you sell a
workaround, *you* are in the firing line. Get the data right. This helps
people using the data make money. The more they make - the more chance of
keeping your job - or getting the pay rise you deserve. This is a
management issue - *not* a technical one.

IMHO

All the best
Andrew

**Erland Sommarskog** · Feb 23 '06, 08:45 AM

Re: Tricky SQL question

Erich93063 (erich93063@gma il.com) writes:[color=blue]
> As far as the first name being NULL or not, I am generated a CSV file
> so I just need an empty string. Unless you're saying it is more optimal
> to use NULL instead of an empty string. As far as what would get
> outputed to the CSV file it would be the same.[/color]

OK, for a CSV that would be the right thing. I got the impression that
you were inserting data into a table.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

Books Online for SQL Server 2005 at

http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx

Books Online for SQL Server 2000 at

http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

**Erland Sommarskog** · Feb 23 '06, 11:05 PM

Re: Tricky SQL question

Andrew (andrewgothard@ Nospamthanks.co m) writes:[color=blue]
> Contacts are silver - leads are gold[/color]

Lead into gold? Alchemy?

--
Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

Books Online for SQL Server 2005 at

http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx

Books Online for SQL Server 2000 at

http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

Tricky SQL question

Tricky SQL question

Comment

Comment

Comment

Comment

Comment