Tricky SQL question

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Erich93063

    Tricky SQL question

    Ok so I'm gonna attempt to explain my problem as easy as I can. I have
    a database of contacts (they are actually vendors). I am writing a
    procedure that will query the vendors and generate a tab delimited file
    that will be used an accounting program (MYOB). In the vendors database
    I am exporting from, I have a field for "First Name", a field for "Last
    Name" and a field (looked up by ID) for "Company Name". Now here is the
    problem. In the usual case there will be a company name, a first name
    and a last name for example:

    Company Name: ACME Corporation
    First Name: John
    Last Name: Smith

    HOWEVER, sometimes the vendor is not part of a company and the users of
    the database have just been entering in the vendors name as the name of
    the company. So it would end up looking like this:

    Company Name: John Smith
    First Name: John
    Last Name: Smith

    So the problem is that the accounting program that I am importing these
    contacts into uses a weird schema. Instead of separate fields for first
    name, last name and company, they have a combined field for "Company
    Name/Last Name" and then a separate field for first name. So if the
    first name field is blank then we know this contact is a company and
    not an individual. As soon as we provide a first name, the record
    becomes an individual as opposed to a company.

    SO the MYOB import procedure is expecting two fields. One called
    "[Co./Last Name]" and one called "[First Name]". So when I query my
    current vendor database, I need to figure out if the company name
    contains both the first and last name of the vendor and if so then
    return only the vendors "last name" for the "[Co./Last Name]" field.
    However if the company name is completely different then the first and
    last name of the vendor, then I need to return the company name for
    the "[Co./Last Name]" field and make the "[First Name]" field blank.

    Here is what I thought would work but it doesn't:

    SELECT
    CASE
    WHEN cu.vchrFirstNam e LIKE '%' + co.vchrCompanyN ame
    + '%'
    AND cu.vchrLastName LIKE '%' + co.vchrCompanyN ame
    + '%'
    THEN cu.vchrLastName
    ELSE co.vchrCompanyN ame
    END AS [Co./Last Name],
    CASE
    WHEN cu.vchrFirstNam e LIKE '%' + co.vchrCompanyN ame
    + '%'
    AND cu.vchrLastName LIKE '%' + co.vchrCompanyN ame
    + '%'
    THEN cu.vchrFirstNam e
    ELSE ''
    END AS [First Name],


    I don't get an error when I use this code, but it just always fails and
    ends up using the company name no matter what even though I know for
    sure that the first and last names ARE in the company name.

    Sorry for my long winded explanation.

    THANKS

  • Erland Sommarskog

    #2
    Re: Tricky SQL question

    Erich93063 (erich93063@gma il.com) writes:[color=blue]
    > I have a field for "First Name", a field for "Last Name" and a field
    > (looked up by ID) for "Company Name". Now here is the problem. In the
    > usual case there will be a company name, a first name and a last name
    > for example:
    >
    > Company Name: ACME Corporation
    > First Name: John
    > Last Name: Smith
    >
    > HOWEVER, sometimes the vendor is not part of a company and the users of
    > the database have just been entering in the vendors name as the name of
    > the company. So it would end up looking like this:
    >
    > Company Name: John Smith
    > First Name: John
    > Last Name: Smith
    >
    > So the problem is that the accounting program that I am importing these
    > contacts into uses a weird schema. Instead of separate fields for first
    > name, last name and company, they have a combined field for "Company
    > Name/Last Name" and then a separate field for first name. So if the
    > first name field is blank then we know this contact is a company and
    > not an individual. As soon as we provide a first name, the record
    > becomes an individual as opposed to a company.
    >...
    > SELECT CASE WHEN cu.vchrFirstNam e LIKE '%' + co.vchrCompanyN ame + '%' AND
    > cu.vchrLastName LIKE '%' + co.vchrCompanyN ame + '%'
    > THEN cu.vchrLastName
    > ELSE co.vchrCompanyN ame
    > END AS [Co./Last Name],
    > CASE WHEN cu.vchrFirstNam e LIKE '%' + co.vchrCompanyN ame + '%' AND
    > cu.vchrLastName LIKE '%' + co.vchrCompanyN ame + '%' THEN
    > cu.vchrFirstNam e
    > ELSE ''
    > END AS [First Name],[/color]

    You have it backwards: You are trying to see if "John" is like
    %John Smith% which it isn't. But "John Smith" is like "%John%".

    Really how you should do this, I don't know, because I suspect that
    the quality of the data is not perfect. It it was, you should
    just say:

    CASE WHEN FirstName + ' ' + LastName = CompanyName

    But users may have entered extra spaces between the first and last
    name - or no space at all.

    You could try

    CASE WHEN CompanyName LIKE '%' + FirstName '%' AND
    CompanyName LIKE '%' + LastName '%'

    But there is entry where the users entered X first and last name, and
    the company has an X in the name...

    It would have been a lot simpler, if you had permitted CompanyName
    to be NULL for private vendros.

    By the way, in the output, shouldn't {First Name] be NULL rather than
    '' for companies?

    --
    Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

    Books Online for SQL Server 2005 at

    Books Online for SQL Server 2000 at

    Comment

    • Erich93063

      #3
      Re: Tricky SQL question

      OMG Im stupid. Ok yeah so now it works perfectly! THANKS

      As far as the first name being NULL or not, I am generated a CSV file
      so I just need an empty string. Unless you're saying it is more optimal
      to use NULL instead of an empty string. As far as what would get
      outputed to the CSV file it would be the same.

      THANK YOU SO MUCH!!!!!!!!!!! !

      Comment

      • Andrew

        #4
        Re: Tricky SQL question


        "Erich93063 " <erich93063@gma il.com> wrote in message
        news:1140647157 .967448.102260@ g47g2000cwa.goo glegroups.com.. .[color=blue]
        > Ok so I'm gonna attempt to explain my problem as easy as I can. I have
        > a database of contacts (they are actually vendors). I am writing a
        > procedure that will query the vendors and generate a tab delimited file
        > that will be used an accounting program (MYOB). In the vendors database
        > I am exporting from, I have a field for "First Name", a field for "Last
        > Name" and a field (looked up by ID) for "Company Name". Now here is the
        > problem. In the usual case there will be a company name, a first name
        > and a last name for example:
        >
        > Company Name: ACME Corporation
        > First Name: John
        > Last Name: Smith
        >
        > HOWEVER, sometimes the vendor is not part of a company and the users of
        > the database have just been entering in the vendors name as the name of
        > the company. So it would end up looking like this:
        >
        > Company Name: John Smith
        > First Name: John
        > Last Name: Smith
        >[/color]

        Unless I've misundestood what you're describing - the thing you absolutely
        have to address is the data issues. It's a hoary old truism - but, GIGO.
        In situations like this - I have always found that you will get a
        significant number of entries which are duplicates. Eg - If you have John
        Smith, Dave Jones and Min Patel from the same company - you may well have at
        least 3 entries. If you have John Smith from several - perhaps competing
        companies, or as individuals, you have a serious mess.
        If you are trying to report on this data - you're feeding your decision
        makers crap.
        In the first instance - you have to try and get high level buy in to put
        significant pressure on those entering the data to clean it and keep it
        clean. This may involve a change to the interface at the front end - for
        example, to force the user to enter a postcode/zip code for any new entry,
        then next option is to choose from an entry already in the database - and
        then - if it's not there - to allow them to select an option from a
        commercial address lookup such as the PAF software available in the UK -
        where you have to do a search linked to zip/post code.

        All of this - obviously - has recource issues. You have to make it clear
        that no matter how clever you try to be - you can't address shite data with
        increasingly complex workrounds. A chaotic contacts database loses the
        company money in a load of ways. If you claim that you can guesstimate what
        people "should" have put in - the first time you fail on a juicy contract -
        you're in the crap.

        Your only professional and sensible option is to go to the people who are
        living by this information to feed their families - get buy in and get the
        people putting crap in to put it right, get agreement to implement a
        strategy for ensuring that the data is useful, getting agreement that this
        is stuck to.

        Contacts are silver - leads are gold - you have to make sure they're usable
        and tracable. Getting in touch with the wrong Mr Smith or Mr Singh is not
        acceptable. Therefore, trying to address the issue by any other means than
        a data cleaning exercise involving those putting in crap is not a good
        option. Thwe ymay be putting in rubbish - but I bet they know who 90% + of
        the people are. End users think computers can make heuristic decisions
        based on their phone contacts with the customer - ot that yo can program
        them. If you persist this myth - it'll get worse.

        I'll be honest - an attempt at a workaround is easier to sell - but in this
        case you really have to take the hard road and sell other people who know
        who is who putting it right and knowing they have to keep it right. It's
        not going to make you popular and by the time your company is making money
        on it - you'll be forgotten. You'll be entitled to sit back and feel smug
        though (;->

        Summary;
        There is *no* work around for unusable data. The better the data quality -
        the more money your company makes. Mistakes on your on your contacts
        database can be "very" embarassing and lose money - if you sell a
        workaround, *you* are in the firing line. Get the data right. This helps
        people using the data make money. The more they make - the more chance of
        keeping your job - or getting the pay rise you deserve. This is a
        management issue - *not* a technical one.

        IMHO

        All the best
        Andrew


        Comment

        • Erland Sommarskog

          #5
          Re: Tricky SQL question

          Erich93063 (erich93063@gma il.com) writes:[color=blue]
          > As far as the first name being NULL or not, I am generated a CSV file
          > so I just need an empty string. Unless you're saying it is more optimal
          > to use NULL instead of an empty string. As far as what would get
          > outputed to the CSV file it would be the same.[/color]

          OK, for a CSV that would be the right thing. I got the impression that
          you were inserting data into a table.


          --
          Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

          Books Online for SQL Server 2005 at

          Books Online for SQL Server 2000 at

          Comment

          • Erland Sommarskog

            #6
            Re: Tricky SQL question

            Andrew (andrewgothard@ Nospamthanks.co m) writes:[color=blue]
            > Contacts are silver - leads are gold[/color]

            Lead into gold? Alchemy?


            --
            Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

            Books Online for SQL Server 2005 at

            Books Online for SQL Server 2000 at

            Comment

            Working...