Using OR vs IIF to limit records

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • webbeacon
    New Member
    • Dec 2015
    • 30

    Using OR vs IIF to limit records

    Hi,
    This script is returning true for more than one OR statement, resulting in too many records being returned. I want it to stop if the first one is a match. I'm thinking that should be an IIF statement (instead of using OR... AND as I've done here) but not sure of how to fit that into a WHERE clause.

    Code:
    SELECT  Employees.[LAST_NAME], Employees.[FIRST_NAME], Employees.[Pay_Method], Employees.[EMAIL_ADR], Employees.[Dept], Employees.[Leader], Members.[Last], Members.[First]
    FROM Employees, Members
    WHERE (InStr(basAlphNum(Employees.[LAST_NAME]),Mid(basAlphNum(Members.Last),1,5)) 
    AND InStr(basAlphNum(Employees.[FIRST_NAME]),Mid(basAlphNum(Members.First),1,5)))
    
    OR (InStr(basAlphNum(Members.[Last]),Mid(basAlphNum(Employees.[LAST_NAME]),1,5))
    AND InStr(basAlphNum(Members.[First]),Mid(basAlphNum(Employees.[FIRST_NAME]),1,5)))
    
    OR (InStr(basAlphNum(Employees.[LAST_NAME]),Mid(basAlphNum(Members.Last),1,5)) 
    AND InStr(basAlphNum(Members.[First]),Mid(basAlphNum(Employees.[FIRST_NAME]),1,5)))
    
    OR (InStr(basAlphNum(Members.[Last]),Mid(basAlphNum(Employees.[LAST_NAME]),1,5))
    AND InStr(basAlphNum(Employees.[FIRST_NAME]),Mid(basAlphNum(Members.First),1,5)))
    
    OR InStr(basAlphNum(Members.[Last]),Mid(basAlphNum(Employees.[LAST_NAME]),1,5))
    
    OR InStr(basAlphNum(Employees.[LAST_NAME]),Mid(basAlphNum(Members.Last),1,5)) 
    
    GROUP BY Employees.[LAST_NAME], Employees.[FIRST_NAME], Employees.[Pay_Method], Employees.[EMAIL_ADR], Employees.[Dept], Employees.[Leader], Members.[Last], Members.[First]
    ORDER BY (Employees.[LAST_NAME]);
    Last edited by Rabbit; Dec 16 '15, 05:20 PM. Reason: Please use [code] and [/code] tags when posting code or formatted data.
  • hvsummer
    New Member
    • Aug 2015
    • 215

    #2
    @webbeacon: please give more information bout your data structure, Table's fields, and what are you planning to do ?

    We can understand the code, but can't guess what you want to do, so you should provide more detail before we can start to help you, thanks.

    Comment

    • webbeacon
      New Member
      • Dec 2015
      • 30

      #3
      Thanks hvsummer - basically I'm trying to do an approximate match of first and last names in two lists. I think the table structure is explained in the SELECT statement so I'm not sure what you're looking for in that regard... sorry I'm new here!

      The basAlphNum function is a public function I found that strips out all spaces and punctuation. I'm using MID to restrict the matching portion to 5 characters. These two things exist to capture differences between the two lists (ie. "Marie Louise" vs. "Marie-Louise" vs. "Marie")

      With the addition of lines 15 and 17, I was hoping to widen my net by including matches for just first or last names, after having gone through the four combinations first... but then I'm getting a lot of duplicate matches.

      My "pie in the sky" is to get this to a point where I don't need to do any manual modifications to either list for it to work, or at least keep them to a minimum.

      Hope that helps
      Thank you!

      Comment

      • webbeacon
        New Member
        • Dec 2015
        • 30

        #4
        Here's an example of the problem. I've hidden part of the name fields for privacy but you can see the highlighted record is a correct match; all the records below that are incorrect matches.

        My hope is that I can modify this in such a way that when it makes a match, it stops looking through the rest of the OR statements, thereby avoiding duplicates. Hope that makes sense.

        Comment

        • Rabbit
          Recognized Expert MVP
          • Jan 2007
          • 12517

          #5
          The fields in a SELECT statement tell us almost nothing about the table structure. They may as well be called Field1, Field2, etc. It doesn't tell us the data type, default values, range of data, primary keys, relationships between tables, and so on. All of this information is important when trying to write a query.

          Also, some of us are behind firewalls that block third party storage sites like imgur. It would be better if you typed in the sample data and results in the thread itself.

          Comment

          • webbeacon
            New Member
            • Dec 2015
            • 30

            #6
            Thanks for the feedback Rabbit. The fields are all Short Text. They're populated with a list of employees. So an example of a record would be (Sorry I'm not sure how to do this in a better way)


            Code:
            Table: Employees
            LAST_NAME     FIRST_NAME     Pay_Method     EMAIL_ADR                 Dept     Leader
            Anderson      Marie-Louse    Salary         mlanderson@company.com    OPS      Joe Leader
            
            Table: Members
            Last         First
            Anderson     Marie Louise
            There is no primary key and I'm not sure what you mean by the relationship between the tables.

            The example I posted at Imgur shows a record that's being returned several times based on the last two OR statements, where it only tries to match the last name and ignores the first name. I understand why that's happening, just don't know an alternative and am hoping someone can point me in the right direction.

            Example:
            Last name "Anderson" is matching once correctly, but then also matching on any other last name containing the string "ander". I want it to try to match last name only if it fails to find last name AND first name.

            I hope that answers your questions.

            Comment

            • Rabbit
              Recognized Expert MVP
              • Jan 2007
              • 12517

              #7
              Have one query that doesn't have the last 2 filters.

              Then create a new query that has the last 2 filters but also outer joins to the first query to filter those that found matches out of the result set.

              Union those 2 queries together to get a single result set.

              But let me float a different approach. What you really want is to try to find the best match possible between the two tables. What you want is fuzzy string matching. There are different algorithms you can look at rather than relying on strict substring matching. For example, there are phonetic algorithms that tell you how a string "sounds" and you can see if the "sound" of the two strings are a match. For that, you can look into the Soundex or Metaphone algorithms. There are also edit distance and substring content matching algorithms that produce a score based on how similar the content of each string is compared to another. For that, you can look into the Levenshtein or NGram algorithms.

              Comment

              • webbeacon
                New Member
                • Dec 2015
                • 30

                #8
                Thank you! You're absolutely right, I'm trying to do fuzzy matching. I tried Soundex, but it matched a lot of records incorrectly. I'll look at the other two... really appreciate the tip!

                Comment

                • Rabbit
                  Recognized Expert MVP
                  • Jan 2007
                  • 12517

                  #9
                  I suggested soundex because of how easy it is to implement. But you're right, it's not very accurate. Also, I suggested 3 additional approaches other than soundex in my post, not 2. The other 3 are much more difficult to implement.

                  Comment

                  • Rabbit
                    Recognized Expert MVP
                    • Jan 2007
                    • 12517

                    #10
                    Hi webbeacon, I don't know where you are in this process now, but if you're interested, I created a new article on this forum yesterday that has an implementation of the double metaphone phonetic algorithm.

                    Comment

                    • webbeacon
                      New Member
                      • Dec 2015
                      • 30

                      #11
                      Thanks Rabbit! I'm thinking for my purposes, the ngram technique is probably best. I haven't had a chance to play with it yet, need to figure out how to apply it in Access VB. I really appreciate your help though and thinking of me after your post!

                      Comment

                      • Rabbit
                        Recognized Expert MVP
                        • Jan 2007
                        • 12517

                        #12
                        If you run into any issues, feel free to create a new thread with what you've attempted so far in terms of implementation and we can help you through it.

                        Comment

                        Working...