How do I get a partial match?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • vorlonfear
    New Member
    • Aug 2009
    • 4

    How do I get a partial match?

    I have been working on this for a while now and I wanted to see if someone could assist me. I have 2 tables each with 5 fields. 4 of the fields are 2 character strings, then the final field is the full value.
    [EX 1:
    Table1......... .......Table2
    Seg1: 10............. Seg1: 10
    Seg2: 11............. Seg2: 11
    Seg3: 15............. Seg3: 15
    Seg4: 01............. Seg4: Null
    Final: 10111501...Fina l: 101115]
    [Ex 2:
    Table1......... .......Table2
    Seg1: 10............. Seg1: 10
    Seg2: 11............. Seg2: 11
    Seg3: 15............. Seg3: 15
    Seg4: 01............. Seg4: 01
    Final: 10111501...Fina l: 10111501]


    I know how to match the finals where they are equal but I need to match where there is a partial match with the greatest number of Segments. So in the above I would need to exclude the exact match EX 2 and match 10111501 to 101115 like in EX 1, however I have not been able to get this to work.

    Any assistance would be greatly appreciated.
  • ChipR
    Recognized Expert Top Contributor
    • Jul 2008
    • 1289

    #2
    This sounds very much like an assignment, but are you matching one record from Table1 with a specific record in Table2, or all/any of them? Are you trying to do this in a query?

    Comment

    • ajalwaysus
      Recognized Expert Contributor
      • Jul 2009
      • 266

      #3
      I'm not sure I know exactly what you are asking, you may need to clarify more. But if I were to guess, you could compare the fields, using a IIF function and a LEN function.

      example:
      Code:
      SELECT Table1.Seg1, Table1.Seg2, Table1.Seg3, Table1.Seg4, Table1.Seg5
      FROM Table1, Table2
      WHERE (((IIf(Len([Table1].[Seg5])>Len([Table2].[Seg5]),[Table2].[Seg5]=Left([Table1].[Seg5],Len([Table2].[Seg5])),[Table1].[Seg5]=Left([Table2].[Seg5],Len([Table1].[Seg5]))))<>False));
      The code above, compares the length of both fields, then based on which field is longer, it will cut the longer field down to the same length of the smaller field, and then do an equal join on the 2 fields.

      Let me know if this makes sense,
      -AJ

      Comment

      • vorlonfear
        New Member
        • Aug 2009
        • 4

        #4
        Re:

        AJ,

        Thanks for the information. I am not sure this will work for what I need though. Basically I have 2 tables 1 may have 10111501 and the other may have 101115 without the last segment. I am trying to figure out a way to do this that is efficient and quick. I was able to do it with Sub queries but they take forever to run.

        Let e know if this makes sense to you.

        -James

        Comment

        • ADezii
          Recognized Expert Expert
          • Apr 2006
          • 8834

          #5
          Originally posted by vorlonfear
          AJ,

          Thanks for the information. I am not sure this will work for what I need though. Basically I have 2 tables 1 may have 10111501 and the other may have 101115 without the last segment. I am trying to figure out a way to do this that is efficient and quick. I was able to do it with Sub queries but they take forever to run.

          Let e know if this makes sense to you.

          -James
          You may wish to try the Instr() Function which will specifying the position of the first occurrence of one string within another.
          Code:
          Debug.print InStr(10111501,101115)
          will equal 1 indicating that 101115 is contained withing 10111501 starting at Position 1.
          Code:
          Debug.print InStr(10111501,102115
          will evaluate to 0 since 102115 is not contained in 10111501.

          Comment

          • vorlonfear
            New Member
            • Aug 2009
            • 4

            #6
            Re:

            ADezii,

            I thought about using Instr() but There may be multiple hits for a single match. Let me give another example.

            Table 1: Has 10111501
            Table 2: Has 10, 1011, 101115

            I would need to match 10111501 to 101115 but not 10 or 1011.

            Thanks for the help.

            Comment

            • ajalwaysus
              Recognized Expert Contributor
              • Jul 2009
              • 266

              #7
              How about trying the InStr() under an if that checks the length of the string first

              Example:
              Code:
              If len(101115) > 4 then
                 Debug.print InStr(10111501,101115)
              End If
              -AJ

              Comment

              • vorlonfear
                New Member
                • Aug 2009
                • 4

                #8
                Re:

                The only problem is that there are thousands of codes that need to be matched up.

                Comment

                • ajalwaysus
                  Recognized Expert Contributor
                  • Jul 2009
                  • 266

                  #9
                  One way or another you will need a function to handle each value as it is passed through the function, have you tested this function and see how long it takes? Because you would be surprised how fast some functions work, it all depends on the intensity of the function.
                  I once wrote a strip string function that i swore would take forever on 500,000 records but once I ran it, it was faster than I had expected it to be.
                  I welcome anyone with a better idea, but I'm not sure if there is a much better one.

                  -AJ

                  Comment

                  • NeoPa
                    Recognized Expert Moderator MVP
                    • Oct 2006
                    • 32636

                    #10
                    If, as seems clear, you will need to determine which of the pairs (of records from the two tables) has the best match, then you will need to produce a cartesian product of matches before grouping these and finding the greatest number of sub-field matches.

                    This will certainly take an enormous amount of processing. The larger the individual tables grow, the longer this will take (not even gradually, but exponentially).

                    I'm sorry - I see no good news here.

                    Unless you can change your specification somehow, so that there are some givens to work with and set the indices to, I see no viable solution.

                    Comment

                    • ADezii
                      Recognized Expert Expert
                      • Apr 2006
                      • 8834

                      #11
                      Originally posted by vorlonfear
                      The only problem is that there are thousands of codes that need to be matched up.
                      Before I attempt an Algorithm to solve this problem efficiently, I want to be absolutely sure that I am interpreting it correctly. For each Record in each Table, you have to concatenate each of the 5 Fields and compare the Final Results to each other to see if there is a Match. Is this correct?

                      Comment

                      • ChipR
                        Recognized Expert Top Contributor
                        • Jul 2008
                        • 1289

                        #12
                        If I'm understanding the problem, I would take the cross product of the 2 tables, add a calculated field for
                        Code:
                        Matches: iif(table1.seg1 = table2.seg1, 1, 0) + 
                                 iif(table1.seg2 = table2.seg2, 1, 0) + 
                                 iif(table1.seg3 = table2.seg3, 1, 0) + 
                                 iif(table1.seg4 = table2.seg4, 1, 0)
                        Then take the record with Max(Matches). Hopefully you can just take one record from the first table and match it against every record in the second, rather than the expensive full cross product.
                        Last edited by ChipR; Aug 19 '09, 06:07 PM. Reason: Typo, had table2.seg2 instead of table2.seg1.

                        Comment

                        • NeoPa
                          Recognized Expert Moderator MVP
                          • Oct 2006
                          • 32636

                          #13
                          I think you have it Chip (although Seg1 = Seg1 rather than Seg1 = Seg2 - Just a typo I'm sure).

                          Your last paragraph doesn't make too much sense though (As far as I can see). For each record from Table1 there should be a reference to each record in Table2. If only 1 record were processed from Table1, then only one record would have a match found. May make sense when processing in code, but I don't see it for a query.

                          To restrict the links from processing a full cartesian product though, each of the sub-fields could be set as an index (one each) and selection criteria set such that only records where at least one of the indices found a match would be processed. This is easy enough to specify with ORed criteria.

                          Comment

                          • ChipR
                            Recognized Expert Top Contributor
                            • Jul 2008
                            • 1289

                            #14
                            I was thinking something like
                            WHERE table1.recordID = Forms![InputForm]!cmbRecordID

                            Comment

                            • NeoPa
                              Recognized Expert Moderator MVP
                              • Oct 2006
                              • 32636

                              #15
                              Originally posted by ChipR
                              I was thinking something like
                              WHERE table1.recordID = Forms![InputForm]!cmbRecordID
                              That makes some sense. Actually, depending on the OP's requirements of course, that could be it.

                              PS. Please see paragraph I added to my earlier post. You're obviously very sharp tonight with your responses ;)

                              Comment

                              Working...