Find the lowest value in an array of results within a SELECT statement... possible?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • webbeacon
    New Member
    • Dec 2015
    • 30

    Find the lowest value in an array of results within a SELECT statement... possible?

    I'd like to make a modification to this SELECT statement so that it only returns a single result in an array of matches; namely, the lowest Damerau-Levenshtein value that it finds. Currently I'm getting multiple pairs presented depending on how high I set the threshold in the WHERE clause (currently <6):

    Code:
    SELECT TCSDBOWNER_EMP.ID AS EMP_ID, TCSDBOWNER_EMP.LAST_NAME AS EMP_LNAME, TCSDBOWNER_EMP.FIRST_NAME AS EMP_FNAME, TCSDBOWNER_EMP.EMAIL_ADR AS EMP_EMAIL, TCSDBOWNER_EMP.ACTIVE_FLAG AS ACTIVE, TICO.TICO_lName, TICO.TICO_fName, TICO.TICO_Lic, TICO.TICO_SupLic, TICO.TICO_Notes, DamerauLevenshtein(TCSDBOWNER_EMP.LAST_NAME & TCSDBOWNER_EMP.FIRST_NAME,TICO.[TICO_lName] & TICO.[TICO_fName]) AS DamLev
    FROM TICO, TCSDBOWNER_EMP
    WHERE (((DamerauLevenshtein([TCSDBOWNER_EMP].[LAST_NAME] & [TCSDBOWNER_EMP].[FIRST_NAME],[TICO].[TICO_lName] & [TICO].[TICO_fName]))<6))
    ORDER BY TCSDBOWNER_EMP.LAST_NAME, TCSDBOWNER_EMP.FIRST_NAME;
    will result in:
    Code:
    EMP_ID  EMP_LNAME  EMP_FNAME  EMP_EMAIL  ACTIVE  TICO_lName  TICO_fName  TICO_Lic  TICO_SupLic  TICO_Notes  DamLev
    111111  REYNOLD      MONA        x         T      REYNOLD      SANA      T123456       x           x          2
    111111  REYNOLD      MONA        x         T      REYNOLD      MONA      T654321       x           x          0
    222222  REYNOLD      SANA        x         T      REYNOLD      MONA      T654321       x           x          2
    222222  REYNOLD      SANA        x         T      REYNOLD      SANA      T123456       x           x          0
    That didn't line up very well... the numbers on the far right are the Damereau-Levenshtein results (DamLev). "0" is a perfect match, where "2" means it's 2 characters off. I want "<6" to be an acceptable match if there are no better matches available, but otherwise it should be discarded.

    I'm pretty sure that it would involve using an array for the results and then just picking the lowest result, but I'm not sure how to implement that in the SQL. I see other threads here about arrays in SQL so I believe it's possible, but I'm not sure how to just isolate the matches and not create an array of the entire table of records.

    Thanks for your thoughts...

    Oh and the tables share no relationship. TCSDBOWNER_EMP. ID is a primary key and there's no key in TICO.
    Last edited by NeoPa; Jan 13 '16, 02:33 AM. Reason: Reformatted table.
  • webbeacon
    New Member
    • Dec 2015
    • 30

    #2
    I just tried using this function but it bogged down the computer ridiculously:

    Code:
    Function MinOfList(ParamArray varValues()) As Variant
        Dim i As Integer        'Loop controller.
        Dim varMin As Variant   'Smallest value found so far.
    
        varMin = Null           'Initialize to null
    
        For i = LBound(varValues) To UBound(varValues)
            If IsNumeric(varValues(i)) Or IsDate(varValues(i)) Then
                If varMin <= varValues(i) Then
                    'do nothing
                Else
                    varMin = varValues(i)
                End If
            End If
        Next
    
        MinOfList = varMin
    End Function

    Comment

    • webbeacon
      New Member
      • Dec 2015
      • 30

      #3
      Thinking now MIN() might be the answer but I get a message about EMP_ID not being an aggregate something....?

      applied here:
      Code:
      SELECT TCSDBOWNER_EMP.ID AS EMP_ID, TCSDBOWNER_EMP.LAST_NAME AS EMP_LNAME, TCSDBOWNER_EMP.FIRST_NAME AS EMP_FNAME, TCSDBOWNER_EMP.EMAIL_ADR AS EMP_EMAIL, TCSDBOWNER_EMP.ACTIVE_FLAG AS ACTIVE, TICO.TICO_lName, TICO.TICO_fName, TICO.TICO_Lic, TICO.TICO_SupLic, TICO.TICO_Notes, MIN(DamerauLevenshtein(TCSDBOWNER_EMP.LAST_NAME & TCSDBOWNER_EMP.FIRST_NAME,TICO.[TICO_lName] & TICO.[TICO_fName])) AS DamLev
      FROM TICO, TCSDBOWNER_EMP
      WHERE (((DamerauLevenshtein([TCSDBOWNER_EMP].[LAST_NAME] & [TCSDBOWNER_EMP].[FIRST_NAME],[TICO].[TICO_lName] & [TICO].[TICO_fName]))<6))
      ORDER BY TCSDBOWNER_EMP.LAST_NAME, TCSDBOWNER_EMP.FIRST_NAME;
      Last edited by webbeacon; Jan 12 '16, 09:30 PM. Reason: forgot CODE tags

      Comment

      • Rabbit
        Recognized Expert MVP
        • Jan 2007
        • 12517

        #4
        Something like this should work.

        Code:
        SELECT t1.*
        
        FROM (
             levenshtein query
        ) AS t1
        
        INNER JOIN (
             SELECT IDField, MIN(LevDistance) AS MinLevDistance
             FROM (levenshtein query) AS x
             GROUP BY IDField
        ) AS t2
        
        ON t1.IDField = t2.IDField AND
           t1.LevDistance = t2.MinLevDistance

        Comment

        • NeoPa
          Recognized Expert Moderator MVP
          • Oct 2006
          • 32655

          #5
          Hi Jen.

          I assume it's the lowest value of [DamLev] you're after, but matched up with the data from your other tables?

          BTW.
          To match columns within the [CODE] tags just make sure to use all spaces and no TABs ;-)
          Also, to show a [CODE] block without an extraneous line at the end simply put the [/CODE] tag at the end of the last line and not on a new one.
          Last edited by NeoPa; Jan 13 '16, 02:34 AM.

          Comment

          • mbizup
            New Member
            • Jun 2015
            • 80

            #6
            webbeacon but I get a message about EMP_ID not being an aggregate something....?
            Just an aside... 'Aggregate Functions' (SUM, MIN, MAX, AVG, etc) are calculations that span multiple rows in your query. The GROUP BY clause defines the range of records that these functions span - for example a SUM per employee, etc. The error message you received, in plain English means that for these types of queries, ALL fields selected must either be in Aggregate Functions or in a GROUP BY clause.
            Last edited by zmbd; Jan 13 '16, 10:30 PM. Reason: [z{ :) }]

            Comment

            • webbeacon
              New Member
              • Dec 2015
              • 30

              #7
              NeoPa:
              I assume it's the lowest value of [DamLev] you're after
              Yes that's right

              but matched up with the data from your other tables?
              DamLev has already performed a match, and the number represents the quality of the match.

              And thank you for cleaning up my table :)
              Last edited by webbeacon; Jan 13 '16, 01:26 PM. Reason: added thank you

              Comment

              • webbeacon
                New Member
                • Dec 2015
                • 30

                #8
                mbizup:
                The error message you received, in plain English means that for these types of queries, ALL fields selected must either be in Aggregate Functions or in a GROUP BY clause.
                [strike]So does that mean that if I use a GROUP BY at the end of the statement, without any other changes, it would probably work?[/strike]

                This did not work. LOL
                Last edited by webbeacon; Jan 13 '16, 02:16 PM. Reason: adding update

                Comment

                • webbeacon
                  New Member
                  • Dec 2015
                  • 30

                  #9
                  Rabbit: This works beautifully, thank you very much! Here's the adapted version of your code:

                  Code:
                  SELECT t1.*
                  FROM (
                  
                  SELECT TCSDBOWNER_EMP.ID AS EMP_ID, TCSDBOWNER_EMP.LAST_NAME AS EMP_LNAME, TCSDBOWNER_EMP.FIRST_NAME AS EMP_FNAME, TCSDBOWNER_EMP.EMAIL_ADR AS EMP_EMAIL, TCSDBOWNER_EMP.ACTIVE_FLAG AS ACTIVE, TICO.TICO_lName, TICO.TICO_fName, TICO.TICO_Lic, TICO.TICO_SupLic, TICO.TICO_Notes, DamerauLevenshtein(TCSDBOWNER_EMP.LAST_NAME & TCSDBOWNER_EMP.FIRST_NAME,TICO.[TICO_lName] & TICO.[TICO_fName]) AS DamLev FROM TICO, TCSDBOWNER_EMP WHERE DamerauLevenshtein([TCSDBOWNER_EMP].[LAST_NAME] & [TCSDBOWNER_EMP].[FIRST_NAME],[TICO].[TICO_lName] & [TICO].[TICO_fName])<3 AND  TCSDBOWNER_EMP.ACTIVE_FLAG="T" ORDER BY TCSDBOWNER_EMP.LAST_NAME, TCSDBOWNER_EMP.FIRST_NAME
                  
                  )  AS t1 
                  
                  INNER JOIN (
                  
                  SELECT EMP_ID, MIN(DamLev) AS MinDamLev FROM (
                  
                  SELECT TCSDBOWNER_EMP.ID AS EMP_ID, TCSDBOWNER_EMP.LAST_NAME AS EMP_LNAME, TCSDBOWNER_EMP.FIRST_NAME AS EMP_FNAME, TCSDBOWNER_EMP.EMAIL_ADR AS EMP_EMAIL, TCSDBOWNER_EMP.ACTIVE_FLAG AS ACTIVE, TICO.TICO_lName, TICO.TICO_fName, TICO.TICO_Lic, TICO.TICO_SupLic, TICO.TICO_Notes, DamerauLevenshtein(TCSDBOWNER_EMP.LAST_NAME & TCSDBOWNER_EMP.FIRST_NAME,TICO.[TICO_lName] & TICO.[TICO_fName]) AS DamLev FROM TICO, TCSDBOWNER_EMP WHERE DamerauLevenshtein([TCSDBOWNER_EMP].[LAST_NAME] & [TCSDBOWNER_EMP].[FIRST_NAME],[TICO].[TICO_lName] & [TICO].[TICO_fName])<3 AND  TCSDBOWNER_EMP.ACTIVE_FLAG="T" ORDER BY TCSDBOWNER_EMP.LAST_NAME, TCSDBOWNER_EMP.FIRST_NAME
                  
                  )  AS x 
                  
                  GROUP BY EMP_ID
                  
                  )  AS t2 ON (t1.EMP_ID = t2.EMP_ID) AND (t1.DamLev = t2.MinDamLev);
                  I have one follow-up question for you: I see the 2nd Levenshtein query is being pulled "AS x"... what's being done with x? I don't see it being called anywhere. Is that just a holder so it can be grouped and passed to t2?

                  Comment

                  • Rabbit
                    Recognized Expert MVP
                    • Jan 2007
                    • 12517

                    #10
                    Subqueries need a name. X is the name I gave the subquery.

                    Comment

                    • NeoPa
                      Recognized Expert Moderator MVP
                      • Oct 2006
                      • 32655

                      #11
                      Originally posted by WebBeacon
                      WebBeacon:
                      I see the 2nd Levenshtein query is being pulled "AS x"... what's being done with x? I don't see it being called anywhere. Is that just a holder so it can be grouped and passed to t2?
                      Essentially, yes it is just a holder or marker in this case. It's probably not necessary here. However, many people find it easier to work with subqueries when every one is ALIASed (AS is an alias for ALIAS, and both are optional. IE. FROM (X) AS Y could equally be written FROM (X) Y or FROM (X) ALIAS Y).

                      Comment

                      • NeoPa
                        Recognized Expert Moderator MVP
                        • Oct 2006
                        • 32655

                        #12
                        Originally posted by Rabbit
                        Rabbit:
                        Subqueries need a name. X is the name I gave the subquery.
                        Sorry Rabbit. We cross-posted.

                        Although my testing showed that this isn't absolutely necessary, at least in Access, I would say it may well be a requirement in SQL Server, and maybe even in the SQL standard. It's probably a good idea to follow this practice in all circumstances anyway. I know I do. I meant to include that in my earlier response.

                        Comment

                        • webbeacon
                          New Member
                          • Dec 2015
                          • 30

                          #13
                          You guys are great. Thank you so much :)

                          Comment

                          • NeoPa
                            Recognized Expert Moderator MVP
                            • Oct 2006
                            • 32655

                            #14
                            I've moved a new question, related to these details, into a separate thread (Help Linking Tables in SQL).

                            Comment

                            Working...