Last 3 records by group

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • moishy101
    New Member
    • Feb 2012
    • 46

    #46
    Rabbit:

    I just noticed your post now, it works great! Thank you. :)

    NeoPa:

    I tried copying your new query but I get an error when I try to open it: Invalid use of '.', '!' or '()' in query expression 'qTH1.'.

    the exact query is (as posted):
    Code:
    SELECT qTH1.*
    FROM qryTransactionHebrew AS qTH1 INNER JOIN qryTransactionHebrew AS qTH2 ON (qTH1.ClientID=qTH2.ClientID) AND (qTH1.MonthOrder>=qTH2.MonthOrder)
    GROUP BY qTH1.*
    HAVING Count(*)<4
    ORDER BY qTH1.ClientID, qTH1.MonthOrder DESC;
    ADezii:
    Special conditions such as these should have been stated earlier, especially since it effects the gentlemen working on the SQL approach. The changes to the Code are relatively easy, not so for the SQL involved.
    True indeed. As it happens though, I did anticipate this when looking at the problem earlier.
    Does that mean that it is accounted for in your solution? If not can be incorporated?

    ADezii:

    Thank you very much for all your efforts in helping me. :)

    I've tested the performance on 100 rows and then multiplied the result by 1500 (total of 150000). Here are my findings:

    based on QueryPerformanc eCounter takes approx.
    433.75252378422 3 seconds
    or
    7.2292087297370 5 minutes

    and based on winmm.dll it takes approx.
    436.5 seconds
    or
    7.275 minutes

    Comment

    • NeoPa
      Recognized Expert Moderator MVP
      • Oct 2006
      • 32633

      #47
      Originally posted by Moishy101
      Moishy101:
      I tried copying your new query but I get an error when I try to open it: Invalid use of '.', '!' or '()' in query expression 'qTH1.'.
      So, did you read the PS, as it seems to me that was a possibility I had anticipated, so I included how to fix it there. I didn't feel it would be necessary to include both versions in the post as the change to the other version was so trivial and straightforward , but if you need it :
      Code:
      SELECT   qTH1.*
      FROM     [qryTransactionHebrew] AS [qTH1]
               INNER JOIN
               [qryTransactionHebrew] AS [qTH2]
        ON     (qTH1.ClientID = qTH2.ClientID)
       AND     (qTH1.MonthOrder >= qTH2.MonthOrder)
      GROUP BY qTH1.ClientID
             , qTH1.TransactionNumber
             , qTH1.MonthOrder
      HAVING   Count(*) < 4
      ORDER BY qTH1.ClientID ASC
             , qTH1.MonthOrder DESC

      Comment

      • Rabbit
        Recognized Expert MVP
        • Jan 2007
        • 12517

        #48
        Could you run the same statistics using the pure SQL? I would be interested in knowing which one is faster.

        Comment

        • moishy101
          New Member
          • Feb 2012
          • 46

          #49
          NeoPa:

          Then I get a Cannot group on fields selected with '*' (qTH1). error.

          Rabbit:

          Using pure SQL based on QueryPerformanc eCounter takes approx.
          396.94153041072 3 seconds
          or
          6.6156921735120 5 minutes

          and based on winmm.dll it takes approx.
          415.5 seconds
          or
          6.925 minutes

          Comment

          • ADezii
            Recognized Expert Expert
            • Apr 2006
            • 8834

            #50
            @Rabbit:
            I would have thought that the SQL approach would have been 'significantly' faster, what is your opinion?

            Comment

            • Rabbit
              Recognized Expert MVP
              • Jan 2007
              • 12517

              #51
              I didn't think that it would be a large difference. The problem is that the 150,000 records are joined to itself. So it's practically comparing 150000 ^ 2. And that's a lot of records to process. That would be the largest contributing factor to the speed of and would far outweigh the overhead of VBA.

              However, this may be because of a lack of proper indexes. Perhaps an index on the client and month would improve the speed of the SQL.

              Now, if we were talking about SQL Server 2005+, it has a row numbering function that would significantly speed up the processing because you don't have to join the table to itself. I'm talking seconds rather than minutes.

              Comment

              • ADezii
                Recognized Expert Expert
                • Apr 2006
                • 8834

                #52
                @Rabbit:
                I also do not think that
                testing the performance on 100 rows and then multiplying the result by 1500 (total of 150000)
                is a reliable Benchmark. The Testing needs to be performed on the entire Data set of 150,000 Records. Comments?

                Comment

                • Rabbit
                  Recognized Expert MVP
                  • Jan 2007
                  • 12517

                  #53
                  Hard to say, because the numbers seem to be in line with what I got. I tested the SQL on 170,000 rows and it took a little under 6 minutes.

                  A full test couldn't hurt, if it's only 6-7 minutes it would be beneficial to just do a full test.

                  Comment

                  • Rabbit
                    Recognized Expert MVP
                    • Jan 2007
                    • 12517

                    #54
                    I whipped this up. It's a simplified replication of the ROW_NUMBER() function in SQL Server. It should be many magnitudes faster.

                    Code:
                    Option Compare Database
                    Option Explicit
                    
                    Dim row As Double
                    Dim strPartition As String
                    
                    Public Function RowNum(partition As String) As Double
                        If partition <> strPartition Then
                            row = 0
                            strPartition = partition
                        End If
                        
                        row = row + 1
                        RowNum = row
                    End Function
                    Code:
                    SELECT partitionField, 
                       orderField,
                       otherField
                    FROM someTable
                    WHERE RowNumber(partitionField) < 4
                    ORDER BY partitionField,
                       orderField
                    PS: It's not as quick as the SQL Server version but it takes half the time as the previous solutions. I ran it on 170000 records and it took a little under 3 minutes.
                    Last edited by Rabbit; Feb 14 '12, 11:59 PM.

                    Comment

                    • NeoPa
                      Recognized Expert Moderator MVP
                      • Oct 2006
                      • 32633

                      #55
                      Originally posted by Moishy101
                      Moishy101:
                      Then I get a Cannot group on fields selected with '*' (qTH1). error.
                      OK. That means Jet SQL needs it spelled out long-hand. Unfortunate, as I was trying to minimise unnecessary verbage :-(

                      Code:
                      SELECT   qTH1.ClientID
                             , qTH1.TransactionNumber
                             , qTH1.MonthOrder
                      FROM     [qryTransactionHebrew] AS [qTH1]
                               INNER JOIN
                               [qryTransactionHebrew] AS [qTH2]
                        ON     (qTH1.ClientID = qTH2.ClientID)
                       AND     (qTH1.MonthOrder >= qTH2.MonthOrder)
                      GROUP BY qTH1.ClientID
                             , qTH1.TransactionNumber
                             , qTH1.MonthOrder
                      HAVING   Count(*) < 4
                      ORDER BY qTH1.ClientID ASC
                             , qTH1.MonthOrder DESC
                      Essentially the same as Rabbit's version but with more meaningful names (Well, you couldn't expect him to read my mind now, could you).

                      Comment

                      • NeoPa
                        Recognized Expert Moderator MVP
                        • Oct 2006
                        • 32633

                        #56
                        On the performance issue, it would be interesting (think critical) to know whether or not [ClientID] is indexed. I would normally expect the SQL only version to be significantly faster, all else being equal. I didn't study the VBA suggested closely enough to determine if it was implementing the same logic as the SQL. Nor can I honestly say what I mean by significantly faster. A little more than was shown to be sure, but not necessarily much more.

                        I believe the killer issues are :
                        1. The number of records involved.
                        2. The fact that the data of each record needs to be run through [qryTransactionH ebrew] before it's sorted.


                        PS. Did the OP ever comment on the problem related to months of different years, raised in post #19?

                        Comment

                        • moishy101
                          New Member
                          • Feb 2012
                          • 46

                          #57
                          Rabbit:
                          I'm not sure what your pseudo code means, can you please elaborate?

                          NeoPa:

                          Thank you very much, it works well.

                          Comment

                          • Rabbit
                            Recognized Expert MVP
                            • Jan 2007
                            • 12517

                            #58
                            It wasn't pseudo code. Well, the SQL query won't run with your table set up but it's syntactically correct. It basically numbers the rows breaking by the grouping field and then uses that to filter the rows. It should run at least twice as fast as the other methods.

                            NeoPa, I don't think that they did.

                            Comment

                            • moishy101
                              New Member
                              • Feb 2012
                              • 46

                              #59
                              NeoPa:

                              Regarding the problem related to months of different years, raised in post #19, there is another problem, the Hebrew calendar has 6 leap years (13 months instead of 12) in a 19 year cycle, so the order will not be accurate.
                              but both of those issues can be solved quite simply albeit not in a smart or professional manner, in tblHebrewMonthO rder I can add the values for the coming years (expected lifetime of the mdb).

                              Comment

                              • moishy101
                                New Member
                                • Feb 2012
                                • 46

                                #60
                                Rabbit:

                                What table setup do I need for your new query to work?

                                Comment

                                Working...