Query to pull top n% of records that include specific criteria

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jonnycakes
    New Member
    • Jan 2012
    • 24

    Query to pull top n% of records that include specific criteria

    I've been trying to do this for a long time, and i'm at the point where my head may explode.. Please take caution when reviewing my code as it may cause more confusion than it is intended to.

    I've got a table named Test that includes multiple fields:

    [Management]|[Group]|[Employee]|[Amount]|[Variance]|[DateTime]

    I'm trying to pull 25% of the records for each of the Managers listed in the "Management " field, but not just any 25%..
    • I'm needing to ensure that I dont exclude any given employee in the "Employee" field. I need a sample of all employees.
    • I want the 25% to be based off of the TOP of "Variance" in DESC order.
    • I also need to include all records that have "0" in the "Amount" field.


    This set of code was my sad attempt to pull 25% of each of the employee's records not including any records that contain "0" in the "Amount" field.

    Code:
    SELECT Managment, Employee, Max(DateTime) AS MAXDateTime, Amount, Variance
    FROM Test
    GROUP BY Employee, Variance, Management
    HAVING Amount IN
    (SELECT TOP 25 PERCENT Variance
    FROM Test AS DUPE
    WHERE Test.Employee=DUPE.Employee
    AND Amount<>0
    ORDER BY DUPE.Variance DESC, DUPE.DateTime DESC)
    ORDER BY Employee, Variance DESC
    Then my thought was that I would use a UNION ALL to tie in all of the records that have "0" in the "Amount" field. I didnt have any success with this thought and even if i did, it wouldnt have given me the target 25%.

    ::Edit:: I have Access 2000.

    Any suggestions? Thank you in advance for any help you may be able to offer!
    Last edited by jonnycakes; Jul 2 '12, 10:08 PM. Reason: Version of Access
  • Rabbit
    Recognized Expert MVP
    • Jan 2007
    • 12517

    #2
    As far as your bullet points are concerned:

    1) If you're not explicitly excluding any employees, then they'll be included. So I don't know what you mean by that bullet point.

    2) No notes on that. What you have is fine.

    3) You say you want to include 0's and yet you exclude them in your code. I don't know what's going on there.

    As for your code:

    1) You have an amount field in your aggregate query and yet you don't group by it or put it in an aggregate function.

    2) You're using the HAVING clause when you're not using it on an aggregation. That should be moved to the WHERE clause.

    3) In your filter, why are you filtering on Amount when the subquery is returning variance? They're not the same field. They're not going to match each other.

    Comment

    • jonnycakes
      New Member
      • Jan 2012
      • 24

      #3
      Originally posted by Rabbit
      As far as your bullet points are concerned:

      1) If you're not explicitly excluding any employees, then they'll be included. So I don't know what you mean by that bullet point.

      2) No notes on that. What you have is fine.

      3) You say you want to include 0's and yet you exclude them in your code. I don't know what's going on there.

      As for your code:

      1) You have an amount field in your aggregate query and yet you don't group by it or put it in an aggregate function.

      2) You're using the HAVING clause when you're not using it on an aggregation. That should be moved to the WHERE clause.

      3) In your filter, why are you filtering on Amount when the subquery is returning variance? They're not the same field. They're not going to match each other.
      As you can see i've struggled with this immensely.. I am terribly sorry for taking up your time with this haggered code. I hope that after I review this and respond back, that you have it in you to provide further assistance.

      Thank you!

      btw, you are an absolute beast(same goes for NeoPa) and I can only hope to gain half of the knoweldge that you have.

      Comment

      • NeoPa
        Recognized Expert Moderator MVP
        • Oct 2006
        • 32633

        #4
        Originally posted by jonnycakes
        jonnycakes:
        btw, you are an absolute beast
        I'm reading that as a compliment - however it was intended :-D

        I'm sure that, between us, we can help you forward here Jonny. Have a look at the points Rabbit has brought up first though, then you may like to peruse Aggregate Query Woes for some further insight.

        Comment

        • jonnycakes
          New Member
          • Jan 2012
          • 24

          #5
          Originally posted by Rabbit
          As far as your bullet points are concerned:

          1) If you're not explicitly excluding any employees, then they'll be included. So I don't know what you mean by that bullet point.

          2) No notes on that. What you have is fine.

          3) You say you want to include 0's and yet you exclude them in your code. I don't know what's going on there.

          As for your code:

          1) You have an amount field in your aggregate query and yet you don't group by it or put it in an aggregate function.

          2) You're using the HAVING clause when you're not using it on an aggregation. That should be moved to the WHERE clause.

          3) In your filter, why are you filtering on Amount when the subquery is returning variance? They're not the same field. They're not going to match each other.
          Bullet 1: Your comment makes sense.

          Bullet 3: Yep, I failed to explain my thoughts completely.. it didn't make sense for me to filter out the zeros. I originally thought it would be better to filter out the zeros and build a query to capture only records that contained zeros in that column, and then use a union all to tie them togther.

          If i'm not mistaken, I believe my code has been corrected to comply with points 1 & 3, but when I change the HAVING clause to a WHERE, I revieve a "missing syntax error. Here's my code:

          Code:
          SELECT Max(DateTime) AS MAXDATETIME, Employee, Management, Account, Variance, Amount
          FROM ADI
          GROUP BY Employee, Variance, Management, Amount, Account
          HAVING Variance IN
          (SELECT TOP 60 PERCENT Variance
          FROM ADI AS DUPE
          WHERE ADI.Employee=DUPE.Employee
          ORDER BY DUPE.Variance DESC, DUPE.DATETIME DESC, Amount ASC)
          ORDER BY Employee, Variance DESC;
          With this code, I'm getting more than 60% of the records per Employee(IE Employee one may have 4 unique records, this query will return 3)


          Originally posted by NeoPa
          I'm reading that as a compliment - however it was intended :-D

          I'm sure that, between us, we can help you forward here Jonny. Have a look at the points Rabbit has brought up first though, then you may like to peruse Aggregate Query Woes for some further insight.
          :) It was intended to be a compliment. I'm very greatful for the topic you shared with me! It made a lot of sense to me the first time through, but i'm going to continue to study it to ensure I fully understand. Thanks again.

          Comment

          • Rabbit
            Recognized Expert MVP
            • Jan 2007
            • 12517

            #6
            I thought you wanted 25%? And if you have 4 records, then yes, you will get 3 records. Two records is only 50%. If it can't get the exact percent, it will round up.

            If you absolutely need it to not round up, you will need to use a ranking query and calculate your percentages there.

            Comment

            • jonnycakes
              New Member
              • Jan 2012
              • 24

              #7
              Well the % may change, but it's very helpful for me to know that the top function will round up. Given that this function acts with such behavior, my assumption is that the result of this query will yield a much larger % of management's records.

              My goal was to have each manager to review n%(25, 60, etc) of records. It's important that each manager reviews at least 1 from every employee. Equally as important these records should include all of the records that contain 0 in the amount field and the highest variance records.

              Any suggestions on how I can ensure to include all records that have 0 in the amount field? My query currently only focuses on top variance. I'll be sure to research ranking; thank you for your guidance.

              Comment

              • Rabbit
                Recognized Expert MVP
                • Jan 2007
                • 12517

                #8
                It seems I've misunderstood one of your requirements. If you need to include all 0 records, then you will use the method you laid out before. Union them to the variance query.

                Comment

                Working...