Taking one additional record in percentile query

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jdfergus
    New Member
    • Jun 2010
    • 14

    Taking one additional record in percentile query

    Access 2003/XP

    Hello, I have a query that shows a running total and percentile of the number of claims our company gets for each type of claim. The field expressions in the design of the query look like this:

    Type: [Type]
    Claims: [Claims]
    RunTotal: DSum("Claims"," ClaimsType","[Claims] >= " & [Claims])
    Percentile: [RunTotal]/80952144
    Included: IIf([Percentile]<0.75,"Yes","No ")

    "RunTotal" is sorted in descending order, "ClaimsType " is the table the query is based off of, 80952144 is the total sum of all claims (entering Sum(Claims) only gives the sum for each "Type").

    My question here is with the "Included" field, which takes the cutoff point at everything within the top 75th percentile. What I would like it to do, is if it could somehow take one additional "Type" over .75 to ensure that at least 75% of "claims" is covered. For example, what it shows now is:

    Type Claims RunTotal Percentile Included
    A 10 10 50% Yes
    B 4 14 70% Yes
    C 3 17 85% No
    D 2 19 95% No
    E 1 20 100% No

    But I would need it to show "Yes" for Type C, since A and B alone would not cover 75% of the total claims. Also, sometimes there is a "tie" at the cutoff point, which might complicate things further. Then it would look like:

    Type Claims RunTotal Percentile Included
    A 10 10 50% Yes
    B 4 14 70% Yes
    C 2 18 90% No
    D 2 18 90% No
    E 1 20 100% No

    In this case I would need both C and D to have "Included" as "Yes". Hopefully this doesn't complicate things too much, and any help would be greatly appreciated.
  • jdfergus
    New Member
    • Jun 2010
    • 14

    #2
    Is this even possible in Access or is it too complex?

    Comment

    • ADezii
      Recognized Expert Expert
      • Apr 2006
      • 8834

      #3
      Subscribing for now, gotta think it over.

      Comment

      • NeoPa
        Recognized Expert Moderator MVP
        • Oct 2006
        • 32661

        #4
        Originally posted by JDFergus
        JDFergus: 80952144 is the total sum of all claims (entering Sum(Claims) only gives the sum for each "Type").
        Let's start with this. You may find another DSum() call would work for you here (Sum() would behave exactly as you describe) :
        Code:
        Percentile: [RunTotal]/DSum('[Claims]', '[ClaimsType]')

        Comment

        • NeoPa
          Recognized Expert Moderator MVP
          • Oct 2006
          • 32661

          #5
          What does the data look like in [ClaimsType]? Is it already grouped by [Type]? Or is that done within your query?

          Comment

          • ADezii
            Recognized Expert Expert
            • Apr 2006
            • 8834

            #6
            Kindly Post the actual SQL for the Query.

            Comment

            • jdfergus
              New Member
              • Jun 2010
              • 14

              #7
              Code:
              SELECT ClaimsType.ClaimsType, ClaimsType.Claims, DSum("Claims","ClaimsType","[Claims] >= " & [Claims]) AS RunTotal, [RunTotal]/DSum('Claims','ClaimsType') AS Percentile, IIf([Percentile]<0.75,"Yes","No") AS Included
              FROM ClaimsType
              GROUP BY ClaimsType.ClaimsType, ClaimsType.Claims
              ORDER BY ClaimsType.Claims DESC;
              Thank you for fixing my DSum problem, NeoPa. I knew that wasn't correct but wasn't exactly sure how to fix it. I've corrected it in the above SQL. Also, "ClaimsType " is just the text for the primary key of claim, such as 'A010'. It's already grouped by that so there's only one "ClaimsType " for each set of claims.

              Comment

              • NeoPa
                Recognized Expert Moderator MVP
                • Oct 2006
                • 32661

                #8
                I'd really like to see some example data from [ClaimsType] if you don't mind :)

                I'm pleased to have helped with the DSum() thing.

                PS. Don't mind me. I'm just going to repost your SQL in such a way as to make it a little easier to work with :
                Code:
                SELECT   [ClaimsType]
                       , [Claims]
                       , DSum('Claims','ClaimsType','[Claims]>=' & [Claims]) AS RunTotal
                       , [RunTotal]/DSum('Claims','ClaimsType') AS Percentile
                       , IIf([Percentile]<0.75,'Yes','No') AS Included
                FROM     [ClaimsType]
                GROUP BY [ClaimsType]
                       , [Claims]
                ORDER BY [Claims] DESC
                Last edited by NeoPa; Aug 6 '10, 03:46 PM. Reason: Added JD's SQL

                Comment

                • jdfergus
                  New Member
                  • Jun 2010
                  • 14

                  #9
                  ClaimsType Claims RunTotal Percentile Included
                  25000 1582232 1582232 1.9545% Yes
                  4019 1168548 2750780 3.3963% Yes
                  2724 1115988 3866768 4.7766% Yes
                  5856 1099347 4966115 6.1346% Yes
                  78650 1063707 6029822 7.4486% Yes

                  Here are exactly the first 5 records of what the query would produce now. "ClaimsType " is text. If you're worried about the naming, I changed it from what it actually is before posting on here for security/simplicity reasons.

                  Comment

                  • NeoPa
                    Recognized Expert Moderator MVP
                    • Oct 2006
                    • 32661

                    #10
                    All information is likely to be helpful JD, but I asked for some example data from the [ClaimsType] table. I need to get a clearer understanding of what we are working with.

                    Comment

                    • ADezii
                      Recognized Expert Expert
                      • Apr 2006
                      • 8834

                      #11
                      I do believe that I have a solution for at least your first dilemma, namely have the Included Field show 'Yes' for the 1st additional 'Type' over .75 Percentile.
                      1. Declare a Public Variable as follows in a Standard Code Module:
                        Code:
                        Public intNumOfTypeOvers As Integer
                      2. Copy-N-Paste the following Public Function into a Standard Code Module:
                        Code:
                        Public Function fCalcIncluded(sngPercentile As Single) As String
                        'intNumOfTypeOvers must retain Value between successive Calls
                        If sngPercentile < 0.75 Then
                          fCalcIncluded = "Yes"
                        Else
                          intNumOfTypeOvers = intNumOfTypeOvers + 1
                            If intNumOfTypeOvers = 1 Then   'and only 1
                              fCalcIncluded = "Yes"
                            Else
                              fCalcIncluded = "No"
                            End If
                        End If
                        End Function
                      3. Change the manner in which the Calculated Field [Included] obtains its Value, namely:
                        Code:
                        'Outta here! ==> Included: IIf([Percentile]<0.75,"Yes","No")
                        Code:
                        'New Statement for the [Included] Field ==> Included: fCalcIncluded([Percentile])
                      4. Here is what is happening. For each Grouping Level, the Value of the [Percentile] Field is passed to the fCalcIncluded() Function. The Percentile is analyzed, and if the Value is <.75 the Function returns 'Yes'. If Percentile is >=.75 and it is the 1st occurrence where Percentile >=.75, then the Functions returns 'Yes', otherwise 'No'.
                      5. You 'must' Reset the Value of the Public Variable (intNumOfTypeOv ers) to 0 before you execute the Query, as in:
                        Code:
                        intNumOfTypeOvers = 0
                        DoCmd.OpenQuery "qryClaims", acViewNormal
                      6. Amazingly, this all seems to work quite well. (LOL).
                      7. I won't even attempt to tackle the 2nd dilemma (ties at the Cutoff Point) until I am assured that I am on the right track. Only you can tell me that.

                      Comment

                      • jdfergus
                        New Member
                        • Jun 2010
                        • 14

                        #12
                        Thanks for the help so far, guys.

                        Sorry NeoPa, I thought you meant you were asking for example data from the field "ClaimsType ", not the table. But the first two fields in my example query (ClaimsType and Claims) are the only two fields in the table. Unfortunately there's no other information relating to this table yet.

                        And I tried your code, ADezii, and I think it worked the first time, but since it works the same as mine. I'm pretty sure it has to do with step 5 though, where exactly does that code supposed to go? I get errors if I try to include it in the other module or if I give it its own. Forgive me for my complete grasp (or lack thereof) of VBA and code modules.

                        Comment

                        • ADezii
                          Recognized Expert Expert
                          • Apr 2006
                          • 8834

                          #13
                          The two critical points to mention are:
                          1. The Function must be 'Public' and reside in a Standard Code Module.
                          2. The Value of the Public Variable 'must' be Reset prior to Opening the Query after its initial execution.
                          3. The following code can reside in various places, but the most obvious location would be the Click() Event of a Command Button:
                            Code:
                            intNumOfTypeOvers = 0       'RESET 
                            DoCmd.OpenQuery "qryClaims", acViewNormal
                          4. The fact that it worked only the 'first' time is due to the Public Variable (intNumOfTypeOv ers) probably has not been Reset as indicated above.

                          Comment

                          • jdfergus
                            New Member
                            • Jun 2010
                            • 14

                            #14
                            I figured how why it only worked once. The first time I went straight to the record I knew where the cutoff point was (by entering the row number at the bottom of the query), and it showed "Yes" for it then. The other times I scrolled down to the record and it wasn't. And actually, the first time it was right when I moved there, but as soon as I scrolled up or down the record changed.

                            I think this is because the formula has to recalculate everytime the records scroll down, but not when the exact row is entered. (You'll notice the field is empty for a breif second each time before it has to repopulate as you scroll down.)

                            So unfortunately this would mean that the query would still be changing even after I click a command button, and would not be realistic to use.

                            Comment

                            • NeoPa
                              Recognized Expert Moderator MVP
                              • Oct 2006
                              • 32661

                              #15
                              Originally posted by jdfergus
                              jdfergus: Sorry NeoPa, I thought you meant you were asking for example data from the field "ClaimsType ", not the table. But the first two fields in my example query (ClaimsType and Claims) are the only two fields in the table. Unfortunately there's no other information relating to this table yet.
                              While I don't argue with anything you say, I still need to see the data for myself. I cannot ask you to tell me the important bit of the puzzle you've missed out, because you'll certainly not know what that is (otherwise I'm sure you would have mentioned it already). Therefore I ask for data that will help me to ascertain that without relying on you to describe it or interfere with it in any way.

                              I just want to see what the data is (or a representative subset at least if the whole is too voluminous) that you desire to build your results from.

                              As for ADezii's solution...
                              I was going to comment on this earlier, but I was waiting for some hard data so that I could be sure of my facts first. It's actually a pretty clever way of producing the result you want. If it's possible to do in SQL (still unclear until data available, but I'm guessing not) then it is almost certainly a better option for two reasons :
                              1. Performance. Calling VBA functions which need to be re-interpreted at every use can severely impact on the performance of queries running on large data sets.
                                BTW. From what I could see ADezii got the algorithm spot on.
                              2. Retreadability. This is what you are falling over at the moment. The fundamental logic of the algorithm depends on (assumes) the fact that the data will be processed forwards, in order, once only. Moving backwards and forwards through the data is not allowed for.
                                There are circumstances where this can be relied upon, but it seems you are not running under those circumstances.
                                This is actually quite unfortunate, as this may well be your best option. When I have the data I will certainly apply some thought to the matter, but I'm frankly not hopeful at this stage, as this problem is fundamentally about relativeness of records. A concept that SQL is fundamentally blind to - hence the need for VBA processing of the SQL returned records.
                              Last edited by NeoPa; Aug 6 '10, 11:24 PM.

                              Comment

                              Working...