Problem calculating moving average

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Stephen Miller

    Problem calculating moving average

    Firstly, sorry for the long post, but I've included a fair bit of
    sample data.

    Im doing a comparision of 10yr Bond prices and CPI adjustments, with
    an 18 week moving average of the CPI. I'm using a nested sub-query to
    calculate the moving average, but I'm having difficulty selecting
    exactly 18 data points (ie When I include the 'HAVING COUNT(C1.Closes )
    = 18' line, I get no results).

    Can anyone help?


    -- Some sample data:
    CREATE TABLE Bond10 (
    Closes [datetime] NOT NULL ,
    Prices [smallmoney] NOT NULL ,
    )
    INSERT INTO Bond10
    SELECT '1994-01-14', 6.57 UNION
    SELECT '1994-01-21', 6.53 UNION
    SELECT '1994-01-28', 6.44 UNION
    SELECT '1994-02-04', 6.51 UNION
    SELECT '1994-02-11', 6.54 UNION
    SELECT '1994-02-18', 6.89 UNION
    SELECT '1994-02-25', 7.18 UNION
    SELECT '1994-03-04', 7.43 UNION
    SELECT '1994-03-11', 7.43 UNION
    SELECT '1994-03-18', 7.44 UNION
    SELECT '1994-03-25', 7.66 UNION
    SELECT '1994-04-01', 7.96 UNION
    SELECT '1994-04-08', 8.07 UNION
    SELECT '1994-04-15', 8.24 UNION
    SELECT '1994-04-22', 8.23 UNION
    SELECT '1994-04-29', 8.45 UNION
    SELECT '1994-05-06', 8.82 UNION
    SELECT '1994-05-13', 8.86 UNION
    SELECT '1994-05-20', 8.44 UNION
    SELECT '1994-05-27', 8.75 UNION
    SELECT '1994-06-03', 8.79 UNION
    SELECT '1994-06-10', 8.77 UNION
    SELECT '1994-06-17', 9.24 UNION
    SELECT '1994-06-24', 9.63 UNION
    SELECT '1994-07-01', 9.66 UNION
    SELECT '1994-07-08', 9.59 UNION
    SELECT '1994-07-15', 9.41 UNION
    SELECT '1994-07-22', 9.56 UNION
    SELECT '1994-07-29', 9.58 UNION
    SELECT '1994-08-05', 9.31


    CREATE TABLE AvgCPI (
    Closes [datetime] NOT NULL ,
    AvgCPI [smallmoney] NOT NULL ,
    )
    INSERT INTO AvgCPI
    SELECT '1994-01-14', 2.04 UNION
    SELECT '1994-01-21', 2.04 UNION
    SELECT '1994-01-28', 2.04 UNION
    SELECT '1994-02-04', 2.04 UNION
    SELECT '1994-02-11', 2.04 UNION
    SELECT '1994-02-18', 2.04 UNION
    SELECT '1994-02-25', 2.04 UNION
    SELECT '1994-03-04', 1.51 UNION
    SELECT '1994-03-11', 1.51 UNION
    SELECT '1994-03-18', 1.51 UNION
    SELECT '1994-03-25', 1.51 UNION
    SELECT '1994-04-01', 1.51 UNION
    SELECT '1994-04-08', 1.51 UNION
    SELECT '1994-04-15', 1.51 UNION
    SELECT '1994-04-22', 1.51 UNION
    SELECT '1994-04-29', 1.51 UNION
    SELECT '1994-05-06', 1.51 UNION
    SELECT '1994-05-13', 1.51 UNION
    SELECT '1994-05-20', 1.51 UNION
    SELECT '1994-05-27', 1.51 UNION
    SELECT '1994-06-03', 1.80 UNION
    SELECT '1994-06-10', 1.80 UNION
    SELECT '1994-06-17', 1.80 UNION
    SELECT '1994-06-24', 1.80 UNION
    SELECT '1994-07-01', 1.80 UNION
    SELECT '1994-07-08', 1.80 UNION
    SELECT '1994-07-15', 1.80 UNION
    SELECT '1994-07-22', 1.80 UNION
    SELECT '1994-07-29', 1.80 UNION
    SELECT '1994-08-05', 1.80

    -- My query so far:
    SELECT A1.Closes, A1.Prices, B1.AvgCPI, SUM(C1.AvgCPI) AS MovSumCPI,
    AVG(C1.AvgCPI) AS MovAvgCPI, COUNT(C1.AvgCPI ) AS Counter
    FROM (
    SELECT Closes, Prices FROM Bond10
    ) A1
    LEFT JOIN (
    SELECT Closes, AvgCPI FROM AvgCPI
    ) B1 ON A1.Closes = B1.Closes
    LEFT JOIN (
    SELECT Closes, AvgCPI FROM AvgCPI
    ) C1 ON C1.Closes >= A1.Closes AND DATEADD(Week,-18,C1.Closes) <
    A1.Closes
    GROUP BY A1.Closes, A1.Prices, B1.AvgCPI, C1.AvgCPI
    -- HAVING COUNT(C1.Closes ) = 18
    ORDER BY A1.Closes

    DROP TABLE Bond10
    DROP TABLE AvgCPI


    Expected Results

    Closes Bon10 AvCPI MovAvg
    ========== ==== ==== ====
    14-Jan-94 6.57 2.04
    14-Jan-94 6.57 2.04
    21-Jan-94 6.53 2.04
    21-Jan-94 6.53 2.04
    28-Jan-94 6.44 2.04
    28-Jan-94 6.44 2.04
    4-Feb-94 6.51 2.04
    4-Feb-94 6.51 2.04
    4-Feb-94 6.51 2.04
    11-Feb-94 6.54 2.04
    11-Feb-94 6.54 2.04
    11-Feb-94 6.54 2.04
    18-Feb-94 6.89 2.04
    18-Feb-94 6.89 2.04
    18-Feb-94 6.89 2.04
    25-Feb-94 7.18 2.04
    25-Feb-94 7.18 2.04
    25-Feb-94 7.18 2.04 2.04
    4-Mar-94 7.43 1.51 2.01
    4-Mar-94 7.43 1.51 1.98
    11-Mar-94 7.43 1.51 1.95
    11-Mar-94 7.43 1.51 1.92
    18-Mar-94 7.44 1.51 1.89
    18-Mar-94 7.44 1.51 1.86
    25-Mar-94 7.66 1.51 1.83
    25-Mar-94 7.66 1.51 1.80
    1-Apr-94 7.96 1.51 1.78
    1-Apr-94 7.96 1.51 1.75
    8-Apr-94 8.07 1.51 1.72
    8-Apr-94 8.07 1.51 1.69
    15-Apr-94 8.24 1.51 1.66
    15-Apr-94 8.24 1.51 1.63
    22-Apr-94 8.23 1.51 1.60
    22-Apr-94 8.23 1.51 1.57
    29-Apr-94 8.45 1.51 1.54
    29-Apr-94 8.45 1.51 1.51
    6-May-94 8.82 1.51 1.51
    6-May-94 8.82 1.51 1.51
    13-May-94 8.86 1.51 1.51
    13-May-94 8.86 1.51 1.51
    20-May-94 8.44 1.51 1.51
    20-May-94 8.44 1.51 1.51
    27-May-94 8.75 1.51 1.51
    27-May-94 8.75 1.51 1.51
    3-Jun-94 8.79 1.8 1.53
    10-Jun-94 8.77 1.8 1.54
    17-Jun-94 9.24 1.8 1.56
    24-Jun-94 9.63 1.8 1.57
    1-Jul-94 9.66 1.8 1.59
    8-Jul-94 9.59 1.8 1.61
    15-Jul-94 9.41 1.8 1.62
    22-Jul-94 9.56 1.8 1.64
    29-Jul-94 9.58 1.8 1.66
    5-Aug-94 9.31 1.8 1.67

    Thanks,

    Stephen
  • David Portas

    #2
    Re: Problem calculating moving average

    I'm not sure I understand your expected result. Why do you want to see
    multiple rows for each week? How are you calculating the MovAvg column? How
    can you have a 18 week MovAvg when you don't have 18 weeks of data until
    1994-05-13?

    Here's how I would have expected to do it:

    SELECT B.Closes, B.Prices,
    MIN(CASE A.closes WHEN B.closes THEN A.AvgCPI END) AS AvgCPI,
    CASE MIN(A.closes) WHEN DATEADD(DAY,-119,B.closes)
    THEN AVG(A.AvgCPI) END AS MovAvgCPI,
    COUNT(A.AvgCPI) AS Counter
    FROM Bond10 AS B
    LEFT JOIN AvgCPI AS A
    ON A.Closes BETWEEN DATEADD(DAY,-119,B.closes) AND B.Closes
    GROUP BY B.Closes, B.Prices
    ORDER BY B.Closes

    Rather different to what you asked for but maybe it helps.

    --
    David Portas
    ------------
    Please reply only to the newsgroup
    --


    Comment

    • Erland Sommarskog

      #3
      Re: Problem calculating moving average

      [posted and mailed, please reply in news]

      Stephen Miller (jsausten@hotma il.com) writes:[color=blue]
      > Im doing a comparision of 10yr Bond prices and CPI adjustments, with
      > an 18 week moving average of the CPI. I'm using a nested sub-query to
      > calculate the moving average, but I'm having difficulty selecting
      > exactly 18 data points (ie When I include the 'HAVING COUNT(C1.Closes )
      >= 18' line, I get no results).[/color]

      I would not say that completly understand the logic, as it is a bit
      complex. But one thing caught my eye in the query:

      GROUP BY A1.Closes, A1.Prices, B1.AvgCPI, C1.AvgCPI

      That C1.AvgCPI seems out of place. When I remove it from the GROUP
      BY, the Counter gets the value 18 for all the early rows. It does
      not really look like your expected results. Then again, the
      expected results looked very similar to the results of the query
      you posted.


      --
      Erland Sommarskog, SQL Server MVP, sommar@algonet. se

      Books Online for SQL Server SP3 at
      SQL Server 2025 redefines what's possible for enterprise data. With developer-first features and integration with analytics and AI models, SQL Server 2025 accelerates AI innovation using the data you already have.

      Comment

      • Mystery Man

        #4
        Re: Problem calculating moving average

        Like the other reviewers, I was unsure why you had multiple rows for a
        given date and I was uncertain as to the what data was to be included
        in the average.

        The code below will return the average CPI for the previous 18 weeks
        (including the current week). This handles cases where there are less
        than 18 weeks of data and will be handle missing weeks and weeks that
        have more than one entry.


        select
        b.closes
        ,b.prices
        ,a.AvgCPI
        ,(select sum(a2.AvgCPI)/count(a2.avgCPI ) from AvgCPI a2 where
        a2.closes <= a.closes AND A2.closes >= dateadd(week, -18,
        a.closes))
        from
        bond10 b
        ,AvgCPI a
        where
        a.closes = b.closes


        I note that this gives different results to the other posting.

        Comment

        • Stephen Miller

          #5
          Re: Problem calculating moving average

          "David Portas" <REMOVE_BEFORE_ REPLYING_dporta s@acm.org> wrote in message news:<IOOdnYzPK 9agGxeiRVn-hA@giganews.com >...[color=blue]
          > I'm not sure I understand your expected result. Why do you want to see
          > multiple rows for each week? How are you calculating the MovAvg column? How
          > can you have a 18 week MovAvg when you don't have 18 weeks of data until
          > 1994-05-13?
          >
          > Here's how I would have expected to do it:
          >
          > SELECT B.Closes, B.Prices,
          > MIN(CASE A.closes WHEN B.closes THEN A.AvgCPI END) AS AvgCPI,
          > CASE MIN(A.closes) WHEN DATEADD(DAY,-119,B.closes)
          > THEN AVG(A.AvgCPI) END AS MovAvgCPI,
          > COUNT(A.AvgCPI) AS Counter
          > FROM Bond10 AS B
          > LEFT JOIN AvgCPI AS A
          > ON A.Closes BETWEEN DATEADD(DAY,-119,B.closes) AND B.Closes
          > GROUP BY B.Closes, B.Prices
          > ORDER BY B.Closes
          >
          > Rather different to what you asked for but maybe it helps.[/color]

          David,

          I must apologise! I'm not sure how it happened (although I do recall
          having a dozen spreadsheets open at the time), but I posted the wrong
          result set.

          Your analysis however was spot on! The results I was trying to archive
          (and your query correctly returns) are:


          Date 10BOND CPI MovAvg
          ========== ====== ====== ======
          1994-01-14 6.5700 2.0400 NULL
          1994-01-21 6.5300 2.0400 NULL
          1994-01-28 6.4400 2.0400 NULL
          1994-02-04 6.5100 2.0400 NULL
          1994-02-11 6.5400 2.0400 NULL
          1994-02-18 6.8900 2.0400 NULL
          1994-02-25 7.1800 2.0400 NULL
          1994-03-04 7.4300 1.5100 NULL
          1994-03-11 7.4300 1.5100 NULL
          1994-03-18 7.4400 1.5100 NULL
          1994-03-25 7.6600 1.5100 NULL
          1994-04-01 7.9600 1.5100 NULL
          1994-04-08 8.0700 1.5100 NULL
          1994-04-15 8.2400 1.5100 NULL
          1994-04-22 8.2300 1.5100 NULL
          1994-04-29 8.4500 1.5100 NULL
          1994-05-06 8.8200 1.5100 NULL
          1994-05-13 8.8600 1.5100 1.7161
          1994-05-20 8.4400 1.5100 1.6866
          1994-05-27 8.7500 1.5100 1.6572
          1994-06-03 8.7900 1.8000 1.6438
          1994-06-10 8.7700 1.8000 1.6305
          1994-06-17 9.2400 1.8000 1.6172
          1994-06-24 9.6300 1.8000 1.6038
          1994-07-01 9.6600 1.8000 1.5905
          1994-07-08 9.5900 1.8000 1.6066
          1994-07-15 9.4100 1.8000 1.6227
          1994-07-22 9.5600 1.8000 1.6388
          1994-07-29 9.5800 1.8000 1.6550
          1994-08-05 9.3100 1.8000 1.6711

          Thanks and again, I apologise for posting the wrong information.

          Regards,

          Stephen

          Comment

          • Christian Maslen

            #6
            Re: Problem calculating moving average

            Stephen,

            This isn't going to be of immediate help because SQL Server currently
            doesn't support the syntax, but I'm posting it anyway to ask if anyone
            knows if it will be available in the Yukon release. I ran this in DB2.
            The syntax is described as relational OLAP or Windowing functionality
            and is specified in the SQL-2003 draft. The following is a link:



            You can achieve a similar result by using a select statement within
            the select clause (as has already been posted), but over large result
            sets this performs very poorly.

            I got slightly different rounding results, but it's basically as you
            posted...

            Displaying result for:
            ---------------------
            select "Date"
            ,"10Bond"
            ,"CPI"
            , CASE
            WHEN "RowNum" < 18 then
            NULL
            else
            "MovAvg"
            end as "MovAvg"
            from (
            select ROW_NUMBER() OVER (ORDER BY b.closes) AS "RowNum"
            ,b.CLOSES as "Date"
            ,b.PRICES as "10Bond"
            ,a.AVGCPI as "CPI"
            ,DECIMAL (ROUND(avg(a.Av gCPI) over (order by b.Closes
            rows between 17 preceding
            and current row
            ), 4), 7, 4) as "MovAvg"
            from bond10 b
            , avgCPI a
            where b.closes = a.closes
            ) as t

            Date 10Bond CPI MovAvg
            ---------- --------- --------- ---------
            1994-01-14 6.5700 2.0400 NULL
            1994-01-21 6.5300 2.0400 NULL
            1994-01-28 6.4400 2.0400 NULL
            1994-02-04 6.5100 2.0400 NULL
            1994-02-11 6.5400 2.0400 NULL
            1994-02-18 6.8900 2.0400 NULL
            1994-02-25 7.1800 2.0400 NULL
            1994-03-04 7.4300 1.5100 NULL
            1994-03-11 7.4300 1.5100 NULL
            1994-03-18 7.4400 1.5100 NULL
            1994-03-25 7.6600 1.5100 NULL
            1994-04-01 7.9600 1.5100 NULL
            1994-04-08 8.0700 1.5100 NULL
            1994-04-15 8.2400 1.5100 NULL
            1994-04-22 8.2300 1.5100 NULL
            1994-04-29 8.4500 1.5100 NULL
            1994-05-06 8.8200 1.5100 NULL
            1994-05-13 8.8600 1.5100 1.7161
            1994-05-20 8.4400 1.5100 1.6867
            1994-05-27 8.7500 1.5100 1.6572
            1994-06-03 8.7900 1.8000 1.6439
            1994-06-10 8.7700 1.8000 1.6306
            1994-06-17 9.2400 1.8000 1.6172
            1994-06-24 9.6300 1.8000 1.6039
            1994-07-01 9.6600 1.8000 1.5906
            1994-07-08 9.5900 1.8000 1.6067
            1994-07-15 9.4100 1.8000 1.6228
            1994-07-22 9.5600 1.8000 1.6389
            1994-07-29 9.5800 1.8000 1.6550
            1994-08-05 9.3100 1.8000 1.6711

            30 Row(s) affected

            Christian.

            Comment

            • Zeppo

              #7
              Re: Problem calculating moving average

              Christian Maslen wrote:
              [color=blue]
              > I ran this in DB2.
              > The syntax is described as relational OLAP or Windowing
              > functionality and is specified in the SQL-2003 draft.
              > .
              > You can achieve a similar result by using a select
              > statement within the select clause (as has already been
              > posted), but over large result sets this performs very poorly[/color]

              Very nice query and it's ashame sql99 functionality isn't
              available in server.
              But you can avoid the performance of hit due to the
              number of comparisons based on the inequality operator
              in either a subquery or join that is used as a general
              solution to running sums in <=sql92.The RAC utility
              computes running sums without resort to multiple comparisons.
              Once runs are computed you can obtain moving runs/averages
              by subtracting prior runs from the current run and thus
              again eliminate the comparisons based on inequality operators.
              This example using RAC duplicates your query.Unions are used
              to more clearly show what is going on.Of course a single
              Select with a case statement could also be used:).

              Exec Rac
              @transform='sum (AvgCPI) as CPI',
              @rows='a.Closes (date) as [Date] & PRICES as [10Bond] ',
              @pvtcol='Report Mode',
              @from='##Bond10 as a inner join ##AvgCPI as b
              on a.closes=b.clos es',
              @grand_totals=' n',
              @racheck='y',
              -- Compute the runs of CPI over the dates.
              @rowruns='(CPI) ',
              -- At this point CPI runs over dates have been computed
              -- in the work table 'rac'.The 'rd' column is a Rac by product
              -- that is a counter of the date rows.A union query produces the
              -- moving averages.
              @select=
              'select a.rd,a.[Date],a.[10Bond],a.CPI,null as MovAvg
              from rac as a
              where rd<18
              union
              select a.rd,a.[Date],a.[10Bond],a.CPI,
              cast( cast(a.runs as decimal(7,4))/18.0 as decimal(7,4))
              from rac as a
              where rd=18
              union
              /* Here a join is used to get a prior run value that is subtracted from
              the current run to obtain the correct moving average for N (18) rows. */
              select a.rd,a.[Date],a.[10Bond],a.CPI,
              cast( (cast(a.runs as decimal(7,4)) - cast(b.runs as decimal(7,4)))/18.0
              as decimal(7,4))
              from rac as a inner join rac as b
              on a.rd-18=b.rd
              order by a.rd'

              RAC v2.2 and QALite @





              *** Sent via Developersdex http://www.developersdex.com ***
              Don't just participate in USENET...get rewarded for it!

              Comment

              Working...