Working with date ranges

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Thomas R. Hummel

    Working with date ranges

    Hello,

    I am importing data that lists rates for particular coverages for a
    particular period of time. Unfortunately, the data source isn't very
    clean. I've come up with some rules that I think will work to clean the
    data, but I'm having trouble putting those rules into efficient SQL.
    The table that I'm dealing with has just under 9M rows and I may need
    to use similar logic on an even larger table, so I'd like something
    that can be made efficient to some degree using indexes if necessary.

    Here is some sample (simplified) code:

    CREATE TABLE Coverage_Rates (
    rate_id INT IDENTITY NOT NULL,
    coverage_id INT NOT NULL,
    start_date SMALLDATETIME NOT NULL,
    end_date SMALLDATETIME NOT NULL,
    rate MONEY NOT NULL )
    GO

    INSERT INTO Coverage_Rates VALUES (1, '2004-01-01', '2004-06-01',
    40.00)
    INSERT INTO Coverage_Rates VALUES (1, '2004-03-01', '2004-08-01',
    20.00)
    INSERT INTO Coverage_Rates VALUES (1, '2004-06-01', '2004-08-01',
    30.00)
    INSERT INTO Coverage_Rates VALUES (2, '2004-01-01', '9999-12-31',
    90.00)
    INSERT INTO Coverage_Rates VALUES (2, '2004-03-01', '2004-08-01',
    20.00)
    INSERT INTO Coverage_Rates VALUES (2, '2004-08-01', '2004-08-01',
    30.00)
    GO


    The rule is basically this... for any given period of time, for a
    particular coverage, always use the coverage with the highest rate. So,
    given the rows above, I would want the results to be:

    coverage_id start_dt end_dt rate
    ----------- ---------- ---------- --------
    1 2004-01-01 2004-06-01 40.00
    1 2004-06-01 2004-08-01 30.00
    2 2004-01-01 9999-12-31 90.00

    There can be any combination of start and end dates in the source, but
    in my final results I would like to be able to have only one distinct
    row for any given time and coverage ID. So, given any date @my_date,

    SELECT coverage_id, COUNT(*)
    FROM <results>
    WHERE @my_date >= start_dt
    AND @my_date < end_dt
    GROUP BY coverage_id
    HAVING COUNT(*) > 1

    the above query should return 0 rows.

    Thanks for any help!

    -Tom.

  • Thomas R. Hummel

    #2
    Re: Working with date ranges

    Please change the start_date and end_date to DATETIME instead of
    SMALLDATETIME. Sorry about that, I made the change after testing my
    script but before doing the cut and paste.

    -Tom.

    Comment

    • --CELKO--

      #3
      Re: Working with date ranges

      If you had avoided proprietary data types, pulled out that silly
      non-relational IDENTITY, followed ISO-11179 naming rules, used a key on
      the table and had some constraints, would the table look like this?

      CREATE TABLE CoverageRates
      (coverage_id CHAR(15) NOT NULL,
      coverage_start_ date DATETIME NOT NULL,
      coverage_end_da te DATETIME NOT NULL,
      CHECK (coverage_start _date < coverage_end_da te)
      coverage_rate DECIMAL(5,3) NOT NULL,
      PRIMARY KEY (coverage_id, coverage_start_ date));
      [color=blue][color=green]
      >> I am importing data that lists rates for particular coverages for a[/color][/color]
      particular period of time. Unfortunately, the data source isn't very
      clean .. The table that I'm dealing with has just under 9M rows and I
      may need to use similar logic on an even larger table ..<<

      Let's start with a Calendar table -- just a simple list of dates. The
      first shot at this is to build a daily rate card for all the coverages,
      thus:

      CREATE VIEW DailyCoverageRa tes (coverage_id, cal_date, coverage_rate)
      AS
      SELECT R.coverage_id, C.cal_date, MAX(coverage_ra te)
      FROM Calendar AS C, CoverageRates AS R
      WHERE C.cal_date BETWEEN R.coverage_star t_date
      AND C.coverage_end_ date
      GROUP BY R.coverage_id, C.cal_date;

      You could just use this and keep your stinky data.

      I would add a MIN(coverage_ra te) and scrub with this query:

      SELECT R.coverage_id, C.cal_date,
      MIN(coverage_ra te) AS low_rate,
      MAX(coverage_ra te) AS high_rate
      FROM Calendar AS C, CoverageRates AS R
      WHERE C.cal_date BETWEEN R.coverage_star t_date
      AND C.coverage_end_ date
      GROUP BY R.coverage_id, C.cal_date
      HAVING MIN(coverage_ra te) < MAX(coverage_ra te);

      This will tell you where you have conflicts. Hey, if you got lucky,
      there might be only a few conflicts that can be fixed by hand. Hey, I
      might win the lottery!

      Here is the query to group the daily rates into runs. Clustered index
      on dates and coverage_id is a BIG help, if you can do it.

      SELECT X.coverage_id, X.coverage_star t_date,
      MAX(X.coverage_ end_date) AS coverage_end_da te,
      X.coverage_rate
      FROM (SELECT S.coverage_id, MIN(S.cal_date) ,
      E.cal_date, E.coverage_rate
      FROM DailyCoverageRa tes AS S, DailyCoverageRa tes AS E
      WHERE S.coverage_id = E.coverage_id
      AND S.coverage_rate = E.coverage_rate
      GROUP BY S.coverage_id, E.cal_date, E.coverage_rate )
      AS X(coverage_id, coverage_start_ date, coverage_end_da te,
      coverage_rate)
      GROUP BY X.coverage_id, X.coverage_star t_date, X.coverage_rate ;

      If you have the SQL-2005 OLAP stuff look at the OVER() operator from
      SQL-99.

      I do not say this often, but you might also consider a cursor to make a
      single pass thru the data and build the ranges that way, since it is a
      one-shot deal (I hope). .

      Comment

      • Thomas R. Hummel

        #4
        Re: Working with date ranges

        --CELKO-- wrote:[color=blue]
        > If you had avoided proprietary data types, pulled out that silly
        > non-relational IDENTITY, followed ISO-11179 naming rules, used a key[/color]
        on[color=blue]
        > the table and had some constraints, would the table look like this?[/color]

        It might, but if I were posting a sample to an MS SQL Server specific
        newsgroup and my only concern was getting across the idea of my problem
        then it would look exactly as I have it.
        [color=blue]
        > Let's start with a Calendar table -- just a simple list of dates.[/color]
        The[color=blue]
        > first shot at this is to build a daily rate card for all the[/color]
        coverages,[color=blue]
        > thus:
        >
        > CREATE VIEW DailyCoverageRa tes (coverage_id, cal_date, coverage_rate)
        > AS
        > SELECT R.coverage_id, C.cal_date, MAX(coverage_ra te)
        > FROM Calendar AS C, CoverageRates AS R
        > WHERE C.cal_date BETWEEN R.coverage_star t_date
        > AND C.coverage_end_ date
        > GROUP BY R.coverage_id, C.cal_date;
        >
        > You could just use this and keep your stinky data.[/color]

        While this is an elegant and useful solution for a lot of problems like
        this, unfortunately it turns my 9M rows into 13.5B rows (even if they
        are virtual rows). Since I am writing a load process I don't actually
        constrain the results, so I would be dealing with the full 13.5B rows,
        which is too unwieldly. I've never seen a query with an estimate cost
        of over 1M before though. :-)
        [color=blue]
        > I would add a MIN(coverage_ra te) and scrub with this query:
        >
        > SELECT R.coverage_id, C.cal_date,
        > MIN(coverage_ra te) AS low_rate,
        > MAX(coverage_ra te) AS high_rate
        > FROM Calendar AS C, CoverageRates AS R
        > WHERE C.cal_date BETWEEN R.coverage_star t_date
        > AND C.coverage_end_ date
        > GROUP BY R.coverage_id, C.cal_date
        > HAVING MIN(coverage_ra te) < MAX(coverage_ra te);
        >
        > This will tell you where you have conflicts. Hey, if you got lucky,
        > there might be only a few conflicts that can be fixed by hand. Hey,[/color]
        I[color=blue]
        > might win the lottery![/color]

        Using another method I've found that there are already far too many
        conflicts for me to fix them manually.
        [color=blue]
        > Here is the query to group the daily rates into runs. Clustered[/color]
        index[color=blue]
        > on dates and coverage_id is a BIG help, if you can do it.
        >
        > SELECT X.coverage_id, X.coverage_star t_date,
        > MAX(X.coverage_ end_date) AS coverage_end_da te,
        > X.coverage_rate
        > FROM (SELECT S.coverage_id, MIN(S.cal_date) ,
        > E.cal_date, E.coverage_rate
        > FROM DailyCoverageRa tes AS S, DailyCoverageRa tes AS E
        > WHERE S.coverage_id = E.coverage_id
        > AND S.coverage_rate = E.coverage_rate
        > GROUP BY S.coverage_id, E.cal_date, E.coverage_rate )
        > AS X(coverage_id, coverage_start_ date, coverage_end_da te,
        > coverage_rate)
        > GROUP BY X.coverage_id, X.coverage_star t_date, X.coverage_rate ;
        >
        > If you have the SQL-2005 OLAP stuff look at the OVER() operator from
        > SQL-99.[/color]

        I don't have this, but I'll look into OVER() anyway, thanks.
        [color=blue]
        > I do not say this often, but you might also consider a cursor to make[/color]
        a[color=blue]
        > single pass thru the data and build the ranges that way, since it is[/color]
        a[color=blue]
        > one-shot deal (I hope). .[/color]

        Unfortunately, this is for a monthly load. This is one of those dreaded
        situations where I have no control over the source data, but it will be
        constantly changing and I need to import it into my database on a
        monthly basis.

        I'm currently looking into a method of using multiple passes over the
        data - an insert followed by a series of updates/deletes. I'll compare
        the performance of that with a cursor and see which works best until I
        can convince the people in charge of my data source to fix their data.

        Thanks,
        -Tom.

        Comment

        • Thomas R. Hummel

          #5
          Re: Working with date ranges

          --CELKO-- wrote:[color=blue]
          > Here is the query to group the daily rates into runs. Clustered[/color]
          index[color=blue]
          > on dates and coverage_id is a BIG help, if you can do it.
          >
          > SELECT X.coverage_id, X.coverage_star t_date,
          > MAX(X.coverage_ end_date) AS coverage_end_da te,
          > X.coverage_rate
          > FROM (SELECT S.coverage_id, MIN(S.cal_date) ,
          > E.cal_date, E.coverage_rate
          > FROM DailyCoverageRa tes AS S, DailyCoverageRa tes AS E
          > WHERE S.coverage_id = E.coverage_id
          > AND S.coverage_rate = E.coverage_rate
          > GROUP BY S.coverage_id, E.cal_date, E.coverage_rate )
          > AS X(coverage_id, coverage_start_ date, coverage_end_da te,
          > coverage_rate)
          > GROUP BY X.coverage_id, X.coverage_star t_date, X.coverage_rate ;[/color]

          After looking at this more closely, I don't see how this gets ranges.
          For example, using a series of numbers instead of sequential dates, if
          I had: 1, 2, 4, 5, 6, 8, 9 then I would expect to have returned 1-2,
          4-6, 8-9. Modifying the query above to fit the simpler model:

          CREATE TABLE Series (my_int INT NOT NULL)
          GO

          INSERT INTO Series VALUES (1)
          INSERT INTO Series VALUES (2)
          INSERT INTO Series VALUES (4)
          INSERT INTO Series VALUES (5)
          INSERT INTO Series VALUES (6)
          INSERT INTO Series VALUES (8)
          INSERT INTO Series VALUES (9)
          GO

          SELECT SQ.start_my_int , MAX(end_my_int) AS end_my_int
          FROM (SELECT MIN(S.my_int) AS start_my_int, E.my_int AS end_my_int
          FROM Series S, Series E
          GROUP BY E.my_int) AS SQ
          GROUP BY SQ.start_my_int
          GO

          This returns only 1-9, which is clearly wrong. Did I miss something in
          simplifying the query?

          Thanks,
          -Tom.

          Comment

          • --CELKO--

            #6
            Re: Working with date ranges

            You have nothing to tell you that you have anything missing. In the
            case of numbers, it the formula that

            (end -start +1) = COUNT of numbers between start and end.

            I assume that every day had at least one rate and the problem was
            finding more than one. Gaps make life harder and you probably want to
            have a rule to fill them in.

            Comment

            • Thomas R. Hummel

              #7
              Re: Working with date ranges

              --CELKO-- wrote:[color=blue]
              > You have nothing to tell you that you have anything missing. In the
              > case of numbers, it the formula that
              >
              > (end -start +1) = COUNT of numbers between start and end.
              >
              > I assume that every day had at least one rate and the problem was
              > finding more than one. Gaps make life harder and you probably want[/color]
              to[color=blue]
              > have a rule to fill them in.[/color]

              Every day may have a rate, but the rate might go up and down, so that
              the periods for one particular rate might not be contiguous. I don't
              have access to a SQL Server at the moment to write and test a test case
              here, but to expand on my example with numbers, let's add rates into
              it:

              CREATE TABLE Series (my_int INT NOT NULL, rate MONEY NOT NULL)
              GO

              INSERT INTO Series VALUES (1, 10.0)
              INSERT INTO Series VALUES (2, 10.0)
              INSERT INTO Series VALUES (3, 20.0)
              INSERT INTO Series VALUES (4, 10.0)
              INSERT INTO Series VALUES (5, 10.0)
              INSERT INTO Series VALUES (6, 10.0)
              INSERT INTO Series VALUES (7, 20.0)
              INSERT INTO Series VALUES (8, 10.0)
              INSERT INTO Series VALUES (9, 10.0)
              GO

              SELECT SQ.rate, SQ.start_num, MAX(SQ.end_num)
              FROM (SELECT MIN(S.my_int) AS start_num, E.my_int AS end_num, E.rate
              FROM Series AS S, Series AS E
              WHERE S.rate = E.rate
              GROUP BY E.my_int, E.rate) AS SQ
              GROUP BY SQ.rate, SQ.start_num
              GO

              Now every number has a row and there are no gaps, but the same problem
              arises since I will now see:

              start_num end_num rate
              --------- ------- -------
              1 9 10.0000
              3 7 20.0000

              instead of:

              start_num end_num rate
              --------- ------- -------
              1 2 10.0000
              3 3 20.0000
              4 6 10.0000
              7 7 20.0000
              8 9 10.0000


              -Tom.

              Comment

              • Erland Sommarskog

                #8
                Re: Working with date ranges

                [posted and mailed]

                Thomas R. Hummel (tom_hummel@hot mail.com) writes:[color=blue]
                > The rule is basically this... for any given period of time, for a
                > particular coverage, always use the coverage with the highest rate. So,
                > given the rows above, I would want the results to be:
                >
                > coverage_id start_dt end_dt rate
                > ----------- ---------- ---------- --------
                > 1 2004-01-01 2004-06-01 40.00
                > 1 2004-06-01 2004-08-01 30.00
                > 2 2004-01-01 9999-12-31 90.00[/color]

                Below are two variants. Since you objected to Celko's use of a calendar
                table, I thought a little about and realised that you don't need a full
                calendar table. You only need one with the actual dates in the table.
                This is why there are two alternatives. The first builds a calendar by
                coverage_id; the second builds a global calendar. The latter is probably
                the better bet, if start and end dates are typically at the start of
                months, and thus shared by many coverage ids. The latter solution
                supplements the input with sentinel rows with a zero rate from 1900-01-01
                to 9999-1231 for each coverage id. (The eager student who want to know
                why, can get a hint by looking at the code for the first alternive.)

                The calendar itself appears only as a derived table. However, there is a
                temp table to hold intermediate results. Just to tease Celko :-) there is
                an IDENTITY column in it, but it serves to make the next DELETE a little
                easier. In SQL 2005, I would have used the ROW_NUMBER() function instead.
                I've also added indexes on the table, with my idea of which will give
                the best performance.

                Note also that the final SELECT where we return the data, comes in two
                different variations. They are not related to the choice of whether
                the calendar is global or by coverage_id.

                How well this scales, I have no idea. Thomas who have the data will
                have to test that part. :-) There are a few more possible variations
                on queries, that I hope Thomas will be able to work out, using the
                queries as a starting point.

                It goes without saying that there may be ugly cases I have overlooked.
                I did check one thing: if there is a gap for a coverage_id, this is
                handled well. Such a case is added to the repro below.

                Finally, looking at Thomas's desired result, I realise that I may
                be one day off with regards to end_date. I believe Thomas will be
                able to handle that as well.



                CREATE TABLE Coverage_Rates (
                rate_id INT IDENTITY NOT NULL,
                coverage_id INT NOT NULL,
                start_date DATETIME NOT NULL,
                end_date DATETIME NOT NULL,
                rate MONEY NOT NULL )
                GO
                INSERT INTO Coverage_Rates VALUES (1, '2004-01-01', '2004-06-01', 40.00)
                INSERT INTO Coverage_Rates VALUES (1, '2004-03-01', '2004-08-01', 20.00)
                INSERT INTO Coverage_Rates VALUES (1, '2004-06-01', '2004-08-01', 30.00)
                INSERT INTO Coverage_Rates VALUES (1, '2004-09-01', '2004-10-01', 25.00)
                INSERT INTO Coverage_Rates VALUES (2, '2004-01-01', '9999-12-31', 90.00)
                INSERT INTO Coverage_Rates VALUES (2, '2004-03-01', '2004-08-01', 20.00)
                INSERT INTO Coverage_Rates VALUES (2, '2004-08-01', '2004-08-01', 30.00)
                GO
                CREATE TABLE #max_rates(iden t int IDENTITY,
                coverage_id int NOT NULL,
                date datetime NOT NULL,
                rate money NOT NULL,
                PRIMARY KEY CLUSTERED (ident),
                UNIQUE NONCLUSTERED (coverage_id, date))
                go
                -- First get max rate on all days where this a change, and the days
                -- before and after the changes. Note that we must include the day
                -- before a period as a day with a zero rate, in case there were no
                -- rates at all this day.
                INSERT #max_rates (coverage_id, date, rate)
                SELECT C.coverage_id, C.date, coalesce(MAX(R. rate), 0)
                FROM Coverage_Rates R
                RIGHT JOIN
                (SELECT coverage_id, date = start_date
                FROM Coverage_Rates
                UNION
                SELECT coverage_id, date = dateadd(DAY, -1, start_date)
                FROM Coverage_Rates
                UNION
                SELECT coverage_id, end_date
                FROM Coverage_Rates
                UNION
                SELECT coverage_id, dateadd(DAY, +1, end_date)
                FROM Coverage_Rates
                WHERE end_date < '99991231') C
                ON C.coverage_id = R.coverage_id
                AND C.date BETWEEN R.start_date AND R.end_date
                GROUP BY C.coverage_id, C.date
                ORDER BY C.coverage_id, C.date
                go
                SELECT * FROM #max_rates ORDER BY ident
                go
                -- Delete adjacent rows with the same rate.
                DELETE #max_rates
                FROM #max_rates a
                WHERE EXISTS (SELECT *
                FROM #max_rates b
                WHERE a.coverage_id = b.coverage_id
                AND a.ident - 1 = b.ident
                AND a.rate = b.rate)
                go
                SELECT * FROM #max_rates ORDER BY ident
                go
                -- And then transform #max_rates back to the Coverage_rates format.
                SELECT a.coverage_id, start_date = a.date,
                end_date = coalesce(datead d(DAY, -1, b.date), '99991231'),
                a.rate
                FROM #max_rates a
                LEFT JOIN #max_rates b
                ON a.coverage_id = b.coverage_id
                AND b.date = (SELECT MIN(c.date)
                FROM #max_rates c
                WHERE a.coverage_id = c.coverage_id
                AND c.date > a.date)
                WHERE a.rate > 0
                ORDER BY a.coverage_id, a.date
                go
                DROP TABLE Coverage_Rates, #max_rates
                ------------------------------------------------------------------------
                go
                CREATE TABLE Coverage_Rates (
                rate_id INT IDENTITY NOT NULL,
                coverage_id INT NOT NULL,
                start_date DATETIME NOT NULL,
                end_date DATETIME NOT NULL,
                rate MONEY NOT NULL )
                GO
                INSERT INTO Coverage_Rates VALUES (1, '2004-01-01', '2004-06-01', 40.00)
                INSERT INTO Coverage_Rates VALUES (1, '2004-03-01', '2004-08-01', 20.00)
                INSERT INTO Coverage_Rates VALUES (1, '2004-06-01', '2004-08-01', 30.00)
                INSERT INTO Coverage_Rates VALUES (1, '2004-09-01', '2004-10-01', 25.00)
                INSERT INTO Coverage_Rates VALUES (2, '2004-01-01', '9999-12-31', 90.00)
                INSERT INTO Coverage_Rates VALUES (2, '2004-03-01', '2004-08-01', 20.00)
                INSERT INTO Coverage_Rates VALUES (2, '2004-08-01', '2004-08-01', 30.00)
                GO
                CREATE TABLE #max_rates(iden t int IDENTITY,
                coverage_id int NOT NULL,
                date datetime NOT NULL,
                rate money NOT NULL,
                PRIMARY KEY CLUSTERED (ident),
                UNIQUE NONCLUSTERED (coverage_id, date))
                go
                INSERT #max_rates (coverage_id, date, rate)
                SELECT R.coverage_id, C.date, MAX(R.rate)
                FROM (SELECT coverage_id, start_date, end_date, rate
                FROM Coverage_Rates
                UNION
                SELECT coverage_id, '19000101', '99991231', 0
                FROM Coverage_Rates) AS R
                JOIN (SELECT date = start_date
                FROM Coverage_Rates
                UNION
                SELECT date = dateadd(DAY, -1, start_date)
                FROM Coverage_Rates
                UNION
                SELECT end_date
                FROM Coverage_Rates
                UNION
                SELECT dateadd(DAY, +1, end_date)
                FROM Coverage_Rates
                WHERE end_date < '99991231') C
                ON C.date BETWEEN R.start_date AND R.end_date
                GROUP BY R.coverage_id, C.date
                ORDER BY R.coverage_id, C.date
                go
                SELECT * FROM #max_rates ORDER BY ident
                go
                DELETE #max_rates
                FROM #max_rates a
                WHERE EXISTS (SELECT *
                FROM #max_rates b
                WHERE a.coverage_id = b.coverage_id
                AND a.ident - 1 = b.ident
                AND a.rate = b.rate)
                go
                SELECT * FROM #max_rates ORDER BY ident
                go
                SELECT a.coverage_id, start_date = a.date,
                end_date = coalesce(datead d(DAY, -1,
                (SELECT MIN(b.date) FROM #max_rates b
                WHERE a.coverage_id = b.coverage_id
                AND b.date > a.date)), '99991231'),
                a.rate
                FROM #max_rates a
                WHERE a.rate > 0
                ORDER BY a.coverage_id, a.date
                go
                DROP TABLE Coverage_Rates, #max_rates



                --
                Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

                Books Online for SQL Server SP3 at
                SQL Server 2025 redefines what's possible for enterprise data. With developer-first features and integration with analytics and AI models, SQL Server 2025 accelerates AI innovation using the data you already have.

                Comment

                • Thomas R. Hummel

                  #9
                  Re: Working with date ranges

                  Thanks for the reply Erland!

                  I had tried something similar at the end of last week, using only the
                  necessary dates for my calendar, although I actually created a calendar
                  table and selected the dates into it rather than using a derived table.
                  Performance with that method was still pretty bad, but then late on
                  Friday I was told that the business rules were slightly different from
                  what I was originally told.

                  Now they have decided that the coverage rate with the later start date
                  always has precedence when there is an overlap. If two rates have the
                  same exact start date then I have to use the higher rate. With these
                  new business rules I was able to quickly come up with an acceptable
                  solution using a view and insert:

                  -- This view will narrow down any coverage rates with identical start
                  dates
                  -- Priority is given to the higher rate, but if those are the same then
                  -- priority is given to the longer-running rate. DISTINCT eliminates
                  exact
                  -- duplicates
                  CREATE VIEW Coverage_Rates_ Deduped
                  AS
                  SELECT DISTINCT
                  T1.coverage_id,
                  T1.start_date,
                  T1.end_date,
                  T1.rate
                  FROM Coverage_Rates T1
                  LEFT OUTER JOIN Coverage_Rates T2 ON T2.coverage_id = T1.coverage_id
                  AND T2.start_date = T1.start_date
                  AND (T2.rate > T1.rate
                  OR (T2.rate = T1.rate AND
                  T2.end_date > T1.end_date))
                  WHERE T2.coverage_id IS NULL
                  GO

                  SELECT T1.coverage_id,
                  T1.start_date,
                  ISNULL((SELECT DATEADD(dy, -1, MIN(T2.start_da te))
                  FROM Coverage_Rates_ Deduped T2
                  WHERE T2.coverage_id = T1.coverage_id
                  AND T2.start_date > T1.start_date
                  AND T2.start_date < T1.end_date), T1.end_date),
                  T1.rate
                  FROM Coverage_Rates_ Deduped T1
                  GO

                  Of course, the results will no longer match what was in my original
                  post because the business rules have been changed.

                  Thanks again though for the time spent on your response Erland, and I
                  will file that trick away for the future.

                  Thanks,
                  -Tom.

                  Comment

                  • Erland Sommarskog

                    #10
                    Re: Working with date ranges

                    Thomas R. Hummel (tom_hummel@hot mail.com) writes:[color=blue]
                    > I had tried something similar at the end of last week, using only the
                    > necessary dates for my calendar, although I actually created a calendar
                    > table and selected the dates into it rather than using a derived table.
                    > Performance with that method was still pretty bad, but then late on
                    > Friday I was told that the business rules were slightly different from
                    > what I was originally told.
                    >
                    > Now they have decided that the coverage rate with the later start date
                    > always has precedence when there is an overlap. If two rates have the
                    > same exact start date then I have to use the higher rate. With these
                    > new business rules I was able to quickly come up with an acceptable
                    > solution using a view and insert:[/color]

                    Ain't life great! First you put a lot of effort into a tricky problem -
                    and then you tell that you've been working from the wrong presumptions.
                    It was fortunate that the new rules made the problem simpler this time.


                    --
                    Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

                    Books Online for SQL Server SP3 at
                    SQL Server 2025 redefines what's possible for enterprise data. With developer-first features and integration with analytics and AI models, SQL Server 2025 accelerates AI innovation using the data you already have.

                    Comment

                    Working...