fastest way to insert range/sequence

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • ratu

    fastest way to insert range/sequence

    I'd like to use a stored procedure to insert large amounts of records
    into a table. My field A should be filled with a given range of
    numbers. I do the following ... but I'm sure there is a better
    (faster) way:

    select @start = max(A) from tbl where B = 'test1' and C = 'test2'
    while @start <= 500000
    begin
    insert into tbl (A, B, C)
    values (@start, 'test1', test2')
    set @start = @start +1
    end

    another question is, how to prevent that another user inserts the same
    numbers into the field A?

    Thanks a lot for any help!
    ratu
  • Simon Hayes

    #2
    Re: fastest way to insert range/sequence


    "ratu" <postit@hispeed .ch> wrote in message
    news:e6e93102.0 407070830.67d76 3c5@posting.goo gle.com...[color=blue]
    > I'd like to use a stored procedure to insert large amounts of records
    > into a table. My field A should be filled with a given range of
    > numbers. I do the following ... but I'm sure there is a better
    > (faster) way:
    >
    > select @start = max(A) from tbl where B = 'test1' and C = 'test2'
    > while @start <= 500000
    > begin
    > insert into tbl (A, B, C)
    > values (@start, 'test1', test2')
    > set @start = @start +1
    > end
    >
    > another question is, how to prevent that another user inserts the same
    > numbers into the field A?
    >
    > Thanks a lot for any help!
    > ratu[/color]

    One possible solution is an auxiliary table of numbers, or a UDF as
    described in this post:



    You could then do something like this:

    insert into tbl (A, B, C)
    select n, 'test1', 'test2'
    from dbo.fn_nums(@st art, 500000)

    As for preventing duplicate entries, you can use a primary key or unique
    constraint to prevent duplicates, depending on your data model. If
    necessary, you can also use a check constraint to ensure that the table will
    only accept a certain range of numbers.

    Simon


    Comment

    • Ross Presser

      #3
      Re: fastest way to insert range/sequence

      On 7 Jul 2004 09:30:03 -0700, ratu wrote:
      [color=blue]
      > I'd like to use a stored procedure to insert large amounts of records
      > into a table. My field A should be filled with a given range of
      > numbers. I do the following ... but I'm sure there is a better
      > (faster) way:
      >
      > select @start = max(A) from tbl where B = 'test1' and C = 'test2'
      > while @start <= 500000
      > begin
      > insert into tbl (A, B, C)
      > values (@start, 'test1', test2')
      > set @start = @start +1
      > end[/color]

      Here's the DIGITS view-based version:

      CREATE VIEW DIGITS (D) AS
      SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
      UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9;

      INSERT INTO tbl (A,B,C)
      SELECT D3.D * 1000 + D2.D * 100 + D1.D as [ValA],
      'test1' AS [ValB], 'test2' AS [ValC])
      FROM DIGITS AS [D1], DIGITS AS [D2], DIGITS AS [D3]
      WHERE D3.D * 1000 + D2.D * 100 + D1.D between @LowVal AND @HiVal
      [color=blue]
      >
      > another question is, how to prevent that another user inserts the same
      > numbers into the field A?[/color]

      Create a unique index on field A. The other user will receive an error.

      Comment

      • John Gilson

        #4
        Re: fastest way to insert range/sequence

        "Ross Presser" <rpresser@imtek .com> wrote in message
        news:jaaqazxjdy er.1b96bwdbr1qo d.dlg@40tude.ne t...[color=blue]
        > On 7 Jul 2004 09:30:03 -0700, ratu wrote:
        >[color=green]
        > > I'd like to use a stored procedure to insert large amounts of records
        > > into a table. My field A should be filled with a given range of
        > > numbers. I do the following ... but I'm sure there is a better
        > > (faster) way:
        > >
        > > select @start = max(A) from tbl where B = 'test1' and C = 'test2'
        > > while @start <= 500000
        > > begin
        > > insert into tbl (A, B, C)
        > > values (@start, 'test1', test2')
        > > set @start = @start +1
        > > end[/color]
        >
        > Here's the DIGITS view-based version:
        >
        > CREATE VIEW DIGITS (D) AS
        > SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
        > UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9;[/color]

        If there's no need to eliminate duplicates in the union operation, as here,
        use UNION ALL instead. This particular case is trivial but in other
        cases the performance improvement can be noticeable.
        [color=blue]
        > INSERT INTO tbl (A,B,C)
        > SELECT D3.D * 1000 + D2.D * 100 + D1.D as [ValA],
        > 'test1' AS [ValB], 'test2' AS [ValC])
        > FROM DIGITS AS [D1], DIGITS AS [D2], DIGITS AS [D3]
        > WHERE D3.D * 1000 + D2.D * 100 + D1.D between @LowVal AND @HiVal[/color]

        You're missing the 10s place here.

        Of course, one can fill a table with all integers possibly needed and query
        for ranges. Alternatively, one can define a view to calculate the range by
        applying constraints to each place value in turn, that is, the ones, tens, hundreds,
        etc., as opposed to applying a constraint to each final candidate integer. The
        former being more of a branch and bound approach while the latter is
        generate and test.

        CREATE VIEW Digits (d)
        AS
        SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL
        SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL
        SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL
        SELECT 9

        -- Return a range [@lower, @upper]
        CREATE FUNCTION NonnegativeInte gerRange
        (@lower INT, @upper INT)
        RETURNS TABLE
        AS
        RETURN(
        SELECT Ones.d + 10 * Tens.d + 100 * Hundreds.d + 1000 * Thousands.d +
        10000 * TenThousands.d AS n
        FROM Digits AS TenThousands
        INNER JOIN
        Digits AS Thousands
        ON TenThousands.d BETWEEN @lower/10000 AND @upper/10000 AND
        (TenThousands.d <> @upper/10000 OR
        Thousands.d <= (@upper%10000)/1000) AND
        (TenThousands.d <> @lower/10000 OR
        Thousands.d >= (@lower%10000)/1000)
        INNER JOIN
        Digits AS Hundreds
        ON (TenThousands.d <> @upper/10000 OR
        Thousands.d <> (@upper%10000)/1000 OR
        Hundreds.d <= (@upper%1000)/100) AND
        (TenThousands.d <> @lower/10000 OR
        Thousands.d <> (@lower%10000)/1000 OR
        Hundreds.d >= (@lower%1000)/100)
        INNER JOIN
        Digits AS Tens
        ON (TenThousands.d <> @upper/10000 OR
        Thousands.d <> (@upper%10000)/1000 OR
        Hundreds.d <> (@upper%1000)/100 OR
        Tens.d <= (@upper%100)/10) AND
        (TenThousands.d <> @lower/10000 OR
        Thousands.d <> (@lower%10000)/1000 OR
        Hundreds.d <> (@lower%1000)/100 OR
        Tens.d >= (@lower%100)/10)
        INNER JOIN
        Digits AS Ones
        ON (TenThousands.d <> @upper/10000 OR
        Thousands.d <> (@upper%10000)/1000 OR
        Hundreds.d <> (@upper%1000)/100 OR
        Tens.d <> (@upper%100)/10 OR
        Ones.d <= @upper%10) AND
        (TenThousands.d <> @lower/10000 OR
        Thousands.d <> (@lower%10000)/1000 OR
        Hundreds.d <> (@lower%1000)/100 OR
        Tens.d <> (@lower%100)/10 OR
        Ones.d >= @lower%10)
        )

        SELECT n
        FROM NonnegativeInte gerRange(20000, 20009)
        ORDER BY n

        n
        20000
        20001
        20002
        20003
        20004
        20005
        20006
        20007
        20008
        20009

        --
        JAG


        Comment

        • Ross Presser

          #5
          Re: fastest way to insert range/sequence

          On Fri, 09 Jul 2004 03:18:31 GMT, John Gilson wrote:
          [color=blue]
          > If there's no need to eliminate duplicates in the union operation, as here,
          > use UNION ALL instead. This particular case is trivial but in other
          > cases the performance improvement can be noticeable.[/color]

          Thanks. I noticed you using UNION ALL before, but never understood it,
          because I never looked up the explanation of UNION ALL.
          [color=blue]
          > You're missing the 10s place here.[/color]

          D'oh!
          [color=blue]
          > Of course, one can fill a table with all integers possibly needed and query
          > for ranges. Alternatively, one can define a view to calculate the range by
          > applying constraints to each place value in turn, that is, the ones, tens, hundreds,
          > etc., as opposed to applying a constraint to each final candidate integer. The
          > former being more of a branch and bound approach while the latter is
          > generate and test.[/color]

          Thanks for your clearly expressed improvement of my half-baked ideas. :)

          I still have two very active SQL 6.5 servers, so I tend not to think of UDF
          solutions.

          Comment

          • John Gilson

            #6
            Re: fastest way to insert range/sequence

            "Ross Presser" <rpresser@imtek .com> wrote in message
            news:1x4qrf3kzc 0t0$.1wjptvh6fh q50.dlg@40tude. net...[color=blue]
            > On Fri, 09 Jul 2004 03:18:31 GMT, John Gilson wrote:
            >[color=green]
            > > If there's no need to eliminate duplicates in the union operation, as here,
            > > use UNION ALL instead. This particular case is trivial but in other
            > > cases the performance improvement can be noticeable.[/color]
            >
            > Thanks. I noticed you using UNION ALL before, but never understood it,
            > because I never looked up the explanation of UNION ALL.[/color]

            It's worth knowing when you're taking the union of significant result sets.
            It's like knowing when and when not to use DISTINCT. In this case
            it obviously isn't an issue other than reinforcing good practice.
            [color=blue][color=green]
            > > You're missing the 10s place here.[/color]
            >
            > D'oh![/color]

            It's even easier to do when you go into the millions and beyond.
            [color=blue][color=green]
            > > Of course, one can fill a table with all integers possibly needed and query
            > > for ranges. Alternatively, one can define a view to calculate the range by
            > > applying constraints to each place value in turn, that is, the ones, tens, hundreds,
            > > etc., as opposed to applying a constraint to each final candidate integer. The
            > > former being more of a branch and bound approach while the latter is
            > > generate and test.[/color]
            >
            > Thanks for your clearly expressed improvement of my half-baked ideas. :)[/color]

            Your generate-and-test approach is an obvious and completely reasonable
            solution. The branch-and-bound approach, run on SQL Server 2000, does
            seem, in cursory testing, to be over twice as fast. Though it is more verbose
            and perhaps less immediately clear.
            [color=blue]
            > I still have two very active SQL 6.5 servers, so I tend not to think of UDF
            > solutions.[/color]

            The UDF here is just for named packaging. Could've simply provided the
            SELECT using variables for the range bounds.

            --
            JAG


            Comment

            Working...