Indexes slowing down BULK INSERT

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • David Sharp

    Indexes slowing down BULK INSERT

    I've been doing some experiments with speeding up copying tables of
    approximately 1 million rows between databases using BCP and BULK INSERT.

    I noticed that the total time for removing the indexes (non-clustered) and
    then recreating them after the BULK INSERT was significantly less than just
    doing the BULK INSERT with the indexes left there, even though I specified
    TABLOCK.

    I would have expected SQL Server not to update the index until the insert
    completed (given the table lock) and so removing the indexes would have no
    effect. Can anyone explain why removing the indexes should speed it up?

    This is on SQL Server 7.

    Cheers
    Dave


  • Erland Sommarskog

    #2
    Re: Indexes slowing down BULK INSERT

    David Sharp (no email address supplied) writes:[color=blue]
    > I've been doing some experiments with speeding up copying tables of
    > approximately 1 million rows between databases using BCP and BULK INSERT.
    >
    > I noticed that the total time for removing the indexes (non-clustered)
    > and then recreating them after the BULK INSERT was significantly less
    > than just doing the BULK INSERT with the indexes left there, even though
    > I specified TABLOCK.
    >
    > I would have expected SQL Server not to update the index until the insert
    > completed (given the table lock) and so removing the indexes would have no
    > effect. Can anyone explain why removing the indexes should speed it up?[/color]

    I have not studied this case very closely. But a few observations: if
    you supplied a batch size with /b, SQL Server had no choice but to
    maintain the indexes while loading, since each batch is committed
    separately.

    When running some bulk-loading recently, I notice that when loading on
    a completely unindexed table, BCP reported the copied rows swiftly, and
    then completely directly, whereas on indexed tables there was a delay
    from when all rows had been loaded until the command had completed,
    which I supposed was spent on rebuilding indexes. I did not use TABLOCK.

    I should add that I was working on SQL 2000, and the behaviour I saw
    may reflect an improvement from SQL7.

    --
    Erland Sommarskog, SQL Server MVP, sommar@algonet. se

    Books Online for SQL Server SP3 at
    SQL Server 2025 redefines what's possible for enterprise data. With developer-first features and integration with analytics and AI models, SQL Server 2025 accelerates AI innovation using the data you already have.

    Comment

    • Greg D. Moore \(Strider\)

      #3
      Re: Indexes slowing down BULK INSERT


      "Erland Sommarskog" <sommar@algonet .se> wrote in message
      news:Xns941B107 9D8D3Yazorman@1 27.0.0.1...[color=blue]
      > David Sharp (no email address supplied) writes:[color=green]
      > > I've been doing some experiments with speeding up copying tables of
      > > approximately 1 million rows between databases using BCP and BULK[/color][/color]
      INSERT.[color=blue][color=green]
      > >
      > > I noticed that the total time for removing the indexes (non-clustered)
      > > and then recreating them after the BULK INSERT was significantly less
      > > than just doing the BULK INSERT with the indexes left there, even though
      > > I specified TABLOCK.
      > >
      > > I would have expected SQL Server not to update the index until the[/color][/color]
      insert[color=blue][color=green]
      > > completed (given the table lock) and so removing the indexes would have[/color][/color]
      no[color=blue][color=green]
      > > effect. Can anyone explain why removing the indexes should speed it up?[/color]
      >
      > I have not studied this case very closely. But a few observations: if
      > you supplied a batch size with /b, SQL Server had no choice but to
      > maintain the indexes while loading, since each batch is committed
      > separately.[/color]

      I have studied this somewhat closely. :-)

      And what Erland says about the /b is a critical part of it. We on a
      quarterly basis have to load a multimillion set of rows. As an experiment
      recently (to reconfirm my thoughts) I did a test load on a backup server.
      It took I believe 3 days to do the load. This was without removing the
      non-clustered indices first.

      For the actual load I I removed the non-clustered indices (but kept the
      clustered index since the data is BCP'd out of another table already in
      order). This took well under 12 hours. (Actually not 100% sure how long it
      took since I started it around midnight and the scripts finished sometime
      before 9:00 AM). This included re-applying the indices to the table.

      I don't know how much of a difference there would be if the data had been
      completely unordered.

      [color=blue]
      >
      > When running some bulk-loading recently, I notice that when loading on
      > a completely unindexed table, BCP reported the copied rows swiftly, and
      > then completely directly, whereas on indexed tables there was a delay
      > from when all rows had been loaded until the command had completed,
      > which I supposed was spent on rebuilding indexes. I did not use TABLOCK.
      >
      > I should add that I was working on SQL 2000, and the behaviour I saw
      > may reflect an improvement from SQL7.
      >
      > --
      > Erland Sommarskog, SQL Server MVP, sommar@algonet. se
      >
      > Books Online for SQL Server SP3 at
      > http://www.microsoft.com/sql/techinf...2000/books.asp[/color]


      Comment

      Working...