Bulk Insert

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • pk

    Bulk Insert

    Sorry for the piece-by-piece nature of this post, I moved it from a
    dormant group to this one and it was 3 separate posts in the other
    group. Anyway...

    I'm trying to bulk insert a text file of 10 columns into a table with
    12. How can I specify which columns to insert to? I think format
    files are what I'm supposed to use, but I can't figure them out. I've
    also tried using a view, as was suggested on one of the many websites
    I've searched, but I clearly did that incorrectly as well.

    --------------------------------
    Update:


    I'm working with the view, and I've got a view that contains the exact
    columns from the table I want. I ran my bulk insert command,


    BULK INSERT Test..IV10401 FROM 'c:\bulkInsertF ile2.txt'


    and it returned the error:


    Server: Msg 2601, Level 14, State 3, Line 1
    Cannot insert duplicate key row in object 'IV10401' with unique index
    'AK2IV10401'.
    Note: Bulk Insert through a view may result in base table default
    values being ignored for NULL columns in the data file.
    The statement has been terminated.


    The AK2IV10401 key is comprised of 3 columns and I'm sure that each of
    my rows in the insert file is unique according to those three. What
    should I be checking for?

    -----------------------
    Update 2:


    I can only successfully insert 1 row. It seems to be treating each row

    as an individual primary key when it should be treating them as
    composite keys. I cannot alter the table, since it was created by
    Great Plains Dynamics. Is there some sort of switch that I'm missing
    in my bulk insert statement or can I suppress the errors?

  • pk

    #2
    Re: Bulk Insert

    ughh, bulk insert is going to be the end of me. i just need to insert
    two seperate .txt files into two separate tables, but i can't do it. i
    did finally get one to go through by not demanding that the index
    AK2IV10401 is unique. i don't know what problems that will cause for
    me in the future, but i would at least like to get to the future to see
    SOMETHING happen. As for the second table, there is a Primary Key that
    is blocking all my progress and I don't know how to get around this.
    Here is the error I get.

    Violation of PRIMARY KEY constraint 'PKIV10402'. Cannot insert
    duplicate key in object 'IV10402'.
    The statement has been terminated.

    I REALLY don't think I'm violating anything, so why is it kicking and
    screaming at me?

    -pk

    Comment

    • Erland Sommarskog

      #3
      Re: Bulk Insert

      pk (philip.kluss@g mail.com) writes:[color=blue]
      > Sorry for the piece-by-piece nature of this post, I moved it from a
      > dormant group to this one and it was 3 separate posts in the other
      > group. Anyway...
      >
      > I'm trying to bulk insert a text file of 10 columns into a table with
      > 12. How can I specify which columns to insert to? I think format
      > files are what I'm supposed to use, but I can't figure them out. I've
      > also tried using a view, as was suggested on one of the many websites
      > I've searched, but I clearly did that incorrectly as well.[/color]

      Format files are a bit tedious, but for 10 columns it's not that
      bad. Here is an example:

      8.0
      10
      1 SQLCHAR 0 0 "\t" 1 X ""
      2 SQLCHAR 0 0 "\t" 2 X ""
      ...
      10 SQLCHAR 0 0 "\r\n" 12 X ""

      First row is the version of the file format. Next row lists the number
      of fields in the bulk file. Next ten rows details the fields.

      First column is the field number. Second column number is the data type.
      This is always SQLCHAR for an ANSI file, and SQLNCHAR for a Unicode
      file. Other data types applies only to binary data files.

      Third column is prefix length. This is always 0 for a text file. Fourth
      column is column length. Use this for fixed-length columns or leave 0.
      Fifth column is the field terminator. In the example, I'm assuming
      tab, save for the last row that is terminated by carriage return+line feed.

      The sixth column is the column number for the table column in SQL Server.
      This does not have to follow the numbers in the file. If the number is 0,
      that file in the text file is not imported.

      The seventh column is the column name, but this column is informational
      only.

      The eigth column specifies the collation. This is good if you need to
      convert data between charsets when importing.

      [color=blue]
      > --------------------------------
      > Update:
      >
      > I'm working with the view, and I've got a view that contains the exact
      > columns from the table I want. I ran my bulk insert command,
      >
      > BULK INSERT Test..IV10401 FROM 'c:\bulkInsertF ile2.txt'
      >
      > and it returned the error:
      >
      > Server: Msg 2601, Level 14, State 3, Line 1
      > Cannot insert duplicate key row in object 'IV10401' with unique index
      > 'AK2IV10401'.
      > Note: Bulk Insert through a view may result in base table default
      > values being ignored for NULL columns in the data file.
      > The statement has been terminated.
      >
      > The AK2IV10401 key is comprised of 3 columns and I'm sure that each of
      > my rows in the insert file is unique according to those three. What
      > should I be checking for?[/color]

      Maybe the keys are already in the table?
      [color=blue]
      > -----------------------
      > Update 2:
      >
      > I can only successfully insert 1 row. It seems to be treating each row
      >
      > as an individual primary key when it should be treating them as
      > composite keys. I cannot alter the table, since it was created by
      > Great Plains Dynamics.[/color]

      Without access to table definition, data file and the BCP command
      it's hard to tell what is going on.

      A common technique is to bulk load into a staging table, and then
      clean up data there, before moving to the target table.
      [color=blue]
      > Is there some sort of switch that I'm missing
      > in my bulk insert statement or can I suppress the errors?[/color]

      Well, you can use -b and -m to set the batch size, and increase the
      number of errors permitted. See Books Online for further details.

      --
      Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

      Books Online for SQL Server SP3 at
      Get the flexibility you need to use integrated solutions, apps, and innovations in technology with your data, wherever it lives—in the cloud, on-premises, or at the edge.

      Comment

      • pk

        #4
        Re: Bulk Insert

        I'll describe the table the best I can. Then I'm headed home for the
        day in hopes that someone can explain my error. I appreciate your
        response Erland but I feel I haven't given enough info. So here it
        goes.

        I've got an empty table IV10402. I didn't create the table, Great
        Plains Dynamics did, but I need to import to it. There are several
        indexes and primary keys defined on it that I can't, with good
        confidence, alter. It has 11 columns and I have a txt file that
        consists of 10, so I've created a format file which I'm fairly certain
        is correct. It appears as follows.

        ----------------------------------

        8.0
        10
        1 SQLCHAR 0 15 "\t" 1
        PRCSHID SQL_Latin1_Gene ral_CP1_CI_AS
        2 SQLCHAR 0 1 "\t" 2
        EPITMTYP SQL_Latin1_Gene ral_CP1_CI_AS
        3 SQLCHAR 0 31 "\t" 3
        ITEMNMBR SQL_Latin1_Gene ral_CP1_CI_AS
        4 SQLCHAR 0 9 "\t" 4
        UOFM SQL_Latin1_Gene ral_CP1_CI_AS
        5 SQLCHAR 0 41 "\t" 5
        QTYFROM ""
        6 SQLCHAR 0 41 "\t" 6
        QTYTO ""
        7 SQLCHAR 0 41 "\t" 7
        PSITMVAL ""
        8 SQLCHAR 0 41 "\t" 8
        EQUOMQTY ""
        9 SQLCHAR 0 41 "\t" 9
        QTYBSUOM ""
        10 SQLCHAR 0 12 "\n" 10
        SEQNUMBR ""

        -------------------------

        So when I run my bulk insert command, which appears as follows,

        BULK INSERT Test..IV10402 FROM 'c:\bulkInsertF ile.txt'
        WITH (DATAFILETYPE=' char',
        ROWTERMINATOR=' \n',
        FORMATFILE='c:\ iv10402.fmt')

        I get this error,

        Server: Msg 2627, Level 14, State 1, Line 1
        Violation of PRIMARY KEY constraint 'PKIV10402'. Cannot insert
        duplicate key in object 'IV10402'.
        The statement has been terminated.

        I then go to check what the Primary Key constraint PKIV10402 is and
        here is the info that I can offer up.

        It is not clustered. It is based off of 6 columns, PRCSHID, EPITMTYP,
        ITEMNMBR, UOFM, QTYFROM, and QTYTO. The Create UNIQUE checkbox is
        checked and greyed out, so I can't use my previous workaround of
        checking the "Ignore Duplicate Key" box. Index Filegroup is PRIMARY.
        Fill Factor is 90%.

        One last thing is that the Table Identity Column for IV10402 is set to
        DEX_ROW_ID, which happens to be the one column that I'm not inserting.
        Is this a problem?

        Again, this table is empty when I run this insert. I'm almost positive
        that there aren't actually duplicate primary keys. Did Microsoft
        really offer no way to find out which rows it feels are duplicates?
        That seems very shortsighted in my opinion. Thanks for reading. I'll
        see you all tomorrow.

        -pk

        Comment

        • Erland Sommarskog

          #5
          Re: Bulk Insert

          pk (philip.kluss@g mail.com) writes:[color=blue]
          > ughh, bulk insert is going to be the end of me. i just need to insert
          > two seperate .txt files into two separate tables, but i can't do it. i
          > did finally get one to go through by not demanding that the index
          > AK2IV10401 is unique. i don't know what problems that will cause for
          > me in the future, but i would at least like to get to the future to see
          > SOMETHING happen. As for the second table, there is a Primary Key that
          > is blocking all my progress and I don't know how to get around this.
          > Here is the error I get.
          >
          > Violation of PRIMARY KEY constraint 'PKIV10402'. Cannot insert
          > duplicate key in object 'IV10402'.
          > The statement has been terminated.
          >
          > I REALLY don't think I'm violating anything, so why is it kicking and
          > screaming at me?[/color]

          Because you are violating something.

          Get data into a keyless staging table, and to a SELECT WHERE EXISTS
          to find clashes with existing data, and "SELECT keycol, COUNT(*) FROM
          tbl GROUP BY keyol HAVING COUNT(*) > 1" to find the dups in the file.


          --
          Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

          Books Online for SQL Server SP3 at
          Get the flexibility you need to use integrated solutions, apps, and innovations in technology with your data, wherever it lives—in the cloud, on-premises, or at the edge.

          Comment

          • Erland Sommarskog

            #6
            Re: Bulk Insert

            pk (philip.kluss@g mail.com) writes:[color=blue]
            > It is not clustered. It is based off of 6 columns, PRCSHID, EPITMTYP,
            > ITEMNMBR, UOFM, QTYFROM, and QTYTO. The Create UNIQUE checkbox is
            > checked and greyed out, so I can't use my previous workaround of
            > checking the "Ignore Duplicate Key" box. Index Filegroup is PRIMARY.
            > Fill Factor is 90%.
            >
            > One last thing is that the Table Identity Column for IV10402 is set to
            > DEX_ROW_ID, which happens to be the one column that I'm not inserting.
            > Is this a problem?
            >
            > Again, this table is empty when I run this insert. I'm almost positive
            > that there aren't actually duplicate primary keys. Did Microsoft
            > really offer no way to find out which rows it feels are duplicates?[/color]

            Either there are duplicates in the file, or the format file is incorrect
            somehow, so that data ends up in the wrong columns.

            Create a copy of the table, but put no indexes or constraints on the
            table. Bulk load data into that table. Check for duplicate with

            SELECT col1, col2, ... COUNT(*)
            FROM tbl
            GROUP BY col1, col2, ...
            HAVING COUNT(*) > 1

            Also, do something like "SELECT TOP 100 * FROM tbl" to see whether the
            data makes any sense.

            You can use the Object Broswer in Query Analyzer to create a script
            for the table. Find the table, and scripting options is on the context
            menu. Create the table in tempdb.

            As for finding which rows that are problematic directly, BULK INSERT
            does not seem to offer this option. BCP does, but I think that error
            file covers only format errors, not insertion errors.

            --
            Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

            Books Online for SQL Server SP3 at
            Get the flexibility you need to use integrated solutions, apps, and innovations in technology with your data, wherever it lives—in the cloud, on-premises, or at the edge.

            Comment

            • pk

              #7
              Re: Bulk Insert

              Erland,

              You are a lifesaver. It took me a while to figure out what that SQL
              statement you were telling me to use was supposed to do, but as soon as
              I did, it found the 2 lines out of 15000 that managed to trip the
              duplicate key error. I've since corrected it and am feeling much more
              confident in my troubleshooting skills for the future. Thank you very
              much.

              -pk

              Comment

              Working...