Importing CSV files

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • tgru

    Importing CSV files

    Hello,

    I need to import a bunch of .csv files. The problem I am having is the
    "non data" information in the files creating bogus rows and column
    definitions. Here is an example of the csv file.

    CBOT - End-of-Day Futures Bulk Download 2001.
    2 Year U.S. Treasury Notes Futures

    Date Symbol Month Code Year Code Open
    20010103 ZT H 2001 102.09375
    20010104 ZT H 2001 102.03125
    20010105 ZT H 2001 102.28125


    In this case, there are bogues rows created with the text at the
    beginning of the file, and also the column names get placed into a row
    as well. My question is; how do you import the file and strip out the
    "non-data" data? so that only the actual data gets inserted into the db?

    Thanks,

    TGru

    *** Sent via Developersdex http://www.developersdex.com ***
    Don't just participate in USENET...get rewarded for it!
  • hharry

    #2
    Re: Importing CSV files

    Write a dotnet or vb exe to loop thru your csv file and place valid
    rows in a separate file, then run bcp using this new file.

    Comment

    • David Portas

      #3
      Re: Importing CSV files

      Use BCP or BULK INSERT with the FIRSTROW (-F) option.

      --
      David Portas
      SQL Server MVP
      --

      Comment

      • Erland Sommarskog

        #4
        Re: Importing CSV files

        tgru (tgru@devdex.co m) writes:[color=blue]
        > I need to import a bunch of .csv files. The problem I am having is the
        > "non data" information in the files creating bogus rows and column
        > definitions. Here is an example of the csv file.
        >
        > CBOT - End-of-Day Futures Bulk Download 2001.
        > 2 Year U.S. Treasury Notes Futures
        >
        > Date Symbol Month Code Year Code Open
        > 20010103 ZT H 2001 102.09375
        > 20010104 ZT H 2001 102.03125
        > 20010105 ZT H 2001 102.28125[/color]

        That doesn't look like a CSV file to me...
        [color=blue]
        > In this case, there are bogues rows created with the text at the
        > beginning of the file, and also the column names get placed into a row
        > as well. My question is; how do you import the file and strip out the
        > "non-data" data? so that only the actual data gets inserted into the db?[/color]

        Was it only the column headers, you could use -F 2 with BCP to specify
        that the BCP is to start with the second record. (I believe -F is the
        option. Double-check with Books Online.) But the introducing text is
        more difficult to handle. Assuming that there are no tabs in the text,
        BCP may consider the row as an error. Then again, you can control how
        many errors BCP may accept, so if you only can make sure that you get
        in sync, it may be possible.

        However, hharry's suggestion that you write a program that strips the
        header info, is probably a safer way.


        --
        Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

        Books Online for SQL Server SP3 at
        Accelerate your AI application's time to market by harnessing the power of your own data and the built-in AI capabilities of SQL Server 2025, the enterprise database with best-in-class security, performance and availability.

        Comment

        • David Rawheiser

          #5
          Re: Importing CSV files


          "tgru" <tgru@devdex.co m> wrote in message news:421f97a6$1 _2@127.0.0.1...[color=blue]
          > Hello,
          >
          > I need to import a bunch of .csv files. The problem I am having is the
          > "non data" information in the files creating bogus rows and column
          > definitions. Here is an example of the csv file.
          >
          > CBOT - End-of-Day Futures Bulk Download 2001.
          > 2 Year U.S. Treasury Notes Futures
          >
          > Date Symbol Month Code Year Code Open
          > 20010103 ZT H 2001 102.09375
          > 20010104 ZT H 2001 102.03125
          > 20010105 ZT H 2001 102.28125
          >
          >
          > In this case, there are bogues rows created with the text at the
          > beginning of the file, and also the column names get placed into a row
          > as well. My question is; how do you import the file and strip out the
          > "non-data" data? so that only the actual data gets inserted into the db?
          >
          > Thanks,
          >
          > TGru
          >
          > *** Sent via Developersdex http://www.developersdex.com ***
          > Don't just participate in USENET...get rewarded for it![/color]


          Take a look at AWK (I use the GNU version GAWK).

          This is a old old text stream (file) processing utility that uses regular
          expressions to let you filter and/or reformat data.

          You could have it create a new import file that ignores blank lines and
          headers (ie. starts with "CBOT').



          Comment

          Working...