Fastest Way to search for a string in a large text file (75 to 100mb)

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Clinto

    Fastest Way to search for a string in a large text file (75 to 100mb)

    Hi,
    I am trying to find the fastest way to search a txt file for a
    particular string and return the line that contains the string. I have
    so for just used the most basic method. Initialized a variable as
    IO.streamreader . Read each line and perform an if-then to see if
    var.contains(my string) is true or false. if true I get my string if
    false it reads the next line. This takes for ever. Is there anything I
    can do to speed this up?
    Thanks.
  • Chris K.

    #2
    Re: Fastest Way to search for a string in a large text file (75 to 100mb)

    How big are these files?

    "Clinto" <myjunkcontaine r@hotmail.comwr ote in message
    news:8382667e-1d02-451a-91a7-d0b93af7a364@d5 g2000hsc.google groups.com...
    Hi,
    I am trying to find the fastest way to search a txt file for a
    particular string and return the line that contains the string. I have
    so for just used the most basic method. Initialized a variable as
    IO.streamreader . Read each line and perform an if-then to see if
    var.contains(my string) is true or false. if true I get my string if
    false it reads the next line. This takes for ever. Is there anything I
    can do to speed this up?
    Thanks.

    Comment

    • Tom Shelton

      #3
      Re: Fastest Way to search for a string in a large text file (75 to100mb)



      Clinto wrote:
      Hi,
      I am trying to find the fastest way to search a txt file for a
      particular string and return the line that contains the string. I have
      so for just used the most basic method. Initialized a variable as
      IO.streamreader . Read each line and perform an if-then to see if
      var.contains(my string) is true or false. if true I get my string if
      false it reads the next line. This takes for ever. Is there anything I
      can do to speed this up?
      Thanks.
      if the file is only a 100mb... Then, seriously, I would just read
      the entire file at once and process it in memory. If you read line-by-
      line, then you are going to hit the disk a lot, and that will really
      slow you down...

      If you don't want to do it that way, then you might want to read the
      file in chunks as binary data - then convert your bytes to strings and
      do yor compares... of course, that is going to make it a little
      tricky because you might end up in the middle of a line....

      --
      Tom Shelton

      Comment

      • kimiraikkonen

        #4
        Re: Fastest Way to search for a string in a large text file (75 to100mb)

        On Feb 28, 7:49 am, Tom Shelton <tom_shel...@co mcast.netwrote:
        Clinto wrote:
        Hi,
        I am trying to find the fastest way to search a txt file for a
        particular string and return the line that contains the string. I have
        so for just used the most basic method. Initialized a variable as
        IO.streamreader . Read each line and perform an if-then to see if
        var.contains(my string) is true or false. if true I get my string if
        false it reads the next line. This takes for ever. Is there anything I
        can do to speed this up?
        Thanks.
        >
        if the file is only a 100mb... Then, seriously, I would just read
        the entire file at once and process it in memory. If you read line-by-
        line, then you are going to hit the disk a lot, and that will really
        slow you down...
        >
        If you don't want to do it that way, then you might want to read the
        file in chunks as binary data - then convert your bytes to strings and
        do yor compares... of course, that is going to make it a little
        tricky because you might end up in the middle of a line....
        >
        --
        Tom Shelton
        I agree Tom, about 100mb is a huge size for a text file, reading it as
        raw then converting each byte to string is a good idea, but the key
        point is how to do it programmaticaly :-)

        Comment

        • \(O\)enone

          #5
          Re: Fastest Way to search for a string in a large text file (75 to 100mb)

          kimiraikkonen wrote:
          I agree Tom, about 100mb is a huge size for a text file, reading it as
          raw then converting each byte to string is a good idea, but the key
          point is how to do it programmaticaly :-)
          The most efficient way would presumably be to read the entire file into a
          single string using IO.File.ReadAll Text and see whether your search string
          is contained within the file at all (which you can then do using a single
          call to .Contains). If it't not there then there's no point trying to work
          out which line it's on, and you can stop looking any further straight away.

          If you do find the search string, you can count the line breaks that appear
          before the search string to work out which line it's on.

          HTH,

          --

          (O)enone


          Comment

          • =?Utf-8?B?RmFtaWx5IFRyZWUgTWlrZQ==?=

            #6
            Re: Fastest Way to search for a string in a large text file (75 to



            "(O)enone" wrote:
            kimiraikkonen wrote:
            I agree Tom, about 100mb is a huge size for a text file, reading it as
            raw then converting each byte to string is a good idea, but the key
            point is how to do it programmaticaly :-)
            >
            The most efficient way would presumably be to read the entire file into a
            single string using IO.File.ReadAll Text and see whether your search string
            is contained within the file at all (which you can then do using a single
            call to .Contains). If it't not there then there's no point trying to work
            out which line it's on, and you can stop looking any further straight away.
            >
            If you do find the search string, you can count the line breaks that appear
            before the search string to work out which line it's on.
            >
            HTH,
            >
            --
            >
            (O)enone
            >
            >
            I would use System.IO.File. ReadAllLines(Fi lename), because this returns the
            lines split out for you. You just loop through the array of individual lines
            in the array.
            >

            Comment

            • \(O\)enone

              #7
              Re: Fastest Way to search for a string in a large text file (75 to

              Family Tree Mike wrote:
              I would use System.IO.File. ReadAllLines(Fi lename), because this
              returns the lines split out for you. You just loop through the array
              of individual lines in the array.
              I did originally write the same thing in my message but then chose to remove
              it before I posted it. I think the ReadAllText approach may be quicker
              because you can check whether the string exists at all without having to
              loop... You could them possible determine the line by using a call to
              Replace() on the string prior to the search result position, changing the
              two-character line break with a one-character replacement string, and then
              see how much smaller the string has got; the number of characters it reduces
              by will be the line count.

              Maybe needs someone to try it to see which is more efficient.

              --

              (O)enone


              Comment

              • Clinto

                #8
                Re: Fastest Way to search for a string in a large text file (75 to100mb)

                On Feb 27, 10:53 pm, "Chris K." <ckoeber[Do Not
                Spam]@googlesemailse rvice.figureito utwrote:
                How big are these files?
                >
                "Clinto" <myjunkcontai.. .@hotmail.comwr ote in message
                >
                news:8382667e-1d02-451a-91a7-d0b93af7a364@d5 g2000hsc.google groups.com...
                >
                >
                >
                Hi,
                I am trying to find the fastest way to search a txt file for a
                particular string and return the line that contains the string. I have
                so for just used the most basic method. Initialized a variable as
                IO.streamreader . Read each line and perform an if-then to see if
                var.contains(my string) is true or false. if true I get my string if
                false it reads the next line. This takes for ever. Is there anything I
                can do to speed this up?
                Thanks.- Hide quoted text -
                >
                - Show quoted text -
                usually anywhere from 75 to 100mb

                Comment

                • Clinto

                  #9
                  Re: Fastest Way to search for a string in a large text file (75 to

                  On Feb 28, 6:58 am, "\(O\)enone " <oen...@nowhere .comwrote:
                  Family Tree Mike wrote:
                  I would use System.IO.File. ReadAllLines(Fi lename), because this
                  returns the lines split out for you.  You just loop through the array
                  of individual lines in the array.
                  >
                  I did originally write the same thing in my message but then chose to remove
                  it before I posted it. I think the ReadAllText approach may be quicker
                  because you can check whether the string exists at all without having to
                  loop... You could them possible determine the line by using a call to
                  Replace() on the string prior to the search result position, changing the
                  two-character line break with a one-character replacement string, and then
                  see how much smaller the string has got; the number of characters it reduces
                  by will be the line count.
                  >
                  Maybe needs someone to try it to see which is more efficient.
                  >
                  --
                  >
                  (O)enone
                  Thanks everyone, I appreciate the responses. I tried several methods,
                  ReadAllText, io.filestream, readallLines and all seem about the same.
                  It became apparent that I am also fighting a slow server connection,
                  which increases the time to open the files.

                  Comment

                  • Cor Ligthert[MVP]

                    #10
                    Re: Fastest Way to search for a string in a large text file (75 to 100mb)

                    Clinto,

                    Use the Visual Basic Find as that is optimized for strings, any other method
                    will go slower, just because those are optimized for characters.

                    Cor


                    Comment

                    Working...