datamining .txt-files, library?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • globalrev

    datamining .txt-files, library?

    i have a big collection of .txt files that i want to open and parse to
    extract information.

    is there a library for this or maybe even built in?
  • Chris

    #2
    Re: datamining .txt-files, library?

    On May 15, 2:27 pm, globalrev <skanem...@yaho o.sewrote:
    i have a big collection of .txt files that i want to open and parse to
    extract information.
    >
    is there a library for this or maybe even built in?
    os.open to open the files and iterate through it and built in string
    functions to parse it.

    Comment

    • George Sakkis

      #3
      Re: datamining .txt-files, library?

      On May 15, 8:27 am, globalrev <skanem...@yaho o.sewrote:
      i have a big collection of .txt files that i want to open and parse to
      extract information.
      >
      is there a library for this or maybe even built in?
      This has a lot to do with how well-structured are your files and what
      kind of information you hope to extract (shallow or deep). NLTK might
      be a good starting point: http://nltk.sourceforge.net/

      George

      Comment

      • globalrev

        #4
        Re: datamining .txt-files, library?

        On 15 Maj, 14:40, George Sakkis <george.sak...@ gmail.comwrote:
        On May 15, 8:27 am, globalrev <skanem...@yaho o.sewrote:
        >
        i have a big collection of .txt files that i want to open and parse to
        extract information.
        >
        is there a library for this or maybe even built in?
        >
        This has a lot to do with how well-structured are your files and what
        kind of information you hope to extract (shallow or deep). NLTK might
        be a good starting point:http://nltk.sourceforge.net/
        >
        George
        superstructured , 10K+files, all look the same.

        Comment

        • Ville M. Vainio

          #5
          Re: datamining .txt-files, library?

          Chris <cwitts@gmail.c omwrites:
          On May 15, 2:27 pm, globalrev <skanem...@yaho o.sewrote:
          >i have a big collection of .txt files that i want to open and parse to
          >extract information.
          >>
          >is there a library for this or maybe even built in?
          >
          os.open to open the files and iterate through it and built in string
          functions to parse it.
          Or more probably, regular expression library "re".

          Comment

          • Bruno Desthuilliers

            #6
            Re: datamining .txt-files, library?

            Chris a écrit :
            On May 15, 2:27 pm, globalrev <skanem...@yaho o.sewrote:
            >i have a big collection of .txt files that i want to open and parse to
            >extract information.
            >>
            >is there a library for this or maybe even built in?
            >
            os.open to open the files
            What's wrong with (builtin) open ?

            Comment

            • martindesalinas@gmail.com

              #7
              Re: datamining .txt-files, library?

              look at module re (rgular expression) or pyparser

              see http://nedbatchelder.com/text/python-parsers.html

              Comment

              • castironpi

                #8
                Re: datamining .txt-files, library?

                On May 16, 3:22 am, martindesali... @gmail.com wrote:
                look at module re (rgular expression) or pyparser
                >
                seehttp://nedbatchelder.c om/text/python-parsers.html
                This ties in to 'call tree tool?' from yesterday. Do we have any
                visualization modules? My two examples were 're' and 'call trees'

                Comment

                Working...