buffering choking sys.stdin.readlines() ?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • cshirky

    buffering choking sys.stdin.readlines() ?

    Newbie question:

    I'm trying to turn a large XML file (~7G compressed) into a YAML file,
    and my program seems to be buffering the input.

    IOtest.py is just

    import sys
    for line in sys.stdin.readl ines():
    print line

    but when I run

    $ gzcat bigXMLfile.gz | IOtest.py

    but it hangs then dies.

    The goal of the program is to build a YAML file with print statements,
    rather than building a gigantic nested dictionary, but I am obviously
    doing something wrong in passing input through without buffering. Any
    advice gratefully fielded.

    -clay
  • Diez B. Roggisch

    #2
    Re: buffering choking sys.stdin.readl ines() ?

    cshirky schrieb:
    Newbie question:
    >
    I'm trying to turn a large XML file (~7G compressed) into a YAML file,
    and my program seems to be buffering the input.
    >
    IOtest.py is just
    >
    import sys
    for line in sys.stdin.readl ines():
    print line
    >
    but when I run
    >
    $ gzcat bigXMLfile.gz | IOtest.py
    >
    but it hangs then dies.
    >
    The goal of the program is to build a YAML file with print statements,
    rather than building a gigantic nested dictionary, but I am obviously
    doing something wrong in passing input through without buffering. Any
    advice gratefully fielded.
    readlines() reads all of the file into the memory. Try using xreadlines,
    the generator-version, instead. And I'm not 100% sure, but I *think* doing

    for line in sys.stdin:
    ...

    does exactly that.

    Diez

    Comment

    • cshirky

      #3
      Re: buffering choking sys.stdin.readl ines() ?

      readlines() reads all of the file into the memory. Try using xreadlines,
      the generator-version, instead. And I'm not 100% sure, but I *think* doing
      >
      for line in sys.stdin
      both work -- many thanks.

      -clay

      Comment

      Working...