Looking for Form Feeds

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Greg Lindstrom

    Looking for Form Feeds

    Hello-
    I have a file generated by an HP-9000 running Unix containing form feeds
    signified by ^M^L. I am trying to scan for the linefeed to signal
    certain processing to be performed but can not get the regex to "see"
    it. Suppose I read my input line into a variable named "input"

    The following does not seem to work...
    input = input_file.read line()
    if re.match('\f', input): print 'Found a formfeed!'
    else: print 'No linefeed!'

    I also tried to create a ^M^L (typed in as <ctrl>Q M <ctrlQ> L) but that
    gives me a syntax error when I try to run the program (re does not like
    the control characters, I guess). Is it possible for me to pull out the
    formfeeds in a straightforward manner?

    Thanks!
    --greg

    --
    Greg Lindstrom 501 975.4859
    Computer Programmer greg.lindstrom@ novasyshealth.c om
    NovaSys Health
    Little Rock, Arkansas

    "We are the music makers, and we are the dreamers of dreams." W.W.

    Confidentiality Notice
    ----------------------
    This email and any attachments to it are privileged and confidential and are intended solely for use of the individual or entity to which they are addressed. If the reader of this message is not the intended recipient, any use, distribution, or copying of this communication, or disclosure of all or any part of its content to any other person, is strictly prohibited. If you have received this communication in error, please notify the sender by replying to this message and destroy this message and delete any copies held in your electronic files. Thank you.

  • Erik Max Francis

    #2
    Re: Looking for Form Feeds

    Greg Lindstrom wrote:
    [color=blue]
    > I have a file generated by an HP-9000 running Unix containing form feeds
    > signified by ^M^L. I am trying to scan for the linefeed to signal
    > certain processing to be performed but can not get the regex to "see"
    > it. Suppose I read my input line into a variable named "input"
    >
    > The following does not seem to work...
    > input = input_file.read line()
    > if re.match('\f', input): print 'Found a formfeed!'
    > else: print 'No linefeed!'
    >
    > I also tried to create a ^M^L (typed in as <ctrl>Q M <ctrlQ> L) but that
    > gives me a syntax error when I try to run the program (re does not like
    > the control characters, I guess). Is it possible for me to pull out the
    > formfeeds in a straightforward manner?[/color]

    What's happening is that you're using .match, so you're only checking
    for matches at the _start_ of the string, not anywhere within it.

    It's easier than you think actually; you're just looking for substrings,
    so searching with .find on strings is probably sufficient:

    if line.find('\f') >= 0: ...

    If you want to look for ^M^L, that'd be '\r\f':

    if line.find('\r\f ') >= 0: ...

    If you want to keep a running count, you can use .count, which will
    count the number of substrings in the line.

    --
    Erik Max Francis && max@alcyone.com && http://www.alcyone.com/max/
    San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
    I would have liked to have seen Montana.
    -- Capt. Vasily Borodin

    Comment

    • John Machin

      #3
      Re: Looking for Form Feeds

      Greg Lindstrom wrote:[color=blue]
      > Hello-
      > I have a file generated by an HP-9000 running Unix containing form[/color]
      feeds[color=blue]
      > signified by ^M^L. I am trying to scan for the linefeed to signal
      > certain processing to be performed but can not get the regex to "see"[/color]
      [color=blue]
      > it. Suppose I read my input line into a variable named "input"
      >
      > The following does not seem to work...
      > input = input_file.read line()[/color]

      You are shadowing a builtin.
      [color=blue]
      > if re.match('\f', input): print 'Found a formfeed!'
      > else: print 'No linefeed!'[/color]

      formfeed == not not linefeed????
      [color=blue]
      >
      > I also tried to create a ^M^L (typed in as <ctrl>Q M <ctrlQ> L) but[/color]
      that[color=blue]
      > gives me a syntax error when I try to run the program (re does not[/color]
      like[color=blue]
      > the control characters, I guess). Is it possible for me to pull out[/color]
      the[color=blue]
      > formfeeds in a straightforward manner?
      >[/color]

      For a start, resolve your confusion between formfeed and linefeed.

      Formfeed makes your printer skip to the top of a new page (form),
      without changing the column position. FF, '\f', ctrl-L, 0x0C.
      Linefeed makes the printer skip to a new line, without changing the
      column position. LF, '\n', ctrl-J, 0x0D.
      There is also carriage return, which makes your typewriter return to
      column 1, without moving to the next line. CR, '\r', ctrl-M, 0x0A.

      Now you can probably guess why the writer of your report file is
      emitting "\r\f". What we can't guess for you is where in your file
      these "\r\f" occurrences are in relation to the newlines (i.e. '\n')
      which Python is interpreting as line breaks. As others have pointed
      out, (1) re.match works on the start of the string and (2) you probably
      don't need to use re anyway. The solution may be as simple as: if
      input_line[:2] == "\r\f":

      BTW, have you checked that there are no other control characters
      embedded in the file, e.g. ESC (introducing an escape sequence), SI/SO
      (change character set), BEL * 100 (Hey, Fred, the printout's finished),
      HT, VT, BS (yeah, probably lots of that, but I mean BackSpace)?
      HTH,
      John

      Comment

      Working...