Re: Parse each line by character location

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Lie

    Re: Parse each line by character location

    On Nov 5, 2:29 pm, Dennis Lee Bieber <wlfr...@ix.net com.comwrote:
    [snip]
            So you have a classic (especially for COBOL and older FORTRAN) fixed
    field record layout, no?
    >
            I presume the entire file is of a single layout? That would mean
    only one splitting format is needed...
    >
    [snip]
    >
            Note that all fields are still in character format. And has been
    noted, I'm sure, if you try to turn the third field into an integer, you
    may have problems, depending upon how you do the conversion -- leading 0
    implies octal if it were a literal, though it seems int() handles it
    correctly (Python 2.5)
    from help(int)
    | ... If base is zero, the proper base is guessed based on the
    | string content. ...

    int(x) will always convert x in base 10
    int(x, 0) will convert based on literal int notation, i.e. the prefix
    '0x' is base 16 (hex), prefix '0' is base 8 (octal), everything else
    is base 10.
    int(x, n) will convert on base n, where 2 <= n <= 36

    if you're still in doubt, just pass the base explicitly to the int:
    int(x, 10) (superfluous though)
  • Shawn Milochik

    #2
    Re: Parse each line by character location

    I work with tab-delimited files for a living. Because of the same need
    you have, I created a Python script to do this. It has usage
    information that is easy to follow (just run it without any
    arguments).

    I hope someone else finds this useful. I have, and use it every month.
    It can be easily modified to create comma-delimited files, but that's
    something I never use, so it does tabs.





    Usage:
    fwconvert -r rulesFile fileName [-t|-f]
    or
    cat filename | fwconvert -r rulesFile" (-t|-f)

    -t (to tab) or -f (to fixed-width) required when piping input to
    script. Otherwise, it will be auto-determined.


    Rules file format:
    fieldStart:fiel dLength,fieldSt art:fieldLength ...
    Example:
    1:3,4:20,24:5

    Comment

    Working...