Want to split a file into columns

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • SL_McManus

    Want to split a file into columns

    Hi All;

    I am fairly new to Perl. I have a file with close to 3000 lines
    that I would like to split out in a certain way. I would like to put
    the record type starting in column 1 for 2 spaces, the employer code
    in column 23 for 29 spaces and employer description in column 53 for
    30 spaces. I have tried modifying an existing file with no real
    success. I haven't found anything that specifically answers my
    question. Any guidance would be appreciated.

    My input file would look like this

    09 A/B A & B Construction Co.

    I'm working on a script so that I don't have to move all of these
    manually. The code I am working on is the following. I didn't want to
    delete anything, so that is why a good bit of it is commented out.
    Thanks in Advance




    my ($mday,$mon,$ye ar) = (localtime(time ))[3,4,5];
    $year -= 100 ;
    $mon++;
    my $date = sprintf("%02d%0 2d%02d%02d","20 ",$year,$mon,$m day);
    my @records = ( );
    $infile .= '.' . $date;
    open(INPUT, $infile) or die "Unable to open $infile: $!\n";
    #@lines = <INPUT>; # reads in all lines when called like this
    my $inrecord = 0;
    my $i = -1; # $i++ the first time will make this 0

    while(<INPUT>) {
    #($a,$b,$c)=spl it(" ",$string);
    $str=split(" ",split,3);

    #my $a = substr(1, 2);
    #my $b = substr(23, 29);
    #my $c = substr(53, 29);

    #print $str[0];
    #print " ";
    #print $str[1];
    #print " ";
    #print $str[2];
    #print "\n";
    }
    close(INPUT);


    my $record;
    open(OFH, '>' . $outfile) or die $!;
    for $record (@records) {
    # split it up per-line
    #for $line (split(/\r?\n/, $record)) {
    #my $a = substr($line, 1, 2);
    #my $b = substr($line, 23, 29);
    #my $c = substr($line, 53, 29);

    #print @records;
    #print "TESTING!!! \n";
    #print $a;
    #print " ";
    #print $string;
    #print " ";
    #print $c;
    #print $line;
    #print "\n";
    #print $record;
    #print "\n";

    #print OFH $record;

    #open(OFH, '>' . $outfile) or die $!;

    # my $a = substr($a, 1, 2);
    # my $b = substr($b, 23, 29);
    # my $c = substr($c, 53, 29);

    # print the whole $record to the outfile
    # print OFH $a,$b,$c;
    #}
    }
    # }
    #}
    close(OFH);
  • David Trudgett

    #2
    Re: Want to split a file into columns

    SL_McManus wrote:[color=blue]
    > Hi All;
    >
    > I am fairly new to Perl. I have a file with close to 3000 lines
    > that I would like to split out in a certain way. I would like to put
    > the record type starting in column 1 for 2 spaces, the employer code
    > in column 23 for 29 spaces and employer description in column 53 for
    > 30 spaces. I have tried modifying an existing file with no real
    > success. I haven't found anything that specifically answers my
    > question. Any guidance would be appreciated.
    >
    > My input file would look like this
    >
    > 09 A/B A & B Construction Co.[/color]


    Well, there's more than one way to do it (hey, this is Perl, right? :-))
    I usually wimp out with a regular expression something like this

    while (<>) {
    my ($first, $second, $third) = /^(.{2})(.{29})( .{30})/;
    print "$first\n$secon d\n$third\n\n";
    }

    HTH

    David



    --

    We come here upon what, in a large proportion of cases, forms the
    source of the grossest errors of mankind. Men on a lower level of
    understanding, when brought into contact with phenomena of a higher
    order, instead of making efforts to understand them, to raise
    themselves up to the point of view from which they must look at the
    subject, judge it from their lower standpoint, and the less they
    understand what they are talking about, the more confidently and
    unhesitatingly they pass judgment on it.

    -- Leo Tolstoy, "The Kingdom of God is Within You"


    Comment

    • Jim Gibson

      #3
      Re: Want to split a file into columns

      In article <b2ecd45e.03100 10958.18904f69@ posting.google. com>,
      SL_McManus <slmcmanus67@ne tzero.net> wrote:
      [color=blue]
      > Hi All;
      >
      > I am fairly new to Perl. I have a file with close to 3000 lines
      > that I would like to split out in a certain way. I would like to put
      > the record type starting in column 1 for 2 spaces, the employer code
      > in column 23 for 29 spaces and employer description in column 53 for
      > 30 spaces. I have tried modifying an existing file with no real
      > success. I haven't found anything that specifically answers my
      > question. Any guidance would be appreciated.
      >
      > My input file would look like this
      >
      > 09 A/B A & B Construction Co.
      >
      > I'm working on a script so that I don't have to move all of these
      > manually. The code I am working on is the following. I didn't want to
      > delete anything, so that is why a good bit of it is commented out.
      > Thanks in Advance[/color]

      In fact, most of it is commented out. You really should only post a
      working (syntax-error-free) program so people can answer specific
      questions. It is hard to tell if you want guidance on the uncommented
      out or the commented out code. I will assume the former.
      [color=blue]
      >
      >
      >[/color]
      Add the following to get help from perl:

      use strict;
      use warnings;
      use diagnostics;

      [color=blue]
      >
      > my ($mday,$mon,$ye ar) = (localtime(time ))[3,4,5];
      > $year -= 100 ;
      > $mon++;
      > my $date = sprintf("%02d%0 2d%02d%02d","20 ",$year,$mon,$m day);
      > my @records = ( );
      > $infile .= '.' . $date;[/color]

      You need a 'my' in front of $infile now that you have 'use strict;' in
      your program. You haven't initialized $infile previously, so you are
      adding ".$date" to an empty string. Is the file you want to read really
      named something like '.20031001'?
      [color=blue]
      > open(INPUT, $infile) or die "Unable to open $infile: $!\n";
      > #@lines = <INPUT>; # reads in all lines when called like this
      > my $inrecord = 0;
      > my $i = -1; # $i++ the first time will make this 0[/color]

      You don't seem to use $inrecord or $i below, so these could be removed.
      [color=blue]
      >
      > while(<INPUT>) {
      > #($a,$b,$c)=spl it(" ",$string);
      > $str=split(" ",split,3);[/color]

      This won't work. You are using the function split twice here. The line
      you just read is in $_, so you should at least be using split("
      ",$_,3). Of course, 'split(" ",...' won't work if you have spaces in
      your field, which your example does. Also, you are assigning the return
      from the split function to a scalar, so you will get only the number of
      fields split and not the fields themselves. The split data actually
      ends up in the @_ array, but you probably didn't know that (I didn't
      until I looked it up).

      Look at the unpack function instead, or use the substr calls you have
      commented out below. Something like this might work, if your fields are
      fixed width in the columns implied by your substr calls, although the
      example data line you gave above doesn't match them (untested):

      ($type,$code,$n ame) = unpack("A2x19A2 9A29",$_);

      Of course, these seem to be your desired output format, so your likely
      input format is something else. According to the sample data, this
      should be

      ($type,$code,$n ame) = unpack("A2x2A3x 4A*",$_);
      [color=blue]
      >
      > #my $a = substr(1, 2);
      > #my $b = substr(23, 29);
      > #my $c = substr(53, 29);
      >
      > #print $str[0];
      > #print " ";
      > #print $str[1];
      > #print " ";
      > #print $str[2];
      > #print "\n";
      > }
      > close(INPUT);
      >
      >
      > my $record;
      > open(OFH, '>' . $outfile) or die $!;[/color]

      You don't have anything in $outfile, so this will not succeed.
      [color=blue]
      > for $record (@records) {[/color]

      You don't have anything in @records, so this will do nothing, but then
      the rest of your loop is commented out so will do nothing in any case.
      [color=blue]
      > # split it up per-line
      > #for $line (split(/\r?\n/, $record)) {
      > #my $a = substr($line, 1, 2);
      > #my $b = substr($line, 23, 29);
      > #my $c = substr($line, 53, 29);
      >
      > #print @records;
      > #print "TESTING!!! \n";
      > #print $a;
      > #print " ";
      > #print $string;
      > #print " ";
      > #print $c;
      > #print $line;
      > #print "\n";
      > #print $record;
      > #print "\n";
      >
      > #print OFH $record;
      >
      > #open(OFH, '>' . $outfile) or die $!;
      >
      > # my $a = substr($a, 1, 2);
      > # my $b = substr($b, 23, 29);
      > # my $c = substr($c, 53, 29);
      >
      > # print the whole $record to the outfile
      > # print OFH $a,$b,$c;
      > #}
      > }
      > # }
      > #}
      > close(OFH);[/color]

      I suggest you look at the unpack and printf functions:

      perldoc -f unpack
      perldoc -f printf

      I suggest you open the output file at the same time as the input file,
      read each record, extract the fields, and write them to the new file.

      Something like this (untested):

      #!/opt/perl/bin/perl

      use strict;
      use warnings;
      use diagnostics;

      my $infile = 'old.dat';
      my $outfile = 'new.dat';
      open(INPUT, $infile) or die "Unable to open $infile: $!\n";
      open(OUTPUT, $outfile) or die "Unable to open $outfile: $!\n";
      while(defined(m y $line=<INPUT>)) {
      my ($type,$code,$n ame) = unpack("A2x19A2 9A29",$_);
      printf OUTPUT "%2s%19s%29s%30 s\n", $type, '', $code, '', $name;
      }
      close(INPUT);
      close(OUTPUT);

      Comment

      Working...