Single Flat File - Join Records?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • PaulH13
    New Member
    • Jul 2007
    • 3

    Single Flat File - Join Records?

    Hi,

    I have a pipe delimited flat file database of students that I mail out certificates to.

    I wrote a label preparation script which basically passes through the file and prepares a mailing label from each record with a reference for which certificate to enclose (also from the database). (Happy to share this if anyone wants it).

    Thing is that I get a label for each certificate when I could put up to 4 certificates in a single mailing - at the moment I'm just not using three out of four labels.

    I am therefore looking to add a subroutine that loads the database and appends up to four certificate references into each student's field (and remove the records that it appends), just before the labels are printed. There can be up to 3,000 records in the database on heavy days but this is unusual, there are normally more like 500.

    The database is like this:

    ID|Name|Address |CertRef|Date
    345|Paul|13 Somwhere|CG6129 |100707
    387|Steve|15 Somwhere|CG6129 |100707
    345|Paul|13 Somwhere|CG6089 |100707

    After the routine I need a database (can be a new file) to look like this:

    ID|Name|Address |CertRef|Date
    345|Paul|13 Somwhere|CG6129 CG6089|100707
    387|Steve|15 Somwhere|CG6129 |100707

    I am assuming that I need to slurp the contents of the database into an array but I would appreciate help on the code to look for matches, append and delete.

    Please ask if you need any more info. Many thanks.

    Paul
  • KevinADC
    Recognized Expert Specialist
    • Jan 2007
    • 4092

    #2
    You will want to use a hash, where the hash keys are the ID and the rest of the data can be stored in a hash, with the CertRef being an array of the values. See if this helps get you started on how to work with complex dta such as a hash of hashes where the values of the hash of hashes is mixed data types:

    perldoc Complex Data Tutorial

    Comment

    • miller
      Recognized Expert Top Contributor
      • Oct 2006
      • 1086

      #3
      Observe the following script:

      [CODE=perl]
      # Constants
      use Readonly;
      Readonly my $ID => 0;
      Readonly my $NAME => 1;
      Readonly my $ADDR => 2;
      Readonly my $CERT => 3;
      Readonly my $DATE => 4;

      Readonly my $MERGE_MAX => 4;

      use strict;

      # Load Data
      my @records = ();
      while (<DATA>) {
      chomp;
      push @records, [split '\|'];
      }
      my $header = shift @records;

      # Sort Data
      @records = sort {$a->[$ID] <=> $b->[$ID] || $a->[$CERT] cmp $b->[$CERT]} @records;

      # Merge Data
      my @merged = ();
      foreach my $rec (@records) {
      if (@merged && $merged[-1][$ID] eq $rec->[$ID] && @{$merged[-1][$CERT]} < $MERGE_MAX) {
      push @{$merged[-1][$CERT]}, $rec->[$CERT];

      } else {
      $rec->[$CERT] = [ $rec->[$CERT] ]; # Turn cert field into an array
      push @merged, $rec;
      }
      }

      # Output Data
      foreach my $rec (@merged) {
      # Translate Cert field into a string.
      $rec->[$CERT] = join(' ', @{$rec->[$CERT]});

      print join('|', @$rec) . "\n";
      }

      __DATA__
      ID|Name|Address |CertRef|Date
      345|Paul|13 Somwhere|CG6129 |100707
      387|Steve|15 Somwhere|CG6129 |100707
      1|Foo|1 Foo St.|CG6089|1007 07
      345|Paul|13 Somwhere|CG6089 |100707
      387|Steve|15 Somwhere|CG6089 |100707
      2|Bar|1 Bar St.|CG6089|1007 07
      345|Paul|13 Somwhere|CG6001 |100707
      3|Baz|1 Baz St.|CG6089|1007 07
      345|Paul|13 Somwhere|CG6002 |100707
      4|Bit|1 Bit St.|CG6089|1007 07
      345|Paul|13 Somwhere|CG6003 |100707
      387|Steve|15 Somwhere|CG6003 |100707
      5|Bot|1 Baz St.|CG6089|1007 07
      345|Paul|13 Somwhere|CG6004 |100707
      [/CODE]

      Outputs

      Code:
      >perl scratch.pl
      1|Foo|1 Foo St.|CG6089|100707
      2|Bar|1 Bar St.|CG6089|100707
      3|Baz|1 Baz St.|CG6089|100707
      4|Bit|1 Bit St.|CG6089|100707
      5|Bot|1 Baz St.|CG6089|100707
      345|Paul|13 Somwhere|CG6001 CG6002 CG6003 CG6004|100707
      345|Paul|13 Somwhere|CG6089 CG6129|100707
      387|Steve|15 Somwhere|CG6003 CG6089 CG6129|100707
      There are other ways to go about this. And I definitely suggest that you read up Kevin's tutorial link as it provides a lot of tools useful for processing data.

      - Miller
      Attached Files

      Comment

      • PaulH13
        New Member
        • Jul 2007
        • 3

        #4
        Wow thanks, I'll give this a go and will report back.

        Comment

        Working...