Splitting string into variables

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Alan Bak
    New Member
    • Dec 2006
    • 12

    Splitting string into variables

    I have a text file that I am reading in with perl line by line and breaking into variables for processing. The lines are being broken on white spaces using split.

    Here are a couple of lines from the text file:

    bl-render -render /renders/HOE/HD_RAW/REEL_5/reel_5.%.6F.dpx -profile HD_bake -format "Super 16mm" -mask "Full" -range 0-30408 max:hoe:reel_5
    bl-render -render /renders/HOE/HD_RAW/REEL_6/reel_6.%.6F.dpx -profile HD_bake -format "Super 16mm" -mask "Full" -range 0-13472 max:hoe:reel_6

    Question: is there a way to break each line into variables and keep the " " strings as 1 variable? Simply breaking on white spaces results in "Super 16mm" being broken into 2 variables "Super and 16mm" while I would like it to be one variable "Super 16mm".

    Any ideas?


    Thanks
  • KevinADC
    Recognized Expert Specialist
    • Jan 2007
    • 4092

    #2
    What you have is the same as a comma delimited file but with spaces instead of commas. So we can use the code from the Perl Cookbook, parsing comma-seperated data, and substitute a space where the commas would be. (Not well tested):

    Code:
    my @data;
    while (<DATA>) {
       push @data,parse_data($_);
    }
    print "$_\n" for @data;
    
    sub parse_data {
        my $text = shift;
        my @new  = ();
        push(@new, $+) while $text =~ m{
            "([^\"\\]*(?:\\.[^\"\\]*)*)"\s?
               |  ([^ ]+)\s?
               | \s
           }gx;
           push(@new, undef) if substr($text, -1,1) eq ' ';
           return @new;
    }  
    __DATA__
    bl-render -render /renders/HOE/HD_RAW/REEL_5/reel_5.%.6F.dpx -profile HD_bake -format "Super 16mm" -mask "Full" -range 0-30408 max:hoe:reel_5
    bl-render -render /renders/HOE/HD_RAW/REEL_6/reel_6.%.6F.dpx -profile HD_bake -format "Super 16mm" -mask "Full" -range 0-13472 max:hoe:reel_6

    It does assume there is a single space delimiting the data fields, if there can be more than one space the code would need to be changed a little bit. I have no idea how it will work if a field is blank.

    Apply the parse_data() subroutine however is appropriate for your purposes.

    Comment

    • Alan Bak
      New Member
      • Dec 2006
      • 12

      #3
      Thanks Kevin!

      I will give this a try. I have access to the Perl Cookbook so I will look it up as well.

      Cheers

      Alan




      Originally posted by KevinADC
      What you have is the same as a comma delimited file but with spaces instead of commas. So we can use the code from the Perl Cookbook, parsing comma-seperated data, and substitute a space where the commas would be. (Not well tested):

      Code:
      my @data;
      while (<DATA>) {
         push @data,parse_data($_);
      }
      print "$_\n" for @data;
      
      sub parse_data {
          my $text = shift;
          my @new  = ();
          push(@new, $+) while $text =~ m{
              "([^\"\\]*(?:\\.[^\"\\]*)*)"\s?
                 |  ([^ ]+)\s?
                 | \s
             }gx;
             push(@new, undef) if substr($text, -1,1) eq ' ';
             return @new;
      }  
      __DATA__
      bl-render -render /renders/HOE/HD_RAW/REEL_5/reel_5.%.6F.dpx -profile HD_bake -format "Super 16mm" -mask "Full" -range 0-30408 max:hoe:reel_5
      bl-render -render /renders/HOE/HD_RAW/REEL_6/reel_6.%.6F.dpx -profile HD_bake -format "Super 16mm" -mask "Full" -range 0-13472 max:hoe:reel_6

      It does assume there is a single space delimiting the data fields, if there can be more than one space the code would need to be changed a little bit. I have no idea how it will work if a field is blank.

      Apply the parse_data() subroutine however is appropriate for your purposes.

      Comment

      Working...