group and print data seperately

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • techtween
    New Member
    • May 2011
    • 7

    group and print data seperately

    I have an XML file of the form,

    Code:
    <?xml version="1.0" encoding="UTF-8"?>
    <testResults version="1.2">
    <httpSample t="704" lt="704" ts="1306146504248" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-1" dt="text" by="411"/>
    <httpSample t="525" lt="525" ts="1306146505234" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-2" dt="text" by="411"/>
    
    
    <httpSample t="586" lt="586" ts="1306146611316" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-1" dt="text" by="411"/>
    <httpSample t="523" lt="523" ts="1306146612307" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-2" dt="text" by="411"/>
    <httpSample t="507" lt="507" ts="1306146613306" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-3" dt="text" by="411"/>
    
    <httpSample t="535" lt="535" ts="1306146615306" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-5" dt="text" by="411"/>
    <httpSample t="526" lt="526" ts="1306146506234" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-3" dt="text" by="411"/>
    <httpSample t="499" lt="498" ts="1306146507234" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-4" dt="text" by="411"/>
    <httpSample t="505" lt="505" ts="1306146508234" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-5" dt="text" by="411"/>
    <httpSample t="536" lt="536" ts="1306146509249" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-6" dt="text" by="411"/>
    </testResults>
    In perl,
    i need output for the above in the following form,

    lt = 704
    lt = 525

    lt max value = 704
    lt min value = 525
    lt average value =

    lt = 586
    lt = 523
    lt = 507

    lt max value = 586
    lt min value = 507
    lt average value =

    lt = 535
    lt = 526
    lt = 498
    lt = 505
    lt = 536

    lt max value = 536
    lt min value = 498
    lt average value =

    kindly help..
    Last edited by Niheel; May 24 '11, 11:23 PM. Reason: clarification
  • miller
    Recognized Expert Top Contributor
    • Oct 2006
    • 1086

    #2
    Use XML::Twig to process the XML, and List::Util to find the min/max/sum/average.

    Code:
    use List::Util;
    use XML::Twig;
    
    use strict;
    use warnings;
    
    my $xml = do {local $/; <DATA>};
    
    my $twig = XML::Twig->new;
    $twig->parse($xml);
    
    my @lts = map {$_->att("lt")} $twig->findnodes(q{//httpSample[@lt]});
    
    print join(',', @lts), "\n";
    
    print min(@lts), "\n";
    print max(@lts), "\n";
    print sum(@lts), "\n";
    # ... pretty easy from ehre
    
    __DATA__
    <?xml version="1.0" encoding="UTF-8"?>
    <testResults version="1.2">
    <httpSample t="704" lt="704" ts="1306146504248" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-1" dt="text" by="411"/>
    <httpSample t="525" lt="525" ts="1306146505234" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-2" dt="text" by="411"/>
    <httpSample t="586" lt="586" ts="1306146611316" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-1" dt="text" by="411"/>
    <httpSample t="523" lt="523" ts="1306146612307" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-2" dt="text" by="411"/>
    <httpSample t="507" lt="507" ts="1306146613306" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-3" dt="text" by="411"/>
    <httpSample t="535" lt="535" ts="1306146615306" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-5" dt="text" by="411"/>
    <httpSample t="526" lt="526" ts="1306146506234" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-3" dt="text" by="411"/>
    <httpSample t="499" lt="498" ts="1306146507234" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-4" dt="text" by="411"/>
    <httpSample t="505" lt="505" ts="1306146508234" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-5" dt="text" by="411"/>
    <httpSample t="536" lt="536" ts="1306146509249" s="true" lb="HTTP Request" rc="200" rm="OK" tn="Thread Group 2-6" dt="text" by="411"/>
    </testResults>
    As already posted on perlmonks: http://www.perlmonks.com/?node_id=906521

    Comment

    • techtween
      New Member
      • May 2011
      • 7

      #3
      Thanks a ton miller... But what should i do if there is an empty new line between my httpSample tags and i need to retrieve the tags above the empty new line into an array and the tags below the empty new line into another array and calculate the same(min, max and avg for each array seperately)

      Comment

      • miller
        Recognized Expert Top Contributor
        • Oct 2006
        • 1086

        #4
        I understand what you want now mate, but no, there is no easy way to do that. That's not standard XML or any such format.

        Obviously anything is possible, but when your data doesn't follow accepted formats, then there is no guaranteed way to process it.

        - M

        Comment

        • mirod
          New Member
          • May 2011
          • 1

          #5
          It is a very bad idea to rely on non significant whitespace (ie empty lines) when processing XML. This will prevent you from using a good number of XML tools that assume that the structure of the data is tagged explicitly, with.. tags, not implicitly through formatting. You also run the risk of an XML tool removing those empty lines. The elements for each series of test should be wrapped in a composite element, to mark the structure. Just like in HTML an ul element wraps around the li's.

          So if you can, change the structure of the XML to
          Code:
          <testResults version="1.2">
            <test>
              <httpSample>...</httpSample>
              <httpSample>...</httpSample>
            </test>
            <test>...</test>
          </testResults>
          If you can't, then the following code, using XML::Twig, will do: the lt values are stored in an array, and if the text element before the httpSample includes 2 line feeds, then it processes the array.

          If you are not familiar with XML::Twig, what you have to know for this is that the twig_handlers bit declares handlers that are called for each httpSample element, the handler is called with $_ set to an object representing the element.

          Code:
          #!/usr/bin/perl
          
          use strict;
          use warnings;
          
          use XML::Twig;
          use List::Util qw( max min sum);
          
          my $lt=[]; # an array with the list of lt values
          
          XML::Twig->new( keep_spaces => 1, # so empty lines are not discarded
                          twig_handlers => { httpSample => sub { process_test( $lt) if @$lt && $_->prev_sibling_text=~ m{\n.*\n};
                                                                 push @$lt, $_->att( 'lt'); },
                                           },
                        )
                   ->parsefile( "test_data.xml");
          
          process_test( $lt) if @$lt; # to process the last batch 
          
          
          sub process_test
            {  print "lt max value: ", max( @$lt), "\nlt min value: ", min( @$lt), "\n lt average value: ", sum( @$lt)/@$lt,
                     "\n\n", join( "\n", map { "lt: $_" } @$lt), "\n\n";
               $lt=[];
            }
          I hope that helps

          Comment

          • techtween
            New Member
            • May 2011
            • 7

            #6
            This was the exact result i was craving for.. Mirod loads of thanks to you, not only for the solution but also for the concept(XML::Tw ig , just now learnt coz of you) and also thanks again miller for your timely assistance:)

            Comment

            Working...