Beautification Script - Regular Expressions

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Dr Fuzzy
    New Member
    • Mar 2008
    • 6

    Beautification Script - Regular Expressions

    Hi all,

    I am working on a VHDL code beautifier with Perl. I've come to this part of the beautification process and I got really stuck. Assume for example the following piece of VHDl code:

    CODE
    Code:
    entity JK_FF is
      port( clock : in std_logic;
             J, K : in std_logic;
            reset : in std_logic;
            Q, Qbar : out std_logic);
    end JK_FF;
    Well I'm trying to figure out the regular expressions to transform it to that:

    Code:
    entity JK_FF is
      port( clock : in  std_logic;
            J     : in  std_logic;
            K     : in  std_logic;
            reset : in  std_logic;
            Q     : out std_logic;
            Qbar  : out std_logic);
    end JK_FF;
    Hence, briefly,
    i. Place all words between 'port(' and ');' in columns.
    ii. Separate
    <signal_name_1> , <signal_name_2> ,...,<signal_na me_n> : <direction> <type>;

    to

    <signal_name_ 1> : <direction> <type>;
    <signal_name_ 2> : <direction> <type>;
    ...
    <signal_name_ n> : <direction> <type>;

    Any help, suggestion is more than welcomed

    Thanks in advance!
  • nithinpes
    Recognized Expert Contributor
    • Dec 2007
    • 410

    #2
    Originally posted by Dr Fuzzy
    Hi all,

    I am working on a VHDL code beautifier with Perl. I've come to this part of the beautification process and I got really stuck. Assume for example the following piece of VHDl code:

    CODE
    Code:
    entity JK_FF is
      port( clock : in std_logic;
             J, K : in std_logic;
            reset : in std_logic;
            Q, Qbar : out std_logic);
    end JK_FF;
    Well I'm trying to figure out the regular expressions to transform it to that:

    Code:
    entity JK_FF is
      port( clock : in  std_logic;
            J     : in  std_logic;
            K     : in  std_logic;
            reset : in  std_logic;
            Q     : out std_logic;
            Qbar  : out std_logic);
    end JK_FF;
    Hence, briefly,
    i. Place all words between 'port(' and ');' in columns.
    ii. Separate
    <signal_name_1> , <signal_name_2> ,...,<signal_na me_n> : <direction> <type>;

    to

    <signal_name_ 1> : <direction> <type>;
    <signal_name_ 2> : <direction> <type>;
    ...
    <signal_name_ n> : <direction> <type>;

    Any help, suggestion is more than welcomed

    Thanks in advance!
    This can be done using a split() function to split on commas for each line in the file, then splitting the last element of the resulting array on colon to get the last two fields.
    It would be good to know what you have tried so far!

    Comment

    • Dr Fuzzy
      New Member
      • Mar 2008
      • 6

      #3
      Originally posted by nithinpes
      This can be done using a split() function to split on commas for each line in the file, then splitting the last element of the resulting array on colon to get the last two fields.
      It would be good to know what you have tried so far!
      Yes you're right! Sorry I forgot to mention, but I've tried this lot already:

      Code:
      use strict;
      #use warnings;
      
      my @data;
      #push @data, [split (/\s+/, $_)] for <DATA>;
      push @data, [split (' ', $_)] for <DATA>;
      
      foreach my $row(0..8) {
      foreach my $col(0..(@data-1)) {
      printf("%-15s", $data[$row][$col]);
      }
      print "\n";
      }
      
      __DATA__
      clk         : in std_logic;
      areset        : in std_logic;
      busy : out std_logic;
      writeEnable : in std_logic;
      readEnable : in std_logic;
      write    : in std_logic_vector(wordSize-1 downto 0);
      read    : out std_logic_vector(wordSize-1 downto 0);
      addr : in std_logic_vector(maxAddrBit downto minAddrBit));
      eventhough <wordSize-1 downto> has a space separator, for some reason I get this:

      write : in std_logic_vecto r(wordSize-1downto 0);

      Any ideas?

      Comment

      • nithinpes
        Recognized Expert Contributor
        • Dec 2007
        • 410

        #4
        I feel the problem lies in this line:
        Code:
        foreach my $col(0..(@data-1))
        In your script, @data is changing dynamically(als o the number of elements). But the range cannot be varying inside foreach() loop. Hence, the range will take number of elements in first @data(first line) as upperlimit of the range.
        Therefore, the column count would end after "std_logic_vect or(wordSize-1".
        Also, you are unconditionally splitting on spaces,though you require exactly 4 fields to be aligned/ formatted. For this purpose, you can make use of third argument in split() function. This number would tell the exact number of splits to be made. The string after these many delimiter characters would become the last element of the array.
        Use:
        Code:
        push @data, [split (/\s+/, $_,4)] for <DATA>;

        Comment

        • Dr Fuzzy
          New Member
          • Mar 2008
          • 6

          #5
          Bingo! That worked exactly the way I want it! Thanks a lot!

          Now the tricky part (for me it is!), is how to:

          Separate
          <signal_name_1> , <signal_name_2> ,...,<signal_na me_n> : <direction> <type>;

          to

          <signal_name_ 1> : <direction> <type>;
          <signal_name_ 2> : <direction> <type>;
          ...
          <signal_name_ n> : <direction> <type>;

          Well, the above special case, may or may not exist, so some sort of detection is required...I can roughly think a way of using split() (switch rows to col etc.), concat the end ;, and multiple if's, but looks quite dodgy. I'd rather prefer a better more neat way of doing it. Could you suggest anything? Especially for the detection part!

          Comment

          • KevinADC
            Recognized Expert Specialist
            • Jan 2007
            • 4092

            #6
            Originally posted by Dr Fuzzy
            Bingo! That worked exactly the way I want it! Thanks a lot!

            Now the tricky part (for me it is!), is how to:

            Separate
            <signal_name_1> , <signal_name_2> ,...,<signal_na me_n> : <direction> <type>;

            to

            <signal_name_ 1> : <direction> <type>;
            <signal_name_ 2> : <direction> <type>;
            ...
            <signal_name_ n> : <direction> <type>;

            Well, the above special case, may or may not exist, so some sort of detection is required...I can roughly think a way of using split() (switch rows to col etc.), concat the end ;, and multiple if's, but looks quite dodgy. I'd rather prefer a better more neat way of doing it. Could you suggest anything? Especially for the detection part!
            And it will be fairly tricky. You could use a hash of arrays.

            <direction> will be (it appears) one of two values (boolean) "in" or "out". <type> looks like it could be just about anything but i assume its everything after the <direction> indicator. You would use those two pieces of information as hash keys. Then you would push the <signal_name> indicator into the approrpiate array. One possible draw back is the loss of order of the data, but if the original order is not important then that is not a problem.

            Comment

            • Dr Fuzzy
              New Member
              • Mar 2008
              • 6

              #7
              Originally posted by KevinADC
              And it will be fairly tricky. You could use a hash of arrays.

              <direction> will be (it appears) one of two values (boolean) "in" or "out". <type> looks like it could be just about anything but i assume its everything after the <direction> indicator. You would use those two pieces of information as hash keys. Then you would push the <signal_name> indicator into the approrpiate array. One possible draw back is the loss of order of the data, but if the original order is not important then that is not a problem.
              Could you spare me an example? Little something to start feedling with! Hope am not asking too much.

              Comment

              • nithinpes
                Recognized Expert Contributor
                • Dec 2007
                • 410

                #8
                Originally posted by Dr Fuzzy
                Could you spare me an example? Little something to start feedling with! Hope am not asking too much.
                Using hash of arrays is a good approach. But, from your initial description, I assume you need to retain the order.This can be done using array of arrays itself, though bit lengthy.
                The following code would do the job:
                [CODE=perl]
                use strict;

                my @data;
                for (<DATA>) {
                ###checking for commas. Otherwise even these signals can be split
                ##into separate elements if there is space before/after comma
                unless(/,/) {
                push @data, [split (/\s+/, $_,4)];
                } else {
                $_=~s/\s*,\s*/,/;
                push @data, [split (/\s+/, $_,4)];
                }
                }

                foreach my $row(0..8) {
                my @signals;my @other;
                my $multi;
                foreach my $col(0..(@data-1)) {
                if($data[$row][$col]=~/,/) {
                $multi=1;
                @signals=split(/,/,$data[$row][$col]); ##separate out signals
                until($col==(@d ata-1)) {
                $col++;
                ##take out corresponding type and direction
                push @other,$data[$row][$col];
                }
                foreach(@signal s) {
                print "\n";
                printf("%-15s",$_);
                printf("%-15s", $_) foreach(@other) ;
                }
                last;
                }
                else {
                printf("%-15s", $data[$row][$col]);
                }
                }
                print "\n";
                }

                __DATA__
                clk : in std_logic;
                areset,reset : in std_logic;
                busy : out std_logic;
                writeEnable : in std_logic;
                readEnable, modifyEnable : in std_logic;
                write, copy : in std_logic_vecto r(wordSize-1 downto 0);
                read ,clock : out std_logic_vecto r(wordSize-1 downto 0);
                addr : in std_logic_vecto r(maxAddrBit downto minAddrBit));

                [/CODE]

                Comment

                • Dr Fuzzy
                  New Member
                  • Mar 2008
                  • 6

                  #9
                  Thanks a lot, yes thats the idea more or less. Now, probably should have justified that from start, but I dont really want to print the formatted text, but collect it into a buffer in order to replace the original part with the formatted one. Any ideas how to achieve that in your existing code?

                  Comment

                  • nithinpes
                    Recognized Expert Contributor
                    • Dec 2007
                    • 410

                    #10
                    Originally posted by Dr Fuzzy
                    Thanks a lot, yes thats the idea more or less. Now, probably should have justified that from start, but I dont really want to print the formatted text, but collect it into a buffer in order to replace the original part with the formatted one. Any ideas how to achieve that in your existing code?
                    If you want to modify the file containing data according to format, all you need to do is to read from that file, write into a temporary file and later change the temporary file to data file.You can use this example:
                    [CODE=perl]
                    use strict;

                    my @data;
                    open(DATA,"data .txt") or die "read failed:$!";
                    open(TEMP,">tem p.txt") or die "write failed:$!"; ##open temporary file
                    for (<DATA>) {
                    s/^\s*//; ## trim out spaces from beginning of line
                    ###checking for commas. Otherwise even these signals can be split
                    ##into separate elements if there is space before/after comma
                    unless(/,/) {
                    push @data, [split (/\s+/, $_,4)];
                    } else {
                    $_=~s/\s*,\s*/,/;
                    push @data, [split (/\s+/, $_,4)];
                    }
                    }

                    foreach my $row(0..(@data-1)) { ### upto last row
                    my @signals;my @other;
                    my $multi;
                    foreach my $col(0..(@{$dat a[$row]}-1)) { ### upto last element in the row
                    if($data[$row][$col]=~/,/) {
                    $multi=1;
                    @signals=split(/,/,$data[$row][$col]); ##separate out signals
                    until($col==(@d ata-1)) {
                    $col++;
                    ##take out corresponding type and direction
                    push @other,$data[$row][$col];
                    }
                    foreach(@signal s) {
                    print TEMP "\n";
                    printf TEMP ("%-15s",$_);
                    printf TEMP ("%-15s", $_) foreach(@other) ;
                    }
                    last;
                    }
                    else {
                    printf TEMP ("%-15s", $data[$row][$col]);
                    }
                    }
                    print TEMP "\n";
                    }
                    close(DATA);
                    close(TEMP);
                    ##change temp.txt to data.txt
                    rename("temp.tx t","data.txt ") or die "rename failed:$!";
                    [/CODE]

                    Also, note the change in range used for $row and $column. This should be the range you need to ideally use to parse through all rows and all columns in each row.

                    Comment

                    • KevinADC
                      Recognized Expert Specialist
                      • Jan 2007
                      • 4092

                      #11
                      "rharsh" on Tek-Tips has already written you a 99% working solution. "nithinpes " code is largely a duplication of that code. You seem a bit disengenuous to me by not informing either forum you are also getting help from another forum.

                      Comment

                      • Dr Fuzzy
                        New Member
                        • Mar 2008
                        • 6

                        #12
                        Originally posted by KevinADC
                        "rharsh" on Tek-Tips has already written you a 99% working solution. "nithinpes " code is largely a duplication of that code. You seem a bit disengenuous to me by not informing either forum you are also getting help from another forum.
                        You probably mean 'disingenuous'. ..Well it never crossed my mind that querying multiple sources possesses a form of disingenuousnes s! Based on that, I should be reading one and only one, say Perl book, but not two or more even worse, cause this would make me disingenuous to the author of the first book! Nevertheless, I truly apologize if that insulted you in any way!

                        Comment

                        Working...