Why is preg_replace replacing only last occurance of target in string?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • danellison
    New Member
    • Apr 2007
    • 3

    Why is preg_replace replacing only last occurance of target in string?

    I am developing an authoring system that includes footnote functionality. The system stores data as htmlspecialchar s encoded html. Footnotes are numbered in the source without regard to where the footnote appears in the text. So, footnote 5 could preceed footnote 1 in the html. When its time to process the document for presentation I am re-assigning the footnote numbers so that they appear sequentially in the text. I am trying to do this using preg_replace.

    Each footnote is embedded in a <span> that has an ID attribute with a unique identifier that maps the corresponding footnote text to the given marker location. The following code will reproduce the problem:

    Code:
    error_reporting(E_ALL);
    $regexp = '/^(.*)<span (.*)id="fn([^"]*)"(.*)>\[(.*)\](.*)$/';
    $target = <<<__EOS
    <b>MANAGEMENT'S DISCUSSION AND ANALYSIS</b> <p>This text is shorter.  <sup><span style="color: blue; font-size: 10px;" id="fn6" class="footnote">[6]</span></sup>.&nbsp; There has to be at least two sentences presented <sup><span style="color: blue; font-size: 10px;" id="fn4" class="footnote">[4]</span></sup>) discussion and analysis is designed to identify the significant <sup><span style="color: blue; font-size: 10px;" id="fn5" class="footnote">[5]</span></sup> in the fiscal year ending April 30, 2004. </p>
    __EOS;
    $fnotenum = 0;
    if(preg_match($regexp, $target, $matches)){
            $fnotenum = 0;
            $replacements = "$1<span class='LOOKHERE' $2 $4>[".++$fnotenum."]$6";
            $target = preg_replace($regexp, $replacements, $target);
    }
    echo $target;
    exit();
    The output I am expecting from this code and sample input should look like this:

    <b>MANAGEMENT 'S DISCUSSION AND ANALYSIS</b> <p>This text is shorter. <sup><span style="color: blue; font-size: 10px;" class="footnote ">[1]</span></sup>.&nbsp; There has to be at least two sentences presented <sup><span style="color: blue; font-size: 10px;" class="footnote ">[2]</span></sup>) discussion and analysis is designed to identify the significant <sup><span style="color: blue; font-size: 10px;" class="footnote ">[3]</span></sup> in the fiscal year ending April 30, 2004. </p>

    Each instance of the id attribute has been removed so that subsequent calls to preg_match will not find the nodes that have already been processed and the arbitrary footnote numbers assigned in the authoring system have been replaced by sequential values.

    What I get is this:

    <b>MANAGEMENT 'S DISCUSSION AND ANALYSIS</b> <p>This text is shorter. <sup><span style="color: blue; font-size: 10px;" id="fn6" class="footnote ">[6]</span></sup>.&nbsp; There has to be at least two sentences presented <sup><span style="color: blue; font-size: 10px;" id="fn4" class="footnote ">[4]</span></sup>) discussion and analysis is designed to identify the significant <sup><span class='LOOKHERE ' style="color: blue; font-size: 10px;" class="footnote ">[1]</span></sup> in the fiscal year ending April 30, 2004. </p>

    Notice that only the very last instance of the footnote span has been modified even though each individual span does match the regular expression. I don't understand this behavior. Am I misunderstandin g the preg_replace function? Is my logic flawed? Anybody experienced and overcome this issue? Any suggestions?

    Your time and thoughts are greatly appreciated,
    Dan Ellison
  • JKing
    Recognized Expert Top Contributor
    • Jun 2007
    • 1206

    #2
    Hey there,

    I'm not too great with regular expressions myself but I can shed a little light on things for you.

    I added this line to your code.
    Code:
    echo preg_match_all($regexp, $target, $matches);
    preg_match_all returns only 1 match. You may need to rework your regular expression.

    Comment

    Working...