preg_replace: stripping backslashes

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Margaret MacDonald

    preg_replace: stripping backslashes

    I've been going mad trying to figure out how to do this--it should be
    easy!

    Allow the user to enter '\_sometext\_', i.e., literal backslash,
    underscore, some text, literal backslash, underscore and, after
    submitting via POST to a preg_replace filter, get back

    '_sometext_' (i.e., the same thing with the literal backslashes
    stripped)

    Unless I'm misunderstandin g something (I don't know Perl at all), this
    should work:

    preg_replace( '/\\\\\\\\_(.*?)\ \\\\\\\_/i', '_$1_', $thepostvar )

    but it doesn't, and I don't know why. The filter apparently leaves
    the string unchanged, since it comes across in the POST array with the
    backslash doubled, and it comes out of the filter with the backslash
    still doubled. It doesn't seem to matter how many backslashes I use
    in the filter--I've tried between 4 and 10--the result is the same.

    Any insights?

    thanks in advance,
    Margaret
    --
    (To mail me, please change .not.invalid to .net, first.
    Apologies for the inconvenience.)
  • Ian.H

    #2
    Re: preg_replace: stripping backslashes

    On Tue, 20 Jul 2004 15:14:55 +0000, Margaret MacDonald wrote:
    [color=blue]
    > I've been going mad trying to figure out how to do this--it should be
    > easy!
    >
    > Allow the user to enter '\_sometext\_', i.e., literal backslash,
    > underscore, some text, literal backslash, underscore and, after
    > submitting via POST to a preg_replace filter, get back
    >
    > '_sometext_' (i.e., the same thing with the literal backslashes
    > stripped)
    >
    > Unless I'm misunderstandin g something (I don't know Perl at all), this
    > should work:
    >
    > preg_replace( '/\\\\\\\\_(.*?)\ \\\\\\\_/i', '_$1_', $thepostvar )
    >
    > but it doesn't, and I don't know why. The filter apparently leaves
    > the string unchanged, since it comes across in the POST array with the
    > backslash doubled, and it comes out of the filter with the backslash
    > still doubled. It doesn't seem to matter how many backslashes I use
    > in the filter--I've tried between 4 and 10--the result is the same.
    >
    > Any insights?
    >
    > thanks in advance,
    > Margaret[/color]


    str_replace() doesn't work?


    $thepostvar = str_replace("\\ ", '', $thepostvar);



    Regards,

    Ian

    --
    Ian.H
    digiServ Network
    London, UK


    Comment

    • Margaret MacDonald

      #3
      Re: preg_replace: stripping backslashes

      Ian.H wrote:
      [color=blue]
      >On Tue, 20 Jul 2004 15:14:55 +0000, Margaret MacDonald wrote:
      >[color=green]
      >> I've been going mad trying to figure out how to do this--it should be
      >> easy!
      >>
      >> Allow the user to enter '\_sometext\_', i.e., literal backslash,
      >> underscore, some text, literal backslash, underscore and, after
      >> submitting via POST to a preg_replace filter, get back
      >>
      >> '_sometext_' (i.e., the same thing with the literal backslashes
      >> stripped)
      >>
      >> Unless I'm misunderstandin g something (I don't know Perl at all), this
      >> should work:
      >>
      >> preg_replace( '/\\\\\\\\_(.*?)\ \\\\\\\_/i', '_$1_', $thepostvar )
      >>
      >> but it doesn't, and I don't know why. The filter apparently leaves
      >> the string unchanged, since it comes across in the POST array with the
      >> backslash doubled, and it comes out of the filter with the backslash
      >> still doubled. It doesn't seem to matter how many backslashes I use
      >> in the filter--I've tried between 4 and 10--the result is the same.
      >>
      >> Any insights?
      >>
      >> thanks in advance,
      >> Margaret[/color]
      >
      >
      >str_replace( ) doesn't work?
      >
      >
      > $thepostvar = str_replace("\\ ", '', $thepostvar);[/color]

      Thanks for the quick response, Ian!

      I stripped out some unessentials :-) This filter is part of a larger
      filtering process, though it fails when run alone too. I could jump
      out of that larger process and call str_replace() to handle this as a
      special case, but there's no particular reason I can think of why I
      should have to kludge it in that way (and I'm sure it would nag me to
      death if I did it :-D).

      Margaret
      --
      (To mail me, please change .not.invalid to .net, first.
      Apologies for the inconvenience.)

      Comment

      • Pedro Graca

        #4
        Re: preg_replace: stripping backslashes

        Margaret MacDonald wrote:[color=blue]
        > I've been going mad trying to figure out how to do this--it should be
        > easy!
        >
        > Allow the user to enter '\_sometext\_', i.e., literal backslash,
        > underscore, some text, literal backslash, underscore and, after
        > submitting via POST to a preg_replace filter, get back
        >
        > '_sometext_' (i.e., the same thing with the literal backslashes
        > stripped)
        >
        > Unless I'm misunderstandin g something (I don't know Perl at all), this
        > should work:
        >
        > preg_replace( '/\\\\\\\\_(.*?)\ \\\\\\\_/i', '_$1_', $thepostvar )
        >
        > but it doesn't, and I don't know why. The filter apparently leaves
        > the string unchanged, since it comes across in the POST array with the
        > backslash doubled, and it comes out of the filter with the backslash
        > still doubled. It doesn't seem to matter how many backslashes I use
        > in the filter--I've tried between 4 and 10--the result is the same.
        >
        > Any insights?[/color]

        First thing that hapens here is PHP dealing with the backslashes.
        Your four pairs of backslashes are sent to the preg_replace as four
        single backslashes.

        Now it is preg_replace()' s turn to deal with them.
        It converts 4 backslashes (two pairs) to two single backslashes and
        can't find them in the value of $thepostvar.



        If you remove one backslash from your regular_express ion, PHP will
        convert three pairs of backslashes to three single backslashes and the
        remaining "\_" to "\_"; so preg_replace() will see the same thing as
        above.


        If you remove two backslashes from your regular expression, PHP will
        convert two pairs to two single backslashes; preg_replace() will replace
        one pair to one single backslash;


        Try this:
        <?php
        $input = 'This is \_sometext\_ embedded.';

        $output1 = preg_replace('@ \_(.*)\_@X', '_$1_', $input);
        $output2 = preg_replace('@ \\_(.*)\\_@X', '_$1_', $input);
        $output3 = preg_replace('@ \\\_(.*)\\\_@X' , '_$1_', $input);
        $output4 = preg_replace('@ \\\\_(.*)\\\\_@ X', '_$1_', $input);
        $output5 = preg_replace('@ \\\\\_(.*)\\\\\ _@X', '_$1_', $input);
        $output6 = preg_replace('@ \\\\\\_(.*)\\\\ \\_@X', '_$1_', $input);
        $output7 = preg_replace('@ \\\\\\\_(.*)\\\ \\\\_@X', '_$1_', $input);
        $output8 = preg_replace('@ \\\\\\\\_(.*)\\ \\\\\\_@X', '_$1_', $input);

        echo '1: ', $output1, "\n";
        echo '2: ', $output2, "\n";
        echo '3: ', $output3, "\n";
        echo '4: ', $output4, "\n";
        echo '5: ', $output5, "\n";
        echo '6: ', $output6, "\n";
        echo '7: ', $output7, "\n";
        echo '8: ', $output8, "\n";
        ?>

        The output is:

        1: This is \_sometext\_ embedded.
        2: This is \_sometext\_ embedded.
        3: This is _sometext_ embedded.
        4: This is _sometext_ embedded.
        5: This is _sometext_ embedded.
        6: This is _sometext_ embedded.
        7: This is \_sometext\_ embedded.
        8: This is \_sometext\_ embedded.




        --
        USENET would be a better place if everybody read: | to email me: use |
        http://www.catb.org/~esr/faqs/smart-questions.html | my name in "To:" |
        http://www.netmeister.org/news/learn2quote2.html | header, textonly |
        http://www.expita.com/nomime.html | no attachments. |

        Comment

        • steve

          #5
          Re: Re: preg_replace: stripping backslashes

          "Margaret MacDonald" wrote:[color=blue]
          > Ian.H wrote:
          >[color=green]
          > >On Tue, 20 Jul 2004 15:14:55 +0000, Margaret MacDonald wrote:
          > >[color=darkred]
          > >> I’ve been going mad trying to figure out how to do this--it[/color][/color]
          > should be[color=green][color=darkred]
          > >> easy!
          > >>
          > >> Allow the user to enter ’\_sometext\_’, i.e., literal[/color][/color]
          > backslash,[color=green][color=darkred]
          > >> underscore, some text, literal backslash, underscore and, after
          > >> submitting via POST to a preg_replace filter, get back
          > >>
          > >> ’_sometext_’ (i.e., the same thing with the literal[/color][/color]
          > backslashes[color=green][color=darkred]
          > >> stripped)
          > >>
          > >> Unless I’m misunderstandin g something (I don’t know[/color][/color]
          > Perl at all), this[color=green][color=darkred]
          > >> should work:
          > >>
          > >> preg_replace( ’/\\\\_(.*?)\\\\_/i’,[/color][/color]
          > ’_[/color]
          _’, $thepostvar )[color=blue][color=green][color=darkred]
          > >>
          > >> but it doesn’t, and I don’t know why. The filter[/color][/color]
          > apparently leaves[color=green][color=darkred]
          > >> the string unchanged, since it comes across in the POST array[/color][/color][/color]
          with[color=blue]
          > the[color=green][color=darkred]
          > >> backslash doubled, and it comes out of the filter with the[/color][/color]
          > backslash[color=green][color=darkred]
          > >> still doubled. It doesn’t seem to matter how many[/color][/color]
          > backslashes I use[color=green][color=darkred]
          > >> in the filter--I’ve tried between 4 and 10--the result is the[/color][/color]
          > same.[color=green][color=darkred]
          > >>
          > >> Any insights?
          > >>
          > >> thanks in advance,
          > >> Margaret[/color]
          > >
          > >
          > >str_replace( ) doesn’t work?
          > >
          > >
          > > $thepostvar = str_replace("\" , ’’, $thepostvar);[/color]
          >
          > Thanks for the quick response, Ian!
          >
          > I stripped out some unessentials This filter is part of a larger
          > filtering process, though it fails when run alone too. I could[/color]
          jump[color=blue]
          > out of that larger process and call str_replace() to handle this as[/color]
          a[color=blue]
          > special case, but there’s no particular reason I can think of
          > why I
          > should have to kludge it in that way (and I’m sure it would nag
          > me to
          > death if I did it ).
          >
          > Margaret[/color]

          Margaret, you don’t need to "jump out" of your regular process.
          Simply write a new function called str_replace_new and call that
          instead. This function checks for "\" and replaces it on the fly
          before checking.

          --
          http://www.dbForumz.com/ This article was posted by author's request
          Articles individually checked for conformance to usenet standards
          Topic URL: http://www.dbForumz.com/PHP-preg_rep...ict131345.html
          Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbForumz.com/eform.php?p=438214

          Comment

          • Margaret MacDonald

            #6
            Re: preg_replace: stripping backslashes

            Pedro Graca wrote:
            [color=blue]
            >Margaret MacDonald wrote:[color=green]
            >> I've been going mad trying to figure out how to do this--it should be
            >> easy!
            >>
            >> Allow the user to enter '\_sometext\_', i.e., literal backslash,
            >> underscore, some text, literal backslash, underscore and, after
            >> submitting via POST to a preg_replace filter, get back
            >>
            >> '_sometext_' (i.e., the same thing with the literal backslashes
            >> stripped)
            >>
            >> Unless I'm misunderstandin g something (I don't know Perl at all), this
            >> should work:
            >>
            >> preg_replace( '/\\\\\\\\_(.*?)\ \\\\\\\_/i', '_$1_', $thepostvar )
            >>
            >> but it doesn't, and I don't know why. The filter apparently leaves
            >> the string unchanged, since it comes across in the POST array with the
            >> backslash doubled, and it comes out of the filter with the backslash
            >> still doubled. It doesn't seem to matter how many backslashes I use
            >> in the filter--I've tried between 4 and 10--the result is the same.
            >>
            >> Any insights?[/color]
            >
            >First thing that hapens here is PHP dealing with the backslashes.
            >Your four pairs of backslashes are sent to the preg_replace as four
            >single backslashes.
            >
            >Now it is preg_replace()' s turn to deal with them.
            >It converts 4 backslashes (two pairs) to two single backslashes and
            >can't find them in the value of $thepostvar.
            >
            >
            >
            >If you remove one backslash from your regular_express ion, PHP will
            >convert three pairs of backslashes to three single backslashes and the
            >remaining "\_" to "\_"; so preg_replace() will see the same thing as
            >above.
            >
            >
            >If you remove two backslashes from your regular expression, PHP will
            >convert two pairs to two single backslashes; preg_replace() will replace
            >one pair to one single backslash;
            >
            >
            >Try this:
            ><?php
            >$input = 'This is \_sometext\_ embedded.';
            >
            >$output1 = preg_replace('@ \_(.*)\_@X', '_$1_', $input);
            >$output2 = preg_replace('@ \\_(.*)\\_@X', '_$1_', $input);
            >$output3 = preg_replace('@ \\\_(.*)\\\_@X' , '_$1_', $input);
            >$output4 = preg_replace('@ \\\\_(.*)\\\\_@ X', '_$1_', $input);
            >$output5 = preg_replace('@ \\\\\_(.*)\\\\\ _@X', '_$1_', $input);
            >$output6 = preg_replace('@ \\\\\\_(.*)\\\\ \\_@X', '_$1_', $input);
            >$output7 = preg_replace('@ \\\\\\\_(.*)\\\ \\\\_@X', '_$1_', $input);
            >$output8 = preg_replace('@ \\\\\\\\_(.*)\\ \\\\\\_@X', '_$1_', $input);
            >
            >echo '1: ', $output1, "\n";
            >echo '2: ', $output2, "\n";
            >echo '3: ', $output3, "\n";
            >echo '4: ', $output4, "\n";
            >echo '5: ', $output5, "\n";
            >echo '6: ', $output6, "\n";
            >echo '7: ', $output7, "\n";
            >echo '8: ', $output8, "\n";
            >?>
            >
            >The output is:
            >
            >1: This is \_sometext\_ embedded.
            >2: This is \_sometext\_ embedded.
            >3: This is _sometext_ embedded.
            >4: This is _sometext_ embedded.
            >5: This is _sometext_ embedded.
            >6: This is _sometext_ embedded.
            >7: This is \_sometext\_ embedded.
            >8: This is \_sometext\_ embedded.[/color]

            Thanks for your quick response, Pedro!

            Interestingly, it turns out that PHP loses track of the fact that the
            string started out with two single backslash literals. So running
            your test (and very neat it is, too...I can't imagine why I didn't
            think of doing that; I must have been having an 'Einstein moment' :-)
            works fine as long as I set the var locally before running the filter.
            Both \_test_\ and \\_test\\_ evaluate as strings of length 8 when
            set locally...but as strings of length 12 and 14 respectively after
            being passed via the POST array! (which feels like a bug, to me).

            So to properly filter on '\_test\_' after passing it through the POST
            array, I've to use 8(!) backslashes, not 4.

            I still don't know why it wasn't working to begin with, since I did
            try it with 8. [mutter mutter]

            Margaret
            --
            (To mail me, please change .not.invalid to .net, first.
            Apologies for the inconvenience.)

            Comment

            • Margaret MacDonald

              #7
              Re: preg_replace: stripping backslashes

              er, that should have been '10 and 12 respectively', of course. It's
              amazing what one can type without noticing, when falling-down tired.
              *sigh*

              I wrote:
              [color=blue]
              >set locally...but as strings of length 12 and 14 respectively after[/color]

              --
              (To mail me, please change .not.invalid to .net, first.
              Apologies for the inconvenience.)

              Comment

              • Pedro Graca

                #8
                Re: preg_replace: stripping backslashes

                Margaret MacDonald wrote:[color=blue]
                > that should have been '10 and 12 respectively', of course.[/color]

                magic_quotes on?

                I don't like them!

                --
                USENET would be a better place if everybody read: | to email me: use |
                http://www.catb.org/~esr/faqs/smart-questions.html | my name in "To:" |
                http://www.netmeister.org/news/learn2quote2.html | header, textonly |
                http://www.expita.com/nomime.html | no attachments. |

                Comment

                Working...