Regex Question

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • JEB

    Regex Question

    I am trying to use Perl to rescue some legacy word processor files.
    The files are ascii, except that some control codes use
    bytes in the $80-$ff ranges. I slurp the file into a string for editing.

    Regex can hand the bytes <\x7f, but fails to recognize bytes that are \x80
    or above.

    e.g.,

    /\x03//; works
    /\x81//; doesn't

    Since I thought the problem might be related the adoption of unicode, I've
    tried various things like;

    no encoding;
    use bytes;
    and various forms of encoding;
    etc.

    Nothing helps.

    I'm using Perl 5.8+(whatever the lastest revision is) with Redhat Linux
    8.0.

    Is this something a Perl regex just can't handle?

    JEB
Working...