size of binary string on multi-byte system?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Mark

    size of binary string on multi-byte system?


    hello!

    normally, if you are given binary data in a string (such as from a call to
    fread), you can call the strlen function on this string to get its size:

    $buffer = fread($file, 100000);
    if (strlen($buffer ) < 100000)
    {
    echo "read less than 100000 bytes";
    }
    etc

    now, the problem is, on systems where you have the mbstring extension
    turned on, and are using function substitution to replace strlen with
    mb_strlen, you are pretty much guaranteed to get a WRONG value back from
    strlen, since it will find some multi-byte lead characters in that binary
    file eventually.

    so the question is:

    are there any other methods to find the length of a string besides strlen?
    the only way i can see to not screw myself right now is to not use function
    substitution. any other options?

    thanks,
    mark.


    --
    I am not an ANGRY man. Remove the rage from my email to reply.
  • Daniel Tryba

    #2
    Re: size of binary string on multi-byte system?

    Mark <mw@angrylanfea r.com> wrote:[color=blue]
    > so the question is:
    >
    > are there any other methods to find the length of a string besides strlen?
    > the only way i can see to not screw myself right now is to not use function
    > substitution. any other options?[/color]

    I'd guess that "mb_strlen($str ,'8bit')" would give you the actual
    number of bytes in the string.

    Checking.... http://nl3.php.net/mb_strlen the only comment:
    If you wish to find the byte length of a multi-byte string when you are
    using mbstring.func_o verload 2 and UTF-8 strings, then you can use the
    following:

    mb_strlen($utf8 _string, 'latin1');

    latin1 is also a 8bit encoding, only there are ranges not defined:
    0x7fz<0xa0

    8bit sounds better IMHO.

    Comment

    • Chung Leong

      #3
      Re: size of binary string on multi-byte system?


      "Mark" <mw@ANGRYLanfea r.com> wrote in message
      news:rpCdnfkrdq i5FZ_fRVn-1A@nventure.com ...[color=blue]
      >
      > hello!
      >
      > normally, if you are given binary data in a string (such as from a call[/color]
      to[color=blue]
      > fread), you can call the strlen function on this string to get its size:
      >
      > $buffer = fread($file, 100000);
      > if (strlen($buffer ) < 100000)
      > {
      > echo "read less than 100000 bytes";
      > }
      > etc
      >
      > now, the problem is, on systems where you have the mbstring extension
      > turned on, and are using function substitution to replace strlen with
      > mb_strlen, you are pretty much guaranteed to get a WRONG value back from
      > strlen, since it will find some multi-byte lead characters in that binary
      > file eventually.
      >
      > so the question is:
      >
      > are there any other methods to find the length of a string besides[/color]
      strlen?[color=blue]
      > the only way i can see to not screw myself right now is to not use[/color]
      function[color=blue]
      > substitution. any other options?
      >
      > thanks,
      > mark.[/color]

      array_sum(count _chars($s)), maybe?


      Comment

      • Mark

        #4
        Re: size of binary string on multi-byte system?

        Chung Leong wrote:
        [color=blue]
        >[color=green]
        >> are there any other methods to find the length of a string besides[/color]
        > strlen?[color=green]
        >> the only way i can see to not screw myself right now is to not use[/color]
        > function[color=green]
        >> substitution. any other options?
        >>
        >> thanks,
        >> mark.[/color]
        >
        > array_sum(count _chars($s)), maybe?[/color]


        the best solution so far appears to be mb_strlen($buf, '8bit')

        thanks to all.
        mark.



        --
        I am not an ANGRY man. Remove the rage from my email to reply.

        Comment

        Working...