Can you improve this code : Search Byte[] backwards for byte Pattern

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Russell Mangel

    Can you improve this code : Search Byte[] backwards for byte Pattern

    Hi,

    The code I have posted searches for a pattern of bytes starting from the end
    of
    Byte[] array *backwords*, if a match is found, return the starting index of
    those
    found bytes. Since one of the parameters is starting index you can start the
    search
    anywhere within the Byte[] you are searching.

    Example:

    Suppose you have a Byte[] that contains the 4 bytes 0x01, 0x01, 0xBA, 0xC8.
    Find the starting index for the two bytes 0x01, 0xBA, searching from the
    end.
    The method would return 0x01, because that is the starting index of the
    sequence
    of bytes 0x01, 0xBA.

    The code I have posted below works, but can you simplfy / optimize the
    FindBackward() method?

    Thank you.

    Russell Mangel
    Las Vegas, NV

    PS
    The Byte[] array in production, will have average size of 3MB - 10MB, they
    are .mp3 files.

    // Start code
    using System;

    class Program
    {
    private Byte[] m_Bytes = { 0x01, 0x01, 0xBA, 0xC8 };
    private const UInt32 NOT_FOUND = 0xFFFFFFFF;
    private readonly UInt32 END_OF_FILE;

    public Program( )
    {
    END_OF_FILE = (UInt32)m_Bytes .Length - 1;
    }

    static void Main( string[] args )
    {
    Program p = new Program( );

    Byte[] searchFor = { 0x01, 0xBA };
    UInt32 startSearchInde x = 0x03;

    UInt32 x = p.FindBackward( searchFor, startSearchInde x );

    Console.WriteLi ne( "{0:X8}", x );

    Console.ReadLin e( );
    }
    public UInt32 FindBackward( Byte[] bytes, UInt32 index )
    {
    if (bytes == null || index END_OF_FILE )
    {
    return NOT_FOUND;
    }

    Boolean isMatch = true;
    UInt32 i = index;
    UInt32 size = ( UInt32 )bytes.Length - 1;
    UInt32 j = size;
    UInt32 t = 0;

    do
    {
    isMatch = true;
    j = size;
    t = 0;
    do
    {
    if ( m_Bytes[ i - t ] != bytes[ j ] )
    {
    isMatch = false;
    break;
    }
    t++;
    }
    while ( j-- 0 );

    if ( isMatch )
    {
    return ( i - size ); // Success
    }
    }
    while ( i-- 0 );

    return NOT_FOUND;
    }
    }
    // End Code


  • Peter Duniho

    #2
    Re: Can you improve this code : Search Byte[] backwards for bytePattern

    On Thu, 25 Sep 2008 14:23:52 -0700, Russell Mangel <russell@tymer. net>
    wrote:
    [...]
    The code I have posted below works, but can you simplfy / optimize the
    FindBackward() method?
    Why? Are you having some problem with it?

    Assuming you always just want to search for a single, pre-determined
    string of bytes, I don't think you're going to get much better than what
    you've posted. There are some techniques that would allow you to only
    have to visit each byte in the searched array once, but I think they would
    be overkill if you have only the one string of bytes you're looking for.
    They have considerable overhead, both in terms of setup and in terms of
    cost during iteration, providing very good improvements in the overall
    efficiency of the algorithm as measured by "big-O notation", but
    increasing the cost of each loop iteration.

    You could certainly clean up the code you posted a bit, and if you wanted
    to go nuts you could even work the Array.FindLast< T>() method into it.
    But I'm not sure what the point would be, if what you have works and you
    don't have any specific complaints about it.

    Pete

    Comment

    • Russell Mangel

      #3
      Re: Can you improve this code : Search Byte[] backwards for byte Pattern

      I just wanted to make it as efficient as possible, because that method will
      be called a lot.
      I am parsing mp3 files, 1.5 Terabytes, hundreds of thousands of files. The
      routine
      parse's every frame and every tag of the binary mp3 file. So I wanted to
      test using
      pure byte arrays first and see how performance is. Then I'd like to try an
      unsafe version if it makes sense. I avoided using any of the stream and
      binaryreader objects, for now cuz I doubt they would increase performance
      because
      we are forced to read every Frame inside Mp3 file from start to end. However
      the stream/binary reader objects would use much less memory than loading
      3-10MB into one Byte[].


      Thanks for your reply.

      Russ

      "Peter Duniho" <NpOeStPeAdM@nn owslpianmk.comw rote in message
      news:op.uh139ts 58jd0ej@petes-computer.local. ..
      On Thu, 25 Sep 2008 14:23:52 -0700, Russell Mangel <russell@tymer. net>
      wrote:
      >
      >[...]
      >The code I have posted below works, but can you simplfy / optimize the
      >FindBackward () method?
      >
      Why? Are you having some problem with it?
      >
      Assuming you always just want to search for a single, pre-determined
      string of bytes, I don't think you're going to get much better than what
      you've posted. There are some techniques that would allow you to only
      have to visit each byte in the searched array once, but I think they would
      be overkill if you have only the one string of bytes you're looking for.
      They have considerable overhead, both in terms of setup and in terms of
      cost during iteration, providing very good improvements in the overall
      efficiency of the algorithm as measured by "big-O notation", but
      increasing the cost of each loop iteration.
      >
      You could certainly clean up the code you posted a bit, and if you wanted
      to go nuts you could even work the Array.FindLast< T>() method into it.
      But I'm not sure what the point would be, if what you have works and you
      don't have any specific complaints about it.
      >
      Pete

      Comment

      • Family Tree Mike

        #4
        Re: Can you improve this code : Search Byte[] backwards for byte Pattern

        I'm currious why you made the mp3 byte array a member of the class
        containing the function, but pass the search string to the function. I
        doubt it
        matters much though.


        "Russell Mangel" <russell@tymer. netwrote in message
        news:%23SAYSW1H JHA.2580@TK2MSF TNGP05.phx.gbl. ..
        Hi,
        >
        The code I have posted searches for a pattern of bytes starting from the
        end of
        Byte[] array *backwords*, if a match is found, return the starting index
        of those
        found bytes. Since one of the parameters is starting index you can start
        the search
        anywhere within the Byte[] you are searching.
        >
        Example:
        >
        Suppose you have a Byte[] that contains the 4 bytes 0x01, 0x01, 0xBA,
        0xC8.
        Find the starting index for the two bytes 0x01, 0xBA, searching from the
        end.
        The method would return 0x01, because that is the starting index of the
        sequence
        of bytes 0x01, 0xBA.
        >
        The code I have posted below works, but can you simplfy / optimize the
        FindBackward() method?
        >
        Thank you.
        >
        Russell Mangel
        Las Vegas, NV
        >
        PS
        The Byte[] array in production, will have average size of 3MB - 10MB,
        they are .mp3 files.
        >
        // Start code
        using System;
        >
        class Program
        {
        private Byte[] m_Bytes = { 0x01, 0x01, 0xBA, 0xC8 };
        private const UInt32 NOT_FOUND = 0xFFFFFFFF;
        private readonly UInt32 END_OF_FILE;
        >
        public Program( )
        {
        END_OF_FILE = (UInt32)m_Bytes .Length - 1;
        }
        >
        static void Main( string[] args )
        {
        Program p = new Program( );
        >
        Byte[] searchFor = { 0x01, 0xBA };
        UInt32 startSearchInde x = 0x03;
        >
        UInt32 x = p.FindBackward( searchFor, startSearchInde x );
        >
        Console.WriteLi ne( "{0:X8}", x );
        >
        Console.ReadLin e( );
        }
        public UInt32 FindBackward( Byte[] bytes, UInt32 index )
        {
        if (bytes == null || index END_OF_FILE )
        {
        return NOT_FOUND;
        }
        >
        Boolean isMatch = true;
        UInt32 i = index;
        UInt32 size = ( UInt32 )bytes.Length - 1;
        UInt32 j = size;
        UInt32 t = 0;
        >
        do
        {
        isMatch = true;
        j = size;
        t = 0;
        do
        {
        if ( m_Bytes[ i - t ] != bytes[ j ] )
        {
        isMatch = false;
        break;
        }
        t++;
        }
        while ( j-- 0 );
        >
        if ( isMatch )
        {
        return ( i - size ); // Success
        }
        }
        while ( i-- 0 );
        >
        return NOT_FOUND;
        }
        }
        // End Code
        >

        Comment

        • Russell Mangel

          #5
          Re: Can you improve this code : Search Byte[] backwards for byte Pattern


          "Family Tree Mike" <FamilyTreeMike @ThisOldHouse.c omwrote in message
          news:e5$30c2HJH A.4060@TK2MSFTN GP03.phx.gbl...
          I'm currious why you made the mp3 byte array a member of the class
          containing the function, but pass the search string to the function. I
          doubt it
          matters much though.

          I just threw that class together, without any design, I am only interested
          in improving the method.
          The method will be used in another class in a .dll assembly.

          Thanks for noticing though.

          Russell Mangel
          Las Vegas, NV


          Comment

          Working...