working with a hex byte array (which represents a string)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • HaLo2FrEeEk
    Contributor
    • Feb 2007
    • 404

    working with a hex byte array (which represents a string)

    Hi, I'm working on a project wherein I need to pull out 256 bytes from a specified position in a file. The 256 bytes represent the title of the file. The thing is, it's null-padded, so there are 2 null hex bytes before the string starts, a single null byte between each letter in the title, and a variable number of nulls after the title ends. For example, if my title was "some random title", I would have 256 bytes with the first 2 being "00 00", then the title with a single "00" byte between each letter, and 221 "00" bytes after.

    I need to know how to get rid of those null bytes. I've tried using:

    ASCIIEncoding.A SCII.GetString( byte[] bytes);

    And that would return a string "some random title" but when I do string.Length I get 256, because null bytes are non-printable when converted to ASCII, but they're still there.

    Is there anything I can do? I can't really do any more work on my project until this gets resolved.
  • RedSon
    Recognized Expert Expert
    • Jan 2007
    • 4980

    #2
    Your string is still 256 bytes in size. It's just a null terminated string. The text is valid but the size is the same. This is normal.

    You could try trimming the string if you have a method that will do that.

    Comment

    • tlhintoq
      Recognized Expert Specialist
      • Mar 2008
      • 3532

      #3
      A null between each word? Not a 32 (space character)? That seems odd.
      Are you sure the file contains just bytes and not was not written as text? Unicode characters are two-bytes per character, so a space would be two bytes 0x0020 (hex) 00-32 (decimal)

      I think I would replace all the nulls with spaces.
      Then do a trim.start and trim.end to clear all the leading and trailing spaces, leaving you with just the middle.

      Comment

      • HaLo2FrEeEk
        Contributor
        • Feb 2007
        • 404

        #4
        RedSon, I tried using the trim() method, but it didn't do anything at all. I don't want the string to be 256 bytes, I want it to be the length of the actual filename, without the null bytes. I need to use string.Length to get the string length so I can truncate it down to fit the area it's being printed in or add an ellipsis (...) to the end.

        And tlhintoq, since the actual filename has the null bytes between each character, using trim would be useless anyway since I would still have a string that is (string.Length + (string.Length - 1)) long. A 15 character string will have 14 null bytes if there is 1 between each character. I need to remove those null bytes. It's not a space, it's a "00" hex byte. Here is an example, the title in this example is "Prepare to Drop Premium Theme", the Hex for that is:

        Code:
        00 00 50 00 72 00 65 00 70 00 61 00 72 00 65 00 20 00 74 00
        6F 00 20 00 44 00 72 00 6F 00 70 00 20 00 50 00 72 00 65 00
        6D 00 69 00 75 00 6D 00 20 00 54 00 68 00 65 00 6D 00 65 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        and the ASCII is:

        Code:
        ..P.r.e.p.a.r.e. .t.o. .D.r.o.p. .P.r.e.
        m.i.u.m. .T.h.e.m.e.....................
        ........................................
        ........................................
        ........................................
        ........................................
        ................
        The dots represent non-printable ASCII characters, the 00 bytes.

        I did manage to throw something together, a method that works, but I feel like there's got to be a better way. Here's my method:

        Code:
        private string getThemeName()
                {
                    string themeName;
                    br.BaseStream.Position = 0x410;
                    byte[] nameBytesNull = br.ReadBytes(256);
                    List<byte> nameBytesList = new List<byte>();
                    foreach (byte single in nameBytesNull)
                    {
                        if (single.ToString() != "0")
                        {
                            nameBytesList.Add(single);
                        }
                    }
                    byte[] nameBytes = new byte[nameBytesList.Count];
                    nameBytesList.CopyTo(nameBytes);
                    themeName = ASCIIEncoding.ASCII.GetString(nameBytes);
                    return themeName;
                }
        Basically I'm reading the 256 bytes that I need into a byte array, then I iterate through each byte in that array and ask if it's ToString() value is 0, if it isn't then I add it to the byte List. Then I create another byte array based on the length of the byte List, and copy that byte List to that byte array. THEN I use the ASCIIEncoding GetString() method to convert the resulting byte array to it's ASCII string value and return it.

        This works, but like I said, I feel like there must be a simpler way. Any ideas?

        Comment

        • tlhintoq
          Recognized Expert Specialist
          • Mar 2008
          • 3532

          #5
          Code:
          00 00 50 00 72 00 65 00 70 00 61 00 72 00 65 00 20 00 74 00
          This is exactly as I mentioned: unicode where two bytes represent a single character.

          This is 1 null at the beginning
          0050 is the two byte unicode of 'P'
          0072 is the two byte unicode of 'r'
          0065 is ... e
          0070 is p
          0061 is a
          0072 is r
          0065 is e
          0020 is a space
          0074 is a t

          Code:
           byte[] nameBytesNull = br.ReadBytes(256);
          Instead of reading bytes into a byte[] you should try reading lines into a string[]

          Comment

          Working...