Sending both binary data and strings over the same stream

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • tomPee
    New Member
    • Jun 2007
    • 21

    Sending both binary data and strings over the same stream

    Hi,

    I've bumbed into a slight problem now, and I just don't seem to know how to fix it. What I want to do is the following:
    Send over a socket:
    1. Number of files to be send (not as an integer, just as a string)
    then for each file to be send:
    2. Length of Filename (again as a string)
    3. Filename
    4. File as binary data.

    I grabbed my Core Java book and figured it'd be easily doable using a bufferd data input/output stream, As i can cast it to a buffered data stream for the binary data in the file, and to the data stream for sending/receiving strings (is that even a valid way of thinking ?)

    The whole 'plan' just started crumbling down when I tried to implement it and it said that the DataInputStream method is deprecated. And I can't use a Reader as I must be able to receive binary data...

    Now, I think it'd be a grave mistake (please do correct me if I'm wrong) to hand the same input/output stream over to a reader, and hand construct a bufferedInput/OutputStream with the same input/output stream as is being used by the reader. And on second thought, the same problem probably occurs with Wrapping the stream in as a DataInputStream (new BufferedInputSt ream ( socket.getInput stream())) so i'll just scratch that plan too...

    So... I'm kinda stuck here... How can I send both binary data and strings over the same socket ? While not making it that complicated that it's really difficult to construct a correctly working C++ counterpart using the same protocol that can communicate with the java implementation.

    Thanks in advance.

    Tom.
  • Nepomuk
    Recognized Expert Specialist
    • Aug 2007
    • 3111

    #2
    Hi Tom!

    I checked the DataInputStream API and it seems that the readLine() method is depreciated, but not the whole class.

    Also I think, it shouldn't be a problem to reuse a stream in the way you suggested.

    Oh, and for sending Strings, you could also use an ObjectOutputStr eam / ObjectInputStre am (although that might be problematic when trying to create a C++ counterpart, I really don't know that).

    Greetings,
    Nepomuk

    Comment

    • tomPee
      New Member
      • Jun 2007
      • 21

      #3
      Hi,

      Thanks for the reply :).

      What I'm not sure about with the reusing of a stream is both the closing of the stream, and the buffered data.
      If data enters on the original stream, is it copied to, for example both the buffer for the BufferedInputSt ream and to the buffer of a BufferedReader, or is it only present in one of the buffers ?
      The latter case would be quite problematic as I would never know where my data is at. If it's the former then I should know exactly how many bytes to skip in each stream as I read from the other Stream, but that shouldn't be to hard.

      Another thing I was thinking about is, how about if I take a pure bytestream and just cast it char's for example for all the things I know are chars (for example until the result of a cast of 2 bytes results in a newline) and then interpret the next part as it is supposed to be.
      But tbh, I could do that in c++ But i have no idea whatsoever on doing this in java.

      Do you have any idea's about that ?

      Thanks in advance,
      Tom

      Comment

      • r035198x
        MVP
        • Sep 2006
        • 13225

        #4
        If you just send the binary file you should be able to get both its name and length after sending the file, right?

        Comment

        • tomPee
          New Member
          • Jun 2007
          • 21

          #5
          But, I can't send the java file object since we have to 'interface' with c++. Or isn't that what you meant ?
          The thing is also, we want it to be possible to for example 10 files after one another without having to openup new connections.

          So far what I'm thinking that might work is just sending bytes over and casting everything to chars, but I think it'll be a bit inefficient. But I think it might work...

          So, just using a raw BufferedOutput/InputStream and getting the bytes from the strings, and then on the receiving end just cast every 1 or 2 bytes (according to java standard) as a character (if that's possible, sigh, c++ is so much easier :P) and seperate based on new lines that way.

          When the protocol then says that the next part should be interpreted as a binary file, e.g. after reading file lenght using the above described method, I don't interpret any of the bytes and just write them to a file, until the whole file has been received. Then I start casting again for the next filename length.

          Would that be a doable approach ?

          Thanks in advance
          Tom

          Comment

          • JosAH
            Recognized Expert MVP
            • Mar 2007
            • 11453

            #6
            No need to over-complicate things: when Strings are written/read by a stream those Strings are encoded/decoded. ASCII Strings (each char <= 0x7f) encode to a single byte in UTF-8. UTF-8 decoding (on the C++ side) isn't much trouble either; I suggest a simple protocol:

            0,1: file name length in high endian format
            2 ... n: file name UTF-8 encoded
            n+1 ... n+4: length of the file in high endian format
            n+5 ...: binary content of the file

            The C++ end shouldn't have any trouble with this data format. All you need is a simple OutputStream on the Java sending side. The String class itself can take care of the encoding (UTF-8)

            kind regards,

            Jos

            Comment

            • tomPee
              New Member
              • Jun 2007
              • 21

              #7
              Oh, the standard encoded size of a char is 1 byte ? I thought it was 2 bytes... If it's 1 byte that indeed simplifies the matter a bit. And I didn't know the String class itself took care of the encoding.
              Thanks a lot Jos, I'll try and get it fixed that way. I'll post here if another corpse jumps out of the closet on my line of thought.

              Thanks a lot already !

              Greets,
              Tom

              Comment

              • JosAH
                Recognized Expert MVP
                • Mar 2007
                • 11453

                #8
                Originally posted by tomPee
                Oh, the standard encoded size of a char is 1 byte ? I thought it was 2 bytes... If it's 1 byte that indeed simplifies the matter a bit. And I didn't know the String class itself took care of the encoding.
                Thanks a lot Jos, I'll try and get it fixed that way. I'll post here if another corpse jumps out of the closet on my line of thought.

                Thanks a lot already !

                Greets,
                Tom
                The UTF-8 encoding scheme encodes the bytes 0x00 - 0x7f to the same range: 0x00 - 0x7f. All the ASCII characters happen to be in that range, so they get encoded to themselves.

                Internally a char takes up two bytes in Java; all chars do. When you write them to an OutputStream they are encoded because streams write bytes, not chars.

                kind regards,

                Jos

                Comment

                • tomPee
                  New Member
                  • Jun 2007
                  • 21

                  #9
                  Hey,

                  As it seems the people working on the C++ counterpart are not sending the lengths of the filenames and the lengths of the files are being send as integers (4 bytes long).
                  Now, keeping in mind that characters are encoded in UTF-8 by default i've thought up of the following 'draft' implementation:

                  Both the fReader and fWriter are actually DataInput/OutputStreams.

                  Code:
                  /*
                  	 * (non-Javadoc)
                  	 * 
                  	 * @see firefile.shared.net.ISocket#readString()
                  	 */
                  	public String readString() throws IOException {
                  		int length = this.fReader.readInt();
                  		
                  		System.out.print("Lenght received:");
                  		System.out.println(length);
                  		
                  		final char[] ch = new char[length];
                  		for (int i = 0; i < length; i++) {
                  			final int tmp = this.fReader.read();
                  			if (tmp == -1) {
                  				throw new IOException("End of stream prematurely ended.");
                  			} else {
                  				ch[i] = (char) tmp;
                  			}
                  		}
                  		final String in = new String(ch);
                  		
                  		System.out.print("Readstring returned: ");
                  		System.out.println(in);
                  		
                  		return in;
                  	}
                  
                  	/*
                  	 * (non-Javadoc)
                  	 * 
                  	 * @see firefile.shared.net.ISocket#sendString(java.lang.String)
                  	 */
                  	public void sendString(final String msg) throws IOException {
                  		this.fWriter.writeInt(msg.length());
                  		this.fWriter.flush();		
                  		this.fWriter.writeBytes(msg);
                  		this.flush();
                  	}
                  So, I'm thinking this will do what I want it to do.
                  Last edited by JosAH; Feb 10 '09, 11:13 AM. Reason: fixed the [code] ... [/code] tags

                  Comment

                  • tomPee
                    New Member
                    • Jun 2007
                    • 21

                    #10
                    Hi,

                    I just wanted to let you know that I've found it :). The 'final' version (except for the debug output) is much like the above. I've used DataInputStream s and DataOutputStrea ms to be able to easily send and receive integers.
                    To receive characters, I just read the bytes one by one until all have been read ( as advertised by the length ) and cast them to characters which works nicely.
                    For the binary data I also read bytes one at a time but I just immediately write them using a BufferedOutputS tream to a file. So all problems have been solved, and it's working perfectly with the c++ counterparts.

                    Following code is the implementation for receiving a single file. (with debug output though)

                    [code=java]
                    public void receiveFile(Str ing uri) throws NetworkExceptio n, FileException {
                    System.out.prin tln("receivefil e() start");
                    try {
                    // length of filename + filename
                    final int fileNameLength = in.readInt();
                    System.out.prin t("Filename length: ");
                    System.out.prin tln(fileNameLen gth);
                    final char[] ch = new char[fileNameLength];
                    for (int i = 0; i < fileNameLength; i++) {
                    final int tmp = in.read();
                    if (tmp == -1) {
                    throw new IOException("En d of stream prematurely ended.");
                    } else {
                    ch[i] = (char) tmp;
                    }
                    }
                    String fileName = new String(ch);
                    fileName = resolveNameColl isions(uri, fileName);
                    System.out.prin t("Filename - after collision resolving: ");
                    System.out.prin tln(fileName);
                    BufferedOutputS tream bos = new BufferedOutputS tream(new FileOutputStrea m(uri+pathSepar ator+fileName)) ;

                    // length of file + file
                    final int fileSize = in.readInt();
                    for(int i = 0; i < fileSize; ++i){
                    final byte tmp = in.readByte();
                    bos.write(tmp);
                    }
                    bos.flush();
                    bos.close();
                    System.out.prin tln("recieveFil e() - end");
                    } catch (IOException e) {
                    throw new NetworkExceptio n(e);
                    }
                    }
                    [/code]

                    I'd like to thank you a lot for all the help ! You've really enlighted me on the whole character encoding problem which was somewhat the largest 'black hole' for me, so thanks for shedding some light on that.
                    And thanks to r035198x and Nepomuk too !

                    greets !
                    Tom
                    Last edited by Nepomuk; Feb 11 '09, 12:15 AM. Reason: Fixed [code] tags

                    Comment

                    • umbr
                      New Member
                      • Feb 2009
                      • 9

                      #11
                      Hi Tom!
                      Why you not consider any encoding methods - ASN.1 or XML?

                      Comment

                      • tomPee
                        New Member
                        • Jun 2007
                        • 21

                        #12
                        Hi,

                        As a matter of fact, we do use XML for message passing. I don't know what ASN.1 is though.
                        It's just, we can't send files in XML messages...
                        We were in the need of sending mixed datatypes, lengths, characters and binary data over one one socket.
                        Where, lengths also includes the length of the xml message being send.

                        Comment

                        • umbr
                          New Member
                          • Feb 2009
                          • 9

                          #13
                          Hi again!
                          Perhaps, you may use TLV(Tag/Length/Value or Type/Length/Value) notation.
                          Typical usage: tag(1 byte) for each datatype, length - 2-byte sequence(usuall y big endian) and data itself. For strings exchange use String's method getBytes(). Details look in example.txt.

                          Comment

                          • tomPee
                            New Member
                            • Jun 2007
                            • 21

                            #14
                            Hi,

                            Hm, that indeed does look nice :). Though, the only difference with what we're doing now is the type field, which is specified by our protocol to be send in a specific order, it looks quite the same as what we are doing :).

                            Only, for sending out integers we use a dataoutputstrea m sending 4 byte integers, big endian. But the rest is quite the same.

                            I don't think we'll be changing it anymore though as it is working correctly as is for now, and we're all satisfied with how it works. And it's going perfectly between C++ en Java now so if we keep it like that we can move on to the next steps, coding more business logic. I appreciate the help though, if only I knew that sooner :)

                            Thanks a lot.
                            Kind Regards,

                            Tom

                            Comment

                            • umbr
                              New Member
                              • Feb 2009
                              • 9

                              #15
                              Hi.
                              Thanks on a kind words :)
                              But, be careful with encodings. If filenames will contains non-ASCII characters at C++ side you'll need special libraries for properly encoding/decoding of these strings, they should exist for every platform.

                              Comment

                              Working...