Problem with writing fast UDP server

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Krzysztof Retel

    Problem with writing fast UDP server

    Hi guys,

    I am struggling writing fast UDP server. It has to handle around 10000
    UDP packets per second. I started building that with non blocking
    socket and threads. Unfortunately my approach does not work at all.
    I wrote a simple case test: client and server. The client sends 2200
    packets within 0.137447118759 secs. The tcpdump received 2189 packets,
    which is not bad at all.
    But the server only handles 700 -- 870 packets, when it is non-
    blocking, and only 670 – 700 received with blocking sockets.
    The client and the server are working within the same local network
    and tcpdump shows pretty correct amount of packets received.

    I included a bit of the code of the UDP server.

    class PacketReceive(t hreading.Thread ):
    def __init__(self, tname, socket, queue):
    self._tname = tname
    self._socket = socket
    self._queue = queue
    threading.Threa d.__init__(self , name=self._tnam e)

    def run(self):
    print 'Started thread: ', self.getName()
    cnt = 1
    cnt_msgs = 0
    while True:
    try:
    data = self._socket.re cv(512)
    msg = data
    cnt_msgs += 1
    total += 1
    # self._queue.put (msg)
    print 'thread: %s, cnt_msgs: %d' % (self.getName() ,
    cnt_msgs)
    except:
    pass


    I was also using Queue, but this didn't help neither.
    Any idea what I am doing wrong?

    I was reading that Python socket modules was causing some delays with
    TCP server. They recomended to set up socket option for nondelays:
    "sock.setsockop t(SOL_TCP, TCP_NODELAY, 1) ". I couldn't find any
    similar option for UDP type sockets.
    Is there anything I have to change in socket options to make it
    working faster?
    Why the server can't process all incomming packets? Is there a bug in
    the socket layer? btw. I am using Python 2.5 on Ubuntu 8.10.

    Cheers
    K
  • Hrvoje Niksic

    #2
    Re: Problem with writing fast UDP server

    Krzysztof Retel <Krzysztof.Rete l@googlemail.co mwrites:
    But the server only handles 700 -- 870 packets, when it is non-
    blocking, and only 670 – 700 received with blocking sockets.
    What are your other threads doing? Have you tried the same code
    without any threading?

    Comment

    • Krzysztof Retel

      #3
      Re: Problem with writing fast UDP server

      On Nov 20, 3:34 pm, Hrvoje Niksic <hnik...@xemacs .orgwrote:
      Krzysztof Retel <Krzysztof.Re.. .@googlemail.co mwrites:
      But the server only handles 700 -- 870 packets, when it is non-
      blocking, and only 670 – 700 received with blocking sockets.
      >
      What are your other threads doing?  Have you tried the same code
      without any threading?
      I have only this one thread, which I can run couple of times.
      I tried without a threading and was the same result, not all packets
      were processed.

      Comment

      • bieffe62@gmail.com

        #4
        Re: Problem with writing fast UDP server

        On 20 Nov, 16:03, Krzysztof Retel <Krzysztof.Re.. .@googlemail.co m>
        wrote:
        Hi guys,
        >
        I am struggling writing fast UDP server. It has to handle around 10000
        UDP packets per second. I started building that with non blocking
        socket and threads. Unfortunately my approach does not work at all.
        I wrote a simple case test: client and server. The client sends 2200
        packets within 0.137447118759 secs. The tcpdump received 2189 packets,
        which is not bad at all.
        But the server only handles 700 -- 870 packets, when it is non-
        blocking, and only 670 – 700 received with blocking sockets.
        The client and the server are working within the same local network
        and tcpdump shows pretty correct amount of packets received.
        >
        I included a bit of the code of the UDP server.
        >
        class PacketReceive(t hreading.Thread ):
            def __init__(self, tname, socket, queue):
                self._tname = tname
                self._socket = socket
                self._queue = queue
                threading.Threa d.__init__(self , name=self._tnam e)
        >
            def run(self):
                print 'Started thread: ', self.getName()
                cnt = 1
                cnt_msgs = 0
                while True:
                    try:
                        data = self._socket.re cv(512)
                        msg = data
                        cnt_msgs += 1
                        total += 1
                        # self._queue.put (msg)
                        print  'thread: %s, cnt_msgs: %d' % (self.getName() ,
        cnt_msgs)
                    except:
                        pass
        >
        I was also using Queue, but this didn't help neither.
        Any idea what I am doing wrong?
        >
        I was reading that Python socket modules was causing some delays with
        TCP server. They recomended to set up  socket option for nondelays:
        "sock.setsockop t(SOL_TCP, TCP_NODELAY, 1) ". I couldn't find any
        similar option for UDP type sockets.
        Is there anything I have to change in socket options to make it
        working faster?
        Why the server can't process all incomming packets? Is there a bug in
        the socket layer? btw. I am using Python 2.5 on Ubuntu 8.10.
        >
        Cheers
        K
        Stupid question: did you try removing the print (e.g. printing once
        every 100 messages) ?

        Ciao
        ----
        FB

        Comment

        • Krzysztof Retel

          #5
          Re: Problem with writing fast UDP server

          On Nov 20, 4:00 pm, bieff...@gmail. com wrote:
          On 20 Nov, 16:03, Krzysztof Retel <Krzysztof.Re.. .@googlemail.co m>
          wrote:
          >
          >
          >
          Hi guys,
          >
          I am struggling writing fast UDP server. It has to handle around 10000
          UDP packets per second. I started building that with non blocking
          socket and threads. Unfortunately my approach does not work at all.
          I wrote a simple case test: client and server. The client sends 2200
          packets within 0.137447118759 secs. The tcpdump received 2189 packets,
          which is not bad at all.
          But the server only handles 700 -- 870 packets, when it is non-
          blocking, and only 670 – 700 received with blocking sockets.
          The client and the server are working within the same local network
          and tcpdump shows pretty correct amount of packets received.
          >
          I included a bit of the code of the UDP server.
          >
          class PacketReceive(t hreading.Thread ):
              def __init__(self, tname, socket, queue):
                  self._tname = tname
                  self._socket = socket
                  self._queue = queue
                  threading.Threa d.__init__(self , name=self._tnam e)
          >
              def run(self):
                  print 'Started thread: ', self.getName()
                  cnt = 1
                  cnt_msgs = 0
                  while True:
                      try:
                          data = self._socket.re cv(512)
                          msg = data
                          cnt_msgs += 1
                          total += 1
                          # self._queue.put (msg)
                          print  'thread: %s, cnt_msgs: %d' % (self.getName() ,
          cnt_msgs)
                      except:
                          pass
          >
          I was also using Queue, but this didn't help neither.
          Any idea what I am doing wrong?
          >
          I was reading that Python socket modules was causing some delays with
          TCP server. They recomended to set up  socket option for nondelays:
          "sock.setsockop t(SOL_TCP, TCP_NODELAY, 1) ". I couldn't find any
          similar option for UDP type sockets.
          Is there anything I have to change in socket options to make it
          working faster?
          Why the server can't process all incomming packets? Is there a bug in
          the socket layer? btw. I am using Python 2.5 on Ubuntu 8.10.
          >
          Cheers
          K
          >
          Stupid question: did you try removing the print (e.g. printing once
          every 100 messages) ?
          :) Of course I did Nothing has changed

          I wonder if there is a kind of setting for socket to allow no delays?

          Comment

          • Gabriel Genellina

            #6
            Re: Problem with writing fast UDP server

            En Thu, 20 Nov 2008 14:24:20 -0200, Krzysztof Retel
            <Krzysztof.Rete l@googlemail.co mescribió:
            On Nov 20, 4:00 pm, bieff...@gmail. com wrote:
            >On 20 Nov, 16:03, Krzysztof Retel <Krzysztof.Re.. .@googlemail.co m>
            >wrote:
            >>
            I am struggling writing fast UDP server. It has to handle around 10000
            UDP packets per second. I started building that with non blocking
            socket and threads. Unfortunately my approach does not work at all.
            I wrote a simple case test: client and server. The client sends 2200
            packets within 0.137447118759 secs. The tcpdump received 2189 packets,
            which is not bad at all.
            But the server only handles 700 -- 870 packets, when it is non-
            blocking, and only 670 – 700 received with blocking sockets.
            The client and the server are working within the same local network
            and tcpdump shows pretty correct amount of packets received.
            >
            I wonder if there is a kind of setting for socket to allow no delays?
            I've used this script to test sending UDP packets. I've not seen any
            delays.

            <code>
            """a very simple UDP test

            Usage:

            %(name)s client <remotehost<mes sage to send|length of message>
            to continuously send messages to <remotehostunti l Ctrl-C

            %(name)s server
            to listen for messages until Ctrl-C

            Uses port %(port)d. Once stopped, shows some statistics.
            Creates udpstress-client.csv or udpstress-server.csv with
            pairs (size,time)
            """

            import os, sys
            import socket
            import time

            PORT = 21758
            BUFSIZE = 4096
            socket.setdefau lttimeout(10.0)

            def server(port):
            sock = socket.socket(s ocket.AF_INET, socket.SOCK_DGR AM)
            sock.bind(('',p ort))
            print "Receiving at port %d" % (port)
            history = []
            print "Waiting for first packet to arrive...",
            sock.recvfrom(B UFSIZE)
            print "ok"
            t0 = time.clock()
            while 1:
            try:
            try:
            data, remoteaddr = sock.recvfrom(B UFSIZE)
            except socket.timeout:
            print "Timed out"
            break
            except KeyboardInterru pt: # #1755388 #926423
            raise
            t1 = time.clock()
            if not data:
            break
            history.append( (len(data), t1-t0))
            t0 = t1
            except KeyboardInterru pt:
            print "Stopped"
            break
            sock.close()
            return history

            def client(remoteho st, port, data):
            sock = socket.socket(s ocket.AF_INET, socket.SOCK_DGR AM)
            history = []
            print "Sending %d-bytes packets to %s:%d" % (len(data), remotehost,
            port)
            t0 = time.clock()
            while 1:
            try:
            nbytes = sock.sendto(dat a, (remotehost,por t))
            t1 = time.clock()
            if not nbytes:
            break
            history.append( (nbytes, t1-t0))
            t0 = t1
            except KeyboardInterru pt:
            print "Stopped"
            break
            sock.close()
            return history

            def show_stats(hist ory, which):
            npackets = len(history)
            bytes_total = sum([item[0] for item in history])
            bytes_avg = float(bytes_tot al) / npackets
            bytes_max = max([item[0] for item in history])
            time_total = sum([item[1] for item in history])
            time_max = max([item[1] for item in history])
            time_min = min([item[1] for item in history])
            time_avg = float(time_tota l) / npackets
            speed_max = max([item[0]/item[1] for item in history if item[1]>0])
            speed_min = min([item[0]/item[1] for item in history if item[1]>0])
            speed_avg = float(bytes_tot al) / time_total
            print "Packet count %8d" % npackets
            print "Total bytes %8d bytes" % bytes_total
            print "Total time %8.1f secs" % time_total
            print "Avg size / packet %8d bytes" % bytes_avg
            print "Max size / packet %8d bytes" % bytes_max
            print "Max time / packet %8.1f us" % (time_max*1e6)
            print "Min time / packet %8.1f us" % (time_min*1e6)
            print "Avg time / packet %8.1f us" % (time_avg*1e6)
            print "Max speed %8.1f Kbytes/sec" % (speed_max/1024)
            print "Min speed %8.1f Kbytes/sec" % (speed_min/1024)
            print "Avg speed %8.1f Kbytes/sec" % (speed_avg/1024)
            print
            open("udpstress-%s.csv" % which,"w").writ elines(
            ["%d,%f\n" % item for item in history])

            if len(sys.argv)>1 :
            if "client".starts with(sys.argv[1].lower()):
            remotehost = sys.argv[2]
            data = sys.argv[3]
            if data.isdigit(): # means length of message
            data = "x" * int(data)
            history = client(remoteho st, PORT, data)
            show_stats(hist ory, "client")
            sys.exit(0)
            elif "server".starts with(sys.argv[1].lower()):
            history = server(PORT)
            show_stats(hist ory, "server")
            sys.exit(0)

            print >>sys.stderr, __doc__ % {
            "name": os.path.basenam e(sys.argv[0]),
            "port": PORT}
            </code>

            Start the server before the client.

            --
            Gabriel Genellina

            Comment

            • John Nagle

              #7
              Re: Problem with writing fast UDP server

              Gabriel Genellina wrote:
              En Thu, 20 Nov 2008 14:24:20 -0200, Krzysztof Retel
              <Krzysztof.Rete l@googlemail.co mescribió:
              >On Nov 20, 4:00 pm, bieff...@gmail. com wrote:
              >>On 20 Nov, 16:03, Krzysztof Retel <Krzysztof.Re.. .@googlemail.co m>
              >>wrote:
              >>>
              >I am struggling writing fast UDP server. It has to handle around 10000
              >UDP packets per second. I started building that with non blocking
              >socket and threads. Unfortunately my approach does not work at all.
              >I wrote a simple case test: client and server. The client sends 2200
              >packets within 0.137447118759 secs. The tcpdump received 2189 packets,
              >which is not bad at all.
              >But the server only handles 700 -- 870 packets, when it is non-
              >blocking, and only 670 – 700 received with blocking sockets.
              >The client and the server are working within the same local network
              >and tcpdump shows pretty correct amount of packets received.
              >>
              >I wonder if there is a kind of setting for socket to allow no delays?
              Is the program CPU-bound? If so, CPython is too slow for what you want
              to do.

              John Nagle

              Comment

              • Greg Copeland

                #8
                Re: Problem with writing fast UDP server

                On Nov 20, 9:03 am, Krzysztof Retel <Krzysztof.Re.. .@googlemail.co m>
                wrote:
                Hi guys,
                >
                I am struggling writing fast UDP server. It has to handle around 10000
                UDP packets per second. I started building that with non blocking
                socket and threads. Unfortunately my approach does not work at all.
                I wrote a simple case test: client and server. The client sends 2200
                packets within 0.137447118759 secs. The tcpdump received 2189 packets,
                which is not bad at all.
                But the server only handles 700 -- 870 packets, when it is non-
                blocking, and only 670 – 700 received with blocking sockets.
                The client and the server are working within the same local network
                and tcpdump shows pretty correct amount of packets received.
                >
                I included a bit of the code of the UDP server.
                >
                class PacketReceive(t hreading.Thread ):
                    def __init__(self, tname, socket, queue):
                        self._tname = tname
                        self._socket = socket
                        self._queue = queue
                        threading.Threa d.__init__(self , name=self._tnam e)
                >
                    def run(self):
                        print 'Started thread: ', self.getName()
                        cnt = 1
                        cnt_msgs = 0
                        while True:
                            try:
                                data = self._socket.re cv(512)
                                msg = data
                                cnt_msgs += 1
                                total += 1
                                # self._queue.put (msg)
                                print  'thread: %s, cnt_msgs: %d' % (self.getName() ,
                cnt_msgs)
                            except:
                                pass
                >
                I was also using Queue, but this didn't help neither.
                Any idea what I am doing wrong?
                >
                I was reading that Python socket modules was causing some delays with
                TCP server. They recomended to set up  socket option for nondelays:
                "sock.setsockop t(SOL_TCP, TCP_NODELAY, 1) ". I couldn't find any
                similar option for UDP type sockets.
                Is there anything I have to change in socket options to make it
                working faster?
                Why the server can't process all incomming packets? Is there a bug in
                the socket layer? btw. I am using Python 2.5 on Ubuntu 8.10.
                >
                Cheers
                K
                First and foremost, you are not being realistic here. Attempting to
                squeeze 10,000 packets per second out of 10Mb/s (assumed) Ethernet is
                not realistic. The maximum theoretical limit is 14,880 frames per
                second, and that assumes each frame is only 84 bytes per frame, making
                it useless for data transport. Using your numbers, each frame requires
                (90B + 84B) 174B, which works out to be a theoretical maximum of ~7200
                frames per second. These are obviously some rough numbers but I
                believe you get the point. It's late here, so I'll double check my
                numbers tomorrow.

                In your case, you would not want to use TCP_NODELAY, even if you were
                to use TCP, as it would actually limit your throughput. UDP does not
                have such an option because each datagram is an ethernet frame - which
                is not true for TCP as TCP is a stream. In this case, use of TCP may
                significantly reduce the number of frames required for transport -
                assuming TCP_NODELAY is NOT used. If you want to increase your
                throughput, use larger datagrams. If you are on a reliable connection,
                which we can safely assume since you are currently using UDP, use of
                TCP without the use of TCP_NODELAY may yield better performance
                because of its buffering strategy.

                Assuming you are using 10Mb ethernet, you are nearing its frame-
                saturation limits. If you are using 100Mb ethernet, you'll obviously
                have a lot more elbow room but not nearly as much as one would hope
                because 100Mb is only possible when frames which are completely
                filled. It's been a while since I last looked at 100Mb numbers, but
                it's not likely most people will see numbers near its theoretical
                limits simply because that number has so many caveats associated with
                it - and small frames are its nemesis. Since you are using very small
                datagrams, you are wasting a lot of potential throughput. And if you
                have other computers on your network, the situation is made yet more
                difficult. Additionally, many switches and/or routes also have
                bandwidth limits which may or may not pose a wall for your
                application. And to make matters worse, you are allocating lots of
                buffers (4K) to send/receive 90 bytes of data, creating yet more work
                for your computer.

                Options to try:
                See how TCP measures up for you
                Attempt to place multiple data objects within a single datagram,
                thereby optimizing available ethernet bandwidth
                You didn't say if you are CPU-bound, but you are creating a tuple and
                appending it to a list on every datagram. You may find allocating
                smaller buffers and optimizing your history accounting may help if
                you're CPU-bound.
                Don't forget, localhost does not suffer from frame limits - it's
                basically testing your memory/bus speed
                If this is for local use only, considering using a different IPC
                mechanism - unix domain sockets or memory mapped files

                Comment

                • Krzysztof Retel

                  #9
                  Re: Problem with writing fast UDP server

                  On Nov 21, 5:49 am, Greg Copeland <gtcopel...@gma il.comwrote:
                  On Nov 20, 9:03 am, Krzysztof Retel <Krzysztof.Re.. .@googlemail.co m>
                  wrote:
                  >
                  >
                  >
                  Hi guys,
                  >
                  I am struggling writing fast UDP server. It has to handle around 10000
                  UDP packets per second. I started building that with non blocking
                  socket and threads. Unfortunately my approach does not work at all.
                  I wrote a simple case test: client and server. The client sends 2200
                  packets within 0.137447118759 secs. The tcpdump received 2189 packets,
                  which is not bad at all.
                  But the server only handles 700 -- 870 packets, when it is non-
                  blocking, and only 670 – 700 received with blocking sockets.
                  The client and the server are working within the same local network
                  and tcpdump shows pretty correct amount of packets received.
                  >
                  I included a bit of the code of the UDP server.
                  >
                  class PacketReceive(t hreading.Thread ):
                      def __init__(self, tname, socket, queue):
                          self._tname = tname
                          self._socket = socket
                          self._queue = queue
                          threading.Threa d.__init__(self , name=self._tnam e)
                  >
                      def run(self):
                          print 'Started thread: ', self.getName()
                          cnt = 1
                          cnt_msgs = 0
                          while True:
                              try:
                                  data = self._socket.re cv(512)
                                  msg = data
                                  cnt_msgs += 1
                                  total += 1
                                  # self._queue.put (msg)
                                  print  'thread: %s, cnt_msgs: %d' % (self.getName() ,
                  cnt_msgs)
                              except:
                                  pass
                  >
                  I was also using Queue, but this didn't help neither.
                  Any idea what I am doing wrong?
                  >
                  I was reading that Python socket modules was causing some delays with
                  TCP server. They recomended to set up  socket option for nondelays:
                  "sock.setsockop t(SOL_TCP, TCP_NODELAY, 1) ". I couldn't find any
                  similar option for UDP type sockets.
                  Is there anything I have to change in socket options to make it
                  working faster?
                  Why the server can't process all incomming packets? Is there a bug in
                  the socket layer? btw. I am using Python 2.5 on Ubuntu 8.10.
                  >
                  Cheers
                  K
                  >
                  First and foremost, you are not being realistic here. Attempting to
                  squeeze 10,000 packets per second out of 10Mb/s (assumed) Ethernet is
                  not realistic. The maximum theoretical limit is 14,880 frames per
                  second, and that assumes each frame is only 84 bytes per frame, making
                  it useless for data transport. Using your numbers, each frame requires
                  (90B + 84B) 174B, which works out to be a theoretical maximum of ~7200
                  frames per second. These are obviously some rough numbers but I
                  believe you get the point. It's late here, so I'll double check my
                  numbers tomorrow.
                  >
                  In your case, you would not want to use TCP_NODELAY, even if you were
                  to use TCP, as it would actually limit your throughput. UDP does not
                  have such an option because each datagram is an ethernet frame - which
                  is not true for TCP as TCP is a stream. In this case, use of TCP may
                  significantly reduce the number of frames required for transport -
                  assuming TCP_NODELAY is NOT used. If you want to increase your
                  throughput, use larger datagrams. If you are on a reliable connection,
                  which we can safely assume since you are currently using UDP, use of
                  TCP without the use of TCP_NODELAY may yield better performance
                  because of its buffering strategy.
                  >
                  Assuming you are using 10Mb ethernet, you are nearing its frame-
                  saturation limits. If you are using 100Mb ethernet, you'll obviously
                  have a lot more elbow room but not nearly as much as one would hope
                  because 100Mb is only possible when frames which are completely
                  filled. It's been a while since I last looked at 100Mb numbers, but
                  it's not likely most people will see numbers near its theoretical
                  limits simply because that number has so many caveats associated with
                  it - and small frames are its nemesis. Since you are using very small
                  datagrams, you are wasting a lot of potential throughput. And if you
                  have other computers on your network, the situation is made yet more
                  difficult. Additionally, many switches and/or routes also have
                  bandwidth limits which may or may not pose a wall for your
                  application. And to make matters worse, you are allocating lots of
                  buffers (4K) to send/receive 90 bytes of data, creating yet more work
                  for your computer.
                  >
                  Options to try:
                  See how TCP measures up for you
                  Attempt to place multiple data objects within a single datagram,
                  thereby optimizing available ethernet bandwidth
                  You didn't say if you are CPU-bound, but you are creating a tuple and
                  appending it to a list on every datagram. You may find allocating
                  smaller buffers and optimizing your history accounting may help if
                  you're CPU-bound.
                  Don't forget, localhost does not suffer from frame limits - it's
                  basically testing your memory/bus speed
                  If this is for local use only, considering using a different IPC
                  mechanism - unix domain sockets or memory mapped files
                  Greg, thanks very much for your reply.
                  I am not sure what do you mean by CPU-bound? How can I find out if I
                  run it on CPU-bound?

                  May I also ask you for list of references about sockets and
                  networking? Just want to develop my knowledge regarding networking.

                  Cheers
                  K

                  Comment

                  • Peter Pearson

                    #10
                    Re: Problem with writing fast UDP server

                    On Fri, 21 Nov 2008 08:14:19 -0800 (PST), Krzysztof Retel wrote:
                    I am not sure what do you mean by CPU-bound? How can I find out if I
                    run it on CPU-bound?
                    CPU-bound is the state in which performance is limited by the
                    availability of processor cycles. On a Unix box, you might
                    run the "top" utility and look to see whether the "%CPU" figure
                    indicates 100% CPU use. Alternatively, you might have a
                    tool for plotting use of system resources.

                    --
                    To email me, substitute nowhere->spamcop, invalid->net.

                    Comment

                    • Krzysztof Retel

                      #11
                      Re: Problem with writing fast UDP server

                      On Nov 21, 4:48 pm, Peter Pearson <ppear...@nowhe re.invalidwrote :
                      On Fri, 21 Nov 2008 08:14:19 -0800 (PST), Krzysztof Retel wrote:
                      I am not sure what do you mean by CPU-bound? How can I find out if I
                      run it on CPU-bound?
                      >
                      CPU-bound is the state in which performance is limited by the
                      availability of processor cycles.  On a Unix box, you might
                      run the "top" utility and look to see whether the "%CPU" figure
                      indicates 100% CPU use.  Alternatively, you might have a
                      tool for plotting use of system resources.
                      >
                      --
                      To email me, substitute nowhere->spamcop, invalid->net.
                      Thanks. I run it without CPU-bound

                      Comment

                      • Greg Copeland

                        #12
                        Re: Problem with writing fast UDP server

                        On Nov 21, 11:05 am, Krzysztof Retel <Krzysztof.Re.. .@googlemail.co m>
                        wrote:
                        On Nov 21, 4:48 pm, Peter Pearson <ppear...@nowhe re.invalidwrote :
                        >
                        On Fri, 21 Nov 2008 08:14:19 -0800 (PST), Krzysztof Retel wrote:
                        I am not sure what do you mean by CPU-bound? How can I find out if I
                        run it on CPU-bound?
                        >
                        CPU-bound is the state in which performance is limited by the
                        availability of processor cycles.  On a Unix box, you might
                        run the "top" utility and look to see whether the "%CPU" figure
                        indicates 100% CPU use.  Alternatively, you might have a
                        tool for plotting use of system resources.
                        >
                        --
                        To email me, substitute nowhere->spamcop, invalid->net.
                        >
                        Thanks. I run it without CPU-bound
                        With clearer eyes, I did confirm my math above is correct. I don't
                        have a networking reference to provide. You'll likely have some good
                        results via Google. :)

                        If you are not CPU bound, you are likely IO-bound. That means you
                        computer is waiting for IO to complete - likely on the sending side.
                        In this case, it likely means you have reached your ethernet bandwidth
                        limits available to your computer. Since you didn't correct me when I
                        assumed you're running 10Mb ethernet, I'll continue to assume that's a
                        safe assumption. So, assuming you are running on 10Mb ethernet, try
                        converting your application to use TCP. I'd bet, unless you have
                        requirements which prevent its use, you'll suddenly have enough
                        bandwidth (in this case, frames) to achieve your desired results.

                        This is untested and off the top of my head but it should get you
                        pointed in the right direction pretty quickly. Make the following
                        changes to the server:

                        sock = socket.socket(s ocket.AF_INET, socket.SOCK_DGR AM)
                        to
                        sock = socket.socket(s ocket.AF_INET, socket.SOCK_STR EAM)

                        Make this:
                        print "Waiting for first packet to arrive...",
                        sock.recvfrom(B UFSIZE)

                        look like:
                        print "Waiting for first packet to arrive...",
                        cliSock = sock.accept()

                        Change your calls to sock.recvfrom(B UFSIZE) to cliSock.recv(BU FSIZE).
                        Notice the change to "cliSock".

                        Keep in mind TCP is stream based, not datagram based so you may need
                        to add additional logic to determine data boundaries for re-assemble
                        of your data on the receiving end. There are several strategies to
                        address that, but for now I'll gloss it over.

                        As someone else pointed out above, change your calls to time.clock()
                        to time.time().

                        On your client, make the following changes.
                        sock = socket.socket(s ocket.AF_INET, socket.SOCK_DGR AM)
                        to
                        sock = socket.socket(s ocket.AF_INET, socket.SOCK_STR EAM)
                        sock.connect( (remotehost,por t) )

                        nbytes = sock.sendto(dat a, (remotehost,por t))
                        to
                        nbytes = sock.send(data)

                        Now, rerun your tests on your network. I expect you'll be faster now
                        because TCP can be pretty smart about buffering. Let's say you write
                        16, 90B blocks to the socket. If they are timely enough, it is
                        possible all of those will be shipped across ethernet as a single
                        frame. So what took 16 frames via UDP can now *potentially* be done in
                        a single ethernet frame (assuming 1500MTU). I say potentially because
                        the exact behaviour is OS/stack and NIC-driver specific and is often
                        tunable to boot. Likewise, on the client end, what previously required
                        15 calls to recvfrom, each returning 90B, can *potentially* be
                        completed in a single call to recv, returning 1440B. Remember, fewer
                        frames means less protocol overhead which makes more bandwidth
                        available to your applications. When sending 90B datagrams, you're
                        waisting over 48% of your available bandwidth because of protocol
                        overhead (actually a lot more because I'm not accounting for UDP
                        headers).

                        Because of the differences between UDP and TCP, unlike your original
                        UDP implementation which can receive from multiple clients, the TCP
                        implementation can only receive from a single client. If you need to
                        receive from multiple clients concurrently, look at python's select
                        module to take up the slack.

                        Hopefully you'll be up and running. Please report back your findings.
                        I'm curious as to your results.

                        Comment

                        Working...