Determine the best buffer sizes when using socket.send() andsocket.recv()

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Giampaolo Rodola'

    Determine the best buffer sizes when using socket.send() andsocket.recv()

    Hi,
    I'd like to know if there's a way to determine which is the best
    buffer size to use when you have to send() and recv() some data over
    the network.
    I have an FTP server application which, on data channel, uses 8192
    bytes as buffer for both incoming and outgoing data.
    Some time ago I received a report from a guy [1] who stated that
    changing the buffers from 8192 to 4096 results in a drastical speed
    improvement.
    I tried to make some tests by using different buffer sizes, from 4 Kb
    to 256 Kb, but I'm not sure which one to use as default in my
    application since I noticed they can vary from different OSes.
    Is there a recommended way to determine the best buffer size to use?

    Thanks in advance

    [1] http://groups.google.com/group/pyftp...thread/f13a82b...


    --- Giampaolo


  • Greg Copeland

    #2
    Re: Determine the best buffer sizes when using socket.send() andsocket.recv( )

    On Nov 14, 9:56 am, "Giampaolo Rodola'" <gne...@gmail.c omwrote:
    Hi,
    I'd like to know if there's a way to determine which is the best
    buffer size to use when you have to send() and recv() some data over
    the network.
    I have an FTP server application which, on data channel, uses 8192
    bytes as buffer for both incoming and outgoing data.
    Some time ago I received a report from a guy [1] who stated that
    changing the buffers from 8192 to 4096 results in a drastical speed
    improvement.
    I tried to make some tests by using different buffer sizes, from 4 Kb
    to 256 Kb, but I'm not sure which one to use as default in my
    application since I noticed they can vary from different OSes.
    Is there a recommended way to determine the best buffer size to use?
    >
    Thanks in advance
    >
    [1]http://groups.google.c om/group/pyftpdlib/browse_thread/thread/f13a82b....
    >
    --- Giampaolohttp://code.google.com/p/pyftpdlib/
    As you stated, the answer is obviously OS/stack dependant. Regardless,
    I believe you'll likely find the best answer is between 16K-64K. Once
    you consider the various TCP stack improvements which are now
    available and the rapid increase of available bandwidth, you'll likely
    want to use the largest buffers which do not impose scalability issues
    for your system/application. Unless you have reason to use a smaller
    buffer, use 64K buffers and be done with it. This helps minimize the
    number of context switches and helps ensure the stack always has data
    to keep pumping.

    To look at it another way, using 64k buffers requires 1/8 the number
    of system calls and less time actually spent in python code.

    If as you say someone actually observed a performance improvement when
    changing from 8k buffers to 4k buffers, it likely has something to do
    with python's buffer allocation overhead but even that seems contrary
    to my expectation. The referenced article was not available to me so I
    was not able to follow and read.

    Another possibility is 4k buffers require less fragmentation and is
    likely to perform better on lossy connections. Is it possible he/she
    was testing on a high lossy connection? In short, performance wise,
    TCP stinks on lossy connections.

    Comment

    • Giampaolo Rodola'

      #3
      Re: Determine the best buffer sizes when using socket.send() andsocket.recv( )

      On Nov 14, 5:27 pm, Greg Copeland <gtcopel...@gma il.comwrote:
      On Nov 14, 9:56 am, "Giampaolo Rodola'" <gne...@gmail.c omwrote:
      >
      >
      >
      >
      >
      Hi,
      I'd like to know if there's a way to determine which is the best
      buffer size to use when you have to send() and recv() some data over
      the network.
      I have an FTP server application which, on data channel, uses 8192
      bytes as buffer for both incoming and outgoing data.
      Some time ago I received a report from a guy [1] who stated that
      changing the buffers from 8192 to 4096 results in a drastical speed
      improvement.
      I tried to make some tests by using different buffer sizes, from 4 Kb
      to 256 Kb, but I'm not sure which one to use as default in my
      application since I noticed they can vary from different OSes.
      Is there a recommended way to determine the best buffer size to use?
      >
      Thanks in advance
      >
      [1]http://groups.google.c om/group/pyftpdlib/browse_thread/thread/f13a82b...
      >
      --- Giampaolohttp://code.google.com/p/pyftpdlib/
      >
      As you stated, the answer is obviously OS/stack dependant. Regardless,
      I believe you'll likely find the best answer is between 16K-64K. Once
      you consider the various TCP stack improvements which are now
      available and the rapid increase of available bandwidth, you'll likely
      want to use the largest buffers which do not impose scalability issues
      for your system/application. Unless you have reason to use a smaller
      buffer, use 64K buffers and be done with it. This helps minimize the
      number of context switches and helps ensure the stack always has data
      to keep pumping.
      >
      To look at it another way, using 64k buffers requires 1/8 the number
      of system calls and less time actually spent in python code.
      >
      If as you say someone actually observed a performance improvement when
      changing from 8k buffers to 4k buffers, it likely has something to do
      with python's buffer allocation overhead but even that seems contrary
      to my expectation. The referenced article was not available to me so I
      was not able to follow and read.
      >
      Another possibility is 4k buffers require less fragmentation and is
      likely to perform better on lossy connections. Is it possible he/she
      was testing on a high lossy connection? In short, performance wise,
      TCP stinks on lossy connections.- Hide quoted text -
      >
      - Show quoted text -
      Thanks for the precious advices.
      The discussion I was talking about is this one (sorry for the broken
      link, I didn't notice that):


      --- Giampaolo

      Comment

      • Greg Copeland

        #4
        Re: Determine the best buffer sizes when using socket.send() andsocket.recv( )

        On Nov 14, 1:58 pm, "Giampaolo Rodola'" <gne...@gmail.c omwrote:
        On Nov 14, 5:27 pm, Greg Copeland <gtcopel...@gma il.comwrote:
        >
        >
        >
        On Nov 14, 9:56 am, "Giampaolo Rodola'" <gne...@gmail.c omwrote:
        >
        Hi,
        I'd like to know if there's a way to determine which is the best
        buffer size to use when you have to send() and recv() some data over
        the network.
        I have an FTP server application which, on data channel, uses 8192
        bytes as buffer for both incoming and outgoing data.
        Some time ago I received a report from a guy [1] who stated that
        changing the buffers from 8192 to 4096 results in a drastical speed
        improvement.
        I tried to make some tests by using different buffer sizes, from 4 Kb
        to 256 Kb, but I'm not sure which one to use as default in my
        application since I noticed they can vary from different OSes.
        Is there a recommended way to determine the best buffer size to use?
        >
        Thanks in advance
        >
        [1]http://groups.google.c om/group/pyftpdlib/browse_thread/thread/f13a82b...
        >
        --- Giampaolohttp://code.google.com/p/pyftpdlib/
        >
        As you stated, the answer is obviously OS/stack dependant. Regardless,
        I believe you'll likely find the best answer is between 16K-64K. Once
        you consider the various TCP stack improvements which are now
        available and the rapid increase of available bandwidth, you'll likely
        want to use the largest buffers which do not impose scalability issues
        for your system/application. Unless you have reason to use a smaller
        buffer, use 64K buffers and be done with it. This helps minimize the
        number of context switches and helps ensure the stack always has data
        to keep pumping.
        >
        To look at it another way, using 64k buffers requires 1/8 the number
        of system calls and less time actually spent in python code.
        >
        If as you say someone actually observed a performance improvement when
        changing from 8k buffers to 4k buffers, it likely has something to do
        with python's buffer allocation overhead but even that seems contrary
        to my expectation. The referenced article was not available to me so I
        was not able to follow and read.
        >
        Another possibility is 4k buffers require less fragmentation and is
        likely to perform better on lossy connections. Is it possible he/she
        was testing on a high lossy connection? In short, performance wise,
        TCP stinks on lossy connections.- Hide quoted text -
        >
        - Show quoted text -
        >
        Thanks for the precious advices.
        The discussion I was talking about is this one (sorry for the broken
        link, I didn't notice that):http://groups.google.com/group/pyftp...thread/f13a82b...
        >
        --- Giampaolohttp://code.google.com/p/pyftpdlib/
        I read the provided link. There really isn't enough information to
        explain what he observed. It is safe to say, his report is contrary to
        common performance expectations and my own experience. Since he also
        reported large swings in bandwidth far below his potential max, I'm
        inclined to say he was suffering from some type of network
        abnormality. To be clear, that's just a guess. For all we know some
        script kiddie was attempting to scan/hack his system at that given
        time - or any number of other variables. One can only be left making
        wild assumptions about his operating environment and it's not even
        clear if his results are reproducible. Lastly, keep in mind, many
        people do not know how to properly benchmark simple applications, let
        alone accurately measure bandwidth.

        Keep in mind, python can typically saturate a 10Mb link even on fairly
        low end systems so it's not likely your application was his problem.
        For now, use large buffers unless you can prove otherwise.

        Comment

        Working...