The Cephes Mathematical Library

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • jacob navia

    The Cephes Mathematical Library

    The cephes math library is a legend in the internet. It was
    written by Stephen L. Moshier in 1984, but there was apparently
    an even earlier version in assembler, before he rewrote that
    in C. It is used in many systems, for instance in the python
    language.

    I started working with this library around 1997 or whereabouts.
    I rewrote the core of it in assembler, fixed some (minor)
    bugs, and added several functions like the Lambert's W
    function and others. I rewrote all the functions needed for
    replicating all math.h

    This library is at the center of the qfloat data type in lcc-win
    that was one of the first concrete applications of operator
    overloading and convinced me that I was in the right track.

    Basically, the library works around

    #define _NQ_ 12
    typedef struct qfloatstruct {
    int sign;
    int exponent;
    unsigned int mantissa[_NQ_];
    }qfloat;

    For a 32 bit system, this structure can be seen as an array
    of 14 32 bit integers, with array[0] the sign, array[1] the
    exponent, and array[2..13] the mantissa.

    Within the mantissa, the first array position is always zero
    and is used to hold bits that overflow from the higher
    positions during the calculations.

    When doing the four operations, the software expands this
    structure by adding a zero to the end, to hold overflows in the
    other direction.

    I maintained this structure within the assembler core, and
    essentially the 32 bit version uses the above description as is.

    When preparing the 64 bit version however, I decided to
    optimize the data layout.

    1) The extra zero will no longer be stored in the number
    but only stored in the temporary forms when actually
    doing the calculations. Numbers will be expanded before
    any of the four operations, then packed afterwards.

    2) The size of the numbers must be a multiple of 2 to better
    use the cache lines. A size of 64 bytes is a sensible choice.

    All this leads naturally to the following structure
    #define MANTISSA_LENGTH 7
    typedef struct tagQfloat {
    int sign;
    unsigned int exponent;
    unsigned long long mantissa[MANTISSA_LENGTH];
    } Qfloat;

    This allows for a precision of 448 bits, with 132 decimal
    digits.

    Obviously it would be too expensive to pass a 64 byte
    structure by value for each function call, so I decided
    to use references everywhere. The trick I used is that
    instead of declaring variables like this:
    Qfloat a,b,c;
    I declare them like this:
    Qfloat a[1],b[1],c[1];

    This has the highly beneficial side effect that when
    you call some function like
    qadd(a,b,c);
    to add a+b and put the result in c, the numbers are
    automatically passed by reference without having to use the
    ugly
    qadd(&a,&b,&c);
    notation and maintaining the bulk of the library
    code intact.

    This 64 bit version has more or less the same speed as the 32 bit
    version, even if it has around 25% more precision (132 digits
    instead of 105). Obviously reducing the number of loops by
    half (mantissa is not 14 but 7 elements wide) offset the
    extra precision work.

    This new library will be shipped with the 64 bit version of lcc-win.

    The purpose of this message is to discuss the data structures and the
    choices I did. Obviously this is a serious implementation
    of floating point, not just an academic exercise where the numbers
    are represented just by "unsigned char" or similar solutions.


    --
    jacob navia
    jacob at jacob point remcomp point fr
    logiciels/informatique

  • jacob navia

    #2
    Re: The Cephes Mathematical Library

    jacob navia wrote:
    The cephes math library is a legend in the internet. It was
    written by Stephen L. Moshier in 1984, but there was apparently
    an even earlier version in assembler, before he rewrote that
    in C. It is used in many systems, for instance in the python
    language.
    >
    Sorry. After I posted that, I found following comment in
    qflti.c
    /*
    * Revision history:
    *
    * SLM, 5 Jan 84 PDP-11 assembly language version
    * SLM, 2 Mar 86 fixed bug in asctoq()
    * SLM, 6 Dec 86 C language version
    *
    */

    SLM is Stephen L. Moshier

    The URL for the version of Mr Moshier is:




    --
    jacob navia
    jacob at jacob point remcomp point fr
    logiciels/informatique

    Comment

    • Eric Sosman

      #3
      Re: The Cephes Mathematical Library

      jacob navia wrote:
      [...]
      The purpose of this message is to discuss the data structures and the
      choices I did. Obviously this is a serious implementation
      of floating point, not just an academic exercise where the numbers
      are represented just by "unsigned char" or similar solutions.
      Okay, I'll bite: Does this discussion have anything to do
      with C, or is it about implementation strategies for a library
      of assembly code?

      There seemed to be three C-ish things in your message:

      1) Some assumptions about how a compiler pads (or rather
      doesn't pad) struct objects. I guess the assumptions
      are correct for your compiler, but they also seem
      gratuitous: The operations are described in terms of
      arrays, so why not just use arrays?

      2) A trick for avoiding an address-of operator in function
      calls, by declaring the variables as one-element arrays.
      The price paid is that simple assignment now needs extra
      syntax: `x = y' becomes `x[0] = y[0]'. TANSTAAFL.

      3) I thought there was a third thing, but now I can't find it.

      So: You've done something of interest to lcc users and maybe
      to people who need in high-precision floating-point, and you
      should feel good about having done so. But I fail to understand
      why the implementation details of a library of assembly code are
      a fit topic for a forum devoted to a different language.

      Follow-ups set to comp.compilers. lcc.

      --
      Eric.Sosman@sun .com

      Comment

      • jacob navia

        #4
        Re: The Cephes Mathematical Library

        Eric Sosman wrote:
        jacob navia wrote:
        >[...]
        >The purpose of this message is to discuss the data structures and the
        >choices I did. Obviously this is a serious implementation
        >of floating point, not just an academic exercise where the numbers
        >are represented just by "unsigned char" or similar solutions.
        >
        Okay, I'll bite: Does this discussion have anything to do
        with C, or is it about implementation strategies for a library
        of assembly code?
        >
        The core of the library is written in C, with the four operations
        (add, subtract and multiply, + shift in assembler). I do not
        see why you think the library is in assembler.
        There seemed to be three C-ish things in your message:
        >
        1) Some assumptions about how a compiler pads (or rather
        doesn't pad) struct objects. I guess the assumptions
        are correct for your compiler, but they also seem
        gratuitous: The operations are described in terms of
        arrays, so why not just use arrays?
        >
        The problem is that arrays must be homogeneous. The exponent and the
        sign are 32 bit, the mantissa is 64 bit. If you find a C array that
        will do that I would take it immediately of course.

        Nowhere in the code a specific layout is assumed. Precisely this
        is the object of using a structure. I do not see where do you see
        anything that assumes a specific layout...

        Obviously, if we have
        struct qfloat{ int32_t sign;int32_t exponent;
        long long mantissa[7]};
        and the compiler aligns each 32 bit integer into a 64 bit slot
        we will have a waste of memory locations but the library will work
        Obviously the assembler routines would need to be modified to
        fit the compiler layout but I see no problem with this.

        2) A trick for avoiding an address-of operator in function
        calls, by declaring the variables as one-element arrays.
        The price paid is that simple assignment now needs extra
        syntax: `x = y' becomes `x[0] = y[0]'. TANSTAAFL.
        >
        There isn't a single assignment
        qfloat a,b;
        a=b

        in the whole library. Why?

        Precisely because the library declared the numbers as arrays. For
        unknown hysterical reasons arrays can't be assigned to.

        NOW that I have rewritten this library to use structures, the assignment
        as you say is possible in the source code. Within the library source
        code the function
        qmov(a,b); //Move a into b
        is used.
        3) I thought there was a third thing, but now I can't find it.
        >
        So: You've done something of interest to lcc users and maybe
        to people who need in high-precision floating-point, and you
        should feel good about having done so. But I fail to understand
        why the implementation details of a library of assembly code are
        a fit topic for a forum devoted to a different language.
        The cephes library is written in C. I made that very clear in my
        message but maybe not clear enough for you.

        Besides the extra precision, the library offers many special functions
        in an open implementation. The list of functions is very long, with
        elliptical integrals, statistics, physics, and many other functions like
        Riemann zeta, Bessel functions, gamma, etc.

        All that is written in C, for 32 bit, 64 bit and 448 bit precision.
        This is a well known library, and I thought that we could discuss it in
        this group.

        >
        Follow-ups set to comp.compilers. lcc.
        >

        --
        jacob navia
        jacob at jacob point remcomp point fr
        logiciels/informatique

        Comment

        • Dann Corbit

          #5
          Re: The Cephes Mathematical Library

          "jacob navia" <jacob@nospam.c omwrote in message
          news:g3rpfb$orm $1@aioe.org...
          Eric Sosman wrote:
          >jacob navia wrote:
          >>[...]
          >>The purpose of this message is to discuss the data structures and the
          >>choices I did. Obviously this is a serious implementation
          >>of floating point, not just an academic exercise where the numbers
          >>are represented just by "unsigned char" or similar solutions.
          >>
          > Okay, I'll bite: Does this discussion have anything to do
          >with C, or is it about implementation strategies for a library
          >of assembly code?
          >>
          >
          The core of the library is written in C, with the four operations
          (add, subtract and multiply, + shift in assembler). I do not
          see why you think the library is in assembler.
          >
          > There seemed to be three C-ish things in your message:
          >>
          > 1) Some assumptions about how a compiler pads (or rather
          > doesn't pad) struct objects. I guess the assumptions
          > are correct for your compiler, but they also seem
          > gratuitous: The operations are described in terms of
          > arrays, so why not just use arrays?
          >>
          >
          The problem is that arrays must be homogeneous. The exponent and the
          sign are 32 bit, the mantissa is 64 bit. If you find a C array that
          will do that I would take it immediately of course.
          >
          Nowhere in the code a specific layout is assumed. Precisely this
          is the object of using a structure. I do not see where do you see
          anything that assumes a specific layout...
          >
          Obviously, if we have
          struct qfloat{ int32_t sign;int32_t exponent;
          long long mantissa[7]};
          and the compiler aligns each 32 bit integer into a 64 bit slot
          we will have a waste of memory locations but the library will work
          Obviously the assembler routines would need to be modified to
          fit the compiler layout but I see no problem with this.
          >
          >
          > 2) A trick for avoiding an address-of operator in function
          > calls, by declaring the variables as one-element arrays.
          > The price paid is that simple assignment now needs extra
          > syntax: `x = y' becomes `x[0] = y[0]'. TANSTAAFL.
          >>
          >
          There isn't a single assignment
          qfloat a,b;
          a=b
          >
          in the whole library. Why?
          >
          Precisely because the library declared the numbers as arrays. For
          unknown hysterical reasons arrays can't be assigned to.
          >
          NOW that I have rewritten this library to use structures, the assignment
          as you say is possible in the source code. Within the library source
          code the function
          qmov(a,b); //Move a into b
          is used.
          So are you suggesting to change to struct assignment instead? I guess that
          there will be no practical difference in speed or clarity.
          > 3) I thought there was a third thing, but now I can't find it.
          >>
          > So: You've done something of interest to lcc users and maybe
          >to people who need in high-precision floating-point, and you
          >should feel good about having done so. But I fail to understand
          >why the implementation details of a library of assembly code are
          >a fit topic for a forum devoted to a different language.
          >
          The cephes library is written in C. I made that very clear in my
          message but maybe not clear enough for you.
          >
          Besides the extra precision, the library offers many special functions
          in an open implementation. The list of functions is very long, with
          elliptical integrals, statistics, physics, and many other functions like
          Riemann zeta, Bessel functions, gamma, etc.
          >
          All that is written in C, for 32 bit, 64 bit and 448 bit precision.
          This is a well known library, and I thought that we could discuss it in
          this group.
          It is an interesting project. I guess that it would be better discussed on
          news:comp.progr amming or the lcc group or even news:sci.math.n um-analysis,
          but certain aspects would no doubt be topical here.



          ** Posted from http://www.teranews.com **

          Comment

          • Eric Sosman

            #6
            Re: The Cephes Mathematical Library

            jacob navia wrote:
            Eric Sosman wrote:
            >jacob navia wrote:
            >>[...]
            >>The purpose of this message is to discuss the data structures and the
            >>choices I did. Obviously this is a serious implementation
            >>of floating point, not just an academic exercise where the numbers
            >>are represented just by "unsigned char" or similar solutions.
            >>
            > Okay, I'll bite: Does this discussion have anything to do
            >with C, or is it about implementation strategies for a library
            >of assembly code?
            >>
            The core of the library is written in C, with the four operations
            (add, subtract and multiply, + shift in assembler). I do not
            see why you think the library is in assembler.
            Sorry; I was misled by "I rewrote the core of it in
            assembler." Obviously, that means it's written in C.
            > There seemed to be three C-ish things in your message:
            >>
            > 1) Some assumptions about how a compiler pads (or rather
            > doesn't pad) struct objects. I guess the assumptions
            > are correct for your compiler, but they also seem
            > gratuitous: The operations are described in terms of
            > arrays, so why not just use arrays?
            >>
            The problem is that arrays must be homogeneous. The exponent and the
            sign are 32 bit, the mantissa is 64 bit. If you find a C array that
            will do that I would take it immediately of course.
            >
            Nowhere in the code a specific layout is assumed. Precisely this
            is the object of using a structure. I do not see where do you see
            anything that assumes a specific layout...
            You're probably right, since you've seen the code and
            no one else has. I was relying on the description: "this
            structure can be seen as an array of 14 32 bit integers, with
            array[0] the sign, array[1] the exponent, and array[2..13]
            the mantissa." Silly me.
            > 2) A trick for avoiding an address-of operator in function
            > calls, by declaring the variables as one-element arrays.
            > The price paid is that simple assignment now needs extra
            > syntax: `x = y' becomes `x[0] = y[0]'. TANSTAAFL.
            >>
            There isn't a single assignment
            qfloat a,b;
            a=b
            >
            in the whole library. Why?
            >
            Precisely because the library declared the numbers as arrays.
            ... thus assuming the "specific layout" earlier denied.
            NOW that I have rewritten this library to use structures, the assignment
            as you say is possible in the source code. Within the library source
            code the function
            qmov(a,b); //Move a into b
            is used.
            If simple assignment works, why call a function? Seriously,
            Jacob, this baffles me. You introduce the trick of declaring
            variables as `Qfloat a[1], b[1], c[1];' (your own example) so
            you can write `qadd(a,b,c);' instead of `qadd(&a,&b,&c) ;' (again,
            your example), which requires you to use `qmov(a,b);' instead of
            `b = a;'. I'd have suggested `b[0] = a[0];' but if you prefer
            qmov() it's all one to me.

            Also, I protest this "dribbling out" of detail: Your original
            message made no mention at all of qmov(), nor of anything like it.
            You can hardly expect the rest of us to be privy to unrevealed
            implementation details known to you alone.
            The cephes library is written in C. I made that very clear in my
            message but maybe not clear enough for you.
            I guess the fellow who said he'd translated it into assembly
            was wrong.
            > Follow-ups set to comp.compilers. lcc.
            Follow-ups once again set to comp.compilers. lcc.

            --
            Eric Sosman
            esosman@ieee-dot-org.invalid

            Comment

            Working...