strong/weak typing and pointers

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Alex Martelli

    #61
    Re: strong/weak typing and pointers

    Steven Bethard <steven.bethard @gmail.com> wrote:
    ...[color=blue]
    > I'm obviously upsetting you, and I can see that we're still not quite
    > understanding each other. I have to assume that you're not the only one I'm
    > upsetting through these misunderstandin gs, so for the sake of the list, I'll
    > stop responding to this thread. Thanks everyone for a good discussion![/color]

    I apologize if I have given the impression of being upset. I am, in a
    way, I guess -- astonished and nonplusses, as if somebody asked me to
    justify the existence of bread -- not of some exotic food, mind you, but
    of the most obvious, elementary, fundamental substance of earthly
    sustenance (in my culture, and many others around it).
    [color=blue]
    > P.S. If anyone would like to know my response to the float representation
    > example, please contact me directly instead.[/color]

    I promise not to ACT upset if you explain it here. So, we have an area
    of 8 bytes in memory which we need to be able to treat as:
    8 bytes, for I/O purposes, say;
    a float, to feed it to some specialized register, say;
    a bit indicating sign plus 15 for mantissa plus 48 for significand,
    or the like, to perform masking and shifting thereof in SW -- a
    structure of three odd-bit-sized integers juxtaposed;
    and this is ONE example -- the specific one you had asked for.

    Another example: we're going to send a controlblock of 64 bytes to some
    HW peripheral, and get it back perhaps with some mods -- a typical
    control/status arrangement. Depending on the top 2 (or in some case 4)
    bytes' value, the structure may need to be interpreted in several
    possible ways, in terms of juxtaposition of characters, halfwords and
    longwords. Again, the driver responsible for talking with this
    peripheral needs to be able to superimpose on the 64 bytes any of
    several possible C-level struct's -- the cleanest way to do this would
    appear to be pointer-casting, though unions would (as usual, of course)
    be essentially equivalent. In Python, or another language that lets me
    pack and unpack a struct to/from bytes in a controlled way (in Python's
    case via the struct module) I can do that through a _copy_ -- I need to
    go through a 'raw bytes' stage, cannot do the overlay directly; but
    that's little more than a figleaf arrangement -- spending real CPU and
    RAM operations because I can't be lowlevel/weakly-typed enough.


    Alex

    Comment

    • Alex Martelli

      #62
      Re: Summary: strong/weak typing and pointers

      Steven Bethard <steven.bethard @gmail.com> wrote:
      ...[color=blue]
      > I wonder what people think about Ruby, which, I understand, does allow you to
      > modify builtins. Can anyone tell me if you could make Ruby strings do the
      > horrible coercion that PHP strings do?[/color]

      Yes, you could. Reliable Ruby friends tell me that's not DONE in the
      real world of Ruby, any more than pythonistas call their methods' first
      argument 'foo' rather than 'self' or pepper their code with 'exec'
      statements or code 200-chars nested-lambda oneliners. But though
      culturally frowned on, it _is_ technically possible.

      The one real example I saw, which was enough to turn me off my quest to
      explore Ruby for production purposes, was making (builtin) string
      comparisons case-insensitive -- apparently that _IS_ the kind of thing
      _SOME_ perhaps-inexperienced Rubystas _DO_ perpetrate (breaking library
      modules left, right, and center, of course). Maybe it's similar to
      rather inexperienced Pythonistas dead keen on "exec myname+'='+valu e"; I
      _have_ seen that horror perpetrated in real Python code (doesn't break
      any library, but slows function execution down by 10 times w/o any real
      advantage wrt dicts or bunch usage, and is a bug-prone piece too...).


      Alex

      Comment

      • Michael Hobbs

        #63
        Re: strong/weak typing and pointers

        Steven Bethard <steven.bethard @gmail.com> wrote:[color=blue]
        > My point here is that I think in most code, even when people do a bunch of
        > bit-twiddling, they have a single underlying structure in mind, and therefore
        > you see them treat the bits as one of two things: (1) The sequence of bits, i.e.
        > the untyped memory block, or (2) the intended structure. IMHO, an example of
        > taking advantage of weak-typing would be a case where you treat the bits as
        > three different things: the sequence of bits, and two (mutually exclusive)
        > intended structures.[/color]

        One word: union

        Comment

        • Steven Bethard

          #64
          Re: strong/weak typing and pointers

          Alex Martelli <aleaxit <at> yahoo.com> writes:[color=blue]
          >
          > I apologize if I have given the impression of being upset.[/color]

          No problem -- my mistake for misinterpreting you. I'm just sensitive to these
          kind of things because I know I've previously miscommunicated , and
          unintentionally got people upset before (you being one of them). ;)
          [color=blue]
          > I am, in a
          > way, I guess -- astonished and nonplusses, as if somebody asked me to
          > justify the existence of bread -- not of some exotic food, mind you, but
          > of the most obvious, elementary, fundamental substance of earthly
          > sustenance (in my culture, and many others around it).[/color]

          Yeah, this goes to the heart of the misunderstandin g. I'm not asking anyone to
          justify the _existence_ of weak-typing. Weak-typing is a direct result of a
          language's support for untyped (bit/byte) data. I agree 100% that this sort of
          data is not only useful, but often essential in any low-level (e.g. OS, hardware
          driver, etc.) code.
          [color=blue]
          > So, we have an area
          > of 8 bytes in memory which we need to be able to treat as:
          > 8 bytes, for I/O purposes, say;
          > a float, to feed it to some specialized register, say;
          > a bit indicating sign plus 15 for mantissa plus 48 for significand,
          > or the like, to perform masking and shifting thereof in SW -- a
          > structure of three odd-bit-sized integers juxtaposed;[/color]

          As a quick refresher, I quote myself in what I was looking for:
          "taking advantage of weak-typing would be a case where you treat the bits as
          three different things: the sequence of bits, and two (mutually exclusive)
          intended structures."

          My response to this example is that your two intended structures are not
          mutually exclusive. Yes, you have to do some bit-twiddling, but only because
          your float struct doesn't have get_sign, get_mantissa and get_significand
          methods. ;) You're still dealing with the same representation, not converting
          to a different type. You're just addressing a lower level part of the
          representation.

          I can see the point though: at least in most of the languages I'm familiar with,
          float is declared as a type while there's no subtype of float that specifies the
          sign, mantissa and significand.

          (Oh, and by the way, in case you really were wondering, they still do teach
          float representations , even in computer science (as opposed to computer
          engineering), or at least they did through 1999.)
          [color=blue]
          > Another example: we're going to send a controlblock of 64 bytes to some
          > HW peripheral, and get it back perhaps with some mods -- a typical
          > control/status arrangement. Depending on the top 2 (or in some case 4)
          > bytes' value, the structure may need to be interpreted in several
          > possible ways, in terms of juxtaposition of characters, halfwords and
          > longwords. Again, the driver responsible for talking with this
          > peripheral needs to be able to superimpose on the 64 bytes any of
          > several possible C-level struct's -- the cleanest way to do this would
          > appear to be pointer-casting, though unions would (as usual, of course)
          > be essentially equivalent.[/color]

          Is the interpretation of the controlblock uniquely defined by the top 2 or 4
          bytes, or are there some values for the top 2 or 4 bytes for which I have to
          apply two different interpretations (C-level structs) to the same sequence of
          bits?

          If the top 2 or 4 bytes uniquely define the structs, then I would just say
          you're just going back and forth between a typed structure and its untyped
          representation. If the top 2 or 4 bytes can specify multiple interpretations
          for the same sequence of bits, then this is the example I was looking for. =)

          Steve

          Comment

          • Steven Bethard

            #65
            Re: strong/weak typing and pointers

            Michael Hobbs <mike <at> hobbshouse.org> writes:[color=blue]
            >
            > One word: union
            >[/color]

            Interestingly, unions can be well-defined even in a strongly-typed language,
            e.g. OCaml:

            # type int_or_list = Int of int | List of int list;;
            type int_or_list = Int of int | List of int list
            # Int 1;;
            - : int_or_list = Int 1
            # List [1; 2];;
            - : int_or_list = List [1; 2]

            The reason for this is that at any given time in OCaml, the sequence of bits is
            only interpretable as *one* of the two types, never both. If you have a good
            example of using a union (in C probably, since OCaml wouldn't let you do this I
            don't think) where you want to treat a given sequence of bytes as both types *at
            once*, that would be great!

            Thanks,

            Steve

            Comment

            • Alex Martelli

              #66
              Re: strong/weak typing and pointers

              Steven Bethard <steven.bethard @gmail.com> wrote:
              ...[color=blue]
              > Yeah, this goes to the heart of the misunderstandin g. I'm not asking
              > anyone to justify the _existence_ of weak-typing. Weak-typing is a direct
              > result of a language's support for untyped (bit/byte) data. I agree 100%
              > that this sort of data is not only useful, but often essential in any
              > low-level (e.g. OS, hardware driver, etc.) code.[/color]

              But so is the ability to get at the same bits/bytes in structured ways.
              [color=blue][color=green][color=darkred]
              > > > So, we have an area[/color]
              > > of 8 bytes in memory which we need to be able to treat as:
              > > 8 bytes, for I/O purposes, say;
              > > a float, to feed it to some specialized register, say;
              > > a bit indicating sign plus 15 for mantissa plus 48 for significand,
              > > or the like, to perform masking and shifting thereof in SW -- a
              > > structure of three odd-bit-sized integers juxtaposed;[/color]
              >
              > As a quick refresher, I quote myself in what I was looking for: "taking
              > advantage of weak-typing would be a case where you treat the bits as three
              > different things: the sequence of bits, and two (mutually exclusive)
              > intended structures."
              >
              > My response to this example is that your two intended structures are not
              > mutually exclusive. Yes, you have to do some bit-twiddling, but only
              > because your float struct doesn't have get_sign, get_mantissa and
              > get_significand methods. ;) You're still dealing with the same
              > representation, not converting to a different type. You're just
              > addressing a lower level part of the representation.[/color]

              What do you mean by "mutually exclusive"? "Never useful at the same
              time"? You're asking for an example of things never useful at the same
              time that are useful at the same time?!

              The struct type with so many bits being signs, exponent, significands,
              IS a distinct type from double-precision float -- it's the
              representation of the latter according to some standard. To multiply by
              0.1 I have to have a float, to 'get the N-bit integer that gives the
              exponent shifted right by 3' I have to have that struct type. They're
              totally distinct (not "mutually exclusive" because they ARE useful as
              ways to look at the same bitbunch at the same time, of course) types,
              ways to analyze or interpret the same bunch of bits (apart from the
              untyped representation where I can do binary I/O with them, too).

              [color=blue]
              > I can see the point though: at least in most of the languages I'm familiar
              > with, float is declared as a type while there's no subtype of float that
              > specifies the sign, mantissa and significand.[/color]

              Right. To get at the bitfields, you use weaktyping instead.

              [color=blue][color=green]
              > > Another example: we're going to send a controlblock of 64 bytes to some
              > > HW peripheral, and get it back perhaps with some mods -- a typical
              > > control/status arrangement. Depending on the top 2 (or in some case 4)
              > > bytes' value, the structure may need to be interpreted in several
              > > possible ways, in terms of juxtaposition of characters, halfwords and
              > > longwords. Again, the driver responsible for talking with this
              > > peripheral needs to be able to superimpose on the 64 bytes any of
              > > several possible C-level struct's -- the cleanest way to do this would
              > > appear to be pointer-casting, though unions would (as usual, of course)
              > > be essentially equivalent.[/color]
              >
              > Is the interpretation of the controlblock uniquely defined by the top 2 or 4
              > bytes, or are there some values for the top 2 or 4 bytes for which I have to
              > apply two different interpretations (C-level structs) to the same sequence of
              > bits?[/color]

              In the HW I was thinking of, the former is the case.
              [color=blue]
              > If the top 2 or 4 bytes uniquely define the structs, then I would just say
              > you're just going back and forth between a typed structure and its untyped
              > representation. If the top 2 or 4 bytes can specify multiple interpretations
              > for the same sequence of bits, then this is the example I was looking for. =)[/color]

              I need to examine the top bytes of the block as the HW returned it, in
              some cases, to know what struct type is most useful to interpret the
              bunch of bits. There is typically only one type (besides 'just a bunch
              of 64 bytes') that it useful at _one_ given time. But weak typing does
              not require parallel processing without locks -- only if two independent
              threads of controls were looking at the same bits concurrently from two
              separate processors would saying "at ONE time" make sense... true and
              unfettered concurrent access...

              As for two different interpretations of the same bits being useful (not
              "at the same time"), consider a 16-bit field that can be seen as one
              16-bit word or two 8-bit bytes. In the former case, '0' means the whole
              operation concluded successfully, any non-0 means problems were
              encountered. So, a piece of code that just needs a pass/nonpass filter
              on the operation is best advised to tread that field as a 16-bit word,
              so it can test it for == or != 0 atomically.

              At a deeper level, one byte indicates possible problems of one kind (say
              ones "intrinsic" to the procedure/operation in question), another
              indicates possible problems of a different kind (say ones "extrinsic" to
              the procedure per se, but caused by preemption, power failures, etc).
              Unix return-status values aren't too far away from this. If you need
              accurate diagnosis of what went wrong, seeing the same field as two
              8-bit bytes is handier (assuming you can get some kind of lock in that
              case, since you are then dealing with nonatomic testing).

              You could see a test such as "if x->field16 == 0:" as a weird shorthand
              for "if x->field8_a == 0 and x->field8_b == 0:", but depending on
              considerations of atomicity it might not even be.


              Another example where the same sequence of bits may be usefully
              interpreted in more ways at the same time: given a string of bytes which
              encodes some unicode text in utf-8 it's clearly useful to consider it as
              such, parsing it left to right byte by byte to find the unicode chars
              being encoded and display the proper glyphs, etc. But I may also want
              to walk the same area of memory as a sequence of 64-bit words to compute
              a simple checksum to ensure data integrity (as well as the usual need
              for 'untyped' bytescan for I/O). Or, say I don't know whether the
              incoming data were utf-8 or utf-16; by walking over them in both 1-byte
              (utf-8) and 2-byte units I may well be able to get strong heuristic
              indications of which of the two encodings was in use. Similar
              heuristics are sometimes very useful even in determining whether a bunch
              of 4-byte words from a record are floats or ints -- as long, of course,
              as you CAN walk them both ways and compare strangeness-indicators. If
              you even need to recover old data from datasets whose details were lost,
              you'll find that out for yourself.


              Alex

              Comment

              • Christophe Cavalaria

                #67
                Re: strong/weak typing and pointers

                Michael Hobbs wrote:
                [color=blue]
                > Steven Bethard <steven.bethard @gmail.com> wrote:[color=green]
                >> My point here is that I think in most code, even when people do a bunch
                >> of bit-twiddling, they have a single underlying structure in mind, and
                >> therefore you see them treat the bits as one of two things: (1) The
                >> sequence of bits, i.e.
                >> the untyped memory block, or (2) the intended structure. IMHO, an
                >> example of taking advantage of weak-typing would be a case where you
                >> treat the bits as three different things: the sequence of bits, and two
                >> (mutually exclusive) intended structures.[/color]
                >
                > One word: union[/color]
                Note that in the C standard, writing to part A of an union and reading from
                part B is UB : undefined behavior and so it should *not* be used.

                Comment

                • Steven Bethard

                  #68
                  Re: strong/weak typing and pointers

                  Alex Martelli <aleaxit <at> yahoo.com> writes:[color=blue]
                  >[/color]
                  [snip example decomposing float representation into mantissa, etc.][color=blue]
                  >[/color]
                  [snip example determining struct type from first few bytes][color=blue]
                  >[/color]
                  [snip example decomposing 16 bit error code into two 8 bit error codes][color=blue]
                  >[/color]
                  [snip example determining utf-8 or utf-16 by trying byte stream as both]

                  Thanks for the examples!

                  I'm not quite convinced by the decomposition examples or the struct type
                  example, but the UTF example is definitely convincing. I can imagine that you
                  could extend this type of example to any case where you didn't know the actual
                  type of a struct. Given this situation, you could try treating the bytes as
                  each of the possible struct types, and see (heuristically or perhaps with a
                  machine learning approach) which struct type is most appropriate.

                  This definitely meets my criterion of treating the same set of bytes as two
                  different structures, and it's even useful! =) Thanks!

                  Steve

                  Comment

                  • Michael Hobbs

                    #69
                    Re: strong/weak typing and pointers

                    Steven Bethard <steven.bethard @gmail.com> wrote:[color=blue]
                    > The reason for this is that at any given time in OCaml, the sequence of bits is
                    > only interpretable as *one* of the two types, never both. If you have a good
                    > example of using a union (in C probably, since OCaml wouldn't let you do this I
                    > don't think) where you want to treat a given sequence of bytes as both types *at
                    > once*, that would be great![/color]

                    This example is a little weak, but may be sufficient. The in_addr
                    structure used for sockets usually uses a union to provide different
                    views to the underlying 32-bit address. You can access the address
                    as 4 8-bit values, 2 16-bit values, or 1 32-bit value. Most code
                    these days only use the 4 8-bit representation, but the interface is
                    there.

                    Another possible example comes from the Windows API. Some of the
                    functions take an arbitrary length structure. If you want to make a
                    simple call to the function, you pass a small structure. If you
                    want to make a more complex call to the function, you pass a larger
                    structure that has more fields tacked on to the end. Usually, the
                    first field in the structure is an int that specifies how large the
                    structure is. It is used as sort of a crude version of OO in C.

                    I'm not sure if these are the kinds of examples you're looking for.
                    I don't know how anyone would be able to use a sequence of bytes as
                    two types of data at once. There is almost always some sort of
                    indicator that specifies how to interpret the bytes; otherwise, it
                    is just garbage.

                    -- Mike

                    Comment

                    • Diez B. Roggisch

                      #70
                      Re: strong/weak typing and pointers

                      Steven Bethard wrote:
                      [color=blue]
                      > Michael Hobbs <mike <at> hobbshouse.org> writes:[color=green]
                      >>
                      >> One word: union
                      >>[/color]
                      >
                      > Interestingly, unions can be well-defined even in a strongly-typed
                      > language, e.g. OCaml:
                      >
                      > # type int_or_list = Int of int | List of int list;;
                      > type int_or_list = Int of int | List of int list
                      > # Int 1;;
                      > - : int_or_list = Int 1
                      > # List [1; 2];;
                      > - : int_or_list = List [1; 2][/color]

                      Unions in functional languages are also known as direct sums of types (as
                      opposed to products, which form tuples). And trying to access a union that
                      holds an int as list will yield an error - runtime, most probably. So there
                      is no way of reinterpreting an int as list, which still satisfies the
                      paragdigms of a strong typed language.
                      --
                      Regards,

                      Diez B. Roggisch

                      Comment

                      • Greg Ewing

                        #71
                        Re: strong/weak typing and pointers

                        Diez B. Roggisch wrote:[color=blue]
                        > I can remeber abusing 32bit pointers in 68k processors by
                        > altering the most-significant byte.[/color]

                        Apple did this in early versions of the Memory Manager
                        of classic MacOS, using the upper 8 bits of a Handle
                        for various flags. You weren't supposed to make any
                        assumptions about what the upper byte contained, but
                        of course some people did... and their applications
                        broke when 32-bit addressing came in...

                        --
                        Greg Ewing, Computer Science Dept,
                        University of Canterbury,
                        Christchurch, New Zealand


                        Comment

                        • Mike Meyer

                          #72
                          Re: Summary: strong/weak typing and pointers

                          Steven Bethard <steven.bethard @gmail.com> writes:
                          [color=blue]
                          > JCM <joshway_withou t_spam <at> myway.com> writes:[color=green]
                          >>[color=darkred]
                          >> > Definition 1 is the definition most commonly used in Programming
                          >> > Languages literature.... However, for
                          >> > all intents and purposes, it is only applicable to statically typed
                          >> > languages; no one on the list could come up with a dyamically typed
                          >> > language that allowed bit-reinterpretatio n.[/color]
                          >>
                          >> Assembly language. The types of values are implied by what
                          >> instructions you use.[/color]
                          >
                          > I'm sure some people would argue that assembly language is untyped (not
                          > statically or dynamically typed) and that the operations are defined on bits,
                          > but this is definitely the best example I've seen. Thanks![/color]

                          The previously mentioned BCPL has the exact same property. For that
                          matter, early versions of C used to allow it to a large degree. I've
                          actually compiled programs written as "char *main = { ... }".

                          To me, a dynamically typed language is one where objects - rather than
                          variables - have a type attached.

                          <mike
                          --
                          Mike Meyer <mwm@mired.or g> http://www.mired.org/home/mwm/
                          Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

                          Comment

                          • Mike Meyer

                            #73
                            Re: Summary: strong/weak typing and pointers

                            Steven Bethard <steven.bethard @gmail.com> writes:
                            [color=blue]
                            > Gabriel Zachmann writes:
                            > In summary, there are basically three interpretations of "weak-typing" discussed
                            > in this thread:
                            >
                            > (1) A language is "weakly-typed" if it allows code to take a block of memory
                            > that was originally defined as one type and reinterpret the bits of this block
                            > as another type.
                            >
                            > (2) A language is "weakly-typed" if it has a large number of implicit coercions.
                            >
                            > (3) A language is "weakly-typed" if it often treats objects of one type as other
                            > types.
                            >
                            > Definition 1 is the definition most commonly used in Programming Languages
                            > literature, and allows a language to be called "weakly-typed" based only on the
                            > language definition. However, for all intents and purposes, it is only
                            > applicable to statically typed languages; no one on the list could come up with
                            > a dyamically typed language that allowed bit-reinterpretatio n.[/color]

                            Definition 1 is a black/white proposition instead of being a
                            continuum. Once you allow the simple case needed for real-world work
                            of allowing an object to be treated as whatever it is or a sequence of
                            bytes, you can treat any type as any other type.
                            [color=blue]
                            > Definition 2 seemed to be the definition most commonly used on the list, most
                            > likely because it is actually applicable to a dynamically typed language like
                            > Python. It has the problem that in a language that supports operator
                            > overloading (like Python), programmers can make their language more
                            > "weakly-typed" by simply providing additional coercions, thus whether or not a
                            > language is called "weakly-typed" depends both on the language definition and
                            > any code written in the language.[/color]

                            This problem can largely be made to go away by limiting it to builtin
                            types. Likewise for definition 3.

                            I'd call Ruby's allowing builtin types to be changed a
                            misfeature. Builtin types should be subclassed.

                            <mike
                            --
                            Mike Meyer <mwm@mired.or g> http://www.mired.org/home/mwm/
                            Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

                            Comment

                            • Michael Hobbs

                              #74
                              Re: strong/weak typing and pointers

                              Steven Bethard <steven.bethard @gmail.com> wrote:[color=blue]
                              > The reason for this is that at any given time in OCaml, the sequence of bits is
                              > only interpretable as *one* of the two types, never both. If you have a good
                              > example of using a union (in C probably, since OCaml wouldn't let you do this I
                              > don't think) where you want to treat a given sequence of bytes as both types *at
                              > once*, that would be great![/color]

                              I've come up with the perfect example for you. However, it is from
                              the days when memory was scarce and programmers were allowed to use
                              any programming language they wanted, so long as it was assembly.

                              To conserve as much memory as possible, some programmers would use
                              machine code that was loaded into memory as their integer constants.
                              Here is an excerpt from The Story of Mel:
                              (http://www.catb.org/~esr/jargon/html/story-of-mel.html)

                              Since Mel knew the numerical value
                              of every operation code,
                              and assigned his own drum addresses,
                              every instruction he wrote could also be considered
                              a numerical constant.
                              He could pick up an earlier "add" instruction, say,
                              and multiply by it,
                              if it had the right numeric value.
                              His code was not easy for someone else to modify.

                              Comment

                              • Alex Martelli

                                #75
                                Re: strong/weak typing and pointers

                                Greg Ewing <greg@cosc.cant erbury.ac.nz> wrote:
                                [color=blue]
                                > Diez B. Roggisch wrote:[color=green]
                                > > I can remeber abusing 32bit pointers in 68k processors by
                                > > altering the most-significant byte.[/color]
                                >
                                > Apple did this in early versions of the Memory Manager
                                > of classic MacOS, using the upper 8 bits of a Handle
                                > for various flags. You weren't supposed to make any
                                > assumptions about what the upper byte contained, but
                                > of course some people did... and their applications
                                > broke when 32-bit addressing came in...[/color]

                                I believe many implementations of high-level languages on machines where
                                addresses had to be aligned used LOW bits similarly. Say addresses of
                                integers need to be even or else a bus error will occur. Then, a word
                                that is used to hold an integer address has its low bit 'available' as a
                                flag -- it needs to be cleared before it's dereferenced, anyway.

                                This seems reasonably sound because, even if a later model of the CPU
                                should be extended to allow misaligned addresses, the OS need not
                                support that. Misaligned addresses can pay substantial performance
                                prices for little gain -- not sure about the state of play these days,
                                but just a few years ago you could boost the performance of some C codes
                                on intel CPUs (which always allowed address misalignment) quite a bit by
                                recompiling with flags telling the compiler to ensure addess alignment.

                                So, the low bit, when set, could indicate we're pointing to a Bignum
                                (like a Python long), when clear, that we're pointing to an ordinary
                                small integer, for example -- or other such dychotomous distinctions.

                                Of course, such an address-plus-flag must be handled as a bitmask (to
                                examine and clear the flag) or a pointer, interchangeably .


                                Alex

                                Comment

                                Working...