Structures

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Marc 'BlackJack' Rintsch

    #16
    Re: Structures

    On Mon, 03 Nov 2008 23:32:25 +0000, Paulo J. Matos wrote:
    What's then the reason for adding named tuples if they are not
    mutable...???
    Names are more descriptive than "magic numbers" as indices. See for
    example the "named tuple" returned by `os.stat()`.

    Ciao,
    Marc 'BlackJack' Rintsch

    Comment

    • Paulo J. Matos

      #17
      Re: Structures

      On Tue, Nov 4, 2008 at 12:17 AM, George Sakkis <george.sakkis@ gmail.comwrote:
      >
      Technically there are no private attributes in (pure) Python so the
      answer is still classes.
      >
      OK, so this is now messing with my lack of knowledge regarding Python.
      What's (pure) Python? Is there any impure Python?


      --
      Paulo Jorge Matos - pocmatos at gmail.com
      Webpage: http://www.personal.soton.ac.uk/pocm

      Comment

      • Paul Rubin

        #18
        Re: Structures

        "Paulo J. Matos" <pocmatos@gmail .comwrites:
        OK, so this is now messing with my lack of knowledge regarding Python.
        What's (pure) Python? Is there any impure Python?
        impure Python = Python with extensions written in C.

        Comment

        • George Sakkis

          #19
          Re: Structures

          On Nov 3, 6:32 pm, "Paulo J. Matos" <pocma...@gmail .comwrote:
          Even though I can use dicts where the keys are strings (as if it were
          the name of the field), it seems to heavy, since a structure doesn't
          need to be resizable (and dicts are) and it has constant time access
          (which depending on the implementation I would guess dicts don't
          have).
          >
          Can someone please clarify?
          For all practical purposes, dicts have almost constant access time (at
          least with any half-decent __hash__ method). If you're paranoid about
          micro-optimizations, you can define a class with __slots__:
          >>class MyStruct(object ):
          .... __slots__ = ('x', 'y')
          ....
          >>s = MyStruct()
          >>s.x = 3
          >>s.y = 4
          >>s.z= 5
          Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
          AttributeError: 'MyStruct' object has no attribute 'z'

          More often than not, that's premature optimization.

          George

          Comment

          • Aaron Brady

            #20
            Re: Structures

            On Nov 3, 6:33 pm, George Sakkis <george.sak...@ gmail.comwrote:
            On Nov 3, 6:32 pm, "Paulo J. Matos" <pocma...@gmail .comwrote:
            >
            Even though I can use dicts where the keys are strings (as if it were
            the name of the field), it seems to heavy, since a structure doesn't
            need to be resizable (and dicts are) and it has constant time access
            (which depending on the implementation I would guess dicts don't
            have).
            >
            Can someone please clarify?
            >
            For all practical purposes, dicts have almost constant access time (at
            least with any half-decent __hash__  method).
            Hash tables are slick, but their hash function is their weakest link.
            >>[ hash( 2**x ) for x in range( 0, 256, 32 ) ]
            [1, 1, 1, 1, 1, 1, 1, 1]

            Such an index gives you linear-time performance in finding elements.
            Ha! Plus, your worst-case insertion will cause a copy of the entire
            table. There aren't any mappings based on balanced trees,
            unfortunately.

            Comment

            • Aaron Brady

              #21
              Re: Structures

              On Nov 3, 5:38 pm, "Paulo J. Matos" <pocma...@gmail .comwrote:
              On Mon, Nov 3, 2008 at 10:19 PM, Aaron Brady <castiro...@gma il.comwrote:
              On Nov 3, 3:45 pm, Ben Finney <bignose+hate s-s...@benfinney. id.au>
              wrote:
              "Paulo J. Matos" <pocma...@gmail .comwrites:
              Take care with broad sweeping statements about "every other language",
              or even "most other languages". They are usually flat-out wrong:
              there is a stunning variety of different approaches and concepts in
              programming languages, with very little common to even a majority of
              them.
              >
              Yea, verily.  How many languages do you think that is?  Feel free to
              count C and C++ as different ones.
              >
              Well, I wouldn't dare to say I know a lot of languages but the ones I
              do provide mechanisms to define structures / records: C, C++, Scheme,
              Common Lisp, Haskell, SML, Ocaml.
              I don't know even half of those. What about Perl? Does anyone know
              that?
              My question was more directed to : if
              there aren't structures in Python, what do Pythonists use instead?
              Just a blank object, I imagine.

              Comment

              • Glenn Linderman

                #22
                Re: Structures

                On approximately 11/3/2008 3:38 PM, came the following characters from
                the keyboard of Paulo J. Matos:
                On Mon, Nov 3, 2008 at 10:19 PM, Aaron Brady <castironpi@gma il.comwrote:
                >
                >On Nov 3, 3:45 pm, Ben Finney <bignose+hate s-s...@benfinney. id.au>
                >wrote:
                >>
                >>"Paulo J. Matos" <pocma...@gmail .comwrites:
                >>>
                >>>
                >>>On Mon, Nov 3, 2008 at 12:32 PM, Ben Finney
                >>><bignose+hat es-s...@benfinney. id.auwrote:
                >>>>
                >>>>I'm wondering a more fundamental question: What are structures?
                >>>>That is, what do *you* mean by that term; without knowing that, an
                >>>>answer isn't likely to be meaningful.
                >>>>>
                >>>Well, I guess that everyone pretty much gets since it exists in
                >>>every other language as struct, or define-structure, or whatever is
                >>>the syntax.
                >>>>
                >>Take care with broad sweeping statements about "every other language",
                >>or even "most other languages". They are usually flat-out wrong:
                >>there is a stunning variety of different approaches and concepts in
                >>programming languages, with very little common to even a majority of
                >>them.
                >>>
                >Yea, verily. How many languages do you think that is? Feel free to
                >count C and C++ as different ones.
                >>
                >"Four shalt thou not count, neither count thou two...."
                >http://en.wikipedia.org/wiki/Holy_Ha...e_instructions
                >>
                >
                Well, I wouldn't dare to say I know a lot of languages but the ones I
                do provide mechanisms to define structures / records: C, C++, Scheme,
                Common Lisp, Haskell, SML, Ocaml.
                This is obviously a minority if you count all available programming
                languages in the world, but I would dare to say these cover a lot of
                ground.
                >
                There are languages that do not have structs; but usually there is some
                way to obtain something similar.
                However, I wouldn't dare to say Python needs structures to be a good
                language, or anything similar. My question was more directed to : if
                there aren't structures in Python, what do Pythonists use instead?
                (I have seen dicts might be an alternative, but as I said in previous
                post, they seem to flexible [making them a canon to shoot a fly, and
                they probably lack constant-time access, right?]
                >
                Dicts have constant time access. On the other hand, the constant is
                significantly larger than the constant for accessing a C struct.

                Note that classes, by default, are based on a contained dict! There are
                games to be played with slots that can apparently improve that. I am
                not yet experienced enough with Python to know if a slot is as fast as a
                C struct, but perhaps it is. You can have both slots and a dict, to get
                both speed and flexibility, or you can eliminate the dict and use slots
                only, but that limits your flexibility. But structs aren't flexible,
                except at compile time, so that might not be a problem for you.

                Another thing I don't know is if slots are as fast as tuples. Perhaps a
                slots-only class and a tuple might be speed rivals? But the former is
                mutable, and the latter not.

                Perhaps a more experience Python user can answer that question, at least
                for some particular implementation.

                --
                Glenn -- http://nevcal.com/
                =============== ============
                A protocol is complete when there is nothing left to remove.
                -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

                Comment

                • Steven D'Aprano

                  #23
                  Re: Structures

                  On Mon, 03 Nov 2008 17:06:07 -0800, Aaron Brady wrote:
                  >For all practical purposes, dicts have almost constant access time (at
                  >least with any half-decent __hash__  method).
                  >
                  Hash tables are slick, but their hash function is their weakest link.
                  >
                  >>>[ hash( 2**x ) for x in range( 0, 256, 32 ) ]
                  [1, 1, 1, 1, 1, 1, 1, 1]
                  That's not the hash function of a dict. Dicts don't have a hash function:
                  >>hash({})
                  Traceback (most recent call last):
                  File "<stdin>", line 1, in <module>
                  TypeError: dict objects are unhashable


                  What you're showing is the hash function of ints, and you've discovered
                  that there are some collisions. This is hardly surprising. There are an
                  infinite number of ints, and only a finite number of values that they can
                  hash to, so there have to be some collisions. It's worth looking at the
                  values that collide:
                  >>[ 2**x for x in range( 0, 256, 32 ) ]
                  [1, 4294967296L, 184467440737095 51616L, 792281625142643 37593543950336L ,
                  340282366920938 463463374607431 768211456L,
                  146150163733090 291820368483271 628301965593254 2976L,
                  627710173538668 076383578942320 766641610235544 4464034512896L,
                  269599466671506 397946670150870 196306736371444 225405724811036 10249216L]

                  So if you have a dict with keys equal to those numbers, and do a lookup
                  on one of them, you'll get O(N) performance instead of O(1) *for those
                  keys that collide*. I think I can live with that.

                  If you think you can come up with a better hash function, feel free to
                  subclass int.

                  Such an index gives you linear-time performance in finding elements. Ha!
                  In any real application, all your keys are highly unlikely to collide in
                  that fashion.


                  Plus, your worst-case insertion will cause a copy of the entire table.
                  Explain please.

                  There aren't any mappings based on balanced trees, unfortunately.
                  I'm not sure why you think O(log N) is better than O(1). The major
                  advantage of tree structures is that, unlike hash tables, they are
                  ordered, not that they don't have collisions.



                  --
                  Steven

                  Comment

                  • Steven D'Aprano

                    #24
                    Re: Structures

                    On Tue, 04 Nov 2008 00:19:16 +0000, Marc 'BlackJack' Rintsch wrote:
                    On Mon, 03 Nov 2008 23:32:25 +0000, Paulo J. Matos wrote:
                    >
                    >What's then the reason for adding named tuples if they are not
                    >mutable...?? ?
                    >
                    Names are more descriptive than "magic numbers" as indices. See for
                    example the "named tuple" returned by `os.stat()`.

                    I have no objection to named tuples, but I've sometimes missed having an
                    equivalent to the Pascal record or C struct: essentially a named mutable
                    tuple. Here's a quick way to get one:

                    def record(**kwargs ):
                    """Lightwei ght named mutable tuple equivalent."""
                    class Record(object):
                    __slots__ = kwargs.keys()
                    x = Record()
                    for key, value in kwargs.items():
                    setattr(x, key, value)
                    return x


                    It needs more work to be a full-fledged record, e.g. __eq__ and __str__,
                    but it's a start.


                    --
                    Steven

                    Comment

                    • Glenn Linderman

                      #25
                      Re: Structures

                      On approximately 11/3/2008 5:28 PM, came the following characters from
                      the keyboard of Aaron Brady:
                      On Nov 3, 5:38 pm, "Paulo J. Matos" <pocma...@gmail .comwrote:
                      >
                      >On Mon, Nov 3, 2008 at 10:19 PM, Aaron Brady <castiro...@gma il.comwrote:
                      >>
                      >>On Nov 3, 3:45 pm, Ben Finney <bignose+hate s-s...@benfinney. id.au>
                      >>wrote:
                      >>>
                      >>>"Paulo J. Matos" <pocma...@gmail .comwrites:
                      >>>Take care with broad sweeping statements about "every other language",
                      >>>or even "most other languages". They are usually flat-out wrong:
                      >>>there is a stunning variety of different approaches and concepts in
                      >>>programmin g languages, with very little common to even a majority of
                      >>>them.
                      >>>>
                      >>Yea, verily. How many languages do you think that is? Feel free to
                      >>count C and C++ as different ones.
                      >>>
                      >Well, I wouldn't dare to say I know a lot of languages but the ones I
                      >do provide mechanisms to define structures / records: C, C++, Scheme,
                      >Common Lisp, Haskell, SML, Ocaml.
                      >>
                      >
                      I don't know even half of those. What about Perl? Does anyone know
                      that?
                      >
                      structs in Perl are generally implemented as hashes, which is similar to
                      a Python dict.

                      --
                      Glenn -- http://nevcal.com/
                      =============== ============
                      A protocol is complete when there is nothing left to remove.
                      -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

                      Comment

                      • Michele Simionato

                        #26
                        Re: Structures

                        On Nov 4, 2:57 am, Glenn Linderman <v+pyt...@g.nev cal.comwrote:
                        Note that classes, by default, are based on a contained dict!  There are
                        games to be played with slots that can apparently improve that.  I am
                        not yet experienced enough with Python to know if a slot is as fast as a
                        C struct, but perhaps it is.  
                        No, slots have nothing to do with speed, they are a memory
                        optimization. IMO slots fall in the category of premature optimization
                        and it was a mistake to include them in the language
                        (the functionality should have been made available only at the C-
                        level). As of now, lots of people abuse slots without realizing their
                        disadvantages (for instance, they break multiple inheritance).

                        Comment

                        • Bruno Desthuilliers

                          #27
                          Re: Structures

                          Joe Strout a écrit :
                          On Nov 3, 2008, at 4:38 PM, Paulo J. Matos wrote:
                          >
                          >However, I wouldn't dare to say Python needs structures to be a good
                          >language, or anything similar. My question was more directed to : if
                          >there aren't structures in Python, what do Pythonists use instead?
                          >
                          Classes.
                          or objects, or dicts, depending on concrete use case, personnal
                          preferences and current phase of the moon.

                          Comment

                          • Bruno Desthuilliers

                            #28
                            Re: Structures

                            Paulo J. Matos a écrit :
                            (snip)
                            However, I wouldn't dare to say Python needs structures to be a good
                            language, or anything similar. My question was more directed to : if
                            there aren't structures in Python, what do Pythonists use instead?
                            (I have seen dicts might be an alternative,
                            Yes, and the most obvious one - at least when all you need is a kind of
                            data transfert object. Else, well, C++ structs are just a survival of a
                            C feature kept for compatibility reasons - technically speaking, you
                            just don't need structs when you have classes.
                            but as I said in previous
                            post, they seem to flexible [making them a canon to shoot a fly,
                            Err... I'm afraid you'll find Python way to flexible and dangerous,
                            then. You may not be aware of the fact that you can add / remove /
                            replace objects attributes (and methods) at runtime ? FWIW, and a couple
                            corner cases set asides, Python objects are mostly glorified dicts.
                            and
                            they probably lack constant-time access, right?]
                            AFAIK, it's almost constant-time access. But you won't get anything
                            better using classes, since attributes are stored in a dict anyway.

                            Comment

                            • bearophileHUGS@lycos.com

                              #29
                              Re: Structures

                              Michele Simionato:
                              No, slots have nothing to do with speed, they are a memory optimization.
                              In many languages, often in Python too, the less memory you use/
                              allocate the faster you go.

                              In fact slots allow a speed increase too (in new style classes):

                              from timeit import default_timer as clock

                              class C1(object):
                              __slots__ = ["a", "b"]
                              def __init__(self, a, b):
                              self.a = a
                              self.a = b

                              class C2(object):
                              def __init__(self, a, b):
                              self.a = a
                              self.a = b

                              def main(N, test):
                              t0 = clock()

                              if test == 1:
                              [C1(ab, ab) for ab in xrange(N)]
                              elif test == 2:
                              [C2(ab, ab) for ab in xrange(N)]

                              t1 = clock()
                              print round(t1 - t0, 2)

                              main(N=700*1000 , test=1)

                              Core duo 2 GHz:
                              test=1 ==1.06 s
                              test=2 ==3.0 s

                              (700*1000 is the way I have found to write the 700_000 I was talking
                              about, until we'll have a syntax for it.)

                              Bye,
                              bearophile

                              Comment

                              • Michele Simionato

                                #30
                                Re: Structures

                                On Nov 4, 11:20 am, bearophileH...@ lycos.com wrote:
                                Michele Simionato:
                                >
                                No, slots have nothing to do with speed, they are a memory optimization..
                                >
                                In many languages, often in Python too, the less memory you use/
                                allocate the faster you go.
                                >
                                In fact slots allow a speed increase too (in new style classes):
                                >
                                from timeit import default_timer as clock
                                >
                                class C1(object):
                                    __slots__ = ["a", "b"]
                                    def __init__(self, a, b):
                                        self.a = a
                                        self.a = b
                                >
                                class C2(object):
                                    def __init__(self, a, b):
                                        self.a = a
                                        self.a = b
                                >
                                def main(N, test):
                                    t0 = clock()
                                >
                                    if test == 1:
                                        [C1(ab, ab) for ab in xrange(N)]
                                    elif test == 2:
                                        [C2(ab, ab) for ab in xrange(N)]
                                >
                                    t1 = clock()
                                    print round(t1 - t0, 2)
                                >
                                main(N=700*1000 , test=1)
                                >
                                Core duo 2 GHz:
                                test=1 ==1.06 s
                                test=2 ==3.0 s
                                >
                                (700*1000 is the way I have found to write the 700_000 I was talking
                                about, until we'll have a syntax for it.)
                                >
                                Bye,
                                bearophile
                                I did not expect such a large difference in instantiation time.
                                However I was thinking about
                                access time, and then the difference is not impressive (~20-25%):

                                from timeit import default_timer as clock

                                class C1(object):
                                __slots__ = ["a", "b"]
                                def __init__(self, a, b):
                                self.a = a
                                self.b = b

                                class C2(object):
                                def __init__(self, a, b):
                                self.a = a
                                self.b = b

                                def main(N, test):
                                t0 = clock()
                                if test == 'with slots':
                                c = C1(1, 2)
                                for _ in xrange(N):
                                (c.a, c.b)
                                elif test == 'without slots':
                                c = C2(1, 2)
                                for _ in xrange(N):
                                (c.a, c.b)
                                t1 = clock()
                                print test, round(t1 - t0, 3)

                                main(N=700*1000 , test='with slots') # 0.152s
                                main(N=700*1000 , test='without slots') # 0.195s

                                I mantain that one should use slots only as last resort, if the
                                speedup really justify having nonstandard classes.

                                Comment

                                Working...