Confounded by Python objects

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • boblatest@googlemail.com

    Confounded by Python objects

    Hello group,

    take a look at the code snippet below. What I want to do is initialize
    two separate Channel objects and put some data in them. However,
    although two different objects are in fact created (as can be seen
    from the different names they spit out with the "diag()" method), the
    data in the "sample" member is the same although I stick different
    values in.

    Thanks,
    robert
    (Sorry for Google Groups, but I don't have NNTP at work)

    Here's the code:

    #!/usr/bin/python

    class Channel:
    name = ''
    sample = []

    def __init__(self, name):
    self.name = name

    def append(self, time, value):
    self.sample.app end((time, value))
    self.diag()

    def diag(self):
    print (self.name, self.sample)


    chA = Channel('A')
    chB = Channel('B')

    chA.append(1, 1.1)
    chB.append(2, 2.1)
    chA.append(3, 1.2)
    chB.append(4, 2.2)

    print 'Result:'

    chA.diag()
    chB.diag()

    ------------------------------------
    and here's the output:

    ('A', [(1, 1.1000000000000 001)])
    ('B', [(1, 1.1000000000000 001), (2, 2.1000000000000 001)])
    ('A', [(1, 1.1000000000000 001), (2, 2.1000000000000 001), (3, 1.2)])
    ('B', [(1, 1.1000000000000 001), (2, 2.1000000000000 001), (3, 1.2), (4,
    2.2000000000000 002)])
    Result:
    ('A', [(1, 1.1000000000000 001), (2, 2.1000000000000 001), (3, 1.2), (4,
    2.2000000000000 002)])
    ('B', [(1, 1.1000000000000 001), (2, 2.1000000000000 001), (3, 1.2), (4,
    2.2000000000000 002)])

    What I'd like to see, however, is 2 tuples per Channel object, like
    this:
    ('A', [(1, 1.1000000000000 001)])
    ('B', [(2, 2.1000000000000 001)])
    ('A', [(1, 1.1000000000000 001), (3, 1.2)])
    ('B', [(2, 2.1000000000000 001), (4, 2.2000000000000 002)])
    Result:
    ('A', [(1, 1.1000000000000 001), (3, 1.2)])
    ('B', [(2, 2.1000000000000 001), (4, 2.2000000000000 002)])

  • Fredrik Lundh

    #2
    Re: Confounded by Python objects

    boblatest@googl email.com wrote:
    take a look at the code snippet below. What I want to do is initialize
    two separate Channel objects and put some data in them. However,
    although two different objects are in fact created (as can be seen
    from the different names they spit out with the "diag()" method), the
    data in the "sample" member is the same although I stick different
    values in.
    that's because you only have one sample object -- the one owned by
    the class object.

    since you're modifying that object in place (via the append method),
    your changes will be shared by all instances. python never copies
    attributes when it creates an instance; if you want a fresh object,
    you have to create it yourself.
    class Channel:
    tip: if you're not 100% sure why you would want to put an attribute
    on the class level, don't do it. instead, just create all attributes
    inside the __init__ method:
    def __init__(self, name):
    self.name = name
    self.sample = [] # create fresh container for instance
    def append(self, time, value):
    self.sample.app end((time, value))
    self.diag()
    >
    def diag(self):
    print (self.name, self.sample)
    hope this helps!

    </F>

    Comment

    • alex23

      #3
      Re: Confounded by Python objects

      On Jul 24, 7:45 pm, "boblat...@goog lemail.com"
      <boblat...@goog lemail.comwrote :
      class Channel:
      name = ''
      sample = []
      >
      def __init__(self, name):
      self.name = name
      >
      def append(self, time, value):
      self.sample.app end((time, value))
      self.diag()
      >
      def diag(self):
      print (self.name, self.sample)
      Okay, the problem is you're appending to a _class_ attribute, not to
      an instance attribute.

      If you change your class definition to this, it should work:

      class Channel:
      def __init__(self, name):
      self.name = name
      self.sample = []

      That will provide each instance with its own version of the sample
      attribute.

      The 'self.name = name' in the __init__ for your original code binds a
      new attribute to the instance, whereas 'self.sample.ap pend(...' in the
      class's append was appending to the class attribute instead.

      Hope this helps.

      - alex23

      Comment

      • Lawrence D'Oliveiro

        #4
        Re: Confounded by Python objects

        In message
        <33c9309a-9679-41f6-a777-2874aad1903a@z6 6g2000hsc.googl egroups.com>,
        boblatest@googl email.com wrote:
        class Channel:
        name = ''
        sample = []
        These are class variables, not instance variables. Take them out, and ...
        def __init__(self, name):
        self.name = name
        .... add this line to the above function

        self.sample = []

        Comment

        • boblatest@googlemail.com

          #5
          Re: Confounded by Python objects

          On Jul 24, 11:59 am, Fredrik Lundh <fred...@python ware.comwrote:
          tip: if you're not 100% sure why you would want to put an attribute
          on the class level, don't do it.
          The reason I did it was sort of C++ish (that's where I come from): I
          somehow wanted a list of attributes on the class level. More for
          readibility than anything elase, really.
          hope this helps!
          Yup, did the trick. Thanks!
          robert

          Comment

          • satoru

            #6
            Re: Confounded by Python objects

            On Jul 24, 6:10 pm, "boblat...@goog lemail.com"
            <boblat...@goog lemail.comwrote :
            On Jul 24, 11:59 am, Fredrik Lundh <fred...@python ware.comwrote:
            >
            tip: if you're not 100% sure why you would want to put an attribute
            on the class level, don't do it.
            >
            The reason I did it was sort of C++ish (that's where I come from): I
            somehow wanted a list of attributes on the class level. More for
            readibility than anything elase, really.
            >
            hope this helps!
            >
            Yup, did the trick. Thanks!
            robert
            yes, i thought your code is kind of static, so it didn't work for a
            dynamic language like python.
            in python, you don't have to say "static" to make an variable a class
            variable, so the "name" and "sample" you kind of "declared" is indeed
            class variables.
            you may wonder why then the two instaces of "Channel" has different
            names, that's because you assign to name in "__init__" and make it an
            instance variable that shared the name "name" with a class variable.
            As to "sample", it never get assigned to and when you say "append" the
            class variable is changed in place.
            hope my explaination helps.

            Comment

            • Robert Latest

              #7
              Re: Confounded by Python objects

              satoru wrote:
              As to "sample", it never get assigned to and when you say "append" the
              class variable is changed in place.
              hope my explaination helps.
              Sure does, thanks a lot.

              Here's an interesting side note: After fixing my "Channel" thingy the
              whole project behaved as expected. But there was an interesting hitch.
              The main part revolves around another class, "Sequence", which has a
              list of Channels as attribute. I was curious about the performance of my
              script, because eventually this construct is supposed to handle
              megabytes of data. So I wrote a simple loop that creates a new Sequence,
              fills all the Channels with data, and repeats.

              Interistingly, the first couple of dozens iterations went satisfactorily
              quickly (took about 1 second total), but after a hundred or so times it
              got really slow -- like a couple of seconds per iteration.

              Playing around with the code, not really knowing what to do, I found
              that in the "Sequence" class I had again erroneously declared a class-level
              attribute -- rather harmlessly, just a string, that got assigned to once in each
              iteration on object creation.

              After I had deleted that, the loop went blindingly fast without slowing
              down.

              What's the mechanics behind this behavior?

              robert

              Comment

              • Steven D'Aprano

                #8
                Re: Confounded by Python objects

                On Sat, 26 Jul 2008 18:54:22 +0000, Robert Latest wrote:
                Here's an interesting side note: After fixing my "Channel" thingy the
                whole project behaved as expected. But there was an interesting hitch.
                The main part revolves around another class, "Sequence", which has a
                list of Channels as attribute. I was curious about the performance of my
                script, because eventually this construct is supposed to handle
                megabytes of data. So I wrote a simple loop that creates a new Sequence,
                fills all the Channels with data, and repeats.
                >
                Interistingly, the first couple of dozens iterations went satisfactorily
                quickly (took about 1 second total), but after a hundred or so times it
                got really slow -- like a couple of seconds per iteration.
                >
                Playing around with the code, not really knowing what to do, I found
                that in the "Sequence" class I had again erroneously declared a
                class-level attribute -- rather harmlessly, just a string, that got
                assigned to once in each iteration on object creation.
                >
                After I had deleted that, the loop went blindingly fast without slowing
                down.
                >
                What's the mechanics behind this behavior?
                Without actually seeing the code, it's difficult to be sure, but my guess
                is that you were accidentally doing repeated string concatenation. This
                can be very slow.

                In general, anything that looks like this:

                s = ''
                for i in range(10000): # or any big number
                s = s + 'another string'

                can be slow. Very slow. The preferred way is to build a list of
                substrings, then put them together in one go.

                L = []
                for i in range(10000):
                L.append('anoth er string')
                s = ''.join(L)


                It's harder to stumble across the slow behaviour these days, as Python
                2.4 introduced an optimization that, under some circumstances, makes
                string concatenation almost as fast as using join(). But be warned: join()
                is still the recommended approach. Don't count on this optimization to
                save you from slow code.

                If you want to see just how slow repeated concatenation is compared to
                joining, try this:

                >>import timeit
                >>t1 = timeit.Timer('f or i in xrange(1000): x=x+str(i)+"a"' , 'x=""')
                >>t2 = timeit.Timer('" ".join(str(i)+" a" for i in xrange(1000))', '')
                >>>
                >>t1.repeat(num ber=30)
                [0.8506159782409 668, 0.8023910522460 9375, 0.7325420379638 6719]
                >>t2.repeat(num ber=30)
                [0.0526781082153 32031, 0.0520679950714 11133, 0.0528039932250 97656]

                Concatenation is more than ten times slower in the example above, but it
                gets worse:
                >>t1.repeat(num ber=40)
                [1.5138671398162 842, 1.5060651302337 646, 1.5035550594329 834]
                >>t2.repeat(num ber=40)
                [0.0722928047180 17578, 0.0706369876861 57227, 0.0706241130828 85742]

                And even worse:
                >>t1.repeat(num ber=50)
                [2.7190279960632 324, 2.6910948753356 934, 2.7089321613311 768]
                >>t2.repeat(num ber=50)
                [0.0876169204711 91406, 0.0880949497222 90039, 0.0878190994262 69531]



                --
                Steven

                Comment

                • Steven D'Aprano

                  #9
                  Re: Confounded by Python objects

                  On Sun, 27 Jul 2008 20:04:27 +0200, Bruno Desthuilliers wrote:
                  >In general, anything that looks like this:
                  >>
                  >s = ''
                  >for i in range(10000): # or any big number
                  > s = s + 'another string'
                  >>
                  >can be slow. Very slow.
                  >
                  But this is way faster:
                  >
                  s = ''
                  for i in range(10000): # or any big number
                  s += 'another string'
                  Actually, no, for two reasons:

                  (1) The optimizer works with both s = s+t and s += t, so your version is
                  no faster than mine.

                  (2) The optimization isn't part of the language. It only happens if you
                  are using CPython versions better than 2.4, and even then not guaranteed.

                  People forget that CPython isn't the language, it's just one
                  implementation of the language, like Jython and IronPython. Relying on
                  the optimization is relying on an implementation-specific trick.
                  yeps : using augmented assignment (s =+ some_string) instead of
                  concatenation and rebinding (s = s + some_string).
                  Both are equally optimized.
                  >>timeit.Timer( 's+=t', 's,t="xy"').rep eat(number=1000 00)
                  [0.0271871089935 30273, 0.0264711380004 88281, 0.0276899337768 55469]
                  >>timeit.Timer( 's=s+t', 's,t="xy"').rep eat(number=1000 00)
                  [0.0263009071350 09766, 0.0263869762420 6543, 0.0263779163360 5957]

                  But here's a version without it:
                  >>timeit.Timer( 's=t+s', 's,t="xy"').rep eat(number=1000 00)
                  [2.1038830280303 955, 2.1027638912200 928, 2.1031770706176 758]



                  --
                  Steven

                  Comment

                  Working...