Deprecating reload() ???

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Skip Montanaro

    #31
    Re: Deprecating reload() ???

    [color=blue][color=green]
    >> I wrote something and threw it up on my Python Bits page:
    >>
    >> http://www.musi-cal.com/~skip/python/[/color][/color]

    Dave> I get AttributeErrors when I try the super_reload function. Looks
    Dave> like sys.modules has a bunch of items with no '__dict__'.

    You can put objects in sys.modules which are not module objects. I updated
    the code to use getattr() and setattr() during the rebinding step. I think
    that will help, though of course this entire exercise is obviously only an
    approximation to a solution.

    Skip


    Comment

    • David MacQuigg

      #32
      Re: Deprecating reload() ???

      On Sun, 14 Mar 2004 02:51:13 -0500, "Terry Reedy" <tjreedy@udel.e du>
      wrote:
      [color=blue]
      >
      >"David MacQuigg" <dmq@gain.com > wrote in message
      >news:6kk6501sh lve52ds2rjdopa7 3jhpchprg3@4ax. com...[color=green]
      >> Just to make sure I understand this, I think what you are saying is
      >> that if I have a module M1 that defines a value x = 3.1, it will be
      >> impossible to keep track of the number of references to M1.x because
      >> the object '3.1' may have other references to it from other modules
      >> which use the same constant 3.1. This really does make it impossible
      >> to do a complete reload.[/color]
      >
      >Currently, this is possible but not actual for floats, but it is actual, in
      >CPython, for some ints and strings. For a fresh 2.2.1 interpreter
      >[color=green][color=darkred]
      >>>> sys.getrefcount (0)[/color][/color]
      >52[color=green][color=darkred]
      >>>> sys.getrefcount (1)[/color][/color]
      >50[color=green][color=darkred]
      >>>> sys.getrefcount ('a')[/color][/color]
      >7[/color]

      I'm amazed how many of these shared references there are.

      [snip][color=blue]
      >
      >However, there is still the problem of instances and their __class__
      >attribute. One could, I believe (without trying it) give each class in a
      >module an __instances__ list that is updated by each call to __init__.
      >Then super_reload() could grab the instances lists, do a normal reload, and
      >then update the __instances__ attributes of the reloaded classes and the
      >__class__ attributes of the instances on the lists. In other words,
      >manually rebind instances to new classes and vice versa.[/color]

      We need to draw a clean line between what gets updated and what
      doesn't. I would not update instances, because in general, that will
      be impossible. Here is a section from my update on Reload Basics at
      http://ece.arizona.edu/~edatools/Python/Reload.htm I need to provide
      my students with a clear explanation, hopefully with sensible
      motivation, for what gets updated and what does not. Comments are
      welcome.

      Background on Reload
      Users often ask why doesn't reload just "do what we expect" and update
      everything. The fundamental problem is that the current state of
      objects in a running program can be dependent on the conditions which
      existed when the object was created, and those conditions may have
      changed. Say you have in your reloaded module:

      class C1:
      def __init__(self, x, y ):
      ...

      Say you have an object x1 created from an earlier version of class C1.
      The current state of x1 depends on the values of x and y at the time
      x1 was created. Asking reload to "do what we expect" in this case, is
      asking to put the object x1 into the state it would be now, had we
      made the changes in C1 earlier.

      If you are designing a multi-module program, *and* users may need to
      reload certain modules, *and* re-starting everything may be
      impractical, then you should avoid any direct references to objects
      within the modules to be reloaded. Direct references are created by
      statements like 'x = M1.x' or 'from M1 import x'. Always access these
      variables via the fully-qualified names, like M1.x, and you will avoid
      leftover references to old objects after a reload. This won't solve
      the object creation problem, but at least it will avoid some surprises
      when you re-use the variable x.

      --- end of section ---

      I *would* like to do something about numbers and strings and other
      shared objects not getting updated, because that is going to be hard
      to explain. Maybe we could somehow switch off the generation of
      shared objects for modules in a 'debug' mode.

      -- Dave

      Comment

      • John Roth

        #33
        Re: Deprecating reload() ???


        "David MacQuigg" <dmq@gain.com > wrote in message
        news:rhj950ts4f brbfadp0s5fmp3v n6bhh7ppc@4ax.c om...[color=blue]
        >
        > I *would* like to do something about numbers and strings and other
        > shared objects not getting updated, because that is going to be hard
        > to explain. Maybe we could somehow switch off the generation of
        > shared objects for modules in a 'debug' mode.[/color]

        It doesn't matter if numbers and strings get updated. They're
        immutable objects, so one copy of a number is as good as
        another. In fact, that poses a bit of a problem since quite
        a few of them are singletons. There's only one object that
        is an integer 1 in the system, so if the new version changes
        it to, say 2, and you go around and rebind all references to
        1 to become references to 2, you might have a real mess
        on your hands.

        On the other hand, if you don't rebind the ones that came out
        of the original version of the module, you've got a different
        mess on your hands.

        John Roth[color=blue]
        >
        > -- Dave
        >[/color]


        Comment

        • Greg Ewing (using news.cis.dfn.de)

          #34
          Re: Deprecating reload() ???

          Skip Montanaro wrote:[color=blue]
          > Not so. del sys.modules['mod']/import mod is effectively what reload()
          > does.[/color]

          Not quite -- reload() keeps the existing module object and changes
          its contents, whereas the above sequence creates a new module
          object.

          The difference will be apparent if any other modules have done
          'import mod' before the reload.

          --
          Greg Ewing, Computer Science Dept,
          University of Canterbury,
          Christchurch, New Zealand


          Comment

          • Hung Jung Lu

            #35
            Re: Deprecating reload() ???

            > >> On Fri, 12 Mar 2004 08:45:24 -0500, Peter Hansen <peter@engcorp. com>[color=blue][color=green][color=darkred]
            > >> wrote:
            > >>
            > >> It's worse than just a surprise. It's a serious problem when what you
            > >> need to do is what most people are expecting -- replace every
            > >> reference to objects in the old module with references to the new
            > >> objects. The problem becomes a near impossibility when those
            > >> references are scattered throughout a multi-module program.[/color][/color][/color]

            You could use a class instead of a module. I have done that kind of
            thing with classes and weakrefs. By the way, it kind of surprises me
            that no one has mentioned weakref in this thread. It's not too hard to
            keep a list of weakrefs, and everytime an object is created, you
            register it with that list. Now, when the new class comes in (e.g. via
            reload(),) you get the list of weakrefs of the existing objects, and
            re-assign their __class__, and voila, dynamic change of class
            behavior. Of course, if you spend some time and push this feature into
            the metaclass, everytime becomes even easier.

            But it is true that in Python you have to implement dynamic refreshing
            of behavior (module or class) explicitly, whereas in Ruby, as I
            understand, class behavior refreshing is automatic.

            David MacQuigg <dmq@gain.com > wrote in message news:<ba1450tvg dttth3ddu88qjev sv9k5b1tpb@4ax. com>...[color=blue]
            >
            > I agree, most programs should not have 'reload()' designed in, and
            > those that do, should be well aware of its limitations. I'm concerned
            > more about interactive use, specifically of programs which cannot be
            > conveniently restarted from the beginning. I guess I'm spoiled by HP
            > BASIC, where you can change the program statements while the program
            > is running! (half wink)[/color]

            Edit-and-continue. Which is kind of important. For instance, I often
            have to load in tons of data from the database, do some initial
            processing, and then do the actual calculations. Or in game
            programming, where you have to load up a lot of things, play quite a
            few initial steps, before you arrive at the point of your interest.
            Now, in these kinds of programs, where the initial state preparation
            takes a long time, you really would like some "edit-and-continue"
            feature while developing the program.

            For GUI programs, edit-and-continue is also very helpful during
            development.

            And for Web applications, actually most CGI-like programs (EJB in Java
            jargon, or external methods in Zope) are all reloadable while the
            web/app server is running. Very often you can do open-heart surgery on
            web/app servers, while the website is running live.
            [color=blue]
            > Here is a use-case for classes. I've got hundreds of variables in a
            > huge hierarchy of "statefiles ". In my program, that hierarchy is
            > handled as a hierarchy of classes. If I want to access a particular
            > variable, I say something like:
            > wavescan.window 1.plot2.xaxis.l abel.font.size = 12
            > These classes have no methods, just names and values and other
            > classes.[/color]

            "State file" reminds me of a programming paradigm based on REQUEST and
            RESPONSE. (Sometimes REQUEST alone.) Basically, your program's
            information is stored in a single "workspace" object. The advantage of
            this approach is: (1) All function/method calls at higher level have
            unique "header", something like f(REQUEST, RESPONSE), or f(REQUEST),
            and you will never need to worry about header changes. (2) The REQUEST
            and/or RESPONSE object could be serialized and stored on disk, or
            passed via remote calls to other computers. Since they can be
            serialized, you can also intercept/modify the content and do unit
            testing. This is very important in programs that take long time to
            build up initial states. Basically, once you are able to serialize and
            cache the state on disk (and even modify the states offline), then you
            can unit test various parts of your program WITHOUT having to start
            from scratch. Some people use XML for serialization to make state
            modification even easier, but any other serialization format is just
            as fine. This approach is also good when you want/need some
            parallel/distributed computing down the future, since the serialized
            states could be potentially be dispatched independently.

            Today's file access time is so fast that disk operations are often
            being sub-utilized. In heavy numerical crunching, having a "workspace"
            serialization can make development and debugging a lot less painful.
            [color=blue]
            > If I reload a module that changes some of those variables, I would
            > like to not have to hunt down every reference in the running program
            > and change it manually.[/color]

            In Python the appropriate tool is weakref. Per each class that
            matters, keep a weakref list of the instances. This way you can
            automate the refresing. I've done that before.

            regards,

            Hung Jung

            Comment

            • Skip Montanaro

              #36
              Re: Deprecating reload() ???


              Hung Jung> But it is true that in Python you have to implement dynamic
              Hung Jung> refreshing of behavior (module or class) explicitly, whereas
              Hung Jung> in Ruby, as I understand, class behavior refreshing is
              Hung Jung> automatic.

              That has its own attendant set of problems. If an instance's state is
              created with an old version of a class definition, then updated later to
              refer to a new version, who's to say that the current state of the instance
              is what you would have obtained had the instance been created using the new
              class from the start?

              Skip

              Comment

              • Skip Montanaro

                #37
                Re: Deprecating reload() ???


                Dave> Maybe we could somehow switch off the generation of shared objects
                Dave> for modules in a 'debug' mode.

                You'd have to disable the integer free list. There's also code in
                tupleobject.c to recognize and share the empty tuple. String interning
                could be disabled as well. Everybody's ignored the gorilla in the room:
                [color=blue][color=green][color=darkred]
                >>> sys.getrefcount (None)[/color][/color][/color]
                1559

                In general, I don't think that disabling immutable object sharing would be
                worth the effort. Consider the meaning of module level integers. In my
                experience they are generally constants and are infrequently changed once
                set. Probably the only thing worth tracking down during a super reload
                would be function, class and method definitions.

                Skip

                Comment

                • Michael Hudson

                  #38
                  Re: Deprecating reload() ???

                  Skip Montanaro <skip@pobox.com > writes:
                  [color=blue]
                  > Not so. del sys.modules['mod']/import mod is effectively what
                  > reload() does.[/color]

                  It's more like 'exec mod.__file__[:-1] in mod.__dict__", actually.

                  Cheers,
                  mwh

                  --
                  I don't have any special knowledge of all this. In fact, I made all
                  the above up, in the hope that it corresponds to reality.
                  -- Mark Carroll, ucam.chat

                  Comment

                  • Michael Hudson

                    #39
                    Re: Deprecating reload() ???

                    David MacQuigg <dmq@gain.com > writes:
                    [color=blue]
                    > On Sat, 13 Mar 2004 14:27:00 -0600, Skip Montanaro <skip@pobox.com >
                    > wrote:
                    >[color=green]
                    > > David> I'm not sure at this point if an improved reload() is worth
                    > > David> pursuing, ...
                    > >
                    > >I wrote something and threw it up on my Python Bits page:
                    > >
                    > > http://www.musi-cal.com/~skip/python/[/color]
                    >
                    > I get AttributeErrors when I try the super_reload function. Looks like
                    > sys.modules has a bunch of items with no '__dict__'.[/color]

                    They'll be None, mostly.

                    Cheers,
                    mwh

                    --
                    C++ is a siren song. It *looks* like a HLL in which you ought to
                    be able to write an application, but it really isn't.
                    -- Alain Picard, comp.lang.lisp

                    Comment

                    • Skip Montanaro

                      #40
                      Re: Deprecating reload() ???

                      [color=blue][color=green][color=darkred]
                      >> >I wrote something and threw it up on my Python Bits page:
                      >> >
                      >> > http://www.musi-cal.com/~skip/python/[/color]
                      >>
                      >> I get AttributeErrors when I try the super_reload function. Looks
                      >> like sys.modules has a bunch of items with no '__dict__'.[/color][/color]

                      Michael> They'll be None, mostly.

                      What's the significance of an entry in sys.modules with a value of None?
                      That is, how did they get there and why are they there?

                      Skip

                      Comment

                      • Michael Hudson

                        #41
                        Re: Deprecating reload() ???

                        Skip Montanaro <skip@pobox.com > writes:
                        [color=blue][color=green][color=darkred]
                        > >> >I wrote something and threw it up on my Python Bits page:
                        > >> >
                        > >> > http://www.musi-cal.com/~skip/python/
                        > >>
                        > >> I get AttributeErrors when I try the super_reload function. Looks
                        > >> like sys.modules has a bunch of items with no '__dict__'.[/color][/color]
                        >
                        > Michael> They'll be None, mostly.
                        >
                        > What's the significance of an entry in sys.modules with a value of None?
                        > That is, how did they get there and why are they there?[/color]

                        Something to do with packags and things that could have been but
                        weren't relative imports, I think...
                        [color=blue][color=green][color=darkred]
                        >>> from distutils.core import setup
                        >>> import sys
                        >>> for k,v in sys.modules.ite ms():[/color][/color][/color]
                        .... if v is None:
                        .... print k
                        ....
                        distutils.distu tils
                        distutils.getop t
                        encodings.encod ings
                        distutils.warni ngs
                        distutils.strin g
                        encodings.codec s
                        encodings.excep tions
                        distutils.types
                        encodings.types
                        distutils.os
                        distutils.re
                        distutils.sys
                        distutils.copy

                        Cheers,
                        mwh

                        --
                        It's actually a corruption of "starling". They used to be carried.
                        Since they weighed a full pound (hence the name), they had to be
                        carried by two starlings in tandem, with a line between them.
                        -- Alan J Rosenthal explains "Pounds Sterling" on asr

                        Comment

                        • David MacQuigg

                          #42
                          Re: Deprecating reload() ???

                          On Sun, 14 Mar 2004 19:49:08 -0500, "John Roth"
                          <newsgroups@jhr othjr.com> wrote:
                          [color=blue]
                          >"David MacQuigg" <dmq@gain.com > wrote in message
                          >news:rhj950ts4 fbrbfadp0s5fmp3 vn6bhh7ppc@4ax. com...[color=green]
                          >>
                          >> I *would* like to do something about numbers and strings and other
                          >> shared objects not getting updated, because that is going to be hard
                          >> to explain. Maybe we could somehow switch off the generation of
                          >> shared objects for modules in a 'debug' mode.[/color]
                          >
                          >It doesn't matter if numbers and strings get updated. They're
                          >immutable objects, so one copy of a number is as good as
                          >another. In fact, that poses a bit of a problem since quite
                          >a few of them are singletons. There's only one object that
                          >is an integer 1 in the system, so if the new version changes
                          >it to, say 2, and you go around and rebind all references to
                          >1 to become references to 2, you might have a real mess
                          >on your hands.[/color]

                          The immutability of numbers and strings is referring only to what you
                          can do via executable statements. If you use a text editor on the
                          original source code, clearly you can change any "immutable" .

                          You do raise a good point, however, about the need to avoid changing
                          *all* references to a shared object. The ones that need to change are
                          those that were created via a reference to an earlier version of the
                          reloaded module.
                          [color=blue]
                          >On the other hand, if you don't rebind the ones that came out
                          >of the original version of the module, you've got a different
                          >mess on your hands.[/color]

                          True.

                          -- Dave

                          Comment

                          • David MacQuigg

                            #43
                            Re: Deprecating reload() ???

                            On Mon, 15 Mar 2004 05:49:58 -0600, Skip Montanaro <skip@pobox.com >
                            wrote:
                            [color=blue]
                            > Dave> Maybe we could somehow switch off the generation of shared objects
                            > Dave> for modules in a 'debug' mode.
                            >
                            >You'd have to disable the integer free list. There's also code in
                            >tupleobject. c to recognize and share the empty tuple. String interning
                            >could be disabled as well. Everybody's ignored the gorilla in the room:
                            >[color=green][color=darkred]
                            > >>> sys.getrefcount (None)[/color][/color]
                            > 1559[/color]

                            Implementation detail. ( half wink )
                            [color=blue]
                            >In general, I don't think that disabling immutable object sharing would be
                            >worth the effort. Consider the meaning of module level integers. In my
                            >experience they are generally constants and are infrequently changed once
                            >set. Probably the only thing worth tracking down during a super reload
                            >would be function, class and method definitions.[/color]

                            If you reload a module M1, and it has an attribute M1.x, which was
                            changed from '1' to '2', we want to change also any references that
                            may have been created with statements like 'x = M1.x', or 'from M1
                            import *' If we don't do this, reload() will continue to baffle and
                            frustrate new users. Typically, they think they have just one
                            variable 'x'

                            It's interesting to see how Ruby handles this problem.
                            http://userlinux.com/cgi-bin/wiki.pl?RubyPython I'm no expert on
                            Ruby, but it is my understanding that there *are* no types which are
                            implicitly immutable (no need for tuples vs lists, etc.). If you
                            *want* to make an object (any object) immutable, you do that
                            explicitly with a freeze() function.

                            I'm having trouble understanding the benefit of using shared objects
                            for simple numbers and strings. Maybe you can save a significant
                            amount of memory by having all the *system* modules share a common
                            'None' object, but when a user explicitly says 'M1.x = None', surely
                            we can afford a few bytes to provide a special None for that
                            reference. The benefit is that when you change None to 'something' by
                            editing and reloading M1, all references that were created via a
                            reference to M1.x will change automatically.

                            We should at least have a special 'debug' mode in which the hidden
                            sharing of objects is disabled for selected modules. You can always
                            explicitly share an object by simply referencing it, rather than
                            typing in a fresh copy.

                            x = "Here is a long string I want to share."
                            y = x
                            z = "Here is a long string I want to share."

                            In any mode, x and y will be the same object. In debug mode, we
                            allocate a little extra memory to make z a separate object from x, as
                            the user apparently intended.

                            If we do the updates for just certain types of objects, we will have a
                            non-intuitive set of rules that will be difficult for users to
                            understand. I would like to make things really simple and say:
                            """
                            If you have a direct reference to an object in a reloaded module, that
                            reference will be updated. If the reference is created by some other
                            process (e.g. copying a string, or instantiation of a new object based
                            on a class in the reloaded module) then that reference will not be
                            updated. Only references to objects from the old module are updated.
                            The old objects are then garbage collected.
                            """

                            We may have to pay a price in implementation cost and a little extra
                            storage to make things simple for the user.

                            -- Dave

                            Comment

                            • John Roth

                              #44
                              Re: Deprecating reload() ???


                              "David MacQuigg" <dmq@gain.com > wrote in message
                              news:tuob50liqn 5mcrbvhu7qpq12d sa9im73un@4ax.c om...[color=blue]
                              > On Mon, 15 Mar 2004 05:49:58 -0600, Skip Montanaro <skip@pobox.com >
                              > wrote:
                              >[/color]
                              [color=blue]
                              >
                              > I'm having trouble understanding the benefit of using shared objects
                              > for simple numbers and strings. Maybe you can save a significant
                              > amount of memory by having all the *system* modules share a common
                              > 'None' object, but when a user explicitly says 'M1.x = None', surely
                              > we can afford a few bytes to provide a special None for that
                              > reference. The benefit is that when you change None to 'something' by
                              > editing and reloading M1, all references that were created via a
                              > reference to M1.x will change automatically.[/color]

                              I believe it's a performance optimization; the memory savings
                              are secondary.
                              [color=blue]
                              > We should at least have a special 'debug' mode in which the hidden
                              > sharing of objects is disabled for selected modules. You can always
                              > explicitly share an object by simply referencing it, rather than
                              > typing in a fresh copy.[/color]

                              That would have rather disasterous concequences, since
                              some forms of comparison depend on there only being
                              one copy of the object.
                              [color=blue]
                              >
                              > -- Dave
                              >[/color]


                              Comment

                              • Jeff Epler

                                #45
                                Re: Deprecating reload() ???

                                On Mon, 15 Mar 2004 05:49:58 -0600, Skip Montanaro <skip@pobox.com >
                                wrote:[color=blue][color=green]
                                > >You'd have to disable the integer free list. There's also code in
                                > >tupleobject. c to recognize and share the empty tuple. String interning
                                > >could be disabled as well. Everybody's ignored the gorilla in the room:
                                > >[color=darkred]
                                > > >>> sys.getrefcount (None)[/color]
                                > > 1559[/color][/color]

                                On Mon, Mar 15, 2004 at 10:15:33AM -0700, David MacQuigg wrote:[color=blue]
                                > Implementation detail. ( half wink )[/color]

                                I'd round that down from half to None, personally.

                                This is guaranteed to work:
                                x = None
                                y = None
                                assert x is y
                                by the following text in the language manual:
                                None
                                This type has a single value. There is a single object with
                                this value. This object is accessed through the built-in
                                name None. It is used to signify the absence of a value in
                                many situations, e.g., it is returned from functions that
                                don't explicitly return anything. Its truth value is false.
                                There are reams of code that rely on the object identity of None, so a
                                special debug mode where "x = <some literal>" makes x refer to something
                                that has a refcount of 1 will break code.

                                The 'is' guarantee applies to at least these built-in values:
                                None Ellipsis NotImplemented True False

                                The only problem I can see with reload() is that it doesn't do what you
                                want. But on the other hand, what reload() does is perfectly well
                                defined, and at least the avenues I've seen explored for "enhancing" it
                                look, well, like train wreck.

                                Jeff

                                Comment

                                Working...