Beginner question - How to effectively pass a large list

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Carl Banks

    #16
    Re: Default parameters

    Terry Reedy wrote:[color=blue]
    >
    >
    >
    > "Stian S?iland" <stain@stud.ntn u.no> wrote in
    > message
    > news:slrnbtvte6 .tnk.stain@ozel ot.stud.ntnu.no ...[color=green]
    >> When is this issue going to be resolved? Enough[/color]
    > newbie-pythoners have[color=green]
    >> made this mistake now.[/color]
    >
    > I am puzzled as to why. When I learned Python, I
    > read something to the effect that default value
    > expressions are evaluated at definition time. I
    > understood that the resulting objects were saved
    > for later use (parameter binding) when needed (as
    > default for value not given). I believed this and
    > that was that.[/color]


    I am puzzled as to why you're puzzled. Not everyone who reads the
    manual pays attention to the time of evaluation explanation, if the
    manual they're using even covers. Not everyone stops and says, "Oh my
    God, I don't know whether this is evaluated when the function is
    defined or called. I better find out." (And of course, not everyone
    reads the manual.)

    It seems that most people who haven't thought about time of evaluation
    tend to expect it to be evaluated when the function is called; I know
    I would expect this. (I think I even made the mistake.)



    --
    CARL BANKS http://www.aerojockey.com/software
    "If you believe in yourself, drink your school, stay on drugs, and
    don't do milk, you can get work."
    -- Parody of Mr. T from a Robert Smigel Cartoon

    Comment

    • Carl Banks

      #17
      Re: Default parameters

      Paul Rubin wrote:[color=blue]
      >
      >
      > "Greg Ewing (using news.cis.dfn.de )" <g2h5dqi002@sne akemail.com> writes:[color=green]
      >> In this case, evaluating the default args at call time would
      >> have a negative payoff, since it would slow down every call to
      >> the function in cases where the default value doesn't need
      >> to be evaluated more than once.[/color]
      >
      > In those cases the compiler should notice it and generate appropriate
      > code to evaluate the default arg just once. In many of the cases it
      > can put a static value into the .pyc file.[/color]

      In a perfect world, that would be a good way to do it. However,
      Python is NOT in the business of deciding whether an arbitrary object
      is constant or not, except maybe in the parsing stages. Internally,
      it's just not built that way.

      If I were designing, I would definitely make it the language's (and
      extension writers') business, because there is a lot of opportunity
      for optimization.


      --
      CARL BANKS http://www.aerojockey.com/software
      "If you believe in yourself, drink your school, stay on drugs, and
      don't do milk, you can get work."
      -- Parody of Mr. T from a Robert Smigel Cartoon

      Comment

      • Carl Banks

        #18
        Re: Default parameters

        Stian S?iland wrote:[color=blue]
        >
        >
        > * J.R. spake thusly:[color=green][color=darkred]
        >> > def f(d=[]):
        >> > d.append(0)
        >> > print d
        >> > f()
        >> > f()
        >> > Explain results. When is d bound?[/color][/color]
        >
        > When is this issue going to be resolved? Enough newbie-pythoners have
        > made this mistake now.
        >
        > Why not evaluate the parameter lists at calltime instead of definition
        > time? This should work the same way as lambdas.[/color]


        Consider something like this:

        def func(param=((1, 2),(3,4),(5,6), (7,8))):
        whatever

        Do you really want to be building a big-ass nested tuple every time
        the function is called?

        Python evaluates default args at time of definition mostly for
        performance reasons (and maybe so we could simulate closures before we
        had real closures). My gut feeling is, moving the evaluation to call
        time would be too much of a performance hit to justify it.


        --
        CARL BANKS http://www.aerojockey.com/software
        "If you believe in yourself, drink your school, stay on drugs, and
        don't do milk, you can get work."
        -- Parody of Mr. T from a Robert Smigel Cartoon

        Comment

        • Paul Rubin

          #19
          Re: Default parameters

          Carl Banks <imbosol@aerojo ckey.invalid> writes:[color=blue]
          > It seems that most people who haven't thought about time of evaluation
          > tend to expect it to be evaluated when the function is called; I know
          > I would expect this. (I think I even made the mistake.)[/color]

          The principle of least astonishment then suggests that Python made
          a suboptical choice.

          Comment

          • Paul Rubin

            #20
            Re: Default parameters

            Carl Banks <imbosol@aerojo ckey.invalid> writes:[color=blue]
            > Consider something like this:
            >
            > def func(param=((1, 2),(3,4),(5,6), (7,8))):
            > whatever
            >
            > Do you really want to be building a big-ass nested tuple every time
            > the function is called?[/color]

            Come on, the compiler can easily recognize that that list is constant.
            [color=blue]
            > Python evaluates default args at time of definition mostly for
            > performance reasons (and maybe so we could simulate closures before we
            > had real closures). My gut feeling is, moving the evaluation to call
            > time would be too much of a performance hit to justify it.[/color]

            Python takes so many other performance hits for the sake of
            convenience and/or clarity that this particular one would be miniscule
            by comparison.

            Comment

            • JCM

              #21
              Re: Beginner question - How to effectively pass a large list

              Alan Gauld <alan.gauld@bti nternet.com> wrote:[color=blue]
              > On Tue, 16 Dec 2003 16:21:00 +0800, "J.R." <j.r.gao@motoro la.com>
              > wrote:[color=green]
              >> Actually, the python is passing the identity (i.e. memory address) of each
              >> parameter, and it will bind to a local name within the function.
              >>
              >> Right?[/color][/color]
              [color=blue]
              > Nope.
              > This is one case where understanding something of the insides of
              > Python helps. Basically Python variables are dictionary entries.
              > The variable values are the the dictionary values associated with
              > the variable names which are the dictionary keys.[/color]
              [color=blue]
              > Thus when you pass an argument to a function you are passing a
              > dictionary key. When the function uses the argument it looks up
              > the dictionary and uses the value found there.[/color]
              [color=blue]
              > This applies to all sorts of things in Python including modules -
              > a local dictionary associated with the module, and classes -
              > another dictionary. Dictionaries are fundamental to how Python
              > works and memory addresses per se play no part in the procedings.[/color]

              You're talking about the implementation of the interpreter. I
              wouldn't have used the term "memory address" as J.R. did, as this also
              implies something about the implementation, but it does make sense to
              say object IDs/object references are passed into functions and bound
              to names/variables.

              Comment

              • Bengt Richter

                #22
                Re: Beginner question - How to effectively pass a large list

                On Wed, 17 Dec 2003 18:46:10 GMT, alan.gauld@btin ternet.com (Alan Gauld) wrote:
                [color=blue]
                >On Tue, 16 Dec 2003 16:21:00 +0800, "J.R." <j.r.gao@motoro la.com>
                >wrote:[color=green]
                >> Actually, the python is passing the identity (i.e. memory address) of each
                >> parameter, and it will bind to a local name within the function.
                >>
                >> Right?[/color][/color]
                Depends on what you mean by "python" ;-) Python the language doesn't pass memory
                addresses, but an implementation of Python might very well. The distinction is
                important, or implementation features will be misconstrued as language features.

                I suspect Alan is trying to steer you away from discussing implementation. I think
                it can be useful to talk about both, if the discussion can be plain about which
                it is talking about.[color=blue]
                >
                >Nope.[/color]
                IMO that is a little too dismissive ;-)
                [color=blue]
                >This is one case where understanding something of the insides of
                >Python helps. Basically Python variables are dictionary entries.
                >The variable values are the the dictionary values associated with
                >the variable names which are the dictionary keys.
                >
                >Thus when you pass an argument to a function you are passing a
                >dictionary key. When the function uses the argument it looks up
                >the dictionary and uses the value found there.[/color]
                That's either plain wrong or mighty misleading. IOW this glosses
                over (not to say mangles) some important details, and not just of
                the implementation. E.g., when you write

                foo(bar)

                you are not passing a "key" (bar) to foo. Yes, foo will get access to
                the object indicated by bar by looking in _a_ "dictionary ". But
                when foo accesses bar, it will be using as "key" the parameter name specified
                in the parameter list of the foo definition, which will be found in _another_
                "dictionary ", i.e., the one defining the local namespace of foo. I.e., if foo is

                def foo(x): return 2*x

                and we call foo thus
                bar = 123
                print foo(bar)

                What happens is that 'bar' is a key in the global (or enclosing) scope and 'x' is a
                key in foo's local scope. Thus foo never sees 'bar', it sees 'x'. It is the job of
                the function-calling implementation to bind parameter name x to the same thing as the specified
                arg name (bar here) is bound to, before the first line in foo is executed. You could write

                globals()['bar'] = 123
                print foo(globals['bar'])

                and

                def foo(x): return locals()['x']*2

                to get the flavor of what's happening. Functions would be little more than global-access macros
                if it were not for the dynamic of binding local function parameter names to the call-time args.[color=blue]
                >
                >This applies to all sorts of things in Python including modules -
                >a local dictionary associated with the module, and classes -
                >another dictionary. Dictionaries are fundamental to how Python
                >works and memory addresses per se play no part in the procedings.[/color]
                Well, ISTM that is contrasting implementation and semantics. IOW, memory addresses may
                (and very likely do) or may not play a part in the implementation, but Python
                the language is not concerned with that for its _definition_ (though of course implementers
                and users are concerned about _implementation _ for performance reasons).

                I think the concept of name space is more abstract and more helpful in encompassing the
                various ways of finding objects by name that python implements. E.g., when you interactively
                type dir(some_object ), you will get a list of key names, but typically not from one single dictionary.
                There is potentially a complex graph of "dictionari es" to search according to specific rules
                defining order for the name in question. Thus one could speak of the whole collection of visible
                names in that process as a (complex) name space, or one could speak of a particular dict as
                implementing a (simple) name space.

                HTH

                Regards,
                Bengt Richter

                Comment

                • Asun Friere

                  #23
                  Re: Beginner question - How to effectively pass a large list

                  "J.R." <j.r.gao@motoro la.com> wrote in message news:<broje1$ab b$1@newshost.mo t.com>...

                  [color=blue]
                  > 1. There is no value passed to the default argument
                  > The name "d" is bound to the first element of the f.func_defaults . Since the
                  > function "f" is an
                  > object, which will be kept alive as long as there is name (current is "f")
                  > refered to it, the
                  > list in the func_defaults shall be accumulated by each invoking.
                  >[/color]

                  ....
                  [color=blue]
                  >
                  > I think we could eliminate such accumulation effect by changing the function
                  > as follow:[color=green][color=darkred]
                  > >>> def f(d=[]):[/color][/color]
                  > d = d+[0]
                  > print d
                  >[/color]

                  And the reason this eliminates the accumulation is that the assignment
                  ('d = d + [0]') rebinds the name 'd' to the new list object ([] +
                  [0]), ie. it no longer points to the first value in f.func_defaults .

                  What surprised me was that the facially equivalent:[color=blue][color=green][color=darkred]
                  >>> def f (d=[]) :[/color][/color][/color]
                  .... d += [0]
                  .... print d
                  did not do so. Apparently '+=' in regards to lists acts like
                  list.apppend, rather than as the assignment operator it looks like.

                  Comment

                  • Jp Calderone

                    #24
                    Re: Beginner question - How to effectively pass a large list

                    On Thu, Dec 18, 2003 at 03:29:55PM -0800, Asun Friere wrote:[color=blue]
                    > "J.R." <j.r.gao@motoro la.com> wrote in message news:<broje1$ab b$1@newshost.mo t.com>...
                    >
                    >[color=green]
                    > > 1. There is no value passed to the default argument
                    > > The name "d" is bound to the first element of the f.func_defaults . Since the
                    > > function "f" is an
                    > > object, which will be kept alive as long as there is name (current is "f")
                    > > refered to it, the
                    > > list in the func_defaults shall be accumulated by each invoking.
                    > >[/color]
                    >
                    > ...
                    >[color=green]
                    > >
                    > > I think we could eliminate such accumulation effect by changing the function
                    > > as follow:[color=darkred]
                    > > >>> def f(d=[]):[/color]
                    > > d = d+[0]
                    > > print d
                    > >[/color]
                    >
                    > And the reason this eliminates the accumulation is that the assignment
                    > ('d = d + [0]') rebinds the name 'd' to the new list object ([] +
                    > [0]), ie. it no longer points to the first value in f.func_defaults .
                    >
                    > What surprised me was that the facially equivalent:[color=green][color=darkred]
                    > >>> def f (d=[]) :[/color][/color]
                    > ... d += [0]
                    > ... print d
                    > did not do so. Apparently '+=' in regards to lists acts like
                    > list.apppend, rather than as the assignment operator it looks like.[/color]

                    list.extend, to be more precise, or list.__iadd__ to be completely precise
                    :) The maybe-mutate-maybe-rebind semantics of += lead me to avoid its use
                    in most circumstances.

                    Jp
                    [color=blue]
                    > --
                    > http://mail.python.org/mailman/listinfo/python-list
                    >[/color]

                    Comment

                    • Greg Ewing (using news.cis.dfn.de)

                      #25
                      Re: Default parameters

                      Paul Rubin wrote:[color=blue]
                      > "Greg Ewing (using news.cis.dfn.de )" <g2h5dqi002@sne akemail.com> writes:
                      >[color=green]
                      >>Changes are rarely if ever made to Python for the sole reason
                      >>of reducing newbie mistakes.[/color]
                      >
                      >
                      > print 3/2[/color]

                      That's not a counterexample! There are sound non-newbie-related
                      reasons for wanting to fix that.

                      --
                      Greg Ewing, Computer Science Dept,
                      University of Canterbury,
                      Christchurch, New Zealand


                      Comment

                      • Greg Ewing (using news.cis.dfn.de)

                        #26
                        Re: Default parameters

                        Paul Rubin wrote:[color=blue]
                        > In those cases the compiler should notice it and generate appropriate
                        > code to evaluate the default arg just once.[/color]

                        How is the compiler supposed to know? In the general case
                        it requires reading the programmer's mind.

                        --
                        Greg Ewing, Computer Science Dept,
                        University of Canterbury,
                        Christchurch, New Zealand


                        Comment

                        • Dang Griffith

                          #27
                          Re: Default parameters

                          On Thu, 18 Dec 2003 18:25:27 +1300, "Greg Ewing (using
                          news.cis.dfn.de )" <g2h5dqi002@sne akemail.com> wrote:
                          [color=blue]
                          >Stian Søiland wrote:[color=green]
                          >> When is this issue going to be resolved? Enough newbie-pythoners have
                          >> made this mistake now.[/color]
                          >
                          >Changes are rarely if ever made to Python for the sole reason
                          >of reducing newbie mistakes. There needs to be a payoff for
                          >long-term use of the language as well.[/color]

                          I strongly concur with this principle. Too often, I've been on a team
                          that goes to such efforts to make a system easy for newbies to learn
                          that the normal/advanced users are then handicapped by a dumbed-down
                          interface. E.g., after you've performed some process enough times, do
                          you *really* need a step-by-step wizard?

                          --dang

                          Comment

                          • Carl Banks

                            #28
                            Re: Default parameters

                            Paul Rubin wrote:[color=blue]
                            >
                            >
                            > Carl Banks <imbosol@aerojo ckey.invalid> writes:[color=green]
                            >> Consider something like this:
                            >>
                            >> def func(param=((1, 2),(3,4),(5,6), (7,8))):
                            >> whatever
                            >>
                            >> Do you really want to be building a big-ass nested tuple every time
                            >> the function is called?[/color]
                            >
                            > Come on, the compiler can easily recognize that that list is constant.[/color]

                            Yes, but that doesn't account for all expensive parameters. What
                            about this:

                            DEFAULT_LIST = ((1,2),(3,4),(5 ,6),(7,8))

                            def func(param=DEFA ULT_LIST):
                            pass

                            Or this:

                            import external_module

                            def func(param=exte rnal_modules.cr eate_constant_o bject()):
                            pass

                            Or how about this:

                            def func(param={'1' : 'A', '2': 'B', '3': 'C', '4': 'D'}):
                            pass


                            The compiler couldn't optimize any of the above cases.

                            [color=blue][color=green]
                            >> Python evaluates default args at time of definition mostly for
                            >> performance reasons (and maybe so we could simulate closures before we
                            >> had real closures). My gut feeling is, moving the evaluation to call
                            >> time would be too much of a performance hit to justify it.[/color]
                            >
                            > Python takes so many other performance hits for the sake of
                            > convenience and/or clarity that this particular one would be miniscule
                            > by comparison.[/color]


                            Well, I don't have any data, but my gut feeling is this would be
                            somewhat more than "miniscule" performance hit. Seeing how pervasive
                            default arguments are, I'm guessing it would be a very significant
                            slowdown if default arguments had to be evaluated every call.

                            But since I have no numbers, I won't say anything more about it.


                            --
                            CARL BANKS http://www.aerojockey.com/software
                            "If you believe in yourself, drink your school, stay on drugs, and
                            don't do milk, you can get work."
                            -- Parody of Mr. T from a Robert Smigel Cartoon

                            Comment

                            • Bengt Richter

                              #29
                              Re: Default parameters

                              On Sat, 20 Dec 2003 01:43:00 GMT, Carl Banks <imbosol@aerojo ckey.invalid> wrote:
                              [color=blue]
                              >Paul Rubin wrote:[color=green]
                              >>
                              >>
                              >> Carl Banks <imbosol@aerojo ckey.invalid> writes:[color=darkred]
                              >>> Consider something like this:
                              >>>
                              >>> def func(param=((1, 2),(3,4),(5,6), (7,8))):
                              >>> whatever
                              >>>
                              >>> Do you really want to be building a big-ass nested tuple every time
                              >>> the function is called?[/color]
                              >>
                              >> Come on, the compiler can easily recognize that that list is constant.[/color]
                              >
                              >Yes, but that doesn't account for all expensive parameters. What
                              >about this:
                              >
                              > DEFAULT_LIST = ((1,2),(3,4),(5 ,6),(7,8))
                              >
                              > def func(param=DEFA ULT_LIST):
                              > pass
                              >
                              >Or this:
                              >
                              > import external_module
                              >
                              > def func(param=exte rnal_modules.cr eate_constant_o bject()):
                              > pass
                              >
                              >Or how about this:
                              >
                              > def func(param={'1' : 'A', '2': 'B', '3': 'C', '4': 'D'}):
                              > pass
                              >
                              >
                              >The compiler couldn't optimize any of the above cases.[/color]
                              For the DEFAULT_LIST (tuple?) and that particular dict literal, why not?
                              [color=blue]
                              >
                              >[color=green][color=darkred]
                              >>> Python evaluates default args at time of definition mostly for
                              >>> performance reasons (and maybe so we could simulate closures before we
                              >>> had real closures). My gut feeling is, moving the evaluation to call
                              >>> time would be too much of a performance hit to justify it.[/color]
                              >>
                              >> Python takes so many other performance hits for the sake of
                              >> convenience and/or clarity that this particular one would be miniscule
                              >> by comparison.[/color]
                              >
                              >
                              >Well, I don't have any data, but my gut feeling is this would be
                              >somewhat more than "miniscule" performance hit. Seeing how pervasive
                              >default arguments are, I'm guessing it would be a very significant
                              >slowdown if default arguments had to be evaluated every call.
                              >
                              >But since I have no numbers, I won't say anything more about it.
                              >[/color]
                              Don't know if I got this right, but

                              [18:32] /d/Python23/Lib>egrep -c 'def .*=' *py |cut -d: -f 2|sum
                              Total = 816
                              [18:32] /d/Python23/Lib>egrep -c 'def ' *py |cut -d: -f 2|sum
                              Total = 4454

                              would seem to suggest pervasive ~ 816/4453
                              or a little less than 20%

                              Of course that says nothing about which are typically called in hot loops ;-)
                              But I think it's a bad idea as a default way of operating anyway. You can
                              always program call-time evaluations explicitly. Maybe som syntactic sugar
                              could be arranged, but I think I would rather have some sugar for the opposite
                              instead -- i.e., being able to code a block of preset locals evaluated and bound
                              locally like current parameter defaults, but not being part of the call signature.

                              Regards,
                              Bengt Richter

                              Comment

                              • Carl Banks

                                #30
                                Re: Default parameters

                                Bengt Richter wrote:[color=blue]
                                >
                                >
                                > On Sat, 20 Dec 2003 01:43:00 GMT, Carl Banks <imbosol@aerojo ckey.invalid> wrote:
                                >[color=green]
                                >>Paul Rubin wrote:[color=darkred]
                                >>>
                                >>>
                                >>> Carl Banks <imbosol@aerojo ckey.invalid> writes:
                                >>>> Consider something like this:
                                >>>>
                                >>>> def func(param=((1, 2),(3,4),(5,6), (7,8))):
                                >>>> whatever
                                >>>>
                                >>>> Do you really want to be building a big-ass nested tuple every time
                                >>>> the function is called?
                                >>>
                                >>> Come on, the compiler can easily recognize that that list is constant.[/color]
                                >>
                                >>Yes, but that doesn't account for all expensive parameters. What
                                >>about this:
                                >>
                                >> DEFAULT_LIST = ((1,2),(3,4),(5 ,6),(7,8))
                                >>
                                >> def func(param=DEFA ULT_LIST):
                                >> pass
                                >>
                                >>Or this:
                                >>
                                >> import external_module
                                >>
                                >> def func(param=exte rnal_modules.cr eate_constant_o bject()):
                                >> pass
                                >>
                                >>Or how about this:
                                >>
                                >> def func(param={'1' : 'A', '2': 'B', '3': 'C', '4': 'D'}):
                                >> pass
                                >>
                                >>
                                >>The compiler couldn't optimize any of the above cases.[/color]
                                >
                                > For the DEFAULT_LIST (tuple?) and that particular dict literal, why not?[/color]


                                Well, the value of DEFAULT_LIST is not known a compile time (unless, I
                                suppose, this happens to be in the main module or command prompt).
                                The literal is not a constant, so the compiler couldn't optimize this.

                                (Remember, the idea is that default parameters should be evaluated at
                                call time, which would require the compiler to put the evaluations
                                inside the function's pseudo-code. The compiler could optimize default
                                parameters by evaluating them at compile time: but you can only do
                                that with constants, for obvious reasons.)

                                [color=blue][color=green]
                                >>Well, I don't have any data, but my gut feeling is this would be
                                >>somewhat more than "miniscule" performance hit. Seeing how pervasive
                                >>default arguments are, I'm guessing it would be a very significant
                                >>slowdown if default arguments had to be evaluated every call.
                                >>
                                >>But since I have no numbers, I won't say anything more about it.
                                >>[/color]
                                > Don't know if I got this right, but
                                >
                                > [18:32] /d/Python23/Lib>egrep -c 'def .*=' *py |cut -d: -f 2|sum
                                > Total = 816
                                > [18:32] /d/Python23/Lib>egrep -c 'def ' *py |cut -d: -f 2|sum
                                > Total = 4454
                                >
                                > would seem to suggest pervasive ~ 816/4453
                                > or a little less than 20%[/color]

                                Well, if you don't like the particular adjective I used, feel free to
                                substitute another. This happens a lot to me in c.l.p (see Martelli).
                                All I'm saying is, default arguments are common in Python code, and
                                slowing them down is probably going to be a significant performance
                                hit.

                                (You probably underestimated a little bit anyways: some functions
                                don't get to the default arguments until the second line.)

                                [color=blue]
                                > Of course that says nothing about which are typically called in hot
                                > loops ;-) But I think it's a bad idea as a default way of operating
                                > anyway. You can always program call-time evaluations
                                > explicitly. Maybe som syntactic sugar could be arranged, but I think
                                > I would rather have some sugar for the opposite instead -- i.e.,
                                > being able to code a block of preset locals evaluated and bound
                                > locally like current parameter defaults, but not being part of the
                                > call signature.[/color]

                                Well, personally, I don't see much use for non-constant default
                                arguments, as we have them now, wheras they would be useful if you
                                could get a fresh copy. And, frankly, the default arguments feel like
                                they should be evaluated at call time. Now that we have nested
                                scopes, there's no need for them to simulate closures. So, from a
                                purely language perspective, I think they ought to be evaluated at
                                call time.

                                The only thing is, I very much doubt I'd be willing to take the
                                performance hit for it.


                                --
                                CARL BANKS http://www.aerojockey.com/software
                                "If you believe in yourself, drink your school, stay on drugs, and
                                don't do milk, you can get work."
                                -- Parody of Mr. T from a Robert Smigel Cartoon

                                Comment

                                Working...