Re: duck-type-checking?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Joe Strout

    Re: duck-type-checking?

    On Nov 12, 2008, at 10:45 AM, Tim Rowe wrote:
    What do you actually mean by "Quacks like a string"? Supports the
    'count()' method? Then you find out if it doesn't when you try to
    apply the 'count()' method. Supports some method that you don't
    actually use? Then why do you care?
    Because if I write a method with the intention of treating the
    arguments like strings in various ways (slicing, combining with other
    strings, printing to stdout or writing to a file, etc. etc.), and some
    idiot (i.e. me six weeks later or long after I should have gone to
    bed) manages to accidentally pass in something else, then I want my
    program to blow up right away, not plant a roadside bomb and
    cheerfully wait for me to drive by.

    This is not hypothetical -- just last week I had a hard-to-track-down
    abend that ultimately turned out to be an NLTK.Tree object stored
    someplace that I expected to only contain strings. I found it by
    littering my code with assertions of the form
    isinstance(foo, basestring). If I'd had those in there in the first
    place, not only documenting my assumptions but letting the computer
    check them for me, it would have saved me a lot of grief.

    But in the spirit of duck-typing, I shouldn't actually check that foo
    is a basestring. I should instead check that foo quacks like a
    basestring. I'd define that is:

    "x quacks like a basestring if it implements all the public methods of
    basestring, and can be used in pretty much any context that a
    basestring can."

    I have to say "pretty much" since obviously there may be some evil
    context that actually checks isinstance. But that's the pathological
    case, and we shouldn't let it prevent us from neatly handling the
    typical case.
    The point about duck typing is that something might quack like a duck
    but not walk like a duck -- one of those duck calls that hunters use,
    for instance. Quacking like a duck doesn't actually mean it /is/ a
    duck, it means that it will do instead of a duck if the quack is all
    you want.
    Well, that's one point, but it's not the only point. If I have code
    that expects to be working with strings, and I want to purposely give
    it something else, then it's reasonable to expect that the something-
    else will act like a string in every way that a string is likely to be
    exercised. My string wrapper or doppleganger_st ring or whatever
    should implement all the methods, and support all the operators and
    type conversions, that basestring does.
    If you need to know that it walks like a duck, mates like a duck and
    tastes like a duck when roasted, you probably want it to really /be/ a
    duck and should go back to inheritance.
    I can't agree; there are times when inheritance just won't do, for
    example when you don't have control over the object creation, because
    they come from some factory method you can't change. In that case you
    may need to make a wrapper instead of a subclass, but if you've
    faithfully implemented the interface of the original class, you should
    be able to use it wherever the original class could be used (within
    reason).

    So, since it's pretty clear by now that there's no standard idiom for
    this, I'll try to cook up something myself. For classes, I think we
    could do a two-stage test:

    1. If the given object isinstance of the specified class, then all is
    good and return immediately.
    2. Otherwise, check each of the public attributes of the specified
    class, and make sure that the given object has corresponding callable
    attributes.

    For case 2, we might be able to cache the result so that we don't do
    all that work again the next time the same type comparison is done.

    Anyway, I'll evolve something in our shop here and live with it a
    while, and in a few months I'll either share what we develop for this
    purpose, or admit it was a horrible idea all along. :)

    Cheers,
    - Joe

  • pruebauno@latinmail.com

    #2
    Re: duck-type-checking?

    On Nov 12, 1:22 pm, Joe Strout <j...@strout.ne twrote:
    On Nov 12, 2008, at 10:45 AM, Tim Rowe wrote:
    >
    What do you actually mean by "Quacks like a string"? Supports the
    'count()' method? Then you find out if it doesn't when you try to
    apply the 'count()' method. Supports some method that you don't
    actually use? Then why do you care?
    >
    Because if I write a method with the intention of treating the
    arguments like strings in various ways (slicing, combining with other
    strings, printing to stdout or writing to a file, etc. etc.), and some
    idiot (i.e. me six weeks later or long after I should have gone to
    bed) manages to accidentally pass in something else, then I want my
    program to blow up right away, not plant a roadside bomb and
    cheerfully wait for me to drive by.
    >
    This is not hypothetical -- just last week I had a hard-to-track-down
    abend that ultimately turned out to be an NLTK.Tree object stored
    someplace that I expected to only contain strings. I found it by
    littering my code with assertions of the form
    isinstance(foo, basestring). If I'd had those in there in the first
    place, not only documenting my assumptions but letting the computer
    check them for me, it would have saved me a lot of grief.
    >
    But in the spirit of duck-typing, I shouldn't actually check that foo
    is a basestring. I should instead check that foo quacks like a
    basestring. I'd define that is:
    >
    "x quacks like a basestring if it implements all the public methods of
    basestring, and can be used in pretty much any context that a
    basestring can."
    >
    I have to say "pretty much" since obviously there may be some evil
    context that actually checks isinstance. But that's the pathological
    case, and we shouldn't let it prevent us from neatly handling the
    typical case.
    >
    The point about duck typing is that something might quack like a duck
    but not walk like a duck -- one of those duck calls that hunters use,
    for instance. Quacking like a duck doesn't actually mean it /is/ a
    duck, it means that it will do instead of a duck if the quack is all
    you want.
    >
    Well, that's one point, but it's not the only point. If I have code
    that expects to be working with strings, and I want to purposely give
    it something else, then it's reasonable to expect that the something-
    else will act like a string in every way that a string is likely to be
    exercised. My string wrapper or doppleganger_st ring or whatever
    should implement all the methods, and support all the operators and
    type conversions, that basestring does.
    >
    If you need to know that it walks like a duck, mates like a duck and
    tastes like a duck when roasted, you probably want it to really /be/ a
    duck and should go back to inheritance.
    >
    I can't agree; there are times when inheritance just won't do, for
    example when you don't have control over the object creation, because
    they come from some factory method you can't change. In that case you
    may need to make a wrapper instead of a subclass, but if you've
    faithfully implemented the interface of the original class, you should
    be able to use it wherever the original class could be used (within
    reason).
    >
    So, since it's pretty clear by now that there's no standard idiom for
    this, I'll try to cook up something myself. For classes, I think we
    could do a two-stage test:
    >
    1. If the given object isinstance of the specified class, then all is
    good and return immediately.
    2. Otherwise, check each of the public attributes of the specified
    class, and make sure that the given object has corresponding callable
    attributes.
    >
    For case 2, we might be able to cache the result so that we don't do
    all that work again the next time the same type comparison is done.
    >
    Anyway, I'll evolve something in our shop here and live with it a
    while, and in a few months I'll either share what we develop for this
    purpose, or admit it was a horrible idea all along. :)
    >
    Cheers,
    - Joe
    It seems to me that what you are describing is exactly what abcs were
    added for in 2.6, in particular registration:

    class AnotherClass(me taclass=ABCMeta ):
    pass
    AnotherClass.re gister(basestri ng)

    assert isinstance(str, AnotherClass)

    Please read this first:
    This is a proposal to add Abstract Base Class (ABC) support to Python 3000. It proposes:


    and tell us why that would not work.

    Comment

    • Joe Strout

      #3
      Re: duck-type-checking?

      On Nov 12, 2008, at 11:48 AM, pruebauno@latin mail.com wrote:
      It seems to me that what you are describing is exactly what abcs were
      added for in 2.6, in particular registration:
      >
      class AnotherClass(me taclass=ABCMeta ):
      pass
      AnotherClass.re gister(basestri ng)
      >
      assert isinstance(str, AnotherClass)
      >
      Please read this first:
      This is a proposal to add Abstract Base Class (ABC) support to Python 3000. It proposes:

      >
      and tell us why that would not work.
      You're right, that is exactly the need I was looking to fill. Thanks
      for pointing it out!

      Now I have only two regrets: first, that our shop is still using 2.5
      and this functionality is new in 2.6; and second, that it does not
      appear to include an easy way to check the types of elements of a
      sequence or mapping.

      Still, I can see that this is a very carefully considered PEP and is
      the best solution for the future, so I will study it carefully and
      incorporate it into our practices ASAP.

      Thanks,
      - Joe

      Comment

      • greg

        #4
        Re: duck-type-checking?

        Joe Strout wrote:
        This is not hypothetical -- just last week I had a hard-to-track-down
        abend that ultimately turned out to be an NLTK.Tree object stored
        someplace that I expected to only contain strings. I found it by
        littering my code with assertions of the form
        isinstance(foo, basestring).
        But you have to ask yourself whether the time taken to
        write and maintain all these assertions is really
        cost-effective. In my experience, occurrences like this
        are actually extremely rare -- the vast majority of the
        time, you do get a failure very quickly, and it's fairly
        obvious from the traceback where to look for the problem.

        It's as annoying as hell when something like this does
        happen, but averaged over the lifetime of the project,
        I find it doesn't cost all that much time.

        --
        Greg

        Comment

        • Ben Finney

          #5
          Re: duck-type-checking?

          Joe Strout <joe@strout.net writes:
          Because if I write a method with the intention of treating the
          arguments like strings in various ways (slicing, combining with
          other strings, printing to stdout or writing to a file, etc. etc.),
          and some idiot (i.e. me six weeks later or long after I should have
          gone to bed) manages to accidentally pass in something else, then I
          want my program to blow up right away, not plant a roadside bomb and
          cheerfully wait for me to drive by.
          This is better achived, not by littering the functional code unit with
          numerous assertions that obscure the normal function of the code, but
          rather by employing comprehensive unit tests *separate from* the code
          unit.

          This suite of unit tests is then employed and executed any time the
          code changes, and indivates any regressions in functionality from what
          is asserted in the tests. Meanwhile, the code unit itself is clear and
          free from the masses of error checking that one often finds in code
          written without such unit test suites.
          This is not hypothetical -- just last week I had a
          hard-to-track-down abend that ultimately turned out to be an
          NLTK.Tree object stored someplace that I expected to only contain
          strings. I found it by littering my code with assertions of the form
          isinstance(foo, basestring). If I'd had those in there in the first
          place, not only documenting my assumptions but letting the computer
          check them for me, it would have saved me a lot of grief.
          Rather than littering these assertions *within* the code unit, and
          thereby making that code harder to read later, those assertions would
          have been better placed in a unit test module that tests the behaviour
          of the unit in response to environment and input.

          This has the additional benefit of highlighting parts of one's code
          that do not have well-defined interfaces: if the code's behaviour in
          response to a simple, narrow specification of inputs and environment
          is not easy to describe in a simple true-or-false testable assertion,
          then the interface and/or the internals of the code unit is very
          likely too complex to be maintained well.
          But in the spirit of duck-typing, I shouldn't actually check that
          foo is a basestring. I should instead check that foo quacks like a
          basestring. I'd define that is:
          >
          "x quacks like a basestring if it implements all the public methods
          of basestring, and can be used in pretty much any context that a
          basestring can."
          That is not duck typing. Rather than checking what foo does in
          response to prodding that, by your admission, is only designed to find
          out what type it is, duck typing instead advocates that you should use
          foo *as though it is known to be* the type of object you want. If it
          is not suitable, then appropriate exceptions will be raised and either
          caught by some code that knows how to handle them, or crash the
          program.

          Unit tests then come into play by feeding inputs to your code unit and
          asserting that the code behaves appropriately whether fed correct or
          incorrect inputs: correct inputs, even of wholly unanticipated types,
          should cause correct behaviour (i.e. if the code is designed to handle
          ducks it should be content with a goose also); and incorrect inputs
          are incorrect by definition *only if* they cause the code unit to be
          unable to behave correctly (in which case an appropriate exception
          should be raised).

          --
          \ “Pinky, are you pondering what I'm pondering?” “I think so, |
          `\ Brain, but Zero Mostel times anything will still give you Zero |
          _o__) Mostel.” —_Pinky and The Brain_ |
          Ben Finney

          Comment

          • paul

            #6
            Re: duck-type-checking?

            Ben Finney schrieb:
            Joe Strout <joe@strout.net writes:
            >"x quacks like a basestring if it implements all the public methods
            >of basestring, and can be used in pretty much any context that a
            >basestring can."
            >
            That is not duck typing. Rather than checking what foo does in
            response to prodding that, by your admission, is only designed to find
            out what type it is, duck typing instead advocates that you should use
            foo *as though it is known to be* the type of object you want. If it
            is not suitable, then appropriate exceptions will be raised and either
            caught by some code that knows how to handle them, or crash the
            program.
            Warning, rant ;)

            This whole theory breaks down quickly if you're writing library code.
            How do your unittests help the user of your library to use it correctly?
            How do you communicate incorrect usage of your interfaces to the user?

            If you are able to specify the type of the arguments as part of the
            interface the compiler/interpreter will help you. Types are used to
            describe behaviour (if thats a good thing, I don't know). While python
            has strong types, there could be used better (instead it gets worse, see
            the suddently-not-sortable-list-type diskussion and the endless
            repetition of the greatest of all after-the-fact theories ever "duck
            typing".

            cheers
            Paul

            BTW: Back to Java? No, not really.


            Comment

            • Tim Rowe

              #7
              Re: duck-type-checking?

              2008/11/13 Ben Finney <bignose+hate s-spam@benfinney. id.au>:
              That is not duck typing.
              Oh, I'm pretty sure it is. It just isn't /using/ the duck typing in
              the way you'd like.

              --
              Tim Rowe

              Comment

              • Steve Holden

                #8
                Re: duck-type-checking?

                greg wrote:
                Joe Strout wrote:
                >This is not hypothetical -- just last week I had a hard-to-track-down
                >abend that ultimately turned out to be an NLTK.Tree object stored
                >someplace that I expected to only contain strings. I found it by
                >littering my code with assertions of the form
                >isinstance(foo ,basestring).
                >
                But you have to ask yourself whether the time taken to
                write and maintain all these assertions is really
                cost-effective. In my experience, occurrences like this
                are actually extremely rare -- the vast majority of the
                time, you do get a failure very quickly, and it's fairly
                obvious from the traceback where to look for the problem.
                >
                It's as annoying as hell when something like this does
                happen, but averaged over the lifetime of the project,
                I find it doesn't cost all that much time.
                >
                This is particularly true since Joe is still proposing to check the
                objects when they are passed to functions and methods that use them.
                Unfortunately the assignment of the "wrong type of object" may well have
                taken place aeons ago, and it's the assignments you really need to catch
                (and it's also the assignments that static languages cause compile
                errors on).

                So it hardly matters whether the code blows up with

                Exception: my expensive type checks sounded an alarm

                or

                Attribute error: foo has no method bar()

                since both are equally informative when it comes to tracing the faulty
                assignment.

                regards
                Steve
                --
                Steve Holden +1 571 484 6266 +1 800 494 3119
                Holden Web LLC http://www.holdenweb.com/

                Comment

                • Steve Holden

                  #9
                  Re: duck-type-checking?

                  greg wrote:
                  Joe Strout wrote:
                  >This is not hypothetical -- just last week I had a hard-to-track-down
                  >abend that ultimately turned out to be an NLTK.Tree object stored
                  >someplace that I expected to only contain strings. I found it by
                  >littering my code with assertions of the form
                  >isinstance(foo ,basestring).
                  >
                  But you have to ask yourself whether the time taken to
                  write and maintain all these assertions is really
                  cost-effective. In my experience, occurrences like this
                  are actually extremely rare -- the vast majority of the
                  time, you do get a failure very quickly, and it's fairly
                  obvious from the traceback where to look for the problem.
                  >
                  It's as annoying as hell when something like this does
                  happen, but averaged over the lifetime of the project,
                  I find it doesn't cost all that much time.
                  >
                  This is particularly true since Joe is still proposing to check the
                  objects when they are passed to functions and methods that use them.
                  Unfortunately the assignment of the "wrong type of object" may well have
                  taken place aeons ago, and it's the assignments you really need to catch
                  (and it's also the assignments that static languages cause compile
                  errors on).

                  So it hardly matters whether the code blows up with

                  Exception: my expensive type checks sounded an alarm

                  or

                  Attribute error: foo has no method bar()

                  since both are equally informative when it comes to tracing the faulty
                  assignment.

                  regards
                  Steve
                  --
                  Steve Holden +1 571 484 6266 +1 800 494 3119
                  Holden Web LLC http://www.holdenweb.com/

                  Comment

                  • Craig Allen

                    #10
                    Re: duck-type-checking?

                    This is better achived, not by littering the functional code unit with
                    numerous assertions that obscure the normal function of the code, but
                    rather by employing comprehensive unit tests *separate from* the code
                    unit.
                    that doesn't seem to work too well when shipping a library for someone
                    else to use... we don't have access to the caller's code that needs to
                    be checked. I suppose if the intent is to have a true assert, that
                    does nothing in shipped code, then you can argue that testing
                    addresses some of the issues, but one, not all of them, specifically,
                    not the part where the problem is ably reported, and two, I don't
                    think we can assume assert meant that sort of assert macro in C which
                    compiles away in release versions.

                    Asserts also do not litter code, they communicate the assumptions of
                    the code. I like the idea of a general duck-type assertion and would
                    probably use that, especially since I also have arguments which can be
                    multiple objects, each with their own interface but similar meaning...
                    i.e. lower level file objects can be passed in, or my higher level
                    abstraction of the same file.

                    Comment

                    • Craig Allen

                      #11
                      Re: duck-type-checking?

                      since both are equally informative when it comes to tracing the faulty
                      assignment.
                      >
                      steve, they are not equally informative, the assertion is designed to
                      fire earlier in the process, and therefore before much mischief and
                      corruption can be done compared to later, when you happen to hit the
                      missing attribute.

                      Comment

                      • Steve Holden

                        #12
                        Re: duck-type-checking?

                        Craig Allen wrote:
                        >since both are equally informative when it comes to tracing the faulty
                        >assignment.
                        >>
                        >
                        steve, they are not equally informative, the assertion is designed to
                        fire earlier in the process, and therefore before much mischief and
                        corruption can be done compared to later, when you happen to hit the
                        missing attribute.
                        I disagree. The assertion may fire a few lines before the non-existent
                        attribute access, but typically the damage is done in an assignment
                        before the function that raises the error is even called.

                        And I therefore suspect that a whole load of heavyweight type-checking
                        will be done for no very good reason.

                        regards
                        Steve
                        --
                        Steve Holden +1 571 484 6266 +1 800 494 3119
                        Holden Web LLC http://www.holdenweb.com/

                        Comment

                        • Steven D'Aprano

                          #13
                          Re: duck-type-checking?

                          On Thu, 13 Nov 2008 14:28:49 +0100, paul wrote:
                          Ben Finney schrieb:
                          >Joe Strout <joe@strout.net writes:
                          >>"x quacks like a basestring if it implements all the public methods of
                          >>basestring, and can be used in pretty much any context that a
                          >>basestring can."
                          >>
                          >That is not duck typing. Rather than checking what foo does in response
                          >to prodding that, by your admission, is only designed to find out what
                          >type it is, duck typing instead advocates that you should use foo *as
                          >though it is known to be* the type of object you want. If it is not
                          >suitable, then appropriate exceptions will be raised and either caught
                          >by some code that knows how to handle them, or crash the program.
                          >
                          Warning, rant ;)
                          >
                          This whole theory breaks down quickly if you're writing library code.
                          How do your unittests help the user of your library to use it correctly?
                          They don't. Unittests don't help you get better fuel economy for your car
                          either. Neither of those things are the purpose of unittests.

                          How do you communicate incorrect usage of your interfaces to the user?
                          With documentation and exceptions.

                          If you are able to specify the type of the arguments as part of the
                          interface the compiler/interpreter will help you.
                          Not in Python it won't. You can argue that this is a weakness of Python,
                          and you might even be correct (arguably, for some definition of
                          "weakness") , but it is by design and won't change anytime soon, not even
                          with ABCs.

                          Types are used to
                          describe behaviour (if thats a good thing, I don't know).
                          Types are one way of specifying behaviour, although they are subject to
                          false negatives and false positives. For example, type Spam may have the
                          exact same behaviour and interface as type Ham, but if the compiler has
                          been instructed to reject anything that is not Ham, it will wrongly
                          reject Spam. This is one of the problems duck-typing is meant to
                          ameliorate.

                          [snip rest of rant, which seemed incoherent to me]



                          --
                          Steven

                          Comment

                          Working...