Next step after pychecker

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Philippe Fremy

    Next step after pychecker


    Hi,

    I would like to develop a tool that goes one step further than pychecker
    to ensure python program validity. The idea would be to get close to
    what people get on ocaml: a static verification of all types of the
    program, without any kind of variable declaration. This would definitely
    brings a lot of power to python.

    The idea is to analyse the whole program, identify constraints on
    function arguments and check that these constraints are verified by
    other parts of the program.

    What is in your opinion the best tool to achieve this ? I had an
    extensive look at pychecker, and it could certainly be extended to do
    the job. Things that still concern me are that it works on the bytecode,
    which prevents it from working with jython and the new .NET python.

    I am currently reading the documentation on AST and visitor, but I am
    not sure that this will be the best tool either. The AST seems quite
    deep and I am afraid that it will make the analysis quite slow and
    complicated.


    For a simple pattern of what I want to do:

    def f1( a ):
    return a+1

    def f2():
    f1( [] )

    Obviously, this won't work, but neither pychecker nor the python
    interpreter will catch it. But a program analysis can catch it in this
    simple example.

    regards,

    Philippe
  • Paul Rubin

    #2
    Re: Next step after pychecker

    Philippe Fremy <phil@freehacke rs.org> writes:[color=blue]
    > I would like to develop a tool that goes one step further than
    > pychecker to ensure python program validity. The idea would be to get
    > close to what people get on ocaml: a static verification of all types
    > of the program, without any kind of variable declaration. This would
    > definitely brings a lot of power to python.[/color]

    You know, I've always thought that ML-style type inference is an
    impressive technical feat, but why is it so important to not use
    declarations? This is an aspect I've never really understood.

    Comment

    • Steven Bethard

      #3
      Re: Next step after pychecker

      Philippe Fremy wrote:[color=blue]
      > I would like to develop a tool that goes one step further than pychecker
      > to ensure python program validity. The idea would be to get close to
      > what people get on ocaml: a static verification of all types of the
      > program, without any kind of variable declaration. This would definitely
      > brings a lot of power to python.[/color]

      It's a very cool idea, and there was talk of including such a tool in
      Python 3.0, though as yet, no one has volunteered to solve this problem.
      =) Note that you can't actually really ensure program validity, but you
      can make some educated guesses like pychecker does. An example:

      def f1(a):
      return a + 1

      def f2():
      return f1([])

      Your suggestion was that this code should be deemed invalid. But if
      it's followed with:

      f1 = ''.join
      f2()

      where f1 is rebound, then it's perfectly valid code. That said, I think
      providing "suggestion s" like pychecker does is definitely worth pursuing.
      [color=blue]
      > What is in your opinion the best tool to achieve this ? I had an
      > extensive look at pychecker, and it could certainly be extended to do
      > the job. Things that still concern me are that it works on the bytecode,
      > which prevents it from working with jython and the new .NET python.[/color]

      I don't know much about what pychecker does, but if it works with the
      bytecode, shouldn't it be fine for jython and IronPython? I thought the
      bytecode was part of the language spec, and what was CPython specific
      was how the bytecodes were actually implemented...
      [color=blue]
      > I am currently reading the documentation on AST and visitor, but I am
      > not sure that this will be the best tool either. The AST seems quite
      > deep and I am afraid that it will make the analysis quite slow and
      > complicated.[/color]

      You should read up on the recent python-dev discussions about merging
      the AST branch. I don't know too much about it, but apparently there's
      some work being done currently on how the AST is used in Python. Sorry
      I don't know more about this...


      Steve

      Comment

      • huy

        #4
        Re: Next step after pychecker

        Paul Rubin wrote:[color=blue]
        > Philippe Fremy <phil@freehacke rs.org> writes:
        >[color=green]
        >>I would like to develop a tool that goes one step further than
        >>pychecker to ensure python program validity. The idea would be to get
        >>close to what people get on ocaml: a static verification of all types
        >>of the program, without any kind of variable declaration. This would
        >>definitely brings a lot of power to python.[/color]
        >
        >
        > You know, I've always thought that ML-style type inference is an
        > impressive technical feat, but why is it so important to not use
        > declarations? This is an aspect I've never really understood.[/color]

        You know, I think I agree ;-). Just because you don't declare the types,
        doesn't mean you can change the implicit type willy nilly anyway; at
        least for more complex programs anyway. In fact, it would be safer to
        have type checking when you want to do something like this. I currently
        needed to change a number parameter to a string parameter (found out
        order_no wasn't just numbers as specs had specified). Of course this
        parameter was being used in a great many places. Changing it was a bit
        scary because we had to make sure it wasn't being treated as a number
        anywhere throughout the code. Yes good coverage with unit tests would
        have helped but unfortunately we do not yet have good coverage. TDD is a
        quite hard to practice as a beginner.

        Cheers,

        Huy

        Comment

        • Jacek Generowicz

          #5
          Re: Next step after pychecker

          Paul Rubin <http://phr.cx@NOSPAM.i nvalid> writes:
          [color=blue]
          > Philippe Fremy <phil@freehacke rs.org> writes:[color=green]
          > > I would like to develop a tool that goes one step further than
          > > pychecker to ensure python program validity. The idea would be to get
          > > close to what people get on ocaml: a static verification of all types
          > > of the program, without any kind of variable declaration. This would
          > > definitely brings a lot of power to python.[/color]
          >
          > You know, I've always thought that ML-style type inference is an
          > impressive technical feat, but why is it so important to not use
          > declarations? This is an aspect I've never really understood.[/color]

          You gain something akin to duck typing, though in a formalized way
          (type classes). The code works for any types which provide the
          features which are used in the code, without regard for the specific
          type. You gain generic programming.

          You also gain not having to clutter your code with all the type
          declarations. And you gain not having to decide what types you will
          use too early on in development.

          Comment

          • Sylvain Thenault

            #6
            Re: Next step after pychecker

            On Tue, 01 Feb 2005 05:18:12 +0100, Philippe Fremy wrote:[color=blue]
            > Hi,[/color]

            Hi
            [color=blue]
            > I would like to develop a tool that goes one step further than pychecker
            > to ensure python program validity. The idea would be to get close to what
            > people get on ocaml: a static verification of all types of the program,
            > without any kind of variable declaration. This would definitely brings a
            > lot of power to python.
            >
            > The idea is to analyse the whole program, identify constraints on function
            > arguments and check that these constraints are verified by other parts of
            > the program.[/color]

            Did you take a look at the starkiller [1] and pypy projects [2] ?
            [color=blue]
            > What is in your opinion the best tool to achieve this ? I had an
            > extensive look at pychecker, and it could certainly be extended to do
            > the job. Things that still concern me are that it works on the bytecode,
            > which prevents it from working with jython and the new .NET python.
            >
            > I am currently reading the documentation on AST and visitor, but I am
            > not sure that this will be the best tool either. The AST seems quite
            > deep and I am afraid that it will make the analysis quite slow and
            > complicated.[/color]

            are you talking about the ast returned by the "parser" module, or the ast
            from the "compiler" module ? The former is a higher abstraction, using
            specific class instances in the tree, and most importantly with all the
            parsing junk removed. See [3]. You may also be interested in pylint
            [4] which is a pychecker like program built in top of the compiler ast,
            and so doesn't require actual import of the analyzed code. However it's
            not yet as advanced as pychecker regarding bug detection.

            And finally as another poster said you should probably keep an eye open
            on the python 2.5 ast branch work...

            Hope that helps !

            [1]http://www.python.org/pycon/dc2004/papers/1/paper.pdf)
            [2]http://codespeak.net/pypy/index.cgi?home
            [3]http://www.python.org/doc/current/lib/module-compiler.ast.ht ml
            [4]http://www.logilab.org/projects/pylint

            --
            Sylvain Thénault LOGILAB, Paris (France).

            http://www.logilab.com http://www.logilab.fr http://www.logilab.org


            Comment

            • Peter Maas

              #7
              Re: Next step after pychecker

              Jacek Generowicz schrieb:[color=blue]
              > You also gain not having to clutter your code with all the type
              > declarations. And you gain not having to decide what types you will
              > use too early on in development.[/color]

              But it can be useful to restrict type variety in certain situations
              e.g. prime number calculation :) And it would probably also be useful
              to check violations of restrictions before running the program in
              normal mode.

              This must not necessarily mean 'int this, float that'. E.g. the
              following would be nice:

              a := some_type_value # "type fixing" assignment
              a = other_type_valu e # error: a is restricted to some_type
              a := yet_another_typ e_value # OK, another fixed type is set
              del a # release type restriction; a can be recreated in normal
              # dynamic mode

              The type fixing assignment could be used for optimization and for
              checking the program with pychecker.

              --
              -------------------------------------------------------------------
              Peter Maas, M+R Infosysteme, D-52070 Aachen, Tel +49-241-93878-0
              E-mail 'cGV0ZXIubWFhc0 BtcGx1c3IuZGU=\ n'.decode('base 64')
              -------------------------------------------------------------------

              Comment

              • Diez B. Roggisch

                #8
                Re: Next step after pychecker

                > But it can be useful to restrict type variety in certain situations[color=blue]
                > e.g. prime number calculation :) And it would probably also be useful
                > to check violations of restrictions before running the program in
                > normal mode.[/color]

                But that's what (oca)ml and the like do - they exactly don't force you to
                specify a type, but a variable has an also variable type, that gets
                inferned upon the usage and is then fixed.

                --
                Regards,

                Diez B. Roggisch

                Comment

                • aurora

                  #9
                  Re: Next step after pychecker

                  A frequent error I encounter

                  try:
                  ...do something...
                  except IOError:
                  log('encounter an error %s line %d' % filename)

                  Here in the string interpolation I should supply (filename,linen o).
                  Usually I have a lot of unittesting to catch syntax error in the main
                  code. But it is very difficult to run into exception handler, some of
                  those are added defensely. Unfortunately those untested exception
                  sometimes fails precisely when we need it for diagnosis information.

                  pychecker sometime give false alarm. The argument of a string
                  interpolation may be a valid tuple. It would be great it we can somehow
                  unit test the exception handler (without building an extensive library of
                  mock objects).

                  Comment

                  • Francis Girard

                    #10
                    Re: Next step after pychecker

                    Hi,

                    I do not want to discourage Philippe Fremy but I think that this would be very
                    very difficult to do without modifying Python itself.

                    What FP languages rely upon to achieve type inference is a feature named
                    "strong typing". A clear definition of strong typing is :

                    "Every well-formed expression of the language can be assigned a type that can
                    be deduced from the constituents of the expression alone." Bird and Wadler,
                    Introduction to Functional Programming, 1988

                    This is certainly not the case for Python since one and the same variable can
                    have different types depending upon the execution context. Example :

                    1- if a is None:
                    2- b = 1
                    3- else:
                    4- b = "Phew"
                    5- b = b + 1

                    One cannot statically determine the type of b by examining the line 5- alone.

                    Therefore, Fremy's dream can very well turn to some very complex expert system
                    to make "educated" warning. Being "educated" is a lot harder than to be
                    "brutal".

                    It's funny that what mainly prevents us from easily doing a type inferenceris
                    at this very moment discussed with almost religious flame in the "variable
                    declaration" thread.

                    Anyway, strong typing as defined above would change the Python language in
                    some of its fundamental design. It would certainly be valuable to attempt the
                    experience, and rename the new language (mangoose would be a pretty name).

                    Francis Girard
                    FRANCE

                    Le mardi 1 Février 2005 16:49, Diez B. Roggisch a écrit :[color=blue][color=green]
                    > > But it can be useful to restrict type variety in certain situations
                    > > e.g. prime number calculation :) And it would probably also be useful
                    > > to check violations of restrictions before running the program in
                    > > normal mode.[/color]
                    >
                    > But that's what (oca)ml and the like do - they exactly don't force you to
                    > specify a type, but a variable has an also variable type, that gets
                    > inferned upon the usage and is then fixed.
                    >
                    > --
                    > Regards,
                    >
                    > Diez B. Roggisch[/color]

                    Comment

                    • Francis Girard

                      #11
                      Re: Next step after pychecker [StarKiller?]

                      Hi,

                      Do you have some more pointers to the StarKiller project ? According to the
                      paper some implementation of this very interesting project exists.

                      Thank you

                      Francis Girard

                      Le mardi 1 Février 2005 11:21, Sylvain Thenault a écrit :[color=blue]
                      > On Tue, 01 Feb 2005 05:18:12 +0100, Philippe Fremy wrote:[color=green]
                      > > Hi,[/color]
                      >
                      > Hi
                      >[color=green]
                      > > I would like to develop a tool that goes one step further than pychecker
                      > > to ensure python program validity. The idea would be to get close to what
                      > > people get on ocaml: a static verification of all types of the program,
                      > > without any kind of variable declaration. This would definitely brings a
                      > > lot of power to python.
                      > >
                      > > The idea is to analyse the whole program, identify constraints on
                      > > function arguments and check that these constraints are verified by other
                      > > parts of the program.[/color]
                      >
                      > Did you take a look at the starkiller [1] and pypy projects [2] ?
                      >[color=green]
                      > > What is in your opinion the best tool to achieve this ? I had an
                      > > extensive look at pychecker, and it could certainly be extended to do
                      > > the job. Things that still concern me are that it works on the bytecode,
                      > > which prevents it from working with jython and the new .NET python.
                      > >
                      > > I am currently reading the documentation on AST and visitor, but I am
                      > > not sure that this will be the best tool either. The AST seems quite
                      > > deep and I am afraid that it will make the analysis quite slow and
                      > > complicated.[/color]
                      >
                      > are you talking about the ast returned by the "parser" module, or the ast
                      > from the "compiler" module ? The former is a higher abstraction, using
                      > specific class instances in the tree, and most importantly with all the
                      > parsing junk removed. See [3]. You may also be interested in pylint
                      > [4] which is a pychecker like program built in top of the compiler ast,
                      > and so doesn't require actual import of the analyzed code. However it's
                      > not yet as advanced as pychecker regarding bug detection.
                      >
                      > And finally as another poster said you should probably keep an eye open
                      > on the python 2.5 ast branch work...
                      >
                      > Hope that helps !
                      >
                      > [1]http://www.python.org/pycon/dc2004/papers/1/paper.pdf)
                      > [2]http://codespeak.net/pypy/index.cgi?home
                      > [3]http://www.python.org/doc/current/lib/module-compiler.ast.ht ml
                      > [4]http://www.logilab.org/projects/pylint
                      >
                      > --
                      > Sylvain Thénault LOGILAB, Paris (France).
                      >
                      > http://www.logilab.com http://www.logilab.fr http://www.logilab.org[/color]

                      Comment

                      • John Roth

                        #12
                        Re: Next step after pychecker


                        "Sylvain Thenault" <sylvain.thenau lt@nospam.logil ab.fr> wrote in message
                        news:pan.2005.0 2.01.10.21.23.7 62637@nospam.lo gilab.fr...[color=blue]
                        > On Tue, 01 Feb 2005 05:18:12 +0100, Philippe Fremy wrote:
                        >
                        > Did you take a look at the starkiller [1] and pypy projects [2] ?[/color]

                        Has anything happened to Starkiller since PyCon 2004? The
                        latest mention I can find on Google is a blog entry (by Ted
                        Leung) on Aug 30 saying he wished someone would give the
                        author some money to finish it and publish it.

                        John Roth

                        Comment

                        • Philippe Fremy

                          #13
                          Re: Next step after pychecker

                          [color=blue]
                          > I do not want to discourage Philippe Fremy but I think that this would be very
                          > very difficult to do without modifying Python itself.[/color]

                          That's the conclusion I reached to too after lurking on the ocaml list.
                          [color=blue]
                          >
                          > What FP languages rely upon to achieve type inference is a feature named
                          > "strong typing".
                          > [...]
                          >
                          > This is certainly not the case for Python since one and the same variable can
                          > have different types depending upon the execution context. Example :
                          >
                          > 1- if a is None:
                          > 2- b = 1
                          > 3- else:
                          > 4- b = "Phew"
                          > 5- b = b + 1
                          >
                          > One cannot statically determine the type of b by examining the line 5- alone.
                          >[/color]

                          There are even worse cases:

                          1- a=1
                          2- a= someObject.some Method()

                          someObject might not have someMethod() and this would be caught by an
                          AttributeError exception handler and 'a' will stay at 1.

                          [color=blue]
                          > Therefore, Fremy's dream can very well turn to some very complex expert system
                          > to make "educated" warning. Being "educated" is a lot harder than to be
                          > "brutal".[/color]

                          And even being brutal is not that easy to achieve...
                          [color=blue]
                          > Anyway, strong typing as defined above would change the Python language in
                          > some of its fundamental design. It would certainly be valuable to attempt the
                          > experience, and rename the new language (mangoose would be a pretty name).[/color]

                          Indeed. I understood that this is one of the things that could be
                          provided by pypy.

                          What I understood (Jacek, this is an answer for you too) is that there
                          are many usage patterns for python.

                          Some people use it for all of its advanced dynamic features. I use it in
                          a java-like fashion. One could use it without the "OO crap", like a
                          modern pascal replacement.

                          This is a strength of python that it lends itself to so many programming
                          paradigm. However, one drawback is that python is not optimum in many of
                          these paradigm (although it is sometimes the best proponent). In my
                          case, dynamic typing is not a required feature. Typing immutability and
                          type inferring is a lot more valuable.

                          I really hope that pypy will provide that kind of choice. Give me python
                          with eiffel like contracts, super speed optimisation thank to type
                          inference and I will be super happy.

                          Thank everyone for its feedback.

                          Any other idea of a fun python improvement project I could join without
                          too much hassle ? I can't help but thinking that pychecker ought to be
                          able to do a better job.

                          regards,

                          Philippe


                          [color=blue]
                          >
                          > Francis Girard
                          > FRANCE
                          >
                          > Le mardi 1 Février 2005 16:49, Diez B. Roggisch a écrit :
                          >[color=green][color=darkred]
                          >>>But it can be useful to restrict type variety in certain situations
                          >>>e.g. prime number calculation :) And it would probably also be useful
                          >>>to check violations of restrictions before running the program in
                          >>>normal mode.[/color]
                          >>
                          >>But that's what (oca)ml and the like do - they exactly don't force you to
                          >>specify a type, but a variable has an also variable type, that gets
                          >>inferned upon the usage and is then fixed.
                          >>
                          >>--
                          >>Regards,
                          >>
                          >>Diez B. Roggisch[/color]
                          >
                          >[/color]

                          Comment

                          • Skip Montanaro

                            #14
                            Re: Next step after pychecker


                            Francis> "Every well-formed expression of the language can be assigned a
                            Francis> type that can be deduced from the constituents of the
                            Francis> expression alone." Bird and Wadler, Introduction to Functional
                            Francis> Programming, 1988

                            Francis> This is certainly not the case for Python since one and the
                            Francis> same variable can have different types depending upon the
                            Francis> execution context. Example :

                            Francis> 1- if a is None:
                            Francis> 2- b = 1
                            Francis> 3- else:
                            Francis> 4- b = "Phew"
                            Francis> 5- b = b + 1

                            Francis> One cannot statically determine the type of b by examining the
                            Francis> line 5- alone.

                            Do you have an example using a correct code fragment? It makes no sense to
                            infer types in code that would clearly raise runtime errors:
                            [color=blue][color=green][color=darkred]
                            >>> "Phew" + 1[/color][/color][/color]
                            Traceback (most recent call last):
                            File "<stdin>", line 1, in ?
                            TypeError: cannot concatenate 'str' and 'int' objects

                            Also, note that the type assigned to an expression may be nothing more than
                            "object". Clearly that wouldn't be very helpful when trying to write an
                            optimizing compiler, but it is a valid type.

                            Skip

                            Comment

                            • Terry Reedy

                              #15
                              Re: Next step after pychecker


                              "Steven Bethard" <steven.bethard @gmail.com> wrote in message
                              news:vvadnbwi6_ bqjWLcRVn-vQ@comcast.com. ..[color=blue]
                              > I don't know much about what pychecker does, but if it works with the
                              > bytecode, shouldn't it be fine for jython and IronPython? I thought the
                              > bytecode was part of the language spec, and what was CPython specific was
                              > how the bytecodes were actually implemented...[/color]

                              Nothing about bytecode is part of the language spec. And CPython bytecode
                              is version specific. If the CPython implementation changed from a virtual
                              stack machine to a virtual register machine, as was once discussed, the
                              stack-oriented byte code would be replaced by a register-oriented byte code
                              or virtual machine language.

                              Jython compiles Python code to JVM (Java Virtual Machine) bytecode. Parrot
                              compile to Parrot bytecode. Ironman compiles, I presume, to .Net CL or
                              whatever it's called.

                              Terry J. Reedy



                              Comment

                              Working...