Genetic programming: pygene, pygp, AST, or (gasp) Lisp?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • John Ladasky

    Genetic programming: pygene, pygp, AST, or (gasp) Lisp?

    Hi folks,

    I've played around with neural nets for a while. I wrote my own slow,
    pure-Python NN package. I knew that there were Python NN packages out
    there -- but I couldn't really understand their features and
    documentation at first, not without some hands-on experience.

    I haven't yet solved any interesting problems with NN, but I learned a
    lot about both NN and about Python along the way. One of the
    unpleasant things I learned about NN was their propensity for becoming
    trapped in local minima. I also learned about overfitting, wasting
    many hours of CPU time in the process. In short, simple neural nets
    have disappointed me.

    I have now asked myself: wouldn't it be cool if, when you ceased to
    make progress on an optimization algorithm, you could divide your data
    set with an "if" statement, which would evolve a useful dividing line
    through your data set -- and then, develop divergent solutions to
    explain each subset better?

    In mathematical terms:

    Round 1 of evolutionary programming leads to a single, but sub-optimal
    solution, probably a solution in a local minimum:

    b = f(a)

    Round 2 would start by elaborating the solution of round 1, adding a
    condition that is initially meaningless:

    if disc(a) < c:
    b = f1(a)
    else:
    b = f2(a)

    Initially, f1() = f2() = the original f(). Therefore, the output of
    the starting state of round 2 will be identical to the output of the
    final state of round 1, regardless of the initial state of the
    discriminant function, disc(). But then, f1(), f2(), and disc() will
    all be permitted to evolve. This may provide a chance to "pop out" of
    the local minimum, without sacrificing any of the progress made in
    defining the solution of round 1.

    Of course, f(), f1(), f2(), and disc() could be complex functions.
    They would best be described with parse trees, which I've just
    discovered (I'm not formally trained in computer science).

    Some reading has led me to conclude that what I'm proposing is
    basically genetic programming.



    I've tried my hand at this already in Python. I have devised a code
    tree class, which functions as a parse tree for functions and also
    handles multiple-line statements. I've tried cobbling together program
    strings from code trees, and then using the exec() function to run
    them. This is all very cool, but I'm finding it VERY cumbersome!

    As with neural nets, I know that there are genetic programming
    packages offered for Python:


    Download Python Genetic Programming Project for free. The Python Genetic Programming Project implements a Genetic Programming System a la J Koza in Python.


    But I can't really understand what they do -- both of these packages
    are, alas, minimally documented. Is anyone out there using either of
    these? I can't tell whether they will implement algorithms of the
    type I've described here. Specifically, I don't know if conditional
    statements are included in their repertoire. Also, Pygene SEEMS to
    have a bent toward Boolean logic. I'm working with analog data, and I
    want complex, nonlinear functions.

    So, as with neural nets, shall I once again proceed on my own? There
    is apparently a layer in the Python interpreter at which code is
    represented as an abstract syntax tree (AST).



    Why not do genetic programming directly on Python code? Maybe my code
    tree data structure is redundant to something which already exists?
    But apparently Python's "_ast" module offers only one-way access -- it
    will generate an AST from a piece of code, but you can't modify an
    AST, and turn it back into executable code? I would definitely need
    this latter feature.

    ALTERNATELY -- and I don't mention this to start a flame war -- in
    pondering this problem I've noticed the frightening fact that Lisp
    seems set up to handle genetic programming by design. The s-
    expression syntax IS essentially a parse tree. Now, I've spent a few
    hours with Lisp so far, and I can't make it do much of anything -- but
    I DO see how Lisp could be helpful. Still, I'm reluctant to pursue a
    language I don't know, and which I'm finding much harder to grasp than
    any language I've tried before.

    Is there a way to interface Lisp to Python, so that I can do all the
    interface programming in the language I already know best -- and just
    do the genetic parts in Lisp? I haven't seen exception handling in
    Lisp, a feature I've come to love in Python. Since it is fairly easy
    for a randomly-generated program to generate illegal output (I already
    know this from my initial experiments in Python), I don't think I can
    live without exception handling.

    I also think that Python has a higher-level understanding of a
    variable's "type" or an object's "class" than Lisp does -- if I'm
    hacking at a parse tree and I want to minimize syntax errors, I need
    to know more than the fact that an element in an expression is an atom
    or a list.

    Producing human-readable code from my genetic programming search would
    be a great bonus -- and for me, at this moment, this seems to mean
    Algol-style syntax. (Sigh.) But it's not a requirement.

    Thanks for your advice!
  • Kay Schluehr

    #2
    Re: Genetic programming: pygene, pygp, AST, or (gasp) Lisp?

    On 20 Jul., 09:52, John Ladasky <lada...@my-deja.comwrote:
    Why not do genetic programming directly on Python code? Maybe my code
    tree data structure is redundant to something which already exists?
    But apparently Python's "_ast" module offers only one-way access -- it
    will generate an AST from a piece of code, but you can't modify an
    AST, and turn it back into executable code?
    Why not? You can compile ASTs.

    Another option is to use EasyExtend



    which is a bit heavyweight though without prior knowledge of the
    framework.

    EE provides some generic functions over languages like parse/unparse.
    Python is just a special case. So you can do the following

    from EasyExtend.lang lets.zero.langl et import parse, unparse

    src = """
    if disc(a) < c:
    b = f1(a)
    else:
    b = f2(a)
    """

    parse(src) # yields a parse tree
    unparse(parse(s rc)) # yields the source code of the parse tree

    Here `zero` means Python which is just the trivial/featureless langlet
    of the system or some kind of embedding.

    For meshing fragments together on random one can use cst.py.

    For each rule in Pythons grammar cst.py implements a corresponding
    function that produces the parse tree accordingly. So if there is a
    rule

    test: or_test ['if' or_test 'else' test] | lambdef

    a corresponding function test(*args) exists that produces a parse tree
    from components that were built using or_test(), test() or lambdef().
    chaining these functions is just like building s-expr.
    I would definitely need
    this latter feature.
    >
    ALTERNATELY -- and I don't mention this to start a flame war -- in
    pondering this problem I've noticed the frightening fact that Lisp
    seems set up to handle genetic programming by design.
    Definitely. But this is nothing new. Lisp was the original language
    used by John Koza to implement GP.
    The s-
    expression syntax IS essentially a parse tree. Now, I've spent a few
    hours with Lisp so far, and I can't make it do much of anything -- but
    I DO see how Lisp could be helpful. Still, I'm reluctant to pursue a
    language I don't know, and which I'm finding much harder to grasp than
    any language I've tried before.
    You can write a primitive s-expr evaluator/manipulator using Pythons
    overloading capabilities. But this way you will just produce s-expr
    and not Python functions.

    Kay

    Comment

    • David Boddie

      #3
      Re: Genetic programming: pygene, pygp, AST, or (gasp) Lisp?

      On Sunday 20 July 2008 09:52, John Ladasky wrote:
      Is there a way to interface Lisp to Python, so that I can do all the
      interface programming in the language I already know best -- and just
      do the genetic parts in Lisp? I haven't seen exception handling in
      Lisp, a feature I've come to love in Python. Since it is fairly easy
      for a randomly-generated program to generate illegal output (I already
      know this from my initial experiments in Python), I don't think I can
      live without exception handling.
      Just searching the Web for Python and Lisp yielded some interesting
      projects:




      I've no idea if they're really that relevant to your problem, but they
      might lead somewhere useful.

      David

      Comment

      • Erik Max Francis

        #4
        Re: Genetic programming: pygene, pygp, AST, or (gasp) Lisp?

        John Ladasky wrote:
        Why not do genetic programming directly on Python code? Maybe my code
        tree data structure is redundant to something which already exists?
        But apparently Python's "_ast" module offers only one-way access -- it
        will generate an AST from a piece of code, but you can't modify an
        AST, and turn it back into executable code? I would definitely need
        this latter feature.
        >
        ALTERNATELY -- and I don't mention this to start a flame war -- in
        pondering this problem I've noticed the frightening fact that Lisp
        seems set up to handle genetic programming by design. The s-
        expression syntax IS essentially a parse tree. Now, I've spent a few
        hours with Lisp so far, and I can't make it do much of anything -- but
        I DO see how Lisp could be helpful. Still, I'm reluctant to pursue a
        language I don't know, and which I'm finding much harder to grasp than
        any language I've tried before.
        The main advantages that Lisp has to a structural language like Python
        are that in Lisp there is no distinction between code and data --
        they're represented in the same way -- and the syntatic uniformity of
        the language. Originally the idea with Lisp was that s-expressions were
        a lower-level syntax that a more Algol-like syntax would be translated
        to ... then it became aware that the syntax was actually a benefice in
        many ways, not a hindrance.
        Is there a way to interface Lisp to Python, so that I can do all the
        interface programming in the language I already know best -- and just
        do the genetic parts in Lisp? I haven't seen exception handling in
        Lisp, a feature I've come to love in Python. Since it is fairly easy
        for a randomly-generated program to generate illegal output (I already
        know this from my initial experiments in Python), I don't think I can
        live without exception handling.
        It is not likely you want to do this. Genetic programming systems are
        probably best run in a sandbox, for reasons which should be pretty
        reasonable. The point of genetic programming is to manipulate programs
        into new programs; thus, complicated syntax for those programs only
        increases the chances that either 1. you have to do an awful lot of
        extra work to make sure that you end up with valid programs or 2. you
        end up with an awful lot of syntactically invalid programs. Both of
        these waste large amounts of processing cycles that you could be
        focusing on creating new, valid programs from old ones.
        I also think that Python has a higher-level understanding of a
        variable's "type" or an object's "class" than Lisp does -- if I'm
        hacking at a parse tree and I want to minimize syntax errors, I need
        to know more than the fact that an element in an expression is an atom
        or a list.
        If you're interested in typed genetic programming systems, then you
        might want to check out some of the stack-based genetic programming
        languages like Push or PushGP. These are in effect a stack-based form
        of Lisp, but which use different data stacks for different types.
        Producing human-readable code from my genetic programming search would
        be a great bonus -- and for me, at this moment, this seems to mean
        Algol-style syntax. (Sigh.) But it's not a requirement.
        Well, you could always translate s-expressions into m-expressions ...

        --
        Erik Max Francis && max@alcyone.com && http://www.alcyone.com/max/
        San Jose, CA, USA && 37 18 N 121 57 W && AIM, Y!M erikmaxfrancis
        It isn't important to come out on top, what matters is to be the one
        who comes out alive. -- Bertolt Brecht, 1898-1956

        Comment

        • david.leoni.dontemailthisaddress@gmail.com

          #5
          Re: Genetic programming: pygene, pygp, AST, or (gasp) Lisp?

          John Ladasky said:
          Why not do genetic programming directly on Python code?

          maybe Dione is what you are looking for - it seems to manipulate
          Python asts directly

          Download dione for free. Genetic Programming framework written in Python. Takes advantage of python\'s compiler to make things simple.

          Comment

          • dmitrey

            #6
            Re: Genetic programming: pygene, pygp, AST, or (gasp) Lisp?

            also, you could look at the simple openopt example provided by GA
            "galileo" solver (connected to OO framework)


            Regards, D

            Comment

            • John Ladasky

              #7
              Re: Genetic programming: pygene, pygp, AST, or (gasp) Lisp?

              Thanks to everyone who replied. I haven't chosen a definite direction
              for my project yet. But you have given me some good leads.

              Google Books offers previews of many pages of John Koza's book,
              published in the early 1990's. I'm reading through the preview pages,
              with the idea of purchasing a more recent book on the subject once I
              understand the subject a bit better.

              Apropos to Python, I've had a quick look at the parse and compile
              functions in Python's compiler module. The Python AST is MUCH longer
              than I expected it to be, for just a few lines of code. It may
              contain more information than I need. I'll see whether I can make
              sense of EasyExtend or Dione next.

              Comment

              Working...