Python for large projects

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Alan Gauld

    #16
    Re: Python for large projects

    On 25 Mar 2004 12:21:36 +0100, Matthias <no@spam.pls> wrote:
    [color=blue]
    > Jacek Generowicz <jacek.generowi cz@cern.ch> writes:
    >[color=green]
    > > "After", is far too late, in my opinion. It's a bit like suggesting to
    > > a static-typing-for-safety fan, that he should only run his program
    > > through the compiler _after_ he has finished developing.[/color]
    >
    > I think this method was advertised as the "cleanroom approach".
    > Google finds some references.[/color]

    The clean room approach was slihtly different although heading in
    that direction. It relied on rigorous review, inspection and
    testing at every stage of the process. (sound familiar?)

    It was popular in the early/mid eighties and here are a few
    references:

    Wicked problems, Righteous Solutions; P DeGrace & L Hulet Stahl
    - many methodolofgies including a section on clean room.

    Cleanroom approach to REliable Software Devt; Dyer & MIlls
    Proceedings Validation Methods Research for Fault Tolerant
    Avionics....; Research Triangele Institiute, 1981

    Cleanroom Software Devt, An Emopirical Investigation;
    Selby, Basili, Baker, 1987
    IEEE Transactions on Software Engineering,
    VolSE-13,#9, Sept 1987

    HTH,

    Alan G.

    PS. Keeping programmers away from compilers is not that old a
    prctice, I was working on a VAX project in 1989 that only allowed
    us one compile each per day, with a full compile overnight (which
    took 6 hours)

    Author of the Learn to Program website

    Comment

    • Hung Jung Lu

      #17
      Re: Python for large projects

      > On Tue, 2004-03-23 at 17:24, Cameron Laird wrote:[color=blue]
      >[color=green]
      > > they're at a particular DISadvantage there. If you have a
      > > big job, you *particularly* need to look at Python (or Erlang,
      > > or Eiffel, or ...)[/color][/color]
      --------------------------------------------
      gabor <gabor@z10n.net > wrote in message news:<mailman.3 00.1080071082.7 42.python-list@python.org >...[color=blue]
      > ...
      > i wanted to use python for a project in our company... we wanted to
      > build a fairly big system/program.
      >
      > but when i recommended python, i got a question like:
      > (previously all the programs were written in java)
      > "if one of our programmers changes a method in a class/interface, we
      > immediately will know about it, because the next program-rebuild will
      > simply fail. but if we would use python, we wouldn't find it out".[/color]
      --------------------------------------------

      I use C++ and Python everyday. Let us be fair and point out some good
      things about each of them.

      (a) In compiled language like C++, changing function prototypes and
      variable names is comfortable, because the compiler will find all
      those spots that you need to change. In Python, you do not have the
      same level of comfort. Sure, there are other techniques, but it's
      different than clicking a button.

      (b) Cameron said something very true in my opinion: for large
      projects, you want Python. But he said so without giving more details.
      So let me add some comments.

      In my opinion, the essence of software development is code/task
      factorization. It seems such a trivial concept, but if you really
      really think about it, goto statements, loops, functions, classes,
      arrays, pointers, OOP, macros/templates, metaprogramming , AOP,
      databases, etc, just about every single technique in programming has
      its base in the concept of code/task factorization. Take for instance
      classes and inheritance, basically, you factor out the common parts of
      two classes and push it up into a common parent class. To go one level
      deeper, my belief is that at the bottom, all human intellectual
      activities are based on factorization: no more, no less.

      In large projects, you'll find that you need to factor out even more.
      Let us take an example. Suppose you write an application, and later on
      you realize that you need to make it transactional: that is, if some
      exceptions happen, you want to roll back the changes. Now, this kind
      of major after-thought is terrible for languages without
      metaprogramming capabilities. To add a new feature, you will have to
      make modifications in hundreds or thousands of spots. Another example,
      suppose your software is versioned, more over, you have different
      versions for the application and for the data file format, and your
      application needs to work with legacy file formats. Again, without
      metaprogramming capabilities, your code will have many redundant lines
      of code, or be cluttered with tons of if-statements or
      switch-statements. Another similar problem: you have several different
      clients that buy your application, and they want some different extra
      features. Again, without metaprogramming , your code will be either
      hard to code (using virtual functions, function pointers, and/or
      templates in C++), or will be cluttered with if-else- and switch-
      statements (a terrible practice that will make your code
      unmaintainable. )

      As your project grows more and more complex (become threaded, many new
      clients requirements, support for legacy versions, using distributed
      computing in a cluster, etc.) you will realize more and more that you
      need to factorize efficiently, otherwise your pain will be unbearable.

      When you have reached that point, you'll come to appreciate simplicity
      and purity in a language. Frankly, Python is good but still not good
      enough.

      For large projects, if you use a rigid language, then your best bet is
      to use tons of programmers coding trivial interfaces and APIs to make
      up for the shortcomings of the language. In flexible languages like
      Python, you often can use metaprogramming features to factor out the
      common areas. At that point, I think that issues like automatically
      finding name changes as I mentioned in point (a) become small issues,
      because you will have bigger concerns. The fact that you may miss a
      name change or function header change is not the thing that will kill
      you. The fact that your entire system is unmaintainable is the thing
      that will kill you. Don't look at individual bugs when you are talking
      about large projects, because your worry should not be there: your
      worry should be focused on how to make your system maintainable. Bugs
      can and will be fixed. But if your language does not allow you to
      factorize efficiently, at the end of the day, that's what's going to
      kill you.

      regards,

      Hung Jung

      Comment

      • Roger Binns

        #18
        Re: Python for large projects

        > (a) In compiled language like C++, changing function prototypes and[color=blue]
        > variable names is comfortable, because the compiler will find all
        > those spots that you need to change.[/color]

        It won't catch some stuff such as where a prototype changes from
        pass by value to pass by reference (or vice versa), or if another
        operator or explicit conversion is available. [That is true
        of many languages, but C++ gives the impression it has this
        rigid type checking system that avoids errors if the code compiles]

        In reality I find the best approach is to use multiple languages.
        You can code components in C++ and glue them together using
        Swig and Python. You can make multiple binaries and execute
        them telling them where to send their output, or use a pipe.
        That kind of thing also makes it easier dealing with issues
        in the field. For example you can send the customer a different
        binary (that has the same interface) or the debugging version
        of a DLL/so etc.

        At the end of the day, use the best tool for the job, and
        don't use any that preclude you from using others at the
        same time as well.

        Roger


        Comment

        • Bill Rubenstein

          #19
          Re: Python for large projects

          In article <mailman.357.10 80147061.742.py thon-list@python.org >, gabor@z10n.net
          says...[color=blue]
          > On Wed, 2004-03-24 at 15:16, Bill Rubenstein wrote:[color=green]
          > > ...snip...[color=darkred]
          > > > > other thing is, that in the projects i work on, there seems to be
          > > > > very hard to do unit tests[/color]
          > > ...snip...
          > >
          > > The ability to do unit testing should not be an afterthought. It should be
          > > considered as a major influence on the architecture of a project.
          > >
          > > If one cannot do proper unit testing, the architecture of the project is
          > > questionable.[/color]
          >
          > ok, so let's use a specific example:
          >
          > imagine you're building a library, which fetches webpages.
          >
          > you have a library which can fetch 1 webpage at a time, but it is a
          > synchronous library (like wget). you call him, and he returns the page.
          >
          > but you want an async one.
          >
          > so you decide to build a threadpool, where every thread will do this:
          > look into a queue, and if there is a new URL to fetch, fetches it with
          > his wget-like library, and saves the html page somewhere (and maybe
          > signals something).
          >
          > and now the user who uses your library, simply adds the URL to fetch,
          > and can check later asynchronously whether they are already fetched or
          > not.
          >
          > could you tell me what unit tests would you create for this example?
          >
          >
          > (a more generic request: is there on the internet a webpage with
          > something like this? one where they have some complex
          > modules/programs/algorithms, and they show how to write unittests for
          > them?)
          >
          > thanks,
          > gabor
          >
          >
          >[/color]
          Ok, I think I understand what the job is so, here is a try.

          I'm assuming that this async wget's job is to start at a url, fetch it, track
          down and fetch any links and such, get them, and make all of that available on
          the local system for later viewing.

          To make it testable, I'd design so that the application part of the system
          (described above) has as limited a knowledge of its surroundings as possible --
          except for the actual work performed. It should have no knowledge of a gui, for
          instance.

          Instead it should know about an object which represents a 'job'. This object
          should have attributes and/or functions which can be accessed to find out the
          base URL, the current status or state of the specific job (not started, in
          progress (various states here),..., complete. There should be a log associated
          with the job object where both normal and abnormal stuff can be kept. It should
          also be able to provide information about the user if there is one, instructions
          about the base URL, where in the local file system to store the results, etc.
          During the development phase this job object is going to be a bit dynamic as new
          needs for it are discovered.

          There should probably be one object which can keep track of all of the job
          objects and is responsible for creating new ones and deleting old ones.

          All of the interfaces to the job management object and the job object need to be
          formalized and properly documented. This whole subsystem can be tested, then, by
          a test driver requesting services via the documented interfaces, changing the
          state of a job via the documented interfaces and determining that the state
          transitions are as expected. There is no need to fetch any real URLs to do this,
          just pretend you did. This test driver also needs to exercise the interfaces
          intended for use by a gui.

          Now, as to testing the actual application code -- I'd think that you'd need a set
          of URLs which would return known and stable results and a number of error
          situations (bad links and such) to test against. Then a test driver would be
          written to use the standard interfaces to the job management object and the job
          object to schedule work against those URLs, determine when that work is done and
          test that the results are as expected, highlight the differences between a prior
          run against the particular URL and the current run, etc.

          I've been retired for years but that was pretty much how we did it. There were
          two small programming teams -- one writing application code against the formal
          interface documentation and one writing test scaffolding against the same
          documentation and building test cases. Things worked, the bug rate was very low,
          implementation changes were localized and testable...

          Anyway, it worked for us and we never had to claim that we just couldn't test
          something except in production.

          Bill





          Comment

          • Cameron Laird

            #20
            Re: Python for large projects

            In article <c3t11s$619$1@a tlantis.news.tp i.pl>,
            Jarek Zgoda <jzgoda@gazeta. usun.pl> wrote:

            Comment

            • Cameron Laird

              #21
              Minor observation on the programming enterprise (was: Python for large projects)

              In article <8ef9bea6.04032 60837.72a8fade@ posting.google. com>,
              Hung Jung Lu <hungjunglu@yah oo.com> wrote:

              Comment

              • Joe Mason

                #22
                Re: Minor observation on the programming enterprise (was: Python for large projects)

                In article <106e0498vbqa61 1@corp.supernew s.com>, Cameron Laird wrote:
                Remarkable fact that I see as turning up all over: we work with[color=blue]
                > grep(1). There are visual programming and language-savvy editors
                > and IDEs and refactoring plugins and all sorts of other tools,
                > and we find our variables with text searches. 'Know how to make
                > a C programmer mad? Name a global variable 'i'. 'Know how to
                > make him happy? Change the name to 'ii'. Both Lisp's inventor
                > and I keep our human address collection in a plaintext file.[/color]

                My address collection was scattered all over various databases and
                phones, and I lost the phone with the most recent one. I spent a good
                hour searching for an important number, and realized that the one
                database I might still have access to was for a PDA I no longer owned,
                with a desktop app that I could no longer run, in a Windows partition
                that I couldn't boot to at the time.

                I could see the actual data, but knowing the Windows world I was almost
                positive it'd be some binary database, and I'd be out of luck.

                Nope, XML. Almost as good as plain text for grepping. I've never been
                so relieved.

                Joe

                Comment

                • Isaac Gouy

                  #23
                  Re: Python for large projects

                  Jacek Generowicz <jacek.generowi cz@cern.ch> wrote in message news:<tyfbrmn63 5i.fsf@pcepsft0 01.cern.ch>...
                  [color=blue]
                  > I am of the opinion that (explicit) static typing contributes to the
                  > bugginess of programs.[/color]

                  Is there a theory for the periodicity of static-checking /
                  dynamic-checking debates?

                  A couple of weeks worth has drawn to a close on comp.lang.objec t



                  The last discussion on comp.lang.funct ional was back in Nov 2003

                  Comment

                  • Aahz

                    #24
                    Re: Python for large projects

                    [quoting unsnipped, voting this for post of the week]

                    In article <8ef9bea6.04032 60837.72a8fade@ posting.google. com>,
                    Hung Jung Lu <hungjunglu@yah oo.com> wrote:[color=blue]
                    >
                    >I use C++ and Python everyday. Let us be fair and point out some good
                    >things about each of them.
                    >
                    >(a) In compiled language like C++, changing function prototypes and
                    >variable names is comfortable, because the compiler will find all
                    >those spots that you need to change. In Python, you do not have the
                    >same level of comfort. Sure, there are other techniques, but it's
                    >different than clicking a button.
                    >
                    >(b) Cameron said something very true in my opinion: for large
                    >projects, you want Python. But he said so without giving more details.
                    >So let me add some comments.
                    >
                    >In my opinion, the essence of software development is code/task
                    >factorizatio n. It seems such a trivial concept, but if you really
                    >really think about it, goto statements, loops, functions, classes,
                    >arrays, pointers, OOP, macros/templates, metaprogramming , AOP,
                    >databases, etc, just about every single technique in programming has
                    >its base in the concept of code/task factorization. Take for instance
                    >classes and inheritance, basically, you factor out the common parts of
                    >two classes and push it up into a common parent class. To go one level
                    >deeper, my belief is that at the bottom, all human intellectual
                    >activities are based on factorization: no more, no less.
                    >
                    >In large projects, you'll find that you need to factor out even more.
                    >Let us take an example. Suppose you write an application, and later on
                    >you realize that you need to make it transactional: that is, if some
                    >exceptions happen, you want to roll back the changes. Now, this kind
                    >of major after-thought is terrible for languages without
                    >metaprogrammin g capabilities. To add a new feature, you will have to
                    >make modifications in hundreds or thousands of spots. Another example,
                    >suppose your software is versioned, more over, you have different
                    >versions for the application and for the data file format, and your
                    >application needs to work with legacy file formats. Again, without
                    >metaprogrammin g capabilities, your code will have many redundant lines
                    >of code, or be cluttered with tons of if-statements or
                    >switch-statements. Another similar problem: you have several different
                    >clients that buy your application, and they want some different extra
                    >features. Again, without metaprogramming , your code will be either
                    >hard to code (using virtual functions, function pointers, and/or
                    >templates in C++), or will be cluttered with if-else- and switch-
                    >statements (a terrible practice that will make your code
                    >unmaintainable .)
                    >
                    >As your project grows more and more complex (become threaded, many new
                    >clients requirements, support for legacy versions, using distributed
                    >computing in a cluster, etc.) you will realize more and more that you
                    >need to factorize efficiently, otherwise your pain will be unbearable.
                    >
                    >When you have reached that point, you'll come to appreciate simplicity
                    >and purity in a language. Frankly, Python is good but still not good
                    >enough.
                    >
                    >For large projects, if you use a rigid language, then your best bet is
                    >to use tons of programmers coding trivial interfaces and APIs to make
                    >up for the shortcomings of the language. In flexible languages like
                    >Python, you often can use metaprogramming features to factor out the
                    >common areas. At that point, I think that issues like automatically
                    >finding name changes as I mentioned in point (a) become small issues,
                    >because you will have bigger concerns. The fact that you may miss a
                    >name change or function header change is not the thing that will kill
                    >you. The fact that your entire system is unmaintainable is the thing
                    >that will kill you. Don't look at individual bugs when you are talking
                    >about large projects, because your worry should not be there: your
                    >worry should be focused on how to make your system maintainable. Bugs
                    >can and will be fixed. But if your language does not allow you to
                    >factorize efficiently, at the end of the day, that's what's going to
                    >kill you.
                    >
                    >regards,
                    >
                    >Hung Jung[/color]


                    --
                    Aahz (aahz@pythoncra ft.com) <*> http://www.pythoncraft.com/

                    "usenet imitates usenet" --Darkhawk

                    Comment

                    Working...