python24.zip

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Robin Becker

    python24.zip

    Investigating a query about the python path I see that my win32 installation has
    c:/windows/system32/python24.zip (which is non existent) second on sys.path
    before the actual python24/lib etc etc.

    Firstly should python start up with non-existent entries on the path?
    Secondly is this entry be the default for some other kind of python installation?
    --
    Robin Becker

  • Martin v. Löwis

    #2
    Re: python24.zip

    Robin Becker wrote:[color=blue]
    > Firstly should python start up with non-existent entries on the path?[/color]

    Yes, this is by design.
    [color=blue]
    > Secondly is this entry be the default for some other kind of python
    > installation?[/color]

    Yes. People can package everything they want in python24.zip (including
    site.py). This can only work if python24.zip is already on the path
    (and I believe it will always be sought in the directory where
    python24.dll lives).

    Regards,
    Martin

    Comment

    • Dieter Maurer

      #3
      Re: python24.zip

      "Martin v. Löwis" <martin@v.loewi s.de> writes on Fri, 20 May 2005 18:13:56 +0200:[color=blue]
      > Robin Becker wrote:[color=green]
      > > Firstly should python start up with non-existent entries on the path?[/color]
      >
      > Yes, this is by design.
      >[color=green]
      > > Secondly is this entry be the default for some other kind of python
      > > installation?[/color]
      >
      > Yes. People can package everything they want in python24.zip (including
      > site.py). This can only work if python24.zip is already on the path
      > (and I believe it will always be sought in the directory where
      > python24.dll lives).[/color]

      The question was:

      "should python start up with **non-existent** objects on the path".

      I think there is no reason why path needs to contain an object
      which does not exist (at the time the interpreter starts).

      In your use case, "python24.z ip" does exist and therefore may
      be on the path. When "python24.z ip" does not exist, it does
      not contain anything and especially not "site.py".


      I recently analysed excessive import times and
      saw thousands of costly and unneccesary filesystem operations due to:

      * long "sys.path", especially containing non-existing objects

      Although non-existent, about 5 filesystem operations are
      tried on them for any module not yet located.

      * a severe weakness in Python's import hook treatment

      When there is an importer "i" for a path "p" and
      this importer cannot find module "m", then "p" is
      treated as a directory and 5 file system operations
      are tried to locate "p/m". Of course, all of them fail
      when "p" happens to be a zip archive.


      Dieter

      Comment

      • Robin Becker

        #4
        Re: python24.zip

        Dieter Maurer wrote:
        ......[color=blue]
        >
        > The question was:
        >
        > "should python start up with **non-existent** objects on the path".
        >
        > I think there is no reason why path needs to contain an object
        > which does not exist (at the time the interpreter starts).
        >
        > In your use case, "python24.z ip" does exist and therefore may
        > be on the path. When "python24.z ip" does not exist, it does
        > not contain anything and especially not "site.py".
        >[/color]

        I think this was my intention, but also I think I have some concern over
        having two possible locations for the standard library. It seems non pythonic
        and liable to cause confusion if some package should manage to install
        python24.zip while I believe that python24\lib is being used.
        [color=blue]
        >
        > I recently analysed excessive import times and
        > saw thousands of costly and unneccesary filesystem operations due to:
        >
        > * long "sys.path", especially containing non-existing objects
        >
        > Although non-existent, about 5 filesystem operations are
        > tried on them for any module not yet located.
        >
        > * a severe weakness in Python's import hook treatment
        >
        > When there is an importer "i" for a path "p" and
        > this importer cannot find module "m", then "p" is
        > treated as a directory and 5 file system operations
        > are tried to locate "p/m". Of course, all of them fail
        > when "p" happens to be a zip archive.
        >
        >
        > Dieter[/color]

        I suppose that's a reason for eliminating duplicates and non-existent entries.

        --
        Robin Becker

        Comment

        • Martin v. Löwis

          #5
          Re: python24.zip

          Dieter Maurer wrote:[color=blue]
          > The question was:
          >
          > "should python start up with **non-existent** objects on the path".
          >
          > I think there is no reason why path needs to contain an object
          > which does not exist (at the time the interpreter starts).[/color]

          There is. When the interpreter starts, it doesn't know what object
          do or do not exist. So it must put python24.zip on the path
          just in case.
          [color=blue]
          > In your use case, "python24.z ip" does exist and therefore may
          > be on the path. When "python24.z ip" does not exist, it does
          > not contain anything and especially not "site.py".[/color]

          Yes, but the interpreter cannot know in advance whether
          python24.zip will be there when it starts.
          [color=blue]
          > I recently analysed excessive import times and
          > saw thousands of costly and unneccesary filesystem operations due to:[/color]

          Hmm. In my Python 2.4 installation, I only get 154 open calls, and
          63 stat calls on an empty Python file. So somebody must have messed
          with sys.path really badly if you saw thoughsands of file operations
          (although I wonder what operating system you use so that failing
          open operations are costly; most operating systems should do them
          very efficiently).

          Regards,
          Martin

          Comment

          • Steve Holden

            #6
            Re: python24.zip

            Robin Becker wrote:[color=blue]
            > Dieter Maurer wrote:[/color]
            [...][color=blue]
            >
            > I think this was my intention, but also I think I have some concern over
            > having two possible locations for the standard library. It seems non pythonic
            > and liable to cause confusion if some package should manage to install
            > python24.zip while I believe that python24\lib is being used.
            >
            >[color=green]
            >>I recently analysed excessive import times and
            >>saw thousands of costly and unneccesary filesystem operations due to:
            >>
            >> * long "sys.path", especially containing non-existing objects
            >>
            >> Although non-existent, about 5 filesystem operations are
            >> tried on them for any module not yet located.
            >>
            >> * a severe weakness in Python's import hook treatment
            >>
            >> When there is an importer "i" for a path "p" and
            >> this importer cannot find module "m", then "p" is
            >> treated as a directory and 5 file system operations
            >> are tried to locate "p/m". Of course, all of them fail
            >> when "p" happens to be a zip archive.
            >>
            >>
            >>Dieter[/color]
            >
            >
            > I suppose that's a reason for eliminating duplicates and non-existent entries.
            >[/color]
            There are some aspects of Python's initialization that are IMHO a bit
            too filesystem-dependent. I mentioned one in




            but I'd appreciate further support. Ideally there should be some means
            for hooked import mechanisms to provide answers that are currently
            sought from the filestore.

            regards
            Steve
            --
            Steve Holden +1 703 861 4237 +1 800 494 3119
            Holden Web LLC http://www.holdenweb.com/
            Python Web Programming http://pydish.holdenweb.com/

            Comment

            • Dieter Maurer

              #7
              Re: python24.zip

              "Martin v. Löwis" <martin@v.loewi s.de> writes on Sat, 21 May 2005 23:53:31 +0200:[color=blue]
              > Dieter Maurer wrote:
              > ...[color=green]
              > > The question was:
              > >
              > > "should python start up with **non-existent** objects on the path".
              > >
              > > I think there is no reason why path needs to contain an object
              > > which does not exist (at the time the interpreter starts).[/color]
              >
              > There is. When the interpreter starts, it doesn't know what object
              > do or do not exist. So it must put python24.zip on the path
              > just in case.[/color]

              Really?

              Is the interpreter unable to call "C" functions ("stat" for example)
              to determine whether an object exists before it puts it on "path".
              [color=blue]
              > Yes, but the interpreter cannot know in advance whether
              > python24.zip will be there when it starts.[/color]

              Thus, it checks dynamically when it starts.
              [color=blue][color=green]
              > > I recently analysed excessive import times and
              > > saw thousands of costly and unneccesary filesystem operations due to:[/color]
              >
              > Hmm. In my Python 2.4 installation, I only get 154 open calls, and
              > 63 stat calls on an empty Python file. So somebody must have messed
              > with sys.path really badly if you saw thoughsands of file operations
              > (although I wonder what operating system you use so that failing
              > open operations are costly; most operating systems should do them
              > very efficiently).[/color]

              The application was Zope importing about 2.500 modules
              from 2 zip files "zope.zip" and "python24.z ip".
              This resulted in about 12.500 opens -- about 4 times more
              than would be expected -- about 10.000 of them failing opens.


              Dieter

              Comment

              • Dieter Maurer

                #8
                Re: python24.zip

                Steve Holden <steve@holdenwe b.com> writes on Sun, 22 May 2005 09:14:43 -0400:[color=blue]
                > ...
                > There are some aspects of Python's initialization that are IMHO a bit
                > too filesystem-dependent. I mentioned one in
                >
                >
                > http://sourceforge.net/tracker/index...70&atid=105470
                >
                >
                > but I'd appreciate further support. Ideally there should be some means
                > for hooked import mechanisms to provide answers that are currently
                > sought from the filestore.[/color]

                There are such hooks. See e.g. the "meta_path" hooks as
                described by PEP 302.

                Comment

                • Martin v. Löwis

                  #9
                  Re: python24.zip

                  Dieter Maurer wrote:[color=blue]
                  > Really?
                  >
                  > Is the interpreter unable to call "C" functions ("stat" for example)
                  > to determine whether an object exists before it puts it on "path".[/color]

                  What do you mean, "unable to"? It just doesn't.

                  Could it? Perhaps, if somebody wrote a patch.
                  Would the patch be accepted? Perhaps, if it didn't break something
                  else.

                  In the past, there was a silent guarantee that you could add
                  items to sys.path, and only later create the directories behind
                  these items. I don't know whether people rely on this guarantee.
                  [color=blue]
                  > The application was Zope importing about 2.500 modules
                  > from 2 zip files "zope.zip" and "python24.z ip".
                  > This resulted in about 12.500 opens -- about 4 times more
                  > than would be expected -- about 10.000 of them failing opens.[/color]

                  I see. Out of curiosity: how much startup time was saved
                  when sys.path was explicitly stripped to only contain these
                  two zip files?

                  I would expect that importing 2500 modules takes *way*
                  more time than doing 10.000 failed opens.

                  Regards,
                  Martin

                  Comment

                  • Steve Holden

                    #10
                    Re: python24.zip

                    Dieter Maurer wrote:[color=blue]
                    > Steve Holden <steve@holdenwe b.com> writes on Sun, 22 May 2005 09:14:43 -0400:
                    >[color=green]
                    >>...
                    >>There are some aspects of Python's initialization that are IMHO a bit
                    >>too filesystem-dependent. I mentioned one in
                    >>
                    >>
                    >> http://sourceforge.net/tracker/index...70&atid=105470
                    >>
                    >>
                    >>but I'd appreciate further support. Ideally there should be some means
                    >>for hooked import mechanisms to provide answers that are currently
                    >>sought from the filestore.[/color]
                    >
                    >
                    > There are such hooks. See e.g. the "meta_path" hooks as
                    > described by PEP 302.[/color]

                    Indeed I have written PEP 302-based code to import from a relational
                    database, but I still don't believe there's any satisfactory way to have
                    [such a hooked import mechanism] be a first-class component of an
                    architecture that specifically requires an os.py to exist in the file
                    store during initialization.

                    I wasn't asking for an import hook mechanism (since I already knew these
                    to exist), but for a way to allow such mechanisms to be the sole import
                    support for certain implementations .

                    regards
                    Steve
                    --
                    Steve Holden +1 703 861 4237 +1 800 494 3119
                    Holden Web LLC http://www.holdenweb.com/
                    Python Web Programming http://pydish.holdenweb.com/

                    Comment

                    • Scott David Daniels

                      #11
                      Re: python24.zip

                      Martin v. Löwis wrote:[color=blue]
                      > Dieter Maurer wrote:
                      >[color=green]
                      >>Really?
                      >>
                      >>Is the interpreter unable to call "C" functions ("stat" for example)
                      >>to determine whether an object exists before it puts it on "path".[/color]
                      >
                      > What do you mean, "unable to"? It just doesn't.[/color]
                      In fact, the interpreter doesn't necessarily know when it is
                      affecting the path.
                      [color=blue]
                      > Could it? Perhaps, if somebody wrote a patch.
                      > Would the patch be accepted? Perhaps, if it didn't break something
                      > else.
                      >
                      > In the past, there was a silent guarantee that you could add
                      > items to sys.path, and only later create the directories behind
                      > these items. I don't know whether people rely on this guarantee.[/color]

                      If you only checked "lost" files/directories on the path a few
                      seconds later than the last time you checked, you might be able
                      to drive this "failed open" time down drastically without seriously
                      affecting those who care. Such an implementation should have a
                      call which allowed you to "clear" the timestamps for the "known bad"
                      entries.

                      --Scott David Daniels
                      Scott.Daniels@A cm.Org

                      Comment

                      • Martin v. Löwis

                        #12
                        Re: python24.zip

                        Scott David Daniels wrote:[color=blue][color=green][color=darkred]
                        >>> Is the interpreter unable to call "C" functions ("stat" for example)
                        >>> to determine whether an object exists before it puts it on "path".[/color]
                        >>
                        >>
                        >> What do you mean, "unable to"? It just doesn't.[/color]
                        >
                        > In fact, the interpreter doesn't necessarily know when it is
                        > affecting the path.[/color]

                        Now I remember what makes this stuff really difficult: PEP 302
                        introduces path hooks (sys.path_hooks ), allowing imports from
                        other sources than files. So the items on sys.path don't have
                        to be directory or file names at all, and importing from them
                        may still succeed if though stat fails.

                        Regards,
                        Martin

                        Comment

                        • Robin Becker

                          #13
                          Re: python24.zip

                          Martin v. Löwis wrote:
                          .....[color=blue]
                          >
                          >
                          > Now I remember what makes this stuff really difficult: PEP 302
                          > introduces path hooks (sys.path_hooks ), allowing imports from
                          > other sources than files. So the items on sys.path don't have
                          > to be directory or file names at all, and importing from them
                          > may still succeed if though stat fails.[/color]
                          ..... so is there implication of multiplicative behaviour?

                          ie if we have N importers and F leading failure syspath entries before the
                          correct one is found do we get order N*F failed stats/opens etc etc?

                          --
                          Robin Becker

                          Comment

                          • Martin v. Löwis

                            #14
                            Re: python24.zip

                            Robin Becker wrote:[color=blue]
                            > ie if we have N importers and F leading failure syspath entries before
                            > the correct one is found do we get order N*F failed stats/opens etc etc?[/color]

                            No. Each path hook is supposed to provide a decision as to whether this
                            is a useful item on sys.path only once; the importer objects themselves
                            are then cached (with some operation to clear the cache). Each path hook
                            may apply its own algorithm, e.g. looking at the syntactical structure
                            or the type of the sys.path item, so not all of them need stat/open
                            to determine whether they support the item.

                            The multiplicative behaviour rather results from the different type of
                            modules: each path item may carry .py, .pyc, .so, module.so, etc.

                            Regards,
                            Martin

                            Comment

                            • Martin v. Löwis

                              #15
                              Re: python24.zip

                              Robin Becker wrote:[color=blue]
                              > ie if we have N importers and F leading failure syspath entries before
                              > the correct one is found do we get order N*F failed stats/opens etc etc?[/color]

                              No. Each path hook is supposed to provide a decision as to whether this
                              is a useful item on sys.path only once; the importer objects themselves
                              are then cached (with some operation to clear the cache). Each path hook
                              may apply its own algorithm, e.g. looking at the syntactical structure
                              or the type of the sys.path item, so not all of them need stat/open
                              to determine whether they support the item.

                              The multiplicative behaviour rather results from the different type of
                              modules: each path item may carry .py, .pyc, .so, module.so, etc.

                              Regards,
                              Martin

                              Comment

                              Working...