Bug in glob.glob for files w/o extentions in Windows

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Georgy Pruss

    Bug in glob.glob for files w/o extentions in Windows

    On Windows XP glob.glob doesn't work properly for files without extensions.
    E.g. C:\Temp contains 4 files: 2 with extensions, 2 without.

    C:\Temp>dir /b *
    aaaaa.aaa
    bbbbb.bbb
    ccccc
    ddddd

    C:\Temp>dir /b *.
    ccccc
    ddddd

    C:\Temp>python
    Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32
    Type "help", "copyright" , "credits" or "license" for more information.[color=blue][color=green][color=darkred]
    >>> import glob[/color][/color][/color]
    [color=blue][color=green][color=darkred]
    >>> glob.glob( '*' )[/color][/color][/color]
    ['aaaaa.aaa', 'bbbbb.bbb', 'ccccc', 'ddddd']
    [color=blue][color=green][color=darkred]
    >>> glob.glob( '*.' )[/color][/color][/color]
    []

    It looks like a bug.

    Georgy
    --
    Georgy Pruss
    E-mail: 'ZDAwMTEyMHQwMz MwQGhvdG1haWwuY 29t\n'.decode(' base64')


  • Georgy Pruss

    #2
    Re: Bug in glob.glob for files w/o extentions in Windows

    OK, you can call it not a bug, but different behavior.
    I've found that the fnmatch module is the reason for that.
    Here's other examples:

    C:\temp>dir /b *.*
    ..eee
    aaa.aaa
    nnn

    C:\temp>dir /b * # it's by def synonym for *.*
    ..eee
    aaa.aaa
    nnn

    C:\temp>dir /b .*
    ..eee

    C:\temp>dir /b *. # it looks strange too
    ..eee
    nnn


    C:\temp>python[color=blue][color=green][color=darkred]
    >>> import glob[/color][/color][/color]
    [color=blue][color=green][color=darkred]
    >>> glob.glob('*.*' )[/color][/color][/color]
    ['aaa.aaa']
    [color=blue][color=green][color=darkred]
    >>> glob.glob('*')[/color][/color][/color]
    ['aaa.aaa', 'nnn']
    [color=blue][color=green][color=darkred]
    >>> glob.glob('.*')[/color][/color][/color]
    ['.eee']
    [color=blue][color=green][color=darkred]
    >>> glob.glob('*.')[/color][/color][/color]
    []


    It seems that in any case I'll have to extract 'nnn' by myself.
    Something like:

    if mask.endswith(' .'): # no extention implies actually no dots in name at all
    list = glob.glob( mask[:-1] )
    list = filter( lambda x: '.' not in x, list ) # or [x for x in list if '.' not in x]
    else:
    list = glob.glob( mask )

    G-:


    Comment

    • Jules Dubois

      #3
      Re: Bug in glob.glob for files w/o extentions in Windows

      On Sun, 30 Nov 2003 03:47:38 GMT, in article
      <news:uLdyb.463 89$I53.2118790@ twister.southea st.rr.com>, Georgy Pruss
      wrote:
      [color=blue]
      > On Windows XP glob.glob doesn't work properly for files without extensions.
      > E.g. C:\Temp contains 4 files: 2 with extensions, 2 without.
      > [...]
      > C:\Temp>dir /b *.
      > ccccc
      > ddddd[/color]

      This is standard Windows behavior. It's compatible with CP/M and therefore
      MS-DOS, and Microsoft has preserved this behavior in all versions of
      Windows.

      Did you ever poke around in the directory system in a FAT partition
      (without VFAT)? You'll find that every file name is exactly 11 characters
      long and "." is not found in any part of any file name in any directory
      entry.

      It's bizarre but that's the way it works. If you try

      dir /b *

      does cmd.exe list only files without extensions?
      [color=blue][color=green][color=darkred]
      >>>> glob.glob( '*.' )[/color][/color]
      > []
      >[/color]

      glob provides "Unix style pathname pattern expansion" as documented in the
      _Python Library Reference_: If there's a period (".") in the pattern, it
      must match a period in the filename.
      [color=blue]
      > It looks like a bug.[/color]

      No, it's proper behavior. It's Windows that's (still) screwy.

      Comment

      • Georgy Pruss

        #4
        Re: Bug in glob.glob for files w/o extentions in Windows


        "Jules Dubois" <bogus@invalid. tld> wrote in message news:nj2k03e19c lm$.uctj11fclu9 6$.dlg@40tude.n et...
        | On Sun, 30 Nov 2003 03:47:38 GMT, in article
        | <news:uLdyb.463 89$I53.2118790@ twister.southea st.rr.com>, Georgy Pruss
        | wrote:
        |
        | > On Windows XP glob.glob doesn't work properly for files without extensions.
        | > E.g. C:\Temp contains 4 files: 2 with extensions, 2 without.
        | > [...]
        | > C:\Temp>dir /b *.
        | > ccccc
        | > ddddd
        |
        | This is standard Windows behavior. It's compatible with CP/M and therefore
        | MS-DOS, and Microsoft has preserved this behavior in all versions of
        | Windows.

        That's what I meant, wanted and liked.

        C'mon guys, I don't care if it's FAT, NTFS, Windows, Linux, VMS or whatever.
        All I wanted was to get files w/o dots in their names (on my computer :)).
        I did it and I can do it on any system if I need.


        | Did you ever poke around in the directory system in a FAT partition
        | (without VFAT)? You'll find that every file name is exactly 11 characters
        | long and "." is not found in any part of any file name in any directory
        | entry.
        |
        | It's bizarre but that's the way it works. If you try
        |
        | dir /b *
        |
        | does cmd.exe list only files without extensions?

        By definition it's the same as *.* if my memory serves me right.


        | >>>> glob.glob( '*.' )
        | > []
        | >
        |
        | glob provides "Unix style pathname pattern expansion" as documented in the
        | _Python Library Reference_: If there's a period (".") in the pattern, it
        | must match a period in the filename.
        |
        | > It looks like a bug.
        |
        | No, it's proper behavior. It's Windows that's (still) screwy.

        I see.
        Show the world a perfect OS and you'll be a billionaire.

        G-:


        Comment

        • Jules Dubois

          #5
          Re: Bug in glob.glob for files w/o extentions in Windows

          On Sun, 30 Nov 2003 06:18:36 GMT, in article
          <news:0Zfyb.480 50$dl.2119318@t wister.southeas t.rr.com>, Georgy Pruss wrote:
          [color=blue]
          > "Jules Dubois" <bogus@invalid. tld> wrote in message news:nj2k03e19c lm$.uctj11fclu9 6$.dlg@40tude.n et...
          >| On Sun, 30 Nov 2003 03:47:38 GMT, in article
          >|
          > C'mon guys, I don't care if it's FAT, NTFS, Windows, Linux, VMS or whatever.
          > All I wanted was to get files w/o dots in their names (on my computer :)).[/color]

          I was just pointing out the reason for the behavior.
          [color=blue]
          >| dir /b *
          >|
          >| does cmd.exe list only files without extensions?
          >
          > By definition it's the same as *.* if my memory serves me right.[/color]

          I'm sure ".*" was the same as "*.*". Win2k's cmd.exe won't run under Wine,
          so I couldn't test "*".
          [color=blue]
          >| No, it's proper behavior. It's Windows that's (still) screwy.
          >
          > Show the world a perfect OS and you'll be a billionaire.[/color]

          We agree, then, that every operating system has its good points and its bad
          points. (I guess we don't agree on whether "*." should or shouldn't match
          files without periods in their name.)

          Comment

          • Georgy Pruss

            #6
            Re: Bug in glob.glob for files w/o extentions in Windows


            "Jules Dubois" <bogus@invalid. tld> wrote in message news:b6xinmmkc0 wp.16hmc77xoj9t 2$.dlg@40tude.n et...
            |
            | We agree, then, that every operating system has its good points and its bad
            | points. (I guess we don't agree on whether "*." should or shouldn't match
            | files without periods in their name.)

            Anyway, "*." is not a bad DOS convention to select files w/o extention, although
            it comes from the old 8.3 name scheme. BTW, how can you select files w/o
            extention in Unix's shells?

            G-:



            Comment

            • Francis Avila

              #7
              Re: Bug in glob.glob for files w/o extentions in Windows

              Georgy Pruss wrote in message ...[color=blue]
              >OK, you can call it not a bug, but different behavior.[/color]


              That's true. But calling dir's behavior "different" here is quite a
              euphemism!
              [color=blue]
              >It seems that in any case I'll have to extract 'nnn' by myself.
              >Something like:
              >
              > if mask.endswith(' .'): # no extention implies actually no dots in[/color]
              name at all[color=blue]
              > list = glob.glob( mask[:-1] )
              > list = filter( lambda x: '.' not in x, list ) # or [x for x in list[/color]
              if '.' not in x][color=blue]
              > else:
              > list = glob.glob( mask )
              >[/color]


              I don't understand where 'mask' is coming from. If you want files with no
              dots, just filter out those files:

              filelist = [file for file in glob.glob('*') if '.' not in file]

              Or you can use sets: symmetric difference of all files against the files
              with dots.

              If you're trying to recast glob in windows' image, you'll have to
              specialcase '*.*' too. And then what do you do if someone comes along who
              *really* wants *only* names with dots in them!?

              Trying to shoehorn windows-style semantics into glob is just braindead--the
              windows semantics are wrong because dots are not special anymore. For one
              thing, we can have more than one of them, and they can be anywhere in the
              filename. Both were not true for DOS, whence windows inherited the *.*
              nonsense.

              Behold the awesome visage of the One True Glob (TM): (Not that I'm starting
              a holy war or anything ;)
              *.* -> Filename has a dot in it, and that dot cannot be the first or last
              char.
              This is NOT the same as '*'!!
              ..* -> Filename has a dot as the first character.
              *. -> Filename has a dot as the last character.
              * -> Gimme everything.
              --
              Francis Avila

              Comment

              • Serge Orlov

                #8
                Re: Bug in glob.glob for files w/o extentions in Windows

                [color=blue]
                > Anyway, "*." is not a bad DOS convention to select files w/o extention, although
                > it comes from the old 8.3 name scheme. BTW, how can you select files w/o
                > extention in Unix's shells?[/color]
                The same way as in Python:
                filelist = [file for file in glob.glob('*') if '.' not in file]
                Shell:
                ls|grep -v [.]
                Making up special conventions is not the Python way.

                -- Serge.


                Comment

                • Peter Otten

                  #9
                  Re: List files without extension, was: Bug in glob.glob for files w/o extentions in Windows

                  Georgy Pruss wrote:
                  [color=blue]
                  > Anyway, "*." is not a bad DOS convention to select files w/o extention,
                  > although it comes from the old 8.3 name scheme. BTW, how can you select
                  > files w/o extention in Unix's shells?[/color]

                  ls -I*.*

                  The -I option tells the ls command what *not* to show.

                  Peter

                  Comment

                  • Gerrit Holl

                    #10
                    Re: Bug in glob.glob for files w/o extentions in Windows

                    Francis Avila wrote:[color=blue]
                    > Behold the awesome visage of the One True Glob (TM): (Not that I'm starting
                    > a holy war or anything ;)
                    > *.* -> Filename has a dot in it, and that dot cannot be the first or last
                    > char.
                    > This is NOT the same as '*'!!
                    > .* -> Filename has a dot as the first character.
                    > *. -> Filename has a dot as the last character.
                    > * -> Gimme everything.[/color]

                    Note that Bash doesn't behave like this either: * does not give
                    everything, rather it gives everything not starting with a dot. In Bash,
                    * really means: [!.]*

                    yours,
                    Gerrit.

                    --
                    102. If a merchant entrust money to an agent (broker) for some
                    investment, and the broker suffer a loss in the place to which he goes, he
                    shall make good the capital to the merchant.
                    -- 1780 BC, Hammurabi, Code of Law
                    --
                    Asperger's Syndrome - a personal approach:


                    Comment

                    • Francis Avila

                      #11
                      Re: Bug in glob.glob for files w/o extentions in Windows

                      Gerrit Holl wrote in message ...[color=blue]
                      >Francis Avila wrote:
                      >Note that Bash doesn't behave like this either: * does not give
                      >everything, rather it gives everything not starting with a dot. In Bash,
                      >* really means: [!.]*[/color]


                      That behavior can be modified with the 'dotglob' shell option:

                      $shopt -s dotglob
                      $echo *
                      a .b b .c d ...

                      --
                      Francis Avila


                      Comment

                      • Stein Boerge Sylvarnes

                        #12
                        Re: List files without extension, was: Bug in glob.glob for files w/o extentions in Windows

                        In article <bqcg7i$olj$03$ 1@news.t-online.com>, Peter Otten wrote:[color=blue]
                        >Georgy Pruss wrote:
                        >[color=green]
                        >> Anyway, "*." is not a bad DOS convention to select files w/o extention,
                        >> although it comes from the old 8.3 name scheme. BTW, how can you select
                        >> files w/o extention in Unix's shells?[/color]
                        >
                        >ls -I*.*
                        >
                        >The -I option tells the ls command what *not* to show.
                        >[/color]
                        That's non-standard gnu ls behaviour, I think. (Tested on OpenBSD and SunOS)
                        [color=blue]
                        >Peter[/color]

                        --
                        regards/mvh
                        Stein B. Sylvarnes
                        stein.sylvarnes @student.uib.no

                        Comment

                        • Francis Avila

                          #13
                          Re: List files without extension, was: Bug in glob.glob for files w/o extentions in Windows


                          Stein Boerge Sylvarnes wrote in message ...[color=blue]
                          >In article <bqcg7i$olj$03$ 1@news.t-online.com>, Peter Otten wrote:[color=green]
                          >>Georgy Pruss wrote:
                          >>[color=darkred]
                          >>> Anyway, "*." is not a bad DOS convention to select files w/o extention,
                          >>> although it comes from the old 8.3 name scheme. BTW, how can you select
                          >>> files w/o extention in Unix's shells?[/color][/color][/color]


                          In Windows, how do you create a file with a dot as the last character? In
                          Unix you can do this, because a dot is just another character in the
                          filename. It's because you can't do this in Windows that *. is unambiguous.
                          [color=blue][color=green]
                          >>ls -I*.*
                          >>
                          >>The -I option tells the ls command what *not* to show.
                          >>[/color]
                          >That's non-standard gnu ls behaviour, I think. (Tested on OpenBSD and[/color]
                          SunOS)


                          for N in $(ls -1 | grep -v '\.'); echo $N; done

                          Not positively sure that the -1 option is posix, but it's at least in
                          OpenBSD and SunOS (in fact, it's the default when output is not to a
                          terminal).

                          Bash also has an extglob option:

                          $ shopt -s extglob dotglob
                          $ ls -1 *
                          a
                          ..b
                          c.c
                          d.
                          $ echo !(*.*)
                          a

                          There's also @(), ?(), *(), +(). You can use multiple patterns within the
                          parens by joining with '|'.
                          --
                          Francis Avila

                          Comment

                          • Jules Dubois

                            #14
                            Re: Bug in glob.glob for files w/o extentions in Windows

                            On Sun, 30 Nov 2003 08:59:55 GMT, in article
                            <news:fkiyb.489 48$dl.2131385@t wister.southeas t.rr.com>, Georgy Pruss wrote:
                            [color=blue]
                            > "Jules Dubois" <bogus@invalid. tld> wrote in message news:b6xinmmkc0 wp.16hmc77xoj9t 2$.dlg@40tude.n et...
                            >| (I guess we don't agree on whether "*." should or shouldn't match
                            >| files without periods in their name.)
                            >
                            > Anyway, "*." is not a bad DOS convention to select files w/o extention, although
                            > it comes from the old 8.3 name scheme. BTW, how can you select files w/o
                            > extention in Unix's shells?[/color]

                            Touche.

                            Comment

                            • Mel Wilson

                              #15
                              Re: Bug in glob.glob for files w/o extentions in Windows

                              In article <0Zfyb.48050$dl .2119318@twiste r.southeast.rr. com>,
                              "Georgy Pruss" <see_signature_ _@hotmail.com> wrote:[color=blue]
                              >
                              >"Jules Dubois" <bogus@invalid. tld> wrote in message news:nj2k03e19c lm$.uctj11fclu9 6$.dlg@40tude.n et...
                              >| On Sun, 30 Nov 2003 03:47:38 GMT, in article
                              >| <news:uLdyb.463 89$I53.2118790@ twister.southea st.rr.com>, Georgy Pruss
                              >| wrote:
                              >|
                              >| > On Windows XP glob.glob doesn't work properly for files without extensions.
                              >| > E.g. C:\Temp contains 4 files: 2 with extensions, 2 without.
                              >| > [...]
                              >| > C:\Temp>dir /b *.
                              >| > ccccc
                              >| > ddddd
                              >|
                              >| This is standard Windows behavior. It's compatible with CP/M and therefore
                              >| MS-DOS, and Microsoft has preserved this behavior in all versions of
                              >| Windows.
                              >
                              >That's what I meant, wanted and liked.
                              >
                              >C'mon guys, I don't care if it's FAT, NTFS, Windows, Linux, VMS or whatever.
                              >All I wanted was to get files w/o dots in their names (on my computer :)).
                              >I did it and I can do it on any system if I need.[/color]

                              Looks like you need os.path.glob(), which doesn't exist, yet.

                              Regards. Mel.

                              Comment

                              Working...