startswith( prefix[, start[, end]]) Query

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • cjt22@bath.ac.uk

    startswith( prefix[, start[, end]]) Query

    Hi

    startswith( prefix[, start[, end]]) States:

    Return True if string starts with the prefix, otherwise return False.
    prefix can also be a tuple of suffixes to look for. However when I try
    and add a tuple of suffixes I get the following error:

    Type Error: expected a character buffer object

    For example:

    file = f.readlines()
    for line in file:
    if line.startswith (("abc","df") )
    CODE

    It would generate the above error

    To overcome this problem, I am currently just joining individual
    startswith methods
    i.e. if line.startswith ("if") or line.startswith ("df")
    but know there must be a way to define all my suffixes in one tuple.

    Thanks in advance

  • Tim Golden

    #2
    Re: startswith( prefix[, start[, end]]) Query

    cjt22@bath.ac.u k wrote:
    Hi
    >
    startswith( prefix[, start[, end]]) States:
    >
    Return True if string starts with the prefix, otherwise return False.
    prefix can also be a tuple of suffixes to look for.
    That particular aspect of the functionality (the multiple
    prefixes in a tuple) was only added Python 2.5. If you're
    using <= 2.4 you'll need to use "or" or some other approach,
    eg looping over a sequence of prefixes.

    TJG

    Comment

    • attn.steven.kuo@gmail.com

      #3
      Re: startswith( prefix[, start[, end]]) Query

      On Sep 6, 7:09 am, cj...@bath.ac.u k wrote:
      Hi
      >
      startswith( prefix[, start[, end]]) States:
      >
      Return True if string starts with the prefix, otherwise return False.
      prefix can also be a tuple of suffixes to look for. However when I try
      and add a tuple of suffixes I get the following error:
      >
      Type Error: expected a character buffer object
      >
      For example:
      >
      file = f.readlines()
      for line in file:
      if line.startswith (("abc","df") )
      CODE
      >
      It would generate the above error
      >
      (snipped)

      You see to be using an older version of Python.
      For me it works as advertised with 2.5.1,
      but runs into the problem you described with 2.4.4:

      Python 2.5.1c1 (r251c1:54692, Apr 17 2007, 21:12:16)
      [GCC 4.0.0 (Apple Computer, Inc. build 5026)] on darwin
      Type "help", "copyright" , "credits" or "license" for more information.
      >>line = "foobar"
      >>if line.startswith (("foo", "bar")): print line
      ....
      foobar
      >>if line.startswith (("foo", "bar")):
      .... print line
      ....
      foobar


      VS.

      Python 2.4.4 (#1, Oct 18 2006, 10:34:39)
      [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
      Type "help", "copyright" , "credits" or "license" for more information.
      >>line = "foobar"
      >>if line.startswith (("foo", "bar")): print line
      ....
      Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      TypeError: expected a character buffer object


      --
      Hope this helps,
      Steven

      Comment

      • Bruno Desthuilliers

        #4
        Re: startswith( prefix[, start[, end]]) Query

        cjt22@bath.ac.u k a écrit :
        Hi
        >
        startswith( prefix[, start[, end]]) States:
        >
        Return True if string starts with the prefix, otherwise return False.
        prefix can also be a tuple of suffixes to look for. However when I try
        and add a tuple of suffixes I get the following error:
        >
        Type Error: expected a character buffer object
        >
        For example:
        >
        file = f.readlines()
        for line in file:
        slightly OT, but:
        1/ you should not use 'file' as an identifier, it shadowas the builtin
        file type
        2/ FWIW, it's also a pretty bad naming choice for a list of lines - why
        not just name this list 'lines' ?-)
        3/ anyway, unless you need to store this whole list in memory, you'd be
        better using the iterator idiom (Python files are iterables):

        f = open('some_file .ext')
        for line in f:
        print line

        if line.startswith (("abc","df") )
        CODE
        >
        It would generate the above error
        May I suggest that you read the appropriate version of the doc ? That
        is, the one corresponding to your installed Python version ?-)

        Passing a tuple to str.startswith is new in 2.5. I bet you're trying it
        on a 2.4 or older version.
        To overcome this problem, I am currently just joining individual
        startswith methods
        i.e. if line.startswith ("if") or line.startswith ("df")
        but know there must be a way to define all my suffixes in one tuple.
        You may want to try with a regexp, but I'm not sure it's worth it (hint:
        the timeit module is great for quick small benchmarks).

        Else, you could as well write your own testing function:

        def str_starts_with (astring, *prefixes):
        startswith = astring.startsw ith
        for prefix in prefixes:
        if startswith(pref ix):
        return true
        return false

        for line in f:
        if str_starts_with (line, 'abc, 'de', 'xxx'):
        # CODE HERE

        HTH

        Comment

        • Tim Williams

          #5
          Re: startswith( prefix[, start[, end]]) Query

          On 06/09/07, Bruno Desthuilliers
          <bruno.42.desth uilliers@wtf.we bsiteburo.oops. comwrote:
          >
          You may want to try with a regexp, but I'm not sure it's worth it (hint:
          the timeit module is great for quick small benchmarks).
          >
          Else, you could as well write your own testing function:
          >
          def str_starts_with (astring, *prefixes):
          startswith = astring.startsw ith
          for prefix in prefixes:
          if startswith(pref ix):
          return true
          return false
          >
          for line in f:
          if str_starts_with (line, 'abc, 'de', 'xxx'):
          # CODE HERE
          >
          Isn't slicing still faster than startswith? As you mention timeit,
          then you should probably add slicing to the pot too :)

          if astring[:len(prefix)] == prefix:
          do_stuff()

          :)

          Comment

          • TheFlyingDutchman

            #6
            Re: startswith( prefix[, start[, end]]) Query

            Else, you could as well write your own testing function:
            >
            def str_starts_with (astring, *prefixes):
            startswith = astring.startsw ith
            for prefix in prefixes:
            if startswith(pref ix):
            return true
            return false
            >
            What is the reason for
            startswith = astring.startsw ith
            startswith(pref ix)

            instead of
            astring.startsw ith(prefix)

            Comment

            • Steve Holden

              #7
              Re: startswith( prefix[, start[, end]]) Query

              TheFlyingDutchm an wrote:
              >Else, you could as well write your own testing function:
              >>
              >def str_starts_with (astring, *prefixes):
              > startswith = astring.startsw ith
              > for prefix in prefixes:
              > if startswith(pref ix):
              > return true
              > return false
              >>
              >
              What is the reason for
              startswith = astring.startsw ith
              startswith(pref ix)
              >
              instead of
              astring.startsw ith(prefix)
              >
              It's an optimization: the assigment creates a "bound method" (i.e. a
              method associated with a specific string instance) and avoids having to
              look up the startswith method of astring for each iteration of the inner
              loop.

              Probably not really necessary, though, and they do say that premature
              optimization is the root of all evil ...

              regards
              Steve
              --
              Steve Holden +1 571 484 6266 +1 800 494 3119
              Holden Web LLC/Ltd http://www.holdenweb.com
              Skype: holdenweb http://del.icio.us/steve.holden
              --------------- Asciimercial ------------------
              Get on the web: Blog, lens and tag the Internet
              Many services currently offer free registration
              ----------- Thank You for Reading -------------

              Comment

              • Bruno Desthuilliers

                #8
                Re: startswith( prefix[, start[, end]]) Query

                Steve Holden a écrit :
                TheFlyingDutchm an wrote:
                >>Else, you could as well write your own testing function:
                >>>
                >>def str_starts_with (astring, *prefixes):
                >> startswith = astring.startsw ith
                >> for prefix in prefixes:
                >> if startswith(pref ix):
                >> return true
                >> return false
                >>>
                >>
                >What is the reason for
                > startswith = astring.startsw ith
                > startswith(pref ix)
                >>
                >instead of
                > astring.startsw ith(prefix)
                >>
                It's an optimization: the assigment creates a "bound method" (i.e. a
                method associated with a specific string instance) and avoids having to
                look up the startswith method of astring for each iteration of the inner
                loop.
                >
                Probably not really necessary, though, and they do say that premature
                optimization is the root of all evil ...
                I wouldn't call this one "premature" optimization, since it doesn't
                change the algorithm, doesn't introduce (much) complication, and is
                proven to really save on lookup time.

                Now I do agree that unless you have quite a lot of prefixes to test, it
                might not be that necessary in this particular case...

                Comment

                • Duncan Booth

                  #9
                  Re: startswith( prefix[, start[, end]]) Query

                  "Tim Williams" <listserver@tdw .netwrote:
                  Isn't slicing still faster than startswith? As you mention timeit,
                  then you should probably add slicing to the pot too :)
                  >
                  Possibly, but there are so many other factors that affect the timing
                  that writing it clearly should be your first choice.

                  Some timings:

                  @echo off
                  setlocal
                  cd \python25\lib
                  echo "startswith "
                  ...\python timeit.py -s "s='abracadabra 1'*1000;t='abra cadabra2'" s.startswith(t)
                  ...\python timeit.py -s "s='abracadabra 1'*1000;t='abra cadabra1'" s.startswith(t)
                  echo "prebound startswith"
                  ...\python timeit.py -s "s='abracadabra 1'*1000;t='abra cadabra2';start swith=s.startsw ith" startswith(t)
                  ...\python timeit.py -s "s='abracadabra 1'*1000;t='abra cadabra1';start swith=s.startsw ith" startswith(t)
                  echo "slice with len"
                  ...\python timeit.py -s "s='abracadabra 1'*1000;t='abra cadabra2'" s[:len(t)]==t
                  ...\python timeit.py -s "s='abracadabra 1'*1000;t='abra cadabra1'" s[:len(t)]==t
                  echo "slice with magic number"
                  ...\python timeit.py -s "s='abracadabra 1'*1000;t='abra cadabra2'" s[:12]==t
                  ...\python timeit.py -s "s='abracadabra 1'*1000;t='abra cadabra1'" s[:12]==t

                  and typical output from this is:

                  "startswith "
                  1000000 loops, best of 3: 0.542 usec per loop
                  1000000 loops, best of 3: 0.514 usec per loop
                  "prebound startswith"
                  1000000 loops, best of 3: 0.472 usec per loop
                  1000000 loops, best of 3: 0.474 usec per loop
                  "slice with len"
                  1000000 loops, best of 3: 0.501 usec per loop
                  1000000 loops, best of 3: 0.456 usec per loop
                  "slice with magic number"
                  1000000 loops, best of 3: 0.34 usec per loop
                  1000000 loops, best of 3: 0.315 usec per loop

                  So for these particular strings, the naive slice wins if the comparison is
                  true, but loses to the pre-bound method if the comparison fails. The slice is
                  taking a hit from calling len every time, so pre-calculating the length
                  (which should be possible in the same situations as pre-binding startswith)
                  might be worthwhile, but I would still favour using startswith unless I knew
                  the code was time critical.

                  Comment

                  • Steve Holden

                    #10
                    Re: startswith( prefix[, start[, end]]) Query

                    Bruno Desthuilliers wrote:
                    Steve Holden a écrit :
                    [...]
                    >>
                    >Probably not really necessary, though, and they do say that premature
                    >optimization is the root of all evil ...
                    >
                    I wouldn't call this one "premature" optimization, since it doesn't
                    change the algorithm, doesn't introduce (much) complication, and is
                    proven to really save on lookup time.
                    >
                    Now I do agree that unless you have quite a lot of prefixes to test, it
                    might not be that necessary in this particular case...
                    The defense rests.

                    regards
                    Steve
                    --
                    Steve Holden +1 571 484 6266 +1 800 494 3119
                    Holden Web LLC/Ltd http://www.holdenweb.com
                    Skype: holdenweb http://del.icio.us/steve.holden
                    --------------- Asciimercial ------------------
                    Get on the web: Blog, lens and tag the Internet
                    Many services currently offer free registration
                    ----------- Thank You for Reading -------------

                    Comment

                    • rzed

                      #11
                      Re: startswith( prefix[, start[, end]]) Query

                      Duncan Booth <duncan.booth@i nvalid.invalidw rote in
                      news:Xns99A45CD 3825D7duncanboo th@127.0.0.1:

                      I went through your example to get timings for my machine, and I
                      ran into an issue I didn't expect.

                      My bat file did the following 10 times in a row:
                      (the command line wraps in this post)

                      call timeit -s "s='abracadabra 1'*1000;t='abra cadabra2';
                      startswith=s.st artswith" startswith(t)
                      .... giving me these times:

                      1000000 loops, best of 3: 0.483 usec per loop
                      1000000 loops, best of 3: 0.49 usec per loop
                      1000000 loops, best of 3: 0.489 usec per loop
                      1000000 loops, best of 3: 0.491 usec per loop
                      1000000 loops, best of 3: 0.488 usec per loop
                      1000000 loops, best of 3: 0.492 usec per loop
                      1000000 loops, best of 3: 0.49 usec per loop
                      1000000 loops, best of 3: 0.493 usec per loop
                      1000000 loops, best of 3: 0.486 usec per loop
                      1000000 loops, best of 3: 0.489 usec per loop

                      Then I thought that a shorter name for the lookup might affect the
                      timings, so I changed the bat file, which now did the following 10
                      times in a row:

                      timeit -s "s='abracadabra 1'* 1000;t='abracad abra2';
                      sw=s.startswith " sw(t)

                      .... giving me these times:
                      1000000 loops, best of 3: 0.516 usec per loop
                      1000000 loops, best of 3: 0.512 usec per loop
                      1000000 loops, best of 3: 0.514 usec per loop
                      1000000 loops, best of 3: 0.517 usec per loop
                      1000000 loops, best of 3: 0.515 usec per loop
                      1000000 loops, best of 3: 0.518 usec per loop
                      1000000 loops, best of 3: 0.523 usec per loop
                      1000000 loops, best of 3: 0.513 usec per loop
                      1000000 loops, best of 3: 0.514 usec per loop
                      1000000 loops, best of 3: 0.515 usec per loop

                      In other words, the shorter name did seem to affect the timings,
                      but in a negative way. Why it would actually change at all is
                      beyond me, but it is consistently this way on my machine.

                      Can anyone explain this?

                      --
                      rzed

                      Comment

                      • Bruno Desthuilliers

                        #12
                        Re: startswith( prefix[, start[, end]]) Query

                        Steve Holden a écrit :
                        Bruno Desthuilliers wrote:
                        >
                        >Steve Holden a écrit :
                        >
                        [...]
                        >
                        >>>
                        >>Probably not really necessary, though, and they do say that premature
                        >>optimizatio n is the root of all evil ...
                        >>
                        >>
                        >I wouldn't call this one "premature" optimization, since it doesn't
                        >change the algorithm, doesn't introduce (much) complication, and is
                        >proven to really save on lookup time.
                        >>
                        >Now I do agree that unless you have quite a lot of prefixes to test,
                        >it might not be that necessary in this particular case...
                        >
                        >
                        The defense rests.
                        Sorry, I don't understand this one (please bare with a poor french boy).

                        Comment

                        Working...