Tokenizer Function (plus rant on strtok documentation)

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Robbie Hatley

    Tokenizer Function (plus rant on strtok documentation)

    A couple of days ago I dedecided to force myself to really learn
    exactly what "strtok" does, and how to use it. I figured I'd
    just look it up in some book and that would be that.

    I figured wrong!

    Firstly, Bjarne Stroustrup's "The C++ Programming Language" said:

    (nothing)

    Ok, how about a C book? Steven Prata's "C Primer Plus" said:

    (nothing)

    Aaarrrggg. Ok, how about good old Randy Schildt and his book
    "C++: Complete Reference"? It said:

    #include <cstring>
    char *strtok(char *str1, const char *str2);
    The strtok() function returns a pointer to the next token in
    the string pointed to by str1. The characters making up the
    string pointed to by str2 are the delimiters that determine
    the token. A null pointer is returned when there is no token
    to return.To tokenize a string, the first call to strtok()
    must have str1 point to the string being tokenized. Subsequent
    calls must use a null pointer for str1. In this way, the entire
    string can be reduced to its tokens. It is possible to use a
    different set of delimiters for each call to strtok() .

    Ok. But when I tried using the function, it didn't do what I
    expected at all. For one thing, it severely alters the contents
    of its first argument. Randy Schildt's book doesn't mention that
    little factoid at all. :-( Bad Randy!

    I had to google this function and find info on it on the web in
    order to find out how it really works. Turns out, there's lots
    of things missing is Schildt's description. (But hey, at least
    he tried. Most other C/C++ authors chicken out and won't even
    touch strtok in their books.) This is how this function REALLY
    works:



    I wish more authors would cover this useful function in their
    books. After all, it IS a part of both the C and C++ standard
    libraries. Ok, I'm done ranting now.


    For your amusement, here is a function I wrote to break a string
    into tokens, given a string of "separator" characters, and put
    the tokens in a std::vector<std ::string. I'm sure there's
    various ways this could be improved. Comments? Slings? Arrows?


    void
    Tokenize
    (
    std::string const & RawText,
    std::string const & Delimiters,
    std::vector<std ::string & Tokens
    )
    {
    // Load raw text into an appropriately-sized dynamic char array:
    size_t StrSize = RawText.size();
    size_t ArraySize = StrSize + 5;
    char* Ptr = new char[ArraySize];
    memset(Ptr, 0, ArraySize);
    strncpy(Ptr, RawText.c_str() , StrSize);

    // Clear the Tokens vector:
    Tokens.clear();

    // Get the tokens from the array and put them in the vector:
    char* TokenPtr = NULL;
    char* TempPtr = Ptr;
    while (NULL != (TokenPtr = strtok(TempPtr, Delimiters.c_st r())))
    {
    Tokens.push_bac k(std::string(T okenPtr));
    TempPtr = NULL;
    }

    // Free memory and scram:
    delete[] Ptr;
    return;
    }


    --
    Cheers,
    Robbie Hatley
    East Tustin, CA, USA
    lone wolf intj at pac bell dot net
    (put "[usenet]" in subject to bypass spam filter)



  • jmoy

    #2
    Re: Tokenizer Function (plus rant on strtok documentation)

    Robbie Hatley wrote:
    A couple of days ago I dedecided to force myself to really learn
    exactly what "strtok" does, and how to use it.
    >This is how this function REALLY
    works:
    >

    >
    I wish more authors would cover this useful function in their
    books. After all, it IS a part of both the C and C++ standard
    libraries. Ok, I'm done ranting now.
    >
    strtok is one of the weird functions that maintain internal state, so
    that you cannot tokenize two strings in an interleaved manner or use it
    in a multithreaded program. POSIX offers a strtok_r which is somewhat
    saner.
    >
    For your amusement, here is a function I wrote to break a string
    into tokens, given a string of "separator" characters, and put
    the tokens in a std::vector<std ::string. I'm sure there's
    various ways this could be improved. Comments? Slings? Arrows?
    >
    >
    void
    Tokenize
    (
    std::string const & RawText,
    std::string const & Delimiters,
    std::vector<std ::string & Tokens
    )
    {
    // Load raw text into an appropriately-sized dynamic char array:
    size_t StrSize = RawText.size();
    size_t ArraySize = StrSize + 5;
    char* Ptr = new char[ArraySize];
    memset(Ptr, 0, ArraySize);
    strncpy(Ptr, RawText.c_str() , StrSize);
    >
    // Clear the Tokens vector:
    Tokens.clear();
    >
    // Get the tokens from the array and put them in the vector:
    char* TokenPtr = NULL;
    char* TempPtr = Ptr;
    while (NULL != (TokenPtr = strtok(TempPtr, Delimiters.c_st r())))
    {
    Tokens.push_bac k(std::string(T okenPtr));
    TempPtr = NULL;
    }
    >
    // Free memory and scram:
    delete[] Ptr;
    return;
    }
    >
    I guess tying the tokenizer to vector<stringis not a good idea. If it
    took an output iterator it could be used with any container or even
    with things like ostream_iterato rs. Here is my attempt, which also gets
    rid of strtok:

    #include <string>
    using namespace std;
    template <class OItervoid tokenize( const string &str,
    const string &delim,
    OIter oi)
    {
    typedef string::size_ty pe Sz;

    Sz begin=0;
    while(begin<str .size()){
    Sz end=str.find_fi rst_of(delim,be gin);
    *oi++=str.subst r(begin,end-begin);
    begin=str.find_ first_not_of(de lim,end);
    }
    }

    I use find_first_not_ of in order to be compatible with strtok's
    behaviour of treating multiple adjacent delimiters as a single
    delimiter. I have not measured the performance of this version against
    the strtok version.

    Comment

    • sonison.james@gmail.com

      #3
      Re: Tokenizer Function (plus rant on strtok documentation)

      jmoy wrote:
      Robbie Hatley wrote:
      A couple of days ago I dedecided to force myself to really learn
      exactly what "strtok" does, and how to use it.
      This is how this function REALLY
      works:



      I wish more authors would cover this useful function in their
      books. After all, it IS a part of both the C and C++ standard
      libraries. Ok, I'm done ranting now.
      >
      strtok is one of the weird functions that maintain internal state, so
      that you cannot tokenize two strings in an interleaved manner or use it
      in a multithreaded program. POSIX offers a strtok_r which is somewhat
      saner.

      For your amusement, here is a function I wrote to break a string
      into tokens, given a string of "separator" characters, and put
      the tokens in a std::vector<std ::string. I'm sure there's
      various ways this could be improved. Comments? Slings? Arrows?


      void
      Tokenize
      (
      std::string const & RawText,
      std::string const & Delimiters,
      std::vector<std ::string & Tokens
      )
      {
      // Load raw text into an appropriately-sized dynamic char array:
      size_t StrSize = RawText.size();
      size_t ArraySize = StrSize + 5;
      char* Ptr = new char[ArraySize];
      memset(Ptr, 0, ArraySize);
      strncpy(Ptr, RawText.c_str() , StrSize);

      // Clear the Tokens vector:
      Tokens.clear();

      // Get the tokens from the array and put them in the vector:
      char* TokenPtr = NULL;
      char* TempPtr = Ptr;
      while (NULL != (TokenPtr = strtok(TempPtr, Delimiters.c_st r())))
      {
      Tokens.push_bac k(std::string(T okenPtr));
      TempPtr = NULL;
      }

      // Free memory and scram:
      delete[] Ptr;
      return;
      }
      >
      I guess tying the tokenizer to vector<stringis not a good idea. If it
      took an output iterator it could be used with any container or even
      with things like ostream_iterato rs. Here is my attempt, which also gets
      rid of strtok:
      >
      #include <string>
      using namespace std;
      template <class OItervoid tokenize( const string &str,
      const string &delim,
      OIter oi)
      {
      typedef string::size_ty pe Sz;
      >
      Sz begin=0;
      while(begin<str .size()){
      Sz end=str.find_fi rst_of(delim,be gin);
      *oi++=str.subst r(begin,end-begin);
      begin=str.find_ first_not_of(de lim,end);
      }
      }
      >
      I use find_first_not_ of in order to be compatible with strtok's
      behaviour of treating multiple adjacent delimiters as a single
      delimiter. I have not measured the performance of this version against
      the strtok version.
      check out http://www.boost.org/libs/tokenizer/index.html
      One cool thing about the boost tokenizer is that you can get NULL
      tokens if you have adjacent separators, which I believe can't be
      handled by strtok.

      Thanks and regards
      SJ

      Comment

      • Mehturt@gmail.com

        #4
        Re: Tokenizer Function (plus rant on strtok documentation)


        jmoy wrote:
        Robbie Hatley wrote:
        A couple of days ago I dedecided to force myself to really learn
        exactly what "strtok" does, and how to use it.
        This is how this function REALLY
        works:



        I wish more authors would cover this useful function in their
        books. After all, it IS a part of both the C and C++ standard
        libraries. Ok, I'm done ranting now.
        >
        strtok is one of the weird functions that maintain internal state, so
        that you cannot tokenize two strings in an interleaved manner or use it
        in a multithreaded program. POSIX offers a strtok_r which is somewhat
        saner.

        For your amusement, here is a function I wrote to break a string
        into tokens, given a string of "separator" characters, and put
        the tokens in a std::vector<std ::string. I'm sure there's
        various ways this could be improved. Comments? Slings? Arrows?


        void
        Tokenize
        (
        std::string const & RawText,
        std::string const & Delimiters,
        std::vector<std ::string & Tokens
        )
        {
        // Load raw text into an appropriately-sized dynamic char array:
        size_t StrSize = RawText.size();
        size_t ArraySize = StrSize + 5;
        char* Ptr = new char[ArraySize];
        memset(Ptr, 0, ArraySize);
        strncpy(Ptr, RawText.c_str() , StrSize);

        // Clear the Tokens vector:
        Tokens.clear();

        // Get the tokens from the array and put them in the vector:
        char* TokenPtr = NULL;
        char* TempPtr = Ptr;
        while (NULL != (TokenPtr = strtok(TempPtr, Delimiters.c_st r())))
        {
        Tokens.push_bac k(std::string(T okenPtr));
        TempPtr = NULL;
        }

        // Free memory and scram:
        delete[] Ptr;
        return;
        }
        >
        I guess tying the tokenizer to vector<stringis not a good idea. If it
        took an output iterator it could be used with any container or even
        with things like ostream_iterato rs. Here is my attempt, which also gets
        rid of strtok:
        >
        #include <string>
        using namespace std;
        template <class OItervoid tokenize( const string &str,
        const string &delim,
        OIter oi)
        {
        typedef string::size_ty pe Sz;
        >
        Sz begin=0;
        while(begin<str .size()){
        Sz end=str.find_fi rst_of(delim,be gin);
        *oi++=str.subst r(begin,end-begin);
        begin=str.find_ first_not_of(de lim,end);
        }
        }
        I like this implementation, but don't you assume the space for data
        (tokens) is already pre-allocated?
        If I use your fn with something like this, I get segmentation fault..

        std::vector<std ::stringv;
        std::vector<std ::string>::iter ator it = v.begin();
        tokenize<std::v ector<std::stri ng>::iterator>( "a b c", " ", it);
        >
        I use find_first_not_ of in order to be compatible with strtok's
        behaviour of treating multiple adjacent delimiters as a single
        delimiter. I have not measured the performance of this version against
        the strtok version.

        Comment

        • Robbie Hatley

          #5
          Re: Tokenizer Function (plus rant on strtok documentation)

          "jmoy" <jmoy.matecon@g mail.comwrote:
          strtok is one of the weird functions that maintain internal state, so
          that you cannot tokenize two strings in an interleaved manner or use it
          in a multithreaded program. POSIX offers a strtok_r which is somewhat
          saner.
          Ah, sort of like the code my ex-boss left me to maintain after he
          got fired. Hundreds of global variables, which he uses to pass
          data from function to function, like a dumbass. Of course, since
          the program is a complex windows app with timers and interrupts,
          the data often gets over-written on its way from one place to
          another. ::sigh:: Global variables are the work of Sauron.
          I guess tying the tokenizer to vector<stringis not a good idea.
          It does limit the user to a std::vector<std ::string>, yes. However,
          that construct is pretty good for this app. I find it hard to
          think of cases which couldn't use that to hold a bunch of tokens.
          If it took an output iterator it could be used with any container
          or even with things like ostream_iterato rs.
          Provided that the output container was big enough. If you start
          with an empty conainer and try writing to it using output
          iterators, you'll get an "illegal memory access" or "general
          protection fault" or some such thing. So you'd have to make sure
          that the container was huge. I don't like that approach.
          #include <string>
          using namespace std;
          template <class OItervoid tokenize( const string &str,
          const string &delim,
          OIter oi)
          {
          typedef string::size_ty pe Sz;
          >
          Sz begin=0;
          while(begin<str .size()){
          Sz end=str.find_fi rst_of(delim,be gin);
          *oi++=str.subst r(begin,end-begin);
          begin=str.find_ first_not_of(de lim,end);
          }
          }
          >
          I use find_first_not_ of in order to be compatible with strtok's
          behaviour of treating multiple adjacent delimiters as a single
          delimiter. I have not measured the performance of this version against
          the strtok version.
          Alluring in its simplicity, yes. But has two major bugs:

          1. Memory corruption danger if used to write to a small container.
          2. You don't take into account the fact that the string might START
          with one or more delimiters.

          Maybe something like THIS might be better:

          #include <string>
          // using namespace std; // Ewww.
          template <class Container>
          void
          tokenize
          (
          const std::string & str,
          const std::string & delim,
          Container & C
          )
          {
          typedef std::string::si ze_type Sz;
          Sz begin = 0;
          Sz end = 0;
          while (begin < str.size())
          {
          begin = str.find_first_ not_of (delim, begin);
          end = str.find_first_ of (delim, begin);
          Container.push_ back(str.substr (begin, end-begin));
          }
          }

          I haven't tested that, but I think something like that would work
          better. It does require that the container for the tokens have
          the push_back() method defined. Other than that, it's pretty
          generic.

          Note that to take care of the "starts with delimiters" case,
          I simply moved your "first_not_ of" up to the top of the loop.
          That should work nicely.


          --
          Cheers,
          Robbie Hatley
          East Tustin, CA, USA
          lone wolf intj at pac bell dot net
          (put "[usenet]" in subject to bypass spam filter)



          Comment

          • Jerry Coffin

            #6
            Re: Tokenizer Function (plus rant on strtok documentation)

            In article <zBHsg.129441$d W3.67625@newssv r21.news.prodig y.com>,
            bogus.address@n o.spam says...

            [ ... ]
            Maybe something like THIS might be better:
            >
            #include <string>
            // using namespace std; // Ewww.
            template <class Container>
            void
            tokenize
            (
            const std::string & str,
            const std::string & delim,
            Container & C
            )
            {
            typedef std::string::si ze_type Sz;
            Sz begin = 0;
            Sz end = 0;
            while (begin < str.size())
            {
            begin = str.find_first_ not_of (delim, begin);
            end = str.find_first_ of (delim, begin);
            Container.push_ back(str.substr (begin, end-begin));
            }
            }
            IMO, this is a poor idea. Take an iterator for the output. If the
            user wants the data pushed onto the back, they can use back_inserter
            to get that. If they want it inserted into something like a set, they
            can use inserter to get that.

            --
            Later,
            Jerry.

            The universe is a figment of its own imagination.

            Comment

            • Robbie Hatley

              #7
              Re: Tokenizer Function (plus rant on strtok documentation)

              "Jerry Coffin" <jcoffin@taeus. comwrote:
              IMO, this is a poor idea. Take an iterator for the output.
              Puts extreme burden on the user to provide the right kind of
              container and iterator. Such a function would often get mis-used
              and cause memory corruption and program crashes.
              If the user wants the data pushed onto the back, they can
              use back_inserter to get that.
              If they know any better.
              If they want it inserted into something like a set, they
              can use inserter to get that.
              If they know that they should, and if they know how.

              So it really depends on which kind of function one wants to write:

              1. Something efficient but dangerous, that requires having and
              reading and understanding some external documentation to use it
              correctly.

              or

              2. Something safe and easy and self-documenting, but a bit limited.

              I can see use for both, actually. But the iterator version will
              always be the more dangerous one.

              --
              Cheers,
              Robbie Hatley
              East Tustin, CA, USA
              lone wolf intj at pac bell dot net
              (put "[usenet]" in subject to bypass spam filter)



              Comment

              • Jerry Coffin

                #8
                Re: Tokenizer Function (plus rant on strtok documentation)

                In article <I5Isg.129443$d W3.1302@newssvr 21.news.prodigy .com>,
                bogus.address@n o.spam says...
                "Jerry Coffin" <jcoffin@taeus. comwrote:
                >
                IMO, this is a poor idea. Take an iterator for the output.
                >
                Puts extreme burden on the user to provide the right kind of
                container and iterator.
                IMO, it's not extreme at all. They're going to have to provide the
                right kind of container in any case -- but the code you provided will
                often _prevent_ them from using the right container. Just for
                example, putting the output into a set might well make sense -- but
                your code simply won't work with it at all.

                [ ... ]
                I can see use for both, actually. But the iterator version will
                always be the more dangerous one.
                The iterator version is the only one that really works. In any case,
                for a programmer to become at all proficient in using C++, they need
                to learn how to do this anyway -- look through most of the algorithms
                in the standard library, and note that they also take an iterator to
                tell them where to put the results -- with precisely the same result.

                --
                Later,
                Jerry.

                The universe is a figment of its own imagination.

                Comment

                • Alex Vinokur

                  #9
                  Re: Tokenizer Function (plus rant on strtok documentation)

                  [snip]
                  sonison.james@g mail.com wrote:[snip]

                  Also "Splitting string into vector of vectors":




                  Alex Vinokur
                  email: alex DOT vinokur AT gmail DOT com



                  Comment

                  • Alex Vinokur

                    #10
                    Re: Tokenizer Function (plus rant on strtok documentation)


                    Alex Vinokur wrote:
                    [slip]--------------------------------------------------
                    Instead ofshould be

                    Sorry
                    --------------------------------------------------
                    [snip]

                    Alex Vinokur
                    email: alex DOT vinokur AT gmail DOT com



                    Comment

                    • jmoy

                      #11
                      Re: Tokenizer Function (plus rant on strtok documentation)

                      Robbie Hatley wrote:
                      "jmoy" <jmoy.matecon@g mail.comwrote:
                      >
                      #include <string>
                      using namespace std;
                      template <class OItervoid tokenize( const string &str,
                      const string &delim,
                      OIter oi)
                      {
                      typedef string::size_ty pe Sz;

                      Sz begin=0;
                      while(begin<str .size()){
                      Sz end=str.find_fi rst_of(delim,be gin);
                      *oi++=str.subst r(begin,end-begin);
                      begin=str.find_ first_not_of(de lim,end);
                      }
                      }

                      ...
                      Alluring in its simplicity, yes. But has two major bugs:
                      >
                      1. Memory corruption danger if used to write to a small container.
                      No. As mentioned by other posters, if you are adding tokens to a
                      container the right thing is to call the function with something like a
                      back_inserter in which case there is no memory corruption
                      2. You don't take into account the fact that the string might START
                      with one or more delimiters.
                      You are right. My mistake.
                      >
                      Maybe something like THIS might be better:
                      >
                      #include <string>
                      // using namespace std; // Ewww.
                      template <class Container>
                      void
                      tokenize
                      (
                      const std::string & str,
                      const std::string & delim,
                      Container & C
                      )
                      {
                      typedef std::string::si ze_type Sz;
                      Sz begin = 0;
                      Sz end = 0;
                      while (begin < str.size())
                      {
                      begin = str.find_first_ not_of (delim, begin);
                      end = str.find_first_ of (delim, begin);
                      Container.push_ back(str.substr (begin, end-begin));
                      }
                      }
                      The problem with this is that it fails for the reverse case of a string
                      ending with a delimiter. Also, I don't like the idea of tying
                      algorithms with containers---iterators are a much more general concept.
                      Here is a corrected version of my function:

                      template <class OItervoid tokenize( const string &str,
                      const string &delim,
                      OIter oi)
                      {
                      typedef string::size_ty pe Sz;

                      Sz end=0;
                      for(;;){
                      Sz begin=str.find_ first_not_of(de lim,end);
                      if (begin==string: :npos)
                      break;
                      end=str.find_fi rst_of(delim,be gin);
                      *oi++=str.subst r(begin,end-begin);
                      }
                      }

                      Comment

                      • Jerry Coffin

                        #12
                        Re: Tokenizer Function (plus rant on strtok documentation)

                        In article <1152622377.108 600.157680@m79g 2000cwm.googleg roups.com>,
                        jmoy.matecon@gm ail.com says...

                        [ ... ]
                        template <class OItervoid tokenize( const string &str,
                        const string &delim,
                        OIter oi)
                        {
                        typedef string::size_ty pe Sz;
                        >
                        Sz end=0;
                        for(;;){
                        Sz begin=str.find_ first_not_of(de lim,end);
                        if (begin==string: :npos)
                        break;
                        end=str.find_fi rst_of(delim,be gin);
                        *oi++=str.subst r(begin,end-begin);
                        }
                        }
                        I think I'd also make the character type a template parameter:

                        template class<charT, class OIter>
                        void tokenize( basic_string<ch arTinput,
                        basic_string<ch arTdelim,
                        OIter oi)
                        {
                        typedef basic_string<ch arTstr;
                        typedef str::size_type Sz;

                        Sz end = 0;
                        for (;;) {
                        Sz begin = input.find_firs t_not_of(delim, end);
                        if (str::npos == begin)
                        break;
                        end = input.find_firs t_of(delim, begin);
                        *oi++ = input.substr(be gin, end-begin);
                        }
                        }

                        This way, the code can work with strings of either narrow or wide
                        characters.

                        --
                        Later,
                        Jerry.

                        The universe is a figment of its own imagination.

                        Comment

                        • Noah Roberts

                          #13
                          Re: Tokenizer Function (plus rant on strtok documentation)


                          Robbie Hatley wrote:
                          A couple of days ago I dedecided to force myself to really learn
                          exactly what "strtok" does, and how to use it.
                          Every once in a while I also have this massocistic urge to torture
                          myself for no reason. Eventually though I tire of it and move on.

                          The stroke function is useless. It is insecure, works in insecure
                          types, and is a general pita to use. There are better options that are
                          much easier to use and have type safety. Look into string streams as a
                          much better alternative to stroke.

                          Comment

                          • Roland Pibinger

                            #14
                            Re: Tokenizer Function (plus rant on strtok documentation)

                            On Tue, 11 Jul 2006 07:06:16 GMT, "Robbie Hatley"
                            <bogus.address@ no.spamwrote:
                            >So it really depends on which kind of function one wants to write:
                            >
                            >1. Something efficient but dangerous, that requires having and
                            reading and understanding some external documentation to use it
                            correctly.
                            >
                            or
                            >
                            >2. Something safe and easy and self-documenting, but a bit limited.
                            The following tokenize function, derived from John Potter's
                            implementation
                            (http://groups.google.com/group/comp....daafacd01ce26),
                            is IMO safe (no output iterators) and efficient (no dynamic
                            allocation, no substr). Usable for all STL-like containers (except
                            map) with bidirectional iterators:

                            #include <algorithm>

                            template <typename StringT, typename ContainerT>
                            size_t tokenize (const StringT& text, const StringT& delim,
                            ContainerT& result) {
                            size_t num = 0;
                            typename StringT::size_t ype b = text.find_first _not_of(delim);
                            while (b != StringT::npos) {
                            typename StringT::size_t ype e(text.find_fir st_of(delim, b));
                            StringT s (text.c_str() + b, e - b);
                            result.insert (result.end(), StringT());
                            typename ContainerT::ite rator iter = result.end();
                            (*--iter).swap (s);
                            ++num;
                            b = text.find_first _not_of(delim, std::min(e, text.size()));
                            }
                            return num;
                            }

                            For std::vector as result container efficency can be increased with
                            reserve().

                            Best wishes,
                            Roland Pibinger

                            Comment

                            • Mehturt@gmail.com

                              #15
                              Re: Tokenizer Function (plus rant on strtok documentation)


                              Roland Pibinger wrote:
                              On Tue, 11 Jul 2006 07:06:16 GMT, "Robbie Hatley"
                              <bogus.address@ no.spamwrote:
                              So it really depends on which kind of function one wants to write:

                              1. Something efficient but dangerous, that requires having and
                              reading and understanding some external documentation to use it
                              correctly.

                              or

                              2. Something safe and easy and self-documenting, but a bit limited.
                              >
                              The following tokenize function, derived from John Potter's
                              implementation
                              (http://groups.google.com/group/comp....daafacd01ce26),
                              is IMO safe (no output iterators) and efficient (no dynamic
                              allocation, no substr). Usable for all STL-like containers (except
                              map) with bidirectional iterators:
                              >
                              #include <algorithm>
                              >
                              template <typename StringT, typename ContainerT>
                              size_t tokenize (const StringT& text, const StringT& delim,
                              ContainerT& result) {
                              size_t num = 0;
                              typename StringT::size_t ype b = text.find_first _not_of(delim);
                              while (b != StringT::npos) {
                              typename StringT::size_t ype e(text.find_fir st_of(delim, b));
                              I think e can be StringT::npos here and that would cause problems in
                              the line below..
                              StringT s (text.c_str() + b, e - b);
                              result.insert (result.end(), StringT());
                              typename ContainerT::ite rator iter = result.end();
                              (*--iter).swap (s);
                              ++num;
                              b = text.find_first _not_of(delim, std::min(e, text.size()));
                              }
                              return num;
                              }
                              >
                              For std::vector as result container efficency can be increased with
                              reserve().
                              >
                              Best wishes,
                              Roland Pibinger

                              Comment

                              Working...