Code review requested, Accelerated C++

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Joe Van Dyk

    Code review requested, Accelerated C++

    This is my answer to problem 5-10 from Accelerated C++. I'd appreciate
    any comments (regarding style, efficiency, standards, etc) you fellows
    have.

    The resulting program seems reasonably fast, searching through a
    dictionary of 300,000+ words in 3 seconds on my PowerBook G4.

    // Write a program that finds all the palindromes in a dictionary,
    // and also find the largest palindrome.

    #include <iostream>
    #include <vector>

    using std::cout; using std::cin;
    using std::string; using std::endl;
    using std::vector;

    bool is_palindrome(c onst string&);
    bool compare_word_le ngth(const string &, const string &);

    int main()
    {
    cout << "This program will find all the palindromes in a dictionary "
    "and will find the largest one." << std::endl;

    string input;
    vector<string> palindromes;

    // Get the palindromes
    while (cin >> input)
    if (is_palindrome( input))
    palindromes.pus h_back(input);

    // Display the palindromes
    cout << "\nThe palindromes were:\n";
    vector<string>: :const_iterator iter;
    for (iter = palindromes.beg in(); iter != palindromes.end (); ++iter)
    {
    cout << *iter << '\t';
    }
    cout << '\n';

    // Sorting palindromes
    sort (palindromes.be gin(), palindromes.end (), compare_word_le ngth);

    // Display the shortest and longest one
    string smallest = *palindromes.be gin();
    string largest = *(palindromes.e nd()-1);

    cout << "\nThe shortest palindrome was '" << smallest << "'"
    " and the longest one was '" << largest << "'\n";

    cout << endl;

    return 0;
    }

    bool is_palindrome(c onst string &input)
    {
    string::const_i terator begin, end;
    begin = input.begin();
    end = input.end() - 1;

    while (begin < end)
    {
    if (*begin != *end)
    return false;
    begin++;
    end--;
    }
    return true;
    }

    bool compare_word_le ngth(const string &a, const string &b)
    {
    return (a.size() < b.size());
    }

  • Joe Van Dyk

    #2
    Re: Code review requested, Accelerated C++

    On 2004-10-16 17:26:45 -0700, Joe Van Dyk <joevandyk@nosp am.gmail.com> said:
    [color=blue]
    > This is my answer to problem 5-10 from Accelerated C++. I'd appreciate
    > any comments (regarding style, efficiency, standards, etc) you fellows
    > have.[/color]

    <snip>
    [color=blue]
    >
    >
    > while (begin < end)
    > {
    > if (*begin != *end)
    > return false;
    > begin++;
    > end--;[/color]

    Just realized this should be:
    ++begin;
    --end;

    Joe

    Comment

    • E. Robert Tisdale

      #3
      Re: Code review requested, Accelerated C++

      Joe Van Dyk wrote:[color=blue]
      > This is my answer to problem 5-10 from Accelerated C++. I'd appreciate
      > any comments (regarding style, efficiency, standards, etc) you fellows
      > have.
      >
      > The resulting program seems reasonably fast, searching through a
      > dictionary of 300,000+ words in 3 seconds on my PowerBook G4.
      >
      > // Write a program that finds all the palindromes in a dictionary,
      > // and also find the largest palindrome.[/color]
      [color=blue]
      > cat main.cc[/color]
      // Write a program that finds all the palindromes
      // in a dictionary and also find the largest palindrome.

      #include <iostream>
      #include <vector>

      bool is_palindrome(c onst std::string& input) {
      std::string::co nst_iterator begin = input.begin();
      std::string::co nst_iterator end = input.end() - 1;

      while (begin < end) {
      if (*begin != *end)
      return false;
      ++begin;
      --end;
      }
      return true;
      }

      bool compare_word_le ngth(const std::string &a,
      const std::string &b) {
      return (a.size() < b.size());
      }

      int main(int argc, char* argv[]) {
      std::cout << std::endl <<
      "This program will find and print all the palindromes\n"
      "in a dictionary then finds and prints\n"
      "the the smallest and largest ones." << std::endl;

      std::string input;
      std::vector<std ::string> palindromes;

      // Get the palindromes
      while (std::cin >> input)
      if (is_palindrome( input))
      palindromes.pus h_back(input);

      // Display the palindromes
      std::cout << std::endl;
      std::cout << "The palindromes were:\n";
      for (std::vector<st d::string>::con st_iterator iter =
      palindromes.beg in(); iter != palindromes.end (); ++iter) {
      std::cout << *iter << '\t';
      }
      std::cout << std::endl;

      // Sorting palindromes
      sort(palindrome s.begin(),
      palindromes.end (), compare_word_le ngth);

      // Display the shortest and longest one
      std::string smallest = *palindromes.be gin();
      std::string largest = *(palindromes.e nd() - 1);

      std::cout << std::endl;
      std::cout << "The shortest palindrome was '" << smallest
      << "' and the longest one was '" << largest << "'.";

      std::cout << std::endl;

      return 0;
      }
      [color=blue]
      > g++ -Wall -ansi -pedantic -o main main.cc
      > time ./main < /usr/share/dict/words[/color]

      This program will find and print all the palindromes
      in a dictionary then finds and prints
      the the smallest and largest ones.

      The palindromes were:
      bib bob boob civic dad deed did dud \
      eke ere ewe eye gag gig huh level \
      madam non noon nun peep pep pip pop \
      pup radar redder refer reviver rotator rotor sees \
      sexes solos tit

      The shortest palindrome was 'tit' \
      and the longest one was 'rotator'.
      0.303u 0.007s 0:00.31 96.7% 0+0k 0+0io 0pf+0w[color=blue]
      > wc /usr/share/dict/words[/color]
      45427 45427 409305 /usr/share/dict/words

      Comment

      • John Harrison

        #4
        Re: Code review requested, Accelerated C++


        "Joe Van Dyk" <joevandyk@nosp am.gmail.com> wrote in message
        news:2004101617 264416807%joeva ndyk@nospamgmai lcom...[color=blue]
        > This is my answer to problem 5-10 from Accelerated C++. I'd appreciate
        > any comments (regarding style, efficiency, standards, etc) you fellows
        > have.
        >[/color]

        Basically looks fine to me. But one or two suggestions follow.
        [color=blue]
        > The resulting program seems reasonably fast, searching through a
        > dictionary of 300,000+ words in 3 seconds on my PowerBook G4.
        >
        > // Write a program that finds all the palindromes in a dictionary,
        > // and also find the largest palindrome.
        >
        > #include <iostream>
        > #include <vector>
        >
        > using std::cout; using std::cin;
        > using std::string; using std::endl;
        > using std::vector;
        >
        > bool is_palindrome(c onst string&);
        > bool compare_word_le ngth(const string &, const string &);
        >
        > int main()
        > {
        > cout << "This program will find all the palindromes in a dictionary "
        > "and will find the largest one." << std::endl;
        >
        > string input;
        > vector<string> palindromes;
        >
        > // Get the palindromes
        > while (cin >> input)
        > if (is_palindrome( input))
        > palindromes.pus h_back(input);
        >
        > // Display the palindromes
        > cout << "\nThe palindromes were:\n";
        > vector<string>: :const_iterator iter;
        > for (iter = palindromes.beg in(); i ter!=palindrome s.end ++iter)
        > {
        > cout << *iter << '\t';
        > }
        > cout << '\n';[/color]

        You could consider this as a replacement for the above loop

        copy(palindrome s.begin (), palindromes.end (), ostream_iterato r(cout,
        "\t"));
        cout << '\n';

        You'd need to include <algorithm> for copy, and <iterator>, for
        ostream_iterato r.
        [color=blue]
        >
        > // Sorting palindromes
        > sort (palindromes.be gin(), palindromes.end ,compare_word_l ength
        > // Display the shortest and longest one
        > string smallest = *palindromes.be gin();
        > string largest = *(palindromes.e nd()-1);[/color]

        I would add a check that you have at least one palindrome, otherwise you are
        going to crash, and I'd use front() and back()

        if (!palindromes.e mpty())
        {
        string smallest = palindromes.fro nt();
        string largest = palindromes.bac k();
        ...
        }
        [color=blue]
        >
        > cout << "\nThe shortest palindrome was '" << smallest << "'"
        > " and the longest one was '" << largest << "'\n";
        >
        > cout << endl;
        >
        > return 0;
        > }
        >
        > bool is_palindrome(c onst string &input)
        > {
        > string::const_i terator begin, end;
        > begin = input.begin end = i nput.end-1[/color]

        I think its better style to combine declaration and initialisation, in some
        circumstances it is also more efficient.

        string::const_i terator begin = input.begin ();
        string::const_i terator end = input.end ()-1;
        [color=blue]
        >
        > while (begin < end)
        > {
        > if (*begin != *end)
        > return false;
        > begin++;
        > end--;
        > }
        > return true;
        > }
        >
        > bool compare_word_le ngth(const string &a, const string &b)
        > {
        > return (a.size() < b.size());
        > }
        >[/color]

        john


        Comment

        • Andrew Koenig

          #5
          Re: Code review requested, Accelerated C++

          "Joe Van Dyk" <joevandyk@nosp am.gmail.com> wrote in message
          news:2004101617 264416807%joeva ndyk@nospamgmai lcom...
          [color=blue]
          > The resulting program seems reasonably fast, searching through a
          > dictionary of 300,000+ words in 3 seconds on my PowerBook G4.[/color]

          I haven't read the code in detail, but I did read enough of it to make a
          strategic suggestion.

          Your program works by making a vector of all of the palindromes, then
          sorting it to determine the largest smallest. This technique is legitimate,
          but takes substantiallly more space and time than needed.

          Consider: If there are n palindromes, your program creates an n-element
          vector, all the elements of which must fit in memory at the same time.
          Moreover, sorting those elements takes time proportional to n*log(n).

          I am not generally fond of optimizing programs too early, but I make a
          distinction between local optimizations and writing code in a way that
          changes the exponent in the expression of how much time or space a program
          takes. In this case, for example, storing an n-element array means you have
          a space exponent of 1 (that is, the amount of space is proportional to the
          1st power of the amount of input), and if you can reduce that to 0, it may
          well be worth doing.

          The way to do it is to stop remembering every palindrome, and remember only
          the shortest and longest you've seen so far. Each time you find a new one,
          print it, check whether it's shorter than the current shortest or longer
          than the current longest, and then throw it away.

          This strategy has the advantage of running in time proportional to n, rather
          than to n*log(n), and in space proportional to the longest word in the file,
          rather than proportional to the sum of the lengths of all the palindromes.
          That could be a big saving.

          Of course you then give up the possibility of printing the palindromes in
          alphabetical order, but I don't think you were doing that anyway.



          Regards,

          Andrew Koenig


          Comment

          • Joe Van Dyk

            #6
            Re: Code review requested, Accelerated C++

            On 2004-10-17 07:56:02 -0700, "Andrew Koenig" <ark@acm.org> said:
            [color=blue]
            > "Joe Van Dyk" <joevandyk@nosp am.gmail.com> wrote in message
            > news:2004101617 264416807%joeva ndyk@nospamgmai lcom...
            >[color=green]
            >> The resulting program seems reasonably fast, searching through a
            >> dictionary of 300,000+ words in 3 seconds on my PowerBook G4.[/color]
            >
            > I haven't read the code in detail, but I did read enough of it to make
            > a strategic suggestion.
            >
            > Your program works by making a vector of all of the palindromes, then
            > sorting it to determine the largest smallest. This technique is
            > legitimate, but takes substantiallly more space and time than needed.
            >
            > Consider: If there are n palindromes, your program creates an n-element
            > vector, all the elements of which must fit in memory at the same time.
            > Moreover, sorting those elements takes time proportional to n*log(n).
            >
            > I am not generally fond of optimizing programs too early, but I make a
            > distinction between local optimizations and writing code in a way that
            > changes the exponent in the expression of how much time or space a
            > program takes. In this case, for example, storing an n-element array
            > means you have a space exponent of 1 (that is, the amount of space is
            > proportional to the 1st power of the amount of input), and if you can
            > reduce that to 0, it may well be worth doing.
            >
            > The way to do it is to stop remembering every palindrome, and remember
            > only the shortest and longest you've seen so far. Each time you find a
            > new one, print it, check whether it's shorter than the current shortest
            > or longer than the current longest, and then throw it away.
            >
            > This strategy has the advantage of running in time proportional to n,
            > rather than to n*log(n), and in space proportional to the longest word
            > in the file, rather than proportional to the sum of the lengths of all
            > the palindromes. That could be a big saving.
            >
            > Of course you then give up the possibility of printing the palindromes
            > in alphabetical order, but I don't think you were doing that anyway.
            >
            >
            >
            > Regards,
            >
            > Andrew Koenig[/color]

            Thanks for the idea. I tried it the way you suggested and it actually
            took longer (4 seconds compared to 3 seconds on my first approach) to
            do on a 300K word set. I think it's because of the repeated cout
            calls, right? (i.e. we're calling cout everytime we find a palindrome,
            and not once at the end.)

            I also tried adding all the palindromes to a string (what's the best
            way to do string concatenation?) and then printing that out at the end,
            but that took about the same amount of time as my first approach.

            Joe

            Comment

            • Mike Wahler

              #7
              Re: Code review requested, Accelerated C++

              "Joe Van Dyk" <joevandyk@no.s pam.gmail.com> wrote in message
              news:2004101710 234716807%joeva ndyk@nospamgmai lcom...[color=blue]
              >
              > Thanks for the idea. I tried it the way you suggested and it actually
              > took longer (4 seconds compared to 3 seconds on my first approach) to
              > do on a 300K word set. I think it's because of the repeated cout
              > calls, right?[/color]

              Probably (but we can only guess without seeing the code). The only
              way to know for sure is to test it with and without the outputs.
              I/O is almost always the slowest part of a program, so it will tend
              to distort timing results.
              [color=blue]
              > (i.e. we're calling cout everytime we find a palindrome,
              > and not once at the end.)
              >
              > I also tried adding all the palindromes to a string (what's the best
              > way to do string concatenation?)[/color]

              Use the std::string type's '+' operator. But note that expansion
              of a string can result in reallocation and copying. However, there
              are a few ways to alleviate that, e.g. first estimate the ultimate size
              of the string and pass that value to 'std::string::r eserve()' or
              perhaps initially create the string (of e.g. characters with value zero)
              with that size (via a constructor). But depending upon how many
              palindromes you find, this could still result in a rather large
              string (and could again impact performance due to e.g. paging on
              a 'virtual memory' system).

              But note that if your only goal is to find e.g. a 'largest' or
              'smallest' item, there's no reason to save all the items. As
              Francis suggested, you only need a single string to replace
              with the 'current' 'largest' item as you iterate and compare.


              [color=blue]
              > and then printing that out at the end,
              > but that took about the same amount of time as my first approach.[/color]

              Try several approaches, and Test, Test, Test! :-)

              -Mike


              Comment

              • David Rubin

                #8
                Re: Code review requested, Accelerated C++

                Joe Van Dyk <joevandyk@nosp am.gmail.com> wrote in message news:<200410161 7264416807%joev andyk@nospamgma ilcom>...[color=blue]
                > This is my answer to problem 5-10 from Accelerated C++. I'd appreciate
                > any comments (regarding style, efficiency, standards, etc) you fellows
                > have.
                >
                > The resulting program seems reasonably fast, searching through a
                > dictionary of 300,000+ words in 3 seconds on my PowerBook G4.
                >
                > // Write a program that finds all the palindromes in a dictionary,
                > // and also find the largest palindrome.
                >
                > #include <iostream>
                > #include <vector>
                >
                > using std::cout; using std::cin;
                > using std::string; using std::endl;
                > using std::vector;
                >
                > bool is_palindrome(c onst string&);
                > bool compare_word_le ngth(const string &, const string &);
                >
                > int main()
                > {
                > cout << "This program will find all the palindromes in a dictionary "
                > "and will find the largest one." << std::endl;
                >
                > string input;
                > vector<string> palindromes;[/color]

                [snip - store plandromes in the vector, then sort, and pull out the
                largest one]

                If you store input in this way, use a deque at the very least. It is
                much more efficient. Storing into a vector (without pre-allocating
                space) is O(n^2). But, given the problem specification, there is no
                reason to either store or sort input. You just need to maintain the
                "largest" palindrome and display each palindrome as you examine each
                input. This reduces the time complexity to O(n), and the space
                requirement to O(1). /david

                Comment

                • David Rubin

                  #9
                  Re: Code review requested, Accelerated C++

                  "Andrew Koenig" <ark@acm.org> wrote in message news:<6Kvcd.710 524$Gx4.622789@ bgtnsc04-news.ops.worldn et.att.net>...[color=blue]
                  > "Joe Van Dyk" <joevandyk@nosp am.gmail.com> wrote in message
                  > news:2004101617 264416807%joeva ndyk@nospamgmai lcom...
                  >[color=green]
                  > > The resulting program seems reasonably fast, searching through a
                  > > dictionary of 300,000+ words in 3 seconds on my PowerBook G4.[/color][/color]
                  ^^^^^^^^^^

                  [snip][color=blue]
                  > Of course you then give up the possibility of printing the palindromes in
                  > alphabetical order, but I don't think you were doing that anyway.[/color]

                  Presumably, input from a dictionary is already sorted. Not that
                  "dictionary " implies a literal dictionary, even in the sense of an
                  ordered list of words, but it's not a bad assumption.

                  Comment

                  • Andrew Koenig

                    #10
                    Re: Code review requested, Accelerated C++

                    "Joe Van Dyk" <joevandyk@no.s pam.gmail.com> wrote in message
                    news:2004101710 234716807%joeva ndyk@nospamgmai lcom...
                    [color=blue]
                    > Thanks for the idea. I tried it the way you suggested and it actually
                    > took longer (4 seconds compared to 3 seconds on my first approach) to do
                    > on a 300K word set. I think it's because of the repeated cout calls,
                    > right? (i.e. we're calling cout everytime we find a palindrome, and not
                    > once at the end.)
                    >
                    > I also tried adding all the palindromes to a string (what's the best way
                    > to do string concatenation?) and then printing that out at the end, but
                    > that took about the same amount of time as my first approach.[/color]

                    Interesting. That suggests that the program spends most of its time in the
                    I/O library no matter what you do. Although that conclusion is not
                    surprising, it's also worth testing; it's a useful exercise to try to figure
                    out a way of doing so.


                    Comment

                    • Andrew Koenig

                      #11
                      Re: Code review requested, Accelerated C++

                      "David Rubin" <davidrubin@war pmail.net> wrote in message
                      news:82b37be.04 10171209.6be045 cb@posting.goog le.com...
                      [color=blue]
                      > If you store input in this way, use a deque at the very least. It is
                      > much more efficient. Storing into a vector (without pre-allocating
                      > space) is O(n^2).[/color]

                      No it isn't. std::vector is required to be implemented in such a way that
                      calling push_back n times on an initially empty vector executes in O(n)
                      time. Individual executions might take more than O(1), but the total time
                      for n calls must be O(n).


                      Comment

                      • Mike Wahler

                        #12
                        Re: Code review requested, Accelerated C++

                        "Andrew Koenig" <ark@acm.org> wrote in message
                        news:2nCcd.3987 $OD2.2137@bgtns c05-news.ops.worldn et.att.net...[color=blue]
                        > "David Rubin" <davidrubin@war pmail.net> wrote in message
                        > news:82b37be.04 10171209.6be045 cb@posting.goog le.com...
                        >[color=green]
                        > > If you store input in this way, use a deque at the very least. It is
                        > > much more efficient. Storing into a vector (without pre-allocating
                        > > space) is O(n^2).[/color]
                        >
                        > No it isn't. std::vector is required to be implemented in such a way that
                        > calling push_back n times on an initially empty vector executes in O(n)
                        > time. Individual executions might take more than O(1), but the total time
                        > for n calls must be O(n).[/color]

                        aka "amortized constant time".

                        -Mike


                        Comment

                        • Joe Van Dyk

                          #13
                          Re: Code review requested, Accelerated C++

                          On 2004-10-17 10:51:44 -0700, "Mike Wahler" <mkwahler@mkwah ler.net> said:
                          [color=blue]
                          > "Joe Van Dyk" <joevandyk@no.s pam.gmail.com> wrote in message
                          > news:2004101710 234716807%joeva ndyk@nospamgmai lcom...[color=green]
                          >>
                          >> Thanks for the idea. I tried it the way you suggested and it actually
                          >> took longer (4 seconds compared to 3 seconds on my first approach) to
                          >> do on a 300K word set. I think it's because of the repeated cout
                          >> calls, right?[/color]
                          >
                          > Probably (but we can only guess without seeing the code). The only
                          > way to know for sure is to test it with and without the outputs.
                          > I/O is almost always the slowest part of a program, so it will tend
                          > to distort timing results.
                          >[color=green]
                          >> (i.e. we're calling cout everytime we find a palindrome,
                          >> and not once at the end.)
                          >>
                          >> I also tried adding all the palindromes to a string (what's the best
                          >> way to do string concatenation?)[/color]
                          >
                          > Use the std::string type's '+' operator. But note that expansion
                          > of a string can result in reallocation and copying. However, there
                          > are a few ways to alleviate that, e.g. first estimate the ultimate size
                          > of the string and pass that value to 'std::string::r eserve()' or
                          > perhaps initially create the string (of e.g. characters with value zero)
                          > with that size (via a constructor). But depending upon how many
                          > palindromes you find, this could still result in a rather large
                          > string (and could again impact performance due to e.g. paging on
                          > a 'virtual memory' system).
                          >
                          > But note that if your only goal is to find e.g. a 'largest' or
                          > 'smallest' item, there's no reason to save all the items. As
                          > Francis suggested, you only need a single string to replace
                          > with the 'current' 'largest' item as you iterate and compare.[/color]

                          One of the program requirements was to print out all the palindromes in
                          the dictionary, so either I'd need to print them out as I found them or
                          save them somehow.
                          [color=blue]
                          >
                          >
                          >[color=green]
                          >> and then printing that out at the end,
                          >> but that took about the same amount of time as my first approach.[/color]
                          >
                          > Try several approaches, and Test, Test, Test! :-)
                          >
                          > -Mike[/color]


                          Comment

                          • David Rubin

                            #14
                            Re: Code review requested, Accelerated C++

                            "Andrew Koenig" <ark@acm.org> wrote in message news:<2nCcd.398 7$OD2.2137@bgtn sc05-news.ops.worldn et.att.net>...[color=blue]
                            > "David Rubin" <davidrubin@war pmail.net> wrote in message
                            > news:82b37be.04 10171209.6be045 cb@posting.goog le.com...
                            >[color=green]
                            > > If you store input in this way, use a deque at the very least. It is
                            > > much more efficient. Storing into a vector (without pre-allocating
                            > > space) is O(n^2).[/color]
                            >
                            > No it isn't. std::vector is required to be implemented in such a way that
                            > calling push_back n times on an initially empty vector executes in O(n)
                            > time. Individual executions might take more than O(1), but the total time
                            > for n calls must be O(n).[/color]

                            Thanks.

                            Comment

                            • Mike Wahler

                              #15
                              Re: Code review requested, Accelerated C++


                              "Joe Van Dyk" <joevandyk@no.s pam.gmail.com> wrote in message
                              news:2004101717 094016807%joeva ndyk@nospamgmai lcom...[color=blue]
                              > One of the program requirements was to print out all the palindromes in
                              > the dictionary, so either I'd need to print them out as I found them or
                              > save them somehow.[/color]

                              OK, I didn't thoroughly review the requirments. Outputting as
                              you go would probably be faster, since you wouldn't have any
                              overhead caused by storing them.

                              -Mike


                              Comment

                              Working...