sorting K+1 vectors with NlogN comparisons not (K+1)NlogN

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Pratyush

    sorting K+1 vectors with NlogN comparisons not (K+1)NlogN

    Hi,

    Suppose there is a vector of objects of class A, i.e., std::vector<A>
    vec_A(N); The class A satisifies all the STL vector requirements.

    Now I wish to add some attributes for each of the objects in the
    vector vec_A. Suppose there are K attributes to be added. For each of
    the attributes I define K vectors of appropriate types. Say, the
    attributes have types type1, type2, ..., typeK. So I define
    std::vector<typ e1> attr1(vec_A.siz e());
    std::vector<typ e2> attr2(vec_A.siz e());
    ....
    std::vector<typ eK> attrK(vec_A.siz e());

    The important condition is that the object at location i in vec_A has
    attributes at location i in attr1, attr2, ..., attrK.

    Now suppose I need to sort the vector vec_A on basis of some of the
    attributes. How can it be efficiently (both in terms of time and
    memory) done so that the above mentioned condition is satisfied?

    One approach that I used is to have copies of the basis attributes
    vectors. Then sort vec_A, attr1, attr2, ..., attrK using the copies.
    But this involves on average (K+1)*NlogN comparisons of the basis
    attributes. Additionally I am creating copies of the basis attributes.
    So this approach seems quite unsatisfactory. Is it possible to have
    only NlogN comparisons? Best if no copies of the basis attributes are
    involved.

    Of course, one can define class A to have the attributes as its
    fields. But then its not general and flexible enough. In that case,
    one loses the flexibilty of having one attribute, two attributes and
    so on.

    One example of such a need can be the case where class A represent a
    vertex of a graph, and the attributes are different properties of a
    vertex.


    Thanks & regards,
    Pratyush
  • tom_usenet

    #2
    Re: sorting K+1 vectors with NlogN comparisons not (K+1)NlogN

    On 10 Nov 2003 00:24:27 -0800, pkpras@noida.at renta.com (Pratyush)
    wrote:
    [color=blue]
    >Hi,
    >
    >Suppose there is a vector of objects of class A, i.e., std::vector<A>
    >vec_A(N); The class A satisifies all the STL vector requirements.
    >
    >Now I wish to add some attributes for each of the objects in the
    >vector vec_A. Suppose there are K attributes to be added. For each of
    >the attributes I define K vectors of appropriate types. Say, the
    >attributes have types type1, type2, ..., typeK. So I define
    >std::vector<ty pe1> attr1(vec_A.siz e());
    >std::vector<ty pe2> attr2(vec_A.siz e());
    >...
    >std::vector<ty peK> attrK(vec_A.siz e());
    >
    >The important condition is that the object at location i in vec_A has
    >attributes at location i in attr1, attr2, ..., attrK.[/color]

    Parallel vectors are rarely a good idea!
    [color=blue]
    >
    >Now suppose I need to sort the vector vec_A on basis of some of the
    >attributes. How can it be efficiently (both in terms of time and
    >memory) done so that the above mentioned condition is satisfied?[/color]

    Create a vector containing (0, 1, 2, 3, 4, ..., vec_A.size() - 1). Now
    sort that, using a sorting criteria that looks up elements in the A
    vector. e.g.

    struct IndexSort
    {
    private:
    //...
    //probably begin iterators for the relevent attibute vectors.
    //or references to those containers
    public:
    typedef bool result_type;
    typedef int first_argument_ type;
    typedef int second_argument _type;

    IndexSort(/*probably begin iterators for the relevent attibute
    vectors*/)
    {}

    bool operator()(firs t_argument_type lhs,
    second_argument _type rhs) const
    {
    //lhs and rhs give the index in the vectors to compare.
    return true; //perform comparison
    }
    };

    std::sort(index vec.begin(), indexvec.end(), IndexSort(/*args*/));

    Now you have a vector like:

    (10, 99, 2, 7, ...)

    You now use this to rearrange all of your parallel vectors into the
    correct order. e.g.

    template <class It, class RanIt>
    void index_shuffle(I t begin, It end, RanIt toSort)
    {
    using std::iter_swap;
    RanIt it = toSort;
    while(begin != end)
    {
    iter_swap(it, toSort + *begin);
    ++begin;
    ++it;
    }
    }

    calling it on each:
    index_shuffle(i ndexvec.begin() , indexvec.end(), vec_A.begin());
    index_shuffle(i ndexvec.begin() , indexvec.end(), attr1.begin());
    ....
    index_shuffle(i ndexvec.begin() , indexvec.end(), attrK.begin());
    [color=blue]
    >
    >One approach that I used is to have copies of the basis attributes
    >vectors. Then sort vec_A, attr1, attr2, ..., attrK using the copies.
    >But this involves on average (K+1)*NlogN comparisons of the basis
    >attributes. Additionally I am creating copies of the basis attributes.
    >So this approach seems quite unsatisfactory. Is it possible to have
    >only NlogN comparisons? Best if no copies of the basis attributes are
    >involved.[/color]

    Sort a vector of indices then, applying the reordering to each vector
    in turn. NlogN comparisons as required.
    [color=blue]
    >
    >Of course, one can define class A to have the attributes as its
    >fields. But then its not general and flexible enough. In that case,
    >one loses the flexibilty of having one attribute, two attributes and
    >so on.
    >
    >One example of such a need can be the case where class A represent a
    >vertex of a graph, and the attributes are different properties of a
    >vertex.[/color]

    There are many ways to give flexible attribute lists. At the very
    least you have:
    struct AHolder
    {
    A a;
    type1 attr1;
    //...
    typek attrk;
    };

    A runtime map<string, type>, or compile time template parameters
    (typelist, tuple, etc.) are alternatives.

    Incidently, you can find a graphs library here:



    Tom

    Comment

    • David Rubin

      #3
      Re: sorting K+1 vectors with NlogN comparisons not (K+1)NlogN

      tom_usenet wrote:

      [snip - how to sort many parallel vectors?][color=blue][color=green]
      > >Now suppose I need to sort the vector vec_A on basis of some of the
      > >attributes. How can it be efficiently (both in terms of time and
      > >memory) done so that the above mentioned condition is satisfied?[/color]
      >
      > Create a vector containing (0, 1, 2, 3, 4, ..., vec_A.size() - 1). Now
      > sort that, using a sorting criteria that looks up elements in the A
      > vector. e.g.[/color]

      [snip][color=blue]
      > Now you have a vector like:
      >
      > (10, 99, 2, 7, ...)
      >
      > You now use this to rearrange all of your parallel vectors into the
      > correct order. e.g.[/color]

      Once you have this vector (v), you don't need to rearrange the others.
      The ith element in this vector refers to element vec_A[v[i]] with
      attributes attr*[v[i]].

      /david

      --
      Andre, a simple peasant, had only one thing on his mind as he crept
      along the East wall: 'Andre, creep... Andre, creep... Andre, creep.'
      -- unknown

      Comment

      • Pratyush

        #4
        Re: sorting K+1 vectors with NlogN comparisons not (K+1)NlogN

        David Rubin <bogus_address@ nomail.com> wrote in message news:<3FAFE492. 43394C54@nomail .com>...[color=blue]
        > tom_usenet wrote:
        >
        > [snip - how to sort many parallel vectors?][color=green][color=darkred]
        > > >Now suppose I need to sort the vector vec_A on basis of some of the
        > > >attributes. How can it be efficiently (both in terms of time and
        > > >memory) done so that the above mentioned condition is satisfied?[/color]
        > >
        > > Create a vector containing (0, 1, 2, 3, 4, ..., vec_A.size() - 1). Now
        > > sort that, using a sorting criteria that looks up elements in the A
        > > vector. e.g.[/color]
        >
        > [snip][color=green]
        > > Now you have a vector like:
        > >
        > > (10, 99, 2, 7, ...)
        > >
        > > You now use this to rearrange all of your parallel vectors into the
        > > correct order. e.g.[/color]
        >
        > Once you have this vector (v), you don't need to rearrange the others.
        > The ith element in this vector refers to element vec_A[v[i]] with
        > attributes attr*[v[i]].
        >
        > /david[/color]


        Thanks a lot guys. I think if the usage is such that we need to access
        the attributes too often then its better to rearrange the vectors. In
        this case, instead of two deferences we will be doing one. Otherwise
        we can use indexing as suggested by David.

        Thanks & regards,
        Pratyush

        Comment

        • Pratyush

          #5
          Re: sorting K+1 vectors with NlogN comparisons not (K+1)NlogN

          tom_usenet <tom_usenet@hot mail.com> wrote in message news:<60ruqvsti m62boc6aoam6rjt rrhd2ns816@4ax. com>...[color=blue]
          > On 10 Nov 2003 00:24:27 -0800, pkpras@noida.at renta.com (Pratyush)
          > wrote:
          >[color=green]
          > >Hi,[/color][/color]
          [snip][color=blue][color=green]
          > >
          > >Now suppose I need to sort the vector vec_A on basis of some of the
          > >attributes. How can it be efficiently (both in terms of time and
          > >memory) done so that the above mentioned condition is satisfied?[/color]
          >
          > Create a vector containing (0, 1, 2, 3, 4, ..., vec_A.size() - 1). Now
          > sort that, using a sorting criteria that looks up elements in the A
          > vector. e.g.
          >
          > struct IndexSort
          > {[/color]
          [snip][color=blue]
          > };
          >
          > std::sort(index vec.begin(), indexvec.end(), IndexSort(/*args*/));
          >
          > Now you have a vector like:
          >
          > (10, 99, 2, 7, ...)
          >
          > You now use this to rearrange all of your parallel vectors into the
          > correct order. e.g.
          >
          > template <class It, class RanIt>
          > void index_shuffle(I t begin, It end, RanIt toSort)
          > {
          > using std::iter_swap;
          > RanIt it = toSort;
          > while(begin != end)
          > {
          > iter_swap(it, toSort + *begin);
          > ++begin;
          > ++it;
          > }
          > }[/color]

          It seems that this index_shuffle algorithm is not correct. For
          example, consider the following:

          int vi[] = {5,3};

          // vector containing (0,1)
          vector<int> indexvec(2); indexvec[0] = 0; indexvec[1] = 1;

          // sort using index sort mechanism as suggested by tom_usenet
          std::sort(index vec.begin(), indexvec.end(), IndexSort(/*args*/))

          //now indexvec contains (1,0)

          index_shuffle(i ndexvec.begin() , indexvec.end(), vi);

          // this step will produce vi containing (5,3) and not (3,5)...
          //
          // basically the while-loop in index-shuffle will loop twice.
          // in the first loop it will do iter_swap(vi, vi+1).
          // in the second loop it will do iter_swap(vi+1, vi).


          But anyways thanks for the pointer. I now need to write the correct
          index_shuffle algorithm. By sorting the indices, we are creating a
          permutation of those indices. It seems one needs to take into account
          the cycles in a permutation while performing the index_shuffle.[color=blue]
          >[/color]
          [snip]

          Comment

          • tom_usenet

            #6
            Re: sorting K+1 vectors with NlogN comparisons not (K+1)NlogN

            On 11 Nov 2003 01:02:02 -0800, pkpras@noida.at renta.com (Pratyush)
            wrote:
            [color=blue]
            >It seems that this index_shuffle algorithm is not correct. For
            >example, consider the following:
            >
            >int vi[] = {5,3};
            >
            >// vector containing (0,1)
            >vector<int> indexvec(2); indexvec[0] = 0; indexvec[1] = 1;
            >
            >// sort using index sort mechanism as suggested by tom_usenet
            >std::sort(inde xvec.begin(), indexvec.end(), IndexSort(/*args*/))
            >
            >//now indexvec contains (1,0)
            >
            >index_shuffle( indexvec.begin( ), indexvec.end(), vi);
            >
            >// this step will produce vi containing (5,3) and not (3,5)...
            >//
            >// basically the while-loop in index-shuffle will loop twice.
            >// in the first loop it will do iter_swap(vi, vi+1).
            >// in the second loop it will do iter_swap(vi+1, vi).[/color]

            Yes, you're right, I'm afraid I didn't put much thought into it.

            [color=blue]
            >But anyways thanks for the pointer. I now need to write the correct
            >index_shuffl e algorithm. By sorting the indices, we are creating a
            >permutation of those indices. It seems one needs to take into account
            >the cycles in a permutation while performing the index_shuffle.[/color]

            This can't be done in place without making it at least O(n log n), and
            more likely O(n*n). You have to find the disjoint cycles and then
            rotate them (this takes me back to discrete maths and combinatorics!) .
            So I suggest you do a copy algorithm:

            #include <algorithm>

            template <class FwdIt1, class RanIt, class FwdIt2>
            void index_shuffle_c opy(FwdIt1 indexBegin, FwdIt1 indexEnd, RanIt
            source, FwdIt2 output)
            {
            using std::iter_swap;
            while(indexBegi n != indexEnd)
            {
            *output = *(source + *indexBegin);
            ++indexBegin;
            ++output;
            }
            }

            Unfortunately this annoyingly makes you copy everything. You can of
            course do a swap at the end to minimize copying.

            Tom

            Comment

            Working...