algorithm for "common array of longs"

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Peter

    algorithm for "common array of longs"

    Hi

    I have a number of arrays of longs, from which I need to find a single
    array which only contains the values which appear in all the original
    arrays.

    For example, I could have the three arrays:

    1, 3, 2, 8, 5
    3, 6, 1
    6, 1, 4, 3

    from which I need to generate the "common" array:
    1, 3
    (as these are the only values which appear in all the original arrays).


    What is a good algorithm for this? The below is what I have written,
    which does seem to do what I want. But more than likely there is a much
    better, more efficient way to do this...

    static void Main(string[] args)
    {
    long[] x = new long[5] { 1, 3, 2, 8, 5 };
    long[] y = new long[3] { 3, 6, 1 };
    long[] z = new long[4] { 6, 1, 4, 3 };

    List<long[]originals = new List<long[]{ x, y, x };

    long[] commonArray = Common(original s);
    }

    static long[] Common(List<lon g[]originals)
    {
    long[] temp = originals[0];

    for (int i = 1; i < originals.Count ; i++)
    {
    temp = Common(temp, originals[i]);
    }

    return temp;
    }

    static long[] Common(long[] one, long[] two)
    {
    List<longcommon = new List<long>();
    foreach (long val in one)
    {
    if (two.Contains(v al))
    {
    common.Add(val) ;
    }
    }
    return common.ToArray( );
    }


    Thanks,
    Peter
  • Michel Walsh

    #2
    Re: algorithm for &quot;common array of longs&quot;

    If the lists are large and sorted, a double walk over the 'index' in
    parallel can be implemented. If the list are short and not sorted, the list
    can be walked through, kind of what you did, rather than first sorted and
    next using a index walk in parallel, because the time spent to sort them may
    be more than what you will win by using the double walk on the 'index'. That
    is why SQL has execution plan based on statistics: given different cases,
    the 'imperative code' used will be different. There is also a question of
    memory: can you use a lot or are you limited to the strict minimum? Anyhow,
    here how a walk on index in parallel works (note that it is what SQL can
    perform to execute an INNER JOIN between two lists with no dup).

    0. R is an empty list,
    FingerA points to before the start of the listA
    FingerB points to before the start of the listB
    1. Finger A points the next item in list A, itemA
    2. If fingerA points to nothing, it is done.
    3. Find next item in listB equals to itemA
    3a. If none is found, goto 1
    4. Append itemA to R
    5. Move FingerB to the next item in listB, itemB
    6. If fingerB points to nothing, it is done.
    7. Find next item in listA equals to itemB
    7a. If none if found, goto 5
    8. Append itemB to R
    9 Goto 1.


    note that you can change lightly the algorithm if listB has duplicated
    values (I assumed both list have not dup), but I don't think this is the
    kind of JOIN you want.

    Example: listA: 1, 2, 3, 5, 8
    list B: 1, 5, 6

    Step 1: fa -1
    Step 3: fb-1, R =={ 1 }
    Step 5: fb-5
    Step 7: fa-5, R =={ 1, 5 }
    Step 1: fa->8
    Step 3a.
    Step 1: fa-eof
    so, R={ 1, 5 }



    Now, for the code... I would use SQL, but feel free to do it in C# though
    :-)



    Vanderghast, Access MVP



    "Peter" <xdzgor@hotmail .comwrote in message
    news:OXvlgh8IJH A.4600@TK2MSFTN GP06.phx.gbl...
    Hi
    >
    I have a number of arrays of longs, from which I need to find a single
    array which only contains the values which appear in all the original
    arrays.
    >
    For example, I could have the three arrays:
    >
    1, 3, 2, 8, 5
    3, 6, 1
    6, 1, 4, 3
    >
    from which I need to generate the "common" array:
    1, 3
    (as these are the only values which appear in all the original arrays).
    >
    >
    What is a good algorithm for this? The below is what I have written,
    which does seem to do what I want. But more than likely there is a much
    better, more efficient way to do this...
    >
    static void Main(string[] args)
    {
    long[] x = new long[5] { 1, 3, 2, 8, 5 };
    long[] y = new long[3] { 3, 6, 1 };
    long[] z = new long[4] { 6, 1, 4, 3 };
    >
    List<long[]originals = new List<long[]{ x, y, x };
    >
    long[] commonArray = Common(original s);
    }
    >
    static long[] Common(List<lon g[]originals)
    {
    long[] temp = originals[0];
    >
    for (int i = 1; i < originals.Count ; i++)
    {
    temp = Common(temp, originals[i]);
    }
    >
    return temp;
    }
    >
    static long[] Common(long[] one, long[] two)
    {
    List<longcommon = new List<long>();
    foreach (long val in one)
    {
    if (two.Contains(v al))
    {
    common.Add(val) ;
    }
    }
    return common.ToArray( );
    }
    >
    >
    Thanks,
    Peter

    Comment

    • Rudy Velthuis

      #3
      Re: algorithm for &quot;common array of longs&quot;

      Peter wrote:
      Hi
      >
      I have a number of arrays of longs, from which I need to find a single
      array which only contains the values which appear in all the original
      arrays.
      >
      For example, I could have the three arrays:
      >
      1, 3, 2, 8, 5
      3, 6, 1
      6, 1, 4, 3
      >
      from which I need to generate the "common" array:
      You need the intersection of different sets. Intersect the first with
      the second and the result with the third. There are plenty of
      algorithms to do that. I have a few (relying on the sets of numbers
      being sorted), but they are unfortunately not in C#.

      A quick translation (may be faulty) - note that the lists must be
      sorted for this to work - it is generally much faster than calling
      Contains() in a loop:

      public static int SetIntersection <T>(IList<Tsour ce1,
      IList<Tsource2, IList<Tdestinat ion, IComparer<Tcomp arer)
      {
      int i1 = 0;
      int i2 = 0;
      int d = 0;
      while (i1 < source1.Count && i2 < source2.Count)
      {
      int comp = comparer.Compar e(source1[i1], source2[i2]);
      if (comp < 0)
      i1++;
      else if (comp 0)
      i2++;
      else
      {
      destination[d] = source1[i1];
      i1++;
      i2++;
      d++;
      }
      }
      return d;
      }

      Of course destination must be big enough to contain the largest of the
      source lists entirely.

      if list1, list2 and list3 are the 3 lists, then do

      int len = SetIntersection <int>(list1, list2, intermediate,
      Comparer<int>.D efault);
      intermediate.De leteRange(len, intermediate.Co unt - len);
      len = SetIntersection <int>(intermedi ate, list3, result,
      Comparer<int>.D efault);
      result.DeleteRa nge(len, result.Count - len);

      And result will contain the elements present in all. You could modify
      the routine to pass d back as well, so you know the number of items
      written to it.


      --
      Rudy Velthuis http://rvelthuis.de

      "We are not retreating - we are advancing in another Direction."
      -- General Douglas MacArthur (1880-1964)

      Comment

      • =?ISO-8859-1?Q?G=F6ran_Andersson?=

        #4
        Re: algorithm for &quot;common array of longs&quot;

        Peter wrote:
        Hi
        >
        I have a number of arrays of longs, from which I need to find a single
        array which only contains the values which appear in all the original
        arrays.
        >
        For example, I could have the three arrays:
        >
        1, 3, 2, 8, 5
        3, 6, 1
        6, 1, 4, 3
        >
        from which I need to generate the "common" array:
        1, 3
        (as these are the only values which appear in all the original arrays).
        >
        >
        What is a good algorithm for this? The below is what I have written,
        which does seem to do what I want. But more than likely there is a much
        better, more efficient way to do this...
        >
        static void Main(string[] args)
        {
        long[] x = new long[5] { 1, 3, 2, 8, 5 };
        long[] y = new long[3] { 3, 6, 1 };
        long[] z = new long[4] { 6, 1, 4, 3 };
        >
        List<long[]originals = new List<long[]{ x, y, x };
        >
        long[] commonArray = Common(original s);
        }
        >
        static long[] Common(List<lon g[]originals)
        {
        long[] temp = originals[0];
        >
        for (int i = 1; i < originals.Count ; i++)
        {
        temp = Common(temp, originals[i]);
        }
        >
        return temp;
        }
        >
        static long[] Common(long[] one, long[] two)
        {
        List<longcommon = new List<long>();
        foreach (long val in one)
        {
        if (two.Contains(v al))
        {
        common.Add(val) ;
        }
        }
        return common.ToArray( );
        }
        >
        >
        Thanks,
        Peter
        The Contains method for a List gets slower the larger the list is, a
        HashSet scales much better:

        static long[] Common(long[] one, long[] two) {
        HashSet<longset = new HashSet<long>(t wo);
        List<longcommon = new List<long>();
        foreach (long val in one) {
        if (set.Contains(v al)) common.Add(val) ;
        }
        return common.ToArray( );
        }

        --
        Göran Andersson
        _____
        Göran Anderssons privata hemsida.

        Comment

        • Michel Walsh

          #5
          Re: algorithm for &quot;common array of longs&quot;

          Seems much more compact that the pseudo code I tried to used...and working
          fine. I would use destination.Add ( source1[i1] ) instead of
          destination[d]=source1[i1], though.



          I wonder what LINQ.Intesect uses?

          Int[] first = { 1, 3, 2, 8, 5};
          Int[] second = { 6, 1, 4, 3};
          IEnumerable<int common = first.Intersect (second);


          Vanderghast, Access MVP



          "Rudy Velthuis" <newsgroups@rve lthuis.dewrote in message
          news:xn0fvx0jq0 000003@news.mic rosoft.com...
          Peter wrote:
          >
          >Hi
          >>
          >I have a number of arrays of longs, from which I need to find a single
          >array which only contains the values which appear in all the original
          >arrays.
          >>
          >For example, I could have the three arrays:
          >>
          >1, 3, 2, 8, 5
          >3, 6, 1
          >6, 1, 4, 3
          >>
          >from which I need to generate the "common" array:
          >
          You need the intersection of different sets. Intersect the first with
          the second and the result with the third. There are plenty of
          algorithms to do that. I have a few (relying on the sets of numbers
          being sorted), but they are unfortunately not in C#.
          >
          A quick translation (may be faulty) - note that the lists must be
          sorted for this to work - it is generally much faster than calling
          Contains() in a loop:
          >
          public static int SetIntersection <T>(IList<Tsour ce1,
          IList<Tsource2, IList<Tdestinat ion, IComparer<Tcomp arer)
          {
          int i1 = 0;
          int i2 = 0;
          int d = 0;
          while (i1 < source1.Count && i2 < source2.Count)
          {
          int comp = comparer.Compar e(source1[i1], source2[i2]);
          if (comp < 0)
          i1++;
          else if (comp 0)
          i2++;
          else
          {
          destination[d] = source1[i1];
          i1++;
          i2++;
          d++;
          }
          }
          return d;
          }
          >
          Of course destination must be big enough to contain the largest of the
          source lists entirely.
          >
          if list1, list2 and list3 are the 3 lists, then do
          >
          int len = SetIntersection <int>(list1, list2, intermediate,
          Comparer<int>.D efault);
          intermediate.De leteRange(len, intermediate.Co unt - len);
          len = SetIntersection <int>(intermedi ate, list3, result,
          Comparer<int>.D efault);
          result.DeleteRa nge(len, result.Count - len);
          >
          And result will contain the elements present in all. You could modify
          the routine to pass d back as well, so you know the number of items
          written to it.
          >
          >
          --
          Rudy Velthuis http://rvelthuis.de
          >
          "We are not retreating - we are advancing in another Direction."
          -- General Douglas MacArthur (1880-1964)

          Comment

          • Rudy Velthuis

            #6
            Re: algorithm for &quot;common array of longs&quot;

            Michel Walsh wrote:
            Seems much more compact that the pseudo code I tried to used...and
            working fine. I would use destination.Add ( source1[i1] ) instead of
            destination[d]=source1[i1], though.
            I designed mine to work with arrays too.
            I wonder what LINQ.Intesect uses?
            >
            Int[] first = { 1, 3, 2, 8, 5};
            Int[] second = { 6, 1, 4, 3};
            IEnumerable<int common = first.Intersect (second);
            I translated that from a language without LINQ. <g>

            I guess LINQ would be a lot easier.
            --
            Rudy Velthuis http://rvelthuis.de

            "Our children are not born to hate, they are raised to hate."
            -- Thomas della Peruta

            Comment

            • Hans Kesting

              #7
              Re: algorithm for &quot;common array of longs&quot;

              Göran Andersson brought next idea :
              Peter wrote:
              >Hi
              >>
              >I have a number of arrays of longs, from which I need to find a single
              >array which only contains the values which appear in all the original
              >arrays.
              >>
              >For example, I could have the three arrays:
              >>
              >1, 3, 2, 8, 5
              >3, 6, 1
              >6, 1, 4, 3
              >>
              >from which I need to generate the "common" array:
              >1, 3
              >(as these are the only values which appear in all the original arrays).
              >>
              >>
              >What is a good algorithm for this? The below is what I have written,
              >which does seem to do what I want. But more than likely there is a much
              >better, more efficient way to do this...
              >>
              >static void Main(string[] args)
              >{
              > long[] x = new long[5] { 1, 3, 2, 8, 5 };
              > long[] y = new long[3] { 3, 6, 1 };
              > long[] z = new long[4] { 6, 1, 4, 3 };
              >>
              > List<long[]originals = new List<long[]{ x, y, x };
              >>
              > long[] commonArray = Common(original s);
              >}
              >>
              >static long[] Common(List<lon g[]originals)
              >{
              > long[] temp = originals[0];
              >>
              > for (int i = 1; i < originals.Count ; i++)
              > {
              > temp = Common(temp, originals[i]);
              > }
              >>
              > return temp;
              >}
              >>
              >static long[] Common(long[] one, long[] two)
              >{
              > List<longcommon = new List<long>();
              > foreach (long val in one)
              > {
              > if (two.Contains(v al))
              > {
              > common.Add(val) ;
              > }
              > }
              > return common.ToArray( );
              >}
              >>
              >>
              >Thanks,
              >Peter
              >
              The Contains method for a List gets slower the larger the list is, a HashSet
              scales much better:
              >
              static long[] Common(long[] one, long[] two) {
              HashSet<longset = new HashSet<long>(t wo);
              List<longcommon = new List<long>();
              foreach (long val in one) {
              if (set.Contains(v al)) common.Add(val) ;
              }
              return common.ToArray( );
              }
              If you are using HashSets, why not:

              long[] x = new long[5] { 1, 3, 2, 8, 5 };
              long[] y = new long[3] { 3, 6, 1 };
              long[] z = new long[4] { 6, 1, 4, 3 };

              HashSet<longhs1 = new HashSet<long>(x );
              hs1.IntersectWi th(y);
              hs1.IntersectWi th(z);

              foreach(var l in hs1)
              Console.WriteLi ne(l);


              (returns 1 and 3)

              Hans Kesting


              Comment

              Working...