algorithm for "common array of longs"

**Michel Walsh** · Oct 1 '08, 02:15 PM

Re: algorithm for "common array of longs"

If the lists are large and sorted, a double walk over the 'index' in
parallel can be implemented. If the list are short and not sorted, the list
can be walked through, kind of what you did, rather than first sorted and
next using a index walk in parallel, because the time spent to sort them may
be more than what you will win by using the double walk on the 'index'. That
is why SQL has execution plan based on statistics: given different cases,
the 'imperative code' used will be different. There is also a question of
memory: can you use a lot or are you limited to the strict minimum? Anyhow,
here how a walk on index in parallel works (note that it is what SQL can
perform to execute an INNER JOIN between two lists with no dup).

0. R is an empty list,
FingerA points to before the start of the listA
FingerB points to before the start of the listB
1. Finger A points the next item in list A, itemA
2. If fingerA points to nothing, it is done.
3. Find next item in listB equals to itemA
3a. If none is found, goto 1
4. Append itemA to R
5. Move FingerB to the next item in listB, itemB
6. If fingerB points to nothing, it is done.
7. Find next item in listA equals to itemB
7a. If none if found, goto 5
8. Append itemB to R
9 Goto 1.

note that you can change lightly the algorithm if listB has duplicated
values (I assumed both list have not dup), but I don't think this is the
kind of JOIN you want.

Example: listA: 1, 2, 3, 5, 8
list B: 1, 5, 6

Step 1: fa -1
Step 3: fb-1, R =={ 1 }
Step 5: fb-5
Step 7: fa-5, R =={ 1, 5 }
Step 1: fa->8
Step 3a.
Step 1: fa-eof
so, R={ 1, 5 }

Now, for the code... I would use SQL, but feel free to do it in C# though
:-)

Vanderghast, Access MVP

"Peter" <xdzgor@hotmail .comwrote in message
news:OXvlgh8IJH A.4600@TK2MSFTN GP06.phx.gbl...

Hi
>
I have a number of arrays of longs, from which I need to find a single
array which only contains the values which appear in all the original
arrays.
>
For example, I could have the three arrays:
>
1, 3, 2, 8, 5
3, 6, 1
6, 1, 4, 3
>
from which I need to generate the "common" array:
1, 3
(as these are the only values which appear in all the original arrays).
>
>
What is a good algorithm for this? The below is what I have written,
which does seem to do what I want. But more than likely there is a much
better, more efficient way to do this...
>
static void Main(string[] args)
{
long[] x = new long[5] { 1, 3, 2, 8, 5 };
long[] y = new long[3] { 3, 6, 1 };
long[] z = new long[4] { 6, 1, 4, 3 };
>
List<long[]originals = new List<long[]{ x, y, x };
>
long[] commonArray = Common(original s);
}
>
static long[] Common(List<lon g[]originals)
{
long[] temp = originals[0];
>
for (int i = 1; i < originals.Count ; i++)
{
temp = Common(temp, originals[i]);
}
>
return temp;
}
>
static long[] Common(long[] one, long[] two)
{
List<longcommon = new List<long>();
foreach (long val in one)
{
if (two.Contains(v al))
{
common.Add(val) ;
}
}
return common.ToArray( );
}
>
>
Thanks,
Peter

**Rudy Velthuis** · Oct 1 '08, 05:35 PM

Re: algorithm for "common array of longs"

Peter wrote:

Hi
>
I have a number of arrays of longs, from which I need to find a single
array which only contains the values which appear in all the original
arrays.
>
For example, I could have the three arrays:
>
1, 3, 2, 8, 5
3, 6, 1
6, 1, 4, 3
>
from which I need to generate the "common" array:

You need the intersection of different sets. Intersect the first with
the second and the result with the third. There are plenty of
algorithms to do that. I have a few (relying on the sets of numbers
being sorted), but they are unfortunately not in C#.

A quick translation (may be faulty) - note that the lists must be
sorted for this to work - it is generally much faster than calling
Contains() in a loop:

public static int SetIntersection <T>(IList<Tsour ce1,
IList<Tsource2, IList<Tdestinat ion, IComparer<Tcomp arer)
{
int i1 = 0;
int i2 = 0;
int d = 0;
while (i1 < source1.Count && i2 < source2.Count)
{
int comp = comparer.Compar e(source1[i1], source2[i2]);
if (comp < 0)
i1++;
else if (comp 0)
i2++;
else
{
destination[d] = source1[i1];
i1++;
i2++;
d++;
}
}
return d;
}

Of course destination must be big enough to contain the largest of the
source lists entirely.

if list1, list2 and list3 are the 3 lists, then do

int len = SetIntersection <int>(list1, list2, intermediate,
Comparer<int>.D efault);
intermediate.De leteRange(len, intermediate.Co unt - len);
len = SetIntersection <int>(intermedi ate, list3, result,
Comparer<int>.D efault);
result.DeleteRa nge(len, result.Count - len);

And result will contain the elements present in all. You could modify
the routine to pass d back as well, so you know the number of items
written to it.

--
Rudy Velthuis http://rvelthuis.de

"We are not retreating - we are advancing in another Direction."
-- General Douglas MacArthur (1880-1964)

**=?ISO-8859-1?Q?G=F6ran_Andersson?=** · Oct 1 '08, 05:55 PM

Re: algorithm for "common array of longs"

Peter wrote:

Hi
>
I have a number of arrays of longs, from which I need to find a single
array which only contains the values which appear in all the original
arrays.
>
For example, I could have the three arrays:
>
1, 3, 2, 8, 5
3, 6, 1
6, 1, 4, 3
>
from which I need to generate the "common" array:
1, 3
(as these are the only values which appear in all the original arrays).
>
>
What is a good algorithm for this? The below is what I have written,
which does seem to do what I want. But more than likely there is a much
better, more efficient way to do this...
>
static void Main(string[] args)
{
long[] x = new long[5] { 1, 3, 2, 8, 5 };
long[] y = new long[3] { 3, 6, 1 };
long[] z = new long[4] { 6, 1, 4, 3 };
>
List<long[]originals = new List<long[]{ x, y, x };
>
long[] commonArray = Common(original s);
}
>
static long[] Common(List<lon g[]originals)
{
long[] temp = originals[0];
>
for (int i = 1; i < originals.Count ; i++)
{
temp = Common(temp, originals[i]);
}
>
return temp;
}
>
static long[] Common(long[] one, long[] two)
{
List<longcommon = new List<long>();
foreach (long val in one)
{
if (two.Contains(v al))
{
common.Add(val) ;
}
}
return common.ToArray( );
}
>
>
Thanks,
Peter

The Contains method for a List gets slower the larger the list is, a
HashSet scales much better:

static long[] Common(long[] one, long[] two) {
HashSet<longset = new HashSet<long>(t wo);
List<longcommon = new List<long>();
foreach (long val in one) {
if (set.Contains(v al)) common.Add(val) ;
}
return common.ToArray( );
}

--
Göran Andersson
_____

guffa.com

http://www.guffa.com

Göran Anderssons privata hemsida.

**Michel Walsh** · Oct 1 '08, 06:35 PM

Re: algorithm for "common array of longs"

Seems much more compact that the pseudo code I tried to used...and working
fine. I would use destination.Add ( source1[i1] ) instead of
destination[d]=source1[i1], though.

I wonder what LINQ.Intesect uses?

Int[] first = { 1, 3, 2, 8, 5};
Int[] second = { 6, 1, 4, 3};
IEnumerable<int common = first.Intersect (second);

Vanderghast, Access MVP

"Rudy Velthuis" <newsgroups@rve lthuis.dewrote in message
news:xn0fvx0jq0 000003@news.mic rosoft.com...

Peter wrote:
>

>Hi
>>
>I have a number of arrays of longs, from which I need to find a single
>array which only contains the values which appear in all the original
>arrays.
>>
>For example, I could have the three arrays:
>>
>1, 3, 2, 8, 5
>3, 6, 1
>6, 1, 4, 3
>>
>from which I need to generate the "common" array:

>
You need the intersection of different sets. Intersect the first with
the second and the result with the third. There are plenty of
algorithms to do that. I have a few (relying on the sets of numbers
being sorted), but they are unfortunately not in C#.
>
A quick translation (may be faulty) - note that the lists must be
sorted for this to work - it is generally much faster than calling
Contains() in a loop:
>
public static int SetIntersection <T>(IList<Tsour ce1,
IList<Tsource2, IList<Tdestinat ion, IComparer<Tcomp arer)
{
int i1 = 0;
int i2 = 0;
int d = 0;
while (i1 < source1.Count && i2 < source2.Count)
{
int comp = comparer.Compar e(source1[i1], source2[i2]);
if (comp < 0)
i1++;
else if (comp 0)
i2++;
else
{
destination[d] = source1[i1];
i1++;
i2++;
d++;
}
}
return d;
}
>
Of course destination must be big enough to contain the largest of the
source lists entirely.
>
if list1, list2 and list3 are the 3 lists, then do
>
int len = SetIntersection <int>(list1, list2, intermediate,
Comparer<int>.D efault);
intermediate.De leteRange(len, intermediate.Co unt - len);
len = SetIntersection <int>(intermedi ate, list3, result,
Comparer<int>.D efault);
result.DeleteRa nge(len, result.Count - len);
>
And result will contain the elements present in all. You could modify
the routine to pass d back as well, so you know the number of items
written to it.
>
>
--
Rudy Velthuis http://rvelthuis.de
>
"We are not retreating - we are advancing in another Direction."
-- General Douglas MacArthur (1880-1964)

**Rudy Velthuis** · Oct 1 '08, 07:25 PM

Re: algorithm for "common array of longs"

Michel Walsh wrote:

Seems much more compact that the pseudo code I tried to used...and
working fine. I would use destination.Add ( source1[i1] ) instead of
destination[d]=source1[i1], though.

I designed mine to work with arrays too.

I wonder what LINQ.Intesect uses?
>
Int[] first = { 1, 3, 2, 8, 5};
Int[] second = { 6, 1, 4, 3};
IEnumerable<int common = first.Intersect (second);

I translated that from a language without LINQ. <g>

I guess LINQ would be a lot easier.
--
Rudy Velthuis http://rvelthuis.de

"Our children are not born to hate, they are raised to hate."
-- Thomas della Peruta

**Hans Kesting** · Oct 2 '08, 01:55 PM

Re: algorithm for "common array of longs"

Göran Andersson brought next idea :

Peter wrote:

>Hi
>>
>I have a number of arrays of longs, from which I need to find a single
>array which only contains the values which appear in all the original
>arrays.
>>
>For example, I could have the three arrays:
>>
>1, 3, 2, 8, 5
>3, 6, 1
>6, 1, 4, 3
>>
>from which I need to generate the "common" array:
>1, 3
>(as these are the only values which appear in all the original arrays).
>>
>>
>What is a good algorithm for this? The below is what I have written,
>which does seem to do what I want. But more than likely there is a much
>better, more efficient way to do this...
>>
>static void Main(string[] args)
>{
> long[] x = new long[5] { 1, 3, 2, 8, 5 };
> long[] y = new long[3] { 3, 6, 1 };
> long[] z = new long[4] { 6, 1, 4, 3 };
>>
> List<long[]originals = new List<long[]{ x, y, x };
>>
> long[] commonArray = Common(original s);
>}
>>
>static long[] Common(List<lon g[]originals)
>{
> long[] temp = originals[0];
>>
> for (int i = 1; i < originals.Count ; i++)
> {
> temp = Common(temp, originals[i]);
> }
>>
> return temp;
>}
>>
>static long[] Common(long[] one, long[] two)
>{
> List<longcommon = new List<long>();
> foreach (long val in one)
> {
> if (two.Contains(v al))
> {
> common.Add(val) ;
> }
> }
> return common.ToArray( );
>}
>>
>>
>Thanks,
>Peter

>
The Contains method for a List gets slower the larger the list is, a HashSet
scales much better:
>
static long[] Common(long[] one, long[] two) {
HashSet<longset = new HashSet<long>(t wo);
List<longcommon = new List<long>();
foreach (long val in one) {
if (set.Contains(v al)) common.Add(val) ;
}
return common.ToArray( );
}

If you are using HashSets, why not:

long[] x = new long[5] { 1, 3, 2, 8, 5 };
long[] y = new long[3] { 3, 6, 1 };
long[] z = new long[4] { 6, 1, 4, 3 };

HashSet<longhs1 = new HashSet<long>(x );
hs1.IntersectWi th(y);
hs1.IntersectWi th(z);

foreach(var l in hs1)
Console.WriteLi ne(l);

(returns 1 and 3)

Hans Kesting

algorithm for "common array of longs"

algorithm for "common array of longs"

Comment

Comment

Comment

Comment

Comment

Comment