For Programmers: Free Programming Magazines  


Home > Archive > Fortran > April 2005 > intersection of two vectors (different sizes)









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author intersection of two vectors (different sizes)
giigle@gmail.com

2005-04-25, 3:59 pm

Using: Intel Fortran Compiler 8.1 on Linux kernel version
2.6.11-1.14_FC3smp

I have two vectors of different size A(1:6) and B(1:441)

I want to find the unique values of A and B that are in both A and B
(unique set intersection). Is there a relatively simple method for
doing this? I don't care about order.

Thanks!

Rich Townsend

2005-04-25, 3:59 pm

giigle@gmail.com wrote:
> Using: Intel Fortran Compiler 8.1 on Linux kernel version
> 2.6.11-1.14_FC3smp
>
> I have two vectors of different size A(1:6) and B(1:441)
>
> I want to find the unique values of A and B that are in both A and B
> (unique set intersection). Is there a relatively simple method for
> doing this? I don't care about order.
>
> Thanks!
>


Are A and B already individually unique? That is, do they contain no
duplicate elements? If so, then the elements of their intersection can
be calculated using something of the form:

PACK(SPREAD(B, DIM=1, NCOPIES=SIZE(A)), &
MASK=(SPREAD(B, DIM=1, NCOPIES=SIZE(A)) == &
SPREAD(A, DIM=2, NCOPIES=SIZE(B)))

This is nice and compact, and will parallelize very well.

cheers,

Rich
Rich Townsend

2005-04-25, 3:59 pm

Rich Townsend wrote:

<snip>

> PACK(SPREAD(B, DIM=1, NCOPIES=SIZE(A)), &
> MASK=(SPREAD(B, DIM=1, NCOPIES=SIZE(A)) == &
> SPREAD(A, DIM=2, NCOPIES=SIZE(B)))
>
> This is nice and compact, and will parallelize very well.


Small syntax error; there is a missing parenthesis at the end of this
expression. A corrected version is in the proof-of-concept code below:

program foo

implicit none

integer, dimension(5) :: A = (/1,2,3,4,5/)
integer, dimension(3) :: B = (/2,4,6/)

print *, PACK(SPREAD(B, DIM=1, NCOPIES=SIZE(A)), &
MASK=(SPREAD(B, DIM=1, NCOPIES=SIZE(A)) == &
SPREAD(A, DIM=2, NCOPIES=SIZE(B))))

end program foo

cheers,

Rich
giigle@gmail.com

2005-04-25, 3:59 pm

Thanks Rich!

The values of each set are individually unique EXCEPT for the fill
values, but I can select appropriate slices of A and B .. e.g.,
replacing SPREAD(B, ... with SPREAD(B(1:cnt), ...

PACK is one of the most obscure (to me) yet useful command in Fortran.
:-) I'm already using it in a simple way to "find" indices of arrays
matching a certain mask, but I didn't consider using SPREAD in
conjuction to find an intersection.

Gordon Sande

2005-04-25, 3:59 pm



giigle@gmail.com wrote:
> Using: Intel Fortran Compiler 8.1 on Linux kernel version
> 2.6.11-1.14_FC3smp
>
> I have two vectors of different size A(1:6) and B(1:441)
>
> I want to find the unique values of A and B that are in both A and B
> (unique set intersection). Is there a relatively simple method for
> doing this? I don't care about order.
>
> Thanks!
>


Sort each and follow the logic for merging sorted subfiles.
Ties will be noticed. You can even find duplicates in each
subfile if that is an issue.





Michael Metcalf

2005-04-26, 8:59 am


"Rich Townsend" <rhdt@barvoidtol.udel.edu> wrote in message
news:d4j94e$29e$2@scrotar.nss.udel.edu...
>
> print *, PACK(SPREAD(B, DIM=1, NCOPIES=SIZE(A)), &
> MASK=(SPREAD(B, DIM=1, NCOPIES=SIZE(A)) == &
> SPREAD(A, DIM=2, NCOPIES=SIZE(B))))
>

A delightful solution! For huge arrays, this might get stuck because large
temporaries are created. Then something along the lines of

do i = 1, size(b)
if(any(a - b(i) == 0)) print *, b(i)
end do

will work (for size(b) < size(a)).

Regards,

Mike Metcalf


NuclearWizard

2005-04-26, 4:00 pm

That's very nice. Wouldn't it be better to put SPREAD(B, DIM=1,
NCOPIES=SIZE(A)) into a temp array, C?

ALLOCATE( C(1:SIZE(A),1:SIZE(B)) )
C = SPREAD(B, DIM=1, NCOPIES=SIZE(A))
print *, PACK(C, MASK=(C == SPREAD(A, DIM=2, NCOPIES=SIZE(B))) )
DEALLOCATE( C )

If there is any reason why you should *NOT* use a temp array, C,
(besides for simplicity's sake) please tell me.

Rich Townsend

2005-04-26, 8:58 pm

NuclearWizard wrote:
> That's very nice. Wouldn't it be better to put SPREAD(B, DIM=1,
> NCOPIES=SIZE(A)) into a temp array, C?
>
> ALLOCATE( C(1:SIZE(A),1:SIZE(B)) )
> C = SPREAD(B, DIM=1, NCOPIES=SIZE(A))
> print *, PACK(C, MASK=(C == SPREAD(A, DIM=2, NCOPIES=SIZE(B))) )
> DEALLOCATE( C )
>
> If there is any reason why you should *NOT* use a temp array, C,
> (besides for simplicity's sake) please tell me.
>


No reason!
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com