Choosing only different vectors from a matrix

Hello users,
I have a matrix A:

 A=[1.0  7.0  4.0  8.0  9.0  4.0  5.0  0.0  10.0  10.0  1.0  7.0
     1.0   0.0  0.0  9.0   8.0  1.0  3.0  5.0    7.0   1.0  1.0  0.0
     1.0   0.0  0.0  5.0  10.0  8.0  3.0  5.0   1.0   5.0  1.0  0.0];

How can I choose only the columns with different elements?
Is there a command for this?
I this example, the columns from 1 to 10 are different and columns 11 and 12 are repeated.
Thanks a lot.

unique(eachcol(A)) will filter for the unique columns, however, if you just want to keep columns which are not repeated, you can count them and then filter, eg:

using DataStructures
C = counter(eachcol(A))
filter(x -> last(x) == 1, C.map)

Note that if you are doing anything numerical, floating point error may make the simple notion of equality implicit in these examples quite useless.

2 Likes

I have used this commands but an error appears:

UndefVarError: eachcol not defined
top-level scope at none:0

I am using Julia 1.0.4 and I have installed the package DataStructures.

what version of Julia are you using? eachcol was added at in v1.1

1 Like

I am with Julia 1.0.4.
Is there a solution for this version?
Thanks so much!

you can add OnlineStats and use OnlineStats.eachcol. but never tested it

Perhaps adding all those packages may be overkill. How about the simple, albeit verbose solution:

function in_matrix(v, m)
	inMatrix = false;
	(_, nCols) = size(m);
	for ic = 1 : nCols
		if v == @view m[:,ic]
			inMatrix = true;
			break;
		end
	end	
	return inMatrix
end


function drop_duplicates(B)
	@views begin
		(_, nCols) = size(B);
		kept = trues(nCols);
		for ic = 1 : (nCols - 1)
			if in_matrix(B[:,ic], B[:, (ic+1) : nCols])
				kept[ic] = false;
			end
		end
	end
	return B[:, kept]
end


A=[1.0  7.0  4.0  8.0  9.0  4.0  5.0  0.0  10.0  10.0  1.0  7.0
	 1.0   0.0  0.0  9.0   8.0  1.0  3.0  5.0    7.0   1.0  1.0  0.0
	 1.0   0.0  0.0  5.0  10.0  8.0  3.0  5.0   1.0   5.0  1.0  0.0];
 
# Avoid numerical issues with isequal for this example
B = round.(Int, A);

drop_duplicates(B)

You can always operate on eg

[A[:, i] for i in axes(A, 2)]

instead, but frankly, I would just upgrade to 1.2 unless you have a compelling reason not to.

You could just copy/paste the implementation of eachcol:

eachcol(A) = (view(A, :, i) for i in axes(A, 2))
1 Like