function pca(data::Array{T,2}) where T
X = data .- mean(data, dims=2)
Y = X' ./ sqrt(T(size(X,2)-1))
U,S,PC = svd(Y)
S = diagm(0=>S)
V = S .* S
# find the least variance vector
indexList = sortperm(diag(V); rev=true)
PCs = map(x->PC[:,x], indexList)
return PCs, diag(V)[indexList]
end
I have data.
A need to do all step by step , without function.
What is ‘T’ in line Y = X’ ./ sqrt(T(size(X,2)-1)) ?
T here is the type of data in the array. An Array is formally an Array{T,N} where N where T. The T is the type of data, and the N is the number of dimensions.
To expand on Oscar’s answer, the reason for that in this case is to use the same type as your input, so if you give it a matrix of Float64s, that will covert what’s returned by size to a Float64.
In your example, add T = Float64.
Separate point - you’re quoting code as a text quote instead of as code. Compare:
function foo()
println(“This doesn’t look like code!”)
end
to
function bar()
println("Oh, much better!")
end
Which look like this when writing:
> function foo()
> println("This doesn't look like code!")
> end
to
```
function bar()
println("Oh, much better!")
end
```
But wouldn’t it work just as well without the type conversion, e.g. X ./ sqrt(size(X,2)-1)?
I would imagine that size returns integers, and sqrt returns at least floats, such that division X by sqrt would have type T (or higher) anyway. Using T(size(X)) seems out of place to me, since the type may be promoted by sqrt anyway. Or if not out of place, then at least a bit obscure.
I wonder:
Is it necessary to cast to a particular type at all? Could one leave out the type, e.g. X'./sqrt(size(X,2)-1)) with no issues?
If one is to cast T, would it make more sense after the root, e.g. T(sqrt(size(X,2)-1)?
Are there more direct or idiomatic (or transparent) ways to ensure type, e.g. T.(X' ./ sqrt(size(X,2)-1))?
Yeah, I was thinking that myself. Similar formulations are often used to generate Arrays of the correct type (eg zeros(T, n)), but in this case sqrt seems to always return a Float, regardless of T