Problems with tuples in Statistics function

brett_knoss · May 8, 2021, 2:19am

using Statistics
using Distributions
x=[1,2,3,4,5,6,7,8]
y=[8,4,5,5,6,5,6,5]
function TwoSampleT2Test(X,Y)
    nx, p = size(X)
    ny, _ = size(Y)
    δ = mean(X, dims=1) - mean(Y, dims=1)
    Sx = cov(X)
    Sy = cov(Y)
    S_pooled = ((nx-1)*Sx + (ny-1)*Sy)/(nx+ny-2)
    t_squared = (nx*ny)/(nx+ny) * δ * inv(S_pooled) * transpose(δ)
    statistic = t_squared[1,1] * (nx+ny-p-1)/(p*(nx+ny-2))
    F = FDist(p, nx+ny-p-1)
	p_value= 1 - pvalue(Kolmogorov(), sqrt(x.n)*x.δ; tail=:right)
    println("Test statistic: $(statistic)\nDegrees of freedom: $(p) and $(nx+ny-p-1)\np-value: $(p_value)")
    return([statistic, p_value])
end
TwoSampleT2Test(x,y)

I get a bounds error, something to do with selecting an invalid value of a tuple. Is there a problem with the function, or is it the array/tuple that I was using as a an example?

I also tried to use Rdatasets, but

using Rdatasets
iris = dataset("datasets", "iris")

versicolor = convert(Matrix, iris[iris.Species .== "versicolor", 1:2])

virginica = convert(Matrix, iris[iris.Species .== "virginica", 1:2])

gave me a multiple definition error in Pluto.

jling · May 8, 2021, 3:54am

please post your code like they are code

brett_knoss · May 8, 2021, 5:18am

sorry I used quotation marks by mistake.

nilshg · May 8, 2021, 8:54am

Your error has nothing to do with statistics. The MWE would be:

julia> x=[1,2,3,4,5,6,7,8]
8-element Vector{Int64}:
 1
 2
 3
 4
 5
 6
 7
 8

julia> nx, p = size(x)
ERROR: BoundsError: attempt to access Tuple{Int64} at index [2]

You are calling size on a vector, which gives you back a 1-element tuple (as a vector is one-dimensional, it has only length). You then try to destructure that by assigning it to two variables (nx, p) but that doesn’t work as there’s only one element in the tuple.

brett_knoss · May 8, 2021, 4:08pm

Ok, do I need to assign the tuple as an array? How do I fix this?

jling · May 8, 2021, 4:09pm

what do you expect nx and p to be respectively?

brett_knoss · May 8, 2021, 4:38pm

I don’t know, I was trying to get someone elses function to work. I instead found tho Hotelling Test in HypothesisTests.jl , but I’m not sure how to enter it.

using HypothesisTests

x = randn(100_000); y = randn(100_000);
BartlettTest(x,y) #not the Hotelling test, but I need to understand how to use matrix variables first

tells me that there is no method matching. Instead i need to use an abstract matrix.
I know that a matirx is an array of arrays, and that an array is a matrix with 1 row, but I’m not sure if this is a formatting problem, ie. do I need to change type, or arrange both x and y into a single maxrix ?

Or, if I need more tests.

jling · May 8, 2021, 4:53pm

no, a Matrix is a Array with dimension equals 2.

julia> Matrix
Matrix{T} where T (alias for Array{T, 2} where T)

also no, a Vector is probably what you’re thinking here and that is a column vector, not a row vector:

julia> a = rand(3)
3-element Vector{Float64}:
 0.9434075399317472
 0.42470401598192176
 0.4838410682979659

julia> a' * a
1.3044934669629604

brett_knoss · May 9, 2021, 7:06pm

I took linear algebra, so I should know this, but what does a_prime and a mean?

AndiMD · May 9, 2021, 7:37pm

a' is short for adjoint(a).
You can find this information in the REPL help with ?something, for example ?'.

brett_knoss · May 9, 2021, 8:35pm

So is there a way to put two arrays into a two-dimensional matrix?

sylvaticus · May 9, 2021, 9:31pm

If you mean to put the first as first colum and the second as the secon colum it is hcat(vector1,vector2)

Have a look on my tutorial: 2 - Data types - Julia language: a concise tutorial

brett_knoss · May 9, 2021, 11:15pm

OK, I got this, now I get that a zero vector is a point, but is mu[0] a common mean, between two ranges?

sylvaticus · May 10, 2021, 6:16am

Not sure here what you mean here . For “zero vector” do you mean a vecor of zeros or a empty vector . but in both point it is hard to interpret as a “point”.

mu[0] looks odds in Julia, as vector indices start from 1.

brett_knoss · May 10, 2021, 3:05pm

from Juliastat.org
Multivariate tests · HypothesisTests.jl

OneSampleHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix, μ₀=<zero vector>)

and here is my understanding of a zero vector

Jeff_Emanuel · May 10, 2021, 4:39pm

The line from the docs below tells you what you need to know. Where it says μ₀=<zero vector>, that means that the default value of μ₀ if not specified, is a vector containing all zeros (Functions · The Julia Language). Since the docs don’t tell you what the function returns, click on the source link and you can clearly see that it returns an instance of the OneSampleHotellingT2Test struct.

test of the hypothesis that the vector of mean column differences between X and Y is equal to μ₀ .

brett_knoss · May 10, 2021, 4:43pm

Ok, I think I get that, so what do I enter there? An array of N number of zeros?

Jeff_Emanuel · May 10, 2021, 4:45pm

You pass a vector that you want to test the mean column differences against. That’s up to your problem. If you want to test whether the column means are equal, ie., their differences are equal, don’t pass anything so that it uses the default value.

brett_knoss · May 10, 2021, 7:39pm

OK, I got that, so what would I enter, so that it knows I want default?

nilshg · May 10, 2021, 7:49pm

The default value for a keyword argument is the value that the argument will take is it is not supplied when the function is called. This is the same as what I showed you with tail in the p value calculations before.

Topic		Replies	Views
Randomized Hypothesis Test (row-level analysis): DimensionMismatch ERROR New to Julia statistics , combinatorics	16	733	November 1, 2021
Using Weights from statsbase in Julia, and using an array New to Julia package	2	2862	November 27, 2020
Use of std function in Statistics New to Julia	2	1977	June 12, 2019
Tuple related error General Usage question	4	433	February 19, 2024
Sampling Multiple samples from NormalInverseGamma distribution Statistics	2	674	April 27, 2018

Problems with tuples in Statistics function

Related topics