Rayleigh Test circular stats

Dear Julia peeps,

I am trying to relate brain activity to an amplitude modulated tone stimulus. As in, does the peak of brain activity occur in synchrony with the phase of the modulated tone? HypothesisTests.jl has this function for a Rayleigh Test: ( HypothesisTests.jl/circular.jl at master · JuliaStats/HypothesisTests.jl (github.com)) but I am not sure from the documentation how to use it. Below is the code I’m using to find instantaneous phase of a 2 Hz amplitude modulation wave and relate a series of mock peak latencies to it. I then take the phase as orientation for a series of unit vectors and find the sum of vectors for the length of the resultant vector (vstrength). I want to compare that resultant vector length between conditions and I believe I should be able to do that with the Rayleigh Test. Any help would be appreciated on how to use this function properly.

(I’m using this publication as reference: Auditory Cortex Phase Locking to Amplitude-Modulated Cochlear Implant Pulse Trains (physiology.org))

using DSP, FFTW, HypothesisTests

# prep the data which will fluctuate
peaklat1   = Int.(ceil.(rand(100)*100)) # list of fake latencies
peaklat1[peaklat1 .> 50] = peaklat1[peaklat1 .> 50].+150 # create more variance
peaklat2   = Int.(ceil.(rand(100)*100)) # list of fake latencies
clickfreq  = 2    # stimulus frequency
sr         = 1000 # sampling rate

function getAMwave(clickfreq=2,sr=1000)
    t    = 0:1/sr:1 # time course
    wave = sin.((2*π*clickfreq*t).+3π/2) # AM wave starting at the bottom
    return wave

# generate the AM wave used and get instantaneous phase
wave = getAMwave(clickfreq,sr)    # function above
wavephase = angle.(hilbert(wave)) # Compute phase of wave

# get the phase at each time point of signal peak amplitude
orientation1 = wavephase[peaklat1]

# break vector into components i and j to compute sums, averages, and then the resultant vector 
i1 = cos.(orientation1) 
j1 = sin.(orientation1)
sumi1 = sum(i1) 
sumj1 = sum(j1)
avgi1 = sumi1/length(i1) # average of component i
avgj1 = sumj1/length(j1) # average of component j

vstrength1 = sqrt((avgi1^2)+(avgj1^2)) # find the vector based on i and j components (Pythagorean)

# now the same for the second latency series
orientation2 = wavephase[peaklat2]
i2 = cos.(orientation2) 
j2 = sin.(orientation2)
sumi2 = sum(i2) 
sumj2 = sum(j2)
avgi2 = sumi2/length(i2) 
avgj2 = sumj2/length(j2) 
vstrength2 = sqrt((avgi2^2)+(avgj2^2)) 

# what I feel I should be able to do: 
# or 
RayleighTest([vstrength1,vstrength2],100) # 100 observations per condition

It looks like the RayleighTest constructor is expecting the i and j components as a single complex vector. Since you are already calculating the resultant length (vstrength), you can use the default constructor as RayleighTest(vstrength1, 100).

However, as implemented in HypothesisTests, it does not appear that you can make the kind of comparison that you wish to. I’m not familiar with the Rayleigh Test, so I don’t know if that is a limitation of the test itself or the implementation.

1 Like

@ararslan, Sorry for the direct tag but I’m working on implementing the HypothesisTests.jl - Rayleigh test in my analysis. I have been able to figure out what I was misunderstanding in the original post here but now I’m comparing results with R and python versions of this test and I noticed that the p value is inconsistent between those two to Julia.

R and python only take exp(-Z) but it’s written here to only perform this calculation if the sample size is larger than 1e6. Could you tell me why there is are is the extra equation in the p value calculations if sample size is less than a million?

function pvalue(x::RayleighTest)
    Z = x.Rbar^2 * x.n
    x.n > 1e6 ? exp(-Z) :
        exp(-Z)*(1+(2*Z-Z^2)/(4*x.n)-(24*Z - 132*Z^2 + 76*Z^3 - 9*Z^4)/(288*x.n^2))

Thank you! It took me a while but I understand now that it is indeed just the way the test works. :slight_smile:

That equation is found here. My guess is that the latter method is more accurate, albeit more expensive, and that for n > 1e6, exp(-Z) is likely sufficiently accurate (the exponential is going to dominate) and computationally cheaper to make sense.

1 Like

The longer computation put me a couple of orders of magnitude off of the exp(-Z) computation so I was unsure. But it’s great to see the source here, thank you for finding that!