Decomposing an EEG Signal from Scratch

Hey folks! :wave:

I’ve been recently tinkering with decomposing some EEG data into various visualizations and want to make sure I am going about things the right way (the data using is openly available from here). Here is how I approached processing some of this EEG data as well as questions I had at the end.

Minor Data Prep

First, as the data was in fdt files, I wrote my own parser to read it (this is where raw channel level data comes from):

fdt Parser
fdt_parser(fdt_file, mat_file) =
    open(fdt_file) do io
        read!(
            io,
            Array{Float32}(undef, (Int(mat_file["nbchan"]), Int(mat_file["pnts"]))),
        )
    end 

Then, I loaded up the .set and .fdt files to have as a reference for the session recording that I was analyzing. This was pretty straightforward:

Loading Data
using MAT

mat_file = matread(
    joinpath(
        session_1_path, 
        "sub-01_ses-1_task-eyesclosed_eeg.set"
    )
)

channel_data = fdt_parser(
    joinpath(
        session_1_path, 
        "sub-01_ses-1_task-eyesclosed_eeg.fdt"
    ), 
mat_file)

Visualizing Electrode Positions

This session recorded EEG signal across 61 different channels (i.e. electrodes). For a quick visualization of that, I used TopoPlot.jl by @behinger which gave the following photo (code is below the photo):

Topoplot Code
fig = eeg_topoplot(
    data[1][1:61, 1, 1], 
    electrodes.name, 
    positions = pts, 
    interpolation = NullInterpolator(), 
    label_scatter = 
        (
            markersize=8, 
            color = :black
        ), 
    axis = 
        (
            type=Axis, 
            title = "61-Lead Electrode Array Layout",
            aspect= DataAspect(), 
            leftspinevisible = false,
            rightspinevisible = false,
            topspinevisible = false,
            bottomspinevisible = false,
            xgridvisible = false,
            ygridvisible = false,
            xminorgridvisible = false,
            yminorgridvisible = false,
            xminorticksvisible = false,
            yminorticksvisible = false,
            xticksvisible = false,
            yticksvisible = false,
            xticklabelsvisible = false,
            yticklabelsvisible = false,
        ), 
    labels = electrodes.name, 
    label_text = 
        (; 
            fontsize = 6, 
            offset = (-3, 0)
        ), 
)

Quick Data Preview

For the next parts of this work, I decided to only use 1 channel for my analysis as I thought I could scale up complexity later (i.e. analyzing and averages all channels together later). To get a sense of the signal on this channel, I did a quick plot of the voltage amplitude recorded (measured in \mu V); here’s that plot (with code below):

Raw Voltage Plot
fig = Figure(
        size = (1200, 500), 
        fontsize = 20
);

ax = CairoMakie.Axis(
   fig[1, 1], 
)

lines!(ax, 1:length(channel_data[1, :]), channel_data[1, :])
ax.xlabel = "Samples"
ax.ylabel = "Voltage"
ax.title = "Raw Signal (Freq: 500 Hz)"

Signal Decomposition

After this, I get to the part that is the most challenging for me – decomposing the signal into different bands (i.e. delta, theta, alpha, beta, and gamma). This part is where I could use the most checking. How I first went about this was to use DSP.jl to compute a Welsh Periodogram of the time series:

s = welch_pgram(channel_data[1, :], fs = 500)

f = s.freq
p = s.power

Then, taking liberal usage of Prof. Makoto Miyakoshi’s notes, I think I somewhat figured out how to extract the bands after translating some MATLAB ideas into Julia. Here is how I approached it:

delta = findall(x -> x>=0 && x<4, f);
theta = findall(x -> x>=4 && x<8, f);
alpha = findall(x -> x>=8 && x<12, f);
beta = findall(x -> x>=12 && x<30, f);
gamma = findall(x -> x>=30 && x<200, f);

And with these parts segmented out, I created the following plot here:

Segmented Wave Bands
fig = Figure(
   size = (1000, 1200), 
   fontsize = 20
);

colors = ["red", "green", "blue", "orange", "black"]
labels = ["delta", "theta", "alpha", "beta", "gamma"]
bands = [delta[50:end], theta, alpha, beta, gamma]

for (idx, band) in enumerate(bands)
    ax = CairoMakie.Axis(
      fig[idx, 1], 
    )

    lines!(ax, f[band], p[band], color = colors[idx])
    ax.title = L"\%$(labels[idx])\text{-Band}"
    ax.ylabel = "Power"
    if idx == 5
        ax.xlabel = "Frequency (Hz)"
    end
end

Additionally, I created a Power Spectral Density plot as follows for additional visualization (with some aesthetic banding for easier viewing):

Abbreviated, Banded PSD Plot
fig = Figure(
   size = (1600, 400), 
   fontsize = 20
);

ax = CairoMakie.Axis(
  fig[1, 1], 
)

colors = ["red", "green", "blue", "orange", "black"]
labels = ["delta", "theta", "alpha", "beta", "gamma"]
bands = [delta[50:end], theta, alpha, beta, gamma[1:800]]

band_ymax = p[vcat(bands...)] |> maximum

for (idx, band) in enumerate(bands)

    band!(ax, f[band], fill(0, length(band)), fill(band_ymax, length(band)), color = (colors[idx], 0.1))
    lines!(ax, f[band], p[band], color = colors[idx], label = L"\%$(labels[idx])")
    ax.title = "EEG Recording Session"
    ax.ylabel = "Power"
    if idx == 5
        ax.xlabel = "Frequency (Hz)"
        axislegend()
    end
end

Note: I cut off the gamma band in this particular diagram prematurely as it resulted in a rather “scrunched up” graphic. This was purely for presentation purposes.

Outstanding Questions

I hope that more or less made sense so far! It was somewhat tough tracking down these methods outside of established software (like MATLAB’s EEGLAB or Python’s MNE-Python). I definitely could have made errors so far and that leads me to outstanding questions I had from this work so far:

First, in the plot where I segmented out the different regions, I am unsure how to make a visualization like this:

image

As one can see, the different bands of one signal are all somehow harmonized against the same x-axis. My segmented one is not. How can I make a plot like this?

Second, are the methods I have used so far to extract the bands correct? I tried creating a nice bandpass filter with DSP.jl but couldn’t get it to work so just opted to directly use the Welch Periodogram function that comes with DSP.jl instead. Would there be a better way to do things here?

Third, is it often standard practice to take the mean of signals across all channels over time when computing brainwaves? In my readings, it seems like that is one way to do it to aggregate occurrences across the entire recording during a session but was curious what else folks do.

Thanks all!

~ tcp :deciduous_tree:

1 Like

P.S. I do not mean to be a bother but am going to CC some folks who I’d be curious to hear from and to hear how you tackle problems like these: @Zach_Christensen @Jakub_Mitura @tim.holy @jcorream @AdamWysokinski

I had not tried but knowing the location of the electrodes one should be able to triangulate the position to some area of the brain (not precisely without MRI of the patient) still relative amplitude of the extracted event from diffrent spots should give the ability to show on a template brain model where is the source of given electrical activity

1 Like

Hey,
what you did looks fine, but I think it is not exactly what you wanted to get. :wink:
You have extracted the spectral representation, however from the graphic in the first point it seems you want timeseries.
So just as you have mentioned - you need to bandpass filter the data with bounds matched to different types of waves. Can you share the code for filtering that you used and the error you got?

As for the third question, I am not sure what are you referring to. Do you mean taking an average of the raw signal? Or average of some metric? Typically people calculate metrics over smaller segments of data (e.g. couple of seconds) and then average the values from segments. It will however vary depending on the chosen measure. Averaging raw signal is not a typical practice.

1 Like

Hi! Happy to chime in.
Yes, the frequency ranges that you’ve used are consistent with the literature. See Herrmann, C.S., Strüber, D., Helfrich, R.F. and Engel, A.K., 2016. EEG oscillations: from correlation to causality. International Journal of Psychophysiology, 103, pp.12-21.

The extent to which the oscillations are extracted accurately in the time-domain depends on the properties of the filter that you design. One way to assess this is to look at the frequency response of the filter and see whether there’s good attenuation outside of the passband of interest. See the DSP documentation here and tutorial at the end. There are of course many other criteria. Oppenheim’s Digital Signal Processing book has more information on this. The MIT OCW on his DSP class has excellent resources as well.

To compute the spectrum, I recommend using Multitaper methods. It’ll reduce the variance of your spectral estimate and uses tapers that maximize the average power within a passband of interest. There’s a Julia package for doing this.

1 Like

Just to add DSP.jl has multitaper methods as well, and they are reference-tested against MATLAB and PyMNE. I actually hadn’t heard of Multitaper.jl so I’m not sure how DSP’s methods compare to those.

2 Likes

shameless plug here, sorry for being off-topic! If some new methods for decomposing signals are discussed here, perhaps they can be added via PR to GitHub - JuliaDynamics/SignalDecomposition.jl: Decompose a signal/timeseries into structure and noise or seasonal and residual components or open an issue mentioning the process so that at least it is there in the library sense!

edit: right, after reading the original post in more detail, I can see that I am way too off topic. This post is about decomposition in frequency domain while what I mentioned above is decomposition in time (albeit they are conjugate of each other but oh well).

1 Like

Right! Thanks. Forgot to mention that DPS.jl also has multitaper methods. :slight_smile: I am also unsure as to how they compare to one another.

2 Likes

Hi all,
I’m very busy for the next few days, so please excuse me for my very brief reply. All these operations (and much more :grinning:) can be done using NeuroAnalyzer.jl. I’ll prepare a brief tutorial on this problem, just give me day or two :slight_smile:
PS 1. Your plot is the power spectrum, the reference plot is a 5-second signal segment split into 5 frequency bands, e.g. using a band-pass filter for each band.
PS 2. Delta band is usually defined from 0.1 Hz or 0.5 Hz (e.g. in sleep studies).
ATB,
Adam

1 Like

Unfortunately it’s much more difficult since source localization is an undetermined problem. For each set of channel recordings there is an infinite number of solutions for their sources.

3 Likes

Hey @mkoculak,

Thanks for the comments! Sure thing, here is the code I was trying to use:

using DSP

#=

...
Code for loading data here
...

=#

fs = 500 # Sampling frequency

Wp = (4, 8) ./ (fs / 2);
Ws = (3.5, 8.5) ./ (fs / 2);
Rp = 3
Rs = 40

n, Wn = buttord(Wp, Ws, Rp, Rs)

theta_band = Bandpass(4, 8; fs = fs);
theta_filter = digitalfilter(theta_band, Butterworth(n));

z = theta_filter.z
p = theta_filter.p
k = theta_filter.k

After this point, I was a bit confused on how to properly apply the filter to my data as well as what form I should have my data in (i.e. \mu V or something else). Do you know what I might want to do after this point if what I did seemed reasonable? I was using this discussion to try to track down how to do this: matlab - EEG bandpass filter in mat lab - Stack Overflow

Yea, I was more referring to a processed signal (i.e. after some denoising and associated filtering had been done).

P.S. You may want to see the following comment now as I retried using DSP.jl and I think I may have gotten something more like what I wanted for question 1.

After refreshing myself with a bit more EEG DSP material, I took another look at this tutorial you mentioned and things started clicking! After playing around with the tutorial some more, I think I may be getting closer to what I am after. Here’s the latest diagram:

Here’s my associated code where I am using DSP.jl and processing my raw channel data (in \mu V):

Brain Wave Extraction Attempt Code
delta_range = [1, 4]
theta_range = [4, 8]
alpha_range = [8, 12]
beta_range = [12, 30]
gamma_range = [30, 200]

fig = Figure(
    size = (1200, 1400), 
    fontsize = 20
);

for (idx, r) in enumerate([delta_range, theta_range, alpha_range, beta_range, gamma_range])
    if r[1] > 0
        Ws = (r[1] - 0.5, r[2] + .5)./(fs/2)
    else
        Ws = (0.5, r[2] + .5)./(fs/2)
    end
    Wp = (r[1], r[2])./(fs/2)
    Rp = 3  # Passband ripple
    Rs = 40 # Stopband attenuation

    n, ωn = buttord(Wp, Ws, Rp, Rs)

    ax = CairoMakie.Axis(
        fig[idx, 1], 
    )

    btr_band_filt = digitalfilter(Bandpass(r[1], r[2]; fs = fs), Butterworth(n))
    signal = filt(btr_band_filt, channel_data[1, :]);

    lines!(ax, 1:5000, signal[1001:6000]);
    ax.title = L"\%$(labels[idx])\text{-Band}"
    ax.ylabel = "Frequency"
    if idx == 5
        ax.xlabel = "Time"
    end
end

Additionally, here is information about my estimated order (n) for my Butterworth Filter and other information about the signal I am showing here:

Variable Information
delta Band Information:
  Wp = (0.004, 0.016) 
  Ws = (0.002, 0.018) 
  n = 25 

    
theta Band Information:
  Wp = (0.016, 0.032) 
  Ws = (0.014, 0.034) 
  n = 28 

    
alpha Band Information:
  Wp = (0.032, 0.048) 
  Ws = (0.03, 0.05) 
  n = 25 

    
beta Band Information:
  Wp = (0.048, 0.12) 
  Ws = (0.046, 0.122) 
  n = 120 

    
gamma Band Information:
  Wp = (0.12, 0.8) 
  Ws = (0.118, 0.802) 
  n = 380 

The signals for the brain waves look almost like what I’d expect but then gamma is just obliterated by the high order estimate of n = 380. I am just not sure if this is exactly right but many thanks @jcorream – I might be getting… Closer?

I had no idea this was the case and makes using DSP.jl even more salient in my eyes! Could I add a quick docs PR to highlight this fact? I didn’t come across this fact anywhere outside of your reference.

Oh thanks for the comments – looking forward to seeing more about NeuroAnalyzer.jl 's usage! I think it is a lovely package and am eager to hear more soon! Hope you are doing great Adam (and hope to see you at a JuliaHealth meeting again sometime! :smiley: )! Take care!

thanks for correcting me here ! I was extrapolating from ECG, where you can read on the basis of electrode tell quite a lot about spatial localisation of anomaly.

1 Like

I’ve discovered a nasty bug in the .FDT import routine, it is fixed, but please use the current NeuroAnalyzer version (0.24.7-dev):

git clone https://codeberg.org/AdamWysokinski/NeuroAnalyzer.jl
cd NeuroAnalyzer.jl
julia

Ok, briefly, here’s the code:

using Pkg
using Plots

Pkg.activate(@__DIR__)
Pkg.instantiate()
using NeuroAnalyzer

# load the data
eeg = import_set("sub-01_ses-1_task-eyesopen_eeg.set")

# there is a huge low-frequency drift, we should remove it
NeuroAnalyzer.filter!(eeg, fprototype=:fir, ftype=:hp, cutoff=0.5)

# split into bands
# s is the split signal (band × channel × samples × epochs), bn is the bands names and bf is their boundary frequencies
s, bn, bf = bpsplit(eeg)

# let's plot the second channel
# we will plot the first 20 seconds (`1:20*sr(eeg)`)
ch = 2
# time points of the segment
t = eeg.time_pts[1:20*sr(eeg)]
# plot individual bands
p_d = Plots.plot(t, s[1, ch, 1:20*sr(eeg), 1], ylims=(-15, 15), title="Band: $(string(bn[1]))", legend=false, xlabel="[s]", ylabel="[μV]")
p_t = Plots.plot(t, s[2, ch, 1:20*sr(eeg), 1], ylims=(-15, 15), title="Band: $(string(bn[2]))", legend=false, xlabel="[s]", ylabel="[μV]")
p_a = Plots.plot(t, s[3, ch, 1:20*sr(eeg), 1], ylims=(-15, 15), title="Band: $(string(bn[3]))", legend=false, xlabel="[s]", ylabel="[μV]")
p_b = Plots.plot(t, s[6, ch, 1:20*sr(eeg), 1], ylims=(-15, 15), title="Band: $(string(bn[6]))", legend=false, xlabel="[s]", ylabel="[μV]")
p_g = Plots.plot(t, s[9, ch, 1:20*sr(eeg), 1], ylims=(-15, 15), title="Band: $(string(bn[9]))", legend=false, xlabel="[s]", ylabel="[μV]")
# combine all plots
Plots.plot(p_d, p_t, p_a, p_b, p_g, layout=(5, 1), labelfontsize=5, titlefontsize=6, ytickfontsize=4, xtickfontsize=5, lw=0.25)

To plot the topographical map:

# first, load channel locations
load_locs!(eeg, file_name="sub-01_ses-1_electrodes.tsv")
# plot the topomap: averge amplitude of the first 20 seconds
plot_topo(eeg, seg=(0, 20))

ATB, Adam

1 Like

The major difference in source localization between ECG and EEG is that in the heart you have a periodical, sequential activity (P peak P-Q segment, QRS peak, S-T segment, T peak and back to P peak). Each part represents electrophysiological activity of a distinctive part of the heart (atria, ventricles, etc.) Therefore, you may map these elements onto the heart structures when you have them recorded at various positions. The brain activity is at the macroscopic level aperiodical and non-sequential, hence the problem.

2 Likes

Sure, doc improvements are of course always welcome! I only know about it because I helped upstream it from some private code. Multitaper methods seem somewhat niche outside of EEG analysis though so it probably shouldn’t really be much more prominent than the standard stuff most folks are probably looking for.

1 Like

Added a wee little doc PR here: [DOCS] Add Small Note about Method Validations by TheCedarPrince · Pull Request #561 · JuliaDSP/DSP.jl · GitHub

Might not make sense in that location so I could move this notice around depending.

Out of curiosity, I guess Independent Component Analysis has been tried in the literature. Is that useful at all? What would be its interpretation?

1 Like