EEG.jl -> Present and Future

I think it is better to treat EEG data as it is, that is, as an array channels × time. Stimulations and other markers, for example to get epochs, are usualy given in an accessory channel. Cheers.


Hello, I have never used EEG.jl. I am collecting code that may be used to build upon it or to make a new package for EEG processing. Cheers.

The nice part of working with named dimensions is that we could even have time × channels × observations and create a super simple method that produces an iterator over each observation of channels × time.

I am super into it

So you say one IO package and many for analysis?
Would you say one for pre-processing (e.g. PCA, rereferencing), other for spatial stuff (e.g.EEG topography ) and so on??

What @Marco-Congedo makes sense to me, at first glance.

So you say one IO package and many for analysis?

It wouldn’t necessarily need to be one IO package. We have FileIO.jl already for a generic IO interface. As long as the loaded type has a generic interface we can generically build algorithms around it. Once that’s in it’s pretty straightforward to do stuff like PCA.

Correction: I shouldn’t have said build algorithms around it. We just need to have a method that does something like get_channels_by_time_matrix(x) and pass the result to already built stuff.

Having a common interface for (de)serializing LPCM sample data from arbitrary domain-specific formats (living in arbitrary storage layers - local filesystems, S3, etc.) to a common matrix representation (that can be backed by any <:AbstractMatrix) is exactly the motivation behind :slight_smile: The package currently implements a TimeSpan type and an Annotation type which wraps it - both can be used to index sample data. Onda’s Paths/Serialization API enables users to efficiently request discontiguous data chunks on per-TimeSpan basis as long as the underlying storage layer/file format support it (if not, a slower but still correct fallback path is used). In a similar vein, since Onda.Samples types just wrap AbstractMatrixs, you can also e.g. just use mmap on top of raw LPCM blobs.

We’re also highly interested in @shashi’s recent release of FileTrees.jl, which we believe will compose quite well as a compute framework on top of Onda Datasets.


But doesn’t Onda.jl conform to a specific dataset structure and use it’s own time series interface? The point of TimeAxes.jl is that you can arbitrarily define which dimension is the time dimension, get the same type of functionality as you get from TimeSeries.jl, and it works with things like FFTW.jl. If we rely on Onda.jl for describing the generic interface then packages that don’t care about any sort of file reading would have to depend on it.

What I’m proposing is that binding to something like Flux.jl could be completely agnostic to where the data originally came from. For example, we could do something like this…

(c::Flux.Chain)(x::SomeEEGType) = c(x[channels=:,time=:,observation=:])

BTW. I built TimeAxes.jl after people kept mentioning the need for a generic way of dealing with time data in Julia. If it’s absolutely horrible and we need to start from the ground up on it I’m fine with that as long is its still accomplishing the same basic goal.

1 Like

I don’t work too much with EEG, but I’m happy to help to the extent that it’s a common data structure with MEA, ECoG, and VSD (and I don’t see why it shouldn’t be – they’re all just MultiChannelTimeSeries with some specialized info tacked on)

To be honest, I’m not super familiar with the Onda format. I think the utility of the Onda format and a common interface are orthogonal issues. To be clear, nothing I’ve said so far should be interpreted as being against the Onda format. What I’m proposing is something that is implemented entirely independent of the file format or dataset organization but IO routines can rely on to make imported data compatible with the larger Julia ecosystem.


I would say this is an important issue. Maybe the package needs to be named differently to include all extracellular recordings, in this sense:

This would, at least :

  1. invite more people
  2. Offer an integrated environment to work in. For instance, you may want to analyse LFP-ECoG cross correlation,etc.

Oh sure! I was more replying to your first post in agreement with "EEG data is conceptually just channels x time x epochs" and in hopes of reinforcing multimodality (I interpreted previous uses of “common interface” as still being within the confines of the many different EEG formats. Though I do see that Onda is more generic, and of course your packages are also more generic, but the conversation still felt like “use these generic tools to implement an EEG.jl package”). Basically I agree with what @VMHidalgo said.

I also didn’t mean to imply anything about Onda, as I also don’t know anything about it. Whatever it may be, in general I support the Julian approach of implementing AbstractX.jl that creates a common interface, completely agnostic to implementation. It seems like Onda is a specific implementation of a data structure and associated methods to handle large timeseries signal data well, in which case I would like to have something like a hypothetical AbstractTimeAxes on top of it so that someone else with a different approach could implement in a different way and my code could simply not care. But I could be completely wrong about that.

While we’re on the topic of different modalities, I’d love to have something like this for ECG and/or PPG as well. Working with high-frequency, irregular length waveform data in Flux requires quite a bit of massaging right now, so a common interface for slicing/batching/padding/truncating/etc. would be very much welcome.

When you say “irregular length” do you mean different channels have different lengths?

Different records, I think it’s safe to assume channels in the same record will have the same length for a given modality.

1 Like

So is it really about conveniently being able to do the “slicing/batching/padding/truncating/etc” so you can pipeline it to Flux?

More or less? Having just had a look through Onda, I think something like the Onda data structures without the on-disk format (for those of us who have to work with existing datasets) would be a good sweet spot.

Just saw this and feel very happy that people are interested in working with EEG data in Julia. I have mainly worked with MNE in the past and while I generally like the idea to rely on Julia’s existing ecosystem, I think that the most common analysis steps should be centralized in a specific package. The scope of such a package would definitely debatable, but having all the basic tools for working with EEG/MEG/etc. data in one place while honoring existing best-practices (e.g. filtering) seems like a good idea. From there on it should be easy to pipe data into any appropriate package for more sophisticated analysis steps.

I would also like to stress the need to not just look at this from a software engineering standpoint, but also consider how a modern analysis package could improve research practices, encourage data sharing, reproducibility and “good science” in general.

Anyway, I find this exciting, and I’m happy to contribute!


Could you please extend on this? Maybe …what would be the ideal open/good science Julia environment…what you have seen in the MNE env.

So we all can synchronize with what it means

Sorry, I can see how this might be a bit unclear. I’m actually referring to building infrastructure to work with things like BIDS [1], offer standardized reporting (like e.g. fmriprep [2] does), and other similar things (e.g. automatic generation of method sections, etc.) that have recently been implemented in other packages (not necessary EEG) in an attempt to make research more reproducible.

Also, there are a lot of specific conventions and recommendations in EEG literature (and most certainly other fields) that would make a specialized package go beyond “it’s just a special case of biodata”.

I’m not sure MNE does these things particularly good (albeit better than EEGLAB et al.), but I just wanted to point out that to make any EEG/MEG/whatever-related packages attractive to a wider (scientific) audience, one should definitely consider this.


Btw. I’m typing this on my phone so beware the autocorrect! :slight_smile: