Would you sonify your data?

sadish-d · June 28, 2024, 7:42pm

Legend [1] has it that we might never have discovered the black hole at the center of our galaxy if Karl Jansky had not heard static while listening for weirdness in radio waves. The sound of the geigercounter freaking out or the handheld metal detector sliding up and down in pitch as passes the hidden metallic object is etched in our cultural memory. Why don’t we scientists then sonify data?

Visualization remains a barrier to blind scientists. Imagine you get asked to review a paper on a topic that you are a specialist in but have to decline because you can’t see the graphs. It happens [2].

Accessibility is not the whole story. We perceive a much wider band of pitch than color. We also process sound at a much better temporal resolution than we do video, which is something like 24 images per second. (For spatial localization, we are much better off with vision.) I mean, our ears are living signal processing machines.

There’s some really beautiful sonification of telescope images of galaxies and star systems [3]. But these are more for “science outreach”. In order to do actual science, you need something that accurately maps data to how we perceive sound without too many layers of imposed cultural meaning. We need to turn data into sound, not music. Some people do that. They sonify earthquake vibrations [4] and others sonify gravitational waves [5].

The problem is, everybody seems to be developing their own tools for sonification. They are domain specific and a lot of times, buggy. What we need is an equivalent of the grammar of graphics-- a set of semantics that is agnostic to scientific domain and programming language.

I’ve been working on that-- the semantics of sonics-- for the last few years. The heart of it is all implemented in Julia. (Source code is not open. Not yet? Never?). I can take solutions from OrdinaryDiffEq.jl and plug it right in. (the visuals are using Makie):

Or I can read earthquake data from the 2003 Turkey-Syria earthquake using CSV.jl and plug it in to get this:

Or I can add some annotations using PythonCall.jl to call text to speech library (this is a bit broken right now) and let you experience how we discovered the COVID-19 vaccine:

I wanted to ask you this:

What are the weird, quirky, niche field you work in where sonification might have a place?
As scientists, what would it take for you to start seriously considering sonification (Edit: other than sonification software being open source)?

[1] Jessica Manning Lovett corroborates the legend in their PhD thesis The Sound Culture of Space Science.

[2] “Accessibility in astronomy for the visually impaired” by Jake Noel-Storr and Michelle Willebrands

[3] https://www.youtube.com/watch?v=NqBfQeJqkfU

[4] https://www.youtube.com/watch?v=q1wg2IbA0oo

[5] https://www.youtube.com/watch?v=gT1VwCTe_90

juliohm · June 28, 2024, 8:24pm

Open-source code, which is something that goes against your comment:

mbauman · June 28, 2024, 8:39pm

Playing data through a speaker isn’t all that magical — nor is it uncommon in my experience. In addition to astronomers, neuroscientists have been doing it for many decades to “listen” to neurons.

Just give your data a sampling rate and play it through PortAudio.jl.

sadish-d · June 28, 2024, 8:40pm

That makes sense. For now, I ask everyone to focus on things other than the code being open source (and I’m going to edit my question above to reflect this).

I can point you to at least a few sonification software that’s open source. But being open source is not good enough for science if the software is:

buggy
undocumented
has no test suite
has no examples

I love open source stuff. Semantics of Sonics is built on openly available research/knowledge. The Julia implementation is built on open source software, and I would for sure consider making it open source if I were convinced I could make it sustainable.

That said, I do consider idea of “the semantics of sonics” to be open source. Just not the current implementation in Julia. The catch is, I haven’t written it down in English.

sadish-d · June 28, 2024, 9:01pm

I’m not claiming it is magic. The idea has been out there for at least a hundred years, more. Kepler apparently advocated translating the motion of planets into music for the sake of science outreach.

No, it’s not magic. But I don’t think the folks who sonify NASA’s images (see my original post) just “played data through speakers”. There’s WAY more to it than that if you want to do it right. There’s a whole field of “auditory display”. People write peer-reviewed papers and hold conferences and do PhDs on this stuff. There’s Psychoacoustics, which, as it turns out, is more than just another field with a silly name. In visualization, we have to think about how to map data to things like shape and color. We’re learning that there’s a right way to do it, and there’s a wrong way. It’s the same with sonification.

The point of having a go-to software is that scientists shouldn’t need to think about sound engineering or psychoacoustics too much. They should just be able to sonify data using a handful of concepts. The point of semantics of sonics is that it is a grammar that tries to account for the physical nature and limitations of sound and tries to allow the most accurate mapping between data and characteristics of sound.

(PortAudio.jl is pretty neat, by the way.)

Perrin_Meyer · June 28, 2024, 9:15pm

I work in professional audio, I use Julia, and I have been involved in some sonification projects. I would be very interested in an advanced Julia sonification package. However, I 2nd the open-source requirement, since if not-open source I could not contribute code… but even still I’m happy to contribute ideas and support in other ways.

mthelm85 · June 28, 2024, 9:33pm

A few months ago I started playing around with a Pluto notebook that I called The Sounds of Statistics. I never finished it (for various reasons), but I thought I’d share here in case someone finds it interesting or useful:

If someone decides to improve upon it, I’d love it if you’d ping me so I can see how it develops.

sadish-d · June 28, 2024, 10:11pm

Thanks for sharing @mthelm85. I think it’s missing a few files? It was fun to play around with though.

sadish-d · June 28, 2024, 10:25pm

It’s great to hear that you are interested. I think the project could benifit from people who know more about audio engineering and programming than I do. I will keep you in mind and may reach out to you in the future.

mthelm85 · June 28, 2024, 11:38pm

Thanks for sharing @mthelm85. I think it’s missing a few files? It was fun to play around with though.

There’s just one .midi file that it reads from - I uploaded it to the repo just now.

kevbonham · June 30, 2024, 10:30pm

I hate to be this guy, but the answer to this question is “when journals allow it and reviewers understand it.” Or when the terrible inertia that makes peer reviewed journal articles the only dissemination method that provides any career benefit dissipates.

Alas, that seems unlikely to happen any time soon

davide · July 1, 2024, 9:47am

I think that data sonification may be a useful tool, if not (yet) for publication at least for exploratory analysis and for result communication to a broader public.

I myself only tried very naive approaches, but would like to use it more extensively. @Perrin_Meyer, are there julia packages, other tools or resources that you may suggest?

sadish-d · July 1, 2024, 4:28pm

That sounds like a real barrier. I think it is possible that a piece of scientific work will not get taken seriously if it uses sonification instead of visualization. I also agree that reviewers would need to be able to “read” data sonificaiton, which makes it all the more important to have a common language for sonification.

But maybe journals wouldn’t have an issue if sonification could be submitted as supplementary material? I know in some fields authors also write shorter, less-technical summaries of their work for wider dissimination. Maybe sonification can be useful there. Though these do not typically have career benifits.

sadish-d · July 1, 2024, 4:40pm

This question was not addressed to me, but I can tell you what I know.

I haven’t come across any Julia package for sonification. Python has sonipy. There’s highcharts sonification studio. And there’s erie-web. The latter two are implemented in Javascript. These are probably the most advanced. I think R has something too.

I’ve found these to be buggy. You can’t expect to use them with datasets with something like 10 million observations. So far as I can tell they don’t support “audification”, which is a very useful kind of sonification that you can do with large, regularly sampled data (that’s what @mbauman was referring to). They don’t always have the notion of same reference frames built in, so it’s hard to make sonics with two different datasets and have them be comparable. I also don’t think they are designed with psychoacoustics in mind. So they don’t maintain perceptual uniformity.

sadish-d · May 13, 2025, 4:12am

Almost a year later…

The software is at a stable-enough point. I’m calling it Son. At the heart of it is the package SemanticsOfSonics, written in pure julia. Then there’s SonUtils which adds IO and numerical algorithms via other packages. SemanticsOfSonics is free for non-commercial use only (for now?), which means it is not exactly open source by prevalent definitions.

repo: GitHub - sadish-d/SemanticsOfSonics.jl: Semantics of Sonics

documentation and examples at: User Interface · SemanticsOfSonics

If anybody has questions or comments, I should be able to respond in the next few days. I’d especially love to hear from blind and visually impaired scientists.

rafael.guerra · May 13, 2025, 7:48am

Some energy companies have also toyed with this nice concept, but I am curious to see what new data insights could be gained.

nsajko · May 13, 2025, 10:23am

The On-Line Encyclopedia of Integer Sequences (OEIS) has some kind of sonification as a feature:

rafael.guerra · May 13, 2025, 10:28am

The Pascal’s triangle sequence sounds like Rachmaninoff!

ssfrr · May 13, 2025, 2:05pm

ICAD (the International Community for Auditory Display) is the main conference I’m aware of for this kind of data sonification. This year it looks like it’ll be in Portugal: https://amicad2025.dei.uc.pt/

Unfortunately their main site, icad.org, seems to be down.

sadish-d · May 13, 2025, 2:43pm

The link that @rafael.guerra shared about the energy companies using sonification for seismic data and @nsajko shared about integer sequence sonification both somehow map numbers to musical notes (make MIDI files). Seismologists usually use a different approach, sometimes called audification (which Son can do). It involves (almost) straight up “playing data through speakers” as a rate that you can hear it.

Half of the data sonification I find on the internet use the musical/MIDI approach. It can sometimes produce beautiful results: Sonifications - NASA Science . The approach with Son is different. It maps data to sound based on pure sinewaves, not musical notes. It’s hard to interpret the data when you use the more musical approach. The music gets in the way of the data. Pascal’s triangle played row-by-row is not supposed to sound that interesting-- if you visualized it with a line plot, it would just look like symmetrical peaks that get taller and taller.

Unlike seismic data, Pascal’s triangle is more of a mathy notion. Son is meant for empirical measurements or simulated data.

@ssfr, I was supposed to do a small demo at the ICAD conference last year, but it had to be in person and I couldn’t go.

Topic		Replies	Views
Julia losing popularity among Data Science users (KDnuggets Software Poll) Community	146	20044	June 23, 2018
Preaching Julia to biologists Teaching & Outreach	76	5987	November 10, 2018
Convincing physicists that Julia is worth their time and effort Teaching & Outreach	34	5893	January 11, 2019
Review of presentation Data	12	1667	December 8, 2017
Special issue on Julia? Community	46	5445	April 4, 2018

Would you sonify your data?

Related topics