Would you sonify your data?

Legend [1] has it that we might never have discovered the black hole at the center of our galaxy if Karl Jansky had not heard static while listening for weirdness in radio waves. The sound of the geigercounter freaking out or the handheld metal detector sliding up and down in pitch as passes the hidden metallic object is etched in our cultural memory. Why don’t we scientists then sonify data?

Visualization remains a barrier to blind scientists. Imagine you get asked to review a paper on a topic that you are a specialist in but have to decline because you can’t see the graphs. It happens [2].

Accessibility is not the whole story. We perceive a much wider band of pitch than color. We also process sound at a much better temporal resolution than we do video, which is something like 24 images per second. (For spatial localization, we are much better off with vision.) I mean, our ears are living signal processing machines.

There’s some really beautiful sonification of telescope images of galaxies and star systems [3]. But these are more for “science outreach”. In order to do actual science, you need something that accurately maps data to how we perceive sound without too many layers of imposed cultural meaning. We need to turn data into sound, not music. Some people do that. They sonify earthquake vibrations [4] and others sonify gravitational waves [5].

The problem is, everybody seems to be developing their own tools for sonification. They are domain specific and a lot of times, buggy. What we need is an equivalent of the grammar of graphics-- a set of semantics that is agnostic to scientific domain and programming language.

I’ve been working on that-- the semantics of sonics-- for the last few years. The heart of it is all implemented in Julia. (Source code is not open. Not yet? Never?). I can take solutions from OrdinaryDiffEq.jl and plug it right in. (the visuals are using Makie):

Or I can read earthquake data from the 2003 Turkey-Syria earthquake using CSV.jl and plug it in to get this:

Or I can add some annotations using PythonCall.jl to call text to speech library (this is a bit broken right now) and let you experience how we discovered the COVID-19 vaccine:

I wanted to ask you this:

  • What are the weird, quirky, niche field you work in where sonification might have a place?
  • As scientists, what would it take for you to start seriously considering sonification (Edit: other than sonification software being open source)?

[1] Jessica Manning Lovett corroborates the legend in their PhD thesis The Sound Culture of Space Science.

[2] “Accessibility in astronomy for the visually impaired” by Jake Noel-Storr and Michelle Willebrands

[3] https://www.youtube.com/watch?v=NqBfQeJqkfU

[4] https://www.youtube.com/watch?v=q1wg2IbA0oo

[5] https://www.youtube.com/watch?v=gT1VwCTe_90

4 Likes

Open-source code, which is something that goes against your comment:

4 Likes

Playing data through a speaker isn’t all that magical — nor is it uncommon in my experience. In addition to astronomers, neuroscientists have been doing it for many decades to “listen” to neurons.

Just give your data a sampling rate and play it through PortAudio.jl.

3 Likes

That makes sense. For now, I ask everyone to focus on things other than the code being open source (and I’m going to edit my question above to reflect this).

I can point you to at least a few sonification software that’s open source. But being open source is not good enough for science if the software is:

  • buggy
  • undocumented
  • has no test suite
  • has no examples

I love open source stuff. Semantics of Sonics is built on openly available research/knowledge. The Julia implementation is built on open source software, and I would for sure consider making it open source if I were convinced I could make it sustainable.

That said, I do consider idea of “the semantics of sonics” to be open source. Just not the current implementation in Julia. The catch is, I haven’t written it down in English.

I’m not claiming it is magic. The idea has been out there for at least a hundred years, more. Kepler apparently advocated translating the motion of planets into music for the sake of science outreach.

No, it’s not magic. But I don’t think the folks who sonify NASA’s images (see my original post) just “played data through speakers”. There’s WAY more to it than that if you want to do it right. There’s a whole field of “auditory display”. People write peer-reviewed papers and hold conferences and do PhDs on this stuff. There’s Psychoacoustics, which, as it turns out, is more than just another field with a silly name. In visualization, we have to think about how to map data to things like shape and color. We’re learning that there’s a right way to do it, and there’s a wrong way. It’s the same with sonification.

The point of having a go-to software is that scientists shouldn’t need to think about sound engineering or psychoacoustics too much. They should just be able to sonify data using a handful of concepts. The point of semantics of sonics is that it is a grammar that tries to account for the physical nature and limitations of sound and tries to allow the most accurate mapping between data and characteristics of sound.

(PortAudio.jl is pretty neat, by the way.)

I work in professional audio, I use Julia, and I have been involved in some sonification projects. I would be very interested in an advanced Julia sonification package. However, I 2nd the open-source requirement, since if not-open source I could not contribute code… but even still I’m happy to contribute ideas and support in other ways.

4 Likes

A few months ago I started playing around with a Pluto notebook that I called The Sounds of Statistics. I never finished it (for various reasons), but I thought I’d share here in case someone finds it interesting or useful:

If someone decides to improve upon it, I’d love it if you’d ping me so I can see how it develops.

4 Likes

Thanks for sharing @mthelm85. I think it’s missing a few files? It was fun to play around with though.

It’s great to hear that you are interested. I think the project could benifit from people who know more about audio engineering and programming than I do. I will keep you in mind and may reach out to you in the future.

Thanks for sharing @mthelm85. I think it’s missing a few files? It was fun to play around with though.

There’s just one .midi file that it reads from - I uploaded it to the repo just now.

I hate to be this guy, but the answer to this question is “when journals allow it and reviewers understand it.” Or when the terrible inertia that makes peer reviewed journal articles the only dissemination method that provides any career benefit dissipates.

Alas, that seems unlikely to happen any time soon :sweat:

1 Like

I think that data sonification may be a useful tool, if not (yet) for publication at least for exploratory analysis and for result communication to a broader public.

I myself only tried very naive approaches, but would like to use it more extensively. @Perrin_Meyer, are there julia packages, other tools or resources that you may suggest?

That sounds like a real barrier. I think it is possible that a piece of scientific work will not get taken seriously if it uses sonification instead of visualization. I also agree that reviewers would need to be able to “read” data sonificaiton, which makes it all the more important to have a common language for sonification.

But maybe journals wouldn’t have an issue if sonification could be submitted as supplementary material? I know in some fields authors also write shorter, less-technical summaries of their work for wider dissimination. Maybe sonification can be useful there. Though these do not typically have career benifits.

This question was not addressed to me, but I can tell you what I know.

I haven’t come across any Julia package for sonification. Python has sonipy. There’s highcharts sonification studio. And there’s erie-web. The latter two are implemented in Javascript. These are probably the most advanced. I think R has something too.

I’ve found these to be buggy. You can’t expect to use them with datasets with something like 10 million observations. So far as I can tell they don’t support “audification”, which is a very useful kind of sonification that you can do with large, regularly sampled data (that’s what @mbauman was referring to). They don’t always have the notion of same reference frames built in, so it’s hard to make sonics with two different datasets and have them be comparable. I also don’t think they are designed with psychoacoustics in mind. So they don’t maintain perceptual uniformity.