[ANN] ReadWorkWrite.jl - a pipeline for CPU/memory limited analysis

A Julia package for efficient parallel processing pipelines that separates IO-bound operations from CPU-intensive work.

Overview

ReadWorkWrite.jl implements a pattern where:

  • Read: Single-threaded IO (loading files from disk)
  • Work: Multi-threaded CPU-intensive processing (e.g., MCMC sampling, data analysis)
  • Write: Single-threaded IO (writing to databases, files)

This design prevents threading issues with IO operations, minimizes memory requirements, and maximizes parallelization for computational work.

Motivation

I am a neuroscientist. In my work, we often need to process data in batches and these processes are CPU bound, not IO bound. For example, we might need to do model comparison on many thousands of neurons.

Other packages, like Folds.jl or ThreadsX.jl are convenient for multi-core or multi-threaded map like functions. But they process an entire iterator together so if you have thousands of elements the process may be more memory intensive than is necessary.

This package, ReadWorkWrite.jl, takes advantage of Base Channels and Threads to read in data only as fast as the workers can handle them.

small print: This is my first package that I have submitted to the registry, and it was a fun learning exercise for me getting the testing and CI working.

9 Likes