Julia's Broadcast vs Jax's vmap

Samuel_Ainsworth · June 13, 2020, 10:41pm

I think this line of questioning is getting a bit off topic, as the OP issues brought up around vmap/julia broadcasting don’t relate to memory pressure.

But in any case, what you’re looking for can be done in JAX. Something like this should do the trick:

import jax.numpy as jnp

really_big_dataset = ...
aggregation = 0
for chunk in jnp.split(really_big_dataset, 50):
  really_big_result = vmap(your_func)(chunk)
  aggregation += some_kind_of_aggregation(really_big_result)

And you can swap that python for loop out for a jax.lax.fori_loop for some extra speed if you’d like! This pattern is actually used in some of the JAX example code for training loops, etc. Or you could use jax.pmap to run vmap-batched computations in parallel across multiple devices and then aggregate them together. The possibilities are endless!

Topic		Replies	Views
Blog post: Loop fusion and vectorization in Julia 0.6 Internals & Design announcement , broadcast	28	8402	May 4, 2017
When should I write loops or vectorised calls? General Usage	17	1789	December 1, 2020
Performance of simple broadcasting operations with many arguments Performance performance , broadcast	15	1592	November 29, 2021
Arithmetic broadcasting in Julia 5x slower than MATLAB Performance	17	1071	May 26, 2022
When to use broadcasting with . vs map General Usage broadcast	23	5242	October 4, 2022

Julia's Broadcast vs Jax's vmap

Related topics