RayTracer.jl a differentiable ray tracer

I am really enthused by @ChrisRackauckas tweets about ScientificAI, the new paper and JuliaCon.
I guess the answer is ‘go to JuliaCon’’ - which sadly I cannot.
@avikpal paper on RayTracer.jl caught my eye. Please could some kind soul explain how differentiation comes into use in a ray tracer?

I am not completely ignorant - I spent a few years working in animation/post production so I know whay a ray tracer is. If anyone has ever see the BBC series Walking with Dinosaurs I had a small part in that.


In the paper at https://arxiv.org/pdf/1907.07587.pdf there is a very simple use of this where the derivative of the RMS difference between rendered image and reference image is taken, with respect to a light position.

Knowing this derivative you can use an optimization algorithm to infer the most plausible light location by adjusting the light position.

It would be interesting to use the same thing for inference of geometry (a classic task in computer vision). Spatial antialiasing would be required as part of the loss function so that the derivatives are nonzero.


Thankyou Chris. Going off topic, as I normally do, I remember a DARPA funded project which concerned seeing around corners. Not such a stupid thing if you are perhaps a soldier in urban combat. The concept was to use shadows which leak around the corner. I guess Julia now can do this!


I imagine it would take quite some work to build such a system… but yes that’s an interesting example of the kind of problem where having a differentiable physical model of light transport might be a good idea.

Then again, for production deployment it might be better to have a lightweight approximation of the inverse problem as a whole because it can be very fast. Eg, use a feed forward neural network trained to do the inverse problem as a whole, rather than doing optimization at inference time using the physical model. The best approach to take is unclear to me at this stage but it’s very interesting.

The way I see it is that, image acquisition and processing may be part of a real-time control loop, so you need to be able to treat that like the rest of the control system. We usually are trying to solve scientific problems like, given that we see data X, fit a neural ODE to uncover the dynamics, or fit a neural ODE mixed with a real ODE to find a nonlinear optimal control strategy… but this assumes we have data X.

Let’s take it one step further back. Let’s say we have an autonomous vehicle with a few cameras and some laser sensors, and we want to have this follow a target. You have the image processing to pinpoint the (x,y,z) location of the target, and then the control loop to the target, and you want to optimize your strategy to get there, but the generation of (x,y,z) itself is a problem of rendering a 3D image image from the cameras and lasers. So that image generation needs to be a first-class differentiable portion of the training loop, so you can handle this thing as a whole by backproping through the dynamical system control problem (neural ODE), and the image processing / raytracing portion, to go all the way back to from prediction to input data.


Good point; in some cases the training itself benefits a lot from gradients taken through a physical model or models.

Optimal control is a great example. Are there other standard examples?

Standard? I don’t think we can say things are standard here yet. But I can think of some off the top of my head. What about, automatically finding out the dynamical system for some object through a neural ODE with observations through some camera?

1 Like

Hehe sorry; not exactly standard. I’ve just been thinking about differentiable programming a lot lately. And also following the great stuff you guys have been doing with great interest.

Whoa! Inferring dynamical system from a video would indeed be an amazing demo of Julia’s strengths that are unique to the language. Is there such a demo somewhere?

That example would be even more impressive than the trebuchet one (as awesome as that one is)

1 Like

This is the kind of thing we’re trying to do. Stay tuned.


This sounds really interesting, indeed!
I have a very short and slightly off-topic question: what are the differences between scientificAI and hybrid modelling?

Not wishing to talk over @Modatu

A long time ago I worked briefly in radiotherapy planning. This is an inverse problem - you know the answer. Maximum dose to the tumour, minimum to healthy tissues. The patient is treated with many beams from the therapy machine, so the weights of these can be optimised.
If anyone is interested:

WIth differentiable programming now you would be able to do real physics simulations for each beam.

1 Like

Combining the differentiable ray tracer with differential geometric algebra would be an interesting project. At JuliaCon I will be presenting Grassmann.jl, which is an implementation of conformal geometric algebra combined with differential geometry.


The PDF discusses my mathematical foundations. Have a look at ganja.js to see what a ray tracer with conformal geometric algebra looks like. Grassmann.jl could be used as a foundation for that.

With Grassmann.jl differentiable programming will enter a whole new realm of possibilities using conformal geometry and differential geometry.


What you people call artificial intelligence… I just call it Grassmann-Leibniz algebra, since that’s what differentiable programming foundationally is.


But the target position here is not controllable I assume, so IIUC there is no loss of information by treating the tracing function as a constant function wrt to the control parameters, i.e. there is no merit to making it differentiable. Am I misunderstanding something?

You’re assuming that the “data reading” is something that is clear and precise. Say you’re doing it with a neural network. Update your object detection neural net simultaneously to you using it for control.

1 Like

It is unclear to me what you mean by data reading with a neural net. Object detection seems like a separate module with separate decisions (NN weights) and separate objective, e.g. loss function for a training dataset. I don’t see why you would need to mix them together. When an object is detected, we can tell the robot to follow it. Again I may be missing the point here.

Edit: controlling the camera/laser positions to “detect” as many objects as possible might be an interesting application.

You are right in assuming that the target position is not controllable. But the scene that is visible to the agent (in this case the autonomous vehicle) is a function of the control variables. Because the position (and configuration) of the agent (or the camera) at any time is determined by applications of the control parameters on the previous configuration. Hence, treating it as a constant will lead to loss of information.

1 Like

There’s a philosophy which says we should mix all the things together to do end-to-end learning of the control system as a whole. It’s a useful and interesting idea if the “object detection” step turns out to be poorly engineered for the downstream control algorithm. For example, imagine there are features in the input space which determine the target’s velocity as well as position, but the object detector was only designed to compute position. An end to end system has a chance to pick up on this extra information.

Another example - suppose your vehicle should follow targets which are cats and dogs of various sizes. The animal type and shape is very relevant to the control algorithm, but a generic object detector might not be built with that in mind.

1 Like

I get the cases where the output of a ray tracer is input to the control NN or is used to define a loss function but that doesn’t require differentiating through the ray tracer IIUC. AD through the ray tracer would be required if the output of the control NN is input to the ray tracer and the output of the ray tracer is part of a loss function.

I think @avikpal 's comment implies that the inputs to the ray tracer are functions of the control signals (output of control NN). The output of the ray tracer is then used to define the loss function. This would require AD through the ray tracer. If you guys know of a good paper or reference I can read, I would appreciate it. Thanks for your explanation.