Hi all,
I am trying to replicate a Likelihood Approximation Network (LAN) with Flux.jl. LANs are used to learn the likelihood function of intractable computational models. As a proof of concept,  I am trying to apply the method to two simple models for which the likelihood function is known: a Gaussian model and a decision model called the Linear Ballistic Accumulator (LBA). I was successful in developing a LAN for the Gaussian model, but the LAN for the LBA produces NaNs as predictions. I tried various solutions from other threads, such as decreasing the learning rate and using BatchNorm, but those recommendations did not solve the problem. Changing the activation function to relu,  solved the NaN problem, but interfered with the ability of the NN to learn the likelihood function.
Can I do anything to fix this problem? Please let me know if there are more details I can provide.