I am really about a corporate practitioner whoās will just make use of whatever is already there and they donāt have the ability to come up with anything new. E.g. I know āsenior data scientistā that donāt know how to fit a logistic regression, so I think you think to highly of these practitioners I am thinking about. But thatās an extreme case, a practitioner I am thinking about is someone like me who can make use of a LSTM, ResBlock, but wouldnāt at any point try to invent something new, I just know how to throw a few layers together and I roughly know what each layer is meant to do kinda practitioner.
Well, maybe AutoML? I donāt think throw a few layers together can make a paper, but people do need that in product. And that is actually why we are still supporting Python, and things like PyTorch is porting caffe2 for product, tensorflow is supporting training on mobiles, and there is even new framework written in C/C++ for ARMs etc.
I personally think, if someone are using Python, without Numba, C/C++, Cython, etc. then Python is fine and probably you wonāt have much motivation in switching the language unless you want to use fresher techs.
And it is always fine to write in you favored language and framework, there is ONNX for model and many other formats for parameters.
And since it is way easier to implement new things in Julia, I believe at some point, there will be plenty of existing models here and then engineers without ML background will be able to use them in product. And thatās when everyone moves to Julia.
I am not sure that catering to the demands of someone like this is a priority for many Julia packages at the moment. I donāt see a problem with this; a lot of commercial software exists precisely to fill in this niche.
Impressive and elegant blog post. Thanks for writing it!
A small note: you have a minor typo in your equation 3. I believe you meant for the 2 to be subscripted. Itās in the Compute-graph section. Thus changing y1=z2x to y_1=z_2 x. Similarly you have an issue in y2=b\cdot x which I believe should be y_2=b\cdot x.
Just playing around a bit with your nice AD package . Iām trying a simple affine transformation for a small MLP. The following code leads to an error. I cannot really understand fully whatās going wrong but my guess would be that it doesnāt have a registered method for dealing with a matrix vector multiplication. Am I right about that or is there something else going on here?
using YAAD
function yaadderivtest(nhidden=100_000, ninput=1_000)
W = Variable(rand(nhidden, ninput) .- 0.5)
b = Variable(rand(nhidden) .- 0.5)
x = rand(ninput)
f(W, b) = sum(W*x .+ b)
y = tr(f(W, b))
backward(y)
end
yaadderivtest()
Yes, this is because sum is not implemented. (See the backtrace). Iām still exploring to see if thereās an elegant way to implement all iteratorsā backward propagation. Which is a problem when this is not a tracked type in AbtracyArray (while Tracker has different tracked type).
But if you want to use sum, simply register this function: