Hi everyone!. I would like to announce my first package: ExtremeLearning.jl.
For now it is a very simple framework for creating ELMs based on the now archived ELM.jl package.
You can find a simple example of its usage here GitHub - gbaraldi / ExtremeLearning.jl
If anyone has feedback or suggestions please comment.
did you pick “row as feature” for performance reasons?
did you consider adding an interface to MLJ? (it would be pretty easy given your API I think)
Edit: having had a quick look at your code, I have more comments:
you might also be interested in using NNlib.jl as a dependency since it contains implementations of basic things like sigmoid that are numerically (more) stable.
I picked row as feature because julia is column major. However since most operations are matrix multiplications it might not matter so switching to row major (features as columns) might be better for using things like DataFrames.
I might add an interface to MLJ since it would add some nice features.
I should probably add NNlib as a dependency. I haven’t used it yet because I started by porting the old ELM.jl package and it used that sigmoid function. Ditto for the pinv call, although the original ELM paper calls for the Moore-Penrose inverse, which is what pinv is.
Ok; re Moore Penrose inverse, check out what an min MSE linsolve is and you’ll see that it’s the same than more penrose except cheaper because you don’t construct the operator. Calling pinv makes sense when you reuse it which is not your case so you should probably just use \ (similar as the usual recommendation not to call inv when solving a linear system). Maybe you could try both and see that the results match and what btime tells you.
Re row major, my advice would be to indeed drop that as quite a lot (most?) of packages for data handling use the other convention which means that most user would have to pay the price of reshaping their data anyway.
Thanks for the advice. I didn’t know about the \ operator (kinda of a newbie) it did make it a little bit faster, probably because it reduced the allocations. On my next version I will also switch to cols as features. I did some testing and it has no measurable impact on speed.