ML feature importance in julia

I am trying to fit decision tree and random forest classsifers in julia. Wanted to know what is the python equivalent of classifier.feature_importance_ in Julia.

Welcome to the forums! It might be worth checking out this post to improve your chance of getting a useful answer.

In this case, I think you’ll need to provide a lot more context for your question. What problem are you trying to solve? What package in Julia are you using? Do you have some code you can share?

You don’t mention what package the function you’re interested in comes from in python for example, nor what it does. There may not be (I would say probably isn’t) an exact replicate, but if you describe what you’re trying to do you may be pointed to analogues that accomplish the same thing or something similar.

3 Likes

I am using DecisionTree.jl classifier in Julia to fit decision trees. I have searched the related docs but could not find a mention of extracting features importance any where in the docs. I know I can print trees but looking for a simple way to get the feature importance. In python this is possible after the classifier is fit using classifier.feature_importance_ command. I hope this info is enough but willing to share the code.

Maybe not directly answering your question, but check out KoalaTrees.jl, part of the Koala.jl suite of packages. The example in the docs includes feature importance output.

Koala does not seem to be updated for Julia 1.0. I have trouble installing it.

Yep sorry should have added that - I dev’ed my own version and with a few changes was able to get it up and running on 0.7/1.0, although it does require some additional effort!

Make a PR?

3 Likes

@dpsanders: I don’t know how to do that. Please guide me.

Assume this was in response to me? I’m toying with the idea although my patches currently are very rough and read (e.g. I haven’t managed to get the show methods to work properly and display the type information that they displayed in 0.6 according to the docs), hence not sure whether my PR wouldn’t create additional effort rather than making life easier for the package author!

I suspect, if it creates additional effort, that additional effort woul be trivial, while if it helps it will help a lot. It doesn’t take much to look at a PR and assess it, and worst case the author says “no thanks, I got this,” in which case neither of you lost much.

2 Likes

Perhaps take a look at ShapMLJ.jl. It computes Shapley values (a model-agnostic way of ranking feature importances) and it looks like you can use it with any MLJ.jl model. It is quite new; I have not used or tested it myself.

6 Likes

I have implemented this also in a package called Duff.jl, as the idea for Shapley Values was independently invented in the paper “Dependency aware feature selection”, by Petr Somol, Jiri Grim and Pavel Pudil, but as the name suggest, for feature selection.

In Duff.jl, I have added a hook to the statistical test if the difference between feature being present or absent is statistically significant.

1 Like

ShapML would be my handy work. Your take on it is right: model agnostic and, really, package agnostic feature importance for each instance in a dataset. The algorithm should be spot on (I tested it against other non-Julia packages…there’s also a vignette where I compare results to Lundberg’s TreeShap method from his shap package), but I’m fairly new to this whole Julia thing so it’s probably slower than it needs to be. There’s some cool research in the Shapley space around causality so I’ll be maintaining and adding to this package for a good long while.

4 Likes

Interesting paper. I’d be curious to see a simulation about how various Shapley-esque algorithms converge and under what conditions. The “Dependency…” part is the big part to get right: conditioning on the right features and comparing to the right population. In my experience, feature correlation does split the feature-level effects, but it’s easy enough to cluster related features to get meta effects of groups of features…though this post-hoc approach seems a bit hacky.

Also, cool package Mill.jl. Haven’t used it, but I’ve read the papers.

1 Like

Thanks for supporting Mill.jl. At the moment, I use Duff just as is. I would be at some moment interested in using biassed sampling, such that features that seems to be important are sampled more frequently and then correct for biassed sampling. I will need to take a look at ShapMLJ and discontinue duff if ShapMLJ would be better.

I have not yet taken a closer look at it yet, but does Duff.jl also work for MIll.jl without any additional modifications?