Survival analysis in Julia

Hi all,
What are good packages for survival analysis in Julia (i.e. working with truncated data, Kaplan-Meier estimator, Cox model and so on…)? So far I only found packages that seemed incomplete/abandoned (e.g.[https://github.com/ContaTP/Survival.jl] (https://github.com/ContaTP/Survival.jl)) or this unmerged PR in GLMnet.
I wanted to know if I’m missing something and if/where this set of analysis will become part of Julia Statistics ecosystem.

Thanks,
Pietro

There’s also https://github.com/kkholst/EventHistory.jl. But none of the packages appear to be under very active development. As often with free software, it will depend on whether somebody steps in or not…

1 Like

If you need a stop-gap measure, you can call into Python or R. But the best solution may be to write your own package, as @nalimilan suggests.

I was just wondering the same thing. I am taking survival analysis this semester. Looks like I will have to lean on R. https://www.openintro.org/download.php?file=survival_analysis_in_R&referrer

In the end I started working on it, although very slowly as I don’t have as much time as I’d like to dedicate to this project. Out of curiosity, what type of analysis would you need? Here’s a list of what I’m planning to implement:

  • Kaplan-Meier estimate of the survival function
  • Nelson-Aalen estimate of the cumulative hazard
  • Some smoothing to get hazard from cumulative hazard
  • Cox proportional hazard models
  • Accelerated failure time models

Feel free to let me know if there are more things you’d like to see implemented/implement yourself.

3 Likes

It is hard for me to know. I am just doing a grad level class - no research. If you are willing I would love to contribute. Let me know what you need.

Cool! I’ll try to work a bit more on the core structure of the thing and, as soon as the “skeleton” is clear I’ll open an issue with a “to do list”.

2 Likes

For anybody bumping into this old thread, there is now also Survival.jl in JuliaStats.

4 Likes

Hi. Yes, thanks for the update. Unfortunately Survival.jl does not include Accelerated Failure time models. Also, quite an important feature which is missing is left-censoring (or, equivalently, stratification).

I tried to install AcceleratedFailure from the link and it seemed to load OK, but when I typed

using AcceleratedFailure

I got:

ERROR: ArgumentError: Package AcceleratedFailure [3f1df495-9e49-5d95-a7ce-ef4aed63100f] is required but does not seem to be installed:

  • Run Pkg.instantiate() to install all recorded dependencies.

I ran the instantiate command as suggested, but nothing happened and it still did not work.

I thought that when a package installs OK, then ‘using’ should work after that?

Thanks for any help (or any update on alternative packages?)

PS My interest in this now is Covid-19 related.

I went to the link but I could find any solved example. The docs are very minimal.

Appreciate this is over a year late but might be useful for anyone who stumbles across this thread like I did when searching the same question: just released SurvivalAnalysis.jl which is in active development and will hopefully cover above needs!

5 Likes

I really should publish my result, but I have done some survival analysis using Bayesian models and got dramatically different results from the Kaplan Meier curves my colleagues were plotting. I assumed it was something wrong with my model. But after weeks of debugging it turned out my model was correct. Kaplan-Meier curves were TERRIBLE for the data we were looking at. This was cancer data. The assumptions surrounding K-M missingness are usually very bad for cancer data. Missingness there is very often about a patient dying before their next followup and the study not being able to contact the patient or bereaved family. I did simulations showing that K-M does a terrible job under these conditions. Please don’t use K-M for studying cancer.

1 Like

Nice work! I recommend creating a separate package announcement under the “Package Announcements” category, so that your announcement reaches more people. :slight_smile:

Come to think of it @RaphaelS1 your package may very well make it easier for me to get a publishable article. I didn’t want to jump back and forth between my simulations in Julia and making K-M curves in R… it was too much hassle. now that I can KM curve in Julia, it’ll be easier to get those results out there.

Thanks for the tip! Done now!

1 Like

I’d be interested in chatting about your paper. I’ve done a fair amount of work with non-small cell lung cancer and whilst I wouldn’t use KM for prediction (ever) it’s been fine for basic estimates.

Anyway happy to hear the package will help you publish. Please let me know if you encounter any bugs or have any feature requests, just post them in the repo.

It all depends on your study design and followup. The key finding was when missingness has to do with patients dying and the study not being able to contact the family to get date of death. When missingness is due to this, KM curves wind up way off. Simulation studies confirm the issue.

1 Like

Oh for sure when the independent censoring assumption is violated. I’ve read some good papers on this, I can share if useful?

Would be happy to see some links, probably others too, put them here? Or PM me if you prefer.