I’ve recently started working at a biotech startup. One of the tools used by these types of companies is the macromolecular modelling tool Rosetta. I don’t totally understand Rosetta’s various capabilities/limitations. To gain better understanding, I was hoping to contrast it with other tools. Specifically, Julia tools, because I’m curious about the bio community.
In the Julia, there are several tools organized under BioJulia. What capabilities of Rosetta have equivalents in BioJulia?
Not many, to my knowledge. Rosetta is a mature and powerful suite that encompasses core code, protocols and a forcefield. Rosetta can do things that few other softwares can in any language, which does make a comparison difficult.
The structural bioinformatics scene in Julia is more limited, and the following packages include features that can also be found in Rosetta (apologies for any omissions, please add below):
Stay tuned, there are more and more as time goes on!
Now having a year of experience working with Rosetta, I’d like to use my experience to expand @jgreener64’s reply. Specifically, the three components mentioned in:
Forcefields are statistical descriptions of how protein components interact. They are typically used to evaluate a protein shape’s energy score, wherein you want a protein to fold to minimize it’s energy score.
I still don’t totally understand how a forcefield is designed/validated, however I realize this mostly requires me to read a bunch of papers.
I most commonly use Rosetta’s built-in protocols, such as Simple Cyclic Peptide Prediction and Protein Docking. Based off the videos from the 2020 Rosetta boot-camp, a protocol is a series of steps to repeat until convergence. For example, rotating the backbone, mutating side-chains… etc.
You can specify more complicated protocols with PyRosetta or RosettaScripts XML.
Protocols have a tendency to break between Rosetta releases.
Rosetta’s goals required the creation of much custom C++ mathematical code, as shown in this YouTube video from the 2016 boot-camp. From my high-level perspective, I think most of this mathematical functionality, such as vector and matrix operations, are already built-in to Julia.
Summary: Rosetta would be nice in Julia
Rosetta seems to be an ideal candidate for a Julia re-write, given:
- It suffers from a three language problem (Python, C++, XML)
- Compiling it is difficult
- It mostly deals with throughput-based scientific computations
However, I believe a religious figure once said “May those that wish for a re-implementation of software, write the first lines of code.”
Rosetta is not a single thing, it a monster suite of hundreds of tools written over the last 30 years. Although each of these tools could be rewritten in Julia and for some uses that would be very nice (particularly for adjusting force-field function terms, defining new constraints, and avoiding all that compilation hassle), I can bet that will never happen.
It is more realistic to think to implement one of the things Rosetta does (a docking method, for example) in Julia. But really, that makes sense if one has something original to contribute to the methods instead on simply translating a code.
A colleague just pointed me here.
I wanted to revive this thread to showcase ProtoSyn.jl. Realistically, it is not a replacement for Rosetta (it could never be). But it is a Julia-based platform for molecular manipulation (with a focus on peptide design). When I started my PhD, we wanted to try out some new models for energy calculation, such as TorchANI or even Dr. Fogolari’s work on born radii estimation using a ML model (10.1093/bioinformatics/btz818) … But Rosetta was such a pain-in-the-*** to modify and use. It was not open-source, the documentation was infamously not up-to-date and we would have to constantly battle between Python and C. So, naturally, we started development on ProtoSyn.jl. This Julia package is now filled with goodies: you can build arbitrarily complex nested algorithms, build custom energy functions, plug-and-play with almost any work you see online or in papers (using PyCall.jl to call Python stuff, for example, we even implemented PyRosetta’s REF-15 energy function). It has a lot to offer, hopefully, in an easy to understand way, completly open-source, well documented and with lots of up-to-date examples/tutorials. As some examples, we’ve used ProtoSyn.jl to build peptide folding algorithms, design, small molecule docking, in-target conformation library generation, sidechain-packaging algorithms, etc. It’s also useful as a pre-step in most computational chemistry protocols: sometimes all you need is to add thos pesky hydrogens to a peptidic structure! ahah
Anyways, give it a look, who know what you may find!
I’m pretty sure over the years I’ve interacted with all of you, showing ProtoSyn.jl around, but I’ll just leave this comment here for anyone who comes looking ahah. Cheers!