Is there any comprehensive list of known security vulnerabilities or associated strengths with Julia? I’m working with the DoD to enable ML applications for data science and am hoping to use Julia in our implementation. However, the DoD is hesitant due to the lack of known utilization in sensitive settings and general unfamiliarity with Julia. Other languages, such as Java 8, cpp, and Python, are supported as they have a list of Common Vulnerability Enumerations (CVEs) and Common Weakness Enumerations (CWEs) which are tracked by MITRE and OWAS (Open Web Application Security). Scanning against these known lists provides a threat baseline when software enters sensitive areas. As far as I can tell, Julia does not have a list of CVE’s or CWE’s from these organizations, which is why they are unable to approve it. In other words there is no baseline threat assessment for Julia so they don’t know the risks of bringing software in.
Are there any comprehensive list of vulnerabilities or weaknesses inherent in Julia? If not, is there any interest to work with MITRE or OWAS to generate a list of vulnerabilities? As the DoD continues to implement more sophisticated ML applications, it would be advantageous for Julia’s limitations to be properly documented to enable more widespread usage.
I guess the question is “what kind of vulnerability”? Looking for vulnerabilities in languages historically had to do with buffer overflows and the like, with other vulnerabilities attributed to the language most often being targeted at a specific application written in that language. I’m not aware of filed CVEs or similar though (and julia being a garbage collected language it’s unlikely to be the most prevalent/pressing vulnerability) and it’s unclear what a CVE in the context of machine learning would look lile (e.g does the use of a machine learning algorithm in and of itself lead to a “vulnerability”, since they’re bound to have some uncertainty in classification and being “vulnerable” to crafted inputs? What would make julia different here from using python for the same algorithm, if the vulnerability is inherent to the algorithm and not the language?)
There is some similar discussion here:
I’m happy to discuss/elaborate in more detail, though I imagine it’ll be of limited importance to the DoD, who, as is often the case with institutions, as an organization does not understand that such broad classification in “good” and “bad” doesn’t necessarily work here. It’s more nuanced than that.
That being said, a security audit (whatever form that may take for a language itself, most probably GC and compiler) would be appreciated.
Another thought: If what they’re looking for is a list of “patterns to avoid to minimize possible vulnerabilities” - such a list doesn’t exist yet, mostly because such broad sweeping vulnerabilities have not been found yet, as far as I’m aware (well, there was this one incident a few years back about the security of the Distributed stdlib, which was fixed promptly).
A preliminary list could be
don’t eval or otherwise interpret user-supplied data (which is a kind of generic catch-all that’s just as well suited to python as it is to julia)
keep memory growth in check by writing mutating functions instead of allocating memory that isn’t freed on return of a function
And probably a few more that I can’t think of right now (not for the pay I get right now anyway, this is a free forum of generic julia users after all )
This is an important topic for Julia and could prevent better adoption by industry. Is there vulnerability testing for example that could be applied to registered packages? Seems far fetched, but even having a way to flag vulnerabilities in packages so an institution can turn off those packages for incorporation into applications or projects. Or turn on flags that have been “certified safe” in some way. Trojan horses, spyware, as well as security permission problems. Is there an industry standard that Julia as an organization can be benchmarking against? I know nothing of these things.
As a second discussion point and example, Python disables MD5 sometimes. This is by design, to comply with e.g. FIPS-140. I don’t know what attention is applied in this direction, but attention toward these requirements will be valuable to increase adoption.
I agree with @alhirzel in that it would be fantastic for Julia to be listed as approved Open Source Software. Right now, it seems the main hesitancy towards applying Julia in vulnerable applications is the lack of scanning tools that compare code to know CVEs and CWEs. So far, I’ve found Julia tagged in a single CVE topic here. One possible idea is to re-use scanning tools meant for C++, but I’m unfamiliar with the details to understand how valuable this would be. This would most likely require Julia to be compiled to C such as by using the LLVM-C backend.
That article has nothing to do with modern cybersecurity work or research - it’s literally just an article implementing historic shift-ciphers.
Any automatic scanners for CVEs would have to be bespoke to julia. Simply compiling down to e.g. C does not work/help, because then you’re only checking that your transpilation didn’t introduce new implementation bugs in your target language. It doesn’t help verify the original semantics.
On top of this, I’m missing concrete proposals on what you require to be allowed/disallowed - these requirements are going to be specific to your usecase & your organization. There is no “one size fits all” for security.
I think if anyone wants to rely on Julia in security-sensitive environments, they should have a conversation with the core team about what assurances are needed and what the team is willing to offer in terms of language safety provisions and disclosures whenever those provisions are violated.
I haven’t checked all CVEs, but I’d wager most of them are in one of the myriad of stdlib packages, yes. I guess we could file CVEs for these as well, which would go hand in hand with backporting all those fixes.
Yes, agreed. As discussed in another thread on this topic, at the end of the day julia is a programming language, designed to do arbitrary things. While support for fixing bugs & vulnerabilities in stdlibs can be given, at the end of the day a large part of the responsibility lies on the user to write their code in a way to not be vulnerable. By the nature of a smaller stdlib (though not quite that small), this naturally leads to more attack surface coming from packages, rather than the language itself.