Use a tool like Codacy to check common issues pattern

#1

Recently, I used Codacy on my fork of Julia, and it discovered 936 issues (237 security + 36 error prone + 663 code style). So I think to suggest a discussion about “Use a tool like Codacy to check common issues pattern” for speed up issue discovering.

I mark all issues pattern to check, so it leaves some ineffective issues. But if we configure it proper, it can be useful.

Screenshot%20from%202019-05-05%2002-43-13

2 Likes
Security scanning of Julia code
#2

I think this could be useful for languages where these patterns are more established and recognizable, especially C. Some of the issues I looked at look potentially relevant, and even though Julia does not claim to be especially secure when it comes to exploits etc, fixing them could not hurt.

After some discussion here, consider making some PRs based on these reports.

#3

even though Julia does not claim to be especially secure when it comes to exploits etc, fixing them could not hurt

Some of the issues seem to very valid, e.g. the strlen check for \0 terminated strings and memcpy not checking buffer bounds. Today security must not be explicitly claimed, it is accepted as being important in every code.

#4

This is not really true; I know plenty of programs that I use daily that could have buffer overflows and other vulnerabilities and could never actually compromise my security (they might annoy me when they crash or give wrong results, of course).

What’s really important is the competency of the user of software. The reason so many security issues have arisen these days is likely because we have many more technically incompetent users being placed in front of systems which assume some level of technical competence. Additionally, the rise of the internet simply provides more opportunities to exploit services that utilize it. But a decently-configured firewall and a competent user sitting behind it can run buggy software all day and likely never end up getting exploited.

Anyway, I still agree that it’s useful to fix such bugs when possible and when they don’t impact performance in performance-sensitive areas. Even more important is identifying the parts of your software which can be influenced by the “outside world”, and properly unit- and fuzz-test those parts to ensure they remain robust in the face of malicious interaction.

#5

While I don’t necessarily disagree with this, I think that in this particular case we are just talking about fixing issues identified by an automated tool, which is just a low-hanging fruit — we can have correct code at the price of programmer effort, without considering whether it is relevant or not from a security perspective.

It is always nice to fix bugs before they show up…

In any case, it would be nice to hear from the core devs about whether they would be receptive to small PRs fixing issues identified by this tool or similar ones.

1 Like
#6

You did proof me wrong, as it seems, that at least two people do not acccept the importance of security in every software. Ok, then lets talk only about julia and security.

I want to stress the importance of inherent security in the julia code base, as it is the base of all applications where julia is used as a language. We can not tell, who uses julia for what in future, but, I am sure we agree on this point, we all would be happy if julia is widely used for, lets say everything.

For the point of security it is much simpler: we all use julia, we use packages, some of us use it, to open and process data. It is probably not wrong, if I assume, that there are enough people who use it like that, and who don’t have control over the incoming data, don’t bother to control the code of packages and trust in the julia base code. This is the scenario where exploits grow easily.

We are not there yet, we are still in some kind of niche, where developing exploits seems not to be economic. But this is also the time, where fixing potential security flaws is cheaper than in future.

My proposal would be: install some kind of automatic checking like the one in the OP, do it open and public and let the community work on the solutions. Maybe some knowledgable people could prioritize the found issues to make it more motivating for the volunteers. (And to anticipate the unavoidable: yes I would to some work in such a setting).

What is important for me is now written above, but this

The reason so many security issues have arisen these days is likely because we have many more technically incompetent users being placed in front of systems which assume some level of technical competence.

is just wrong and I just have to say this. Maybe I got it wrong but anyways it seems not constructive to the topic to discuss it as the result would always be: there is nothing to do, because ignorant people us it wrongly. (yes I should have ignored it, I couldn’t, but maybe others can ignore this now for further discussion).

1 Like
#7

The problem with automatic checkers is that they tend to be noisy. Just like malware scanners, they try to give as many results they possibly can to show their “value” and thereby getting people to buy the product, c.f. the front page of that scanner:

Logging in with Google will allow you to try out the product, but you will not be able to experience its full potential.

That being said, if people find security problems in the Julia codebase, PRs to improve the security are of course always welcome. For people who are interested, I would suggest starting with small PRs.

3 Likes
#8

Possibly you misunderstand: I don’t think anyone was claiming that Julia should not be as exploit-free as possible, but it is not as simple as setting an automated scanner on it. It is very likely that a fraction of these flags are false positives and/or irrelevant (because the C code gets sanitized input from the caller), so each of them would need to be reviewed in turn by someone who knows the relevant code. There are not that many people working with the internals of Julia.

This is potentially important—but so are many other things. Julia at this stage does not claim to be especially secure or hardened, either explicitly or impicitly.

That said, if someone cares about this, trying to fix things in small PR could be a viable strategy. I am sure these would be well-received, especially if accompanied by a test that shows breakage.

2 Likes
#9

No, but I could have emphasized better: at least two people do not acccept the importance of security in every!! software, and even I could imagine or actually use software, which must not be secure, because everything which goes in is controlled by myself. So, if it’s not you, it’s just me and one is enough to falsify :grinning::grin: no worries! Really just a sidenote.

On the topic:
Using automated software for code checking is one of the ways security people (good or bad) are searching for exploits. Doing it beforehand automatically as part of the quality control process is not preventing all security issues but helps to avoid making them obvious for the bad guys and does prevent some of the possible issues before they are real (=exploited).

Many of the issues I have checked (from the above link in the OP, just a few) are no real issues, but trigger just some test of the tool. But they also very easy to adress, even of somebody without any knowledge of the purpose of the function ,e.g.
changing
jl_value_t *result;
to
jl_value_t *result = NULL;

Looking at the code right now, this wouldn’t change the logic at all. But code is fluid and other unknown bugs may change this in future fixes, but result may still be uninitialized with some sideeffect in the bug fix, so why not changing it now, even if it has no good or bad effect now.

Still it would be some effort and it is not so clear what benefit julia would gain. But, at least one is willing to contribute (me) :grin:

#10

FWIW, I looked over the report and everything it flagged that I looked at was a false alarm.

1 Like
#11

There’s a down side to assigning NULL: if you do that, a static analysis tool cannot tell you if there’s a code path where you forgot to assign a non-null value. We do run more precise static analysis tools on Julia periodically, to catch issues arising from potential undefined behavior, etc. E.g. https://github.com/JuliaLang/julia/pull/31548.

3 Likes
#12

Sounds good.
I do not say that OPs tool and rules are those which should be applied, but there should be something applied and at least for uninitialized values there is something which is great. There is probably more.
I disagree to refuse automatic tools (and that unknown users are the problem), and I do this in general and not to argue against specific opinions which come up here (except about the unknown users). And I agree that automatic tools are not the single solution and if you only rely on them, they are the problem.
:laughing:
(very offtopic I am) (searched for yoda but only found :viking: which is even better)