How do you use debuggers?


#24

I am not sure if I should post my naive thoughts here but perhaps I represent quite a large user group: the ones with low to medium programming skills that want to deal with scientific problems without spending a lot of time programming (one of the goals of Julia I think :slight_smile: ). I guess the long wish list is largely influenced by quite technically oriented people, since Julia is probably mostly embraced by people liking such challanges. Many of my collegues, currently using stuff like R and Matlab, want an easy-to-use language that is, for example, stable (not changing all the time) and has user-friendly tools. So, if the goal of the Julia language is to reach a broader audience, there perhaps must be a balance for the development between tools many people are used to (a Matlab like debugger, easy handling of missing data …) and more sophisitcated language featurs that helps developers to create advanced Julia code. In order for me to convience my collegues to start using Julia, I think a good debugger along with some other tools that makes Julia more user-friendly will help a lot. So I would not be disheartened by so many wanting a debugger. It would help to “sell” Julia to more people and raise the interest of the language. I guess what I wrote is known by everyone here, but I wanted to mention it so that it doesn´t get forgotten. I like the development strategy [posted here] (List of most desired features for Julia v1.x) by StefanKarpinski.


#25

That should all be covered by unit tests. That doesn’t mean the interactive debugger part is useless, but in the end unit and integration tests should be validating your assumptions. I think testing is vastly underused and large test suites are what actually make debugging easier.

(Side note: I honestly don’t know how in 2017 code could be allowed to be published in scientific papers without unit tests showing convergence and error estimates. IMO a repo with tests should become standard sooner rather than later because tests are what gives trust. There should always be a 1-click button to reproduce results (not the good ole’ “uncomment these for figure 1” approach)).


#26

Couldn’t agree more @ChrisRackauckas.


#27

Personally, I don’t find debuggers essential in my workflow, but I do think that they are a lot of fun and they can reduce stress when working with complicated code. Moreover, I don’t think that you can watch this video without having to admit that Gallium is going to be freakin amazing once it becomes fully mature.

I don’t think that I would say that Gallium is 1.0 blocking, I would just say that it will make working with Julia a lot more fun. Also, the possibility of being able to use it in a Jupyter notebook would be fairly mind blowing.

The point is that the prospect of a high quality debugger excites me quite a lot, it would make my life a lot easier and it could lower the barrier for newcomers to Julia. Also, I think that I should reiterate that the presence of a debugger makes Julia look a lot more modern to outsiders.

As for the original question, I think that it kind of answers itself. Debuggers are a tool for finding out exactly what caused your code to behave the way it did. Yes, you can use the stacktrace/printing thing that Chris explained, but a debugger is so much more elegant.


#28

I agree.
In that sense a debugger is like a dynamic unit test …
It saves a lot of time on RnD phase specifically because you don’t need to write unit tests.

Once your algorithms and types and data flow stabilize , it is time to build the unit tests.


#29

It’s interesting that this turned into a “debugger vs unit tests” debate. I never thought of those as competing methodologies before, but now that I think about it, I rarely use a debugger on projects where I have unit tests.

If somebody wants to unite those two worlds, I would be really happy to have a debugger that can output unit tests. Basically, I’d like to be able to “capture” the function call that I’m in, including all of the arguments that were passed to the function (I don’t care about global state) and save all that to a text file as a @test. (Bonus points if it can also read my mind and fill in the expected return value, but otherwise I’ll do that myself.)


ANN: TraceCalls.jl, a debugging and profiling tool for Julia 0.6
#30

You make a valid point. However, most scientists are self-taught programmers and I imagine that the majority of them are not aware of unit tests, let alone automating them with continuous integration. Writing good unit tests and designing code to be testable is a skill one picks up through experience. In an ideal world TDD would be prevalent in scientific code and the debugger would be needed only occasionally, however the reality is different. With its TDD/CI culture (eg PkgDev.generate setting things up by default) Julia could have an important impact on this though.


#31

IMO, IDE+debuggers are simply the MOST important companion tool of a programmer. I now virtually program inside a debugger all the time (C and Matlab) and cannot believe I took so long to learn about them.
I want to dive more into a certain external API that will communicate with Julia via callback functions. If I could stop the program flow inside those callback functions the development would be a breeze because I could experiment commands on REPL and fix if they error. Without it I have to run, see the error, try to fix, restart Julia, try again, … and again …

Failing to see this … make my lips smile.


#32

My two cents (I’m a self-taught programmer from a mostly Matlab backrgound):

In my experience almost all practitioners from a Matlab background use a Matlab-style debugger extensively in their day-to-day workflow. Whether this is a good thing or a bad thing I don’t really have an opinion. What I am certain of, is that most of these practitioners will not switch to Julia if it doesn’t come with a Matlab/Visual Studio/RStudio style debugger that “just works”.

I’ve seen the following argument several times in this thread: “why not just add some print statements in the relevant file…”. It is worth remembering that Matlab code tends to end up as long functions with lots of variables, most of which are arrays. Assuming this style of code, adding print statements is a real pain. It is so much easier to just set a breakpoint and then use a GUI to look at the (possibly hundreds) of variables currently in the workspace, often taking advantage of the built-in spreadsheet facilities.

Do I personally like this style of coding? Absolutely not! I love how Julia encourages short (often one-liner) functions, but it doesn’t change the fact that this is what most people coming from Matlab will try to do, at least, initially. I certainly did.


#33

A recent example for the friends of printf/display debugging:

               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.5.2 (2017-05-06 16:34 UTC)
 _/ |\__'_|_|_|\__'_|  |  
|__/                   |  x86_64-linux-gnu

julia> Pkg.test("CairoScript")
INFO: Testing CairoScript
"%!CairoScript\n<< /content //COLOR_ALPHA /width 400 /height 300 >> surface context\n0 0 1 rgb set-source\nn 128 25.601562 m 230.398438 230.398438 l 128 230.398438 l 51.199219 230.398438 51.199219 128 128 128 c h 64 25.601562 m 115.199219 76.800781 l 64 128 l 12.800781 76.800781 l h 64 25.601562 m\nfill+\n0 g set-source\n10 set-line-width\nstroke\n"Test Summary:   | Pass  Total
  Interpreter Run |    6      6
INFO: CairoScript tests passed

julia> Pkg.test("CairoScript")
INFO: Testing CairoScript
Interpreter Run: Test Failed
  Expression: status == 0
   Evaluated: 0x0000000000000020 == 0
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:431
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281
 in macro expansion; at /home/lobi/.julia/v0.5/CairoScript/test/runtests.jl:83 [inlined]
 in macro expansion; at ./test.jl:674 [inlined]
 in anonymous at ./<missing>:?
 in include_from_node1(::String) at ./loading.jl:488
 in process_options(::Base.JLOptions) at ./client.jl:265
 in _start() at ./client.jl:321
Test Summary:   | Pass  Fail  Total
  Interpreter Run |    5     1      6
ERROR: LoadError: Some tests did not pass: 5 passed, 1 failed, 0 errored, 0 broken.
 in finish(::Base.Test.DefaultTestSet) at ./test.jl:498
 in macro expansion; at ./test.jl:681 [inlined]
 in anonymous at ./<missing>:?
 in include_from_node1(::String) at ./loading.jl:488
 in process_options(::Base.JLOptions) at ./client.jl:265
 in _start() at ./client.jl:321
while loading /home/lobi/.julia/v0.5/CairoScript/test/runtests.jl, in expression starting on line 20
=============================[ ERROR: CairoScript ]=============================

failed process: Process(`/home/lobi/julia05/usr/bin/julia -Cnative -J/home/lobi/julia05/usr/lib/julia/sys.so --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /home/lobi/.julia/v0.5/CairoScript/test/runtests.jl`, ProcessExited(1)) [1]

where the first run displays testdata, while the second doesn’t.


#34

As a researcher in economics who mainly uses MATLAB, debuggers are really useful. Surely, I could think about some of my code more before I run it, but going step-by-step through code can be sometimes insightful, especially if you have a lot of complex formulas, as is in economic models often the case. This way you can check that matrices are of the correct size or other mistakes.

It is also useful in the case of non-linear optimization. If the optimizer does not work as expected, I can view some iterations just by using the debugger, which is impossible without. This way I can check if I had a mistake in a formula or chose the wrong starting values.

Recently I had an issue where NLsolve did not work as expected. I went back to MATLAB, copied the code 1-1 and all I had to do is to adjust the starting values. Not a conclusion I would have reached using Julia without the debugger in this case. I am still not sure where the code fails, as it still does not work with Julia.


#35

I’m not an experienced programmer (Or at all, I use MATLAB for algorithms, not programming).

But in my point of view of things is:

  • Debugger
    Helps the programmer make sure the code does what the programmer intended it to do.
  • Unit Tests
    Helps the programmer make sure the code does what the code should do.

Hence they are complementary and not one instead of the other.

MATLAB Debugger should certainly be a reference point.


#36

Totally agree with RoyiAvital and IljaK91. I am a researcher in an institute and many of my colleagues use programming languages to implement models and analyze data, including Matlab, R, Python and Java. When selecting a language I always consider if the language can let me finish the task conviniently. I have experience of using C++, Matlab, Python and R, and currently start considering Julia. I am used to the convenient debuggers in Visual C++, Pydev, RStudio, and PyCharm. When I start using Julia, I found it is hard to work without a convenient debugger.

A large part of the Julia users would be engineers, students or researchers, who are not programming experts. Julia language itself is elegant and productive. However, the end users who are not programming experts would care more about the efficiency of the whole solution, including the language, packages, and editors. For me, the reason to use Julia is JuMP, which saves me a lot of time.

For the debugger, maybe the programming experts depend more on Unit Test, while the researchers or engineers rely more on debuggers.


#37

You can also do the opposite, which is a very good practice in my opinion: write the tests first and then implement the function to pass the tests. This is known as test-driven development:

and can save you a lot of debugging as well.


#38

Good programming habits with test suites, code coverage, etc. eliminate almost entirely the need for a debugger.

I am a scientist that writes code, not a coder that does science. This is an important distinction. I would love to go back to school, learn about computer internals, get to the point where data structures are second nature, and learn and always implement good programming habits from the start of a project.

But I was in school for a decade learning about biology, and now that I’m doing computational biology I don’t have time to go back and learn these things from scratch. I’m picking up knowledge as I need it.

You can also do the opposite, which is a very good practice in my opinion: write the tests first and then implement the function to pass the tests. This is known as test-driven development.

I know what test-driven development is, but it’s simply not feasible for what I’m working on. I work on datasets that are not uniform - I have to do exploratory data analysis before I can even know where to start. A lot of code I write is one-off code that will only ever be used once.

Truthfully, I haven’t used a debugger much because I haven’t had time to learn how to use one. I just litter my code with print statements and then ctrl-c in the middle of massive loops when I have to find problems. Or I give up, delete everything and start over because that’s often faster.

You might be cringing now, but it’s important to understand that I almost never write anything that should be considered “production code,” because I generally don’t need to.


#39

@kevbonham it really depends on what are your goals. If you want to push the frontiers of science within your research group by writing one-shot scripts that get the job done, that is perfectly fine. However, if you care about educating other people, delivering reproducible science outside of your lab that works with other’s datasets, then there is no escape from test suites and a good amount of assertions in your code. Users of your science aren’t computer scientists either and you have to ask yourself if you want them to 1) spend time debugging your code or 2) learn what the input is supposed to be from the assertions you wrote.


#40

I’ve developed the habit of coding without a debugger since coming to Julia. However, when I speak about Julia to other colleagues (scientists), they often ask two questions:

  1. “Does it have an IDE?” --> for which I can speak about the greatness of Juno.

  2. “Does it have a (MATLAB-like) debugger?” -> For which I need to tell them “no”.

Usually, the interest fade after they learn that there is no debuggers, even if I already told them that Julia is fast and proven it to them.

The thing is, scientists are not programmers. Some of them are really good, but most of them would fail the majority of CS courses.


#41

Thanks everyone for the thoughts in here. It is very interesting to see the variety of use-cases, particularly since the wide appeal of Julia is one of the the charms of this community. The debate on the merits of each approach is less so, people should be free to use whatever workflow they are comfortable in.

For the benefit of the onlookers, I should probably clarify that we do have a debugger. It works reasonably well on 0.5. Setting a breakpoint is sometimes dodgy from Juno, but its better from the command line, and stepping and variable inspection is quite good. So it’s not as if people have not put in huge amounts of effort in the debugger, lets not diminish that work. It’s not perfect, but its there.

Yes, it is unfortunate that it does not yet work on 0.6. But you will appreciate that the debugger is deeply tied to the internals of the language, and so will need substantial changes given the evolution of the language. Also, it needs skills both in language internals and OS internals, so the pool of people willing to work on it is smaller. That work will happen eventually, and the fact that its somewhat delayed is no comment on the importance of the debugging workflow.

Regards

Avik


#42

An IDE without a debugger is just a DE.


#43

@juliohm it seems you misunderstand me, or you misunderstand the sort of work that many biologists do. If one wants to calculate the mean of 5 numbers, it is not necessary to provide the code that was used to make the calculation reproducible. One can simply say they took the sum of the numbers and divided it by 5. Sure, if no mean () function existed, it might be nice to provide that for the community, and many scientists are in the business of writing tools.

But a lot of data processing work is idiosyncratic, and spending the time to write a well-tested suit of functions that no one else will ever use is a waste of time. Better to describe the analysis and let someone implement it the way that makes sense to them. Further, by far the majority of the code I write is exploratory and is not intended for publication.

My science shouldn’t be reproducible because of my code, it should be reproducible because I design well controlled experiments. Again, I’m not a computer scientist - I hope to god no one is trying to learn from the code I write. I know you don’t intend to be insulting, since you said as much to someone else, but when you imply (as you do here) that the work someone is doing is less valuable than other work, some people might take offense.

Actually, I don’t want them to do either. My science typically isn’t my code, that’s exactly the point. Most of my science doesn’t have “users.” To the extent my work has value, it has value in that I’m designing good experiments, executing them properly and thoroughly describing the results. People that care about the biology won’t bother looking at the code that analyzed the data, and people that care about the data are almost certainly better off working their own code to analyze it.

I did write a piece of software that’s intended for others to use, and that’s documented, tested, and I’ve got CI running on it. But that package accounts for maybe 5% of the coding That I’ve done or will likely do in the future.