Collecting all output from shell commands

I want to run a command that may or may not exit gracefully. In any case I want to get all of the following information from it:

  • content of stdout as String
  • content of stderr as String
  • exit code as Int

What is the recommended way to do this? (Currently my hacky way of doing this is redirecting to a file and reading that file).

1 Like

I did this recently… let me find the code…

"Run a Cmd object, returning the stdout & stderr contents plus the exit code"
function execute(cmd::Cmd)
  out = Pipe()
  err = Pipe()

  process = run(pipeline(ignorestatus(cmd), stdout=out, stderr=err))
  close(out.in)
  close(err.in)

  (
    stdout = String(read(out)), 
    stderr = String(read(err)),  
    code = process.exitcode
  )
end

execute(`ls`)
execute(`ls --invalid-option`)
12 Likes

Cool, thanks! I like your solution, but I will leave the question a bit open to see if there are alernatives.

To avoid the deadlock situation inherent in doing IO operations sequentially, this should be written as:

stdout = @async String(read(out))
stderr = @async String(read(err))
return (
    stdout = wait(stdout),
    stderr = wait(stderr),
    code = process.exitcode
)
8 Likes

Let me add another thing to the challenge. I still want to run a command and get the stdout, stderr, exitcode back. However this time, I also want to write from julia to the stdin of the command.

With this suggestion

function execute(cmd::Cmd)
  out = Pipe()
  err = Pipe()

  process = run(pipeline(ignorestatus(cmd), stdout=out, stderr=err))
  close(out.in)
  close(err.in)
  stdout = @async String(read(out))
  stderr = @async String(read(err))
  (
    stdout = String(read(out)), 
    stderr = String(read(err)),  
    code = process.exitcode
  )
end

julia> a, = execute(`cmd /c dir C:`); println(a)
julia> a, = execute(`cmd /c dir C:`); println(a)
julia> a, = execute(`cmd /c dir C:`); println(a)
(repeatdly enter the above by executing each line and then hitting the up arrow and executing again)

What you will see is that sometimes it prints the information , but also occasionally prints nothing. Is there a way to avoid this situation.

It looks like this works.

function communicate(cmd::Cmd, input)
    inp = Pipe()
    out = Pipe()
    err = Pipe()

    process = run(pipeline(cmd, stdin=inp, stdout=out, stderr=err), wait=false)
    close(out.in)
    close(err.in)

    stdout = @async String(read(out))
    stderr = @async String(read(err))
    write(process, input)
    close(inp)
    wait(process)
    return (
        stdout = fetch(stdout),
        stderr = fetch(stderr),
        code = process.exitcode
    )
end

@show communicate(`cat`, "hello")
@show communicate(`sh -c "cat; echo errrr 1>&2; exit 3"`, "hello")
11 Likes

Cool it works! It looks a bit weird, since inp is not really used.

You can write write(inp, input) instead of write(process, input) :slight_smile:

Well, actually I don’t fully understand it. I needed to put Pipe there to pass wait=false as otherwise run sets stdin to devnull. Maybe you can use open(pipeline(cmd, stderr=err), "r+") or sth to reduce explicit Pipe? I haven’t tried that route.

1 Like

@musm can you open an issue? That observered behavior doesn’t sound right.

Separately, I’m working on some changes to make this more convenient—stay tuned :slight_smile:

1 Like

Any updates?

1 Like

https://docs.julialang.org/en/v1/manual/running-external-programs/#Running-External-Programs-1
Seems read should be able to work for those cases now.

3 Likes

I discovered that for some larger outputs the run(pipeline(ignorestatus(cmd), stdout=out, stderr=err)) hangs forever, while run(cmd) finishes quite fast, so there seems to be some problem in Julia 1.5.4 with larger data apparently.

Yes, it’s nontrivial to read long output indeed, and this is not specific to 1.5.4. See my question here: How to read long output from external command?. There was no answer, so I still don’t know how to do it properly.

I routinely read multimegabyte amounts of data produced by an external process. The way I do it is similar to that described above: read in an async task and concatenate the output until the external process is done.

1 Like

Oh yikes, that code seems awful. Using readavailable is nearly always a bad sign.

The code samples above seem to have converged to a proper solution (e.g. Collecting all output from shell commands - #7 by tkf)

In response to my previous comment about making this more convenient: You can now pass a Base.BufferStream object as the pipeline, which will manage setting up the async reader in the background for you. However, this also disables flow control (unless you also hack the BufferStream object to have a maximum buffer size), so do this at your own risk. But for most use cases, this is not an issue.

You can also use the open(cmd, devnull, out, err) do; end form instead of run(pipeline(cmd, stdout=out, stderr=err)), which is shorter, and should soon support detection and reporting of this particular deadlock risk, once https://github.com/JuliaLang/julia/pull/39544 is merged.

1 Like

Sure it’s awful. I’ve spent nine years frustrated by that code.

The solution above applies only to processes whose entire output is read in one shot. IOW, communicate() does not return until cmd ends.

In my case, I need interact with a long-lived process, continuously and asynchronously. I send it a command via stdin and it replies via stdout. The process may crash, may send a partial response, may take a long time to respond, etc.

That awful code is the best I’ve been able to cobble together (and it works, so far). I’ve asked for help with this problem many times before; if you have suggestions, I’d love to hear them. I already have an issue to remove readavailable(). I also want to get rid of Base.throwto() but I’m not sure how to handle timeouts.

1 Like

I don’t know anything about gnuplot. The examples in their docs don’t show stopping at any point to read stdout, so they aren’t particularly useful. Is there anything it prints except for echoing the input then writing out “gnuplot>”? Perhaps you want to do read(out, byteswritten); readuntil(out, "\ngnuplot>")? I think the latter was relatively new to when you started Gaston.jl.

To answer the original post, there is the OutputCollectors.jl package which does all that was asked:

julia> using OutputCollectors

julia> script = """
       #!/bin/sh
       echo 1
       sleep 1
       echo 2 >&2
       sleep 1
       echo 3
       sleep 1
       echo 4
       false
       """
"#!/bin/sh\necho 1\nsleep 1\necho 2 >&2\nsleep 1\necho 3\nsleep 1\necho 4\nfalse\n"

julia> oc = OutputCollector(`sh -c $script`; verbose = true);

julia> [19:28:26] 1
[19:28:27] 2
[19:28:28] 3
[19:28:29] 4
julia> 

julia> oc.P.exitcode
1

julia> collect_stdout(oc)
"1\n3\n4\n"

julia> collect_stderr(oc)
"2\n"
5 Likes

(Maybe this should be moved to a different thread?)

Gnuplot reads from stdin when invoked as part of a pipe; when doing so, you don’t get the usual gnuplot REPL. This command results in the plot of a sine wave in addition to the console output:

$ echo "plot sin(x); set output '-'; print 'Done'" | gnuplot --persist
Done

(The set output '-' command results in Gnuplot printing to stdout.)

What I do in Gaston is keep the pipe open, so that I can send subsequent commands to the same gnuplot process. In addition, I try to fail gracefully when Gnuplot fails to respond (that is, it fails to print Done after a timeout period).