Error spawn file when file exists in Julia

Hello,
I am trying to run a shell-based bioinformatic tool via Julia. I created a file containing a text query and assigned it as a parameter to the aforementioned command (blastn). However, it looks like the file doe snot exist when it has been correctly created. I tired to assign the output of the command to an object, but the issue persist even by simply launching the command:

julia> q_file
"/home/gigiux/Downloads/h32/query.fa"
julia> readdir("/home/gigiux/Downloads/h32/")
14-element Array{String,1}:
 "1_readSelection.sh"
 "2_readCounter.jl"
 "3_readPurger.jl"
 "4_readHandler.jl"
 "5_readWrapper.jl"
 "6_nls.R"
 "7_readConfirm.jl"
 "confirm"
 "counts"
 "deNovo"
 "plot"
 "query.fa"  ### here it is
 "rslt"
 "src"
julia> hit = chop(read(pipeline(`blastn -db /home/gigiux/refSeq/fusion/fusion38-10k \
                   -max_target_seqs 5 -max_hsps 1 -outfmt  '6 qseqid sseqid evalue bitscore pident sacc' \
                   -query q_file`), String))
ERROR: IOError: could not spawn `blastn -db /home/gigiux/refSeq/fusion/fusion38-10k '
' -max_target_seqs 5 -max_hsps 1 -outfmt '6 qseqid sseqid evalue bitscore pident sacc' '
' -query q_file`: no such file or directory (ENOENT) ###
Stacktrace:
 [1] _jl_spawn(::String, ::Array{String,1}, ::Cmd, ::Tuple{Base.DevNull,Base.PipeEndpoint,Base.TTY}) at ./process.jl:367
 [2] (::getfield(Base, Symbol("##493#494")){Cmd})(::Tuple{Base.DevNull,Base.PipeEndpoint,Base.TTY}) at ./process.jl:509
 [3] setup_stdio(::getfield(Base, Symbol("##493#494")){Cmd}, ::Tuple{Base.DevNull,Pipe,Base.TTY}) at ./process.jl:490
 [4] #_spawn#492(::Nothing, ::Function, ::Cmd, ::Tuple{Base.DevNull,Pipe,Base.TTY}) at ./process.jl:508
 [5] _spawn(::Cmd, ::Tuple{Base.DevNull,Pipe,Base.TTY}) at ./process.jl:504
 [6] #open#502(::Bool, ::Bool, ::Function, ::Cmd, ::Base.DevNull) at ./process.jl:594
 [7] open at ./process.jl:584 [inlined]
 [8] open(::Cmd, ::String, ::Base.DevNull) at ./process.jl:565
 [9] read(::Cmd) at ./process.jl:634
 [10] read(::Cmd, ::Type{String}) at ./process.jl:645
 [11] top-level scope at none:0
julia> run(`blastn -db /home/gigiux/refSeq/fusion/fusion38-10k \
                   -max_target_seqs 5 -max_hsps 1 -outfmt  '6 qseqid sseqid evalue bitscore pident sacc' \
                   -query q_file`)
ERROR: IOError: could not spawn `blastn -db /home/gigiux/refSeq/fusion/fusion38-10k '
' -max_target_seqs 5 -max_hsps 1 -outfmt '6 qseqid sseqid evalue bitscore pident sacc' '
' -query q_file`: no such file or directory (ENOENT)
Stacktrace:
 [1] _jl_spawn(::String, ::Array{String,1}, ::Cmd, ::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:367
 [2] (::getfield(Base, Symbol("##493#494")){Cmd})(::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:509
 [3] setup_stdio(::getfield(Base, Symbol("##493#494")){Cmd}, ::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:490
 [4] #_spawn#492(::Nothing, ::Function, ::Cmd, ::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:508
 [5] _spawn at ./process.jl:504 [inlined]
 [6] #run#503(::Bool, ::Function, ::Cmd) at ./process.jl:662
 [7] run(::Cmd) at ./process.jl:661
 [8] top-level scope at none:0

Why is the input file not recognized? it is a Julia issue or is down to the shell command?

Does blastn exist in your current PATH variable? Can you run the same command from the unix shell?

yes I can, in fact I am simply translating the shell script I already used into julia

Does running push!(LOAD_PATH, pwd()) make a difference (assuming your working directory is the one containing the file)?

I think

q_file

is not interpolated.
Try $q_file :

julia> hit = chop(read(pipeline(`blastn -db /home/gigiux/refSeq/fusion/fusion38-10k \
                   -max_target_seqs 5 -max_hsps 1 -outfmt  '6 qseqid sseqid evalue bitscore pident sacc' \
                   -query $q_file`), String))

Ops! I missed that, but still a no go:

julia> hit = run(`blastn -db /home/gigiux/refSeq/fusion/fusion38-10k \
                   -max_target_seqs 5 -max_hsps 1 -outfmt  '6 qseqid sseqid evalue bitscore pident sacc' \
                   -query $q_file`)
ERROR: IOError: could not spawn `blastn -db /home/gigiux/refSeq/fusion/fusion38-10k '
' -max_target_seqs 5 -max_hsps 1 -outfmt '6 qseqid sseqid evalue bitscore pident sacc' '
' -query /home/gigiux/Downloads/h32/query.fa`: no such file or directory (ENOENT)
Stacktrace:
 [1] _jl_spawn(::String, ::Array{String,1}, ::Cmd, ::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:367
 [2] (::getfield(Base, Symbol("##493#494")){Cmd})(::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:509
 [3] setup_stdio(::getfield(Base, Symbol("##493#494")){Cmd}, ::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:490
 [4] #_spawn#492(::Nothing, ::Function, ::Cmd, ::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:508
 [5] _spawn at ./process.jl:504 [inlined]
 [6] #run#503(::Bool, ::Function, ::Cmd) at ./process.jl:662
 [7] run(::Cmd) at ./process.jl:661
 [8] top-level scope at none:0

To be sure try the full path to blastn.
You can get it on the shell with
which blastn

I think this is it: even if blastn was in $PATH, it really required a full path:

julia> hit = run(`/home/gigiux/src/blast/bin/blastn -db /home/gigiux/refSeq/fusion/fusion38-10k \
                   -max_target_seqs 5 -max_hsps 1 -outfmt  '6 qseqid sseqid evalue bitscore pident sacc' \
                   -query $q_file`)
USAGE
  blastn [-h] [-help] [-import_search_strategy filename]
    [-export_search_strategy filename] [-task task_name] [-db database_name]
    [-dbsize num_letters] [-gilist filename] [-seqidlist filename]
    [-negative_gilist filename] [-negative_seqidlist filename]
    [-taxids taxids] [-negative_taxids taxids] [-taxidlist filename]
    [-negative_taxidlist filename] [-entrez_query entrez_query]
    [-db_soft_mask filtering_algorithm] [-db_hard_mask filtering_algorithm]
    [-subject subject_input_file] [-subject_loc range] [-query input_file]
    [-out output_file] [-evalue evalue] [-word_size int_value]
    [-gapopen open_penalty] [-gapextend extend_penalty]
    [-perc_identity float_value] [-qcov_hsp_perc float_value]
    [-max_hsps int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value]
    [-xdrop_gap_final float_value] [-searchsp int_value] [-penalty penalty]
    [-reward reward] [-no_greedy] [-min_raw_gapped_score int_value]
    [-template_type type] [-template_length int_value] [-dust DUST_options]
    [-filtering_db filtering_database]
    [-window_masker_taxid window_masker_taxid]
    [-window_masker_db window_masker_db] [-soft_masking soft_masking]
    [-ungapped] [-culling_limit int_value] [-best_hit_overhang float_value]
    [-best_hit_score_edge float_value] [-subject_besthit]
    [-window_size int_value] [-off_diagonal_range int_value]
    [-use_index boolean] [-index_name string] [-lcase_masking]
    [-query_loc range] [-strand strand] [-parse_deflines] [-outfmt format]
    [-show_gis] [-num_descriptions int_value] [-num_alignments int_value]
    [-line_length line_length] [-html] [-sorthits sort_hits]
    [-sorthsps sort_hsps] [-max_target_seqs num_sequences]
    [-num_threads int_value] [-remote] [-version]

DESCRIPTION
   Nucleotide-Nucleotide BLAST 2.9.0+

Use '-help' to print detailed descriptions of command line arguments
========================================================================

Error:  (CArgException::eSynopsis) Too many positional arguments (1), the offending value:

ERROR: failed process: Process(`/home/gigiux/src/blast/bin/blastn -db /home/gigiux/refSeq/fusion/fusion38-10k '
' -max_target_seqs 5 -max_hsps 1 -outfmt '6 qseqid sseqid evalue bitscore pident sacc' '
' -query /home/gigiux/Downloads/h32/query.fa`, ProcessExited(1)) [1]
Stacktrace:
 [1] pipeline_error at ./process.jl:705 [inlined]
 [2] #run#503(::Bool, ::Function, ::Cmd) at ./process.jl:663
 [3] run(::Cmd) at ./process.jl:661
 [4] top-level scope at none:0

The problem here were the bakcslash to split the shell command, by removing them and using read(pipeline()) the result is:

julia> hit = read(pipeline(`/home/gigiux/src/blast/bin/blastn -db /home/gigiux/refSeq/fusion/fusion38-10k -max_target_seqs 5 -max_hsps 1 -outfmt  '6 qseqid sseqid evalue bitscore pident sacc' -query $q_file`))
0-element Array{UInt8,1}
julia> hit
0-element Array{UInt8,1}

which is the output of the command, as expected. Thank you.

The shell which is opened from julia may have not the same environment (in this case the PATH environment variable) than the shell where your shell script is running in.

By the way; it is good practice for security reason to always use the full path to the command in shell scripts. If you are always using it you are sure, that e.g. a suid-bit script can not be flawed by setting the PATH variable before running it, with the mallicious goal to run e.g. another “blastn” command which alters the /etc/security file in a suid-root script. (suid root scripts are somehow outdated I admit).

1 Like

You shouldn’t need the full path, there’s probably some difference between the shell and your julia session. You could try printing the value of ENV["PATH"] to debug that.