BioSequences upgrade works on 1.0 but not 0.7

Hi,

We’ve been upgrading BioSequences for julia 0.7/1.0 at this PR: https://github.com/BioJulia/BioSequences.jl/pull/41

I’ve managed to get tests on TravisCI passing for julia 1.0 but oddly not 0.7. The error that throws for 0.7 is:

ERROR: LoadError: LoadError: LoadError: syntax: invalid variable expression in "where"

Stacktrace:

[1] include at ./boot.jl:317 [inlined]

[2] include_relative(::Module, ::String) at ./loading.jl:1038

[3] include at ./sysimg.jl:29 [inlined]

[4] include(::String) at /home/travis/build/BioJulia/BioSequences.jl/src/BioSequences.jl:11

[5] top-level scope at none:0

[6] include at ./boot.jl:317 [inlined]

[7] include_relative(::Module, ::String) at ./loading.jl:1038

[8] include at ./sysimg.jl:29 [inlined]

[9] include(::String) at /home/travis/build/BioJulia/BioSequences.jl/src/BioSequences.jl:11

[10] top-level scope at none:0

[11] include at ./boot.jl:317 [inlined]

[12] include_relative(::Module, ::String) at ./loading.jl:1038

[13] include(::Module, ::String) at ./sysimg.jl:29

[14] top-level scope at none:2

[15] eval at ./boot.jl:319 [inlined]

[16] eval(::Expr) at ./client.jl:399

[17] top-level scope at ./none:3

in expression starting at /home/travis/build/BioJulia/BioSequences.jl/src/bioseq/constructors.jl:65

in expression starting at /home/travis/build/BioJulia/BioSequences.jl/src/bioseq/bioseq.jl:119

in expression starting at /home/travis/build/BioJulia/BioSequences.jl/src/BioSequences.jl:208

Yet this does not happen on Julia 1.0.

What’s more, the methods indicated in the stacktrace, at BioSequences.jl/src/bioseq/constructors.jl:65, do not
use the “where” syntax. I’m struggling to see if this is an error that is the fault of BioSequences code or a quirk of 0.7 that got fixed in 1.0, could I ask a few extra eyes to pass over this and see if (likely) and where I’m being silly? The actual branch with the upgrade changes is at:

(v1.0) pkg> add https://github.com/dcjones/BioSequences.jl.git#julia1

Thanks!

2 Likes

I have a local patch where it passes tests on v0.7 - it seems you got bit by the constructors no longer having the `convert fallback:
Testing my local patch:

┌ Warning: Constructors no longer fall back to `convert`. A constructor `BioSequences.BioSequence{BioSequences.DNAAlphabet{2}}(::BioSequences.BioSequence{BioSequences.DNAAlphabet{4}})` should be defined instead.
│   caller = convert(::Type{BioSequences.BioSequence{BioSequences.DNAAlphabet{2}}}, ::BioSequences.BioSequence{BioSequences.DNAAlphabet{4}}) at conversion.jl:25
└ @ BioSequences ~/.julia/packages/BioSequences/lsGtC/src/bioseq/conversion.jl:25
┌ Warning: Constructors no longer fall back to `convert`. A constructor `BioSequences.BioSequence{BioSequences.DNAAlphabet{2}}(::BioSequences.BioSequence{BioSequences.DNAAlphabet{4}})` should be defined instead.
│   caller = convert at conversion.jl:25 [inlined]
└ @ Core ~/.julia/packages/BioSequences/lsGtC/src/bioseq/conversion.jl:25

This is after I patched the constructor methods from

function BioSequence{DNAAlphabet{4}}(seq::BioSequence{DNAAlphabet{2}}) 
    newseq = BioSequence{DNAAlphabet{4}}(length(seq))
    for (i, x) in enumerate(seq)
        unsafe_setindex!(newseq, x, i)
    end
    return newseq
end

to

function BioSequence{DNAAlphabet{N}}(seq::BioSequence{DNAAlphabet{M}}) where {N<:Int,M<:Int}
    @assert N == 4 && M == 2
    newseq = BioSequence{DNAAlphabet{4}}(length(seq))
    for (i, x) in enumerate(seq)
        unsafe_setindex!(newseq, x, i)
    end
    return newseq
end

The code ran with the rest of the methods patched as in this gist, but tests hung until quitting :confused:, and don’t know what to do next.

That’s really interesting, but how does it now which method to dispatch to when there are multiple methods since N and M can be any int value, until the assert statements are run?

@jgreener64 also had a shot at this and found that the following would work on 0.7 and 1.0:

abstract type Alphabet end
abstract type Sequence end

struct DNAAlphabet{n} <: Alphabet end

mutable struct BioSequence{A<:Alphabet} <: Sequence
    data::Vector{UInt64}  # encoded character sequence data
    part::UnitRange{Int}  # interval within `data` defining the (sub)sequence
    shared::Bool          # true if and only if `data` is shared between sequences

    function BioSequence{A}(data::Vector{UInt64},
                            part::UnitRange{Int},
                            shared::Bool) where A
        return new(data, part, shared)
    end
end

function BioSequence{A}(seq::BioSequence{DNAAlphabet{2}}) where A <: DNAAlphabet{4}
    newseq = BioSequence{A}(length(seq))
    for (i, x) in enumerate(seq)
        unsafe_setindex!(newseq, x, i)
    end
    return newseq
end

Although using a parametric constraint and a “where” where only one possiblity is allowed is not ideal.

Interestingly, @jgreener64’s version made me think of the following alternative, which passes on both 0.7 and 1.0:

abstract type Alphabet end
abstract type Sequence end

struct DNAAlphabet{n} <: Alphabet end

mutable struct BioSequence{A<:Alphabet} <: Sequence
    data::Vector{UInt64}  # encoded character sequence data
    part::UnitRange{Int}  # interval within `data` defining the (sub)sequence
    shared::Bool          # true if and only if `data` is shared between sequences

    function BioSequence{A}(data::Vector{UInt64},
                            part::UnitRange{Int},
                            shared::Bool) where A
        return new(data, part, shared)
    end
end

function (::Type{BioSequence{DNAAlphabet{4}}})(seq::BioSequence{DNAAlphabet{2}})
    newseq = BioSequence{DNAAlphabet{4}}(length(seq))
    for (i, x) in enumerate(seq)
        unsafe_setindex!(newseq, x, i)
    end
    return newseq
end
1 Like