How to convert LinearAlgebra.Transpose into Array?

I’m first converting a dataframe to array:

a = Matrix(df[:,["col1","col2"]])
a = transpose(a)
println(typeof(a))

=> LinearAlgebra.Transpose{Union{Missing, Float64},Array{Union{Missing, Float64},2}}

The problem is that when I pass “a” to Pycall, it results in Python list rather than ndarray. I am puzzled how to convert matrix a into regular Array{Float64},2}?

To answer (parts of?) your question, a Transposed matrix can be converted to a regular Matrix by calling the Matrix constructor (or, alternatively, by using copy)

julia> using LinearAlgebra

julia> a = reshape(rand(1:10, 9), (3,3))
3×3 Matrix{Int64}:
 8  5  10
 4  3  10
 9  3   3

julia> at = transpose(a)
3×3 transpose(::Matrix{Int64}) with eltype Int64:
  8   4  9
  5   3  3
 10  10  3

julia> b = Matrix(at)
3×3 Matrix{Int64}:
  8   4  9
  5   3  3
 10  10  3

julia> b = copy(at)
3×3 Matrix{Int64}:
  8   4  9
  5   3  3
 10  10  3

Assuming that your matrix doesn’t contain any missings, to get rid of the Union{Missing, Float64} you can call Matrix{Float64}(at) instead of Matrix(at)

julia> amissing = transpose(convert(Matrix{Union{Missing, Int64}}, a))
3×3 transpose(::Matrix{Union{Missing, Int64}}) with eltype Union{Missing, Int64}:
  8   4  9
  5   3  3
 10  10  3

julia> Matrix{Float64}(amissing)
3×3 Matrix{Float64}:
  8.0   4.0  9.0
  5.0   3.0  3.0
 10.0  10.0  3.0

Final note, just in case you aren’t familiar, Matrix{Float64} is just an alias for Array{Float64, 2}, i.e. it is the same thing.

julia> Matrix{Float64}
Matrix{Float64} (alias for Array{Float64, 2})
1 Like

Thanks for the profound answer! For my case, the more important thing seems to be the union type Union{Missing, Float64}, which could be converted as you said. Another option would be to convert already the dataframe:

df = dropmissing(df, disallowmissing=true)

The drawback is that missings are silently deleted without warning.

1 Like

Well I wouldn’t necessarily call this “silent deletion without warning” - the function is literally called DROPmissing, so it drops rows with missing values.

1 Like

Admitted that the function does what it is supposed to do. There should be another function which checks if any missing values are present.

You might be looking for completecases

any(ismissing, x)

I think the disallowmissing! function is what you are looking for. See the DataFrames documentation for more information on handling missing data. I wrote a function below that will safely convert your data frame to concrete element types with disallowmissing!, plus some extra error information. Also check out the coalesce function if you want to replace your missing data with another value (maybe NaN or 0) before sending to Python.

using DataFrames

function makeconcrete!(df)
    missingrows = .!completecases(df)
    if sum(missingrows) > 0
        dfm = hcat(DataFrame(row = 1:nrow(df)), df)
        println(dfm[missingrows,:])
        throw(ErrorException(
            "Cannot safely convert DataFrame to concrete type."*
            " Missing values detected in the rows above."))
    else
        disallowmissing!(df)
    end
    return df
 end

 df1 = DataFrame(x = [11, 12, 13, 14], y = [21, 22, 23, 24])
 df2 = DataFrame(x = [11, missing, 13, 14], y = [21, 22, 23, missing])
 allowmissing!(df1)
julia> makeconcrete!(df1)
4×2 DataFrame
 Row │ x      y     
     │ Int64  Int64
─────┼──────────────
   1 │    11     21
   2 │    12     22
   3 │    13     23
   4 │    14     24

julia> makeconcrete!(df2)
2×3 DataFrame
 Row │ row    x        y       
     │ Int64  Int64?   Int64?
─────┼─────────────────────────
   1 │     2  missing       22
   2 │     4       14  missing
ERROR: Cannot safely convert DataFrame to concrete type. Missing values detected in the rows above.
1 Like

Thanks! disallowmissing! also throws exception itself if there are missings. I could catch that exception and show the error message.