How to compare non-missing elements of two DataFrames

mkarikom · June 30, 2020, 9:23pm

Seeking terseness in order to keep my @testset readable.
In this particular @test, query responses (as DataFrames) need to be checked for accuracy.

There has to be a better way to do this:

response = DataFrame(:a=>["a",missing],:g=>["a",missing])
actual = DataFrame(:a=>["a",missing],:g=>["a",missing])
@test actual[:a][(!ismissing).(actual[:a])] == response[:a][(!ismissing).(actual[:a])] 
      && actual[:g][(!ismissing).(actual[:g])] == response[:g][(!ismissing).(actual[:g])]

I’m looking for something like the following, such that equality of non-missing values in all columns (:a, :g , etc) are checked without the columns themselves being listed explicitly

notmissing = !ismissing.(actual)
@test all(actual[notmissing] .== response[notmissing])

Edit: added multiple columns and corrected syntax on !ismissing

dpsanders · June 30, 2020, 11:55pm

You need (!ismissing) in parentheses.

mkarikom · July 1, 2020, 2:52am

Thanks for this, I have updated the original post for clarity

pdeffebach · July 1, 2020, 3:04am

For looking at the whole dataframe, could you do dropmissing for both data frames? I guess that wouldn’t allow for different columns to have different missing placement. You would be dropping too many rows.

Also, you should be using the two argument getindex method, instead of

actual[:a][(!ismissing).(actual[:a])]

do

actual[ismissing.(actual.a) .== false, :a]

But I think you should do

ulia> function check_nonmissing(df1, df2)
       for n in names(df)
           if collect(skipmissing(df1[!, n])) != collect(skipmissing(df2[!, n]))
               return false
           end
       end
       return true
       end

Topic		Replies	Views
Finding DataFrame rows with `missing` values in specific columns? General Usage dataframes , missing-values	12	1081	February 7, 2022
Find DataFrame row with missing values present General Usage dataframes	8	687	January 9, 2023
Filter a number when missing values are in column New to Julia dataframes , missing-values	23	2596	February 11, 2022
Detecting missing in DataFrame columns New to Julia	6	5781	April 6, 2021
Quick DataFrame question: I am trying to read CSVs that, for some reason, have pos General Usage	1	247	February 16, 2021

How to compare non-missing elements of two DataFrames

Related topics