Another UndefRefError: access to undefined reference

Is there any way to check a Dataframe to see why this error would be returned when running a function on a DF? The error message is too vague. I got this error when running:

unique!(df,[col1,col2,col3])

I’ve only have only received this error on 1 dataset and I clean out missing values beforehand and don’t use undef.

StackTrace

ERROR: UndefRefError: access to undefined reference

Stacktrace:

[1] getindex

   @ .\essentials.jl:917 [inlined]

[2] _broadcast_getindex

   @ .\broadcast.jl:644 [inlined]

[3] _getindex

   @ .\broadcast.jl:674 [inlined]

[4] _broadcast_getindex

   @ .\broadcast.jl:650 [inlined]

[5] getindex

   @ .\broadcast.jl:610 [inlined]

[6] copyto_widen!(res::Vector{…}, bc::Base.Broadcast.Broadcasted{…}, pos::Int64, col::Int64)

   @ DataFrames C:\Users\programmer8\.julia\packages\DataFrames\kcA9R\src\other\broadcasting.jl:27

[7] copy(bc::Base.Broadcast.Broadcasted{DataFrames.DataFrameStyle, Tuple{…}, typeof(coalesce), Tuple{…}})

   @ DataFrames C:\Users\programmer8\.julia\packages\DataFrames\kcA9R\src\other\broadcasting.jl:77

[8] materialize(bc::Base.Broadcast.Broadcasted{DataFrames.DataFrameStyle, Nothing, typeof(coalesce), Tuple{…}})

   @ Base.Broadcast .\broadcast.jl:872

[9] top-level scope

   @ c:\data\process.jl:111

Some type information was truncated. Use `show(err)` to see complete types.

Nobody will be able to say anything without a dataset to reproduce this with, or at least a stacktrace :slight_smile:

I’ll recreate the stack trace. I cannot share the dataset and am unsure on how to create a sample dataset that would reproduce the issue.

I’d hazard a guess that there is an unassigned array element somewhere; that’s not the same as being assigned to a missing value. Check:

((col, name) -> (name, isbitstype(eltype(col)), all(isassigned.(Ref(col), eachindex(col)))) ).(eachcol(df), names(df))

If I guessed right, it’ll show for each column whether its elements can have a reference and if so, whether they are all assigned. If you see 0, 0 then that’s a column that can throw the error. If that’s the case, you’ll have to reevaluate how you’re instantiating the DataFrame, and you’ll also have to consider the columns that can’t have references because those elements just silently hold garbage values if not assigned.

1 Like
ERROR: MethodError: no method matching keys(::DataFrame)

You’ll need to explain the context, paste exact code, and paste the full stack trace. I could imagine it happening with eachindex(::DataFrame) but that’s not happening. evidently happening but shouldn’t.

That’s from running the line of your code with the dataframe name modified to fit my needs.

Stacktrace:
 [1] eachindex(itrs::DataFrame)
   @ Base .\abstractarray.jl:318
 [2] (::var"#21#22")(col::Vector{String}, name::String)
   @ Main c:\data\process.jl:111
 [3] _broadcast_getindex_evalf
   @ .\broadcast.jl:678 [inlined]
 [4] _broadcast_getindex
   @ .\broadcast.jl:651 [inlined]
 [5] getindex
   @ .\broadcast.jl:610 [inlined]
 [6] copy
   @ .\broadcast.jl:911 [inlined]
 [7] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{…}, Nothing, var"#21#22", Tuple{…}})
   @ Base.Broadcast .\broadcast.jl:872
 [8] top-level scope
   @ c:\data\process.jl:111
Some type information was truncated. Use `show(err)` to see complete types.

Yeah that could be a problem, hence the recommendations to paste the code and stacktrace. If you can’t reproduce the DataFrame object directly, could you at least show typeof(df), eltype.(eachcol(df)) as well?

show(eltype.(eachcol(allNcoa))) returns a vector of the 74 cols as String types

Alright, any one of those could have an undef element. And the line that caused the eachindex(itrs::DataFrame) error?

Trying to find something that would show the undef.
Both of these also give the undef error:
VSCodeServer.vscodedisplay(df)

println(ismissing.(df))

this is not how you do it. You want to use isassigned(col, idx) to check if a given index in a column is assigned or not, before trying to access it (which is what ismissing.(col) would do)

I defined a string Matrix with some undefined values ​​and then tried to build a DataFrame, but the constructor fails for undefined reference
I wonder if and how it is possible to have an undefined value inside a table in DataFrames

Not really. In general it’s not a good idea to keep undefined entries in Julia arrays. Better use missing or nothing for entries where there’s no value.

Of course. But my intention was to create a similar situation (from what I understand, but I’m not sure) to the one described by the OP

I need to revisit that dataset from when I originally raised the question. I got this same error on a dataset yesterday. I had to write a quick function to drill down using a try-catch. Yesterday, the error was from having double quotes inside of a double quoted csv field. The error comes when trying to coalesce() after the data was read in with CSV.jl

"John "Jonathan" Smith","","",""

	function find_undef(df::DataFrame)
		undef_indices = []
		for col in 1:ncol(df)
			for row in 1:nrow(df)
				try
					if ismissing(df[row, col])
						push!(undef_indices, (row, col))
					end
				catch e
					msg = string("Error at"*string(row)*","*string(col))
					println(msg)
				end
			end
		end
		return undef_indices
	end
	undef_postitions = find_undef(nonm)
	println(undef_positions

This may indicate a bug in the CSV reader. Are you using CSV.jl? A file to reproduce the problem would be very useful.

1 Like

If you use the data below inside a csv file and then try to read it in, it gives the error Cannot 'convert' an object of type Missing to an object of type String. Removing the types=String, truncates the cell and only places Tom in row 1 col 1.

df = DataFrame(CSV.File("/home/user/sample.csv",types=String))
"name","id","keycode"
"Tom "Thomas" Smith",1,"U9T"
"Sally Jones",2,"Y9E"
"Lori Johns",3,"U7T"
"Paul Gray",4,"E4R"
1 Like

Interesting. This is definitely worth reporting against CSV.jl, even though it doesn’t seem to produce an undef cell.

(Note that I needed to add a final " after E4R to reproduce this behavior, I guess that’s just a typo.)

1 Like

I’ll have to issue a bug report for it. And yes the missing dq was a typo. Thank you for catching it.

1 Like