Why will we have to say fieldnames(typeof(v)) instead of fieldnames(v)?

Sorry for being dense, but what is this ambiguity? I don’t understand it.

If the input is a type, then return the field names. If it is not, then return the field names of the type of the input.

I’m not upset about this change, and I can appreciate the argument that the new behaviour is a bit more logically rigorous, but I flat out cannot see where the ambiguity is.

1 Like

Because a type is also a value and it’s type has it’s own fields.

So what? You return the field names of whatever type is passed in, except when it’s not a type at all.

2 Likes

So that’s the ambiguity. This is literally your question.
Edit: and since it was apparently not obvious to everyone, I don’t like your tone. Please be more polite.

So simple, so logical, so non-breaking.

1 Like

And more importantly, so ambiguous and so dangerous.

As someone working with types as values a lot I’m happy to see this change. Let’s say, we have a type like this:

struct Foo 
    x::Int 
end

foo = Foo(108)

It looks appealing to write fieldnames(Foo) or fieldnames(foo) to get the list [:x]. We can use this meta facility to write generic code like this:

function dump_object(x)
    for f in fieldnames(x)
        println("$f: $(getfield(x, f))")
    end
end
dump_object(Foo(3))  # ==> "x: 3"

This is actually what I used a couple of times for serialization and various meta-programming tasks. What sounds good about this function is that it’s general enough to apply to literally any value in Julia, right? Not quite: what if we want to dump the type Foo instead of the value?

dump_object(Foo)
# ERROR: type DataType has no field x
# Stacktrace:
#  [1] dump_object(::Type{T} where T) at ./REPL[14]:3

D’oh! Even though we passed a type into dump_object, fieldnames thought we wanted to get the list of fields of its instance. To make this function really general, we need to use fieldnames(typeof(x)) instead:

function dump_object(x)
    for f in fieldnames(typeof(x))
        v = isdefined(x, f) ? getfield(x, f) : "<undefied>"
        println("$f: $v")
    end
end
dump_object(Foo(42))           # ==> [:x]
dump_object(Foo)               # ==> long list of DataType's fields

I’ve got into this trap several times, sometimes in packages that I’ve written months and years ago and considered stable. So in all my new code I try to never use fieldnames(v), exactly because it’s ambiguous and hard to predict in practice. Julia 0.7 will also help me to find this issue in old code, which sounds good.

At the same time, I like the idea of having fieldnames_t() as a shorthand for the fieldnames(typeof(...)) pattern.

10 Likes

This is a similar issue to my answer back here:

julia> using Base: @pure

julia> @pure is_mutable(x::DataType) = x.mutable    # for a type
is_mutable (generic function with 1 method)

julia> @pure is_mutable(x) = is_mutable(typeof(x))    # for a value
is_mutable (generic function with 2 methods)

julia> @pure is_immutable(x) = !is_mutable(x)
is_immutable (generic function with 1 method)

You could use DataType to dispatch on it and use the generic method for everything else.

Julia’s dispatch mechanisms allow you to define different implementations based on the input type, but the guiding principle should be that all the methods for a given function should mean the same thing. Obviously that’s pretty loosely-defined, so I try to think about what the help string for a given generic function should be, and minimise the number of “except when” clauses you’d need.

So in this case it seems like the new behavior is “fieldnames(x) returns a list of the fields of x”. You’re arguing for

fieldnames(x) returns a list of the fields of typeof(x). fieldnames(x::Type) returns the fields of x.”

This sort of punning is discouraged because while it may make things more convenient when you know a prori the type of x, it makes it harder to write generic code that means the same thing regardless of the type of x.

9 Likes

Note that this will lead to wrong answer for types like Array.

It is indeed loosely-defined in general but it’s very clear in this case. If the function can have two different meanings for the same input than, 1) it’s ambiguous 2) it definately is not doing a single thing.

2 Likes

Perhaps the problem can be solved by using two appropriate function names for the different meanings and not overloading a single function name with qualitatively distinct actions. As an example, apart from fieldnames, there is also the names function. If we define,

Base.names(v) = fieldnames(typeof(v))

then we get the second operation with a user-friendly name. The upshot names is already defined for Modules with a sort-of isomorphic action: returns a list of symbols which can be .-indexed to the value of the parameter.

For example:

julia> names(Base.Random)
21-element Array{Symbol,1}:
 :AbstractRNG    
 :GLOBAL_RNG     
 :MersenneTwister
 :Random         
 :RandomDevice   
 :bitrand        
 :rand           
⋮ 
 :randsubseq     
 :randsubseq!    
 :shuffle        
 :shuffle!       
 :srand          

julia> names(1:10)
2-element Array{Symbol,1}:
 :start
 :stop 

I’ve said many times that this is certainly fine. It just doesn’t have to be both in base and since the one operation on the type can be used to define the one operating on values but not the other way around, the one in base should be the one operating on types. Checking the field of a value/type or provide other necessary info about the type is also a help system feature as mentioned above.

1 Like

Also, you shouldn’t be overriding names. For one you are basically introducing type piracy if you define it in your own code and if you want to put it in base you are creating a different ambiguity.

1 Like

That’s not what ‘ambiguity’ means. That is just a branch. if A do B else do C. It’s only ambiguous if A is not definitively either true or false in each case. The statement A here is “x is a type”. Are you really saying that this does not always have a definitive answer?

If we ask a related question: “is x a type or is it a value?”, that is ambiguous.

I think @ssfrr and @dfdx make good arguments for why the new behaviour is sensible, but none of them have anything to do with ambiguities, so I don’t think it makes sense to advance that as an argument.

Unless, of course, there really is some case where A is ambiguous, but that was my original question, which has not been answered.

2 Likes

how about Base.propertynames
https://docs.julialang.org/en/latest/base/base/#Base.propertynames

refer this PR
https://github.com/JuliaLang/julia/pull/25311

1 Like

It is, your branch is ignoring the ambiguity and doesn’t prove anything at all. It is of course always possible to force the ambiguous case to mean one of its meaning but it’s a terrible implementation and it does not mean the ambiguity is not there.

No, none of their arguments makes sense if there’s no ambiguity. See below.

And I’ll repeat again that there are two different things here.

  1. Get the field names of a type, this means returning empty for Array, Int, and error for [].
  2. Get the field names of a value, this means returning the fieldnames for the type UnionAll, DataType and Vector{Any} for the above three cases.

The examples given above shows that there are already two cases where both meaning are valid and that IS ambiguity. In fact, this is necessarily true for everything that the first meaning is valid (I.e. all types) since

If there isn’t this ambiguity, one would never think about to write any generic code that require a specific behavior for all objects so that wouldn’t be a valid arguments at all and the function that implement these based on dispatch wouldn’t be clearly doing two different things.

5 Likes

Array is a type, Int is a type, [] is not a type, UnionAll is a type, DataType is a type, Vector{Any} is a type. No ambiguity. They are also values, but whether or not they are types is unambiguous (???).

Unfortunately, I don’t get it, and possibly won’t. Too bad, because I’m curious, but since I don’t really mind the change, I give up now.

Here’s an example of how the old behavior could be confusing in generic code:

julia> firstfieldname(x) = fieldnames(x)[1]
       getfirstfield(x) = getfield(x, 1)
getfirstfield (generic function with 1 method)

julia> firstfieldname(1im)
┌ Warning: `fieldnames(v)` is deprecated, use `fieldnames(typeof(v))` instead.
│   caller = firstfieldname(::Complex{Int64}) at REPL[13]:1
└ @ Main REPL[13]:1
:re

julia> getfirstfield(1im)
0

julia> firstfieldname(Complex{Int})
:re

julia> getfirstfield(Complex{Int})
Complex

The object Complex{Int} does not have a field named re. It’s first field name is, well, the name of the type:

julia> fieldnames(typeof(Complex{Int}))
(:name, :super, :parameters, :types, :names, :instance, :layout, :size, :ninitialized, :uid, :abstract, :mutable, :hasfreetypevars, :isconcretetype, :isdispatchtuple, :isbitstype, :zeroinit, :isinlinealloc, Symbol("llvm::StructType"), Symbol("llvm::DIType"))
4 Likes

I never said that checking a type is ambiguous that’s also completely irrelevant to the whle discussion.

I’m saying that the two meanings of the function are ambiguous and you should realize that the meaning of the function is the whole point of this thread, starting from the very title of it.

Edit: I don’t like your tone. Please be more constructive on a conversation you started.

Good to know.

I think I finally understand what you are saying. By ‘ambiguity’, you refer directly to the fact that the fieldnames function has two different meanings depending on whether you interpret the input as a value or as a type. Correct?

If so, that’s an ok argument for making this change, because it can be confusing, and is error-prone, like @mbauman demonstrated .

And in that case, you are simply using the word “ambiguous” incorrectly. This is not an example of an ambiguity. As far as I can tell, that is the origin of this whole misunderstanding.