Why can't I access a GroupedDataFrame (via get function) through a GroupKey?

Iโ€™m sorry, this should be a question easily answered I guess, but why does the following not work.

grouped_df = groupby(original_df,[:field_to_group])
get(grouped_df, keys(grouped_df)[1])

LoadError: MethodError: no method matching get(::GroupedDataFrame{DataFrame}, ::DataFrames.GroupKey{GroupedDataFrame{DataFrame}})

The documentation for the get function clearly states that :

get(gd::GroupedDataFrame, key, default)
[...] key may be a GroupKey [...]

When I run typeof(keys(grouped_df)[1]), I get DataFrames.GroupKey{GroupedDataFrame{DataFrame}}

What am I missing? Thank you for the time!

I think you want getindex instead of get here. A more idiomatic way of writing this is

grouped_df[first(keys(grouped_df))]

or

first(grouped_df)
2 Likes

Thank you for the answer. But what if i want to access a specific subgroup? (not by index, but rather, by name). And what if I want to loop on this GroupedDataFrame and access each one of its individual groups?

And i still donโ€™t understand why get does not work with my GroupKey as stated on the documentationโ€ฆ

the docs have example right?

julia> df
3ร—3 DataFrame
 Row โ”‚ x1     x2     x3    
     โ”‚ Int64  Int64  Int64 
โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   1 โ”‚    10      9      8
   2 โ”‚    10      7      6
   3 โ”‚     7      7      4

julia> gd = groupby(df, :x2)
GroupedDataFrame with 2 groups based on key: x2
First Group (2 rows): x2 = 7
 Row โ”‚ x1     x2     x3    
     โ”‚ Int64  Int64  Int64 
โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   1 โ”‚    10      7      6
   2 โ”‚     7      7      4
โ‹ฎ
Last Group (1 row): x2 = 9
 Row โ”‚ x1     x2     x3    
     โ”‚ Int64  Int64  Int64 
โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   1 โ”‚    10      9      8

julia> get(gd, (x2=9,), 0)
1ร—3 SubDataFrame
 Row โ”‚ x1     x2     x3    
     โ”‚ Int64  Int64  Int64 
โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   1 โ”‚    10      9      8

julia> get(gd, keys(gd)[1], 0)
2ร—3 SubDataFrame
 Row โ”‚ x1     x2     x3    
     โ”‚ Int64  Int64  Int64 
โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   1 โ”‚    10      7      6
   2 โ”‚     7      7      4

yes, yet still I donโ€™t understand, if you could clarify. I could not reproduce what I asked

you need a third argument that is default

4 Likes

Ok, now it works! It was really only the missing argument. The error message was a little bit misleading IMOโ€ฆ thanks @jling! Have a nice day :slight_smile:

the error suggests this too, highlighting the potentially missed argument

1 Like

Interesting, because to me ::Any does not imply that I need to pass this argument, but rather that it would choose something as a default. But I will see it this way going forward. Thanks!

A method taking an ::Any argument requires that argument but has no type constraint on that argument. It might do additional checking within the method on the argument though, but the method will be called with any value but there must be a value.

3 Likes

basically:

julia> f(x) = 3
f (generic function with 1 method)

julia> g() = 3
g (generic function with 1 method)

julia> f()
ERROR: MethodError: no method matching f()
Closest candidates are:
  f(::Any) at REPL[15]:1
Stacktrace:
 [1] top-level scope
   @ REPL[17]:1

julia> g()
3
1 Like

Thank you for the clarification!

Thank you for taking the time with all these examples!