Side effects (or intended effects) in TidierData?

julia> df4 = DataFrame(x = ["a", "b", "a", "b", "C", "a"], y = 1:6, yz = 13:18, a = [join(rand('a':'z',4)) for _ in 1:6], ab = 12:-1:7)
6Γ—5 DataFrame
 Row β”‚ x       y      yz     a       ab    
     β”‚ String  Int64  Int64  String  Int64
─────┼─────────────────────────────────────
   1 β”‚ a           1     13  coes       12
   2 β”‚ b           2     14  nwoz       11
   3 β”‚ a           3     15  gber       10
   4 β”‚ b           4     16  ompu        9
   5 β”‚ C           5     17  ktgq        8
   6 β”‚ a           6     18  edkt        7

julia> nested_df = @nest(df4, n2 = starts_with("a"), n3 = y:yz)
3Γ—3 DataFrame
 Row β”‚ x       n3             n2            
     β”‚ String  DataFrame      DataFrame
─────┼──────────────────────────────────────
   1 β”‚ a       3Γ—2 DataFrame  3Γ—2 DataFrame
   2 β”‚ b       2Γ—2 DataFrame  2Γ—2 DataFrame
   3 β”‚ C       1Γ—2 DataFrame  1Γ—2 DataFrame

julia> @chain nested_df begin
           @unnest_wider(n3:n2, names_sep = nothing)        
           @unnest_longer(y:ab)
       end
9Γ—5 DataFrame
 Row β”‚ x       y        yz       a     ab      
     β”‚ String  Int64?   Int64?   Any   Int64?
─────┼─────────────────────────────────────────
   1 β”‚ a             1       13  coes       12
   2 β”‚ a             3       15  gber       10
   3 β”‚ a             6       18  edkt        7
   4 β”‚ b             2       14  nwoz       11
   5 β”‚ b             4       16  ompu        9
   6 β”‚ C             5       17  k           8
   7 β”‚ C       missing  missing  t     missing
   8 β”‚ C       missing  missing  g     missing
   9 β”‚ C       missing  missing  q     missing

That looks unfortunate, probably want to open an issue? If you use CategoricalArrays or symbols or something, you get an error that there is no iterate-method for it. The String is happy with it.

I don’t know how to open an issue. I just wanted to point out something that apparently (but I may be wrong) isn’t consistent with the rest of the examples in the manual chapter on @unnest_xyz macros.
If you also think it is a relevant fact, please make the issue yourself.

That is really not difficult and worth to learn. Just go to GitHub Β· Where software is built and click on the button β€œNew Issue”.

Then describe the issue you encountered, preferably including an example.

1 Like

is that okay?

1 Like

Not really. An issue should contain:

  • the code you executed
  • the output of executing that code
  • the expected output

and usually also the output of the command:

versioninfo()

Sometimes a link to discourse can be useful, but an issue should not only contain the link to a discussion on discourse.

thank you for pointing this out. this is not intended.

it seemed to be an issue when unnesting wider before longer… which was not intended.

I have now fixed it in the β€œunnest_wider_edgecase” branch.

julia> nested_df = @nest(df4, n2 = starts_with("a"), n3 = y:yz)
3Γ—3 DataFrame
 Row β”‚ x       n3             n2            
     β”‚ String  DataFrame      DataFrame     
─────┼──────────────────────────────────────
   1 β”‚ a       3Γ—2 DataFrame  3Γ—2 DataFrame 
   2 β”‚ b       2Γ—2 DataFrame  2Γ—2 DataFrame 
   3 β”‚ C       1Γ—2 DataFrame  1Γ—2 DataFrame 

julia> @chain nested_df begin
                  @unnest_wider(n3:n2, names_sep = nothing)        
                  @unnest_longer(y:ab)
              end
6Γ—5 DataFrame
 Row β”‚ x       y      yz     a       ab    
     β”‚ String  Int64  Int64  String  Int64 
─────┼─────────────────────────────────────
   1 β”‚ a           1     13  yzyb       12
   2 β”‚ a           3     15  tijm       10
   3 β”‚ a           6     18  dxcd        7
   4 β”‚ b           2     14  ijkj       11
   5 β”‚ b           4     16  gavn        9
   6 β”‚ C           5     17  zvtt        8

julia> @chain nested_df begin
                  @unnest_longer(n3:n2)
                  @unnest_wider(n3:n2, names_sep = nothing)        
              end
6Γ—5 DataFrame
 Row β”‚ x       yz     y      a       ab    
     β”‚ String  Int64  Int64  String  Int64 
─────┼─────────────────────────────────────
   1 β”‚ a          13      1  yzyb       12
   2 β”‚ a          15      3  tijm       10
   3 β”‚ a          18      6  dxcd        7
   4 β”‚ b          14      2  ijkj       11
   5 β”‚ b          16      4  gavn        9
   6 β”‚ C          17      5  zvtt        8
2 Likes

Is it possible to update to the fixed version?
How?

You can add the specific branch by doing

(@v1.11) pkg> add TidierData#unnest_wider_edgecase
1 Like