Julia reporting an extra ) when it doesn't exist

I have this for loop in Julia:

    for country in countries_data_labels
        new_dataframe = get_country(df, country)
        new_dataframe = DataFrame(new_dataframe)
        df_rows, df_columns = size(new_dataframe)
        new_dataframe_long = stack(new_dataframe, begin:end-4)
        y_axis[!, Symbol("$country")] = new_dataframe_long[!, :value]
    end

and Iโ€™m getting this error:

syntax: extra token ")" after end of expression

I decided to comment all of the body of the for loop except the 1st one and ran the cell each time after uncommenting to see which line was throwing this error and it was the 4th line in the body:

new_dataframe_long = stack(new_dataframe, begin:end-4)

There is no reason for this error to exist. There are no extra bracket pieces in this line.

On my phone so canโ€™t check, but begin and end syntax only works when indexing with square brackets []

As an aside, DataFrames accept strings as column names, so no reason to do y_axis[!, Symbol("$country")]

3 Likes

Could you elaborate the 1st part? This is my full cell:

begin
	countries_data_labels = ["Canada", "Italy", "China", "United States", "Spain"]
	y_axis = DataFrame()
	
	
	for country in countries_data_labels
		
		new_dataframe = get_country(df, country)
		
		new_dataframe = DataFrame(new_dataframe)
		
		df_rows, df_columns = size(new_dataframe)
		
		new_dataframe_long = stack(new_dataframe, begin:end-4)
		
		y_axis[!, Symbol("$country")] = new_dataframe_long[!, :value]
		
	end
end

What I tried to say was that begin and end are special cased by the parser if they occur within an indexing expression. Take this example:

julia> a = collect(1:5)
5-element Vector{Int64}:
 1
 2
 3
 4
 5

We can now get subsets of this vector using begin and end inside square brackets:

julia> a[begin:end-3]
2-element Vector{Int64}:
 1
 2

but letโ€™s define our own indexing function which, when called, does not use square brackets:

julia> my_getter(vector, idxs) = vector[idxs]
my_getter (generic function with 1 method)

If we pass some numerical range to this it works as expected:

julia> my_indexer(a, 2:4)
3-element Vector{Int64}:
 2
 3
 4

but when we try to use begin and end:

julia> my_indexer(a, begin:end-2)
ERROR: syntax: "begin" at REPL[15]:1 expected "end", got ")"
Stacktrace:
 [1] top-level scope
   @ none:1

this doesnโ€™t work, as they donโ€™t occur inside square brackets. When inside square brackets, end is just syntactic sugar for lastindex, i.e. a[end] is the same as a[lastindex(a)].

More broadly, itโ€™s not really clear what your code is trying to achieve but it looks overly complicated. It seems you just want to subset your original DataFrame with a set of countries and then stack to turn it into long format, in which case it seems your whole cell could be replaced by a single line:

stack(df[in(countries_data_labels).(df."Country/Region"), :], measure_vars)

(check the docstring for stack to make sure you get the correct measure_vars and id_vars, this is always a little hard to do without an MWE)

1 Like

To explain further, what the parser sees here would be similar to this:

my_indexer(a, begin
  :end - 2)

Since begin outside of indexing expressions creates a new block of syntax and :end is parsed as the Symbol Symbol("end"), itโ€™s expecting a closing of that open block instead of a closing bracket.

3 Likes

I did check the docstring but Pluto.jl is telling me that measure_vars is not defined

measure_vars are just the variables that you want to stack into the value column, as the docstring shows:

  julia> df = DataFrame(a = repeat([1:3;], inner = [2]),
                        b = repeat([1:2;], inner = [3]),
                        c = randn(6),
                        d = randn(),
                        e = map(string, 'a':'f'))
  6ร—5 DataFrame
   Row โ”‚ a      b      c          d         e
       โ”‚ Int64  Int64  Float64    Float64   String
  โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
     1 โ”‚     1      1   0.867347  0.532813  a
     2 โ”‚     1      1  -0.901744  0.532813  b
     3 โ”‚     2      1  -0.494479  0.532813  c
     4 โ”‚     2      2  -0.902914  0.532813  d
     5 โ”‚     3      2   0.864401  0.532813  e
     6 โ”‚     3      2   2.21188   0.532813  f
  
  julia> stack(df, [:c, :d])
  12ร—5 DataFrame
   Row โ”‚ a      b      e       variable  value
       โ”‚ Int64  Int64  String  String    Float64
  โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
     1 โ”‚     1      1  a       c          0.867347
     2 โ”‚     1      1  b       c         -0.901744
     3 โ”‚     2      1  c       c         -0.494479
     4 โ”‚     2      2  d       c         -0.902914
    โ‹ฎ  โ”‚   โ‹ฎ      โ‹ฎ      โ‹ฎ        โ‹ฎ          โ‹ฎ
    10 โ”‚     2      2  d       d          0.532813
    11 โ”‚     3      2  e       d          0.532813
    12 โ”‚     3      2  f       d          0.532813
                                     5 rows omitted

you need to replace measure_vars with whatever variables.

There are loads of additional examples in the excellent DataFrames tutorial which Iโ€™ve already recommended twice to you, long-to-wide and wide-to-long reshapes are covered here: Julia-DataFrames-Tutorial/09_reshaping.ipynb at master ยท bkamins/Julia-DataFrames-Tutorial ยท GitHub

4 Likes