Indexing a DataFrame Index

Hello, I am trying to write a function that takes in a DataFrame that has an index, df[x:x2,c], and uses those index numbers for calculations.

An example would taking df[1:100,1] and then multiplying 1 and 100 by 2.

Whenever I try to create a function with df[x:x2,c] as a variable it throws this error, β€œdf[x:x2, c]” is not a valid function argument name."

I was wondering if my way of naming the variable is wrong, or I just have to write a function with a variable that encompasses the whole index and then index the index within the function (which I cant figure out either).

Thanks,
W

it would be helpful if you can share code snippet

(Cary1[1327:1528,2], Cary1[1634:1883,2], Cary1[2495:2680,2], Cary1[3211:3298,2], Cary1[3330:3505,2], Cary1[3558:3668,2], Cary1[3888:4078,2])
function ConvertCaryDftoCRDDf(CaryDF, CrdDF, Df[x:x2,c]...)
	DistanceDiff = length(CaryDF[!,1])/length(CrdDF[!,1])
	Replace = replace("$(Df)[$(round(x/DistanceDiff)):$(round(x2/DistanceDiff)),c]", "Df" => "CrdDF")
	
	ConvertedDF = eval(Meta.parse(Replace))
	return ConvertedDF
end

Basically I am trying to divide the indexes of one DataFrame and then set another DataFrame with those new indexes.

The DataFrames are on a different timescale so I am trying to match them.

You have three options :

  • pass them as a separate argument to your function
  • store them as a column of your data frame
  • store them as metadata of your data frame

How does this work?

julia> using DataFrames

julia> df = DataFrame(a=1:10, b=11:20)
10Γ—2 DataFrame
 Row β”‚ a      b
     β”‚ Int64  Int64
─────┼──────────────
   1 β”‚     1     11
   2 β”‚     2     12
   3 β”‚     3     13
   4 β”‚     4     14
   5 β”‚     5     15
   6 β”‚     6     16
   7 β”‚     7     17
   8 β”‚     8     18
   9 β”‚     9     19
  10 β”‚    10     20

julia> metadata!(df, "idxs", [2, 4, 6])
10Γ—2 DataFrame
 Row β”‚ a      b
     β”‚ Int64  Int64
─────┼──────────────
   1 β”‚     1     11
   2 β”‚     2     12
   3 β”‚     3     13
   4 β”‚     4     14
   5 β”‚     5     15
   6 β”‚     6     16
   7 β”‚     7     17
   8 β”‚     8     18
   9 β”‚     9     19
  10 β”‚    10     20

and now you can retrieve this metadata later:

julia> metadata(df, "idxs")
3-element Vector{Int64}:
 2
 4
 6

(here I made the metadata volatile, as I assume when mutating the data frame the correctness of the metadata can be lost)

1 Like

Hello, thank you for explaining the metadata example.

Out of curiosity how would you accomplish β€œpass them as a separate argument to your function”

Best,
W

E.g. define a signature as:

function ConvertCaryDftoCRDDf(CaryDF, CrdDF, Df, x:x2, c)

and inside you do not need to use metaprogramming as you have access to both x:x2 and c.

1 Like

I get this error.

"x:x2" is not a valid function argument name

Ah - sorry - I did not make a correct argument names - I used names you used to show you where the arguments should go. The signature should be:

function ConvertCaryDftoCRDDf(CaryDF, CrdDF, Df, row_index, c)

and then pass x:x2 as row_index positional argument.

2 Likes

I would recommend you read the Julia docs or some beginner level tutorials to get a sense for how Julia works - your issues mainly seem to stem from a lack of understanding of base Julia syntax.

1 Like
I would recommend you read the Julia docs or some beginner level tutorials to get a sense for how Julia works - your issues mainly seem to stem from a lack of understanding of base Julia syntax.

My lack of understanding is exactly why I am looking for help. I was unsure if there was a way to pass a DataFrame with and index as an argument. I tried looking for documentation online on the subject but I came up short.

Thank you for your help. I will try both the metadata and the separate argument function to see which works best.