Enforcing Schema on Data Frame Passed as Function Argument

There are a couple of other options that haven’t been mentioned.

Wrap a data frame in a struct

You could create various custom types with constructors that represent different tables in your data flow. These custom types with constructors can provide some of the safety you are looking for. However, I’ve tried this before and the thing I didn’t like about it is that all the actions become nouns (the constructors) rather than verbs.

Let it fail

This is the approach I’ve taken. When I’m writing internal code (in other words, code that is not user facing), I can just assume that I, the programmer, am smart enough to provide the right kind of data frame. In fact, this approach applies to a lot of things. When you are writing internal code, you have to assume to some degree that the developer (not the user) knows what they are doing and is providing the correct input to functions. You typically don’t fill your code with assertions (@assert) in every function.

1 Like