Relation Extraction project: request for benchmarking suggestions

Hello everyone,

I’m working on a project about relation extraction (RE).
More specifically my project focus on a new method to improove training data to use with any (supervised) model for RE and I would like to benchmark my method testing it in pair with some state-of-the-art models for RE to check if I have a significant enhancement.

How can I benchmark my model properly (i.e. as a computer science paper would require)?
Do I need the RE models working implementations? And If yes, where can I find it?
Any help in how to proceed with benchmarking will be very apreciated :slightly_smiling_face: