Dataset statistics showing distribution of Figurative Language across our data.

Type Entails Contradicts Total
Paraphrase 1339 - 1339
+Sarcasm - 2678 2678
Simile 750 750 1500
Metaphor 750 750 1500
Idiom 1000 1000 2000

Train Test Split

We have 7500 samples for training and 1500 samples reserved for blind test set. You are free to reserve a certain portion of the training set as validation. We don't provide a validation set but if you are interested in the validation set use your own split