The Neptune AI blog has a nice summarization of popular NLP augmentation strategies

Data Augmentation in NLP: Best Practices From a Kaggle Master - neptune.ai

This can be a good start for the project to explore some of these existing approaches and how would they perform on standard tasks.

The Augmentation game in CV is at another level:

CutMix: This method literally fills the CutOut. Instead of simply removing pixels, we replace the removed regions with a patch from another image (See Table). The ground truth labels are also mixed proportionally to the number of pixels of combined images.

Screen Shot 2022-07-10 at 8.44.10 PM.png

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

AugMix: Augmentation operations such as translate x and weights such as m are randomly sampled. Randomly sampled operations and their compositions allow us to explore the semantically equivalent input space around an image. Mixing these images together produces a new image without veering too far from the original.

AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

It is interesting to see how we can apply some of the above-mentioned strategies to NLP tasks.

Notion - The all-in-one workspace for your notes, tasks, wikis, and databases.