NoisyHate: Benchmarking Content Moderation Machine Learning Models with Human-Written Perturbations Online
NoisyHate: Benchmarking Content Moderation Machine Learning Models with Human-Written Perturbations Online
Yiran Ye,Thai Le,Dongwon Lee
2023 · DOI: 10.48550/arXiv.2303.10430
arXiv.org · 5 Citations
TLDR
A benchmark test set of human-written perturbations, named as NoisyHate, created from real perturbations written by human users on various social platforms, for helping develop better toxic speech detection models.
