Download - Challenge Txt

Training and benchmarking offensive language identification models Official dataset for the SemEval 2020 Task 12 (OffensEval)

: If you are looking for the evaluation script or the specific "challenge" format used in the SemEval task, check the "Participate" or "Files" tab on the OffensEval 2020 CodaLab page. Quick Facts about SOLID Description Full Name Semi-Supervised Offensive Language Identification Dataset Size Over 9 million English tweets Purpose Download challenge txt

The SOLID dataset was introduced in the paper "SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification". Because it consists of millions of tweets, the full dataset is often distributed via tweet IDs or through specific competition portals to comply with X (formerly Twitter) Developer Policies . Download challenge txt