DEBAGREEMENT is a dataset including 42,894 comment-reply pairings from the popular debate website Reddit, each of which has been labeled with agree, neutral, or disagree. We gathered interactions from five different forums: r/BlackLivesMatter, r/Brexit, r/Climate, r/Democrats, and r/Republican. Comment pairings for each forum were chosen in such a way that they generate a user interaction graph when taken as a whole.
DEBAGREEMENT offers a challenge for Natural Language Processing (NLP) systems because it includes slang, irony, and topic-specific humor, all of which are common in online discussions. We compared the performance of state-of-the-art language models on a (dis)agreement detection task, and looked at the usage of contextual information that is accessible to the models during training (graph, authorship, and temporal information).
DEBAGREEMENT provides novel opportunities for combining graph-based and text-based machine learning techniques to detect agreements as well as disagreements online, in light of recent research showing that context, such as social context or knowledge graph information, enables language models to perform better on downstream NLP tasks.
Key Takeaways in this Tech Talk:
Language models trained on existing datasets underperform when run on real world data. This discrepancy is exposed when evaluating the models using the new DEBAGREEMENT dataset.
DEBAGREEMENT presents new opportunities for modeling diverse online interactions with text and context (authorship, graph, temporal information). Its graph structure allows for the combination of text-based machine learning (ML) with graph representation learning (GRL) approaches.
By modeling online discussion forums as graphs of user interactions, researchers can: 1) transform the agreement/disagreement detection problem into a sign link prediction task, and 2) use existing signed graph embedding techniques evaluated on publicly accessible signed graphs such as Epinions and Slashdot. The sentiment and polarization of topics is not static in time and for the first time, this dataset exposes that and shows how realistic topic sentiment evolves over time.