This is a paper that I wrote together with Piek Vossen. We were able to achieve SOTA back then by simply training a RoBERTa (a variant of BERT) with speaker tokens! Check out https://arxiv.org/abs/2108.12009
Abstract: We present EmoBERTa: Speaker-Aware Emotion Recognition in Conversation with RoBERTa, a simple yet expressive scheme of solving the ERC (emotion recognition in conversation) task. By simply prepending speaker names to utterances and inserting separation tokens between the utterances in a dialogue, EmoBERTa can learn intra- and inter- speaker states and context to predict the emotion of a current speaker, in an end-to-end manner. Our experiments show that we reach a new state of the art on the two popular ERC datasets using a basic and straight-forward approach. We’ve open-sourced our code and models at https://github.com/tae898/erc.