Compositional De-Attention Networks

Abstract

Attentional models are distinctly characterized by their ability to learn relative importance, i.e., assigning a different weight to input values. This paper proposes a new quasi-attention that is compositional in nature, i.e., learning whether to add, subtract, or nullify a certain vector when learning representations. This is strongly contrasted with vanilla attention, which simply re-weights input tokens. Our proposed Compositional De-Attention (CoDA) is fundamentally built upon the intuition of both similarity and dissimilarity (negative affinity) when computing affinity scores, benefiting from a greater extent of expressiveness. We evaluate CoDA on six NLP tasks, i.e., open-domain question answering, retrieval/ranking, natural language inference, machine translation, sentiment analysis, and text-to-code generation. We obtain promising experimental results, achieving state-of-the-art performance on several tasks/datasets.

Publication
The Advances in Neural Information Processing Systems

Link: https://papers.nips.cc/paper_files/paper/2019/hash/16fc18d787294ad5171100e33d05d4e2-Abstract.html

Luu Anh Tuan
Luu Anh Tuan
Assistant Professor

My research interests lie in the intersection of Artificial Intelligence and Natural Language Processing.