tf-attentive-conv: Attentive Convolution

A Tensorflow implementation of Yin Wenpeng's recent paper on TACL "Attentive Convolution".

This is a Tensorflow implementation of Yin Wenpeng's paper "Attentive Convolution" at TACL in 2018. Wenpeng's original code is written in Theano.

I only implement the light attentive convolution described in Sect. 3.1 of the paper. Authors argue that even this light-version AttConv outperforms some of pioneering attentive RNNs in both intra-context (context=query, i.e. self-attention) and extra-context (context!=query) settings. The following figure (from the paper) illustrates this idea:

What did I change?

Nothing big. I do add some features:

  1. add a dropout-resnet-layernorm block before the output
  2. add masking to ensure causality, so that one may use it for decoding as well.

By default these features are all disabled.


Run for a simple test on toy data.