tf-attentive-conv: Attentive Convolution
A Tensorflow implementation of Yin Wenpeng's recent paper on TACL "Attentive Convolution".
This is a Tensorflow implementation of Yin Wenpeng's paper "Attentive Convolution" at TACL in 2018. Wenpeng's original code is written in Theano.
I only implement the light attentive convolution described in Sect. 3.1 of the paper. Authors argue that even this light-version AttConv outperforms some of pioneering attentive RNNs in both intra-context (context=query, i.e. self-attention) and extra-context (context!=query) settings. The following figure (from the paper) illustrates this idea:
What did I change?
Nothing big. I do add some features:
- add a
dropout-resnet-layernormblock before the output
- add masking to ensure causality, so that one may use it for decoding as well.
By default these features are all disabled.
app.py for a simple test on toy data.