Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using transformers

Nov 15, 2021 1 min read

hierarchical-transformer-1d

Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using ? transformers

applied methodology

Reversible Redisual Layers
Chunked Feed Forward Layers
Rotary Positional Embeddings
Pre-shift feature space
hierarchical matrix

In Progress!! 2021.11.12

Citations

@misc{zhu2021htransformer1d,
    title   = {H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences}, 
    author  = {Zhenhai Zhu and Radu Soricut},
    year    = {2021},
    eprint  = {2107.11906},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}

@software{lucidrains2021,
    author       = {Phil Wang},
    title        = {lucidrains/-transformer-1d: 0.1.7},
    month        = {nov},
    year         = {2021},
    publisher    = {GitHub},
    journal      = {GitHub repository},
    howpublished = {\url{https://github.com/lucidrains/h-transformer-1d}},
    commit       = {4e7f4fc58bab9a0bedd31951dce509c401ecdb7f}
}

GitHub

View Github

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.