hierarchical-transformer-1d

Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using ? transformers

applied methodology

  • Reversible Redisual Layers
  • Chunked Feed Forward Layers
  • Rotary Positional Embeddings
  • Pre-shift feature space
  • hierarchical matrix

In Progress!! 2021.11.12

Citations

@misc{zhu2021htransformer1d,
    title   = {H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences}, 
    author  = {Zhenhai Zhu and Radu Soricut},
    year    = {2021},
    eprint  = {2107.11906},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}

@software{lucidrains2021,
    author       = {Phil Wang},
    title        = {lucidrains/-transformer-1d: 0.1.7},
    month        = {nov},
    year         = {2021},
    publisher    = {GitHub},
    journal      = {GitHub repository},
    howpublished = {\url{https://github.com/lucidrains/h-transformer-1d}},
    commit       = {4e7f4fc58bab9a0bedd31951dce509c401ecdb7f}
}

GitHub

View Github