SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Main Authors: | , , |
---|---|
其他作者: | |
格式: | 文件 |
语言: | English |
出版: |
Institute of Electrical and Electronics Engineers (IEEE)
2022
|
在线阅读: | https://hdl.handle.net/1721.1/143674 |