RWKV
RWKV is an RNN with transformer-level performance at some language modeling tasks. Unlike other RNNs, it can be scaled to tens of billions of parameters quite efficiently.
RWKV is an RNN with transformer-level performance at some language modeling tasks. Unlike other RNNs, it can be scaled to tens of billions of parameters quite efficiently.