tuned-lens
A library implementing the Tuned Lens, along with other tools for extracting, manipulating, and studying the learned representations of transformers across layers.
https://github.com/norabelrose/tuned-lens
RWKV
RWKV is an RNN with transformer-level performance at some language modeling tasks. Unlike other RNNs, it can be scaled to tens of billions of parameters efficiently.
RWKV is an RNN with transformer-level performance at some language modeling tasks. Unlike other RNNs, it can be scaled to tens of billions of parameters quite efficiently.
GPT-NeoX
A library for efficiently training large language models with tens of billions of parameters in a multimachine distributed context. This library is currently maintained by EleutherAI.
A library for efficiently training large language models with tens of billions of parameters in a multimachine distributed context.
LM Eval Harness
Our library for reproducible and transparent evaluation of LLMs.
Our library for reproducible and transparent evaluation of LLMs.
Mesh Transformer Jax
A JAX and TPU-based library developed by Ben Wang. The library has been used to train GPT-J.
https://github.com/kingoflolz/mesh-transformer-jax
GPT-Neo Library
A library for training language models written in Mesh TensorFlow. This library was used to train the GPT-Neo models, but has since been retired and is no longer maintained. We currently recommend the GPT-NeoX library for LLM training.
https://github.com/EleutherAI/gpt-neo