Mesh Transformer Jax
A JAX and TPU-based library developed by Ben Wang. The library has been used to train GPT-J.
https://github.com/kingoflolz/mesh-transformer-jax
VQGAN-CLIP
A technique for doing text-to-image synthesis cheaply using pretrained CLIP and VQGAN models.
VQGAN-CLIP is a methodology for using multimodal embedding models such as CLIP to guide text-to-image generative algorithms without additional training. While the results tend to be worse than pretrained text-to-image generative models, they are orders of magnitude cheaper and can often be assembled out of pre-existing independently valuable models. Our core approach has been adopted to a variety of domains including text-to-3D and audio-to-image synthesis, as well as to develop novel synthetic materials.
GPT-Neo Library
A library for training language models written in Mesh TensorFlow. This library was used to train the GPT-Neo models, but has since been retired and is no longer maintained. We currently recommend the GPT-NeoX library for LLM training.
https://github.com/EleutherAI/gpt-neo
The Pile
A large-scale corpus for training language models, composed of 22 smaller sources. The Pile is publicly available and freely downloadable, and has been used by a number of organizations to train large language models.
The Pile is a curated collection of 22 diverse high-quality datasets for training large language models.
OpenWebText2
OpenWebText2 is an enhanced version of the original OpenWebTextCorpus, covering all Reddit submissions from 2005 up until April 2020. It was developed primarily to be included in the Pile.
OpenWebText2 is an enhanced version of the original OpenWebTextCorpus, covering all Reddit submissions from 2005 up until April 2020. It was developed primarily to be included in the Pile.