Machine Learning Transformers Finetuning Representation Learning and Retrieval A look at extending pre-trained representations with document retrieval to better solve downstream tasks. by Madison May
JAX Transformers Attention Large Memory Layers with Product Keys Augmenting transformer language models with sparse access of large memory matrices by Madison May
JAX Machine Learning Programming Finetuning Transformers with JAX + Haiku A practical, code-first look at DeepMind's new haiku library. by Madison May
Machine Learning Attention Transformers A Survey of Long-Term Context in Transformers Exploring 6 noteworthy approaches for incorporating longer-term context in transformer models. by Madison May