Machine Learning Transformers Finetuning Representation Learning and Retrieval A look at extending pre-trained representations with document retrieval to better solve downstream tasks. by Madison May
JAX Transformers Attention Large Memory Layers with Product Keys Augmenting transformer language models with sparse access of large memory matrices by Madison May
Machine Learning Transformers Finetuning Pattern-Exploiting Training Leveraging the knowledge locked away in language models by reframing categorical tasks as constrained text generation. by Madison May
Machine Learning Transformers Optimal Transport and the Sinkhorn Transformer Optimal Transport, the Sinkhorn Transformer, and Charmin Ultra-Soft by Madison May
Machine Learning Transformers Model Compression A Survey of Methods for Model Compression in NLP A foray into numeric precision reduction, operation fusion, pruning, knowledge distillation, and module replacement. by Madison May
Machine Learning Programming Transformers Talking-Heads Attention Put on your headphones, jam out to some funky 80s rock and read about an equally funky variation on multi-head attention. by Madison May
JAX Machine Learning Programming Finetuning Transformers with JAX + Haiku A practical, code-first look at DeepMind's new haiku library. by Madison May
Machine Learning Attention Transformers A Survey of Long-Term Context in Transformers Exploring 6 noteworthy approaches for incorporating longer-term context in transformer models. by Madison May