The Essência, by K

Brief Summaries and Quick Thoughts - One post at a time...

Only yesterday the practical things of today were decried as impractical, and the theories which will be practical tomorrow will always be branded as valueless games by the practical men of today! - William Feller

Taken for Granted? It's time we reconsider the Instruction Tuning Loss! 👉

Uncovering the benefits of Weighted Instruction Tuning (WIT)

11 min read · 2025

Taken for Granted? It's time we reconsider the Instruction Tuning Loss! 👉

Uncovering the benefits of Weighted Instruction Tuning (WIT)

11 min read · July 30, 2025

2025 · LLMs instruction-tuning generalization loss
Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons

12 min read · December 24, 2024

2024 · instruction-tuning interpretability
Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models

11 min read · December 24, 2024

2024 · instruction-tuning interpretability
Transformer-Based Language Models

3 min read · December 15, 2024

2024 · transformers LMs pre-training