التفاصيل البيبلوغرافية
العنوان: |
Efficient Training of Sparse Autoencoders for Large Language Models via Layer Groups |
المؤلفون: |
Ghilardi, Davide, Belotti, Federico, Molinari, Marco |
سنة النشر: |
2024 |
المجموعة: |
Computer Science |
مصطلحات موضوعية: |
Computer Science - Computation and Language, Computer Science - Artificial Intelligence |
الوصف: |
Sparse AutoEnocders (SAEs) have recently been employed as an unsupervised approach for understanding the inner workings of Large Language Models (LLMs). They reconstruct the model's activations with a sparse linear combination of interpretable features. However, training SAEs is computationally intensive, especially as models grow in size and complexity. To address this challenge, we propose a novel training strategy that reduces the number of trained SAEs from one per layer to one for a given group of contiguous layers. Our experimental results on Pythia 160M highlight a speedup of up to 6x without compromising the reconstruction quality and performance on downstream tasks. Therefore, layer clustering presents an efficient approach to train SAEs in modern LLMs. |
نوع الوثيقة: |
Working Paper |
URL الوصول: |
http://arxiv.org/abs/2410.21508 |
رقم الانضمام: |
edsarx.2410.21508 |
قاعدة البيانات: |
arXiv |