Report
Asterisk*: Keep it Simple
العنوان: | Asterisk*: Keep it Simple |
---|---|
المؤلفون: | Semenov, Andrew |
سنة النشر: | 2024 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Computation and Language, Computer Science - Artificial Intelligence |
الوصف: | This paper describes Asterisk, a compact GPT-based model for generating text embeddings. The model uses a minimalist architecture with two layers, two attention heads, and 256 embedding dimensions. By applying knowledge distillation from larger pretrained models, we explore the trade-offs between model size and performance while minimizing computational and memory requirements. The model is primarily evaluated and optimized for classification tasks, with experimental results showing its moderate performance in zero-shot classification across various downstream applications. With additional configuration, the model performance can approach or even surpass that of larger architectures on specific classification tasks. |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2411.05691 |
رقم الانضمام: | edsarx.2411.05691 |
قاعدة البيانات: | arXiv |
ResultId |
1 |
---|---|
Header |
edsarx arXiv edsarx.2411.05691 1128 3 Report report 1128.03088378906 |
PLink |
https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2411.05691&custid=s6537998&authtype=sso |
FullText |
Array
(
[Availability] => 0
)
Array ( [0] => Array ( [Url] => http://arxiv.org/abs/2411.05691 [Name] => EDS - Arxiv [Category] => fullText [Text] => View record in Arxiv [MouseOverText] => View record in Arxiv ) ) |
Items |
Array
(
[Name] => Title
[Label] => Title
[Group] => Ti
[Data] => Asterisk*: Keep it Simple
)
Array ( [Name] => Author [Label] => Authors [Group] => Au [Data] => <searchLink fieldCode="AR" term="%22Semenov%2C+Andrew%22">Semenov, Andrew</searchLink> ) Array ( [Name] => DatePubCY [Label] => Publication Year [Group] => Date [Data] => 2024 ) Array ( [Name] => Subset [Label] => Collection [Group] => HoldingsInfo [Data] => Computer Science ) Array ( [Name] => Subject [Label] => Subject Terms [Group] => Su [Data] => <searchLink fieldCode="DE" term="%22Computer+Science+-+Computation+and+Language%22">Computer Science - Computation and Language</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Artificial+Intelligence%22">Computer Science - Artificial Intelligence</searchLink> ) Array ( [Name] => Abstract [Label] => Description [Group] => Ab [Data] => This paper describes Asterisk, a compact GPT-based model for generating text embeddings. The model uses a minimalist architecture with two layers, two attention heads, and 256 embedding dimensions. By applying knowledge distillation from larger pretrained models, we explore the trade-offs between model size and performance while minimizing computational and memory requirements. The model is primarily evaluated and optimized for classification tasks, with experimental results showing its moderate performance in zero-shot classification across various downstream applications. With additional configuration, the model performance can approach or even surpass that of larger architectures on specific classification tasks. ) Array ( [Name] => TypeDocument [Label] => Document Type [Group] => TypDoc [Data] => Working Paper ) Array ( [Name] => URL [Label] => Access URL [Group] => URL [Data] => <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2411.05691" linkWindow="_blank">http://arxiv.org/abs/2411.05691</link> ) Array ( [Name] => AN [Label] => Accession Number [Group] => ID [Data] => edsarx.2411.05691 ) |
RecordInfo |
Array
(
[BibEntity] => Array
(
[Subjects] => Array
(
[0] => Array
(
[SubjectFull] => Computer Science - Computation and Language
[Type] => general
)
[1] => Array
(
[SubjectFull] => Computer Science - Artificial Intelligence
[Type] => general
)
)
[Titles] => Array
(
[0] => Array
(
[TitleFull] => Asterisk*: Keep it Simple
[Type] => main
)
)
)
[BibRelationships] => Array
(
[HasContributorRelationships] => Array
(
[0] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Semenov, Andrew
)
)
)
)
[IsPartOfRelationships] => Array
(
[0] => Array
(
[BibEntity] => Array
(
[Dates] => Array
(
[0] => Array
(
[D] => 08
[M] => 11
[Type] => published
[Y] => 2024
)
)
)
)
)
)
)
|
IllustrationInfo |