| The following Image was shared on X by Yan LeCun.. The architecture nomenclature for LLMs is somewhat confusing and unfortunate. What's called "encoder only" actually has an encoder and a decoder (just not an auto-regressive decoder). What's called "encoder-decoder" really means "encoder with auto-regressive decoder" What's called "decoder only" really means "auto-regressive encoder-decoder" Along with unfortunate nomenclature, I was also not Fortunate enough to use all these models. First time saw these models , and I knew and used just 5-6 among these.. How many ofnthese did you use? [link] [comments] |