It seems that large language models are getting bigger and bigger, and by growing they need more and more processing power.
I know that some LLM developers have made smaller versions to test how small they can be made and function.
But what happens when you want a LLM to do a specific job, surely it only needs a fraction of the data a general-purpose model does.
Potential benefits of SLMs:
- Less data.
- Potentially faster.
- Less space to hallucinate/go wrong.
- Smaller set of potentials for complete testing.
- Running costs reduced.
- Lower spec hardware needs.
Has anyone tried dedicating a LLM to a specific job/task and then optimizing its data size to create a SLM?
TLDR; How large does a LLM have to be for a toaster or microwave?
Talkie Toaster https://www.youtube.com/watch?v=vLm6oTCFcxQ
[link] [comments]