LLMs/google and whatever articles I've read have failed me on this, so I'm hoping to find someone with some insight on this simple question.
How is compute distributed in research? Sutskever's 20% for superalignment is confusing to me. 20% of what available compute? Do AI companies partition off portions of their compute on a project by project basis? Is this compute reserved only during training? Or is 100% of compute dedicated to training during training? If so, given that b2b/consumer model usage seems to access the same GPUs used in training, what hardware do researchers specifically use?
I'm having trouble conceptualizing that 20% in a practical manner.
[link] [comments]