<span class="vcard">/u/ChipmunkUpstairs1876</span>
/u/ChipmunkUpstairs1876

Built a pipeline for training HRM-sMOE LLMs

just as the title says, ive built a pipeline for building HRM & HRM-sMOE LLMs. However, i only have dual RTX 2080TIs and training is painfully slow. Currently working on training a model through the tinystories dataset and then will be running eval…