Difficult to find motherboard with more than 2 PCIe 16x slots. What if I connect GPUs through the PCIe 1x port ? Would that only affect loading the model once per boot and then have no impact on performance ? Does the model need to be reloaded many times during a session ?
I imagine when you start a new conversation, you need to load a clean copy ? So maybe once per conversation and then you can make many queries without being limited by PCIe bandwidth ?
[link] [comments]