Replika/Character AI: How is it possible to handle all those predictions so fast?
Replika/Character AI: How is it possible to handle all those predictions so fast?

Replika/Character AI: How is it possible to handle all those predictions so fast?

Hi,

as the title suggest, how are they doing? I mean, I have developed a platform for commercial use that use goliath on replicate to run predictions.

The problem is, handling 10 messages per second, will take hours to process the last messages.

Do you have any suggestion on a good platform or a faster llm (mixtral or vicuna for example) so that the user can expect to receive a response in reasonable time? Even 5 seconds would be perfect.

Thank you

submitted by /u/Sapessiii
[link] [comments]