I saw a paper/article on hacker news at one point about making LLMs where they did not use floating point gpus to do their calculations so you wouldn't get the non-deterministic problem (ask same question get diffrent response).
How is that going?
I work with RAG tech and it seems amazing but it also is sketch when a table is read incorrectly and values are off by a significant figure.
[link] [comments]