<span class="vcard">/u/vagobond45</span>
/u/vagobond45

Compact offline medical SLM with Native Knowledge Graph + RAG audit (benchmark + HF demo)

I’ve been experimenting with a slightly different approach to medical LMs and would really value feedback from people working on ML, health IT, or clinical education. Instead of chasing more parameters, I built a ~6 GB medical SLM that’s tightly couple…

LLMs Path to GenAI; Graph Info Maps

LLMs, a Race for more data centers, Nvidia chips and more model parameters, yet no LLM can understand concepts and their relationships and still limited to next token prediction. Trying to increase model parameters in each generation is akin to trying …

6GB Offline Medical SLM with Native Knowledge Graph, zero hallucinations, runs on your phone

We built a 6 GB, fully self-contained Medical SLM that runs offline on laptops and phones, no cloud, no data leaks. It combines BioGPT-Large + a native biomedical knowledge graph (5 000+ nodes, 25 000+ edges) with graph-aware embeddings and real-time R…

Modular AI Architecture with Dynamic Digital Information Maps

I already created a medical graph dictionary with nodes and edges, generated uniform graph vectors (85%) and combined them with MiniLLM vectors (15%) and utilized successfully in MLM and CLM (preidict next token) training. With only 500 Pubmed data sam…

Code to create Uniform Graph Vectors

Below code was utilized create unform graph vectors based on nodes and edges of a medical graph dictionary with 500 nodes (body parts, cellular structure, diseases, medical treatment, symptoms), hierarchical order (parent, child) and medical relationsh…

Medical SLM Model Output based on Graph Dictionary, 85% to 100% token success, 0.002 loss, 1.01 perplexity and all of this based on only 500 PubMed dataset samples and 85% weight on graph dictionary vector embeddings, These are simply results of 20 epochs of MLM training next I will run a CLM traini

Medical SLM Model Output based on Graph Dictionary, 85% to 100% token success, 0.002 loss, 1.01 perplexity and all of this based on only 500 PubMed dataset samples and 85% weight on graph dictionary vector embeddings, These are simply results of 20 epo…