Sold my crypto data company last year. We processed everything in the cloud - that was the whole model. Now I'm building the opposite.
Running all my inference locally on a NAS with an eGPU. Not because it's cheaper (it isn't, upfront) or faster (it isn't, for big models). Because the data never leaves.
The more I watch the AI space evolve, the more I think there's going to be a split. Most people will use cloud AI and not care. But there's a growing segment - developers, professionals handling sensitive data, privacy-conscious users - who will want capable models running on hardware they control.
I wrote up my thinking on this - the short version is that local-first isn't about rejecting cloud AI, it's about having the option.
Current setup is Ollama on an RTX 4070 12GB. The 7B-13B models are genuinely useful for daily work now. A year ago they weren't. That trajectory is what makes local viable.
Anyone else moving toward local inference? Curious whether this is a niche concern or something more people are thinking about.
[link] [comments]