We built an open-source data platform where AI agents are the primary operators — not humans clicking buttons

Most data platforms treat AI as a feature — slap on a chatbot, call it "AI-native." We went the other way: we built Datris so AI agents could operate an entire data platform autonomously through MCP (Model Context Protocol).

The idea: Agents shouldn't need custom integration code to work with data infrastructure. They should be able to discover what's available, create pipelines, validate data, query databases, search documents, and diagnose problems — all through a standardized protocol.

What agents can actually do with Datris (30+ MCP tools): - Discover schemas, tables, and metadata across PostgreSQL, MongoDB, and 5 vector databases - Create data pipelines from sample files — schema auto-detected, no config writing - Validate data using plain English rules ("prices must be positive", "emails must be valid") — AI generates the validation code, runs it locally - Transform data with natural language instructions - Run SQL or describe what you want in plain English — Datris generates and executes it - Full RAG pipeline — ingest PDFs/Word/HTML, chunk, embed, semantic search - Monitor jobs, diagnose failures, profile data quality

Why this matters for AI development: The bottleneck for autonomous agents isn't intelligence — it's infrastructure access. Most agents today can reason well but can't actually do things with data without a human writing glue code. MCP changes that by giving agents a standard way to interact with tools. We built Datris around that idea from day one.

Stack: Open-source, Docker Compose, runs anywhere. AI via Anthropic, OpenAI, or Ollama (fully local/air-gapped). Spring Boot, MinIO, MongoDB, PostgreSQL, 5 vector DBs supported.

**Quick start:**git clone https://github.com/datris/datris-platform-oss cp .env.example .env docker compose up -d

MCP server: uvx datris-mcp-server CLI: brew tap datris/tap && brew install datris

I've been building data infrastructure in financial services for 30+ years (Goldman Sachs, Bridgewater, Deutsche Bank, Freddie Mac). The pattern I kept seeing: platforms that can't evolve as fast as the agents using them. So we built one that puts agents first.

Curious what this community thinks — where do you see the biggest gaps in how AI agents interact with data infrastructure today?

GitHub: https://github.com/datris/datris-platform-oss Docs: https://docs.datris.ai Website: https://datris.ai

submitted by /u/toddfearn
[link] [comments]