AI that alerts parents ONLY when it gives harmful answers.

I’ve been exploring an idea for a tool that helps parents feel safer when their kids use AI chatbots. It’s not a shock that kids and teens are using these models daily - and sometimes, chatbots have given harmful or manipulative instructions to users. (Here’s one example)

The concept is simple: if our system detects a harmful response from the AI model itself (not the kid/teen), it quietly alerts the parent. No chat logs, no spying, no invasion of privacy at all as parents never see what their kid is actually saying. The only goal is to step in when the AI crosses a line.

I’d love some honest feedback - and bold opinions on why this idea might suck or totally fail. If there are huge flaws in this concept (technical, ethical, or social), I want to hear them.

submitted by /u/AIMadeMeDoIt__
[link] [comments]