Storytime AI: Voice-First, Mobile, and Built for Imagination
Storytime AI is a voice-first, multilingual storytelling platform for children aged 2–10. Built using Gemini, Whisper, LangChain, and ElevenLabs, it delivers personalized, screen-free narratives that adapt to each child’s voice input and language level. I led the product end-to-end — from concept to launch — optimizing for latency, education, and engagement.

Coming out soon on mobile and selected for Google’s AI Startup Program
Loading...
Why We Built StoryTime
Parents told us:
  • “Too much screen time.”
  • “Apps don’t really listen.”
  • “I need 10 minutes of peace.”
We built StoryTime to give children agency through voice and give parents peace of mind through privacy-first design.
How StoryTime Works
Parent Setup
Parent selects story length and theme preferences from our intuitive interface.
AI Greeting
TTS greets the child: “Hi Kiaan! What story would you like to hear?”
Voice Input
Child responds aloud. STT captures the input.
Story branches dynamically across 2–5 interaction points based on age and time.
Real-Time Adaptation
LLM uses token-based pacing to keep responses short, relevant, and engaging.
Satisfying Conclusion
Each session ends with a positive moral and a soft voice cue. No data is stored. Fully COPPA/GDPR compliant.
Engineered for latency <5s to match child attention span
User Personas & Journey
See how Storytime AI integrates into real lives
Primary User – "Kiaan", Age 4
Loves trains and animals, craves screen-free playtime with educational content. Enjoys voice-based interactions and simple prompts.
Secondary User – “Lucy”, Working Parent
Juggling work and parenting, seeks screen-free engagement for child, prioritizing learning and routine.
📖 Journey Overview
Before Storytime AI: Passive watching on YouTube. After Storytime AI: Personalized, interactive storytelling experience with metrics for improved engagement.
Customer Journey – From Curiosity to Connection
How parents and children experience StoryTime from first touch to final moral.
AI Hypothesis
If our conversational AI voice-agent engages children ages 3–6 in adaptive, research-backed language education through personalized, interactive storytelling—while maintaining at least 80% alignment with vocabulary objectives and a 100% child-safe content score—then we can create measurable learning gains and parent satisfaction, opening monetization opportunities through B2C (freemium/subscription) and B2B (early learning centers).
AI Architecture
Inputs: Child’s voice, name, language → transcribed by Whisper
Processing: Gemini 2.0 + LangChain (interaction logic)
Memory: MongoDB + LangSmith traces (session + preference)
Output: ElevenLabs voice (TTS) → personalized, multilingual
Loop: Real-time feedback & sentiment tracing refine future responses
*This is actively being updated and may not represent the current state (Moved from v1 - Universal and Open AI for STT and TTS)
Risks & Tradeoffs
What I Learned
  • Designing for delight means designing for unpredictability.
    Building for kids taught me to embrace ambiguity, not control it.
  • Latency isn’t a tech issue — it’s a trust issue.
    Speed = presence. Delays break not just flow, but emotional connection.
  • Scope clarity is the most underrated skill.
    Saying “no” to non-core features protected the product’s soul.
  • Privacy isn't a feature — it’s a promise.
    Earning parental trust required proactive, not reactive, guardrails.
Proven Results That Matter
92%
Story Completion
Children consistently engage from start to finish
200+
Pilot Participants
Successfully tested with families nationwide
100%
Privacy Compliance
Full COPPA and GDPR-K certification
Selected by Google's prestigious AI Startup Program for innovation in child-safe AI technology. Our token-safe Gemini and LangChain implementation sets new industry standards.
Storytime AI Product Roadmap
Phase 1: MVP
- Voice-to-Voice MVP (English)
- Gemini 2.0 Flash integration
- LangChain for interaction handling
- Whisper for voice transcription
- ElevenLabs for output voice
Phase 2: Mobile & Multilingual Voice Support
- Mobile app launch
- Support for multiple character and language voices
- Offline mode
- Parental control for voice and language selection
Phase 3: RAG & Personalization
- Integrate RAG with LangChain
- Vector DB for reusable story components
- Support for multilingual user prompts
- Preference memory across sessions
- Checkpoint continuation logic
Phase 4: AI Feedback Loop
- LangSmith analytics traces
- Drop-off and interaction scoring
- Sentiment-driven fallback logic
- Adaptive prompt tuning and model refinement
Phase 5: Future Roadmap
- AI-driven voice style personalization
- Parent-facing dashboard and learning insights
- Story-level comprehension and engagement analytics
- Real-time multilingual narration powered by on-device inference
Customer Driven Product Strategy
Storytime's roadmap, architecture, and core features are grounded in real-world voice-of-customer insights. Each user story and feature set reflects tested needs from parents and children across multiple environments — from bedtime routines to multilingual homes.
The following sections illustrate how user input shaped our design, from MVP to multilingual, adaptive storytelling.
Key User Stories — Designed Around Real Use Cases
Phase 1: Voice Interaction (MVP)
As a child, I want to speak to the app and hear a response in a fun voice,
so that I can enjoy a magical story experience hands-free.
As a parent, I want the story to wait for my child's response before continuing,
so that it feels more interactive and less like a passive video.
Phase 2: Mobile Usability & Control
As a parent, I want to select different story voices and language preferences,
so that my child can hear familiar or diverse voices and learn new languages.
As a parent, I want to quickly open the app and see recent stories and themes played,
so I can track what my child is engaging with.
Phase 3: Personalization & Learning
As a child, I want the story to remember my favorite animal and my name,
so that it feels like it was made just for me.
As a parent, I want the story to help teach empathy or problem-solving,
so my child can learn while being entertained.
Phase 4: AI Feedback Loop & Dashboard
As a parent, I want to see how long my child engaged and where they dropped off,
so that I can understand attention span and improvement over time.
As a PM, I want to know which interaction points are working and which themes are skipped,
so that I can fine-tune our AI prompts for better outcomes.
Phase 5: Global Expansion & Safety
As a multilingual parent, I want stories to be available in multiple languages,
so that my child can learn and connect in our native language.
As a parent, I want to ensure the story is safe, educational, and ad-free,
so I feel confident letting my child use it independently.
These stories directly shaped Storytime's product architecture, roadmap phases, and interaction point logic._
Success Metrics & AI-Specific KPIs
Click to expand Key Metrics used to evaluate product performance
Voice Interaction Metrics
  • Completion Rate: % of stories finished in one session
  • Interaction Accuracy: STT-to-expected intent match rate
  • Drop-off Points: Avg. story step where users disengage
  • STT Latency: Avg. Whisper response time in seconds
Personalization Impact
  • Re-engagement Rate: % of users returning within 3 days
  • Story Continuation Rate: % of stories picked up across sessions
  • Vector Hit Accuracy: RAG success in reusing correct segments
  • Preference Recall: # of personalized variables retrieved/session
AI Feedback & Prompt Loop
  • Prompt Adaptation Score: % of successful fallbacks triggered by sentiment
  • LangSmith Trace Depth: Avg. steps traced per interaction path
  • Fallback Trigger Rate: % of sessions needing recovery logic
Parental Engagement & Trust
  • Voice Customization Usage: % of parents customizing voice
  • Language Preference Setting: % of sessions using non-default language
Product Requirements & Acceptance Criteria
1
MVP Requirements
Interactive Story Generator, Voice-to-Voice Interface, Theme Selection Engine, Parent Dashboard
2
Post-MVP Requirements
Multilingual Story Support, Sentiment-Adaptive Storytelling, Memory-Based Personalization
3
Global Safety & Quality Criteria
COPPA/GDPR compliance, audio moderation, token limits, age-appropriate interaction times
Key Acceptance Criteria
  • Stories must have branching paths and pass content filters
  • Voice narration must be responsive and accurate
  • Themes must influence story content and be trackable
  • Parent dashboard must be mobile-friendly and data-driven
  • Multilingual support, sentiment analysis, and personalization
  • Strict safety and quality standards for child users
📘 Storytime AI Case Study
Role
Principal Product Manager (Founder-led)
Led the full lifecycle — from concept → MVP → mobile platform with multilingual, voice-first experience. Defined architecture, roadmap, and metrics for production-quality AI integration.
Team & Tech Stack
Team: 4 PMs, 2 frontend devs, 1 ML engineer, 1 content designer
Stack: Gemini 2.0 Flash, LangChain, Whisper STT, ElevenLabs TTS, LangSmith, MongoDB, Vector DB
Tools: Notion, Figma, Retool, ElevenLabs, Latitude AI, Replit, Claude, DoveTail
Key Highlights
  • Built MVP in 60 days → 100% early user retention
  • Selected for Google’s AI Startup Program
  • Reduced parent interaction time by 40%
  • RAG loop with LangSmith + fallback tuning
  • Personalization for multilingual homes
Product Impact
"From bedtime stories to multilingual, intelligent narration — Storytime is voice-first AI storytelling that listens, adapts, and delights."
Used by families across 3 continents. Built for kids aged 2–10. Designed for high recall, repeat engagement, and voice-based learning.
Closing Statement
🎖️ Designed to demonstrate Principal PM-level thinking, execution, and AI fluency across Gemini, LangChain, and RAG loops.
Think I might be a good fit for your team?
📩 Lets connect 🔗 LinkedIn