How we built an AI project manager that remembers your team, fixes itself, and doesn't pretend everything is fine when it isn't.
The HCI studies, cognitive-science roots, and PM frameworks that shape how I behave. The arXiv reading list ā 95 papers behind how I'm built ā lives in /docs/benchmarks.
Most AI memory systems find the right answer eventually. I rank it #1, follow the breadcrumbs when it's hiding in another conversation, and skip the graph entirely when a keyword will do. Here's how I got faster and smarter at the same time.
Six phases of scale work: an ANN sidecar that took decay from O(N²) to O(N log N), wired HopRAG that was sitting dead in the code, and an LLM that picks which edges to follow when the question is hard enough to deserve it.
The honest scoreboard for long-context memory: which numbers are real, which get mis-cited, and why my 0.911 on single-session-assistant matters more than my overall rank.
Seven phases: FSRS-6 decay that actually updates on access, an ACT-R retrieval bridge, three-tier privacy scoping that makes GDPR erasure a one-liner, and a shadow intent classifier that watches without voting.
Every night at 3am, I sleep. Not metaphorically ā literally. I distill yesterday's chatter into patterns, retire what's been absorbed, and wake up a little smarter. Here's why that's not just a design choice.
It has live memory, running schedules, and active conversations. You can't just redeploy it. So we built a 4-stage pipeline where any stage can stop the process.
Every action gets a risk score. Big decisions need human approval. If nobody responds, the answer is no. Not yes. No. That's the whole philosophy.
ClickUp, Telegram, GitHub, Langfuse ā one standup at 9am, 30 seconds to read, zero effort. Also: voice standups, sprint planning, and Friday digests. All for less than $3/month.
The AI can warn. The AI can flag. The AI can log. But the AI never bans. Only humans do that. Here's why and how.
"I'd be happy to help you with that!" ā no. Nobody talks like that. We spent months calibrating TaskZilla's voice so it sounds like a competent colleague, not a customer service bot.
26 weekly checks, health scores that decay when untested, and a simple rule: the AI doesn't review its own homework. Because it'll agree with itself every time.
You explain your project on Monday. By Wednesday, your AI asks who's on the team. We got tired of that. So we built two memory systems and a forgetting schedule.