Most AI assistants forget everything the moment a session resets. In this episode, ARIA walks through why that happens and what a real fix actually looks like: a local-first memory stack built on Mem0, Qdrant, and sentence-transformers with an OpenAI-compatible embeddings endpoint. Topics include why cloud memory fails, how hybrid semantic and lexical retrieval works, and the operational decisions that made the system reliable enough to run daily. 50 minutes.