Weaviate Podcast podcast | Listen online for free

Available Episodes

5 of 127

Agentic Topic Modeling with Maarten Grootendorst - Weaviate Podcast #126!
Maarten Grootendorst is a psychologist turned AI engineer who has created BERTopic and authored "Hands-On Large Language Models" with Jay Alammar. The rise of LLMs and Agents are transforming many areas of software! This podcast dives deep into their impact on Topic Modeling! Maarten designed BERTopic from the start with modularity in mind -- letting you ablate embedding models, dimensionality reduction, clustering algorithms, and more. This early insight to prioritize modularity makes BERTopic incredibly well structured to become more "Agentic". An "Agentic" Topic Modeling algorithm can use LLMs to generate topics or topic descriptions, as well as contrast them with other topics. It can decide which topics to subdivide, and it can integrate human feedback and evaluate topics in novel ways... I hope you find the podcast interesting!
--------
1:05:18
--------
1:05:18
Sufficient Context with Hailey Joren - Weaviate Podcast #125!
Hailey Joren is a Ph.D. student at UCSD! Hailey and collaborators at Duke University and Google have recently published Sufficient Context: A New Lens on Retrieval Augmented Generation Systems in ICLR 2025! There are so many interesting nuggets to this work! Firstly, it really helped me understand the difference between *relevant* search results and sufficient context for answering the question. Armed with this lens of looking at retrieved context, Hailey and collaborators make all sorts of interesting observations about the current state of Hallucination. RAG unfortunately makes the models far less likely to hallucinate, and the existing RAG benchmarks unfortunately do not emphasize retrieval adaptation well enough -- indicated by LLMs outputting correct answers despite insufficient context 35-62% of the time! However, reason for optimism! Hailey and team develop an autorater that can detect insufficient context 93% of the time! There are all sorts of interesting ideas around this paper! I really hope you find the podcast useful!
--------
50:53
--------
50:53
RAG Benchmarks with Nandan Thakur - Weaviate Podcast #124!
Nandan Thakur is a Ph.D. student at the University of Waterloo! Nandan has worked on many of the most impactful works in Retrieval-Augmented Generation (RAG) and Information Retrieval. His work ranges from benchmarks such as BEIR, MIRACLE, TREC, and FreshStack, to improving the training of embedding models and re-rankings, and more!
--------
1:04:46
--------
1:04:46
MUVERA with Rajesh Jayaram and Roberto Esposito - Weaviate Podcast #123!
Multi-vector retrieval offers richer, more nuanced search, but often comes with a significant cost in storage and computational overhead. How can we harness the power of multi-vector representations without breaking the bank? Rajesh Jayaram, the first author of the groundbreaking MUVERA algorithm from Google, and Roberto Esposito from Weaviate, who spearheaded its implementation, reveal how MUVERA tackles this critical challenge.Dive deep into MUVERA, a novel compression technique specifically designed for multi-vector retrieval. Rajesh and Roberto explain how it leverages contextualized token embeddings and innovative fixed dimensional encodings to dramatically reduce storage requirements while maintaining high retrieval accuracy. Discover the intricacies of quantization within MUVERA, the interpretability benefits of this approach, and how LSH clustering can play a role in topic modeling with these compressed representations.This conversation explores the core mechanics of efficient multi-vector retrieval, the challenges of benchmarking these advanced systems, and the evolving landscape of vector database schemas designed to handle such complex data. Rajesh and Roberto also share their insights on the future directions in artificial intelligence where efficient, high-dimensional data representation is paramount.Whether you're an AI researcher grappling with the scalability of vector search, an engineer building advanced retrieval systems, or fascinated by the cutting edge of information retrieval and AI frameworks, this episode delivers unparalleled insights directly from the source. You'll gain a fundamental understanding of MUVERA, practical considerations for its application in making multi-vector retrieval feasible, and a clear view of future directions in AI.
--------
1:13:06
--------
1:13:06
Patronus AI with Anand Kannappan - Weaviate Podcast #122!
AI agents are getting more complex and harder to debug. How do you know what's happening when your agent makes 20+ function calls? What if you have a Multi-Agent System orchestrating several Agents? Anand Kannappan, co-founder of Patronus AI, reveals how their groundbreaking tool Percival transforms agent debugging and evaluation. Percival can instantly analyze complex agent traces, it pinpoints failures across 60 different modes, and it automatically suggests prompt fixes to improve performance. Anand unpacks several of these common failure modes. This includes the critical challenges of "context explosion" where agents process millions of tokens. He also explains domain adaptation for specific use cases, and the complex challenge of multi-agent orchestration. The paradigm of AI Evals is shifting from static evaluation to dynamic oversight! Also learn how Percival's memory architecture leverages both episodic and semantic knowledge with Weaviate!This conversation explores powerful concepts like process vs. outcome rewards and LLM-as-judge approaches. Anand shares his vision for "agentic supervision" where equally capable AI systems provide oversight for complex agent workflows. Whether you're building AI agents, evaluating LLM systems, or interested in how debugging autonomous systems will evolve, this episode delivers concrete techniques. You'll gain philosophical insights on evaluation and a roadmap for how evaluation must transform to keep pace with increasingly autonomous AI systems.
--------
1:01:06
--------
1:01:06