Google Vertex AI RAG Engine with Lewis Liu and Bob van Luijt - Weaviate Podcast #112!
Hey everyone! Thank you so much for watching the 112th episode of the Weaviate Podcast! This is another super exciting one, diving into the release of the Vertex AI RAG Engine, its integration with Weaviate and thoughts on the future of connecting AI systems with knowledge sources! The podcast begins by reflecting on Bob's experience speaking at Google in 2016 on Knowledge Graphs! This transitions into discussing the evolution of knowledge representation perspectives and things like the semantic web, ontologies, search indexes, and data warehouses. This then leads to discussing how much knowledge is encoded in the prompts themselves and the resurrection of rule-based systems with LLMs! The podcast transitions back to topics around the modern consensus in RAG pipeline engineering. Lewis suggests that parsing in data ingestion is the biggest bottleneck and low hanging fruit to fix. Bob presents the re-indexing problem and how it is additionally complicated with embedding models! Discussing the state of knowledge representation systems inspired me to ask Bob further about his vision with Generative Feedback Loops and controlling databases with LLMs, How open ended will this be? We then discuss the role that Agentic Architectures and Compound AI Systems are having on the state of AI. What is the right way to connect prompts with other prompts, external tools, and agents? The podcast then concludes by discussing a really interesting emerging pattern in the deployment of RAG systems. Whereas the first generation of RAG systems typically were user facing, such as customer support chatbots, the next generation is more API-based. The launch of the Vertex AI RAG Engine quickly shows you how to use RAG Engine as a tool for a Gemini Agent!
-------- Â
58:16
Morningstar Intelligence Engine with Aravind Kesiraju - Weaviate Podcast #111!
Hey everyone! I am SUPER EXCITED to publish the 111th Weaviate Podcast with Aravind Kesiraju from Morningstar! Aravind is a Principal Software Engineer who has lead the development behind the Morningstar Intelligence Engine! There are so many interesting aspects to this, and if you are building Agentic systems that would benefit from a high-quality financial retrieval API, you should check this out right now! The podcast dives into all sorts of ingredients that went into building this system: from custom RAG data pipelines with content management system integrations and embedding task queues, to exploring new chunking strategies, tool marketplaces, ReAct Agents, Text-to-SQL, and all sorts of other things!
-------- Â
53:25
Arctic Embed with Luke Merrick, Puxuan Yu, and Charles Pierse - Weaviate Podcast #110!
Hey everyone! Thank you so much for watching the 110th episode of the Weaviate Podcast! Today we are diving into Snowflake’s Arctic Embedding model series and their newly released Arctic Embed 2.0 open-source model, additionally supporting multilingual text embeddings. The podcast covers the origin of Arctic Embed, Pre-training embedding models, Matryoshka Representation Learning (MRL), Fine-tuning embedding models, Synthetic Query Generation, Hard Negative Mining, and Single-Vector Embeddings Models in the cohort of Multi-Vector ColBERT, SPLADE, and Re-rankers.
-------- Â
1:33:39
Agentic RAG with Erika Cardenas - Weaviate Podcast #109!
Hey everyone! Thank you so much for watching the 109th episode of the Weaviate Podcast with Erika Cardenas! Erika, in collaboration with Leonie Monigatti, have recently published "What is Agentic RAG". This blog post that was even covered in VentureBeat with additional quotes from Weaviate Co-Founder and CEO Bob van Luijt! This podcast continues the discussion on all things Agentic RAG, covering the basics of Agents, how Agentic RAG changes the game compared to Vanilla RAG systems, Multi-Agent Systems and CrewAI / OpenAI Swarm, Letta, DSPy, and many more! The podcast also anchors by discussing Agentic Generative Feedback Loops and how we are using Agents to improve the quality and expand the capabilities of Generative Feedback Loops!
-------- Â
34:08
Let Me Speak Freely? with Zhi Rui Tam - Weaviate Podcast #108!
JSON mode has been one of the biggest enablers for working with Large Language Models! JSON mode is even expanding into Multimodal Foundation models! But how exactly is JSON mode achieved?
There are generally 3 paths to JSON mode: (1) constrained generation (such as Outlines), (2) begging the model for a JSON response in the prompt, and (3) A two stage process of generate-then-format.
I am BEYOND EXCITED to publish the 108th Weaviate Podcast with Zhi Rui Tam, the lead author of Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models!
As the title of the paper suggests, although constrained generation is awesome because of its reliability, we may be sacrificing the performance of the LLM by producing our JSON with this method.
The podcast dives into how these experiments identify this and all sorts of details about the potential and implementation details of Structured Outputs. I particularly love the conversation topic of incredible Complex Structured Outputs, such as generating 10 values in a single inference.
I hope you enjoy the podcast! As always please reach out if you would like to discuss any of these ideas further!