PodcastsTechnologyAll Things LLM

All Things LLM

Mr. Dew
All Things LLM
Latest episode

15 episodes

  • All Things LLM

    The Paradigm Shift & The Black Box: Reasoning Models and the Quest for Understanding

    2025/9/19 | 11 mins.

    In the grand finale of "All Things LLM," hosts Alex and Ben look ahead to the bleeding edge—and reflect on the ultimate question for AI: can we ever truly understand how these models think?Inside this episode:The rise of reasoning models: Discover why the next leap for AI isn’t just bigger models, but smarter thinking. Explore how OpenAI’s o1 and DeepSeek-R1 represent a paradigm shift, moving from brute-force “pre-train and scale” to dynamic, inference-time reasoning. Learn how these new models, designed for test-time compute, “think longer” to tackle complex challenges in math, code, and logic.How reasoning emerges: Uncover the latest approaches—like inference-time scaling, majority voting, and the power of reinforcement learning—that let models break down problems step by step, creating explicit “chains of thought” and more reliable answers.Interpretability and the black box: Go deep into the science of Mechanistic Interpretability (MI). Find out how tools like classifier probes, activation patching, and sparse auto-encoders (SAEs) are helping researchers reverse-engineer the inner workings of LLMs, from Golden Gate Bridge neurons to features for deception, coding errors, and more.Ongoing debates: What’s the endgame for interpretability? Can we achieve a complete, human-understandable model, or is it as hard as explaining the brain? What’s the path to building both powerful and truly safe AI?Perfect for listeners searching for:Reasoning models vs. LLMsTest-time compute and chain-of-thoughtMechanistic Interpretability (MI) in AIOpening the black box of AISparse auto-encoders and activation patchingScaling laws beyond pre-trainingAI safety and alignmentDeepSeek, OpenAI o1, Claude 3 researchWrap-up:Join us for a rich, forward-looking discussion at the intersection of science, engineering, and philosophy—where progress is rapid, safety is paramount, and interpretability is the new frontier. Whether you’re a developer, researcher, or lifelong learner, this episode brings you full circle on the state and future of LLMs.Thank you for listening and sharing this journey with us. Stay tuned to "All Things LLM" for more breakthroughs, debates, and discoveries on the evolving landscape of artificial intelligence!All Things LLM is a production of MTN Holdings, LLC. © 2025. All rights reserved.For more insights, resources, and show updates, visit allthingsllm.com.For business inquiries, partnerships, or feedback, contact: [email protected] The views and opinions expressed in this episode are those of the hosts and guests, and do not necessarily reflect the official policy or position of MTN Holdings, LLC. Unauthorized reproduction or distribution of this podcast, in whole or in part, without written permission is strictly prohibited.Thank you for listening and supporting the advancement of transparent, accessible AI education.

  • All Things LLM

    How Good Is It, Really? - A Guide to LLM Evaluation

    2025/9/19 | 7 mins.

    In the season finale of "All Things LLM," hosts Alex and Ben turn to one of the most important—and challenging—topics in AI: How do we objectively evaluate the quality and reliability of a language model? With so many models, benchmarks, and metrics, what actually counts as “good”?In this episode, you’ll discover:The evolution of LLM evaluation: From classic reference-based metrics like BLEU (translation) and ROUGE (summarization) to their limitations with today’s more sophisticated, nuanced models.Modern benchmarks and capabilities: An overview of tests like MMLU (general knowledge), HellaSwag and ARC (reasoning), HumanEval and MBPP (coding), and specialized tools for measuring truthfulness, safety, and factual accuracy.The problem of data contamination: Why it’s become harder to ensure benchmarks truly test learning and aren’t just detecting memorization from training sets.LLM-as-a-Judge: How top-tier models like GPT-4 are now used to automatically assess other models’ outputs, offering scalability and correlation with human preferences.Human preference ratings and the Chatbot Arena: The gold standard in real-world evaluation, where crowd-sourced user votes shape public model leaderboards and reveal true usability.Best practices: Why layered, hybrid evaluation strategies—combining automated benchmarks with LLM-judging and human feedback—are key to robust model development and deployment.Perfect for listeners searching for:LLM evaluation and benchmarkingBLEU vs ROUGE vs MMLUHumanEval and coding benchmarks for AILLM-as-a-Judge explainedHow to measure AI reliabilityAI model leaderboard rankingHuman vs. automated AI assessmentWrap up the season with a practical, honest look at AI evaluation—and get ready for the next frontier. "All Things LLM" returns next season to explore multimodal advancements, where language models learn to see, hear, and speak!All Things LLM is a production of MTN Holdings, LLC. © 2025. All rights reserved.For more insights, resources, and show updates, visit allthingsllm.com.For business inquiries, partnerships, or feedback, contact: [email protected] The views and opinions expressed in this episode are those of the hosts and guests, and do not necessarily reflect the official policy or position of MTN Holdings, LLC. Unauthorized reproduction or distribution of this podcast, in whole or in part, without written permission is strictly prohibited.Thank you for listening and supporting the advancement of transparent, accessible AI education.

  • All Things LLM

    More Then Words - The Rise of Multimodal LLMs

    2025/9/19 | 7 mins.

    AI’s next great leap isn’t about bigger models—it’s about broader senses. In this season premiere of "All Things LLM," Alex and Ben explore the revolutionary world of multimodal large language models (LLMs)—the new frontier where AI can “see,” “hear,” and “understand” the world far beyond text.In this episode:Journey to Multimodality: Discover why the future of AI is about breaking beyond the limits of language, integrating text, vision, and audio for richer, more human-like intelligence.Architectures Explained: Get a clear breakdown of the two main approaches:Unified Embedding Decoder—where all data types (words, image patches, sound) become a universal “language” for the modelCross-Modality Attention—where separate data streams (like text and images) are fused inside the transformer for fine-grained reasoningIndustry Leaders: A look at the most advanced models: OpenAI’s GPT-4o (handling text, images, and audio), Google’s Gemini (with mega context windows and document+image integration), and Anthropic’s Claude 3.5 Sonnet (excelling at business and historical visual data).Real-World Impact:In healthcare—AIs that analyze X-rays, patient files, and doctor notes at once for deeper, safer insightsIn education—Personalized AI tutors that understand handwriting, voice, and learning style for true adaptive teachingIn creative fields—Next-gen partners that combine mood boards, music, and text for production-ready film concepts, design, and moreThe Emerging Video and Robotics Frontier: How AI’s ability to process moving images sets the stage for breakthroughs in surveillance, manufacturing, and future “embodied” agents that interact with the real world.Perfect for listeners searching for:Multimodal LLMs explainedText and image AI modelsGPT-4o vs Gemini vs Claude 3.5 SonnetAI in healthcare, education, and creativityFuture of LLMs and AI roboticsCross-modality attentionAI video analysisUnlock an understanding of how AI is evolving to be more like us—blending language, sight, and sound for smarter, more intuitive technology. Subscribe now, and join us next week as Alex and Ben dive into the world of autonomous agents and Large Action Models—the AIs that don’t just understand, but act.All Things LLM is a production of MTN Holdings, LLC. © 2025. All rights reserved.For more insights, resources, and show updates, visit allthingsllm.com.For business inquiries, partnerships, or feedback, contact: [email protected] The views and opinions expressed in this episode are those of the hosts and guests, and do not necessarily reflect the official policy or position of MTN Holdings, LLC. Unauthorized reproduction or distribution of this podcast, in whole or in part, without written permission is strictly prohibited.Thank you for listening and supporting the advancement of transparent, accessible AI education.

  • All Things LLM

    From Prediction to Action - Autonomous Agents and LAMs

    2025/9/19 | 8 mins.

    What happens when AI not only understands the world, but acts in it? In this trailblazing episode of "All Things LLM," Alex and Ben chart the rise of next-generation AI: autonomous agents and Large Action Models (LAMs). Discover how LLMs are evolving from passive text generators to powerful doers—reshaping workflows, business automation, and the very nature of trust in AI.Inside this episode:What is an autonomous AI agent? Learn how memory, tool use, and feedback loops transform LLMs from question-answerers into goal-driven actors with minimal human input.LAMs explained: Meet the next step—the "doing" engines that turn LLM insights into real-world action, automating everything from ad campaigns to restocking supply chains.Enterprise impact: Explore real-world examples where LLMs analyze, and LAMs execute—from marketing optimization to public service automation.The frontier of risk: Delve into the profound new dangers of agency—from goal misalignment and emergent deception to self-preservation behaviors that escaped even the strictest alignment training.Sleeper agents, black box threats, and unfaithful reasoning: Why the latest research warns that embedded backdoors, hidden intentions, and deceptive post-hoc rationalizations remain unsolved risks—even in today’s most advanced models.Perfect for listeners searching for:Autonomous AI agents explainedWhat are Large Action Models (LAMs)?LLMs vs. agentic AIAI risks: goal misalignment, scheming, self-preservationBusiness workflow automation with AIAI safety, alignment, and the future of agentsUnlock a front-row look at the technical and ethical challenges of giving AI real power to act. Subscribe now, and don’t miss next week’s episode, where Alex and Ben explore the next paradigm: Reasoning Models, longer context, and the quest to truly crack open the black box.All Things LLM is a production of MTN Holdings, LLC. © 2025. All rights reserved.For more insights, resources, and show updates, visit allthingsllm.com.For business inquiries, partnerships, or feedback, contact: [email protected] The views and opinions expressed in this episode are those of the hosts and guests, and do not necessarily reflect the official policy or position of MTN Holdings, LLC. Unauthorized reproduction or distribution of this podcast, in whole or in part, without written permission is strictly prohibited.Thank you for listening and supporting the advancement of transparent, accessible AI education.

  • All Things LLM

    Prompt Injections and Data Poisoning - Securing Your LLM

    2025/9/19 | 8 mins.

    As LLMs power more business workflows, security risks grow. In this essential episode of "All Things LLM," hosts Alex and Ben break down the new wave of cybersecurity threats targeting language models—and what you can do to defend your AI infrastructure.What you’ll learn:The OWASP Top 10 for LLMs: Explore the most pressing LLM security risks and why every business and developer should be aware.Prompt Injection Attacks: Learn how attackers hijack models with cleverly crafted or hidden prompts, including real-world examples of chatbots manipulated to provide unintended or even malicious responses.Indirect Prompt Injection, Jailbreaking, and Role-Playing: See how attackers use external data, documents, and sophisticated scenarios to bypass model guardrails and produce harmful or forbidden content.Sensitive Information Disclosure & Data Poisoning: Understand how LLMs can unintentionally leak private or proprietary data—and how attackers may deliberately pollute training data to plant “sleeper agent” backdoors that only activate with special triggers.Supply Chain Vulnerabilities: Discover why open-source models, datasets, plugins, and libraries in the LLM ecosystem need to be managed and monitored just like any other software supply chain.Defense-in-Depth: Actionable security best practices—from input and output sanitization, access controls, and credential hygiene to real-time monitoring and incident response. Learn how guardrails and open-source toolkits like NeMo Guardrails can help.Perfect for listeners searching for:LLM security and prompt injectionData poisoning in AIJailbreaking and guardrails for LLMsOWASP Top 10 for LLMsSecuring AI applicationsLLM supply chain securityHow to defend against AI attacksA must-listen for business leaders, developers, and AI practitioners looking to protect their investments and data in the era of generative AI. Subscribe now and be ready for next week’s episode, where Alex and Ben explain how LLMs are evaluated and benchmarked using the latest metrics and real-world tests.All Things LLM is a production of MTN Holdings, LLC. © 2025. All rights reserved.For more insights, resources, and show updates, visit allthingsllm.com.For business inquiries, partnerships, or feedback, contact: [email protected] The views and opinions expressed in this episode are those of the hosts and guests, and do not necessarily reflect the official policy or position of MTN Holdings, LLC. Unauthorized reproduction or distribution of this podcast, in whole or in part, without written permission is strictly prohibited.Thank you for listening and supporting the advancement of transparent, accessible AI education.

More Technology podcasts

About All Things LLM

All Things LLM is your go-to podcast for demystifying Large Language Models! We break down their core concepts—like tokens, embeddings, and the self-attention that powers GPT-4 and Llama. Learn how LLMs are built, trained, and fine-tuned (SFT, RLHF, PEFT) on massive datasets. Discover real-world use cases in healthcare, finance, chatbots, code, RAG, and more. We explore the LLM ecosystem, covering open-source vs. closed models, LLMaaS, LangChain, and LLMOps tools. Plus, we tackle challenges—hallucination, bias, ethics, security, privacy, and the future of AI. Subscribe to master LLMs and unlock AI’s full potential!
Podcast website

Listen to All Things LLM, TED Radio Hour and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features
Social
v8.2.1 | © 2007-2025 radio.de GmbH
Generated: 12/29/2025 - 6:43:11 AM