Max Agency

LangChain
Max Agency
Latest episode

5 episodes

  • Max Agency

    How Cogent builds AI agents that have to be right every single time | Geng Sng (Co-founder & CTO - Cogent)

    2026/05/22 | 1h 14 mins.
    Geng Sng is co-founder and CTO of Cogent, which builds autonomous agents that remediate vulnerabilities for enterprise security teams. Today, Cogent's agents process billions of security events per day, maintaining a live context graph of every asset and vulnerability across customer environments. In this conversation, Geng walks through Cogent's hot vs cold context split, the sub-agents that handle side quests, and the two graphs they run in parallel.

    We also discuss:
    Why defensive security is harder for AI than offensive
    Under the hood of Cogent's three agents
    Inside Cogent's “read only” by-default sandboxes
    Why graph databases don't scale for security data
    Cogent Research and the move into formal verification
    Why interactive agents need a deeper planning phase to one-shot

    Referenced:
    Abnormal AI
    Amazon S3
    Anthropic
    Bash
    ChatGPT
    Claude Code
    Claude Mythos
    CodeMender
    Codex
    Cogent
    Cursor
    Google DeepMind
    GPT-5.5-Cyber
    Jupyter
    Letta
    Mozilla
    OpenAI
    Opus 4.6
    Opus 4.7
    Vercel

    Where to find Geng:
    LinkedIn

    Where to find Harrison:
    LinkedIn
    Twitter/X

    Where to find LangChain:
    Website
    Docs

    Send feedback or questions to [email protected]

    Timestamps:
    00:00 Why mean time to exploit collapsed from years to minutes
    02:08 Inside Cogent's Agent Lake architecture
    05:11 Why Cogent rejected graph databases
    10:48 The trust ladder before agents touch production
    15:13 The three types of agents inside Cogent
    17:07 How Cogent sandboxes its agents
    19:16 Short-circuiting interactive agents with a deeper planning phase
    24:31 What to do when users believe agents too much
    31:21 Why sub-agents let agents go on side quests
    34:59 Two-tiered evals and the metric that catches bad prompts
    40:00 Cogent’s unique approach to context
    48:39 Cogent Research and the move into formal verification
    51:33 The single trait Cogent hires for
    54:00 Open-sourcing models within six months
    57:07 Why defensive security won’t be commoditized anytime soon
    1:00:51 The founding insight behind Cogent
  • Max Agency

    How Ramp built an AI agent that can think outside of tokens | Alex Shevchenko

    2026/05/07 | 44 mins.
    Alexander Shevchenko is the head of applied research at Ramp, where he leads Ramp Labs – the team behind Ramp Sheets and a steady stream of public AI engineering experiments. Ramp Sheets started as an internal process mining tool that turned Loom videos of accountants into Markov diagrams, before evolving into the agentic spreadsheet editor that shipped in November. In this conversation, Alex walks through the architecture under the hood, why Ramp biases the agent toward Excel formulas over Python code gen, and two recent Labs experiments: Latent Briefing and a user-steerable revival of Golden Gate Claude.

    We also discuss:
    Under the hood of Ramp Sheets
    Inspect, Ramp's internal coding agent, and the self-improving monitor loop it powers
    Why finance professionals rejected code gen as too "black box"
    Why Anthropic models tend to excel at agentic spreadsheet manipulation
    The case for putting the agent outside the sandbox, not inside it
    The Loom-to-Markov-diagram process mining pipeline
    RLMs and how subagents can share memory in latent space
    Latent Briefing and KV-cache communication between subagents
    Reviving Golden Gate Claude with steering vectors on Gemma

    Referenced:
    Alex Levinson
    Anthropic
    Ben Geist
    Claude
    Efficient Memory Sharing for Multi-Agent Systems via KV Cache Compaction (Ben Geist)
    Gemma
    Golden Gate Claude
    Graphviz
    Inspect
    Latent Briefing
    Loom
    Modal
    OpenAI
    Opus
    Qwen
    Ramp
    Ramp Labs
    Ramp Sheets
    Recursive Language Models (Alex Zhang)
    Retool
    Self-maintaining Ramp Sheets
    Steer AI

    Where to find Alex:
    LinkedIn
    Twitter/X
    Website

    Where to find Harrison:
    LinkedIn
    Twitter/X

    Where to find LangChain:
    Website
    Docs

    Send feedback or questions to [email protected]

    Timestamps:
    00:00 Introduction
    01:13 The origin of Ramp Sheets
    02:27 The Loom-to-Markov-diagram process mining pipeline
    04:28 Why code gen approaches felt too "black box" to finance
    06:13 Meeting finance where they already are: inside the spreadsheet
    09:08 How far process mining got them
    10:31 Text descriptions and Graphviz DAGs as output
    12:41 Under the hood of Ramp Sheets
    14:52 Why the agent uses Python only as an escape hatch
    15:47 Why Anthropic models excel at agentic spreadsheet manipulation
    17:12 Frankensteining the OpenAI Agents SDK
    17:43 The Ramp Sheets UX and fast vs. expert mode
    19:58 Agent in a sandbox vs. agent with a sandbox
    21:55 Vibe evals with expert humans
    23:40 Inspect, the internal coding agent
    24:13 The self-monitoring loop and auto-PRs
    28:01 Other wacky experiments on Sheets
    28:43 Memory experiments that didn't pan out
    31:16 Latent Briefing and KV-cache subagent communication
    35:13 Reviving Golden Gate Claude
    37:47 Contrastive pairs and steering vectors
    39:47 Picking the right layers in Gemma
    41:37 What Ramp Labs looks for when hiring
  • Max Agency

    How Listen is building a system of AI Agents & subagents for specialized tasks | Florian Juengermann, CTO

    2026/04/23 | 47 mins.
    Florian Juengermann is the co-founder and CTO of Listen, an AI startup that turns qualitative research across hundreds of interviews, surveys, and focus groups into structured, traceable insights. Listen's agents analyze responses at scale, and Florian has rearchitected the system multiple times to get there. In this conversation, he walks through the virtual table architecture at the core of their Research Agent, how small models run map-reduce classification across thousands of open-ended responses, and the self-reviewing feedback subagent that catches errors during long async runs.

    We also discuss:
    The three agents inside Listen's platform
    How Listen rearchitected from a simple RAG bot to a multi-agent system multiple times
    Why the PowerPoint subagent was completely rebuilt using Claude's code SDK
    Contextual prompt engineering as an alternative to skills
    How Listen keeps report numbers live as new interview responses come in
    When to trigger the long-running agent vs. showing early results
    What Florian looks for when hiring agent engineers

    References:
    Anthropic
    ChatGPT
    Claude
    Claude Code SDK
    E2B
    Emotional Intelligence
    GPT Mini
    Haiku
    Listen
    OpenAI
    Pandas
    Postgres
    Python
    Research Agent
    Render
    Zoom

    Where to find Florian:
    LinkedIn
    Twitter/X

    Where to find Harrison:
    LinkedIn
    Twitter/X

    Where to find LangChain:
    Website
    Docs

    Send feedback or questions to [email protected]

    Timestamps
    00:00 Introduction
    01:25 The three agents inside Listen's platform
    03:15 Live chat vs. long async runs, and how Listen tunes for each
    05:33 Under the hood of the Research Agent
    06:37 Listen's virtual table architecture
    07:34 How small models classify thousands of open-ended responses
    10:05 Running code in a sandbox: how E2B fits in
    11:52 Why Listen rebuilt the PowerPoint subagent from scratch
    14:11 Contextual prompt engineering instead of skills
    16:32 The feedback subagent that reviews its own reports
    18:14 How Listen runs evals in production
    19:47 Unexpected ways users push the agent to its limits
    21:42 How many times Listen has rearchitected, and why
    24:59 Trace observability: depth over breadth
    26:10 Lessons from running Claude Code SDK inside E2B
    27:42 Memory: what's solved and what isn't
    29:10 The Composer agent UX: co-editing a document with AI
    35:50 How Listen keeps report numbers live as new responses come in
    43:47 What Listen looks for when hiring agent engineers
  • Max Agency

    How Hex builds AI agents that reason like human data analysts | Izzy Miller, AI Engineer

    2026/04/09 | 1h 8 mins.
    Izzy Miller is an AI engineer at Hex, an AI analytics platform that was one of the first companies to ship data agents to real paying users. Today, Hex runs a multi-agent system with nearly 100K tokens of tools, and Izzy is building a 90-day simulation to evaluate whether those agents actually get smarter over time. In this conversation, he walks through the harness decisions that shaped their architecture, the failure modes Hex is seeing at scale, and what it takes to build an eval that no current model can pass.

    We also discuss:
    Why data agents are harder to verify than coding agents
    Under the hood of Hex’s agents
    How Hex is unifying separate agents
    Why most eval sets are bad
    The 90-day simulation for long-horizon evals
    How Izzy went from marketing to AI engineer

    References:
    Andon Labs
    Anthropic
    Barry McCardel
    ChatGPT
    Claude Code
    Claude Sonnet 4.6
    DBT
    GPT-3.5 Turbo
    GPT-5.3 Codex Spark
    GPT-5.4
    Hex
    LangChain
    LangSmith
    Looker
    OpenAI
    Opus 4.6
    Satya Nadella
    Snowflake
    Vending Machine

    Where to find Izzy:
    LinkedIn
    Twitter/X

    Where to find Harrison:
    LinkedIn
    Twitter/X

    Where to find LangChain:
    Website
    Docs

    Send feedback or questions to [email protected]

    Timestamps:
    01:35 Where Hex's notebook agent started
    03:46 The moment Hex knew it was time for agents
    07:36 Why data agents are harder to verify than coding agents
    09:30 How Hex is unifying separate agents
    13:28 Under the hood of the notebook agent
    15:41 The harness features that are now holding the agent back
    17:41 Why Hex built their own orchestrator
    18:59 Managing nearly 100K tokens of tools
    20:49 Ephemeral queries and agent behavior trade-offs
    24:46 The UX problem with showing agents' thinking
    27:28 Why verification is harder than transparency for data agents
    31:00 Memory, context conflicts, and collapse modes
    34:38 How Hex built their internal eval system
    39:29 Why most eval sets are bad
    44:30 The 900% quota eval that every model fails
    46:55 Model upgrades and the "in distribution" debate
    51:34 How Izzy went from marketer to AI engineer
    59:59 The 90-day simulation for long-horizon evals
  • Max Agency

    Welcome to Max Agency

    2026/04/08 | 0 mins.
    Welcome to Max Agency, the podcast that goes deep into how the best agents are being built by builders like you. I'm Harrison Chase, CEO of LangChain, the agent engineering company, and I'll be your host.
More Technology podcasts
About Max Agency
Welcome to Max Agency, a podcast about how the best AI agents are actually being built. Hosted by Harrison Chase, CEO of LangChain, each episode goes deep with the builders designing, deploying, and learning from real agent systems in the wild. From architecture decisions to evals, tooling, and failure modes, Max Agency is for people who want to understand what it really takes to build useful agents.
Podcast website

Listen to Max Agency, Search Engine and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features