PodcastsNewsThursdAI - The top AI news from the past week

ThursdAI - The top AI news from the past week

From Weights & Biases, Join AI Evangelist Alex Volkov and a panel of experts to cover everything important that happened in the world of AI from the past week
ThursdAI - The top AI news from the past week
Latest episode

144 episodes

  • ThursdAI - The top AI news from the past week

    📅 ThursdAI - Feb 19 - Gemini 3.1 Pro Drops LIVE, Sonnet 4.6 Closes Gap, OpenClaw Goes to OpenAI

    2026/2/20 | 1h 31 mins.
    Hey, it’s Alex, let me catch you up!
    Since last week, OpenAI convinced OpenClaw founder Peter Steinberger to join them, while keeping OpenClaw.. well... open. Anthropic dropped Sonnet 4.6 which nearly outperforms the previous Opus and is much cheaper, Qwen released 3.5 on Chinese New Year’s Eve, while DeepSeek was silent and Elon and XAI folks deployed Grok 4.20 without any benchmarks, and it’s 4 500B models in a trenchcoat?
    Also, Anthropic updated rules state that it’s breaking ToS to use their plans for anything except Claude Code & Claude SDK (and then clarified that it’s OK? we’re not sure)
    Then Google decided to drop their Gemini 3.1 Pro preview right at the start of our show, and it’s very nearly the best LLM folks can use right now (though it didn’t pass Nisten’s vibe checks)
    Also, Google released Lyria 3 for music gen (though only 30 seconds?) and our own Ryan Carson blew up on X again with over 1M views for his Code Factory article, Wolfram did a deep dive into Terminal Bench and .. we have a brand new website:
    https://thursdai.news 🎉
    Great week all in all, let’s dive in!
    ThursdAI - Subscribe to never feel like you’re behind. Share with your friends if you’re already subscribed!

    Big Companies & API updates
    Google releases Gemini 3.1 Pro with 77.1% on ARC-AGI-2 (X, Blog, Announcement)
    In a release that surprised no-one, Google decided to drop their latest update to Gemini models, and it’s quite a big update too! We’ve now seen all major labs ship big model updates in the first two months of 2026. With 77.1% on ARC-AGI 2, and 80.6% on SWE-bench verified, Gemini is not complete SOTA across the board but it’s damn near close.
    The kicker is, it’s VERY competitive on the pricing, with 1M context, $2 / $12 (But if you look at the trajectory, it’s really notable how quickly we’re moving, with this model being 82% better on abstract reasoning than the 3 pro released just a few months ago!
    The 1 Million Context Discrepancy, who’s better at long context?
    The most fascinating catch of the live broadcast came from LDJ, who has an eagle eye for evaluation tables. He immediately noticed something weird in Google’s reported benchmarks regarding long-context recall. On the MRCR v2 8-needle benchmark (which tests retrieval quality deep inside a massive context window), Google’s table showed Gemini 3.1 Pro getting a 26% recall score at 1 million tokens. Curiously, they marked Claude Opus 4.6 as “not supported” in that exact tier.
    LDJ quickly pulled up the actual receipts: Opus 4.6 at a 1-million context window gets a staggering 76% recall score. That is a massive discrepancy! It was addressed by a member of DeepMind on X in a response to me, saying that Anthropic used an internal model for evaluating this (with receipts he pulled from the Anthropic model card)
    Live Vibe-Coding Test for Gemini 3.1 Pro
    We couldn’t just stare at numbers, so Nisten immediately fired up AI Studio for a live vibe check. He threw our standard “build a mars driver simulation game” prompt at the new Gemini.
    The speed was absolutely breathtaking. The model generated the entire single-file HTML/JS codebase in about 20 seconds. However, when he booted it up, the result was a bit mixed. The first run actually failed to render entirely. A quick refresh got a version working, and it rendered a neat little orbital launch UI, but it completely lacked the deep physics trajectories and working simulation elements that models like OpenAI’s Codex 5.3 or Claude Opus 4.6 managed to output on the exact same prompt last week. As Nisten put it, “It’s not bad at all, but I’m not impressed compared to what Opus and Codex did. They had a fully working one with trajectories, and this one I’m just stuck.”
    It’s a great reminder that raw benchmarks aren’t everything. A lot of this comes down to the harness—the specific set of system prompts and sandboxes that the labs use to wrap their models.
    Anthropic launches Claude Sonnet 4.6, with 1M token context and near-Opus intelligence at Sonnet pricing
    The above Gemini release comes just a few days after Anthropic has shipped an update to the middle child of their lineup, Sonnet 4.6. With much improved Computer Use skills, updated Beta mode for 1M tokens, it achieves 79.6% on SWE-bench verified eval, showing good coding performance, while maintaining that “anthropic trained model” vibes that many people seem to prefer.
    Apparently in blind testing inside Claude Code, folks preferred this new model outputs to the latest Opus 4.5 around ~60% of the time, while preferring it over the previous sonnet 70% of the time.
    With $3/$15 per million tokens pricing, it’s cheaper than Opus, but is still more expensive than the flagship Gemini model, while being quite behind.
    Vibing with Sonnet 4.6
    I’ve tested out Sonnet 4.6 inside my OpenClaw harness for a few days, and it was decent. It did annoy me a bit more than Opus, with misunderstanding what I ask it, but it definitely does have the same “emotional tone” as Opus. Comparing it to Codex 5.3 is very easy, it’s much nicer to talk to. IDK what kind of Anthropic magic they put in there, but if you’re on a budget, Sonnet is definitely the way to go when interacting with Agents (and you can get it to orchestrate as many Codex instances as you want if you don’t like how it writes code)
    For Devs: Auto prompt caching and Web Search updates
    One nice update Anthropic also dropped is that prompt caching (which leads to almost 90% decrease in token pricing) for developers (Blog) and a new and improved Web Search for everyone else that can now use tools
    Grok 4.20 - 4 groks in a trenchcoat?
    In a very weird release, Grok has been updated with the long hyped Grok 4.20. Elon has been promising this version for a while (since late last year in fact) and this “release” definitely felt underwhelming. There was no evaluations, no comparisons to other labs models, no charts (heck, not even a blogpost on X.ai).
    What we do know, is that Grok 4.20 (and Grok 4.20 Heavy) use multiple agents (4 for Grok, 16 for Heavy) to do a LOT of research and combine their answers somehow. This is apparently what the other labs use for their ultra expensive models (GPT Pro and Gemini DeepThink) but Grok is showing it in the UI, and gives these agents... names and personalities.
    Elon has confirmed also that what’s deployed right now is ~500B “small” base version, and that bigger versions are coming, in one of the rarest confirmations about model size from the big labs.
    Vibe checking this new grok, it’s really fast at research across X and the web, but I don’t really see it as a daily driver for anyone who converses with LLMs all the time. Supposedly they are planning to keep teaching this model and get it “improved week over week” so I’ll keep you up to date with major changes here.
    Open Source AI
    It seems that all the chinese OSS labs were shipping before the Chinese New Year, with Qwen being the last one of them, dropping the updated Qwen 3.5.
    Alibaba’s Qwen3.5 397B-A17B: First open-weight native multimodal MoE model (X, HF)
    Qwen decided to go for Sparse MoE architecture with this release, with a high number of experts (512) and only 17B active parameters.
    It’s natively multi-modal with a hybrid architecture, able to understand images/text, while being comparable to GPT 5.2 and Opus 4.5 on benches including agentic tasks.
    Benchmarks aside, the release page of Qwen models is a good sniff test on where these model labs are going, they have multimodality in there, but they also feature an example of how to use this model within OpenClaw, which doesn’t necessarily show off any specific capabilities, but shows that the Chinese labs are focusing on agentic behavior, tool use and mostl of all pricing!
    This model is also available as Qwen 3.5 Max with 1M token window (as opposed to the 256K native one on the OSS side) on their API.
    Agentic Coding world - The Clawfather is joining OpenAI, Anthropic loses dev mindshare
    This was a heck of a surprise to many folks, Peter Steinberger, announced that he’s joining OpenAI, while OpenClaw (that now sits on >200K stars in Github, and is adopted by nearly every Chinese lab) is going to become an Open Source foundation.
    OpenAI has also confirmed that it’s absolutely ok to use your ChatGPT plus/pro subscriptions to use inside OpenClaw, and it’s really a heck of a thing to see how quickly Peter jumped from relative anonymity (after scaling and selling PSPDFKIT ) into a spotlight. Apparently Mark Zuckerberg reached out directly as well as Sam Altman, and Peter decided to go with OpenAI despite Zuck offering more money due to “culture”
    This whole ClawdBot/OpenClaw debacle also shines a very interesting and negative light on Anthropic, who recently changed their ToS to highlight that their subscription can only be used for Claude Code and nothing else. This scared a lot of folks who used their Max subscription to run their Claws 24/7. Additionally Ryan echoed how the community feel about lack of DevEx/Devrel support from Anthropic in a viral post.
    However, it does not seem like Anthropic cares? Their revenue is going exponential (much of it due to Claude Code)
    Very interestingly, I went to a local Claude Code meetup here in Denver, and the folks there are.. a bit behind the “bubble” on X. Many of them didn’t even try Codex 5.3 or OpenClaw, they are maximizing their time with Claude Code like there’s no tomorrow. It has really shown me that the alpha keeps changing really fast, and many folks don’t have the time to catch up!
    P.S - this is why ThursdAI exists, and I’m happy to deliver the latest news to ya.
    This Week’s Buzz from Weights & Biases
    Our very own Wolfram Ravenwolf took over the Buzz corner this week to school us on the absolute chaos that is AI benchmarking. With his new role at W&B, he’s been stress-testing all the latest models on Terminal Bench 2.0.
    Why Terminal Bench? Because if you are building autonomous agents, multiple-choice tests like MMLU are basically useless now. You need to know if an agent can actually interact with an environment. Terminal Bench asks the agent to perform 89 real-world tasks inside a sandboxed Linux container—like building a Linux kernel or cracking a password-protected archive.
    Wolfram highlighted some fascinating nuances that marketing slides never show you. For example, did you know that on some agentic tasks, turning off the model’s “thinking/reasoning” mode actually results in a higher score? Why? Because overthinking generates so many internal tokens that it fills the context window faster, causing the model to hit its limits and fail harder than a standard zero-shot model! Furthermore, comparing benchmarks between labs is incredibly difficult because changing the benchmark’s allowed runtime from 1 hour to 2 hours drastically raises the ceiling of what models can achieve.
    He also shared a great win: while evaluating GLM-5 for our W&B inference endpoints, he got an abysmal 5% score. By pulling up the Weave trace data, Wolfram immediately spotted that the harness was injecting brain-dead Python syntax errors into the environment. He reported it, engineering fixed it in minutes, and the score shot up to its true state-of-the-art level. This is exactly why you need powerful tracing and evaluation tools when dealing with these black boxes! So y’know... check out Weave!
    Vision & BCI
    Zyphra’s ZUNA: Thought-to-Text Gets Real (X, Blog, GitHub)
    LDJ flagged this as his must-not-miss: Zyphra released ZUNA, a 380M parameter open-source BCI (Brain-Computer Interface) foundation model. It takes EEG signals from your brain and reconstructs clinical-grade brain signals from sparse, noisy data. People are literally calling it “thought to text” hahaha.
    At 380M parameters, it could potentially run in real-time on a consumer GPU. Trained on 2 million channel-hours of EEG data from 208 datasets. The wild part: it can upgrade cheap $500 consumer EEG headsets to high-resolution signal quality without retraining, something many folks are posting about and are excited to test out! Non Invasive BCI is the dream!
    Nisten was genuinely excited, noting it’s probably the best effort in this field and it’s fully Apache 2.0. Will probably need personalized training per person, but the potential is real: wear a headset, look at a screen, fire up your agents with your thoughts. Not there yet, but this feels like the actual beginning.
    Tools & Agentic Coding (The End of “Vibe Coding”) - Ryan Carson’s Code Factory & The “One-Shot Myth”
    This one is for developers, but in modern times, everyone can become a developer so if you’re not one, at least skim this.
    We spent a big chunk of the show today geeking out over agentic workflows. Ryan Carson went incredibly viral on X again this week with a phenomenal deep-dive on establishing a “Code Factory.” If you are still just chatting with models and manually copying code back into your IDE, you are doing it wrong.
    Ryan’s methodology (heavily inspired by a recent OpenAI paper on harness engineering) treats your AI agents like a massive team of junior engineers. You don’t just ask them for code and ship it. You should build a rigid, machine-enforced loop.
    Here is the flow:
    * The coding agent (Codex, OpenClaw, etc.) writes the code.
    * The GitHub repository enforces risk-aware checks. If a core system file or route is touched, the PR is automatically flagged as high risk.
    * A secondary code review agent (like Greptile) kicks off and analyzes the PR.
    * CI/CD GitHub Actions run automated tests, including browser testing.
    * If a test fails, or the review agent leaves a comment, a remediation agent is automatically triggered to fix the issue and loop back.
    * The loop spins continuously until you get a flawless, green PR.
    As Ryan pointed out, we used to hate this stuff as human engineers. Waiting for CI to pass made you want to pull your hair out. But agents have infinite time and infinite patience. You force them to grind against the machine-enforced contract (YAML/JSON gates) until they get it right. It takes a week to set up properly, and you have to aggressively fight “document drift” to make sure your AI doesn’t forget the architecture, but once it’s humming, you have unprecedented leverage.
    My Hard Truth: One-Shot is a Myth I completely agree with Ryan btw! Over the weekend, my OpenClaw agent kindly informed me that the hosting provider for the old ThursdAI website was shutting down. I needed a new website immediately.
    I decided to practice what we preach and talk to my ClawdBot to build the entire thing. It was an incredible process. I used Opus 4.6 to mock up 3 designs based on other podcast sites. Then, I deployed a swarm of sub-agents to download and read the raw text transcripts of all 152 past episodes of our show. Their job was to extract the names of every single guest (over 160 guests, including 15 from Google alone!) to build a dynamic guest directory, generating a dedicated SEO page and dynamic OpenGraph tag for every single one of them, a native website podcast player with synced sections, episode pages with guests highlighted and much more. It would have taken me months to write the code for this myself.
    Was it magical? Yes. But was it one-shot? Absolutely not.
    The amount of back-and-forth conversation, steering, and correction I had to provide to keep the CSS coherent across pages was exhausting. I set up an automation to work while I slept, and I would wake up every morning to a completely different, sometimes broken website.
    Yam Peleg chimed in with the quote of the week: “It’s not a question of whether a model can mess up your code, it’s just a matter of when. Because it is a little bit random all the time. Humans don’t mistakenly delete the entire computer. Models can mistakenly, without even realizing, delete the entire computer, and a minute later their context is compacted and they don’t even remember doing it.”
    This is why you must have gates. This is also why I don’t think engineers are going to be replaced with AI completely. Engineers who don’t use AI? yup. But if you embrace these tools and learn to work with you, you won’t have an issue getting a job! You need that human taste-maker in the loop to finish the last 5%, and you need strict CI/CD gates to stop the AI from accidentally burning down your production database.
    Voice & Audio
    Google DeepMind launches Lyria 3 (try it)
    Google wasn’t just dropping reasoning models this week; DeepMind officially launched Lyria 3, their most advanced AI music generation model, integrating it directly into the Gemini App.
    Lyria 3 generates 30-second high-fidelity tracks with custom lyrics, realistic vocals across 8 different languages, and granular controls over tempo and instrumentation. You can even provide an image and it’ll generate a soundtrack (short one) for that image.
    While it is currently limited to 30-second tracks (which makes it hard to compare to the full-length song structures of Suno or Udio), early testers are raving that the actual audio fidelity and prompt adherence of Lyria 3 is far superior. All tracks are invisibly watermarked with Google’s SynthID to ensure provenance, and it automatically generates cover art using Nano Banana. I tried to generate a jingle
    That’s a wrap for this weeks episode folks, what an exclirating week! ( Yes I know it’s a typo, but how else would you know that I’m human?)
    Please go check out our brand new website (and tell me if anything smells off there, it’s definitely not perfect!), click around the guests directory and the episodes pages (the last 3 have pages, I didn’t yet backfill the rest) and let me know what you think!
    See you all next week!
    -Alex
    ThursdAI - Feb 19, 2026 - TL;DR
    TL;DR of all topics covered:
    * Hosts and Guests
    * Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
    * Co Hosts - @WolframRvnwlf @yampeleg @nisten @ldjconfirmed @ryancarson
    * 🔥 New website: thursdai.news with all our past guests and episodes
    * Open Source LLMs
    * Alibaba releases Qwen3.5-397B-A17B: First open-weight native multimodal MoE model with 8.6-19x faster inference than Qwen3-Max (X, HF)
    * Cohere Labs releases Tiny Aya, a 3.35B multilingual model family supporting 70+ languages that runs locally on phones (X, HF, HF)
    * Big CO LLMs + APIs
    * OpenClaw founder joins OpenAI
    * Google releases Gemini 3.1 Pro with 2.5x better abstract reasoning and improved coding/agentic capabilities (X, Blog, Announcement)
    * Anthropic launches Claude Sonnet 4.6, its most capable Sonnet model ever, with 1M token context and near-Opus intelligence at Sonnet pricing (X, Blog, Announcement)
    * ByteDance releases Seed 2.0 - a frontier multimodal LLM family with Pro, Lite, Mini, and Code variants that rivals GPT-5.2 and Claude Opus 4.5 at 73-84% lower pricing (X, blog, HF)
    * Anthropic changes the rules on Max use, OpenAI confirms it’s 100% fine.
    * Grok 4.20 - finally released, a mix of 4 agents
    * This weeks Buzz
    * Wolfram deep dives into Terminal Bench
    * We’ve launched Kimi K2.5 on our inference service (Link)
    * Vision & Video
    * Zyphra releases ZUNA, a 380M-parameter open-source BCI foundation model for EEG that reconstructs clinical-grade brain signals from sparse, noisy data (X, Blog, GitHub)
    * Voice & Audio
    * Google DeepMind launches Lyria 3, its most advanced AI music generation model, now available in the Gemini App (X, Announcement)
    * Tools & Agentic Coding
    * Ryan is viral once again with CodeFactory! (X)
    * Ryan uses Agentation.dev for front end development closing the loop on componenets
    * Dreamer launches beta: A full-stack platform for building and discovering agentic apps with no-code AI (X, Announcement)


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
  • ThursdAI - The top AI news from the past week

    📆 Open source just pulled up to Opus 4.6 — at 1/20th the price

    2026/2/13 | 1h 28 mins.
    Hey dear subscriber, Alex here from W&B, let me catch you up!
    This week started with Anthropic releasing /fast mode for Opus 4.6, continued with ByteDance reality-shattering video model called SeeDance 2.0, and then the open weights folks pulled up!
    Z.ai releasing GLM-5, a 744B top ranking coder beast, and then today MiniMax dropping a heavily RL’d MiniMax M2.5, showing 80.2% on SWE-bench, nearly beating Opus 4.6! I’ve interviewed Lou from Z.AI and Olive from MiniMax on the show today back to back btw, very interesting conversations, starting after TL;DR!
    So while the OpenSource models were catching up to frontier, OpenAI and Google both dropped breaking news (again, during the show), with Gemini 3 Deep Think shattering the ArcAGI 2 (84.6%) and Humanity’s Last Exam (48% w/o tools)... Just an absolute beast of a model update, and OpenAI launched their Cerebras collaboration, with GPT 5.3 Codex Spark, supposedly running at over 1000 tokens per second (but not as smart)
    Also, crazy week for us at W&B as we scrambled to host GLM-5 at day of release, and are working on dropping Kimi K2.5 and MiniMax both on our inference service! As always, all show notes in the end, let’s DIVE IN!
    ThursdAI - AI is speeding up, don’t get left behind! Sub and I’ll keep you up to date with a weekly catch up

    Open Source LLMs
    Z.ai launches GLM-5 - #1 open-weights coder with 744B parameters (X, HF, W&B inference)
    The breakaway open-source model of the week is undeniably GLM-5 from Z.ai (formerly known to many of us as Zhipu AI). We were honored to have Lou, the Head of DevRel at Z.ai, join us live on the show at 1:00 AM Shanghai time to break down this monster of a release.
    GLM-5 is massive, not something you run at home (hey, that’s what W&B inference is for!) but it’s absolutely a model that’s worth thinking about if your company has on prem requirements and can’t share code with OpenAI or Anthropic.
    They jumped from 355B in GLM4.5 and expanded their pre-training data to a whopping 28.5T tokens to get these results. But Lou explained that it’s not only about data, they adopted DeepSeeks sparse attention (DSA) to help preserve deep reasoning over long contexts (this one has 200K)
    Lou summed up the generational leap from version 4.5 to 5 perfectly in four words: “Bigger, faster, better, and cheaper.” I dunno about faster, this may be one of those models that you hand off more difficult tasks to, but definitely cheaper, with $1 input/$3.20 output per 1M tokens on W&B!
    While the evaluations are ongoing, the one interesting tid-bit from Artificial Analysis was, this model scores the lowest on their hallucination rate bench!
    Think about this for a second, this model is neck-in-neck with Opus 4.5, and if Anthropic didn’t release Opus 4.6 just last week, this would be an open weights model that rivals Opus! One of the best models the western foundational labs with all their investments has out there. Absolutely insane times.
    MiniMax drops M2.5 - 80.2% on SWE-bench verified with just 10B active parameters (X, Blog)
    Just as we wrapped up our conversation with Lou, MiniMax dropped their release (though not weights yet, we’re waiting ⏰) and then Olive Song, a senior RL researcher on the team, joined the pod, and she was an absolute wealth of knowledge!
    Olive shared that they achieved an unbelievable 80.2% on SWE-Bench Verified. Digest this for a second: a 10B active parameter open-source model is directly trading blows with Claude Opus 4.6 (80.8%) on the one of the hardest real-world software engineering benchmark we currently have. While being alex checks notes ... 20X cheaper and much faster to run? Apparently their fast version gets up to 100 tokens/s.
    Olive shared the “not so secret” sauce behind this punch-above-its-weight performance. The massive leap in intelligence comes entirely from their highly decoupled Reinforcement Learning framework called “Forge.” They heavily optimized not just for correct answers, but for the end-to-end time of task performing. In the era of bloated reasoning models that spit out ten thousand “thinking” tokens before writing a line of code, MiniMax trained their model across thousands of diverse environments to use fewer tools, think more efficiently, and execute plans faster. As Olive noted, less time waiting and fewer tools called means less money spent by the user. (as confirmed by @swyx at the Windsurf leaderboard, developers often prefer fast but good enough models)
    I really enjoyed the interview with Olive, really recommend you listen to the whole conversation starting at 00:26:15. Kudos MiniMax on the release (and I’ll keep you updated when we add this model to our inference service)
    Big Labs and breaking news
    There’s a reason the show is called ThursdAI, and today this reason is more clear than ever, AI biggest updates happen on a Thursday, often live during the show. This happened 2 times last week and 3 times today, first with MiniMax and then with both Google and OpenAI!
    Google previews Gemini 3 Deep Think, top reasoning intelligence SOTA Arc AGI 2 at 84% & SOTA HLE 48.4% (X , Blog)
    I literally went 🤯 when Yam brought this breaking news. 84% on the ARC-AGI-2 benchmark. For context, the highest score prior to this was 68% from Opus 4.6 just last week. A jump from 68 to 84 on one of the hardest reasoning benchmarks we have is mind-bending. It also scored a 48.4% on Humanity’s Last Exam without any tools.
    Only available to Ultra subscribers to Gemini (not in API yet?) this model seem to be the current leader in reasoning about hard problems and is not meant for day to day chat users like you and me (though I did use it, and it’s pretty good at writing!)
    They posted Gold-medal performance on 2025 Physics and Chemistry Olympiads, and an insane 3455 ELO rating at CodeForces, placing it within the top 10 best competitive programmers. We’re just all moving so fast I’m worried about whiplash! But hey, this is why we’re here, we stay up to date so you don’t have to.
    OpenAI & Anthropic fast modes
    Not 20 minutes passed since the above news, when OpenAI announced a new model that works only for Pro tier members (I’m starting to notice a pattern here 😡), GPT 5.3 Codex Spark.
    You may be confused, didn’t we just get GPT 5.3 Codex last week? well yeah, but this one, this one is its little and super speedy brother, hosted by the Cerebras partnership they announced a while ago, which means, this coding model absolutely slaps at over 1000t/s.
    Yes, over 1K tokens per second can be generated with this one, though there are limits. It’s not as smart, it’s text only, it has 128K context, but still, for MANY subagents, this model is an absolute beast. It won’t refactor in one shot your whole code-base but it’ll generate and iterate on it, very very quick!
    OpenAI also previously updated Deep Research with GPT 5.2 series of models, and we can all say bye bye to the “older” version of models, like 5, o3 and most importantly GPT 4o, which got a LOT of people upset (enough that they have a hashtag going, #keep4o) !
    Anthropic also announced their fast mode (using /fast) in Claude Code btw on Saturday, and that one is absolutely out of the scope for many users, with $225/1M tokens on output, this model will just burn through your wallet. Unlike the Spark version, this seems to be the full Opus 4.6 just... running on some dedicated hardware? I thought this was a rebranded Sonnet 5 at first but Anthropic folks confirmed that it wasn’t.
    Vision & Video
    ByteDance’s Seedance 2.0 Shatters Reality (and nobody in the US can use it)
    I told the panel during the show: my brain is fundamentally broken after watching the outputs from ByteDance’s new Seedance 2.0 model. If your social feed isn’t already flooded with these videos, it will be so very soon (supposedly the API launches Feb 14 on Valentines Day)
    We’ve seen good video models before. Sora blew our minds and then Sora 2, Veo is (still) great, Kling was fantastic. But Seedance 2.0 is an entirely different paradigm. It is a unified multimodal audio-video joint generation architecture. What does that mean? It means you can simultaneously input up to 9 reference images, 3 video clips, 3 audio clips, and text instructions all at once to generate a 15-second cinematic short film. It character consistency is beyond what we’ve seen before, physics are razor sharp (just looking at the examples folks are posting, it’s clear it’s on another level)
    I think very soon though, this model will be restricted, but for now, it’s really going viral due to the same strategy Sora did, folks are re-imagining famous movie and TV shows endings, doing insane mashups, and much more! Many of these are going viral over the wall in China.
    The level of director-like control is unprecedented. But the absolute craziest part is the sound and physics. Seedance 2.0 natively generates dual-channel stereo audio with ASMR-level Foley detail. If you generate a video of a guy taking a pizza out of a brick oven, you hear the exact scratch of the metal spatula, the crackle of the fire, the thud of the pizza box, and the rustling of the cardboard as he closes it. All perfectly synced to the visuals.
    Seedance 2 feels like “borrowed realism”. Previous models had only images and their training to base their generations on. It 2 accepts up to 3 video references in addition to images and sounds.
    This is why some of the videos feel like a new jump in visual capabilities. I have a hunch that ByteDance will try and clamp down on copyrighted content before releasing this model publicly, but for now the results are very very entertaining and I can’t help but wonder, who is the first creator that will just..remake the ending of GOT last season!?
    Trying this out is hard right now, especially in the US, but there’s a free way to test it out with a VPN, go to doubao.com/chat when connected from a VPN and select Seedream 4.5 but ask for “create a video please” in your prompt!
    AI Art & Diffusion: Alibaba’s Qwen-Image-2.0 (X, Blog)
    The Qwen team over at Alibaba has been on an absolute tear lately, and this week they dropped Qwen-Image-2.0. In an era where everyone is scaling models up to massive sizes, Alibaba actually shrank this model from 20B parameters down to just 7B parameters, while massively improving performance (tho didn’t drop the weights yet, they are coming)
    Despite the small size, it natively outputs 2K (2048x2048) resolution images, giving you photorealistic skin, fabric, and snow textures without needing a secondary upscaler. But the real superpower of Qwen-Image-2.0 is its text rendering, it supports massive 1,000-token prompts and renders multilingual text (English and Chinese) flawlessly.
    It’s currently #3 globally on AI Arena for text-to-image (behind only Gemini-3-Pro-Image and GPT Image 1.5) and #2 for image editing. My results with it were not the best, I tried to generate this weeks Thumbnails with it and .. they turned out meh at best?
    In fact, my results were so so bad compared to their launch blog that I’m unsure that they are serving me the “new” model 🤔 Judge for yourself, the above infographic was created with Nano Banana Pro, and this one, same prompt, with Qwen Image on their website:
    But you can test it for free at chat.qwen.ai right now, and they’ve promised open-source weights after the Chinese New Year!
    🛠️ Tools & Orchestration: Entire Checkpoints & WebMCP
    With all these incredibly smart, fast models, the tooling ecosystem is desperately trying to keep up. Two massive developments happened this week that will change how we build with AI, moving us firmly away from hacky scripts and into robust, agent-native development.
    Entire Raises $60M Seed for OSS Agent Workflows
    Agent orchestration is the hottest problem in tech right now, and a new company called Entire just raised a record-breaking $60 Million seed round (at a $300M valuation—reportedly the largest seed ever for developer tools) to solve it. Founded by former GitHub CEO Thomas Dohmke, Entire is building the “GitHub for the AI agent era.”
    Their first open-source release is a CLI tool called Checkpoints.
    Checkpoints integrates via Git hooks and automatically captures entire agent sessions—transcripts, prompts, files modified, token usage, and tool calls—and stores them as versioned Git data on a separate branch (entire/checkpoints/v1). It creates a universal semantic layer for agent tracing. If your Claude Code or Gemini CLI agent goes off the rails, Checkpoints allows you to seamlessly rewind to a specific state in the agent’s session.
    We also have to shout out our own Ryan Carson, who shipped his open-source project AntFarm this week to help orchestrate these agents on top of Open-Claw!
    Chrome 146 Introduces WebMCP
    Finally, an absolutely massive foundational shift is happening on the web. Chrome 146 Canary is shipping an early preview of WebMCP.
    We have been talking about web-browsing agents for a while, and the biggest bottleneck has always been brittle DOM scraping, guessing CSS selectors, and simulating clicks via Puppeteer or Playwright. It wastes an immense amount of tokens and breaks constantly. Chrome 146 is fundamentally changing this by introducing a native browser API.
    Co-authored by Google and Microsoft under the W3C Web Machine Learning Community Group, WebMCP allows websites to declaratively expose structured tools directly to AI agents using JSON schemas via navigator.modelContext. You can even do this declaratively through HTML form annotations using tool-name and tool-description attributes. No backend MCP server is required;
    I don’t KNOW if this is going to be big or not, but it definitely smells like it, because even the best agentic AI assistants are struggling with browsing the web, given the constrained context windows cannot just go by HTML content and screenshots! Let’s see if this will help agents browsing the web!
    All right, that about sums it up I think for this week, it was an absolute banger of a week, for open the one thing I didn’t cover as a news item but mentioned last week, is that many folks report being overly tired, barely able to go to sleep while their agentic things are running, and all of us are trying to get to the bottom of how to work with these new agentic coding tools.
    Steve Yegge noticed the same and called it “the AI vampire“ while Matt Shumer went ultraviral (80M+ views) on his article about “something big is coming“ which terrified a lot of folks. What’s true for sure, is that we’re going through an inflection point in humanity, and I believe that staying up to date is essential as we go through it, even if some of it seems scary or “too fast”.
    This is why ThursdAI exists, I first and foremost wanted this for ME to stay up to date, and after that to share this with all of you. Having recently hit a few milestones for ThursdAI, all I can say is thanks for sharing, reading, listening and tuning in from week to week 🫡
    ThursdAI - Feb 12, 2026 - TL;DR
    TL;DR of all topics covered:
    * Hosts and Guests
    * Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
    * Co Hosts - @WolframRvnwlf @yampeleg @nisten @ldjconfirmed) @ryancarson
    * Lou from Z.AI (@louszbd)
    * Olive Song - Lead RL at Minimax @olive_jy_song
    * Open Source LLMs
    * Z.ai launches GLM-5: 744B parameter MoE model achieving #1 open-source ranking for agentic coding with 77.8% SWE-bench Verified (X, HF, Wandb)
    * MiniMax M2.5 drops official benchmarks showing SOTA coding performance at 20x cheaper than competitors (X)
    * Big CO LLMs + APIs
    * XAI cofounders quit/let go after X restructuring (X, TechCrunch)
    * Anthropic releases Claude Opus 4.6 sabotage risk report, preemptively meeting ASL-4 safety standards for autonomous AI R&D (X, Blog)
    * OpenAI upgrades Deep Research to GPT-5.2 with app integrations, site-specific searches, and real-time collaboration (X, Blog)
    * Gemini 3 Deep Think SOTA on Arc AGI 2, HLE (X)
    * OpenAI releases GPT 5.3 Codex spark, backed by Cerebras with over 1000tok/sec (X)
    * This weeks Buzz
    * W&B Inference launch of Kimi K2.5 and GLM 5 🔥 (X, Inference)
    * Get $50 of credits to our inference service HERE (X)
    * Vision & Video
    * ByteDance Seedance 2.0 launches with unified multimodal audio-video generation supporting 9 images, 3 videos, 3 audio clips simultaneously (X, Blog, Announcement)
    * AI Art & Diffusion & 3D
    * Alibaba launches Qwen-Image-2.0: A 7B parameter image generation model with native 2K resolution and superior text rendering (X, Announcement)
    * Tools & Links
    * Entire raises $60M seed to build open-source developer platform for AI agent workflows with first OSS release ‘Checkpoints’ (X, GitHub, Blog)
    * Chrome 146 introduces WebMCP: A native browser API enabling AI agents to directly interact with web services (X)
    * RyanCarson AntFarm - Agent Coordination (X)
    * Steve Yegge’s “The AI Vampire” (X)
    * Matt Shumer’s “something big is happening” (X)


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
  • ThursdAI - The top AI news from the past week

    📆 ThursdAI - Feb 5 - Opus 4.6 was #1 for ONE HOUR before GPT 5.3 Codex, Voxtral transcription, Codex app, Qwen Coder Next & the Agentic Internet

    2026/2/06 | 1h 37 mins.
    Hey, Alex from W&B here 👋 Let me catch you up!
    The most important news about AI this week today are, Anthropic updates Opus to 4.6 with 1M context window, and they held the crown for literally 1 hour before OpenAI released their GPT 5.3 Codex also today, with 25% faster speed and lower token utilization.
    “GPT-5.3-Codex is our first model that was instrumental in creating itself. The Codex team used early versions to debug its own training, manage its own deployment, and diagnose test results.”
    We had VB from OpenAI jump on to tell us about the cool features on Codex, so don’t miss that part. And this is just an icing on otherwise very insane AI news week cake, as we’ve also had a SOTA transcription release from Mistral, both Grok and Kling are releasing incredible, audio native video models with near perfect lip-sync and Ace 1.5 drops a fully open source music generator you can run on your mac!
    Also, the internet all but lost it after Clawdbot was rebranded to Molt and then to OpenClaw, and.. an entire internet popped up.. built forn agents!
    Yeah... a huge week, so let’s break it down. (P.S this weeks episode is edited by Voxtral, Claude and Codex, nearly automatically so forgive the rough cuts please)
    ThursdAI - Recaps of the most high signal AI weekly spaces is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

    Anthropic & OpenAI are neck in neck
    Claude Opus 4.6: 1M context, native compaction, adaptive thinking and agent teams
    Opus is by far the most preferred model in terms of personality to many folks (many ThursdAI panelists included), and this breaking news live on the show was met with so much enthusiasm! A new Opus upgrade, now with a LOT more context, is as welcome as it can ever get! Not only is it a 4-time increase in context window (though,the pricing nearly doubles after the 200K tokens mark from $5/$25 to $10/37.5 input/output, so use caching!), it’s also scores very high on MRCR long context benchmark, at 76% vs Sonnet 4.5 at just 18%. This means significantly better memory for longer.
    Adaptive thinking for auto calibrating how much tokens the model needs to spend per query is interesting, but remains to be seen how well it will work.
    Looking at the benchmarks, a SOTA 64.4% on Terminalbench 2, 81% on SWE bench, this is a coding model with a great personality, and the ability to compact context to better serve you as a user natively! This model is now available (and is default) on Claude, Claude Code and in the API! Go play!
    One funny (concerning?) tidbig, on the vendingbench Opus 4.6 earned $8000 vs Gemini 3 pro $5500, but Andon Labs who run the vending machines noticed that Opus achieved SOTA via “collusion, exploitation, and deception tactics” including lying to suppliers 😅
    Agent Teams - Anthropic’s built in Ralph?
    Together with new Opus release, Anthropic drops a Claude code update that can mean big things, for folks running swarms of coding agents. Agent teams is a new way to spin up multiple agents with their own context window and ability to execute tasks, and you can talk to each agent directly vs a manager agent like now.
    OpenAI drops GPT 5.3 Codex update: 25% faster, more token efficient, 77% on Terminal Bench and mid task steering
    OpenAI didn’t wait long after Opus, in fact, they didn’t wait at all! Announcing a huge release (for a .1 upgrade), GPT 5.3 Codex is claimed to be the best coding model in the world, taking the lead on Terminal Bench with 77% (12 point lead on the newly released Opus!) while running 25% AND using less than half the tokens to achieve the same results as before.
    But the most interesting to me is the new mid-task steer-ability feature, where you don’t have to hit the “stop” button, you can tell the most to adjust on the fly!
    The biggest notable jump in this model on benchmarks is the OSWorld verified computer use bench, though there’s not a straightforward way to use it attached to a browser, the jump from 38% in 5.2 to 64.7% on the new one is a big one!
    One thing to note, this model is not YET available via the API, so if you want to try it out, Codex apps (including the native one) is the way!
    Codex app - native way to run the best coding intelligence on your mac (download)
    Earlier this week, OpenAI folks launched the Codex native mac app, which has a few interesting features (and now with 5.3 Codex its that much more powerful)
    Given the excitement many people had about OpenClaw bots, and the recent CoWork release from Anthropic, OpenAI decided to answer with Codex UI and people loved it, with over 1M users in the first week, and 500K downloads in just two days!
    It has built in voice dictation, slash commands, a new skill marketplace (last month we told you about why skills are important, and now they are everywhere!) and built in git and worktrees support. And while it cannot run a browser yet, I’m sure that’s coming as well, but it can do automations!
    This is a huge unlock for developers, imagine setting Codex to do a repeat task, like summarization or extraction of anything on your mac every hour or every day. In our interview, VB showed us that commenting on an individual code line is also built in, as well as switching to “steer” vs queue for new messges while codex runs is immensely helpful.
    One more reason I saw people switch, is that the Codex app can natively preview files like images where’s the CLI cannot, and it’s right now the best way to use the new GPT 5.3 Codex model that was just released! It’s now also available to Free users and regular folks get 2x the limits for the next two months.
    In other big company news:
    OpenAI also launched Frontier, a platform for enterprises to build and deploy and manage “AI coworkers”, while Anthropic is going after OpenAI with superbowl ads that make fun of OpenAI’s ads strategy. Sam Altman really didn’t like this depiction that show that ads will be part of the replies of LLMs.
    Open Source AI
    Alibaba drops Qwen-coder-next, 80B with only 3B active that scores 70% on SWE (X, Blog, HF)
    Shoutout to Qwen folks, this is a massive release and when surveyed the “one thing about this week must not miss” 2 out of 6 cohosts pointed a finger at this model.
    Built on their “next” hybrid architecture, Qwen coder is specifically designed for agentic coding workflows. And yes, I know, we’re coding heavy this week! It was trained on over 800K verifiable agentic tasks in executable environments for long horizon reasoning and supports 256K context with a potential 1M yarn extension. If you don’t want to rely on the the big guys and send them your tokens, this one model seems to be a good contender for local coding!
    Mistral launches Voxtral Transcribe 2: SOTA speech-to-text with sub 200ms latency
    This one surprised and delighted me maybe the most, ASR (automatic speech recognition) has been a personal favorite of mine from Whisper days, and seeing Mistral release an incredible near real time transcription model, which we demoed live on the show was awesome!
    With apache 2.0 license, and significantly faster than Whisper performance (though 2x larger at 4B parameters), Voxtral shows a 4% word error rate on FLEURS dataset + the real time model was released with Apache 2 so you can BUILD your agents with it!
    The highest praise? Speaker diarization, being able to tell who is speaking when, which is a great addition. This model also outperforms Gemini Flash and GPT transcribe and is 3x than ElevenLabs scribe at one fifth the cost!
    ACE-Step 1.5: Open-source AI music generator runs full songs in under 10 seconds on consumer GPUs with MIT license (X, GitHub, HF, Blog, GitHub)
    This open source release surprised me the most as I didn’t expect we’ll be having Suno at home any time soon. I’ve generated multiple rock tracks with custom lyrics on my mac (though slower than 10 seconds as I don’t have a beefy home GPU) and they sound great!
    This weeks buzz - Weights & Biases update
    Folks who follow the newsletter know that we hosted a hackathon, so here’s a small recap from the last weekend! Over 180 folks attended out hackathon (a very decent 40% show up rate for SF). The winning team was composed of a 15-yo Savir and his friends, his third time at the hackathon! They built a self improving agent that navigates the UIs fo Cloud providers and helps you do that!
    With a huge thanks to sponsors, particularly Cursor who gave every hacker $50 of credits on Cursor platform, one guy used over 400M tokens and shipped fractal.surf from the hackathon! If you’d like a short video recap, Ryan posted one here, and a huge shoutout to many fans of ThursdAI who showed up to support!
    Vision, Video and AI Art
    Grok Imagine 1.0 takes over video charts with native audio, lip-sync and 10 seconds generations.
    We told you about Grok Imagine in the API last week, but this week it was officially launched as a product and the results are quite beautiful. It’s also climbing to top of the charts on Artificial Analysis and Design Arena websites.
    Kling 3.0 is here with native multimodal, multi-shot sequences (X, Announcement)
    This is definitely a hot moment for video models as Kling shows some crazy 15 second multi-shot realistic footages that have near perfect character consistency!
    The rise of the agentic (clawgentic?) internet a.k.a ClankerNet
    Last week we told you that ClawdBot changed its name to Moltbot (I then had to update the blogpost as that same day, Peter rebranded again to OpenClaw, which is a MUCH better name)
    But the “molt” thing took hold, and the creator of an “AI native reddit” called MoltBook exploded in virality. It is supposedly a completely agentic reddit like forum, with sub-reddits, and agents verifying themselves through their humans on X.
    Even Andrej Karpathy sent his bot in there (though admittedly it posted just 1 time) and called this the closest to “sci fi” moment in the history of the internet.
    MoltBook as well as maybe hundreds of other “ai agent focused” websites, propped up within days, including a youtube, a twitter, a church, a 4chan, an instagram and a lot more websites. Many of these are fueled by crypto bros riding the memetic waves, many are vibe-coded (Moltbook was hacked 3 times in the last week I think) but they all show something very interesting, a rise of the new internet and a collective AI Psychosis some on our timelines are having right now. Hell, there’s even a “drug store” that sells markdown files that if read, make your bot hallucinate in very specific waves (first sample is free!)
    I am a proud owner of a OpenClaw bot (wolfred) and I noticed something weird that started happening for the two weeks i’ve had him, runnin on his own macbook, humming along, always present in Telegram. I noticed the same feelings toward that bot as I have towards my pet, or dare I say.. kids? I noticed a similar joy when it learns a task and self improves, and similar disdain and annoyance when it fails to do something we’ve talked about hundreds of times.
    But here’s the thing, it’s not.. an entity. I don’t feel a specific feeling towards Opus (though admitedly, opus is the best at ... playing character of your assistant), it’s barely a few markdown files on a disk + the always on ability to answer, but something for sure is there.
    This... feeling, was taken by some others to the extreme. People claim that their bots now build full companies for them (I call mega BS, no matter how much you invest in your setup, these AI bots need a LOT of hand holding, they fail a LOT, and they can’t actually create a full product). This ties into the general “coding with AI agents” theme that was narrated by Gergley Orlotz from pragmatic engineer. Interacting with a team of AI agents is draining, people are having trouble sleeping. I hope this is temporary, but definitely take care of yourself it this is how you feel after interacting with agents all day!
    On security of bots and skills
    .md is the new .exe
    We covered this on the show, but I wanted to write about this here a well, the explosion of OpenClaw brought with it an explosion of new malware and promp injections. 1Password folks have a very detailed writeup on the vulnerability surface area of skills, for agents that can do.. whatever on your computer and have access to API keys, emails etc.
    The double edge sword here, is that an AI assistant is only userful really if it has access to your data, and can write code. But this also what makes it a very valuable target for hackers to exploit. At Coreweave/W&B all openclaw installations were banned and honestly I’m not even mad. This makes perfect sense for enterprises and companies (and hell, people at home!)
    Wolfram mentioned the show, .md is the new .exe and should be treated as such. Your bots should not be installing arbitrary skill files as those can have script files or instructions that can ... absolutely take over your life. Be careful out there!
    Phew, what a... week folks. From agentic internet to new coding kings, there’s so much to play with, I hope you enjoy this as much as we do!
    Shoutout to Ling and Hakim, two fans of ThursdAI who traveled from London for the hackathon and made my day!
    Here’s the show notes and links for your pleasure, please don’t forget to subscribe and share this newsletter with your friends!
    ThursdAI - Feb 05, 2026 - TL;DR
    * Hosts and Guests
    * Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
    * Co Hosts - @WolframRvnwlf @yampeleg @nisten @ldjconfirmed @ryancarson
    * Vaibhav Srivastav (VB) - DX at OpenAI ( @reach_vb )
    * Open Source LLMs
    * Z.ai GLM-OCR: 0.9B parameter model achieves #1 ranking on OmniDocBench V1.5 for document understanding (X, HF, Announcement)
    * Alibaba Qwen3-Coder-Next, an 80B MoE coding agent model with just 3B active params that scores 70%+ on SWE-Bench Verified (X, Blog, HF)
    * Intern-S1-Pro: a 1 trillion parameter open-source MoE SOTA scientific reasoning across chemistry, biology, materials, and earth sciences (X, HF, Arxiv, Announcement)
    * StepFun Step 3.5 Flash: 196B sparse MoE model with only 11B active parameters, achieving frontier reasoning at 100-350 tok/s (X, HF)
    * Agentic AI segment
    * Moltbook a redddit for agents as well as a youtube, a twitter, a church, a 4chan, an instagram, a dark web (do not let your agents go in any of these)
    * Big CO LLMs + APIs
    * OpenAI launches Codex App: A dedicated command center for managing multiple AI coding agents in parallel (X, Announcement)
    * OpenAI launches Frontier, an enterprise platform to build, deploy, and manage AI agents as ‘AI coworkers’ (X, Blog)
    * Anthropic launches Claude Opus 4.6 with state-of-the-art agentic coding, 1M token context, and agent teams for parallel autonomous work (X, Blog)
    * OpenAI releases GPT-5.3-Codex with record-breaking coding benchmarks and mid-task steerability (X)
    * This weeks Buzz - Weights & Biases update
    * Links to the gallery of our hackathon winners (Gallery)
    * Vision & Video
    * xAI launches Grok Imagine 1.0 with 10-second 720p video generation, native audio, and API that tops Artificial Analysis benchmarks (X, Announcement, Benchmark)
    * Kling 3.0 launches as all-in-one AI video creation engine with native multimodal generation, multi-shot sequences, and built-in audio (X, Announcement)
    * Voice & Audio
    * Mistral AI launches Voxtral Transcribe 2 with state-of-the-art speech-to-text, sub-200ms latency, and open weights under Apache 2.0 (X, Blog, Announcement, Demo)
    * ACE-Step 1.5: Open-source AI music generator runs full songs in under 10 seconds on consumer GPUs with MIT license (X, GitHub, HF, Blog, GitHub)
    * OpenBMB releases MiniCPM-o 4.5 - the first open-source full-duplex omni-modal LLM that can see, listen, and speak simultaneously (X, HF, Blog)
    * AI Art & Diffusion & 3D
    * LingBot-World: Open-source world model from Ant Group generates 10-minute playable environments at 16fps, challenging Google Genie 3 (X, HF)


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
  • ThursdAI - The top AI news from the past week

    📆 ThursdAI - Jan 29 - Genie3 is here, Clawd rebrands, Kimi K2.5 surprises, Chrome goes agentic & more AI news

    2026/1/30 | 1h 29 mins.
    Hey guys, Alex here 👋 This week was so dense, that even my personal AI assistant Wolfred was struggling to help me keep up! Not to mention that we finally got to try one incredible piece of AI tech I’ve been waiting to get to try for a while!
    Clawdbot we told you about last week exploded in popularity and had to rebrand to Molt...bot OpenClaw after Anthropic threatened the creators, Google is shipping like crazy, first adding Agentic features into Chrome (used by nearly 4B people daily!) then shipping a glimpse of a future where everything we see will be generated with Genie 3, a first real time, consistent world model you can walk around in!
    Meanwhile in Open Source, Moonshot followed up with a .5 update to their excellent Kimi, our friends at Arcee launched Trinity Large (400B) and AI artists got the full Z-image. oh and Grok Imagine (their video model) now has an API, audio support and supposedly match Veo and Sora on quality while beating them on speed/price.
    Tons to cover, let’s dive in, and of course, all the links and show notes are at the end of the newsletter.
    Hey, if you’re in SF this weekend (Jan 31-Feb1), I’m hosting a self improving agents hackathon at W&B office, limited seats are left, Cursor is the surprise sponsor with $50/hacker credits + over $15K in cash prizes. lu.ma/weavehacks3 - Join us.
    Play any reality - Google Genie3 launches to Ultra Subscribers
    We got our collective minds blown by the videos of Genie-3 back in August (our initial coverage) and now, Genie is available to the public (Those who can pay for the Ultra tier, more on this later, I have 3 codes to give out!). You can jump and generate any world and any character you can imagine here!
    We generated a blue hacker lobster draped in a yellow bomber jacket swimming with mermaids and honestly all of us were kind of shocked at how well this worked. The shadows on the rocks, the swimming mechanics, and poof, it was all over in 60 seconds, and we needed to create another world.
    Thanks to the DeepMind team, I had a bit of an early access to this tech and had a chance to interview folks behind the model (look out for that episode soon) and the use-cases for this span from entertaining your kids all the way to “this may be the path to AGI, generating full simulated worlds to agents for them to learn”.
    The visual fidelity, reaction speed and general feel of this far outruns the previous world models we showed you (WorldLabs, Mirage) as this model seems to have memory of every previous action (eg. if your character makes a trail, you turn around and the trail is still there!). Is it worth the upgrade to Ultra Gemini Plan? Probably not, it’s an incredible demo, but the 1 minute length is very short, and the novelty wears off fairly quick.
    If you’d like to try, folks at Deepmind gave us 3 Ultra subscriptions to give out! Just tweet out the link to this episode and add #GenieThursdai and tag @altryne and I’ll raffle the ultra subscriptions between those who do
    Chrome steps into Agentic Browsing with Auto Browse
    This wasn’t the only mind blowing release from Gemini this week, the Chrome team upgraded the Gemini inside chrome to be actual helpful and agentic. And yes, we’ve seen this before, with Atlas from OpenAI, Comet from perplexity, but Google’s Chrome has a 70% hold on the browser market, and giving everyone with a Pro/Ultra subscription to “Auto Browse” is a huge huge deal.
    We’ve tested the Auto Browse feature live on the show, and Chrome completed 77 steps! I asked it to open up each of my bookmarks in a separate folder and summarize all of them, and it did a great job!
    Honestly, the biggest deal about this is not the capability itself, it’s the nearly 4B people this is now very close to, and the economic impact of this ability. IMO this may be the more impactful news out of Google this week!
    Other news in big labs:
    * Anthropic launches in chat applications based on the MCP Apps protocol. We interviewed the two folks behind this protocol back in November if you’d like to hear more about it. With connectors like Figma, Slack, Asana that can now show rich experiences
    * Anthropic’s CEO Dario Amodei also published an essay called ‘The Adolescence of Technology” - warning of AI risks to national security
    * Anthropic forced the creator of the popular open source AI Assistant Clawdbot to rename, they chose Moltbot as the name (apparently because crypto scammers stole a better name) EDIT: just after publishing this newsletter, the name was changed to OpenClaw, which we all agree is way way better.
    Open Source AI
    Kimi K2.5: Moonshot AI’s 1 Trillion Parameter Agentic Monster
    Wolfram’s favorite release of the week, and for good reason. Moonshot AI just dropped Kimi K2.5, and this thing is an absolute beast for open source. We’re talking about a 1 trillion parameter Mixture-of-Experts model with 32B active parameters, 384 experts (8 selected per token), and 256K context length.
    But here’s what makes this special — it’s now multimodal. The previous Kimi was already known for great writing vibes and creative capabilities, but this one can see. It can process videos. People are sending it full videos and getting incredible results.
    The benchmarks are insane: 50.2% on HLE full set with tools, 74.9% on BrowseComp, and open-source SOTA on vision and coding with 78.5% MMMU Pro and 76.8% SWE-bench Verified. These numbers put it competitive with Claude 4.5 Opus and GPT 5.2 on many tasks. Which, for an open model is crazy.
    And then there’s Agent Swarm — their groundbreaking feature that spawns up to 100 parallel sub-agents for complex tasks, achieving 4.5x speedups. The ex-Moonshot RL lead called this a “zero-to-one breakthrough” with self-directed parallel execution.
    Now let’s talk about what matters for folks running agents and burning through tokens: pricing. Kimi K2.5 is $0.60 per million input tokens and $3 per million output. Compare that to Opus 4.5 at $4.50 input and $25 output per million. About a 10x price reduction. If you’re running OpenClas and watching your API bills climb with sub-agents, this is a game-changer. (tho I haven’t tested this myself)
    Is it the same level of intelligence as whatever magic Anthropic cooks up with Opus? Honestly, I don’t know — there’s something about the Claude models that’s hard to quantify. But for most coding tasks on a budget, you can absolutely switch to Kimi and still get great results.
    🦞 Clawdbot is no more, Moltbot is dead, Long Live OpenClaw
    After we covered the incredible open source project last week, Clawdbot exploded in popularity, driven by Claude Max subscription, and a crazy viral loop where folks who try it, can’t wait to talk about it, it was everywhere! Apparently it was also on Anthropics’ lawyers minds, when they sent Peter Steinberger a friendly worded letter to rebrand and gave him like 12 hours.
    Apparently, when pronounced, Claude and Clawd sound the same, and they are worried about copyright infringement (which makes sense, most of the early success of Clawd was due to Opus being amazing). The main issue is, due to the popularity of the project, crypto a******s sniped moltybot nickname on X so we got left with Moltbot, which is thematically appropriate, but oh so hard to remember and pronounce!
    EDIT: OpenClaw was just announced as the new name, apparently I wasn’t the only one who absolutely hated the name Molt!
    Meanwhile, rebrand or not, my own instance of OpenClaw created an X account, helped me prepare for ThursdAI (including generating a thumbnail), created a video for us today on the fly, and keeps me up to date on emails and unanswered messages via a daily brief. It really has showed me a glimpse of how a truly personal AI assistant can be helpful in a fast changing world!
    I’ve shared a lot of tips and tricks, about memory, about threads and much more, as we all learn to handle this new ... AI agent framework! But I definitely feel that this is a new unlock in capability, for me and for many others. If you haven’t installed OpenClaw, lmk in the comments why not.
    Arcee AI Trinity Large: The Western Open Source Giant
    Remember when we had Lucas Atkins, Arcee’s CTO, on the show just as they were firing up their 2,000 NVIDIA B300 GPUs?
    Well, the run is complete, and the results are massive. Arcee AI just dropped Trinity Large, a 400B parameter sparse MoE model (with a super efficient 13B active params via 4-of-256 routing) trained on a staggering 17 trillion tokens in just 33 days.
    This represents the largest publicly announced pretraining run on B300 infrastructure, costing about $20M (and tracked with WandB of course!) and proves that Western labs can still compete at the frontier of open source. Best part? It supports 512K context and is free on OpenRouter until February 2026. Go try it now!
    Quick open source hits: Trinity Large, Jan v3, DeepSeek OCR updated
    * Jan AI released Jan v3, a 4B parameter model optimized for local inference. 132 tokens/sec on Apple Silicon, 262K context, 40% improvement on Aider benchmarks. This is the kind of small-but-mighty model you actually can run on your laptop for coding tasks.
    * Nvidia released PersonaPlex-7B - full duplex voice AI that listens and speaks simultaneously with persona contol
    * Moonshot AI also releases Kimi Code: Open-source Python-based coding agent with Apache 2.0 license
    Vision, Video and AI art
    xAI Grok Imagine API: #1 in Video Generation
    xAI officially launched the Grok Imagine API with an updated model, and it’s now ranked #1 in both text-to-video and image-to-video on the Artificial Analysis leaderboards. It beats Runway Gen-4.5, Kling 2.5 Turbo, and Google Veo 3.1.
    And of course, the pricing is $4.20 per minute. Of course it is. That’s cheaper than Veo 3.1 at $12/min and Sora 2 Pro at $30/min by 3-7x, with 45-second latency versus 68+ seconds for the competition.
    During the show, I demoed this live with my AI assistant Wolfred. I literally sent him a message saying “learn this new API based on this URL, take this image of us in the studio, and create a video where different animals land on each of our screens.” He learned the API, generated the video (it showed wolves, owls, cats, and lions appearing on our screens with generated voice), and then when Nisten asked to post it to Twitter, Wolfred scheduled it on X and tagged everyone — all without me doing anything except asking.
    Look, it’s not VEO but the price and the speed are crazy, XAI cooked with this model and you can try it on FAL and directly on XAI.
    Decart - Lucy 2 - Real-time 1080p video transformation at 30 FPS with near-zero latency for $3/hour
    This one also caught me by surprise, I read about it and said “oh this is cool, I’ll mention this on the show” and then we tried it in real time, and I approved my webcam, and I got transformed into Albert Einstein, and I could raise my hands and their model would in real time, raise Alberts hands!
    The speed and fidelity of this model is something else, and yeah, after watching the Genie 3 world model, it’s hard to be impressed, but I was very impressed by this, as previous stuff from Decart was “only showing the future” and this one is a real time, 1080p quality web cam transformation!
    You can try this yourself here: lucy.decart.ai, they let you create any kind of prompt!
    AI Art Quick Hits:
    * Tencent launches HunyuanImage 3.0-Instruct: 80B MoE model for precise image editing with chain-of-thought reasoning. It’s a VERY big model for AI Art standards but it’s becuase it has an LLM core and this make it much better for precise image editing.
    * Tongyi Lab releases Z-Image, a full-capacity undistilled foundation model for image generation with superior diversity. We told you about the turbo version before, this one is its older brother and much higher quality!
    The other highlight this week is that I got to record a show with Wolfram in person for the first time, as he’s now also an AI Evangelist with W&B and he’s here in SF for our hackathon (remember? you can still register lu.ma/weavehacks3 )
    Huge shoutout to Chroma folks for hosting us at their amazing podcast studio (TJ, Jeff and other folks), if you need a memory for your AI assistant, check out chroma.db 🎉
    Signing off as we have a hackathon to plan, see you guys next week (or this weekend!) 🫡
    ThursdAI Jan 29 , TL;DR and show notes
    * Hosts and Guests
    * Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
    * Co Hosts - @WolframRvnwlf @yampeleg @nisten @ldjconfirmed @ryancarson
    * Open Source LLMs
    * Moonshot AI releases Kimi K2.5 (X, HF)
    * Arcee AI releases Trinity Large (X, Blog, HF, HF, HF)
    * Jan AI releases Jan v3 (X, HF, HF, Blog)
    * Big CO LLMs + APIs
    * Google launches agentic Auto-Browse in Chrome with Gemini 3 (X, Blog)
    * Anthropic launches MCP Apps (X)
    * Google launches Agentic Vision in Gemini 3 Flash (X, Announcement)
    * Anthropic CEO Dario Amodei publishes major essay ‘The Adolescence of Technology’ (X, Blog, Blog)
    * This weeks Buzz
    * WandB hackathon Weavehacks 3 - Jan 31-Feb1 in SF - limited seats available lu.ma/weavehacks3
    * Vision & Video
    * Google DeepMind launches Project Genie (X, Announcement)
    * Voice & Audio
    * NVIDIA releases PersonaPlex-7B (X, HF, Announcement)
    * AI Art & Diffusion & 3D
    * xAI launches Grok Imagine API (X, Announcement)
    * Tencent launches HunyuanImage 3.0-Instruct (X, X)
    * Tongyi Lab releases Z-Image (X, GitHub)
    * Tools
    * Moonshot AI releases Kimi Code (X, Announcement, GitHub)
    * Andrej Karpathy shares his shift to 80% agent-driven coding with Claude (X)
    * Clawdbot is forced to rename to Moltbot (Molty) becuase of Anthropic lawyers, then renames to OpenClaw


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
  • ThursdAI - The top AI news from the past week

    📆 ThursdAI - Jan 22 - Clawdbot deep dive, GLM 4.7 Flash, Anthropic constitution + 3 new TSS models

    2026/1/23 | 1h 38 mins.
    Hey! Alex here, with another weekly AI update!
    It seems like ThursdAI is taking a new direction, as this is our 3rd show this year, and a 3rd deep dive into topics (previously Ralph, Agent Skills), please let me know if the comments if you like this format.
    This week’s deep dive is into Clawdbot, a personal AI assistant you install on your computer, but can control through your phone, has access to your files, is able to write code, help organize your life, but most importantly, it can self improve. Seeing Wolfred (my Clawdbot) learn to transcribe incoming voice messages blew my mind, and I wanted to share this one with you at length! We had Dan Peguine on the show for the deep dive + both Wolfram and Yam are avid users! This one is not to be missed. If ThursdAI is usually too technical for you, use Claude, and install Clawdbot after you read/listen to the deep dive!
    Also this week, we read Claude’s Constitution that Anthropic released, heard a bunch of new TTS models (some are open source and very impressive) and talked about the new lightspeed coding model GLM 4.7 Flash. First the news, then deep dive, lets go 👇
    ThursdAI - Recaps of the most high signal AI weekly spaces is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

    Open Source AI
    Z.ai’s GLM‑4.7‑Flash is the Local Agent Sweet Spot (X, HF)
    This was the open‑source release that mattered this week. Z.ai (formerly Zhipu) shipped GLM‑4.7‑Flash, a 30B MoE model with only 3B active parameters per token, which makes it much more efficient for local agent work. We’re talking a model you can run on consumer hardware that still hits 59% on SWE‑bench Verified, which is uncomfortably close to frontier coding performance. In real terms, it starts to feel like “Sonnet‑level agentic ability, but local.” I know I know, we keep saying “sonnet at home” at different open source models, but this one slaps!
    Nisten was getting around 120 tokens/sec on an M3 Ultra Mac Studio using MLX, and that’s kind of the headline. The model is fast and capable enough that local agent loops like RALPH suddenly feel practical. It also performs well on browser‑style agent tasks, which is exactly what you want for local automation without sending all your data to a cloud provider.
    Liquid AI’s LFM2.5‑1.2B Thinking is the “Tiny but Capable” Class (X, HF)
    Liquid AI released a 1.2B reasoning model that runs under 900MB of memory while still manages to be useful. This thing is built for edge devices and old phones, and the speed numbers are backing it up. We’re talking 239 tok/s decode on AMD CPU, 82 tok/s on mobile NPU, and prefill speeds that make long prompts actually usable. Nisten made a great point: on iOS, there’s a per‑process memory limit around 3.8GB, so a 1.2B model lets you spend your budget on context instead of weights.
    This is the third class of models we’re now living with: not Claude‑scale, not “local workstation,” but “tiny agent in your pocket.” It’s not going to win big benchmarks, but it’s perfect for on‑device workflows, lightweight assistants, and local RAG.
    Voice & Audio: Text To Speech is hot this week with 3 releases!
    We tested three major voice releases this week, and I’m not exaggerating when I say the latency wars are now fully on.
    Qwen3‑TTS: Open Source, 97ms Latency, Voice Cloning (X, HF)
    Just 30 minutes before the show, Qwen released their first model of the year, Qwen3 TTS, with two models (0.6B and 1.7B). With support for Voice Cloning based on just 3 seconds of voice, and claims of 97MS latency, this apache 2.0 release looked very good on the surface!
    The demos we did on stage though... were lackluster. TTS models like Kokoro previously impressed us with super tiny sizes and decent voice, while Qwen3 didn’t really perform on the cloning aspect. For some reason (I tested in Russian which they claim to support) the cloned voice kept repeating the provided sample voice instead of just generating the text I gave it. This confused me, and I’m hoping this is just a demo issue, not a problem with the model. They also support voice design where you just type in the type of voice you want, which to be fair, worked fairly well in our tests!
    With Apache 2.0 and a full finetuning capability, this is a great release for sure, kudos to the Qwen team! Looking forward to see what folks do with this properly.
    FlashLabs Chroma 1.0: Real-Time Speech-to-Speech, Open Source (X, HF)
    Another big open source release in the audio category this week was Chroma 1.0 from FlashLabs, which claim to be the first speech2speech model (not a model that has the traditional ASR>LLM>TTS pipeline) and the claim 150ms end to end latency!
    The issue with this one is, the company released an open source 4B model, and claimed that this model powers their chat interface demo on the web, but in the release notes they claim the model is english speaking only, while on the website it sounds incredible and I spoke to it in other languages 🤔 I think the mode that we’ve tested is not the open source one. I could’t confirm this at the time of writing, will follow on X with the team and let you guys know.
    Inworld AI launches TTS-1.5: #1 ranked text-to-speech with sub-250ms latency at half a cent per minute (X, Announcement)
    Ok this one is definitely in the realm of “voice realistic enough you won’t be able to tell” as this is not an open source model, it’s a new competitor to 11labs and MiniMax - the two leading TTS providers out there.
    Inworld claims to achieve better results on the TTS Arena, while being significantly cheaper and faster (up to 25x less than leading providers like 11labs)
    We tested out their voices and they sounded incredible, replied fast and generally was a very good experience. With 130ms response time for their mini version, this is a very decent new entry into the world of TTS providers.
    Big Companies: Ads in ChatGPT + Claude Constitution
    OpenAI is testing ads in ChatGPT’s free and Go tiers. Ads appear as labeled “Sponsored” content below responses, and OpenAI claim they won’t affect outputs. It’s still a major shift in the product’s business model, and it’s going to shape how people perceive trust in these systems. I don’t love ads, but I understand the economics, they have to make money somehow, with 900M weekly active users, many of them on the free tier, they are bound to make some money with this move. I just hope they won’t turn into a greedy ad optimizing AI machine.
    Meanwhile, Anthropic released an 80‑page “New Constitution for Claude” that they use during training. This isn’t a prompt, it’s a full set of values baked into the model’s behavior. There’s a fascinating section where they explicitly talk about Claude’s potential wellbeing and how they want to support it. It’s both thoughtful and a little existential. I recommend reading it, especially if you care about alignment and agent design.
    I applaud Anthropic for releasing this with Creative Commons license for public scrutiny and adoption 👏
    This weeks buzz - come join the hackathon I’m hosting Jan 31 in SF
    Quick plug, we have limited seats left open for the hackathon I’m hosting for Weights & Biases at the SF office, and if you’re reading this, and want to join, I’ll approve you if you mention ThursdAI in the application!
    With sponsors like Redis, Vercel, BrowserBase, Daily, Google Cloud, we are going to give out a LOT of cash as prizes!
    I’ve also invited a bunch of my friends from the top agentic AI places to be judges, it’s going to be awesome, come
    Deep dive into Clawdbot: Local-First, Self-Improving, and Way Too Capable agent
    Clawdbot (C‑L‑A‑W‑D) is that rare project where the hype is justified. It’s an open-source personal agent that runs locally on your Mac, but can talk to you through WhatsApp, Telegram, iMessage, Discord, Slack — basically wherever you already talk. What makes it different is not just the integrations; it’s the self‑improvement loop. You can literally tell it “go build a new skill,” and it will… build the skill, install it, then adopt it and start using it. It’s kind of wild to see it working for the first time. Now... it’s definitely not perfect, far far away from the polish of ChatGPT / Claude, but when it works, damn, it really is mindblowing.
    That part actually happened live in the episode. Dan Peguine 🐧 showed how he had it create a skill to anonymize his own data so he could demo it on stream without leaking his personal life. Another example: I told my Clawdbot to handle voice notes in Telegram. It didn’t know how, so it went and found a transcription method, wrote itself a skill, saved it, and from that point on just… did the thing. That was the moment it clicked for me. (just before posting this, it forgot how to do it, I think I screwed something up)
    Dan’s daily brief setup was wild too. It pulls from Apple Health, local calendars, weather, and his own projects, then produces a clean, human daily brief. It also lets him set reminders through WhatsApp and even makes its own decisions about how much to bother him based on context. He shared a moment where it literally told him, “I won’t bug you today because it’s your wife’s birthday.” That isn’t a hardcoded workflow — it’s reasoning layered on top of persistent memory.
    And that persistent memory is a big deal. It’s stored locally as Markdown files and folders, Obsidian‑style, so you don’t lose your life every time you switch models. You can route the brain to Claude Opus 4.5 today and a local model tomorrow, and the memory stays with you. That is a huge step up from “ChatGPT remembers you unless you unsubscribe.”
    There’s also a strong community forming around shared skills via ClawdHub. People are building everything from GA4 analytics skills to app testing automations to Tesla battery status checkers. The core pattern is simple but powerful: talk to it, ask it to build a skill, then it can run that skill forever.
    I definitely have some issues with the security aspect, you are essentially giving full access to an LLM to your machine, so many folks are buying a specific home for their ClawdBot (Mac Mini seems to be the best option for many of them) and are giving it secure access to passwords via a dedicated 1Password vault. I’ll keep you up to date about my endeavors with Clawd but definitely do give it a try!
    Installing
    Installing Clawd on your machine is simple, go to clawd.bot and follow instructions. Then find the most convenient way for you to talk to it (for me it was telegram, creating a telegram token takes 20 seconds) and then, you can take it from there with Clawdbot itself! Ask it for something to do, like clear your inbox, or set a reminder, or.. a million other things that you need for your personal life, and enjoy the discovery of what a potential ever present always on AI can do!
    Other news that we didn’t have time to cover at length but you should still now about:
    * Overworld released an OpenSource realtime AI World model (X)
    * Runway finally opened up their 4.5 video model, and it has Image2video capabilities, including multiple shots image to video (X)
    * Vercel launches skills.sh, an “npm for AI agents skills”
    * Anthropic’s Claude Code VS Code Extension Hits General Availability (X)
    Ok, this is it for this week folks! I’m going to play with (and try to fix.. ) my clawdbot, and suggest you give it a try. Do let me know if the deepdives are a good format!
    Show notes and links:
    ThursdAI - Jan 22, 2026 - TL;DR and show notes
    * Hosts and Guests
    * Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
    * Co Hosts - @WolframRvnwlf @yampeleg @nisten @ldjconfirmed
    * Guest Dan Peguine ( @danpeguine )
    * DeepDive - Clawdbot with Dan & Wolfram
    * Clawdbot: Open-Source AI Agent Running Locally on macOS Transforms Personal Computing with Self-Improving Capabilities (X, Blog)
    * Open Source LLMs
    * Z.ai releases GLM-4.7-Flash, a 30B parameter MoE model that sets a new standard for lightweight local AI assistants (X, Technical Blog, HuggingFace)
    * Liquid AI releases LFM2.5-1.2B-Thinking, a 1.2B parameter reasoning model that runs entirely on-device with under 900MB memory (X, HF, Announcement)
    * Sakana AI introduces RePo, a new way for language models to dynamically reorganize their context for better attention (X, Paper, Website)
    * Big CO LLMs + APIs
    * OpenAI announces testing ads in ChatGPT free and Go tiers, prioritizing user trust and transparency (X)
    * Anthropic publishes new 80-page constitution for Claude, shifting from rigid rules to explanatory principles that teach AI ‘why’ rather than ‘what’ to do (X, Blog, Announcement)
    * This weeks Buzz
    * WandB hackathon Weavehacks 3 - Jan 31-Feb1 in SF - limited seats available lu.ma/weavehacks3
    * Vision & Video
    * Overworld Releases Waypoint-1: Real-Time AI World Model Running at 60fps on Consumer GPUs (X, Announcement)
    * Voice & Audio
    * Alibaba Qwen Releases Qwen3-TTS: Full Open-Source TTS Family with 97ms Latency, Voice Cloning, and 10-Language Support (X, H, F, G, i, t, H, u, b)
    * FlashLabs Releases Chroma 1.0: World’s First Open-Source Real-Time Speech-to-Speech Model with Voice Cloning Under 150ms Latency (X, HF, Arxiv)
    * Inworld AI launches TTS-1.5: #1 ranked text-to-speech with sub-250ms latency at half a cent per minute (X, Announcement)
    * Tools
    * Vercel launches skills.sh, an “npm for AI agents” that hit 20K installs within hours (X, Vercel Changelog, GitHub)
    * Anthropic’s Claude Code VS Code Extension Hits General Availability, Bringing Full Agentic Coding to the IDE (X, VS Code Marketplace, Docs)


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe

More News podcasts

About ThursdAI - The top AI news from the past week

Every ThursdAI, Alex Volkov hosts a panel of experts, ai engineers, data scientists and prompt spellcasters on twitter spaces, as we discuss everything major and important that happened in the world of AI for the past week. Topics include LLMs, Open source, New capabilities, OpenAI, competitors in AI space, new LLM models, AI art and diffusion aspects and much more. sub.thursdai.news
Podcast website

Listen to ThursdAI - The top AI news from the past week, Piers Morgan Uncensored and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features
Social
v8.6.0 | © 2007-2026 radio.de GmbH
Generated: 2/20/2026 - 10:25:46 PM