
What the Freakiness of 2025 in AI Tells Us About 2026
2025/12/23 | 33 mins.
It’s probably not possible to satisfactorily condense a 12 month’s worth of weird progress in AI, as well as predictions for the year to come, into one video. But I’m gonna try anyway because it has been a very strange time.http://matsprogram.org/s26-aieMy new app! https://lmcouncil.aiPatreon Interview: https://www.patreon.com/posts/robot-in-your-27-146376094Chapters:00:00 - Introduction00:34 - Reasoning Models … and limits02:54 - A playable world03:36 - Realism03:50 - AI Slop gone mainstream05:03 - DolphinGemma05:39 - Public Mood07:34 - AI Enlisted08:30 - GPT-511:05 - Open Weight not out13:00 - METR Breakout17:30 - VASA-118:28 - Lateral Productivity20:15 - 1 or 1000 benchmarks needed?24:54 - Continual Learning + Altman on Superintelligence28:08 - Automated Information Discovery ft AlphaEvolveHassabis on Generality: https://x.com/demishassabis/status/2003097405026193809https://www.youtube.com/watch?v=PqVbypvxDtoGemini 3: https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini_3_table_final_HLE_Tools_on.gifReasoning Trade-offs: https://arxiv.org/pdf/2504.13837DolphinGemma: https://blog.google/technology/ai/dolphingemma/?s=09Genie 3: https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/METR Time Horizon: https://arxiv.org/pdf/2503.14499https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/Flaws: https://x.com/ShashwatGoel7/status/2002369517499105443https://shash42.substack.com/p/how-to-game-the-metr-plothttps://x.com/METR_Evals/status/2002203627377574113GPT-5 - Altman phd in everything: https://edition.cnn.com/2025/08/14/business/chatgpt-rollout-problemshttps://simple-bench.com/AI Slop: https://www.youtube.com/watch?v=I_3vxoJDD9khttps://www.theguardian.com/technology/2025/dec/16/boost-for-artists-in-ai-copyright-battle-as-only-3-per-cent-back-uk-active-opt-out-planSurvey: https://x.com/SearchlightInst/status/2001057144842387920/photo/1Nvidia Nemotron: https://x.com/percyliang/status/2000608134205985169OpenAI Compute Flywheel: https://x.com/OpenAI/status/2001363007209914399/photo/1Altman Interview: https://www.youtube.com/watch?v=2P27Ef-LLuQAI in Govt: https://x.com/jdcmedlock/status/1939814516503847259Benchmark Gaming: https://techcrunch.com/2025/04/07/meta-exec-denies-the-company-artificially-boosted-llama-4s-benchmark-scores/AlphaEvolve: https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf?utm_source=deepmind.google&utm_medium=referral&utm_campaign=gdm&utm_content=Continual Learning: https://abehrouz.github.io/files/NL.pdfJob Risk: https://archive.ph/20250708204527/https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropicGPT4o: https://x.com/AISafetyMemes/status/1916889492172013989Vasa-1: https://www.microsoft.com/en-us/research/project/vasa-1/Three Views: https://www.lesswrong.com/posts/K2D45BNxnZjdpSX2j/ai-timelinesTuring Test: https://x.com/tunguz/status/1907185471211422147Karpathy Year in Review: https://karpathy.bearblog.dev/year-in-review-2025/LLM Brainrot: https://arxiv.org/pdf/2510.13928Lateral Productivity: https://www.aisi.gov.uk/frontier-ai-trends-reportEmotional Quotient: https://arxiv.org/pdf/2511.08394Non-hype Newsletter: https://signaltonoise.beehiiv.com/Podcast: https://aiexplainedopodcast.buzzsprout.com/AI Insiders ($9!): https://www.patreon.com/AIExplained

Gemini Exponential, Demis Hassabis' ‘Proto-AGI’ coming, but …
2025/12/19 | 19 mins.
The condensed highlights of hours of AI lab leader interviews, model releases, Gemini 3 Flash insights (plus it’s hidden flaw), Hassabis’ ‘proto-AGI’ and much more…https://matsprogram.org/apply?utm_source=ai-explained&utm_medium=youtube&utm_campaign=s26  Also, do check out my new app: https://lmcouncil.aiChapters: 00:00 - Introduction00:50 - Results02:44 - But… the Flaw04:49 - So Benchmarks are fake? No07:37 - Spatial Reasoning + Hassabis10:06 - Proto-AGI12:07 - Minimal AGI15:07 - Compute Slowdown17:56 - New Data ParadigmGemini 3 Flash: https://deepmind.google/models/gemini/flash/Hassabis Interview: https://www.youtube.com/watch?v=PqVbypvxDtoLegg Interview: https://www.youtube.com/watch?v=l3u_FAv33G0Pre-training Lead Interview: https://www.youtube.com/watch?v=cNGDAqFXvewAltman Interview: https://www.youtube.com/watch?v=2P27Ef-LLuQBrockman Video: https://x.com/OpenAI/status/2001336514786017417Post-Training Reveal: https://x.com/OfficialLoganK/status/2001742530472534442Hallucinations Paper: https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdfPatreon Hallucinations Vid: https://www.patreon.com/posts/blockers-to-and-139264812AA-Omniscience Benchmark: https://artificialanalysis.ai/evaluations/omnisciencehttps://arxiv.org/pdf/2511.13029lmcouncil.ai/benchmarks https://simple-bench.com/https://x.com/scaling01/status/19996205877448132055.2 Codex Drop: https://cdn.openai.com/pdf/ac7c37ae-7f4c-4442-b741-2eabdeaf77e0/oai_5_2_Codex.pdfOpenAI Compute Trend: https://www.theinformation.com/articles/openais-350-billion-computing-cost-problem?rc=sy0ihqCramer Tweet/Response: https://x.com/BorisMPower/status/2001440650210976018OpenAI Valuation: ​​https://www.theinformation.com/articles/openai-discussed-raising-tens-billions-valuation-around-750-billion?rc=sy0ihqIndian Data: https://www.reuters.com/world/india/with-freebies-openai-google-vie-indian-users-training-data-2025-12-17/TheInformation Data: https://x.com/theinformation/status/2001421225751351778Genie 3: https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/Sima 2: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/Veo 3.1: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/METR: https://metr.org/blohttps://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/2025-03-19-measuring-ai-ability-to-complete-long-tasks/AI Insiders ($9!): https://www.patreon.com/AIExplainedNon-hype Newsletter: https://signaltonoise.beehiiv.com/

GPT 5.2: OpenAI Strikes Back
2025/12/12 | 17 mins.
Full GPT-5.2 breakdown - did OpenAI reclaim the crown? A story of tokens, time and cost, plus 9 details you wouldn’t get just from reading the headlines.https://www.youtube.com/@eightythousandhoursAI Insiders ($9!): https://www.patreon.com/AIExplainedhttps://lmcouncil.aiChapters:00:00 - Introduction00:55 - Better than Human @ Professional Tasks?04:42 - Test time Compute07:05 - Benchmark Selection09:32 - Simple Results + council comparison13:01 - Long Context13:52 - Self-Improvement15:00 - 10 Years + New ModelsRelease Page: https://openai.com/index/introducing-gpt-5-2/GPT 5.2 Benchmark Comparison: https://www.reddit.com/r/singularity/comments/1pka1y9/gpt52_all_20_benchmarks_rankings_and_pricing/https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini_3_table_final_HLE_Tools_on.gifhttps://lmcouncil.ai/benchmarksCharxiv: https://charxiv.github.io/#leaderboardGDPval: https://arxiv.org/pdf/2510.04374My vid: https://www.youtube.com/watch?v=oK5LxMaROSAKilpatrick: https://x.com/OfficialLoganK/status/1999270402712023158/photo/1Noam Brown: https://x.com/polynoamial/status/1999189845164667132New Model in New Year: https://www.theinformation.com/articles/openai-developing-garlic-model-counter-googles-recent-gains?rc=sy0ihq10 Years of OpenAI: https://openai.com/index/ten-years/GPQA: https://x.com/idavidrein/status/1841265634170278063ARC-AGI 1-2: https://arcprize.org/arc-agi/2/Sunday Robotics: https://x.com/tonyzzhao/status/1991204839578300813Non-hype Newsletter: https://signaltonoise.beehiiv.com/https://lmcouncil.ai

You Are Being Told Contradictory Things About AI: 8 examples
2025/12/05 | 20 mins.
With headlines of an imminent job apocalypse, code red for ChatGPT and recursive self-improvement, at the same time as Anthropic's CEO yesterday saying we know how to scale to AGI, and Gemini 3 DeepThink out today, it is easy to get lost among the narratives and counter-narratives. So here are both, plus the facts behind them, for you to decide.https://epoch.ai/data/data-centersEpoch AI is the sponsor of today’s video, and my views, and those expressed in this video, do not necessarily reflect Epoch AI’s views in any way.Chapters: 00:00 - Introduction00:42 - Job Apocalypse?01:45 - Scaling to AGI04:15 - Recursive Self-Improvement Needed, or Not09:57 - OpenAI Code Red vs Gemini 3 DeepThink vs Claude Opus 4.513:27 - DeepSeek Speciale vs Mistral Large v316:45 - Claude Soul Documenthttps://lmcouncil.ai/AI Insiders ($9!): https://www.patreon.com/AIExplainedGuardian Interview: https://www.theguardian.com/technology/ng-interactive/2025/dec/02/jared-kaplan-artificial-intelligence-train-itselfMIT Study on Jobs/Tasks: https://iceberg.mit.edu/report.pdfvs https://www.cnbc.com/2025/11/26/mit-study-finds-ai-can-already-replace-11point7percent-of-us-workforce.htmlAmodei on Scaling: https://www.youtube.com/watch?v=FEj7wAjwQIkClaude Soul Document: https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5-opus-soul-documentCapabilities Original Stance: https://www.anthropic.com/news/core-views-on-ai-safetyIlya Interview: https://www.dwarkesh.com/p/ilya-sutskever-2Ricursive Intelligence: https://x.com/RicursiveAI/status/1995932204703346946Economist Worker Usage of GenAI: https://www.economist.com/finance-and-economics/2025/11/26/investors-expect-ai-use-to-soar-thats-not-happening#selection-1409.94-1413.42Mistral v3 Large: https://docs.mistral.ai/models/mistral-large-3-25-12Compute Slowdown Paper: https://joel-becker.com/images/publications/forecasting_time_horizon_under_compute_slowdown.pdfhttps://x.com/joel_bkr/status/1993023436541903155METR Chart: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/https://www.theinformation.com/articles/openais-350-billion-computing-cost-problem?rc=sy0ihqOpenAI Code Red: https://www.anthropic.com/news/core-views-on-ai-safetyRocket Company: https://www.independent.co.uk/news/world/americas/sam-altman-rocket-elon-musk-spacex-b2878351.htmlDeepSeek Paper: https://arxiv.org/html/2512.02556v1DeepSeek Crowdstrike CCP: https://www.crowdstrike.com/en-us/blog/crowdstrike-researchers-identify-hidden-vulnerabilities-ai-coded-software/https://simple-bench.com/Patreon Post: https://www.patreon.com/c/aiexplained/postsRobot: https://x.com/jloganolson/status/1985850115379351799

Gemini 3 is Here: 11 Details You Might Have Missed
2025/11/19 | 21 mins.
Gemini 3 Pro is out, and records fell like snowflakes in Svalbard. No long description, chapters or links today, huge technical difficulties, including with audio, so just want to publish asap.https://app.grayswan.ai/ai-explainedhttps://lmcouncil.aiAI Insiders ($9!): https://www.patreon.com/AIExplainedNon-hype Newsletter: https://signaltonoise.beehiiv.com/Podcast: https://aiexplainedopodcast.buzzsprout.com/



AI Explained Official Podcast