Companies are shifting from gen AI that simply answers questions to autonomous agents that perceive, reason, and act on their behalf. Attempting to scale these agents on legacy stacks exposes structural failures that can lead to fractured governance, a persistent trust gap, and broken reasoning loops, all while causing costs to spiral. To solve this, … continue reading
Anthropic releases Claude Sonnet 4.6 Claude Sonnet 4.6 features improved skills in coding, computer use, long-context reasoning, agent planning, knowledge work, and design. It is now the default model in claude.ai and Claude Cowork, has a 1M context window (beta), and is priced the same as Sonnet 4.5, at $3 per million input tokens and … continue reading
OpenAI releases research preview GPT-5.3-Codex-Spark for ChatGPT Pro users GPT-5.3-Codex-Spark is a lightweight version of the company’s coding model, GPT-5.3-Codex, that is optimized to run on ultra-low latency hardware and can deliver over 1,000 tokens per second. It is the first outcome of the company’s recently announced partnership with Cerebras to add 750MW of ultra … continue reading
Anthropic launches Claude Opus 4.6 Claude Opus 4.6 is the latest version of the company’s most powerful class of AI models. Anthropic says that this release improves on Opus’ coding skills, and it now plans more carefully, sustains agent tasks for longer, can run more reliably in larger codebases, and has better code review and … continue reading
Google is attempting to make it easier for developers to access its documentation by creating the Developer Knowledge API and corresponding MCP server, both now in public preview. The Developer Knowledge API allows developers to search and retrieve documentation for Google’s services in Markdown. This includes documentation from firebase.google.com, developer.android.com, docs.cloud.google.com, and more. The two … continue reading
Google unveils new open-source standard for agentic commerce Google has announced a new open-source standard for agentic commerce called the Universal Commerce Protocol (UCP). Developed in collaboration with a number of commerce companies, including Shopify, Etsy, Wayfair, Target, and Walmart, UCP establishes a common language and primitives for the commerce journey between consumer surfaces, businesses, … continue reading
Kaggle has announced that it now offers Community Benchmarks, enabling AI practitioners to design, run, and share their own benchmarks for evaluating AI models. Kaggle is a community platform run by Google that offers models and resources for data scientists and machine learning practitioners. Last year, it had introduced Kaggle Benchmarks to provide evaluations from … continue reading
Google has announced a new open-source standard for agentic commerce called the Universal Commerce Protocol (UCP). Developed in collaboration with a number of commerce companies, including Shopify, Etsy, Wayfair, Target, and Walmart, UCP establishes a common language and primitives for the commerce journey between consumer surfaces, businesses, and payment providers. “As consumers embrace conversational experiences, … continue reading
Anthropic makes Skills an open standard Skills—a capability that allows users to teach Claude repeatable workflows—was first introduced in October, and now the company is making it an open standard. “Like MCP, we believe skills should be portable across tools and platforms—the same skill should work whether you’re using Claude or other AI platforms,” the … continue reading
Google has announced the release of Gemini 3 Flash, its latest frontier model designed for speed at a lower token cost. According to Google, this model is ideal for iterative development, as it is able to quickly reason and solve tasks in high-frequency workflows. It also outperforms all Gemini 2.5 models as well as Gemini … continue reading
Google has announced a new project that aims to leverage generative AI to build contextually relevant UIs. A2UI is an open source tool that generates UIs based on the current conversation’s needs. For example, an agent designed to help users book restaurant reservations would be more useful if it featured an interface to input the … continue reading
OpenAI announces GPT-5.2 GPT-5.2 is optimized for professional knowledge work, scoring a 70.9% (using GPT-5.2 Thinking) on knowledge work tasks on the GDPval benchmark, compared to just 38.8% for GPT-5.1 Thinking. The company has started rolling out GPT-5.2 in ChatGPT today, with Instant, Thinking, and Pro modes, starting with paid plans. It is also available … continue reading