Toutes les catégories

IA

1791 articles

The PM’s Playbook for Shipping AI Features That Actually Work in Production

The PM’s Playbook for Shipping AI Features That Actually Work in Production

The demo to production Death Valley If you’ve worked on an AI feature, you know the feeling. You start building something that you are excited about, set launch timelines. The model spits out a perfect response, the prototype works magically, and everybody in the room is mentally calculating how big this product will be when […]

10 juin 2026

O'Reilly Radar — AI/ML

If Claude Fable stops helping you, you'll never know

IA Programmation

If Claude Fable stops helping you, you'll never know

If Claude Fable stops helping you, you'll never know Jonathon Ready highlights one of the more eyebrow-raising details from the 319 page system card for Fable 5 and Mythos 5. Here's a longer excerpt, highlights mine: In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training…

10 juin 2026

Simon Willison's Weblog

Initial impressions of Claude Fable 5

IA Programmation

Initial impressions of Claude Fable 5

I didn't have early access to today's Claude Fable 5 release, but I've spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a beast. It's slow, expensive and has been quite happily churning through everything I've thrown at it so far. As is frequently the case with current frontier models the challenge is finding tasks that it can't do. First, let's review the key characteristics. Anthropic claim that Claude Fable 5 offers the same…

9 juin 2026

Simon Willison's Weblog

Loop Engineering: Design the System That Prompts Agents

Loop Engineering: Design the System That Prompts Agents

...explained visually.

9 juin 2026

Daily Dose of Data Science

IA Programmation

llm 0.32a3

Release: llm 0.32a3 Almost entirely written by the new Claude Fable 5, see my write-up for more details. Tags: projects, ai, generative-ai, llms, llm, claude-mythos

9 juin 2026

Simon Willison's Weblog

Setting a custom price for a model in AgentsView

IA Programmation

Setting a custom price for a model in AgentsView

TIL: Setting a custom price for a model in AgentsView I've been really enjoying AgentsView by Wes McKinney as a tool for exploring my token usage across different coding agents running on my laptop. Claude Fable 5 came out today and wasn't yet included in the pricing database AgentsView uses. I used Fable to reverse-engineer AgentsView and figured out this recipe for setting custom prices. Here's my Claude Fable 5 usage for today so far, plotted by AgentsView as a treemap across my different…

9 juin 2026

Simon Willison's Weblog

Quoting Andrej Karpathy

IA Programmation

Quoting Andrej Karpathy

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). — Andrej…

9 juin 2026

Simon Willison's Weblog

The Subsidy Ended: What Tool-Using Agents Actually Cost

The Subsidy Ended: What Tool-Using Agents Actually Cost

On June 1, GitHub Copilot’s usage-based billing became active for all Copilot plans, and developers reacted quickly and loudly. A Pro plan still costs $10, but it now comes with a monthly pool of AI credits. Those credits are priced at a penny each, and they’re consumed according to the model used and the tokens […]

9 juin 2026

O'Reilly Radar — AI/ML

Siri AI at WWDC 2026

IA Programmation

Siri AI at WWDC 2026

Given how badly burned anyone who took Apple's 2024 WWDC Apple Intelligence announcements at face value was, I'm holding to a strict "I'll believe it when I see it" policy for everything they announced today. The new Siri AI features do at least look feasible with today's technology, especially since Apple are licensing a custom Gemini-derived model that they can run on their own Private Cloud Compute. It sounds like they'll be taking advantage of vision LLMs to extract information from the…

8 juin 2026

Simon Willison's Weblog

Your Agent Harness Should Repair Itself

Your Agent Harness Should Repair Itself

...covered with an open-source solution.

8 juin 2026

Daily Dose of Data Science

Announcing major new donations, and recapping the 2025 fundraiser

Announcing major new donations, and recapping the 2025 fundraiser

This past December, we ran our first fundraiser in six years, setting an ambitious goal of $6M. We ended up receiving a total of $1.8M from small donors and $1.6M in matching from the Survival and Flourishing Fund (SFF) for a total of $3.4M. We’re incredibly grateful for all this support! In the rest of […] The post Announcing major new donations, and recapping the 2025 fundraiser appeared first on Machine Intelligence Research Institute.

8 juin 2026

Long-Running Agents

Long-Running Agents

The following article originally appeared on Addy Osmani’s blog and is being reposted here with the author’s permission. A long-running AI agent can keep making progress over hours, days, or weeks. It can do this across many context windows and sandboxes, recover from failure, leave structured artifacts behind, and resume where it left off. For […]

8 juin 2026

O'Reilly Radar — AI/ML

The AI Agents Stack (2026 Edition)

The AI Agents Stack (2026 Edition)

The following article originally appeared on Paolo Perrone’s The AI Engineer Substack and is being reposted here with the author’s permission. Your team picks LangGraph for a customer support chatbot. Three weeks in, you’ve got 14 nodes in a state graph, a custom checkpointer writing to Redis, and retry logic for tool calls that fail […]

8 juin 2026

O'Reilly Radar — AI/ML

datasette-agent-edit 0.1a0

IA Programmation

datasette-agent-edit 0.1a0

Release: datasette-agent-edit 0.1a0 I'm planning several plugins for Datasette Agent which can make edits to existing pieces of text - things like collaborative Markdown editing, updating large SQL queries, and editing SVG files. Agentic editing of text is a little tricky to get right. My favorite published design for this is for the Claude text editor, which implements the following tools: view - view sections of a file, with line numbers added to every line. str_replace - find an exact…

7 juin 2026

Simon Willison's Weblog

REINFORCE and Actor-critic Methods in RL

REINFORCE and Actor-critic Methods in RL

The full RL nanodegree, covered with implementation.

7 juin 2026

Daily Dose of Data Science