Flux
Couleur d'accent
Toutes les catégories

Programmation

2090 articles

If Claude Fable stops helping you, you'll never know

If Claude Fable stops helping you, you'll never know

If Claude Fable stops helping you, you'll never know Jonathon Ready highlights one of the more eyebrow-raising details from the 319 page system card for Fable 5 and Mythos 5. Here's a longer excerpt, highlights mine: In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training…

Simon Willison's Weblog
Initial impressions of Claude Fable 5

Initial impressions of Claude Fable 5

I didn't have early access to today's Claude Fable 5 release, but I've spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a beast. It's slow, expensive and has been quite happily churning through everything I've thrown at it so far. As is frequently the case with current frontier models the challenge is finding tasks that it can't do. First, let's review the key characteristics. Anthropic claim that Claude Fable 5 offers the same…

Simon Willison's Weblog
Setting a custom price for a model in AgentsView

Setting a custom price for a model in AgentsView

TIL: Setting a custom price for a model in AgentsView I've been really enjoying AgentsView by Wes McKinney as a tool for exploring my token usage across different coding agents running on my laptop. Claude Fable 5 came out today and wasn't yet included in the pricing database AgentsView uses. I used Fable to reverse-engineer AgentsView and figured out this recipe for setting custom prices. Here's my Claude Fable 5 usage for today so far, plotted by AgentsView as a treemap across my different…

Simon Willison's Weblog
npm Tooling Bug Incorrectly Marks One-Character Packages as Security Holders

npm Tooling Bug Incorrectly Marks One-Character Packages as Security Holders

npm incorrectly applied security-holder metadata to multiple one-character packages, including letters, numbers, and the - package. Socket reviewed public npm registry metadata and found several affected packages had been assigned 0.0.1-security or 0.0.1-security.0 versions, with the latest dist-tag moved to the security placeholder. Older package versions remained available. npm confirmed the markings were not intentional and that they are working on rolling it back. “This happened due to a…

Socket
Quoting Andrej Karpathy

Quoting Andrej Karpathy

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). — Andrej…

Simon Willison's Weblog
Help Shape the Future of PHP: Take the State of PHP Survey

Help Shape the Future of PHP: Take the State of PHP Survey

PHP is powered by a global community of developers, maintainers, contributors, companies, and users. To better understand that community and the ecosystem around it, The PHP Foundation and PhpStorm, a JetBrains IDE, are launching the first annual State of PHP survey. This survey is an effort to build a clearer picture of how PHP is used today: who is using it, which tools and frameworks developers rely on, what challenges they face, how they feel about the language, and where they see PHP going…

The PHP Foundation
Siri AI at WWDC 2026

Siri AI at WWDC 2026

Given how badly burned anyone who took Apple's 2024 WWDC Apple Intelligence announcements at face value was, I'm holding to a strict "I'll believe it when I see it" policy for everything they announced today. The new Siri AI features do at least look feasible with today's technology, especially since Apple are licensing a custom Gemini-derived model that they can run on their own Private Cloud Compute. It sounds like they'll be taking advantage of vision LLMs to extract information from the…

Simon Willison's Weblog
Mini Shai-Hulud, Miasma, and Hades Worms Target Bioinformatics and MCP Developers via Malicious PyPI Wheels

Mini Shai-Hulud, Miasma, and Hades Worms Target Bioinformatics and MCP Developers via Malicious PyPI Wheels

Socket Threat Research team identified a newer PyPI wave connected to the broader Mini Shai-Hulud, Miasma, and Hades supply chain attacks. This wave expands beyond the 37 malicious PyPI wheels covered in our weekend report and shows that the threat actors are iterating quickly across delivery mechanisms, package themes, and runtime triggers. The campaign has since added 23 newly identified PyPI package-version artifacts, expanding beyond the 37 malicious PyPI wheels covered in our weekend…

Socket
Evals in Laravel: How to Prove Your AI Output Is Actually Good

Evals in Laravel: How to Prove Your AI Output Is Actually Good

Your Agent::fake() tests prove your Laravel AI feature runs — not that its output is any good. This evals a real ticket classifier with the AI SDK: a golden dataset for the fields you can check, an LLM-as-judge for the free text you can't, and a regression gate that catches a bad prompt before your customers do. Read more

Freek Van der Herten
Esc