Flux
Couleur d'accent
Setting a custom price for a model in AgentsView

Setting a custom price for a model in AgentsView

TIL: Setting a custom price for a model in AgentsView I've been really enjoying AgentsView by Wes McKinney as a tool for exploring my token usage across different coding agents running on my laptop. Claude Fable 5 came out today and wasn't yet included in the pricing database AgentsView uses. I used Fable to reverse-engineer AgentsView and figured out this recipe for setting custom prices. Here's my Claude Fable 5 usage for today so far, plotted by AgentsView as a treemap across my different…

Simon Willison's Weblog
npm Tooling Bug Incorrectly Marks One-Character Packages as Security Holders

npm Tooling Bug Incorrectly Marks One-Character Packages as Security Holders

npm incorrectly applied security-holder metadata to multiple one-character packages, including letters, numbers, and the - package. Socket reviewed public npm registry metadata and found several affected packages had been assigned 0.0.1-security or 0.0.1-security.0 versions, with the latest dist-tag moved to the security placeholder. Older package versions remained available. npm confirmed the markings were not intentional and that they are working on rolling it back. “This happened due to a…

Socket
Quoting Andrej Karpathy

Quoting Andrej Karpathy

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). — Andrej…

Simon Willison's Weblog
Help Shape the Future of PHP: Take the State of PHP Survey

Help Shape the Future of PHP: Take the State of PHP Survey

PHP is powered by a global community of developers, maintainers, contributors, companies, and users. To better understand that community and the ecosystem around it, The PHP Foundation and PhpStorm, a JetBrains IDE, are launching the first annual State of PHP survey. This survey is an effort to build a clearer picture of how PHP is used today: who is using it, which tools and frameworks developers rely on, what challenges they face, how they feel about the language, and where they see PHP going…

The PHP Foundation
Siri AI at WWDC 2026

Siri AI at WWDC 2026

Given how badly burned anyone who took Apple's 2024 WWDC Apple Intelligence announcements at face value was, I'm holding to a strict "I'll believe it when I see it" policy for everything they announced today. The new Siri AI features do at least look feasible with today's technology, especially since Apple are licensing a custom Gemini-derived model that they can run on their own Private Cloud Compute. It sounds like they'll be taking advantage of vision LLMs to extract information from the…

Simon Willison's Weblog
Mini Shai-Hulud, Miasma, and Hades Worms Target Bioinformatics and MCP Developers via Malicious PyPI Wheels

Mini Shai-Hulud, Miasma, and Hades Worms Target Bioinformatics and MCP Developers via Malicious PyPI Wheels

Socket Threat Research team identified a newer PyPI wave connected to the broader Mini Shai-Hulud, Miasma, and Hades supply chain attacks. This wave expands beyond the 37 malicious PyPI wheels covered in our weekend report and shows that the threat actors are iterating quickly across delivery mechanisms, package themes, and runtime triggers. The campaign has since added 23 newly identified PyPI package-version artifacts, expanding beyond the 37 malicious PyPI wheels covered in our weekend…

Socket
Announcing major new donations, and recapping the 2025 fundraiser

Announcing major new donations, and recapping the 2025 fundraiser

This past December, we ran our first fundraiser in six years, setting an ambitious goal of $6M. We ended up receiving a total of $1.8M from small donors and $1.6M in matching from the Survival and Flourishing Fund (SFF) for a total of $3.4M. We’re incredibly grateful for all this support! In the rest of […] The post Announcing major new donations, and recapping the 2025 fundraiser appeared first on Machine Intelligence Research Institute.

MIRI Blog
Long-Running Agents

Long-Running Agents

The following article originally appeared on Addy Osmani’s blog and is being reposted here with the author’s permission. A long-running AI agent can keep making progress over hours, days, or weeks. It can do this across many context windows and sandboxes, recover from failure, leave structured artifacts behind, and resume where it left off. For […]

O'Reilly Radar — AI/ML
Evals in Laravel: How to Prove Your AI Output Is Actually Good

Evals in Laravel: How to Prove Your AI Output Is Actually Good

Your Agent::fake() tests prove your Laravel AI feature runs — not that its output is any good. This evals a real ticket classifier with the AI SDK: a golden dataset for the fields you can check, an LLM-as-judge for the free text you can't, and a regression gate that catches a bad prompt before your customers do. Read more

Freek Van der Herten
Esc