1. Urgent Security: Malicious Axios versions drop remote access trojan
Two malicious versions of the widely used Axios HTTP client (1.14.1 and 0.30.4) were published to the npm registry, executing a supply chain attack. The compromised releases inject a fake dependency, [email protected], which runs a postinstall script to deploy a cross-platform remote access trojan on macOS, Windows, and Linux. The malware contacts a command-and-control server, delivers second-stage payloads, and then replaces its own package.json to evade detection. Developers must immediately audit their environments and assume compromise if either specific version was installed.
2. Unconfirmed: Claude Code source code leaked via npm source map
Anthropic inadvertently exposed the TypeScript source code for its Claude Code agentic AI harness via a 59.8 MB JavaScript source map file published to the public npm registry. The leak reveals approximately 512,000 lines of internal codebase, including tool implementations and undercover modes. Developers have mirrored the repository across GitHub for analysis of Anthropic's agent architecture. A separate incident also exposed Hugging Face Research team pretraining datasets due to a misconfigured repository push.
3. OpenClaw agent compromise exposes root shell access risks
A threat actor successfully compromised a self-hosted OpenClaw AI personal assistant, gaining and selling root shell access to a corporate executive's computer. The incident highlights security vulnerabilities in autonomous AI agents that execute tasks via messaging platforms without enterprise kill switches or least-privilege constraints. Developers deploying open-source agents like OpenClaw on private infrastructure must implement strict zero-trust boundaries to prevent agent exploitation from escalating to host system compromise.
4. TimesFM 2.5: Google releases 200M-parameter time-series model with 16k context
Google Research has released TimesFM 2.5, an updated time-series forecasting foundation model available on Hugging Face. The new version reduces the parameter count from 500M to 200M while expanding the context length from 2,048 to 16,000 tokens. It introduces support for continuous quantile forecasting up to a 1k horizon via an optional 30M quantile head and removes the frequency indicator requirement. Developers can run the model using PyTorch or Flax backends via the updated inference API.
5. pg_textsearch v1.0: Open-source Postgres extension for BM25 search
Tiger Data has released pg_textsearch v1.0, an open-source PostgreSQL extension providing BM25 relevance-ranked full-text search. The extension is designed to complement semantic search tools like pgvector by offering scalable keyword search directly within Postgres. Benchmark results using MS-MARCO indicate a 4.7x query throughput advantage over existing solutions like ParadeDB. The release allows developers to build hybrid search stacks without relying on AGPL-licensed alternatives.
6. Claude Code introduces Auto Mode with safety classifiers
Anthropic has added an Auto Mode to Claude Code that utilizes a two-layer classifier to evaluate command safety. The system automatically approves safe operations while blocking risky commands, reducing the need for manual developer intervention. This provides a middle ground between requiring explicit approval for every action and allowing full, unmonitored agent autonomy.
7. llm-d joins CNCF to provide native Kubernetes LLM inference
IBM Research, Red Hat, and Google Cloud have donated the llm-d project to the Cloud Native Computing Foundation (CNCF). The framework provides a production-grade, distributed LLM inference stack built natively for Kubernetes using vLLM. This integration allows infrastructure teams to manage and scale large language model deployments using standard Kubernetes orchestration patterns.
8. Claude Platform launches Compliance API for audit logging
Anthropic has introduced a Compliance API for the Claude Platform to help administrators monitor user and system activities. The API tracks administrative actions, system events, and resource modifications such as file creation or deletion. Organizations can integrate these audit logs into their existing compliance and security monitoring systems by generating an admin API key through their account team.
9. Transformers.js v4 introduces WebGPU runtime
The release of Transformers.js v4 includes a new WebGPU Runtime for running machine learning models directly in the browser. This update allows developers to use the same Transformers.js codebase across a wide variety of JavaScript environments with hardware acceleration. The WebGPU integration significantly improves client-side inference performance for web applications.
10. KwaiKAT releases KAT-Coder-Pro V2 non-reasoning coding model
KwaiKAT has launched KAT-Coder-Pro V2, a proprietary non-reasoning coding model featuring a 256K context window. The model matches Claude Sonnet 4.6 on the Artificial Analysis Intelligence Index while operating at a lower cost of $0.30 per 1M input tokens and $1.20 per 1M output tokens. It achieves high token efficiency and low latency by avoiding the reasoning delays typical of frontier models, though it shows some regression in long-context knowledge recall compared to its predecessor. The model is accessible via StreamLake and AtlasCloud API endpoints.
11. Preview: Ollama introduces MLX backend for Apple Silicon acceleration
Ollama has released a preview version built on Apple's MLX machine learning framework to accelerate local inference on macOS. The update leverages unified memory and GPU Neural Accelerators on M5-series chips to improve both time-to-first-token and generation speeds. It also introduces support for NVIDIA's NVFP4 quantization format and improves cache reuse across conversations, specifically optimizing performance for local coding agents like OpenClaw and Claude Code.
12. Universal CLAUDE.md configuration reduces agent verbosity
An open-source CLAUDE.md configuration file has been released to help developers control the output verbosity of the Claude Code agent. By dropping the file into a project root, developers can suppress sycophantic responses, unsolicited suggestions, and formatting noise without modifying application code. The configuration reduces output token consumption by approximately 63%, making it highly useful for automation pipelines and repeated structured tasks.