TurboQuant-WASM Brings Google's Vector Quantization to the Browser

1. TurboQuant-WASM Brings Google's Vector Quantization to the Browser

An experimental WebAssembly and relaxed SIMD build of Google's TurboQuant algorithm is now available for browsers and Node.js. The library compresses vectors to approximately 4.5 bits per dimension while preserving inner products, enabling fast client-side vector search and 3D Gaussian Splatting compression. It includes a TypeScript API for encoding, decoding, and executing batch dot products directly in the browser. This allows developers to run memory-efficient similarity searches on the edge without relying on cloud vector databases.

2. Unconfirmed: Malware Distributed via Fake Claude Code Leaks

Security researchers report that threat actors are distributing malware disguised as leaked versions of Anthropic's Claude Code. Hackers are posting these malicious payloads on various forums and repositories, targeting developers seeking early or unauthorized access to the agentic coding tool. Developers should strictly avoid downloading unverified Claude Code binaries or source files from unofficial channels. Compromised machines risk exposing local environment variables, API keys, and source code to attackers.

3. sllm Launches Shared GPU Nodes for Private LLM Inference

A new service called sllm allows developers to share dedicated GPU nodes to run large models like DeepSeek V3 (685B) at a fraction of the standard hardware cost. The platform groups developers into cohorts to split the cost of 8xH100 clusters, charging users only when a cohort fills. It provides an OpenAI-compatible API powered by vLLM, requiring only a base URL swap to integrate into existing applications. The service guarantees privacy by not logging traffic and targets developers who need lower throughput rather than dedicated enterprise capacity.

4. Apple Signs Tiny Corp Driver for Nvidia eGPUs on Apple Silicon

Apple has officially signed a third-party driver from Tiny Corp that enables AMD and Nvidia eGPUs to work with Apple Silicon Macs. The driver is specifically designed to accelerate local LLM inference rather than general graphics rendering. Developers must compile the driver using Docker, but the official signature means users no longer need to disable macOS System Integrity Protection (SIP) to use it. This provides a new hardware pathway for running heavy local AI workloads on ARM-based Macs.

5. Hugging Face Releases OpenClaw Migration Tooling and Gemma4 Support

Hugging Face has published step-by-step instructions and tooling to transition OpenClaw deployments to open-source and local models. The update includes a new CLI command to authenticate and connect local environments directly to Hugging Face endpoints. Additionally, the community has confirmed that the newly released Gemma4 model is now supported and running within the OpenClaw framework. This allows developers to swap proprietary backends for local or open-weight alternatives without changing their agent harness.

6. Open Source Travel Hacking Toolkit Releases 6 New MCP Servers

A new open-source Travel Hacking Toolkit provides six Model Context Protocol (MCP) servers designed to integrate real-time travel data into AI coding assistants like Claude Code and OpenCode. The toolkit allows AI agents to directly query award flight availability, cash prices, loyalty balances, and hotel data across multiple providers. Five of the six included MCP servers operate without requiring API keys, lowering the barrier to entry for testing. This release serves as a practical reference implementation for developers building custom MCP integrations for complex, multi-API workflows.