news.triggerhive.com

~/news vault: 164 entries last sync: 06:26 model: gemma4:26b

$

All 164 Ai 58 Dev-tools 29 General 19 Infra 25 Releases 2 Security 31

§ 02

This week

8 entries

Ai · Simon Willison

LLM 0.32a0 is a major backwards-compatible refactor

LLM 0.32a0 is an alpha release of the LLM Python library that introduces a major, backwards-compatible refactor of its core abstraction. The update shifts the library from a simple text-in/text-out model to a more...

python ai llm software-development

Ai · Hugging Face Blog

AI evals are becoming the new compute bottleneck

AI evaluation is transitioning from a static, compressible task into a significant compute bottleneck. As benchmarks shift from simple text predictions to agentic rollouts and training-in-the-loop protocols, evaluation...

Ai · Simon Willison

What's new in pip 26.1 - lockfiles and dependency cooldowns!

Pip 26.1 introduces native support for lockfiles and dependency cooldowns, providing new mechanisms for environment reproducibility and package stability. This release also officially drops support for Python 3.9.

python pip devops

Ai · OpenAI Blog

OpenAI models, Codex, and Managed Agents come to AWS

OpenAI has expanded the availability of its GPT models, Codex, and Managed Agents to Amazon Web Services (AWS). This integration allows enterprises to deploy and manage OpenAI's generative capabilities directly within...

Ai · Hugging Face Blog

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

NVIDIA has released Nemotron 3 Nano Omni, an omni-modal model designed for integrated processing of text, image, video, and audio. The model is optimized for long-context workloads, including complex document analysis,...

AI NVIDIA LLM Multimodal

Ai · Simon Willison

microsoft/VibeVoice

Microsoft's VibeVoice is an MIT-licensed, Whisper-style speech-to-text model that features integrated speaker diarization. It allows for the transcription of audio files while simultaneously identifying and labeling...

AI Machine Learning Microsoft Speech-to-Text

Ai · Hugging Face Blog

How to build scalable web apps with OpenAI's Privacy Filter

This entry details the implementation of scalable web applications using OpenAI's Privacy Filter and `gradio.Server`. It demonstrates how to integrate a 1.5B-parameter PII detection model into custom HTML/JS frontends...

ai web-development privacy

Ai · OpenAI Blog

An open-source spec for orchestration: Symphony

Symphony is an open-source specification designed for orchestrating Codex-based agents. It enables the transformation of standard issue trackers into autonomous, always-on agent systems to automate software engineering...

AI orchestration open-source

§ 03

Earlier

50 entries

Ai · Simon Willison

Quoting Romain Huet

OpenAI has transitioned from a bifurated model architecture to a unified system, integrating the specialized Codex model into the primary model starting with GPT-5.4. The subsequent GPT-5.5 release focuses on advancing...

AI LLM Software Engineering

Ai · Simon Willison

GPT-5.5 prompting guide

GPT-5.5 is now available via API, requiring a new approach to prompt engineering and model migration. Developers should treat this release as a distinct model family rather than a drop-in replacement for GPT-5.2 or...

Ai · Latent Space

[AINews] DeepSeek V4 Pro (1.6T-A49B) and Flash (284B-A13B), Base and Instruct — runnable on Huawei Ascend chips

DeepSeek has released the V4 model family, consisting of DeepSeek V4 Pro and DeepSeek V4 Flash, marking a significant architectural update to the series. The release introduces a 1M token context window and advanced...

AI DeepSeek LLM Huawei Ascend

Ai · Simon Willison

The release of `llm` version 0.31 introduces support for OpenAI's GPT-5.5 model and implements new configuration parameters for controlling output characteristics. These updates provide developers with more granular...

Ai · Hugging Face Blog

DeepSeek-V4: a million-token context that agents can actually use

DeepSeek-V4 introduces a 1-million-token context window optimized specifically for long-running agentic workloads. The architecture utilizes a hybrid attention mechanism to significantly reduce KV cache memory and...

AI LLM Software Engineering

Ai · Simon Willison

DeepSeek V4 - almost on the frontier, a fraction of the price

DeepSeek has released the DeepSeek-V4 series, featuring the DeepSeek-V4-Pro and DeepSeek-V4-Flash models. These Mixture of Experts (MoE) models provide a 1 million token context window at a significantly lower price...

ai llm deepseek

Ai · Simon Willison

An update on recent Claude Code quality reports

Recent reports of performance degradation in Claude Code were traced to bugs within the tool's execution harness rather than the underlying LLM models. A logic error in session management caused the tool to lose context...

AI Claude Code Software Engineering

Ai · OpenAI Blog

Codex is a system designed to extend the capabilities of conversational AI by enabling task automation and tool integration. It moves beyond simple text-based chat to produce functional, tangible outputs.

AI automation productivity

Ai · OpenAI Blog

Introducing GPT-5.5

OpenAI has released GPT-5.5, an updated model iteration designed for high-complexity tasks. The release focuses on increasing processing speed and improving the model's ability to operate across various integrated tools.

ai llm software-development

Ai · Simon Willison

A pelican for GPT-5.5 via the semi-official Codex backdoor API

GPT-5.5 has been released for ChatGPT subscribers and OpenAI Codex, though the official OpenAI API deployment is currently pending due to ongoing safety and security scaling requirements. Developers can access the model...

Ai · OpenAI Blog

Introducing workspace agents in ChatGPT

OpenAI has introduced workspace agents for ChatGPT, designed to automate multi-step, complex workflows. These agents leverage Codex to facilitate secure automation across various software tools within a cloud-based...

AI automation productivity

Ai · OpenAI Blog

Introducing OpenAI Privacy Filter

OpenAI has released the OpenAI Privacy Filter, an open-weight model designed for the identification and redaction of personally identifiable information (PII) within text. This tool enables developers to automate the...

AI privacy security open-source

Ai · Latent Space

[AINews] OpenAI launches GPT-Image-2

OpenAI has launched GPT-Image-2, a new image generation model featuring "thinking" capabilities and enhanced text rendering. The release marks a shift toward using image generation for functional, structured outputs...

Ai · OpenAI Blog

Scaling Codex to enterprises worldwide

OpenAI has launched Codex Labs to facilitate the enterprise-scale deployment of Codex throughout the software development lifecycle (SDLC). Through strategic partnerships with consulting firms including Accenture, PwC,...

AI OpenAI Enterprise Software

Ai · Latent Space

[AINews] Anthropic Claude Opus 4.7 - literally one step better than 4.6 in every dimension

Anthropic has released Claude Opus 4.7, an update to the Opus model family focused on improving coding, instruction following, and computer-use capabilities. The release introduces a new tokenizer and expanded vision...

ai llm anthropic

Ai · Hugging Face Blog

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

This entry details the process of finetuning multimodal embedding and reranker models using the Sentence Transformers library, specifically for tasks like Visual Document Retrieval (VDR). It demonstrates how...

AI Machine Learning NLP

Ai · OpenAI Blog

Codex for (almost) everything

The Codex application for macOS and Windows has been updated with new features designed to extend its operational capabilities. The update introduces tools for system interaction, web access, and persistent context...

AI productivity software-development

Ai · OpenAI Blog

The next evolution of the Agents SDK

OpenAI has updated the Agents SDK to include native sandbox execution and a model-native harness. These updates are designed to enable the development of secure, long-running agents capable of interacting with multiple...

Ai · Hugging Face Blog

Inside VAKRA: Reasoning, Tool Use, and Failure Modes of Agents

VAKRA is an executable benchmark designed to evaluate the reasoning and tool-use capabilities of AI agents within enterprise-like environments. It moves beyond testing isolated skills by measuring compositional...

AI LLM Agentic Workflows

Ai · OpenAI Blog

Enterprises power agentic workflows in Cloudflare Agent Cloud with OpenAI

Cloudflare has integrated OpenAI's GPT-5.4 and Codex models into its Agent Cloud platform. This integration allows enterprises to develop, deploy, and scale autonomous AI agents designed for executing complex,...

AI Cloudflare OpenAI Agentic Workflows

Ai · OpenAI Blog

Our response to the Axios developer tool compromise

OpenAI has addressed a supply chain attack originating from the Axios developer tool. The response involved rotating macOS code signing certificates and deploying application updates to mitigate the impact of the...

security supply chain attack openai

Ai · Hugging Face Blog

Safetensors is Joining the PyTorch Foundation

Safetensors, originally a Hugging Face project, is transitioning to the PyTorch Foundation under the Linux Foundation to establish vendor-neutral, community-driven governance. This move shifts the project's trademark...

ai pytorch security machine-learning

Ai · Sebastian Raschka

Components of A Coding Agent

A coding agent is an application layer, or "agentic harness," that wraps a Large Language Model (LLM) in a control loop to perform software engineering tasks. While the LLM provides the core reasoning, the harness...

ai software-engineering llm

Ai · Hugging Face Blog

Welcome Gemma 4: Frontier multimodal intelligence on device

Gemma 4 is a new family of open-weights multimodal models released under the Apache 2 license, designed for both on-device and large-scale deployment. The series supports text, image, audio, and video inputs, featuring...

ai machine-learning llm

Ai · Hugging Face Blog

TRL v1.0: Post-Training Library Built to Move with the Field

TRL v1.0 introduces a formal stability contract to the library, transitioning it from a research project to a stable infrastructure component. The update implements a bifurcated API structure that separates stable,...

ai machine learning software development

Ai · Hugging Face Blog

Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents

Granite 4.0 3B Vision is a compact multimodal model designed for high-precision extraction of structured data from enterprise documents. Released as a LoRA adapter for the Granite 4.0 Micro language model, it...

AI LLM Computer Vision

Ai · Hugging Face Blog

Holotron-12B - High Throughput Computer Use Agent

H Company has released Holotron-12B, a multimodal model optimized for high-throughput computer-use agent tasks. The model is post-trained from NVIDIA's Nemotron-Nano-2 VL architecture to enhance performance in screen...

ai automation llm

Ai · Hugging Face Blog

Introducing Storage Buckets on the Hugging Face Hub

Hugging Face has introduced Storage Buckets, a mutable, S3-like object storage layer on the Hub designed for high-throughput ML artifacts such as training checkpoints, optimizer states, and processed datasets. Built on...

ai huggingface storage

Ai · Hugging Face Blog

Ulysses Sequence Parallelism: Training with Million-Token Contexts

Ulysses Sequence Parallelism, part of the Arctic Long Sequence Training (ALST) protocol, enables training with million-token contexts by distributing attention computation across multiple GPUs. It utilizes attention...

AI Machine Learning Distributed Training

Ai · Hugging Face Blog

LeRobot v0.5.0: Scaling Every Dimension

LeRobot v0.5.0 expands the framework's capabilities from tabletop manipulation to full-body humanoid control with the addition of Unitree G1 support. The release also introduces advanced VLA policies, optimized data...

AI Robotics Machine Learning

Ai · Hugging Face Blog

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

Modular Diffusers introduces a composable architecture for diffusion pipelines, replacing the standard `DiffusionPipeline` with a system of interchangeable, self-contained blocks. This allows developers to construct...

ai machine-learning diffusion-models

Ai · Hugging Face Blog

Bringing Robotics AI to Embedded Platforms: Dataset Recording, VLA Fine‑Tuning, and On‑Device Optimizations

This guide details strategies for deploying Vision-Language-Action (VLA) models on embedded robotic platforms, specifically focusing on the NXP i.MX 95 SoC. It addresses the computational and latency challenges of...

robotics embedded-systems machine-learning edge-ai

Ai · Hugging Face Blog

Mixture of Experts (MoEs) in Transformers

The `transformers` library has undergone a significant architectural refactor to transition from dense-model-centric loading to a specialized pipeline for Mixture of Experts (MoE) architectures. This update introduces a...

ai machine-learning transformers

Ai · Sebastian Raschka

A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026

The first two months of 2026 saw a significant surge in the release of advanced open-weight LLM architectures, characterized by highly efficient Mixture-of-Experts (MoE) configurations and multimodal integration. These...

ai llm open-source

Ai · Hugging Face Blog

Train AI models with Unsloth and Hugging Face Jobs for FREE

Hugging Face Jobs, integrated with Unsloth, provides a managed cloud GPU environment for fine-tuning small language models (SLMs). This workflow can be automated using coding agents like Claude Code and Codex through...

ai machine-learning llm unsloth

Ai · Hugging Face Blog

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

The development team behind GGML and llama.cpp, led by Georgi Gerganov, has joined Hugging Face to provide long-term sustainable resources for local AI inference. The collaboration aims to unify model definition via the...

ai llm open-source

Ai · Hugging Face Blog

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST

IBM Research and UC Berkeley have introduced a diagnostic framework to move beyond simple success-rate metrics in evaluating agentic LLM systems for IT automation. By applying the Multi-Agent System Failure Taxonomy...

AI LLM Software Testing

Ai · Hugging Face Blog

Transformers.js v4: Now Available on NPM!

Transformers.js v4 introduces a new WebGPU runtime written in C++ and transitions the project to a pnpm monorepo structure. This update enables high-performance, hardware-accelerated AI model execution across various...

ai javascript transformers.js

Ai · Hugging Face Blog

Community Evals: Because we're done trusting black-box leaderboards over the community

Hugging Face has introduced a decentralized evaluation system designed to aggregate and transparently report benchmark scores across the Hub. This feature allows the community to contribute results via Pull Requests,...

ai llm evaluation

Ai · Hugging Face Blog

We Got Claude to Build CUDA Kernels and teach open models!

The `upskill` tool enables the transfer of complex capabilities from large-scale "teacher" models, such as Claude Opus 4.5, to smaller or open-source "student" models using "agent skills." By converting execution traces...

AI CUDA LLM Machine Learning

Ai · Sebastian Raschka

Categories of Inference-Time Scaling for Improved LLM Reasoning

Inference-time scaling, also referred to as test-time or inference-compute scaling, involves allocating additional computational resources and time during the generation phase to improve the accuracy and reasoning of...

ai llm inference-scaling

Ai · Sebastian Raschka

The State Of LLMs 2025: Progress, Problems, and Predictions

The LLM landscape in 2025 has shifted from pure architectural scaling to the implementation of reasoning-focused architectures using Reinforcement Learning with Verifiable Rewards (RLVR) and GRPO. This transition...

ai llm machine-learning

Ai · Sebastian Raschka

From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates

DeepSeek has evolved its model architecture from the base DeepSeek V3 and the dedicated reasoning model DeepSeek R1 toward a hybrid architecture in the V3.1 and V3.2 series. This evolution includes the implementation of...

ai deepseek machine-learning

Ai · Sebastian Raschka

Beyond Standard LLMs

While standard transformer-based LLMs using quadratic attention remain the industry standard, recent architectural shifts are exploring linear attention hybrids and subquadratic mechanisms. These alternatives aim to...

ai machine-learning transformer-architectures

Ai · Sebastian Raschka

Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)

Large Language Model (LLM) evaluation is categorized into two primary frameworks: benchmark-based evaluation and judgment-based evaluation. These frameworks encompass four main approaches—multiple-choice benchmarks,...

ai llm evaluation

Ai · Sebastian Raschka

Understanding and Implementing Qwen3 From Scratch

Qwen3 is a family of open-weight large language models (LLMs) released under the Apache License v2.0. The architecture is highly scalable, ranging from 0.6B dense models to 480B parameter Mixture-of-Experts (MoE)...

ai llm machine-learning

Ai · Sebastian Raschka

Understanding and Coding the KV Cache in LLMs from Scratch

The KV cache is a technique used during LLM inference to store intermediate key (K) and value (V) computations for previously processed tokens. This mechanism eliminates the need to recompute these vectors for every new...

ai llm optimization

Ai · Sebastian Raschka

The State of Reinforcement Learning for LLM Reasoning

The paradigm of Large Language Model (LLM) development is shifting from simple scaling of parameters and data to the strategic use of reinforcement learning (RL) to enhance reasoning capabilities. While conventional...

AI Machine Learning LLM

Ai · Sebastian Raschka

The State of LLM Reasoning Model Inference

Recent advancements in LLM reasoning focus on scaling compute during the inference phase to improve performance on complex, multi-step tasks. This approach allows models to "think longer" by increasing the number of...

ai llm inference

Ai · Sebastian Raschka

Understanding Reasoning LLMs

Reasoning LLMs are specialized models optimized for complex, multi-step tasks such as mathematics, coding, and logic puzzles. These models utilize techniques like reinforcement learning and inference-time scaling to...

ai llm machine-learning