📁 LLM

The LLM archive monitors model releases, quantization updates, reasoning capabilities, and real-world deployment implications for local and hybrid AI. We focus on what materially changes selection and operations: context windows, latency, memory footprint, licensing, and evaluation evidence across open and commercial families. This section is designed for teams that need dependable model intelligence, not hype cycles. Pair these updates with the LLM pillar and references to hardware constraints and framework integration.

OpenAI has announced that on February 13, 2026, it will retire the GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini models from ChatGPT. The decision does not currently impact the APIs. This announcement follows the previous communication regarding the retirement of GPT-5 (Instant, Thinking, and Pro).

2026-01-29 Fonte

The emergence of "distilled" models like Qwen 8B DeepSeek R1 has demonstrated reasoning capabilities exceeding their size. The article questions why there aren't more models of this kind, capable of operating on hardware with limited resources.

2026-01-29 Fonte

While major companies pour billions into large language models, San Francisco-based startup Logical Intelligence is taking a different approach to achieving AGI, aiming to emulate the human brain. The company seeks to develop artificial intelligence that more closely resembles human reasoning.

2026-01-29 Fonte

OpenAI built an in-house AI data agent that uses GPT-5, Codex, and memory to reason over massive datasets and deliver reliable insights in minutes, enhancing data processing and analysis efficiency.

2026-01-29 Fonte

OpenAI has released Prism, a free AI-powered workspace for scientists. This tool, integrated with GPT-5.2, aims to facilitate the writing of scientific papers and collaboration. However, some researchers fear that Prism could contribute to an increase in low-quality publications, an existing problem in the sector.

2026-01-29 Fonte

Google has announced Project Genie, a new tool for generating virtual worlds powered by advanced AI models like Genie 3, Nano Banana Pro, and Gemini. Initially available to AI Ultra subscribers in the U.S., it offers new creative possibilities.

2026-01-29 Fonte

Anthropic's secret to building a better AI assistant might be treating Claude like it has a soul—whether or not anyone actually believes that's true. Anthropic released Claude's Constitution, outlining the company's vision for how its AI assistant should behave, notable for the highly anthropomorphic tone it takes toward Claude. It remains unclear whether this is a development strategy or a genuine belief about the nature of AI.

2026-01-29 Fonte

The Qwen3-ASR family includes 1.7B and 0.6B parameter models, capable of identifying the language and transcribing audio in 52 languages and dialects. The larger model achieves performance comparable to proprietary commercial APIs, offering a valid open-source alternative for speech recognition applications.

2026-01-29 Fonte

An engineer has developed Mini-LLM, an 80 million parameter transformer language model from scratch, based on the Llama 3 architecture. The project includes tokenization, memory-mapped data loading, mixed precision training, and inference with KV caching. Suitable for students wanting to understand modern LLM architecture.

2026-01-29 Fonte

OpenMOSS has released MOVA (MOSS-Video-and-Audio), a fully open-source model with 18 billion active parameters (MoE architecture, 32 billion total). MOVA offers day-0 support for SGLang-Diffusion and aims at scalable and synchronized video and audio generation.

2026-01-29 Fonte

A developer has created a system where an LLM generates procedural spells for a virtual reality prototype. The system uses a pool of spell components and converts words into instructions to create unique effects. The soundtrack was made with Suno.

2026-01-29 Fonte

A user discovered that Devstral 2 123B and 24B models can be forced into more consistent logical reasoning through the use of Jinja templates. Adding a specific Jinja statement appears to significantly enhance the reasoning capabilities of the models, although the smaller version may have difficulty exiting the thinking process in some configurations.

2026-01-29 Fonte

A new study introduces Gap-K%, a novel technique for identifying data used in the pre-training of large language models (LLMs). The method analyzes discrepancies between the model's top-1 prediction and the target token, leveraging the optimization dynamics of pre-training to improve detection accuracy.

2026-01-29 Fonte

A 2025 workshop explores synergies between neuroscience and artificial intelligence, identifying promising areas such as embodiment, language, robotics, learning, and neuromorphic engineering. The goal is to develop NeuroAI to improve algorithms and the understanding of biological neural computations, analyzing benefits and risks through SWOT analyses.

2026-01-29 Fonte

Assistant_Pepe_8B, an 8 billion parameter LLM, has been released, designed to combine top-tier shitposting capabilities with actual helpfulness. The model boasts a 1 million token context window and aims to provide useful and irreverent responses, while avoiding excessive pandering. No system prompt is needed.

2026-01-29 Fonte