📁 LLM

The LLM archive monitors model releases, quantization updates, reasoning capabilities, and real-world deployment implications for local and hybrid AI. We focus on what materially changes selection and operations: context windows, latency, memory footprint, licensing, and evaluation evidence across open and commercial families. This section is designed for teams that need dependable model intelligence, not hype cycles. Pair these updates with the LLM pillar and references to hardware constraints and framework integration.

A hardware coder has expressed frustration with the performance of large language models (LLMs) running locally on a 5090 GPU. Despite the powerful hardware, the models seem underutilized and unable to leverage external tools to improve context. The discussion revolves around the actual utility of such setups compared to cloud-based IDEs and the tools needed to optimize local performance.

2026-01-24 Fonte

A prompt library for large language models (LLM), specifically designed for Retrieval-Augmented Generation (RAG) architectures, has been created and made available. The library includes prompts focused on grounding constraints, citation rules, and handling uncertainty and multiple sources. The templates are easily usable via copy-and-paste, and the community is invited to contribute and evaluate the prompts to improve their effectiveness.

2026-01-24 Fonte

Newelle, a virtual AI assistant for the GNOME desktop with API integration for Google Gemini, OpenAI, Groq, and also local LLMs, has a new release. Newelle has been steadily expanding its AI integration and capabilities, and with the new Newelle 1.2, there are even more capabilities for those wanting AI on the GNOME desktop.

2026-01-24 Fonte

Hugging Face has released and updated several AI and machine learning models. These include multilingual reasoning models like GLM-4.7, tools for automated report generation, and multimodal models for translation and medical image processing. Also noteworthy are models for image editing and video generation, as well as solutions for speech recognition and customized text-to-speech.

2026-01-24 Fonte

A Reddit user is seeking an uncensored large language model (LLM) capable of generating particularly spicy and intelligent prompts for sexually explicit role-playing games (NSFW). The discussion is open within the LocalLLaMA community, with the aim of identifying suitable solutions for this type of application.

2026-01-24 Fonte

A user reported a significant performance drop with GLM 4.7 Flash in LM Studio after exceeding 10,000 tokens, despite using recommended settings and updated software. The discussion explores whether other implementations, such as vllm, might mitigate this issue. A patch for ik_llama.cpp seems to address the slowdown, but compiling it is proving difficult.

2026-01-24 Fonte

A developer has created Context Engine, a self-hosted retrieval system for codebases, designed to work with various MCP clients. It uses a hybrid search that combines dense embeddings with lexical search and AST parsing. The goal is to avoid overloading LLMs with irrelevant contexts or missing important information, keeping the code local and compatible with different models.

2026-01-24 Fonte

A new data-driven report examines ChatGPT adoption across industries, highlighting key automated tasks, departmental usage patterns, and the future prospects of AI in the workplace. The analysis is based on concrete data to provide a clear and useful overview for businesses.

2026-01-24 Fonte

LuxTTS, a diffusion-based text-to-speech model with only 120 million parameters, has been released. It stands out for its high-quality voice cloning capabilities, comparable to models ten times larger, and its efficiency, requiring less than 1GB of VRAM. The speed is remarkable, exceeding real-time performance several times over even on CPUs. The code is available on GitHub, with the model hosted on Hugging Face.

2026-01-24 Fonte

AMI Labs, Yann LeCun's new venture after leaving Meta, has immediately captured the attention of the industry. The company will focus on developing advanced AI models, promising to revolutionize the field of artificial intelligence. LeCun, a leading figure in the AI world, aims for new frontiers with this startup.

2026-01-24 Fonte

South Korea is engaged in an intense competition to develop its own artificial intelligence. This "AI Squid Game," as it has been dubbed, sees various companies and institutions vying for supremacy in the field of AI, with the goal of achieving technological independence and competing globally.

2026-01-24 Fonte

Donald Trump and major AI companies shared the stage at the World Economic Forum in Davos. This episode of 'Uncanny Valley' analyzes the implications of this meeting, exploring the dynamics between politics, technology, and the global economy. A focus on the hot topics of the moment.

2026-01-23 Fonte

Google Photos introduces a new feature that allows users to create custom memes from their photos. The integration leverages Google's Gemini AI, offering a fun way to experiment with images.

2026-01-23 Fonte
📁 LLM AI generated

Unrolling the Codex agent loop

A technical deep dive into the Codex agent loop, explaining how Codex CLI orchestrates models, tools, prompts, and performance using the Responses API. We explore the architecture and inner workings of this key component for developing applications based on language models.

2026-01-23 Fonte

OpenAI has outlined its PostgreSQL scaling strategies to support ChatGPT's 800 million users. The original article delves into the challenges faced and the solutions implemented to manage such a high workload, while ensuring optimal performance and service reliability.

2026-01-23 Fonte

Sweep AI has released a 1.5B parameter open-source model, named Sweep, designed to predict the next code edits. Available on Hugging Face and via a JetBrains plugin, this tool uses recent edits as context, outperforming larger models in speed and accuracy. Training involved both SFT and RL, with a focus on prompt format and code cleanup.

2026-01-23 Fonte

Meta has temporarily paused teen access to its AI characters. The company is developing new versions of these characters, designed to provide age-appropriate responses. The move is a precautionary measure, pending the release of the updates.

2026-01-23 Fonte

A behind-the-scenes look at 404 Media. This week, the focus is on the impact of generative artificial intelligence, a conference on money laundering, and the removal of symbols related to slavery. The interview with the Wikimedia Foundation CTO addresses the challenges and opportunities of AI for Wikipedia, a crucial site both as a source of training data and as a potential victim of AI-generated content.

2026-01-23 Fonte

Meta is developing new versions of its AI characters, designed to provide age-appropriate responses to teenagers. The company has temporarily paused access to this feature for younger users in order to refine and calibrate the responses provided by the artificial intelligence.

2026-01-23 Fonte