AI-RADAR | Monitoraggio Intelligenza Artificiale

2026-01-19 • LocalLLaMA

JARVIS: Progress Report on LLM Agent Development

A Reddit user shared an update on the development of JARVIS, an agent based on large language models (LLM). The original post includes a link to a demonstration video of the project. The development of LLM agents is a rapidly growing research area, w...

2026-01-19 • LocalLLaMA

Local LLM Coding: Is it Still Worth it with a 16GB GPU?

A user with a 16GB Nvidia RTX 5070 Ti GPU questions the effectiveness of local large language model (LLM) development. Experience with Kilo code and Qwen 2.5 coder 7B via Ollama revealed issues with context management, which quickly runs out even wit...

#Hardware #LLM On-Premise

2026-01-19 • DigiTimes

Apple-Google AI partnership could reshape voice assistant market

A potential collaboration between Apple and Google in the field of artificial intelligence could reshape the voice assistant market. The partnership, if realized, would have an estimated value of up to $5 billion. Implications and details of the agre...

2026-01-19 • ArXiv cs.CL

Conversational Agents: Does Conciseness Reduce Expertise?

A new study analyzes the unexpected side effects of using specific stylistic features in prompts for conversational agents based on large language models (LLMs). The research reveals how prompting for conciseness can compromise the perceived expertis...

#Fine-Tuning

2026-01-19 • ArXiv cs.CL

BYOL: Bring Your Own Language Into LLMs

A new study introduces BYOL, a framework for improving the performance of large language models (LLMs) in languages with limited digital presence. BYOL classifies languages based on available resources and adapts training techniques, including synthe...

2026-01-19 • ArXiv cs.AI

LLMs: How Do They Assess Trustworthiness of Online Information?

Large language models (LLMs) are increasingly important in online search and recommendation systems. New research analyzes how these models encode perceived trustworthiness in web narratives, revealing that models internalize psychologically grounded...

#Fine-Tuning

2026-01-19 • LocalLLaMA

cuda-nn: Custom MoE inference engine in Rust/CUDA without PyTorch

cuda-nn, a MoE (Mixture of Experts) inference engine developed in Rust, Go, and CUDA, has been introduced. This open-source project stands out for its ability to handle models with 6.9 billion parameters without PyTorch, thanks to manually optimized ...

2026-01-19 • LocalLLaMA

DetLLM: tool to ensure deterministic inference in LLMs

A developer has created DetLLM to address the issue of non-reproducibility in LLM inference. The tool verifies repeatability at the token level, generates a report, and creates a minimal reproduction package for each run, including environment snapsh...

2026-01-19 • LocalLLaMA

SLM Prompting: How to Outperform Larger Language Models?

A user is questioning how to get the most out of small language models (SLMs), especially when fine-tuned for a specific topic. The challenge is that traditional prompts, effective with large language models (LLMs), often produce incoherent results w...

2026-01-19 • LocalLLaMA

GFN v2.5.0: Verified O(1) Memory Inference and 500x Length Extrapolation

Version 2.5.0 of GFN (Geodesic Flow Networks) has been released, an architecture that reformulates sequence modeling as particle dynamics. GFN offers O(1) inference and stability through symplectic integration. Zero-shot generalization on algorithmic...

#Fine-Tuning

2026-01-18 • LocalLLaMA

Analyzing 1M+ Emails for Context Engineering: Key Learnings

A team processed over a million emails to turn them into structured context for AI agents. The analysis revealed that thread reconstruction is complex, attachments are crucial, multilingual conversations are frequent, and data retention is a hurdle f...

2026-01-18 • LocalLLaMA

Faster LLM Inference with Speculative Decoding

Speculative Decoding promises a 2x-3x speedup in large language model (LLM) inference without sacrificing accuracy. By leveraging a smaller model to generate token drafts, and then verifying them in parallel with the main model, hardware utilization ...

#Hardware #Fine-Tuning

2026-01-17 • LocalLLaMA

Prompt Repetition Improves Non-Reasoning LLMs

New research demonstrates that repeating prompts can significantly improve the performance of large language models (LLMs) in tasks that do not require complex reasoning. The approach does not impact latency and could become a standard practice.

2026-01-17 • LocalLLaMA

DeepSeek Engram: A static memory unit for LLMs

DeepSeek AI introduced Engram, a novel static memory unit for LLMs. Engram separates remembering from reasoning, allowing models to handle larger contexts and improve performance in complex tasks like math and coding, all while reducing the computati...

#Hardware

2026-01-16 • ArXiv cs.CL

AI Creativity: Advanced Workflows for Original Research Plans

A new study explores how multi-step workflows based on large language models (LLMs) can generate more innovative and feasible research plans. By comparing different architectures, the research highlights how decomposition-based and long-context analy...

2026-01-16 • ArXiv cs.LG

The Geometry of Thought: Disclosing the Transformer as a Tropical Polynomial Circuit

New research reveals that the Transformer's self-attention mechanism, in the high-confidence regime, operates within the tropical semiring (max-plus algebra). This study transforms softmax attention into a tropical matrix product, demonstrating how t...

2026-01-15 • TechWire Asia

DeepSeek proposes a workaround to train bigger AI models with less powerful chips

Chinese startup DeepSeek has unveiled a new technique called "Engram" to train large AI models, overcoming limitations imposed by US export restrictions on advanced chips. This approach aims to reduce reliance on GPU memory, enabling the handling of ...

#Hardware

2026-01-15 • ArXiv cs.LG

SGFM: A Physics-Inspired Approach to Generative Models

Spectral Generative Flow Models (SGFM) are proposed as an alternative to transformer-based large language models. By leveraging constrained stochastic dynamics in a multiscale wavelet basis, SGFM offers a generative mechanism grounded in continuity, ...

2026-01-14 • ArXiv cs.CL

LLM: A Human-Centric Pipeline for Aligning LLMs with Chinese Medical Ethics

A new study introduces MedES, a dynamic benchmark for aligning large language models (LLMs) with Chinese medical ethics. The system uses an automated evaluator to provide structured ethical feedback, improving model performance in complex clinical sc...

#Fine-Tuning

2026-01-14 • ArXiv cs.LG

Hierarchical Compression for LLMs: Reducing Memory and Compute

A novel approach to compressing large language models (LLMs) promises to significantly reduce memory requirements and computational resources. The technique, called Hierarchical Sparse Plus Low-Rank (HSS) compression, combines sparsity with low-rank ...

#Fine-Tuning

2026-01-13 • ArXiv cs.CL

Operation Veja: A New Approach for More Realistic Characters

A new study identifies the limitations of current roleplaying models, which struggle to reproduce believable characters. The VEJA (Values, Experiences, Judgments, Abilities) framework proposes a new training method based on manually curated data, ach...

#Fine-Tuning #RAG

2026-01-12 • TechCrunch AI

Anthropic’s new Cowork tool offers Claude Code without the code

Anthropic has introduced Cowork, a new feature integrated into the Claude Desktop app. Cowork allows users to designate specific folders where Claude can read or modify files, with further instructions given through the standard chat interface. The g...

Large Language Models (LLMs) Advancements

Related Coverage

JARVIS: Progress Report on LLM Agent Development

Local LLM Coding: Is it Still Worth it with a 16GB GPU?

Apple-Google AI partnership could reshape voice assistant market

Conversational Agents: Does Conciseness Reduce Expertise?

BYOL: Bring Your Own Language Into LLMs

LLMs: How Do They Assess Trustworthiness of Online Information?

cuda-nn: Custom MoE inference engine in Rust/CUDA without PyTorch

DetLLM: tool to ensure deterministic inference in LLMs

SLM Prompting: How to Outperform Larger Language Models?

GFN v2.5.0: Verified O(1) Memory Inference and 500x Length Extrapolation

Analyzing 1M+ Emails for Context Engineering: Key Learnings

Faster LLM Inference with Speculative Decoding

Prompt Repetition Improves Non-Reasoning LLMs

DeepSeek Engram: A static memory unit for LLMs

AI Creativity: Advanced Workflows for Original Research Plans

The Geometry of Thought: Disclosing the Transformer as a Tropical Polynomial Circuit

DeepSeek proposes a workaround to train bigger AI models with less powerful chips

SGFM: A Physics-Inspired Approach to Generative Models

LLM: A Human-Centric Pipeline for Aligning LLMs with Chinese Medical Ethics

Hierarchical Compression for LLMs: Reducing Memory and Compute

Operation Veja: A New Approach for More Realistic Characters

Anthropic’s new Cowork tool offers Claude Code without the code