On-Premise AI Hardware Guide

Altro

Linux MD RAID5 migliora la scalabilità fino al 17%: cosa cambia per lo storage on-prem

Linux MD RAID5 gains up to 17% scalability boost: implications for on-prem storage

A fresh patch series for Linux MD RAID5 brings scalability improvements of 10–17% in certain configurations. The development is directly relevant...

2026-06-27 READ_MORE >

LLM

Orthrus porta la testa a diffusione su Qwen 3.5/3.6 e Gemma 4: codice open source in arrivo

Orthrus brings diffusion head to Qwen 3.5/3.6 and Gemma 4: open-source code dropping soon

Orthrus models with a diffusion head are about to land on Hugging Face, joined by full end-to-end training and evaluation code. A pairing that...

2026-06-27 READ_MORE >

Frameworks

GNOME ora ha un assistente AI che genera immagini: Newelle 1.4.5

GNOME’s AI Assistant Now Generates Images: Newelle 1.4.5 Arrives

After three years of development, Newelle reaches version 1.4.5 with two major updates: AI image generation support and a redesigned chat...

2026-06-27 READ_MORE >

Altro

L’AI del futuro non sarà guidata solo da modelli migliori

The next AI won’t be powered by better models alone

Oxylabs CEO suggests the real leap lies beyond models — in data quality and freshness. For those running LLMs on-prem, data sovereignty and robust...

2026-06-27 READ_MORE >

Hardware

Dal mercato di Shenzhen una RTX 5090 con 96 GB di VRAM: costa 8.200 dollari

A 96GB VRAM RTX 5090 from Shenzhen's Huaqiangbei Market for $8,200

A hands-on report from Shenzhen's Huaqiangbei market confirms offers of modified GeForce RTX 5090 cards with 96GB of VRAM. With a base card cost...

2026-06-27 READ_MORE >

Altro

USA sblocca Mythos 5 di Anthropic per una cerchia ristretta di difensori informatici

US clears Anthropic to restore Mythos 5 to a small group of cyber defenders

The Commerce Department greenlights Anthropic to restore access to Mythos 5, its most powerful cybersecurity model, for a select group of trusted...

2026-06-27 READ_MORE >

Frameworks

Meno sincronizzazioni CUDA in llama.cpp: guadagni prestazionali per l'inference on-prem

Llama.cpp cuts CUDA synchronizations, boosting on-premise inference performance

A recent llama.cpp commit reintroduces more aggressive asynchronous handling for CUDA backends, cutting synchronizations between tokens and...

2026-06-27 READ_MORE >

Hardware

Chip AI: la strozzatura logistica minaccia i piani on-premise

AI chip demand squeezes global freight, putting on-premise plans at risk

Surging demand for AI accelerators is congesting air and sea freight, driving up shipping rates. For enterprises building on-premise LLM...

2026-06-27 READ_MORE >

Market

SYM: utili in calo nel 2025 nonostante la quota di mercato record

SYM's Profits Fall in 2025 Despite Record Market Share

The Taiwanese motorcycle manufacturer saw profits decline in 2025, even as it captured its highest-ever market share. A paradox that mirrors...

2026-06-27 READ_MORE >

Hardware

JCET investe 1,1 miliardi: la svolta cinese per i chip AI passa dal packaging

JCET's US$1.1bn expansion shows where China's AI chip crunch is moving

JCET’s $1.1 billion expansion in advanced packaging shows China's strategy to bypass semiconductor restrictions and secure AI accelerator supply....

2026-06-27 READ_MORE >

LLM

Fine-tuning Qwen: perché i modelli ottimizzati faticano a convincere

Qwen Fine-tunes: Why Optimized Models Struggle to Impress

Despite the popularity of fine-tuning Qwen models, concrete evidence of versions truly outperforming the base is scarce. This raises questions...

2026-06-27 READ_MORE >

Frameworks

DeepSeek V4 Flash e MiniMax M3 su llama.cpp: a che punto è il supporto nativo?

DeepSeek V4 Flash and MiniMax M3 on llama.cpp: When will native support arrive?

The community is waiting for official integration of DeepSeek V4 Flash and MiniMax M3 models into llama.cpp. Forks provide partial solutions, but...

2026-06-27 READ_MORE >

LLM

DeepSeek-V4-Pro-DSpark: il nuovo LLM open source che guarda al deployment locale

DeepSeek-V4-Pro-DSpark: A New Open-Source LLM Targeting Local Deployment

DeepSeek releases the V4-Pro-DSpark model on Hugging Face along with the DSpark technical paper. This release fuels the strategy of those betting...

2026-06-27 READ_MORE >

LLM

Ornith-1.0-35B Q3_K_M: 17 GB di VRAM e benchmark verde, la quantization estrema regge

Ornith-1.0-35B Q3_K_M: 17 GB VRAM, all benchmarks pass, extreme quantization holds up

Ornith-1.0-35B has been quantized to Q3_K_M, achieving 16.8 GB on disk and ~17 GiB loaded VRAM. Validated with KL divergence probes and 14/14...

2026-06-27 READ_MORE >

LLM

Distillare LLM in proprio per il theorem proving: quando lo stack on-premise batte il cloud

Distilling Your Own LLM for Theorem Proving: When On-Premise Beats the Cloud

A user with hardware funding but no cloud credits considers distilling an LLM for theorem proving in Rocq, a niche lacking tailored models. The...

2026-06-27 READ_MORE >

Hardware

Wiwynn e TE Connectivity raffreddano a liquido le sbarre DC a 800V per i datacenter AI

Liquid Cooling Comes to 800V DC Busbars for AI Data Centers

At Wiwynn's booth, TE Connectivity's new 800V DC busbars with integrated liquid cooling were on display. It's a clear sign that power delivery for...

2026-06-27 READ_MORE >

LLM

Mythos 5 di Anthropic autorizzato a oltre 100 enti USA: svolta per l'AI sovrana?

Anthropic’s Mythos 5 Authorized for Over 100 US Entities: A Turn for Sovereign AI?

The Trump administration has authorized over 100 companies and government agencies to use Anthropic's Mythos 5, including their non-American...

2026-06-27 READ_MORE >

LLM

Anthropic può rilasciare Mythos a organizzazioni USA: il via libera della Casa Bianca

Trump Administration Allows Anthropic to Release Mythos to Select US Organizations

After weeks of negotiations, the White House authorized Anthropic to restore access to its most advanced model, Mythos, for a select group of US...

2026-06-27 READ_MORE >

Altro

La Corea del Sud addestra tutto l'esercito con i droni: l'AI sul campo impone l'inference locale

South Korea to train entire military as drone warriors: Edge AI puts inference on the front line

Seoul aims to make drones a universal combat tool for its half-million troops, inspired by lessons from Ukraine. The move shifts the weight of...

2026-06-26 READ_MORE >

Frameworks

llama.cpp: il tensor parallelism su Vulkan ora è alla portata di tutti

llama.cpp: Vulkan Tensor Parallelism Now Within Reach

Pull request #25051 by Piotr ‘pwilkin’ makes Vulkan tensor parallelism usable in llama.cpp, opening LLM inference to non-NVIDIA GPUs. A concrete...

2026-06-26 READ_MORE >

Altro

Nemotron-3-Super: 504K token di contesto perfetti su quattro RTX 3090

Nemotron-3-Super Nails 504K-Token Needle Retrieval on 4× RTX 3090

NVIDIA's hybrid Mamba+MoE model, quantized to 71 GB, runs entirely on consumer GPUs and achieves perfect needle retrieval up to 504,482 tokens....

2026-06-26 READ_MORE >

Frameworks

Un veterano del software costruisce un harness locale per LLM e chiede alla community: cosa serve?

A software veteran builds a local LLM harness and asks the community: what do you need?

A developer with 45 years of enterprise tooling experience is about to release an open-source harness designed to simplify local LLM deployment....

2026-06-26 READ_MORE >

Market

Ford riassume 350 ingegneri: l’AI da sola non garantiva la qualità

Ford had to rehire 350 engineers after AI got vehicle quality wrong

The automaker admitted it overestimated AI capabilities in quality control and had to rehire 350 engineers. The story reignites the debate on...

2026-06-26 READ_MORE >

Market

Microsoft ha costruito un supercomputer per aiutare OpenAI a violare il copyright, accusa il NYT

Microsoft built supercomputer to help OpenAI infringe copyrights, NYT alleges in amended complaint

The New York Times amends its complaint, alleging Microsoft built a bespoke supercomputer to enable OpenAI’s copyright infringement. The filing...

2026-06-26 READ_MORE >

Market

Zettabyte chiede uno standard per la qualità del calcolo AI, mentre la domanda esplode

Zettabyte calls for a quality standard in AI compute as demand surges

Zettabyte urges a new standard for AI compute quality amid a two-year surge in demand, as organizations struggle to compare on-prem and cloud...

2026-06-26 READ_MORE >

Market

DrayTek scivola nei ricavi fino al 2026, la scommessa è su Wi-Fi 7 e cybersecurity

DrayTek's revenue slide drags into 2026, company bets on Wi-Fi 7 and cybersecurity

The Taiwanese networking gear maker faces a revenue decline extending into 2026. It is betting on Wi-Fi 7 and cybersecurity to turn things around,...

2026-06-26 READ_MORE >

Hardware

Intel prepara il supporto HDR per configurazioni DP MST su Linux

Intel readies HDR support for DP MST configurations on Linux

The Intel Linux kernel graphics driver is set to fix a gap: the inability to use HDR over DisplayPort Multi-Stream Transport connections. This...

2026-06-26 READ_MORE >

Altro

OpenAI frena il rilascio di GPT-5.6 su richiesta governativa: 'Non deve diventare la norma'

OpenAI limits GPT-5.6 rollout after government request, says restrictions shouldn't be the norm

OpenAI restricted the rollout of GPT-5.6 following a government request, sparking debate on digital sovereignty and LLM access. The move puts a...

2026-06-26 READ_MORE >

Market

OpenAI ingaggia l’ex capo di Uber India per guidare la sua espansione fuori dagli USA

OpenAI hires ex-Uber India chief to lead expansion in its largest non-U.S. market

The hire strengthens OpenAI’s push into India, a crucial market for scale and opportunity. Bringing in a veteran with deep local experience...

2026-06-26 READ_MORE >

Altro

LLM on-premise: il workflow che vorresti aver scoperto prima

On-prem LLMs: the workflow you wish you had discovered sooner

A Reddit thread asks which local AI workflow made the biggest difference. The answers reveal that the real value lies not in models but in...

2026-06-26 READ_MORE >

Hardware for Local Intelligence