Topic / Trend Rising

AI Hardware & Semiconductor Industry

This trend covers the rapid evolution in the design, manufacturing, and performance of AI-specific chips like GPUs, TPUs, ASICs, and HBM. It highlights the intense competition and strategic importance of advanced semiconductor technology for AI workloads.

Detected: 2026-05-06 · Updated: 2026-05-06

Related Coverage

2026-05-06 DigiTimes

AI Revolutionizes Semiconductor Testing: AEM CEO's Vision

The CEO of AEM highlights how artificial intelligence is radically transforming the semiconductor testing sector. This evolution presents new challenges and opportunities for the industry, driving the adoption of more efficient and automated solution...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-06 DigiTimes

VIS Joins CoWoS Chain: New Interposer Foundry in Singapore Backed by TSMC

Vanguard International Semiconductor (VIS) is joining the CoWoS supply chain, crucial for AI chips. An interposer foundry in Singapore, backed by TSMC, strengthens the production of essential components for high-bandwidth memory integration. This dev...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-06 DigiTimes

Flex Exceeds 2027 Outlook, Plans AI Data Center Unit Spinoff

Flex announced financial prospects for 2027 that surpassed expectations, alongside a plan to spin off its artificial intelligence data center unit. This strategic move highlights the growing importance of AI infrastructure and companies' willingness ...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-06 DigiTimes

MediaTek's Airoha Targets Optical Growth for AI Networking

Airoha, a MediaTek unit, is focusing its efforts on the artificial intelligence networking sector. The company aims for "triple optical growth," highlighting the importance of high-speed interconnections to support increasing AI workloads. This focus...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-06 DigiTimes

AMD and AI: CPUs Return to the Main Event

Artificial intelligence is redefining the role of Central Processing Units (CPUs) in IT infrastructure. Recent statements from AMD, via CEO Lisa Su, highlight how AI is bringing CPUs back into focus, influencing deployment strategies and TCO consider...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-06 DigiTimes

Synnex Reports Record First-Quarter Revenue and Profit Driven by AI Demand

Synnex announced exceptional financial results for its first quarter, achieving record revenue and profit. This growth is attributed to strong demand in the artificial intelligence sector, which is fueling sales in both the semiconductor and cloud se...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-06 DigiTimes

Lumentum Sees Explosive Expansion as AI Demand Fuels Record Results

Lumentum, a key supplier of optical components, is experiencing explosive growth and record financial results, driven by the increasing demand in the artificial intelligence sector. This trend highlights the critical importance of high-speed network ...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-06 DigiTimes

AMD Lifts Outlook: AI Demand Fuels Data Center Growth

AMD has raised its financial outlook, citing robust demand for AI solutions that is fueling data center expansion. This trend underscores the growing need for dedicated hardware for artificial intelligence workloads, prompting companies to carefully ...

#Hardware #LLM On-Premise #DevOps
2026-05-05 LocalLLaMA

AMD Strix Halo and llama.cpp: MTP Accelerates On-Premise LLM Inference

A recent experiment showcased a significant performance boost in Large Language Model (LLM) inference on AMD Strix Halo hardware, leveraging `llama.cpp` with Multi-Token Prediction (MTP) support. The setup, featuring a system with 128GB of DDR5 at 80...

#Hardware #LLM On-Premise #DevOps
2026-05-05 The Register AI

Astera Labs Unveils NVSwitch Alternative for Rack-Scale AI Systems

Astera Labs has introduced a high-speed connectivity solution for rack-scale AI systems, positioning itself as an alternative to Nvidia's NVSwitch. The technology promises compatibility with a wide range of accelerators, offering greater flexibility ...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 The Next Web

The Volatility of the AI Chip Market: The Intel Case and On-Premise Challenges

Intel's journey in the AI chip market, from a disadvantaged position in 2025 to an all-time high in 2026, highlights the rapid evolution of the sector. This context underscores the importance of robust infrastructure strategies for on-premise LLM dep...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 Tom's Hardware

JSR and TSMC: Strategic Investment in Taiwan for Advanced Photoresists

JSR and TSMC will collaborate on building a new advanced photoresist plant in Taiwan, representing a multi-million dollar investment. Expected to be operational by 2028, this initiative aims to strengthen the semiconductor supply chain, crucial for n...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 The Next Web

Intel Bolsters AI and Client Computing Division with Qualcomm Veteran

Intel has announced the appointment of Alex Katouzian, a former Qualcomm executive with 25 years of experience in mobile, compute, and extended-reality sectors. Katouzian will lead the new Client Computing and Physical AI group, reflecting Intel's st...

#Hardware #LLM On-Premise #DevOps
2026-05-05 The Next Web

Apple Explores Intel and Samsung for M-series Chips: End of TSMC's Monopoly?

Apple has initiated preliminary discussions with Intel and Samsung regarding the production of some of its M-series chips. This move signals a potential shift in the company's silicio procurement strategy, which for nearly a decade has relied exclusi...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 DigiTimes

Tesla AI5 Dual Sourcing Strategy: Implications for AI Silicio Supply

Tesla is implementing a dual sourcing strategy for its AI5 chip, with Samsung among the partners, though production weight may not be equal. This move highlights the increasing importance of supply chain diversification for AI silicio, a critical fac...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 DigiTimes

Google's TPU Push Challenges Nvidia's Neocloud AI Dominance

Google is intensifying its Tensor Processing Unit (TPU) offerings, putting pressure on Nvidia's established leadership in the cloud-based AI infrastructure market. This competition redefines dynamics for companies evaluating computing solutions for L...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 DigiTimes

Ardentec to Start AI ASIC Testing in Longtan in 3Q26

Ardentec, a semiconductor testing company, has announced the commencement of testing activities for its AI-dedicated ASICs at its Longtan plant, scheduled to begin in the third quarter of 2026. This move highlights the growing importance of specializ...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 DigiTimes

South Korea Eyes Memory-Led AI Strategy Against Nvidia Dominance

A recent DIGITIMES report suggests South Korea is exploring a "memory-led" strategy for artificial intelligence. This move indicates a potential alternative or competitive approach to Nvidia's current leadership in the AI hardware market, focusing on...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 DigiTimes

China Accelerates Domestic AI Silicio Development as Nvidia Retreats

US export restrictions are significantly reducing Nvidia's presence in the Chinese market. In response, Beijing is intensifying efforts to develop proprietary AI hardware solutions, aiming for technological self-sufficiency. This drive seeks to ensur...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 DigiTimes

AI and Chips Propel US Back to Top of Taiwan's Trade Chart

The acceleration in artificial intelligence development and the surging demand for advanced semiconductors have driven the United States to reclaim its position as Taiwan's leading trade partner. This scenario underscores the centrality of silicio in...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 DigiTimes

MediaTek's ASIC Growth Aims for 60% Market Share

MediaTek is consolidating its position in the ASIC sector, with an expansion that could lead the company to achieve 60% market share in the segment. This development reflects a strategic evolution towards higher-value-added solutions, with significan...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 DigiTimes

AI-Driven HPC Demand Fuels CHPT's Record April Revenue

CHPT reported record revenue in April, a result driven by the increasing demand for High-Performance Computing (HPC) fueled by artificial intelligence workloads. This highlights the significant impact of AI on technological infrastructure, prompting ...

#Hardware #LLM On-Premise #DevOps
2026-05-05 DigiTimes

Taiwan's Manufacturing PMI Rises: AI and Semiconductor Demand Tightens Supply

Taiwan's manufacturing PMI jumped to 60.3%, indicating strong expansion. This growth is driven by sustained demand for AI technologies and semiconductors, leading to a tightening of the global supply chain. Implications extend to hardware costs and a...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-05 DigiTimes

Onsemi Targets AI Data Centers and Treo for Revenue and Margin Recovery

Onsemi, a leading semiconductor company, has identified AI-dedicated data centers and its Treo segment as key drivers for future growth. This strategy aims to strengthen the company's position in a rapidly expanding market, focusing on essential hard...

#Hardware #LLM On-Premise #DevOps
2026-05-05 DigiTimes

CoWoS Crunch and Intel's Uncertainties: A Look at the AI Market

The scarcity of CoWoS manufacturing capacity, essential for advanced AI chips, is creating market tensions. MediaTek's hiring of Dr. Douglas Yu and growing questions about Intel's ability to meet demand highlight supply chain challenges. This scenari...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-04 TechCrunch AI

Cerebras, Strategic Partner of OpenAI, Prepares for Billion-Dollar IPO

Cerebras, an AI chip maker, is preparing for an initial public offering (IPO) that could value it at over $26.6 billion. Its close collaboration with OpenAI highlights the importance of strategic partnerships in the AI sector, influencing on-premise ...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-04 Tom's Hardware

Intel strengthens client computing with Qualcomm veteran for on-device AI

Intel has announced the appointment of Alex Katouzian, a former Qualcomm executive with 25 years of experience, to lead its client computing group. Katouzian will oversee consumer CPUs and on-device AI, a strategic move that highlights Intel's commit...

#Hardware #LLM On-Premise #DevOps
2026-05-04 Phoronix

ROCm 7.2.3: Minor Updates and XIO Documentation for AMD's AI Stack

AMD has released ROCm 7.2.3, a minor update for its open-source GPU compute and AI stack. This version, available less than a month after the previous one, introduces improvements and makes ROCm XIO documentation available. The update is relevant for...

#Hardware #LLM On-Premise #DevOps
2026-05-04 Tom's Hardware

AMD Ryzen AI 5 435G: A New Zen 5 Chip for Local AI

AMD has unveiled the Ryzen AI 5 435G APU, a six-core processor based on the Zen 5 architecture with integrated AI capabilities. Aimed at budget-conscious systems, it competes with the Ryzen 5 8600G, promising new opportunities for local inference and...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-04 Tom's Hardware

Intel Arc Pro B70: 32GB VRAM Workstation GPU, Doubled Performance

Intel's new Arc Pro B70 GPU, equipped with 32GB of VRAM, demonstrates significant performance in tests. With an average speed roughly twice that of the Arc B580 and the ability to surpass the RTX 5060 Ti in certain scenarios, it positions itself as a...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-04 Tom's Hardware

Nvidia: Exposure to Asian Supply Chains and the Impact on Edge AI

Nvidia's production costs are 90% exposed to Asian supply chains for components, a significant increase from 65%. This growing reliance, potentially amplified by the expansion of physical AI with platforms like Nvidia Jetson, raises questions about c...

#Hardware #LLM On-Premise #DevOps
2026-05-04 Tom's Hardware

AMD Ryzen AI Max+ PRO 495: 192GB of Unified Memory for Next-Gen APUs

New leaked PassMark benchmarks suggest the arrival of the AMD Ryzen AI Max+ PRO 495 APU. This new unit could integrate up to 192GB of unified memory, representing an update over the Strix Halo series. The memory increase is a key factor for on-premis...

#Hardware #LLM On-Premise #DevOps
2026-05-04 LocalLLaMA

AMD Strix Halo: 192GB Memory for On-Premise LLMs, a New Horizon?

Recent rumors suggest that AMD's upcoming Strix Halo APU, potentially named "Gorgon Halo 495 Max" or "Ryzen AI Max Pro 495," could integrate 192GB of memory. This capacity, coupled with a Radeon 8065S iGPU, would mark a significant advancement for ru...

#Hardware #LLM On-Premise #DevOps
2026-05-04 DigiTimes

South Korea's 260,000 GPU Plan: Reliance on Taiwan and the AI Challenge

South Korea's ambitious plan to acquire 260,000 GPUs for AI initiatives underscores a critical reliance on Taiwanese manufacturing capabilities. As highlighted by the DIGITIMES Chair, this scenario emphasizes the importance of international collabora...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-04 DigiTimes

Samsung Strike: HBM Risks for AI and On-Premise Supply Chain

A strike at Samsung raises concerns about the supply of High Bandwidth Memory (HBM), a crucial component for AI GPUs. The potential disruption highlights the fragility of the tech supply chain and its implications for on-premise Large Language Model ...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-04 DigiTimes

Optical Acceleration: Taiwan's Micro LEDs for AI Data Centers

Taiwanese Micro LED suppliers are intensifying their focus on optical links for AI data centers. This trend highlights the increasing demand for high-speed, low-latency connectivity, essential for AI and Large Language Model (LLM) workloads. For comp...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-04 DigiTimes

Cerebras Eyes $40 Billion IPO, Challenges Nvidia in AI Chip Market

Cerebras, a company specializing in artificial intelligence chips, is reportedly considering an initial public offering that could value it up to $40 billion. This move positions the company as a direct competitor to Nvidia, the market leader, highli...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-04 DigiTimes

The AI Hardware Boom: Impact on the Supply Chain and Passive Components

Yageo's Pierre Chen highlights how the rapid expansion of the artificial intelligence hardware sector is generating a significant increase in demand for passive components. This phenomenon, crucial for the production of high-performance servers and G...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-04 DigiTimes

AI Cooling and Optics Demand Drives Asia Optical's Record Revenues

Asia Optical reported record revenues and profits for Q1 2026, driven by the increasing demand for cooling solutions and optical components for artificial intelligence. This result highlights the significant impact that the expansion of AI workloads,...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-04 DigiTimes

Croma ATE posts record 1Q26 revenue and profit driven by AI server demand

Croma ATE reported record revenue and profit in the first quarter of 2026. This exceptional performance is attributed to the increasing demand for AI servers, which boosted orders in the SLT and photonics sectors. The trend highlights the impact of t...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-03 LocalLLaMA

Hummingbird+: Low-Cost FPGAs for LLM Inference

A new study introduces Hummingbird+, a low-cost FPGA-based solution designed for Large Language Model inference. The system, with an estimated mass production cost of $150, can run the Qwen3-30B-A3B model with 4-bit quantization, achieving 18 tokens ...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-02 Phoronix

Linux 7.1-rc2: Updates for Older AMD GPUs

The upcoming Linux kernel release, version 7.1-rc2, introduces a series of updates and fixes for the Direct Rendering Manager (DRM) drivers. These interventions are specifically aimed at improving the support and stability of previous-generation AMD ...

#Hardware #LLM On-Premise #DevOps
2026-05-01 LocalLLaMA

Intel Auto-Round: SOTA Quantization for LLM Inference on CPU, XPU, and CUDA

Intel has released Auto-Round, a state-of-the-art quantization algorithm designed to optimize low-bit LLM inference with high accuracy. The solution is compatible with CPUs, XPUs, and CUDA, supports multiple data types, and integrates with frameworks...

#Hardware #LLM On-Premise #DevOps
2026-05-01 Tom's Hardware

Skyrocketing AI Component Costs Push Big Tech CapEx to Record $725 Billion

Big Tech's capital expenditure has reached a record $725 billion, driven by surging component prices. Microsoft, in particular, has allocated $25 billion of its AI budget to increased memory and chip costs, as stated by Satya Nadella at the World Eco...

#Hardware #LLM On-Premise #DevOps
2026-05-01 Phoronix

AMD Introduces HDMI 2.1 FRL Support for AMDGPU Linux Driver

AMD has released official patches for its AMDGPU Linux graphics driver, introducing support for HDMI Fixed Rate Link (FRL). This implementation, while not full HDMI 2.1 support, marks a significant step. FRL technology, part of the HDMI 2.1+ standard...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-01 Tom's Hardware

ASML's Roadmap: From DUV to EUV, the Future of Lithography for AI Chips

ASML, a key player in semiconductor manufacturing, outlines its lithography technology roadmap, from DUV to advanced EUV. These advancements are crucial for developing increasingly powerful chips, essential for Large Language Model inference and trai...

#Hardware #LLM On-Premise #DevOps
2026-05-01 The Next Web

Thomas Reardon and the Challenge of Low-Power AI: Thinking on Just 20 Watts

Thomas Reardon, known for creating Internet Explorer and co-founding CTRL-labs, is embarking on a new challenge: developing artificial intelligence capable of "thinking" while consuming just 20 watts. This ambitious goal aims to redefine energy effic...

#Hardware #LLM On-Premise #DevOps
2026-05-01 DigiTimes

Advantest and AI Chip Testing: Positive Results and Cautious Outlook

Advantest, a leader in semiconductor testing, exceeded expectations driven by AI chip demand. Despite strong performance, a cautious future outlook impacted its share value. This scenario highlights the complexity of the AI hardware market and its im...

#Hardware #LLM On-Premise #Fine-Tuning
2026-05-01 Phoronix

Linux 7.2: 'Fair' DRM Scheduler and AMDXDNA AIE4 Hardware Integration

The upcoming Linux 7.2 kernel, expected this summer, will introduce significant hardware resource management enhancements. Key among these is the adoption of a default 'Fair' priority for the DRM scheduler, aimed at optimizing GPU resource allocation...

#Hardware #LLM On-Premise #DevOps
2026-04-30 Wired AI

Rapid AI Adoption Strains Supply Chain: Mac Mini Scarcity for Months

Apple CEO Tim Cook revealed that artificial intelligence adoption is exceeding expectations, with direct repercussions on hardware availability. The scarcity of Mac Minis for the coming months highlights growing challenges for companies planning on-p...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-30 LocalLLaMA

AMD Halo Box: A Look at the Demo System with Ryzen 395 and 128GB RAM

An AMD demo unit, dubbed "Halo Box," has surfaced online, showcasing a system equipped with a Ryzen 395 processor and 128GB of RAM. This device, running Ubuntu and featuring a programmable light strip, offers a glimpse into potential hardware configu...

#Hardware #LLM On-Premise #DevOps
2026-04-30 DigiTimes

Google and the Future of AI Chips: The Shift Towards Specialized Accelerators

Google is shifting the development of its TPU chips towards more specialized solutions, moving away from a universal approach. This evolution reflects a trend in the AI industry favoring efficiency and performance for specific workloads, with signifi...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-30 Phoronix

AMD and Linux: New Patches to Optimize Page Migration and Performance

AMD has released new patches for the Linux kernel, aimed at accelerating page migration. This work, originally started by NVIDIA, is now being continued by AMD engineers, leveraging batch copies and hardware offloading to significantly improve perfor...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-30 DigiTimes

Powertech Raises Stakes: $1.6 Billion for AI Packaging and Sector Growth

Powertech, a leading Taiwanese Outsource Semiconductor Assembly and Test (OSAT) company, has announced a significant increase in its capital expenditure, reaching $1.6 billion. This initiative aims to boost production capacity in the AI component pac...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-30 DigiTimes

Taiwan's Episil triples capex, focusing on silicio photonics for AI

Taiwanese wafer manufacturer Episil has announced a significant increase in capital expenditure (capex), tripling its investment to accelerate the development and production of silicio photonics solutions. This strategic move aims to support the grow...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-30 DigiTimes

Taiwan's Optics Industry Finds a New Role in the AI Imaging Boom

Taiwan's optics industry is strategically redefining its role within the AI imaging ecosystem. This evolution highlights the critical importance of advanced hardware components for visual data acquisition and processing, a key consideration for enter...

#Hardware #LLM On-Premise #DevOps
2026-04-30 DigiTimes

Cambricon Reports Revenue Surge Driven by AI Compute Demand

Cambricon, a company specializing in AI chips, has reported a significant increase in revenue, propelled by the growing demand for artificial intelligence computing capacity. This trend underscores the strategic importance of dedicated hardware and i...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-30 DigiTimes

ASE Raises Capex to $8.5 Billion: Advanced Packaging Boost for AI

ASE, a key player in the semiconductor industry, has announced a record increase in its CapEx to $8.5 billion by 2026. This decision is driven by strong demand for advanced packaging, a critical component for hardware architectures dedicated to artif...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-30 The Register AI

Google Cloud to Offer TPUs to External Customers: Diversification and AI Boost

Google Cloud has announced it will make its custom Tensor Processing Units (TPUs) available for sale to a selection of external customers. This initiative addresses the rising demand for specialized AI hardware and aims to diversify the tech giant's ...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-30 DigiTimes

Samsung Highlights Stable 4nm Tech Amid Growing AI, Automotive Demand

Samsung has emphasized the stability of its 4-nanometer process technology, highlighting its crucial role in meeting the increasing demand from the artificial intelligence and automotive sectors. The ability to produce reliable and high-performing ch...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-30 DigiTimes

Qualcomm Navigates Near-Term Headwinds While Data Center Push Gains Traction

Qualcomm is facing near-term challenges, but its data center market strategy is gaining traction. This scenario highlights the complexity of the semiconductor industry, where innovation and expansion into new segments, such as on-premise AI, are cruc...

#Hardware #LLM On-Premise #DevOps
2026-04-29 DigiTimes

Nvidia and the AI Chip Race: CEO's View on Google's TPUs

Nvidia's CEO has shared his perspective on the competition in the artificial intelligence chip market, stating that Google's TPUs do not pose a significant threat. This declaration comes amidst increasing demand for AI accelerators, where companies c...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-29 DigiTimes

Google's TPU Shortage and the AI Infrastructure Challenge

Google's Tensor Processing Unit (TPU) shortage is highlighting a growing disparity in AI infrastructure. This scenario underscores the critical role of specialized hardware for the development and deployment of Large Language Models, influencing stra...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-29 Phoronix

Intel Lunar Lake: CPU Performance Gains on Linux

This analysis focuses on the evolution of Intel Lunar Lake CPU performance on Linux systems. Following an examination of Xe2 integrated graphics performance gains, attention now shifts to the processor's computational capabilities. Benchmarks, conduc...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-29 Phoronix

OpenCL Introduces Cooperative Matrix Extensions for AI Inference

The OpenCL API is integrating Cooperative Matrix Extensions, a move that follows the introduction of similar functionalities in Vulkan in 2023. These extensions are designed to optimize machine learning and AI Inference operations, offering new oppor...

#Hardware #LLM On-Premise #DevOps
2026-04-29 PyTorch Blog

AutoSP: Simplifying Long-Context LLM Training on Multi-GPU Setups

AutoSP, a compiler-based solution, automates the implementation of Sequence Parallelism (SP) for training Large Language Models (LLM) with extended contexts. Integrated into DeepSpeed, it addresses out-of-memory (OOM) issues and the complexity associ...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-29 LocalLLaMA

llama.cpp: Native NVFP4 Accelerates Prompt Processing on Blackwell

A recent llama.cpp benchmark reveals that native NVFP4 support significantly improves prompt processing performance (up to 68%) for the Qwen3.6-27B-NVFP4 model on an NVIDIA RTX 5090 GPU. Token generation speed remains unchanged. This advantage is cru...

#Hardware #LLM On-Premise #DevOps
2026-04-29 Tom's Hardware

Intel 18A: Wafer Optimization Boosts Revenue and CPU Availability

New details reveal how Intel is increasing revenue per wafer through careful production optimization. According to analyses, a reduction in yield variability across each wafer, particularly for the 18A node, allows for a greater number of marketable ...

#Hardware #LLM On-Premise #Fine-Tuning
2026-04-29 LocalLLaMA

Hipfire: A New Inference Engine for AMD GPUs with a Focus on Quantization

Hipfire is a new inference engine designed to optimize Large Language Model (LLM) performance across all AMD GPUs. It utilizes an `mq4` quantization methodology and, according to the Localmaxxing benchmarking site, offers significant inference speedu...

#Hardware #LLM On-Premise #DevOps
← Back to All Topics