Hardware Acceleration and Optimization for AI

2026-02-06 • LocalLLaMA

Qwen3-Coder: improved performance on RTX 5090 with llama.cpp

A user reported a significant throughput increase, up to 26 tokens/second, using the Qwen3-Coder-Next-Q4_K_S model with llama.cpp on an RTX 5090. The optimization was achieved by offloading MoE expert tensors to the CPU and quantizing the KV cache.

#Hardware #LLM On-Premise

2026-02-06 • DigiTimes

CSPs turn to custom silicio to break Nvidia dependence

Cloud service providers (CSPs) are exploring custom silicio solutions to diversify their hardware options and reduce dependence on traditional vendors like Nvidia. This trend could lead to new architectures optimized for specific workloads.

#Hardware #LLM On-Premise #DevOps

2026-02-06 • DigiTimes

Google doubles AI capex, turning TPU ASIC orders into high-stakes supplier race

Google is significantly increasing its investments in AI infrastructure, particularly in TPU ASICs. This move intensifies competition among suppliers and signals a strong push towards custom hardware solutions for artificial intelligence workloads.

#Hardware #LLM On-Premise #Fine-Tuning

2026-02-06 • DigiTimes

Wistron posts strongest January on AI server growth

Taiwanese manufacturer Wistron reported an exceptionally positive January, driven by strong demand for servers dedicated to artificial intelligence. This highlights the growing market interest in specialized hardware solutions for AI workloads.

#Hardware #LLM On-Premise #Fine-Tuning

2026-02-06 • LocalLLaMA

Tensor Parallelism in Llama.cpp: A Promising Update

A pull request introduces tensor parallelism in Llama.cpp, paving the way for faster and more efficient inference on large language models. The community welcomes this development, which could significantly improve performance on distributed hardware...

#Hardware #LLM On-Premise #DevOps

2026-02-06 • DigiTimes

AI and AP Drive Load Board Shipments with January Revenue Up

According to DIGITIMES, artificial intelligence and advanced applications (AP) are boosting shipments of load boards. January revenues show growth, indicating strong demand in the sector.

#Hardware #LLM On-Premise #DevOps

2026-02-06 • DigiTimes

South Korea's AI Push: Nvidia Powers with Over 260,000 GPUs

South Korea is making significant investments in artificial intelligence, supported by a hardware infrastructure powered by over 260,000 Nvidia GPUs. This strategic move aims to position the country as a leader in the AI sector, with a focus on advan...

#Hardware

2026-02-05 • Tom's Hardware

Tenstorrent reduces Tensor Cores on Blackhole p150 via Firmware Update

Tenstorrent announced a reduction in the number of Tensor cores on its Blackhole p150 cards, from 140 to 120, via a firmware update. The company anticipates a 1-2% performance drop for existing users. New cards will ship with 120 Tensor cores.

#Hardware #LLM On-Premise #DevOps

2026-02-05 • Tom's Hardware

Western Digital details 14-platter 3.5-inch HAMR HDD designs with 140 TB and beyond

Western Digital announces the development of 3.5-inch HDDs (Hard Disk Drives) based on HAMR (Heat-Assisted Magnetic Recording) technology with a capacity reaching 140 TB, thanks to the use of 14 platters. This technology promises to significantly inc...

#LLM On-Premise #DevOps

2026-02-05 • Tom's Hardware

Nvidia DLSS 4.5: Ray Reconstruction without Denoisers?

Nvidia is reportedly developing DLSS 4.5, an advanced version of its upscaling technology that could eliminate the need for denoisers in ray tracing. This is thanks to a Transformer model that reconstructs ray-traced reflections more accurately.

#Hardware

2026-02-05 • Phoronix

Intel Arc B390 Graphics Performance On Linux With Panther Lake

First Linux benchmarks of the Intel Arc B390 GPU, integrated in high-end Panther Lake models. The Xe3 graphics card, equipped with 12 Xe cores, promises interesting performance in desktop and mobile environments for graphics and compute workloads.

#Hardware #LLM On-Premise #DevOps

2026-02-05 • DigiTimes

Nvidia reportedly seeks faster HBM4 deliveries from Samsung

Nvidia is reportedly seeking faster deliveries of HBM4 memory from Samsung, amid a global crunch in high-bandwidth memory supply. The move highlights the competition to secure resources for upcoming AI accelerators.

#Hardware #Fine-Tuning

2026-02-05 • DigiTimes

Google AI platform win elevates Innoscience's 8-inch GaN manufacturing clout

Google's selection of Innoscience for its AI platform highlights the importance of GaN (gallium nitride) manufacturing on 8-inch wafers. This technology promises to improve the efficiency and performance of artificial intelligence systems, opening ne...

#LLM On-Premise #DevOps

2026-02-05 • DigiTimes

Alphabet's US$185 billion hardware mandate: Breaking the AI supply bottleneck

Alphabet plans to invest US$185 billion in hardware infrastructure dedicated to artificial intelligence. The initiative aims to overcome current supply chain bottlenecks and ensure the computing capacity needed for its ambitious AI projects.

#Hardware #LLM On-Premise #DevOps

2026-02-05 • DigiTimes

Jensen Huang: AI factories will power a trillion-dollar reindustrialization

According to Jensen Huang, CEO of NVIDIA, AI factories are the engine of a new wave of reindustrialization. These specialized infrastructures will be fundamental for the development and deployment of advanced AI solutions in various industrial sector...

#Hardware #LLM On-Premise #DevOps

2026-02-05 • DigiTimes

Alphabet pledges record $185 billion capital spend as AI fuels cloud boom

Alphabet plans to invest a record $185 billion, fueled by cloud growth and AI opportunities. The company aims to strengthen its infrastructure to support the increasing demand for AI and cloud services.

#Hardware #LLM On-Premise #DevOps

2026-02-04 • TechCrunch AI

A16z invests $1.7B in AI infrastructure

Andreessen Horowitz has allocated $1.7 billion from its new $15 billion fund for investments in AI infrastructure. The team will focus on companies like Black Forrest Labs, Cursor, OpenAI, ElevenLabs, Ideogram, and Fal.

#LLM On-Premise #DevOps

2026-02-04 • TechCrunch AI

Positron challenges Nvidia with AI chips: $230M Series B round

Positron has raised $230 million in a Series B funding round, with participation from the Qatar Investment Authority. The company aims to compete with Nvidia in the artificial intelligence chip market, amid growing demand and with Qatar aiming to dev...

#Hardware

2026-02-04 • DigiTimes

Nvidia's HBM4 tests near completion as SK Hynix ramps 1b DRAM

Nvidia's HBM4 memory tests are nearing completion, while SK Hynix is increasing production of 1b DRAM. This development could lead to a significant increase in memory bandwidth for future Nvidia GPUs, with important implications for artificial intell...

#Hardware #LLM On-Premise #DevOps

2026-02-04 • DigiTimes

Intel CEO unveils plans to enter GPU market dominated by Nvidia

Intel's CEO has announced plans to enter the GPU market, currently dominated by Nvidia. This strategic move could bring new dynamics to the hardware acceleration sector for artificial intelligence and graphics workloads.

#Hardware #LLM On-Premise #DevOps

2026-02-04 • DigiTimes

Nvidia shapes the HBM4 race as Samsung, SK Hynix jockey for position

The race for HBM4 memory production intensifies, with Nvidia playing a key role in defining the specifications. Samsung and SK Hynix are vying for leadership in this sector crucial for future GPUs and AI accelerators.

#Hardware #LLM On-Premise #DevOps

2026-02-04 • DigiTimes

Vanguard International Semiconductor sees strong 2026 AI server power demand

Vanguard International Semiconductor anticipates strong growth in power demand for AI servers starting in 2026. The company expects a significant impact on the semiconductor market, with implications for hardware manufacturers and cloud service provi...

#LLM On-Premise #DevOps

2026-02-04 • DigiTimes

AI upgrades intensify high-capacity NOR Flash shortages

The rise of artificial intelligence applications is intensifying the shortage of high-capacity NOR Flash memory, especially SLC and MLC variants. This situation could impact the production of devices requiring these memories.

#Hardware #LLM On-Premise #DevOps

2026-02-04 • DigiTimes

Nvidia expands validation and testing, silicio photonics could be GTC 2026 focus

Nvidia is expanding its validation and testing processes. The company may focus on silicio photonics as a key element for future GPUs, with potential announcements at GTC 2026. This technology promises to significantly improve the speed and energy ef...

#Hardware

2026-02-03 • TechCrunch AI

Intel to enter the Nvidia-dominated GPU market

Intel is ramping up efforts to compete in the GPU market, currently dominated by Nvidia. The company is building a dedicated team and will develop a GPU strategy focused on customer needs. This marks a significant evolution in the graphics processor ...

#Hardware #LLM On-Premise #DevOps

2026-02-03 • Tom's Hardware

Intel is co-developing new Z-Angle Memory for AI data centers

Intel and SoftBank subsidiary, Saimemory, are collaborating to develop Z-Angle Memory (ZAM), a vertical-stacked memory for AI data centers. ZAM promises 2 to 3x more capacity, greater bandwidth, and half the power consumption compared to current solu...

#Hardware #LLM On-Premise #DevOps

2026-02-03 • LocalLLaMA

Intel Xeon 600 Workstation CPUs Launched: Up To 86 Cores

Intel has launched the new Xeon 600 series processors for workstations, offering up to 86 cores. These processors support memory up to 8000 MT/s, 128 PCIe Gen5 lanes, and a TDP of 350W with overclocking support. They are positioned as an alternative ...

#Hardware #LLM On-Premise #DevOps

2026-02-03 • DigiTimes

C Sun invests NT$1.48 billion in Taichung plant for AI packaging

C Sun is investing NT$1.48 billion (approximately €46 million) in its Taichung plant to expand the production of advanced chip packaging equipment for artificial intelligence applications. The investment aims to meet the growing demand in the sector.

#LLM On-Premise #DevOps

2026-02-03 • The Register AI

xAI merges into SpaceX: the goal is universal consciousness?

Elon Musk announced that his space company SpaceX has acquired his AI outfit xAI. The integration aims to leverage solar energy to overcome earthly limitations and spread a universal consciousness. SpaceX's valuation rises to $250 billion.

#LLM On-Premise #DevOps

2026-02-03 • DigiTimes

SpaceX's xAI acquisition reframes AI energy constraints and complicates the IPO narrative

SpaceX's acquisition of xAI raises questions about the future energy needs of artificial intelligence models and could impact the aerospace company's initial public offering (IPO) plans. The article highlights the growing challenges related to energy...

#LLM On-Premise #DevOps

2026-02-03 • DigiTimes

Taiwan's top tech talent pivots to healthcare, eyeing TSMC-style success

Taiwan's top tech talents are shifting their focus to the healthcare sector, aiming to replicate the success of companies like TSMC. This transition is driven by the increasing demand for innovative solutions in medicine and the desire to apply advan...

2026-02-02 • Tom's Hardware

Memory makers unite against hoarding: will prices rise faster?

Samsung, SK Hynix, and Micron are teaming up to block memory hoarding. This move might accelerate price increases, but could encourage increased supply in the long term.

2026-02-02 • Tom's Hardware

Jensen Huang warns TSMC needs to 'work very hard' to meet AI demand

Nvidia CEO Jensen Huang says TSMC needs to work very hard to expand capacity in order to keep up with AI demand. Huang says its demand alone may force doubling its capacity over the next decade.

#Hardware #LLM On-Premise #DevOps

2026-02-02 • DigiTimes

Nvidia GB200 fuels chassis sector pivot to liquid cooling, rack integration

The introduction of the Nvidia GB200 GPU is accelerating the adoption of liquid cooling systems and rack-level integration in the chassis sector. This transition is driven by the need to manage the increased power density and thermal requirements of ...

#Hardware #LLM On-Premise #DevOps

2026-02-02 • DigiTimes

Advantech sole IPC vendor at Nvidia banquet as edge AI becomes next frontier

Advantech stands out as the sole IPC vendor invited to Nvidia's banquet, signaling a growing interest in edge AI solutions. This move underscores the importance of distributed AI inference and local computing capabilities for advanced applications.

#Hardware #LLM On-Premise #DevOps

2026-02-02 • DigiTimes

Taiwan PCB makers vie for AI server market with new 2026 capacity

Taiwanese printed circuit board (PCB) manufacturers are investing in new production capacity, expected by 2026, to meet the growing demand for AI servers. This strategic move aims to position Taiwanese companies as key suppliers in a rapidly expandin...

#LLM On-Premise #DevOps

2026-02-02 • DigiTimes

Nvidia: Huang expects TSMC capacity to double, reaffirms OpenAI investment

Nvidia CEO Jensen Huang anticipates TSMC's capacity to double by 2026. He also highlighted memory supply challenges and reaffirmed Nvidia's investment in OpenAI. Huang's Taiwan visit underscores the region's strategic importance for the company.

#Hardware #LLM On-Premise #DevOps

2026-02-02 • DigiTimes

Micron ramps global memory investments as Nvidia prepares HBM4 rollout

Micron is ramping up its global investments in memory technology. This strategic move comes at a crucial time, with Nvidia preparing to roll out its next-generation HBM4 memory, intended for high-performance GPUs for artificial intelligence and high-...

#Hardware #LLM On-Premise #DevOps

2026-02-02 • DigiTimes

Nvidia speeds up silicio photonics momentum in 2026, optical supply chain mobilizes

Nvidia is targeting mass production of silicio photonics solutions by 2026. This development could significantly impact the optical supply chain, paving the way for faster and more efficient interconnects for future high-performance computing workloa...

#Hardware #LLM On-Premise #DevOps

2026-02-02 • DigiTimes

Taiwan's cooling and power solutions rise to serve soaring AI chips demand

Taiwan's cooling and power solutions industry is responding to the increasing demand for AI chips. The ability to manage increased power consumption and heat dissipation is crucial for the efficient operation of AI systems.

#Hardware #LLM On-Premise #DevOps

2026-02-01 • LocalLLaMA

vLLM-MLX on Apple Silicio: Up to 87% Higher Throughput

Recent research compares the performance of vLLM-MLX on Apple Silicio with llama.cpp, highlighting significantly higher throughput. The results suggest potential advantages in using Apple hardware for local inference of large language models (LLMs).

#LLM On-Premise #DevOps

2026-02-01 • DigiTimes

CSPs ramp up AI capex as supply chain gains confidence

Cloud service providers (CSPs) are increasing investments in AI infrastructure, thanks to a more stable supply chain. This increase in CapEx is an indicator of the growing demand for computational resources for artificial intelligence and machine lea...

#Hardware #LLM On-Premise #DevOps

2026-01-30 • DigiTimes

KLA: AI-driven process control boom lifts 2026 semiconductor equipment demand

KLA forecasts that the increasing adoption of AI-driven process control systems will boost demand for semiconductor manufacturing equipment by 2026. This trend is driven by the need to improve efficiency and yield in the fabrication of increasingly c...

#LLM On-Premise #DevOps

2026-01-30 • DigiTimes

ASIC server demand boosts Taiwan's high-end CCL shipments

The increasing demand for ASIC servers, driven by artificial intelligence applications, is boosting shipments of high-end CCL (Copper Clad Laminate) materials from Taiwan. This trend reflects the growing importance of specialized hardware for AI work...

#Hardware #LLM On-Premise #Fine-Tuning

Hardware Acceleration and Optimization for AI

Related Coverage