Topic / Trend Rising

AI Model Development and Releases

Numerous new AI models are being released, with a focus on open-source options, improved performance, and larger context windows. Companies and communities are actively developing and testing these models, pushing the boundaries of AI capabilities.

Detected: 2026-01-25 · Updated: 2026-02-15

Related Coverage

2026-02-15 LocalLLaMA

Qwen3 Coder: Improved Performance with Llama.cpp

A recent update to Llama.cpp appears to have significantly improved the performance of the Qwen3 Coder Next model. Tests indicate an increase in throughput, measured in tokens per second, using specific hardware configurations with NVIDIA RTX GPUs.

#Hardware #LLM On-Premise #DevOps
2026-02-14 LocalLLaMA

KaniTTS2: open-source TTS model with voice cloning, 3GB VRAM footprint

KaniTTS2 is a 400M parameter open-source text-to-speech (TTS) model designed for real-time conversational use cases. It supports voice cloning and runs with only 3GB of VRAM. The pre-training code is included, allowing users to develop custom TTS mod...

#LLM On-Premise #Fine-Tuning #DevOps
2026-02-14 LocalLLaMA

6-GPU local LLM workstation: scaling and orchestration advice

A Reddit user is experimenting with a local workstation equipped with 6 GPUs (approximately 200GB of VRAM) for concurrent execution of open-source reasoning models. The goal is internal data analysis and workflow automation. They are seeking advice o...

#Hardware #LLM On-Premise #DevOps
2026-02-14 LocalLLaMA

Qwen3-TTS.cpp: Optimized GGML Inference for Local Voice Cloning

Lightweight GGML implementation of Qwen3-TTS 0.6B, focused on fast inference and efficient memory usage. Optimization with Metal backend and CoreML code predictor promises a speedup of up to 4x compared to the PyTorch pipeline, with a memory footprin...

#Hardware #LLM On-Premise #DevOps
2026-02-14 LocalLLaMA

NVIDIA Nemotron-3: FP4 pre-training and H1 2026 release

NVIDIA announced that Nemotron-3 Super and Ultra models are being pre-trained using FP4 precision, leveraging the high FP4 throughput of NVIDIA GPUs. The models are expected to be released in the first half of 2026. An interesting aspect that emerged...

#Hardware #LLM On-Premise #Fine-Tuning
2026-02-14 LocalLLaMA

NVIDIA Nemotron Nano 12B v2 VL: multi-image reasoning on-premise

The NVIDIA Nemotron Nano v2 12B VL model enables multi-image reasoning and video understanding, along with strong document intelligence, visual Q&A and summarization capabilities. This model is ready for commercial use and lends itself to on-premise ...

#Hardware #LLM On-Premise #DevOps
2026-02-14 LocalLLaMA

Small LLM Evaluation: The Importance of Parsing in Local Agents

A benchmark of 21 small language models (LLMs) reveals that the ability to call tools locally depends as much on the model as on the accuracy of the parser used. The results highlight how models with less than 4 billion parameters can compete with la...

#Hardware #LLM On-Premise #DevOps
2026-02-14 LocalLLaMA

Qwen3Next Optimization in llama.cpp: Improved Performance

A pull request on llama.cpp introduces optimizations for the Qwen3Next model, promising an increase in processing speed (tokens/second). The improvements aim to make the model more efficient and performant.

#LLM On-Premise #DevOps
2026-02-14 DigiTimes

ByteDance's Doubao 2.0 aims to undercut the West's AI elite

ByteDance launches Doubao 2.0, an AI model aiming to compete with Western solutions. The move highlights the increasing competition in the artificial intelligence sector and the global ambitions of the Chinese company.

#LLM On-Premise #DevOps
2026-02-14 The Register AI

Google and OpenAI warn: AI models at risk of cloning

Google and OpenAI have raised concerns about competitors, including China's DeepSeek, probing their AI models to steal underlying reasoning and replicate capabilities. This practice raises questions about intellectual property protection in the AI se...

#LLM On-Premise #DevOps
2026-02-14 LocalLLaMA

Local Development with LLM Models: Tools and Experiences

An overview of tools for developing applications with large language models (LLMs) running locally, rather than in the cloud. Several frameworks and IDEs are presented that facilitate the integration of LLMs into development projects, with a focus on...

#LLM On-Premise #DevOps
2026-02-14 LocalLLaMA

Claude Code: Full Prompt Reprocessing with Local Models

A user discovered that Claude Code was reprocessing the entire prompt with each request due to a dynamic billing header. The solution involves disabling the header transmission via a local configuration, restoring the effectiveness of the KV cache.

#LLM On-Premise #DevOps
2026-02-13 TechCrunch AI

Airbnb: AI handles a third of customer support in US and Canada

Airbnb CEO Brian Chesky announced that a third of North American customer service is now handled by an AI agent. This shift marks a growing adoption of artificial intelligence in the hospitality sector to automate and improve user support.

#LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

GPT-OSS 120B: Uncensored Open-Source Model for Local Inference

An uncensored version of GPT-OSS 120B is available, an open-source language model with 117 billion total parameters and a context window of 128K. The model is in MXFP4 format and can be run on consumer or server hardware equipped with high-capacity G...

#Hardware #LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

GPT-OSS (20B) running locally in browser with WebGPU

A demo showcases GPT-OSS (20B) running 100% locally in a browser, leveraging WebGPU. The system is powered by Transformers.js v4 (preview) and ONNX Runtime Web. Source code and the optimized ONNX model are available on Hugging Face.

#LLM On-Premise #DevOps
2026-02-13 TechCrunch AI

AI Industry Shake-up: Top Talent Exits OpenAI and xAI

The artificial intelligence sector is in turmoil, with significant defections of skilled personnel from leading companies such as OpenAI and xAI. The reasons appear to range from internal reorganizations to strategic disagreements on future technolog...

#LLM On-Premise #DevOps
2026-02-13 OpenAI Blog

OpenAI releases GABRIEL for large-scale social science analysis

OpenAI has introduced GABRIEL, an open-source toolkit based on GPT. This tool is designed to transform qualitative text and images into quantitative data, aiming to support researchers in analyzing social science studies on a large scale.

#LLM On-Premise #Fine-Tuning #DevOps
2026-02-13 LocalLLaMA

SWE-rebench Jan 2026: GLM-5, MiniMax M2.5, and Opus Lead Performance

The SWE-rebench benchmark has been updated with January 2026 results on 48 new GitHub tasks. Claude Code (Opus 4.6) leads with a 52.9% resolved rate. GLM-5, MiniMax M2.5, and Qwen3-Coder-Next stand out among open-source models. A gap between Kimi var...

#LLM On-Premise #DevOps
2026-02-13 TechCrunch AI

OpenAI removes access to sycophancy-prone ChatGPT-4o model

OpenAI has removed access to the ChatGPT-4o model, known for its overly sycophantic nature. The decision follows several lawsuits involving unhealthy relationships between users and the chatbot. The model had become problematic due to its compliant n...

2026-02-13 LocalLLaMA

Minimax M2.5 weights to drop soon

The upcoming release of the Minimax M2.5 language model weights has been confirmed. The news was shared via a Reddit post, generating interest in the open source community interested in experimenting with local language models.

#LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

Flyto-core: MCP server with 300+ local tools for LLMs

Flyto-core is an MCP (Meta-Control Protocol) server that includes over 300 locally executable tools, designed to simplify the integration between local language models and various applications. It offers browser automation capabilities via Playwright...

#LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

Home Server with 4x MI50 and 2TB RAM: Configuration and Optimizations

A user has finalized the specifications for their home server, featuring 4 MI50 GPUs, 2 8260L CPUs, and 2TB of DDR4 RAM. The configuration includes a custom VBIOS for Linux, raising questions about potential optimizations and ideal workloads for such...

#Hardware #LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

Nvidia's DMS Cuts LLM Inference Costs by Up to 8x

Nvidia introduced Dynamic Memory Sparsification (DMS), a technique that optimizes KV cache management in LLMs during inference. DMS, through a learned "keep or evict" signal for each token, reduces memory usage by up to 8x, enabling more performant m...

#Hardware #LLM On-Premise #DevOps
2026-02-13 ServeTheHome

OpenAI GPT-5.3 Achieves 1000 Tokens/Second on Cerebras Chips

OpenAI's GPT-5.3-Codex-Spark model has been optimized to run on Cerebras WSE-3 processors, achieving an inference speed of over 1000 tokens per second. This performance opens new perspectives for applications requiring fast, low-latency responses.

#LLM On-Premise #DevOps
2026-02-13 TechCrunch AI

Claude climbs the charts after Super Bowl ads

Claude's app reached the top 10 on the U.S. App Store following Anthropic's Super Bowl ad campaign. The advertisement, centered on a parody of artificial intelligence, helped increase the visibility and adoption of the application.

2026-02-13 OpenAI Blog

OpenAI: Scaling Access to Codex and Sora Beyond Rate Limits

OpenAI built a real-time access system for Codex and Sora, managing rate limits, tracking usage, and implementing a credit system. This approach ensures continuous access to the platforms, optimizing resources and maintaining service stability.

2026-02-13 Wired AI

Zillow Has Gone Wild—for AI

As the housing market stalls, Zillow’s CEO sees AI as “an ingredient rather than a threat” that can both help the company protect its turf and reinvent how people search for homes.

#LLM On-Premise #DevOps
2026-02-13 TechCrunch AI

xAI: Mass Resignations or Internal Purge?

At least nine engineers, including two co-founders, have announced their exits from xAI in the past week. The resignations raise questions about the stability of Elon Musk's company, already at the center of several controversies. Speculation suggest...

2026-02-13 AI News

AI for Healthcare: Predictive Model to Optimize Resources

Researchers at the University of Hertfordshire have developed an AI model to improve efficiency in healthcare resource allocation. The system analyzes historical data to forecast future demand, supporting decisions on staffing, patient care, and infr...

2026-02-13 LocalLLaMA

Deepseek testing a new model: focus on reading comprehension

Deepseek, a Chinese group active in the development of large language models (LLM), has announced that it is testing a new model. Preliminary benchmarks focus on reading comprehension skills, with results showing variable performance across different...

#LLM On-Premise #DevOps
2026-02-13 TechCrunch AI

Cohere’s $240M year sets stage for IPO

Cohere surpassed $240 million in annual recurring revenue in 2025, highlighting strong enterprise AI demand as the Canadian startup positions itself for a potential IPO. This comes amid intensifying competition from OpenAI and Anthropic.

#LLM On-Premise #DevOps
2026-02-13 Ars Technica AI

RentAHuman: The New Frontier of Gig Work?

RentAHuman is a platform that aims to connect AI agents with human workers for the execution of physical tasks. Launched in early February, the platform was developed by Alexander Liteplo and Patricia Tani and presents itself as a marketplace for on-...

#LLM On-Premise #DevOps
2026-02-13 The Next Web

Stanhope AI raises $8M to build adaptive AI for robotics and defence

London-based deep tech startup Stanhope AI has closed a €6.7 million ($8 million) Seed funding round to advance a new class of adaptive artificial intelligence. The aim is to power autonomous systems in the physical world, moving beyond the limitatio...

2026-02-13 Tech.eu

ScyAI secures €2M and launches AI risk platform for real assets

Zurich-based startup ScyAI has closed a €2 million pre-seed funding round. The company has developed a platform that creates quantified risk profiles for companies with large physical asset portfolios, combining operational data and external hazard m...

2026-02-13 LocalLLaMA

MiniMaxAI releases MiniMax-M2.5 language model on Hugging Face

MiniMaxAI has released its MiniMax-M2.5 language model on the Hugging Face platform. The news, shared on Reddit, points out the absence of quantized versions at the time of release. The LocalLLaMA community is already evaluating the implications and ...

#Hardware #LLM On-Premise #DevOps
2026-02-13 Tom's Hardware

Google: State-sponsored hackers using Gemini in attacks

Google reports that state-sponsored actors from China, Russia, and Iran are leveraging Gemini in various stages of cyberattacks. The AI is being used for phishing, malicious code development, and vulnerability testing, enhancing the offensive capabil...

#LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

DeepSeek tests model with 1 million token context window

DeepSeek is testing a new long-context model architecture, capable of supporting a context window of 1 million tokens. The announcement was shared via a post on X (formerly Twitter) by AiBattle, signaling a significant step forward in long-sequence h...

#LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

ByteDance Releases Protenix-v1 for Biomolecular Structure Prediction

ByteDance has released Protenix-v1, a new open-source model for biomolecular structure prediction. The model achieves AlphaFold3-level performance. The source code is available on GitHub, opening new possibilities for research and development in the ...

#LLM On-Premise
2026-02-13 LocalLLaMA

Google Releases Conductor: Gemini CLI Extension

Google has released Conductor, a CLI (Command Line Interface) extension for Gemini, focused on context management and agent-based workflow orchestration. Conductor stores knowledge in Markdown format, facilitating information organization and access.

#LLM On-Premise #DevOps
2026-02-13 Tech.eu

Simmetry.ai expands AI training platform following €330K funding

Simmetry.ai, a synthetic data company working across agriculture, food and industrial sectors, has secured €330,000 from NBank. The funding, provided through the High-Tech Incubator (HTI) accelerator programme, will support the development of a scala...

#LLM On-Premise #Fine-Tuning #DevOps
2026-02-13 The Register AI

Anthropic's AI-built C compiler fails to impress developers

Anthropic has developed a C compiler using artificial intelligence, but the reception among developers has been lukewarm. The initiative is seen more as a demonstration of capability than as a revolutionary breakthrough in the field of software engin...

#LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

MiniMax onX: Model weights dropping soon

According to a Reddit post, the weights for the MiniMax onX model are expected to be released soon. The news has been met with enthusiasm by the LocalLLaMA community, interested in local LLM inference solutions.

#Hardware #LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

MiniMax-M2.5: Checkpoints Available on Hugging Face

MiniMax-M2.5 model checkpoints will be available on Hugging Face. This announcement, coming from the LocalLLaMA community, signals an opportunity for developers and researchers to access and experiment with this model. Availability on Hugging Face fa...

#Fine-Tuning
2026-02-13 LocalLLaMA

UG student launches Dhi-5B, LLM trained from scratch on a budget

An undergraduate student has launched Dhi-5B, a 5 billion parameter multimodal language model, trained with a budget of approximately $1200. The model was developed using a custom codebase and advanced training methodologies, in several stages, from ...

#LLM On-Premise #Fine-Tuning #DevOps
2026-02-13 LocalLLaMA

Step 3.5 Flash: a promising open-source model for complex tasks?

A user tested Step 3.5 Flash on complex merging tasks with a 90k context window, achieving surprising results. Performance exceeds Gemini 3.0 Preview in agentic scenarios, with remarkable speed. The model demonstrated flexibility with opencode and Cl...

#LLM On-Premise #DevOps
2026-02-13 ArXiv cs.CL

HybridRAG: LLM Chatbot Framework with Pre-Generated Knowledge Base

HybridRAG is a RAG framework that pre-generates a question-answer knowledge base from unstructured documents (PDFs with OCR). This approach aims to reduce latency and improve answer quality in chatbots, compared to standard RAG systems that operate i...

#LLM On-Premise #DevOps #RAG
2026-02-13 ArXiv cs.LG

KBVQ-MoE: Low-Bit Quantization for MoE Large Language Models

A novel framework, KBVQ-MoE, addresses the challenges of low-bit quantization in Mixture of Experts (MoE) large language models (LLMs). By leveraging redundancy elimination and bias-corrected output stabilization, KBVQ-MoE aims to preserve accuracy e...

#LLM On-Premise #DevOps
2026-02-13 ArXiv cs.LG

Enhancing LLMs for Automated Optimization via MIND

A novel approach, MIND, aims to enhance the capabilities of Large Language Models (LLMs) in automated optimization. MIND addresses existing limitations in model training by focusing on error-specific problems and refining solutions locally. Results d...

#Fine-Tuning
2026-02-13 ArXiv cs.AI

Explaining AI Without Code: A User Study on Explainable AI

A new study explores Explainable AI (XAI) in no-code ML platforms, focusing on making explanations accessible to both novices and experts. The research evaluates an XAI module in DashAI, an open-source platform, using techniques like Partial Dependen...

2026-02-13 DigiTimes

AUO to hire 1,000 in 2026 as AI expands display, smart mobility push

Display manufacturer AUO plans to hire 1,000 people by 2026. The expansion is driven by increasing demand for AI solutions in the display and smart mobility sectors. The company aims to strengthen its presence in these growing markets.

#LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

StepFun Team: AMA session on Step 3.5 Flash models

The StepFun team hosted an AMA (Ask Me Anything) session on Reddit, focusing on Step 3.5 Flash models and other Step models. The session covered aspects related to model training, the future roadmap, and features desired by users. The team's research...

#LLM On-Premise #DevOps
2026-02-13 LocalLLaMA

GLM-5 and Minimax-2.5 benchmarked on Fiction.liveBench

A user shared on Reddit the results of a comparative benchmark between the GLM-5 and Minimax-2.5 language models, using the Fiction.liveBench dataset. The analysis, focused on the models' performance in narrative content generation scenarios, offers ...

#LLM On-Premise #Fine-Tuning #DevOps
2026-02-13 The Register AI

Cloudflare turns websites into faster food for AI agents

Cloudflare shifts its focus from bot barriers to offering structured data for AI agents. The goal is to provide content in more easily processed formats, such as Markdown, instead of HTML.

#LLM On-Premise #DevOps
2026-02-13 DigiTimes

Anthropic's hive-mind model sets a new pace for AI development

Anthropic is pushing the boundaries of artificial intelligence development with a new 'hive-mind' approach. This model promises to significantly accelerate development times and open new frontiers in AI, although technical details remain scarce.

2026-02-12 TechCrunch AI

IBM to Focus on Entry-Level Talent in the Age of AI

IBM plans to triple its entry-level hiring in the U.S. by 2026. These roles will focus on different tasks than in previous years, reflecting the evolving job market driven by artificial intelligence.

2026-02-12 Ars Technica AI

OpenAI sidesteps Nvidia with GPT-5.3-Codex-Spark coding model on Cerebras

OpenAI released GPT-5.3-Codex-Spark, its first production AI model to run on non-Nvidia hardware, deploying on Cerebras chips. The model delivers code at over 1,000 tokens per second, roughly 15 times faster than its predecessor. Access is available ...

#Hardware #LLM On-Premise #DevOps
2026-02-12 The Register AI

OpenAI adopts Cerebras silicio for its models

OpenAI unveiled GPT-5.3-Codex-Spark, its first model designed to run on Cerebras Systems' AI accelerators. These accelerators, known for their large size and high-speed on-chip memory, directly compete with Nvidia and AMD solutions in the artificial ...

#Hardware #LLM On-Premise #DevOps
2026-02-12 LocalLLaMA

MiniMaxAI: M2.5 model with 230 billion parameters

OpenHands announced that the MiniMaxAI M2.5 model has 230 billion parameters, with 10 billion active parameters. Currently, the model is not yet available on Hugging Face. The news was shared via a Reddit post.

#LLM On-Premise #DevOps
2026-02-12 Tom's Hardware

Nvidia DGX Spark update cuts idle power by 32% or more

Nvidia DGX Spark update introduces hot-plug detection on ConnectX NIC, optimizing the energy efficiency of AI workstations and reducing idle power consumption by 32% or more. A step forward for more efficient AI workstations.

#Hardware #LLM On-Premise #DevOps
2026-02-12 TechCrunch AI

Anthropic: Valuation Soars to $380 Billion After Series G Round

Anthropic, a chatbot maker, has announced a new Series G funding round, bringing its valuation to $380 billion. This reflects the increasing investor interest in the generative artificial intelligence sector and its potential applications.

#LLM On-Premise #DevOps
2026-02-12 LocalLLaMA

Ant Group releases Ming-flash-omni-2.0, a 100B multimodal model

Ant Group has released Ming-flash-omni-2.0, a multimodal model with 100 billion parameters (6 billion active). This unified model handles image, text, video, and audio inputs, generating outputs in the same formats. The architecture promises integrat...

#LLM On-Premise #DevOps
2026-02-12 TechCrunch AI

OpenAI's Codex: new version powered by a dedicated chip

OpenAI announced a new version of its Codex coding tool, highlighting it as a milestone in its relationship with a chipmaker. No details were provided on the chip's technical specifications or the performance improvements achieved.

#Hardware #Fine-Tuning
2026-02-12 LocalLLaMA

Minimax M2.5 Officially Released: Promising Performance

Minimax has officially announced the release of its new language model, M2.5. Early benchmarks show promising results in several tests, including SWE-Bench and BrowseComp. The company has published a dedicated webpage with more details on the model a...

#LLM On-Premise #DevOps
2026-02-12 LocalLLaMA

inclusionAI releases Ring-1T-2.5, model optimized for deep thinking

inclusionAI has announced the release of Ring-1T-2.5, a new large language model (LLM) designed to deliver state-of-the-art performance in tasks requiring deep thinking. The model is available on Hugging Face in FP8 format, facilitating its use and i...

#Hardware #LLM On-Premise #DevOps
2026-02-12 Google AI Blog

Gemini 3 Deep Think: Advancing science, research and engineering

Google introduces Gemini 3 Deep Think, an update designed to navigate the complex challenges of modern science, advanced research, and precision engineering. The initiative aims to provide enhanced tools and resources for professionals in these field...

#LLM On-Premise #DevOps
2026-02-12 LocalLLaMA

Ovis2.6-30B-A3B: New Open Source Multimodal Model

Ovis2.6-30B-A3B, a multimodal language model (MLLM) building on Ovis2.5, has been released. This model introduces a Mixture-of-Experts (MoE) architecture to improve multimodal performance and understanding of long contexts and complex documents, whil...

2026-02-12 Tom's Hardware

Cadence embeds AI for advanced chip design

Cadence introduces an AI-powered 'super agent' to assist engineers in designing EDA tools. The company aims to handle complex projects with over a trillion transistors by 2030, leveraging AI for debugging and verification.

#LLM On-Premise #DevOps
2026-02-12 The Register AI

Elon Musk paints exodus of xAI co-founders as 'evolution'

Elon Musk has framed the recent exodus of talent from his artificial intelligence startup, xAI, as a necessary growing pain, saying the company's evolution "required parting ways with some people." The initial 12-strong founding team is now down to 6...

#LLM On-Premise #DevOps
2026-02-12 Ars Technica AI

Tested: How Chrome’s Auto Browse agent handles common web tasks

Google has released Chrome's Auto Browse agent in preview for AI Pro and AI Ultra subscribers. The article analyzes the capabilities of this AI agent in automating common web tasks, evaluating its effectiveness and reliability in performing online ta...

#LLM On-Premise #DevOps
2026-02-12 404 Media

AI Abuse: Nude AI Images Created, OnlyFans Opened in Her Name

A woman was victimized by AI-generated images. Strangers created nude images from her profile and opened an OnlyFans account in her name. The incident occurred during a surge in the generation of sexual images via AI, raising questions about the misu...

2026-02-12 The Next Web

A2A Protocol: AI agents communicate autonomously

The agent-to-agent (A2A) protocol aims to bridge the gap between AI automation and human action. The goal is to enable AIs to interact and complete complex tasks without direct user intervention, opening new frontiers in automation and process effici...

#LLM On-Premise #DevOps
2026-02-12 LocalLLaMA

Samsung explores REAM: LLM model reduction without 'lobotomy'

Samsung proposes REAM (REAP-less) as an alternative to Cerebras' REAP for reducing the size of large language models (LLMs). REAM aims to minimize the loss of model capabilities during the compression process. Qwen3 models reduced via REAM have been ...

#LLM On-Premise #Fine-Tuning #DevOps
2026-02-12 DigiTimes

Samsung reclaims AI crown with world-first HBM4 mass production

Samsung announces the start of mass production of HBM4 memories, a world first that could redefine performance in artificial intelligence and high-performance computing. This move consolidates Samsung's position in the high-speed memory market.

#Hardware #LLM On-Premise #DevOps
2026-02-12 DigiTimes

Z.ai unveils GLM-5, advances AI agents and China chip compatibility

Z.ai has announced GLM-5, a new version of its large language model (LLM), with improvements in AI agent capabilities and a focus on compatibility with Chinese hardware. This development could have significant implications for the AI landscape in Chi...

#Hardware #LLM On-Premise #DevOps
2026-02-12 ArXiv cs.CL

KV Policy: Reinforcement Learning for Key-Value Cache Eviction in LLMs

A novel approach to Key-Value (KV) cache management in Large Language Models (LLMs) employs reinforcement learning (RL) to optimize token eviction. KV Policy (KVP) trains lightweight RL agents to predict the future utility of tokens, outperforming tr...

#Fine-Tuning
2026-02-12 ArXiv cs.CL

LT-Tuning: Enhanced LLM Reasoning in Continuous Latent Spaces

A novel approach, Latent Thoughts Tuning (LT-Tuning), aims to enhance the reasoning capabilities of Large Language Models (LLMs) by leveraging continuous latent spaces. This method contrasts with the traditional Chain-of-Thought (CoT) approach, which...

#LLM On-Premise #DevOps
2026-02-12 LocalLLaMA

Unsloth releases GLM-5 in GGUF format for local inference

Unsloth has announced the release of GLM-5 in GGUF format, paving the way for model inference on local hardware. The GGUF format facilitates the use of the model with tools like llama.cpp, making it accessible to a wide range of users and application...

#Hardware #LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

GLM-5 scores 50 on the Intelligence Index

The GLM-5 language model has achieved a score of 50 on the Intelligence Index, positioning itself as a leader among open-source models. The news was shared on Reddit, highlighting the growing interest in increasingly performant models accessible to t...

#LLM On-Premise #DevOps
2026-02-11 TechCrunch AI

Apple’s Siri revamp reportedly delayed… again

Apple's highly anticipated Siri revamp, powered by Apple Intelligence and promised since 2024, is reportedly facing another delay. The implications for users and the competitive landscape of voice assistants remain to be seen.

#LLM On-Premise #DevOps
2026-02-11 TechCrunch AI

Who will own your company’s AI layer? Glean’s CEO explains

Enterprise AI is shifting fast from chatbots that answer questions to systems that actually do the work across an organization. Glean's CEO explores who will own the AI layer and how companies can prepare.

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

Z.ai reports GPU shortage for its workloads

Z.ai has publicly stated that it is struggling to find enough GPUs to support its activities. The news emerged on Reddit, highlighting the challenges many companies face in gaining access to the hardware resources needed for inference and training of...

#Hardware #LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

Zai-Org's GLM-5 Available on Hugging Face

The GLM-5 language model developed by Zai-Org is now accessible via Hugging Face. The news was shared on Reddit, paving the way for new experimentation and applications of the model by the open-source community. Further technical details and download...

2026-02-11 LocalLLaMA

GLM-5: New Language Model with 744 Billion Parameters Officially Released

Zai has announced GLM-5, a large language model (LLM) designed for complex systems and long-horizon agentic tasks. Compared to the previous version, GLM-5 boasts a significantly larger number of parameters (744 billion) and a more extensive pre-train...

#LLM On-Premise #Fine-Tuning #DevOps
2026-02-11 OpenAI Blog

Prompt Engineering: Leveraging Codex in an Agent-First World

The article explores how prompt engineering, enhanced by models like Codex, is becoming crucial in a landscape where autonomous software agents increasingly drive digital interactions. It discusses the importance of well-defined prompts to achieve op...

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

Kimi-K2.5 support added to llama.cpp

The llama.cpp library has added support for the Kimi-K2.5 model. This integration allows users to utilize the model directly within llama.cpp, expanding the options available for local language model inference.

#Hardware #LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

MOSS-TTS Released: Open Source Text-to-Speech

MOSS-TTS, a new open-source text-to-speech model, has been released. The news was shared via a post on Reddit, paving the way for new experiments in the field of voice generation.

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

MiniMax M2.5: New Version Coming Soon

A user reported the upcoming release of MiniMax M2.5 on the LocalLLaMA forum. Further details on the model and its capabilities are not yet available, but the news has generated interest in the open source community interested in local LLM solutions.

#Hardware #LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

GLM 5.0 & MiniMax 2.5: Are We Entering China's Agent War Era?

New versions of GLM and MiniMax, two language models developed in China, have been released. GLM 5.0 focuses on advanced reasoning and code development, while MiniMax 2.5 concentrates on decomposing complex tasks and long-running execution. The compe...

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

MiniMax M2.5 Released

The release of the MiniMax M2.5 model has been announced. MiniMax is a platform providing large language models (LLMs) and tools for developing AI-powered applications. The new version promises performance improvements and new features, but specific ...

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

GLM-5 Released: Zhipu AI's New Language Model

Zhipu AI has released GLM-5, the latest version of its language model. The news was shared via a Reddit post linking to the Zhipu AI website, where users can interact with the model through a chat interface.

#LLM On-Premise #DevOps
2026-02-11 Tom's Hardware

SMIC warns of overcapacity in AI data centers

China's top chipmaker, SMIC, warns that AI data center capacity could outstrip demand. The company emphasizes the need for more careful planning to effectively utilize resources.

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

Zhipu is rolling out GLM-5: a new AI model shaking up the market

The Chinese company Zhipu has announced the release of its new artificial intelligence model, GLM-5. The launch, scheduled soon, promises to intensify competition in the sector. This update could lead to new opportunities for those seeking advanced a...

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

Grok-3 joins upcoming models list

Elon Musk hinted at the upcoming release of Grok-3, the next iteration of the language model developed by xAI. Details regarding technical specifications or release date are not yet available, but the announcement has generated interest within the op...

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

DeepSeek Updated to 1M Context Window

The DeepSeek application has been updated with a 1 million token context window. The knowledge cutoff date has been extended to May 2025. It is currently unclear whether this is a new model. There are no updates on their Hugging Face page yet.

#LLM On-Premise #DevOps
2026-02-11 The Next Web

The next Renaissance: Why creativity is the currency of the AI age

Artificial intelligence is rewriting the rules of work and human potential. Creativity, imagination, and the ability to innovate become valuable assets. Technology handles tedious tasks, allowing humans to focus on higher-level activities. A future w...

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

DeepSeek Tests New Model with 1 Million Token Context Window

DeepSeek has launched limited grayscale testing for its new language model, featuring a 1 million token context window and an updated knowledge base. Access is currently restricted to a select group of users through its official website and app.

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts

Nanbeige LLM Lab introduces Nanbeige4.1-3B, a 3 billion parameter open-source model designed to excel in complex reasoning, alignment with human preferences, and agentic behavior. The model supports contexts up to 256,000 tokens and shows promising r...

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

EpsteinFiles-RAG: Building a RAG Pipeline on 2M+ Pages

A developer has built an open-source RAG (Retrieval-Augmented Generation) pipeline to query a dataset of over 2 million pages extracted from the "Epstein Files". The project aims to optimize semantic search and Q&A performance at scale, addressing th...

#Fine-Tuning #RAG
2026-02-11 LocalLLaMA

Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts

Nanbeige LLM Lab introduces Nanbeige4.1-3B, a 3 billion parameter open-source model designed to excel in complex reasoning, alignment with human preferences, and agentic capabilities. The model supports contexts up to 256k tokens and demonstrates str...

#LLM On-Premise #DevOps
2026-02-11 LocalLLaMA

Fine-tuning Qwen 14B for Discord Autocomplete

A user fine-tuned the Qwen 14B model on their Discord messages to get personalized autocomplete suggestions. The model was trained with Unsloth.ai and QLoRA on a Kaggle GPU and integrated with Ollama for local use.

#Hardware #LLM On-Premise #Fine-Tuning
2026-02-11 Anthropic News

Anthropic Introduces Claude Opus 4.6: The Latest Model Evolution

Anthropic has announced Claude Opus 4.6, the latest version of its flagship language model. This release promises enhanced performance and new features, solidifying Claude's position in the landscape of large language models (LLMs). The announcement ...

#Hardware #LLM On-Premise #DevOps
2026-02-10 TechCrunch AI

Flapping Airplanes: $180 Million Seed Funding for New AI Lab

AI lab Flapping Airplanes secured $180 million in seed funding from Google Ventures, Sequoia, and Index. Their goal is to develop learning models that mimic human reasoning, moving away from the traditional approach of massive internet data analysis.

#LLM On-Premise #DevOps
2026-02-10 LocalLLaMA

Plano: AI agent framework reaches 5000 stars on GitHub

Plano, an open-source framework for developing AI agents, has surpassed 5000 stars on GitHub. The project focuses on small LLMs for routing and orchestration, with a framework-agnostic approach. Plano acts as a model-integrated proxy server and data ...

#LLM On-Premise #DevOps
2026-02-10 LocalLLaMA

Kimi: a promising LLM according to the LocalLLaMA community

The LocalLLaMA community has expressed positive opinions about Kimi, a large language model, favorably comparing it to ChatGPT and Claude. Some users consider it superior in certain applications, opening new perspectives for local inference and use i...

#LLM On-Premise #DevOps
2026-02-10 LocalLLaMA

Analyzing the 'Personality' of Open-Source LLMs via Hidden States

A researcher analyzed the hidden states of six open-source language models (7B-9B parameters) to measure their 'personality'. The analysis reveals distinct behavioral fingerprints, different reactions to hostile users, and behavioral 'dead zones,' po...

#LLM On-Premise #DevOps
2026-02-10 LocalLLaMA

Hugging Face Is Teasing Something Anthropic Related

Hugging Face has hinted at a possible collaboration with Anthropic, the company behind the Claude models. While the exact nature of the collaboration remains uncertain, speculations suggest it might be a dataset for improving model safety, rather tha...

#LLM On-Premise #Fine-Tuning #DevOps
2026-02-10 The Register AI

Frankfurt to dethrone London as colocation king by 2031

According to the EU Data Centre Association (EUDCA), Frankfurt is set to surpass London as the leading colocation hub in Europe by 2031. The growth is driven by data sovereignty requirements and the expansion of artificial intelligence.

#LLM On-Premise #DevOps
2026-02-10 LocalLLaMA

Qwen-Image-2.0: 7B unified model for image generation and editing

The Qwen team has released Qwen-Image-2.0, a 7B unified model for image generation and editing, capable of text rendering and handling 2K images. Currently available only via API on Alibaba Cloud (invite beta) and free demo on Qwen Chat, the release ...

#Hardware #LLM On-Premise #DevOps
2026-02-10 Tech.eu

Vesiro raises €1.6M to optimise Elasticsearch and lower server energy use

Gothenburg-based Vesiro has raised €1.6 million to develop a plug-in for Elasticsearch. The aim is to improve search efficiency in large-scale data environments, reducing the number of servers required and energy consumption. The funding will support...

#LLM On-Premise #Fine-Tuning #DevOps
2026-02-10 LocalLLaMA

Step-3.5-Flash: A Compact Yet Powerful LLM

A user reported the effectiveness of the Step-3.5-Flash model, highlighting its superior performance compared to larger models like GPT OSS 120B in certain contexts. Its availability on OpenRouter and performance comparable to Deepseek V3.2, despite ...

2026-02-10 ArXiv cs.CL

Visual Language Models: Tokenization Bypassed or Reintroduced?

A recent study analyzes whether pixel-based language models effectively overcome the limitations of tokenization, especially in languages with non-Latin scripts. The results highlight how integrating text tokenizers can reintroduce alignment issues, ...

#LLM On-Premise #DevOps
2026-02-10 ArXiv cs.LG

Lagged backward-compatible neural networks for soil consolidation analysis

A Lagged Backward-Compatible Physics-Informed Neural Network (LBC-PINN) has been developed to simulate unsaturated soil consolidation under long-term loading. The framework integrates logarithmic time segmentation and transfer learning to improve acc...

#LLM On-Premise #DevOps
2026-02-10 ArXiv cs.AI

ST-Raptor: An Agentic System for Semi-Structured Table QA

ST-Raptor is an agentic system for question answering (QA) on semi-structured tables. It combines visual editing, tree-based structural modeling, and agent-driven query resolution to improve accuracy and usability in table understanding. Experimental...

#Fine-Tuning
2026-02-10 LocalLLaMA

Local Home Assistant with Qwen3 on RTX 5060 Ti

An open-source project demonstrates a fully local home automation voice assistant, powered by Qwen3 models for ASR, LLM, and TTS. The system runs on an RTX 5060 Ti GPU with 16GB VRAM, highlighting the feasibility of on-prem AI implementations even wi...

#LLM On-Premise #DevOps
2026-02-10 LocalLLaMA

Kimi-Linear-48B-A3B-Instruct: LLM model and GGUF for extended context

A new LLM model, Kimi-Linear-48B-A3B-Instruct, is available with promising support for extended contexts, surpassing GLM 4.7 Flash. The community has released a GGUF version, facilitating the model's use and integration into various environments.

#LLM On-Premise #DevOps
2026-02-09 LocalLLaMA

Waiting for DeepSeek V4, GLM-5, Qwen 3.5 and MiniMax 2.2

The LocalLLaMA community is eagerly awaiting new versions of large language models (LLMs) such as DeepSeek V4, GLM-5, Qwen 3.5, and MiniMax 2.2. There is particular interest in the performance of DeepSeek V4 via OpenRouter and the capabilities of GLM...

#Hardware #LLM On-Premise #DevOps
2026-02-09 OpenAI Blog

Custom ChatGPT for U.S. Defense on GenAI.mil

OpenAI for Government announces the deployment of a custom ChatGPT on the GenAI.mil platform, aiming to provide secure and reliable artificial intelligence tools to U.S. defense teams. The platform aims to enhance operational capabilities while maint...

#LLM On-Premise #DevOps
2026-02-09 LocalLLaMA

Aurora Alpha: New LLM Model Available on OpenRouter

A new LLM model, named Aurora Alpha, has been released on OpenRouter. The model is accessible for free ($0/M tokens). Further details on the architecture and capabilities of Aurora Alpha are available on the OpenRouter platform.

#LLM On-Premise #DevOps
2026-02-09 TechCrunch AI

Databricks CEO says AI will soon make SaaS irrelevant

Databricks CEO Ali Ghodsi believes that AI will not replace major SaaS apps with vibe-coded versions, but it could give rise to competitors. The major impact will therefore be on innovation and competition in the software market.

#LLM On-Premise #DevOps
2026-02-09 LocalLLaMA

Qwen: A step forward for local LLM inference?

A recent update to llama.cpp appears to improve support for the Qwen language model. This development could facilitate the execution and inference of large models on local hardware, opening new possibilities for on-premise applications and resource-c...

#Hardware #LLM On-Premise #DevOps
2026-02-09 LocalLLaMA

Qwen3-Coder-Next: A Versatile Model That Goes Beyond Code

A user shares their positive experience with Qwen3-Coder-Next, highlighting its ability to provide stimulating conversations and pragmatic solutions. Despite the name, the model proves valuable even for tasks beyond software development, approaching ...

2026-02-09 TechCrunch AI

Anthropic eyes $20B funding round amid compute cost pressures

Anthropic, a leading AI company, is reportedly pursuing a new funding round potentially reaching $20 billion. This move is driven by intense competition and the significant compute costs associated with developing advanced AI models.

#Hardware #LLM On-Premise #DevOps
2026-02-09 TechCrunch AI

InfiniMind: AI to unlock the value of enterprise video data

Founded by former Google Japan leaders, InfiniMind is building AI solutions to transform enterprise video archives into actionable business intelligence. The goal is to make video content searchable and usable to extract valuable insights.

2026-02-09 LocalLLaMA

GLM-5: New details on model architecture released

A pull request has been released revealing further details on the architecture and parameters of GLM-5. The documentation includes diagrams and technical specifications of the model, offering a clearer overview of its internal capabilities. This upda...

#LLM On-Premise #DevOps
2026-02-09 LocalLLaMA

GLM-5 Support Is On Its Way For Transformers: What it Means

The integration of GLM-5 into Hugging Face's Transformers framework suggests an imminent model release. Clues point to a possible stealth deployment of GLM-5, named Pony Alpha, on the OpenRouter platform. This development could broaden options for th...

#LLM On-Premise #DevOps
2026-02-09 Tech.eu

MuseCool: AI to Revolutionize Music Education

The startup MuseCool uses artificial intelligence to personalize music lessons, bridge gaps in traditional learning, and make studying more engaging. Through audio analysis, AI generates personalized exercises and provides feedback, transforming prac...

2026-02-09 LocalLLaMA

Ministral-3-3B: a compact model for local inference

A user reported a positive experience with the Ministral-3-3B model, highlighting its effectiveness in running tool calls and its ability to operate with only 6GB of VRAM. The model, in its instruct version and quantized to Q8, proves suitable for re...

#Hardware #LLM On-Premise #DevOps
2026-02-09 LocalLLaMA

GLM-5 Incoming: Spotted in vLLM Pull Request

Hints of the upcoming GLM-5 language model have surfaced in a pull request related to vLLM, a framework for LLM inference. The news, initially shared on Reddit, suggests that the new model might soon be integrated and available to the open-source com...

#Hardware #LLM On-Premise #DevOps
2026-02-09 ArXiv cs.AI

Large Language Model Reasoning Failures: An Analysis

A new study systematically analyzes reasoning failures in large language models (LLMs). The research introduces a categorization framework for reasoning types (embodied and non-embodied) and classifies failures based on their origin: intrinsic archit...

#LLM On-Premise #DevOps
2026-02-09 ArXiv cs.AI

Jackpot: Optimal Sampling for Efficient RL and LLMs

Researchers propose Jackpot, a framework for reinforcement learning (RL) with LLMs. Jackpot uses Optimal Budget Rejection Sampling (OBRS) to reduce the discrepancy between the rollout model and the evolving policy, improving training stability and ef...

2026-02-09 LocalLLaMA

WokeAI Releases Three New Open Source 'Tankie' LLM Models

The WokeAI group has announced the release of three new open-source large language models (LLMs), named 'Tankie', designed for ideological analysis and critique of power structures. The models are available on the Hugging Face Hub and can be run on v...

#Hardware #LLM On-Premise #Fine-Tuning
2026-02-09 LocalLLaMA

Qwen3.5 Support Merged in llama.cpp

Support for the Qwen3.5 language model has been merged into llama.cpp. This addition allows users to run and experiment with Qwen3.5 directly on local hardware, opening new possibilities for developers and researchers interested in on-premise inferen...

#Hardware #LLM On-Premise #DevOps
2026-02-08 LocalLLaMA

Interactive Visualization of LLM Models in GGUF Format

An enthusiast has developed a tool to visualize the internal architecture of large language models (LLMs) saved in .gguf format. The goal is to make the structure of these models more transparent, traditionally considered "black boxes". The tool allo...

#LLM On-Premise #DevOps
2026-02-08 LocalLLaMA

LLM Benchmark: Qwen MoE outperforms LLaMA-70B in neuroscience

A new benchmark in neuroscience and brain-computer interfaces (BCI) reveals that the Qwen3 235B MoE model outperforms LLaMA-3.3 70B. The results highlight a shared accuracy ceiling among different models, suggesting that limitations lie in epistemic ...

#LLM On-Premise #DevOps
2026-02-08 LocalLLaMA

StepFun 3.5 Flash vs MiniMax 2.1: comparison on Ryzen

A user compares the performance of StepFun 3.5 Flash and MiniMax 2.1, two large language models (LLM), on an AMD Ryzen platform. The analysis focuses on processing speed and VRAM usage, highlighting the trade-offs between model intelligence and respo...

#Hardware #LLM On-Premise #DevOps
2026-02-08 LocalLLaMA

Tandem: local, open-source AI workspace using Rust and SQLite

A developer has created Tandem, an AI workspace that runs entirely locally, without sending data to the cloud. The solution uses Rust, Tauri, and sqlite-vec, offering a lightweight alternative to Python/Electron apps. It supports local Llama models v...

#LLM On-Premise #DevOps #RAG
2026-02-08 Phoronix

Intel Releases QATlib 26.02 With New APIs For Zero-Copy DMA

Intel has released QATlib 26.02, the newest version of its user-space library for leveraging QuickAssist Technology (QAT) on capable hardware. This release introduces new APIs for zero-copy DMA, improving compression and encryption performance. QAT r...

#Hardware #LLM On-Premise #DevOps
2026-02-07 DigiTimes

Record Japan blizzard threatens AI chip supply chains

Severe blizzards in Japan are threatening the supply chains of AI chips. The situation could impact the production and distribution of essential components for the sector.

#LLM On-Premise #DevOps
2026-02-07 LocalLLaMA

Full Claude Opus 4.6 System Prompt

A user shared a full system prompt for Claude Opus 4.6 on Reddit. The prompt is available on GitHub and offers an in-depth look at the model's internal configuration.

#LLM On-Premise #DevOps
2026-02-07 LocalLLaMA

DeepSeek V3.2: AIME 2026 results above 90% with minimal costs

AIME 2026 benchmark results show high performance, above 90%, for both closed and open-source models. DeepSeek V3.2 stands out with a test execution cost of only $0.09, opening new perspectives on the efficiency of language models.

#LLM On-Premise #DevOps
2026-02-07 LocalLLaMA

LLM Benchmarking: Total Wait Time vs. Tokens Per Second

A LocalLLaMA user has developed an alternative benchmarking method for evaluating the real-world performance of large language models (LLMs) locally. Instead of focusing on tokens generated per second, the benchmark measures the total time required t...

#Hardware #LLM On-Premise #DevOps
2026-02-07 LocalLLaMA

DeepSeek-V2-Lite: performance on modest hardware with OpenVINO

A user compared DeepSeek-V2-Lite and GPT-OSS-20B on a 2018 laptop with integrated graphics, using OpenVINO. DeepSeek-V2-Lite showed almost double the speed and more consistent responses compared to GPT-OSS-20B, although with some logical and programm...

#Hardware
2026-02-07 LocalLLaMA

Qwen and ByteDance testing new seed models on the Arena

Potential new Qwen and ByteDance models are being tested on the Arena. The “Karp-001” and “Karp-002” models claim to be Qwen-3.5 models. The “Pisces-llm-0206a” and “Pisces-llm-0206b” models are identified as ByteDance models, suggesting further expan...

#LLM On-Premise #DevOps
2026-02-07 LocalLLaMA

Kimi-Linear-48B-A3B & Step3.5-Flash are ready - llama.cpp

Releases of Kimi-Linear-48B-A3B and Step3.5-Flash compatible with llama.cpp are now available. Official GGUF files are not yet available, but the community is already working on their creation. The availability of these models expands options for loc...

#Hardware #LLM On-Premise #DevOps
2026-02-07 LocalLLaMA

Open-sourced exact attention kernel: 1M tokens in 1GB VRAM

Geodesic Attention Engine (GAE) is an open-source kernel that promises to drastically reduce memory consumption for large language models. With GAE, it's possible to handle 1 million tokens with only 1GB of VRAM, achieving significant energy savings ...

#Hardware #LLM On-Premise #DevOps
2026-02-07 LocalLLaMA

Nemo 30B: LLM with 1M Token Context Window on a Single RTX 3090

A user tested the Nemo 30B language model, achieving a context window of over 1 million tokens on a single RTX 3090 GPU. The user reported a speed of 35 tokens per second, sufficient to summarize books or research papers in minutes. The model was com...

#Hardware #LLM On-Premise #DevOps
2026-02-06 LocalLLaMA

GLM-5 Is Being Tested On OpenRouter

The GLM-5 language model is currently being tested on the OpenRouter platform. This news, originating from a Reddit discussion, indicates a potential expansion of the models available to OpenRouter users, opening new possibilities for artificial inte...

#LLM On-Premise #DevOps
2026-02-06 Phoronix

ML-LIB: Machine Learning Library Proposed For The Linux Kernel

An IBM engineer has proposed a machine learning library (ML-LIB) for the Linux kernel. The intent is to plug in running ML models directly into the kernel to optimize system performance and enable various other functionalities. The proposal is curren...

#LLM On-Premise #DevOps
2026-02-06 LocalLLaMA

Experimental Model with Subquadratic Attention: Up to 10M Context Length

A 30B experimental model with subquadratic attention mechanism has been released, scaling at O(L^(3/2)). It enables handling contexts up to 10 million tokens on a single GPU, maintaining practical decoding speeds. Includes an OpenAI-compatible server...

#Hardware #LLM On-Premise #DevOps
2026-02-06 LocalLLaMA

Hugging Face: Community-Driven LLM Benchmark Repositories

Hugging Face introduces benchmark repositories for community-driven LLM evaluations. The initiative aims to address inconsistencies in benchmark results, allowing users to contribute evaluations and directly link models to leaderboards. Verified resu...

#LLM On-Premise #DevOps
2026-02-06 LocalLLaMA

llama.cpp integrates Kimi-Linear support: improved performance

The llama.cpp library has integrated support for Kimi-Linear, a technique that promises to improve the performance of language models. The integration was made possible by a pull request on GitHub, opening new possibilities for efficient inference.

#Hardware #LLM On-Premise #DevOps
2026-02-06 AI News

How separating logic and search boosts AI agent scalability

A new framework, ENCOMPASS, separates the workflow logic of AI agents from inference strategies. This approach, developed by Asari AI, MIT CSAIL, and Caltech, aims to reduce technical debt and improve performance, enabling more efficient management o...

#LLM On-Premise #DevOps
2026-02-06 LocalLLaMA

LLM Inference: DeepSpeed Optimization and Performance

A user shares an image related to optimizing the inference of large language models (LLM) using DeepSpeed. The image suggests an analysis of performance and configurations to improve the speed and efficiency in running these models.

#Hardware
2026-02-06 LocalLLaMA

Qwen3-235B: User Praises Local Performance

A user shared their positive experience with the Qwen3-235B language model, running it on a desktop system. The user highlighted the model's accuracy and utility, to the point of preferring it over a commercial ChatGPT subscription.

#LLM On-Premise #DevOps
2026-02-05 LocalLLaMA

Gemma 4: Is Google still developing the language model?

The LocalLLaMA community is questioning the future of Gemma 4, wondering if Google is still investing in the development of the language model. Despite progress in the sector, the fate of Gemma 4 remains uncertain.

#LLM On-Premise #DevOps
2026-02-05 LocalLLaMA

New OCR Models: LightOnOCR-2 and GLM-OCR Improve Accuracy

LightOnOCR-2 and GLM-OCR, two new models for optical character recognition (OCR), have been released. A user reported superior performance compared to solutions available in late 2025, with GLM-OCR offering speed and reliable structured output.

2026-02-05 LocalLLaMA

gWorld: 8B model beats 402B Llama 4 by generating web code

Trillion Labs and KAIST AI introduced gWorld, an open-weight visual world model for mobile GUIs. gWorld, available in 8B and 32B versions, generates executable web code instead of pixels, surpassing larger models like Llama 4 in accuracy. This approa...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-31 LocalLLaMA

Open-weight models: a realistic assessment

A Reddit discussion questions the current state of open-source language models compared to the most advanced proprietary models (SOTA). The analysis, based on practical experience rather than standard benchmarks, offers an interesting perspective for...

#LLM On-Premise #DevOps
2026-01-31 DigiTimes

Taiwan's capital markets reach new highs, signaling broader global role

Taiwan's capital markets are reaching new highs, signaling an expanding global role. This development underscores the island's growing importance in the global economy and its ability to attract international investment. The strength of Taiwanese mar...

#LLM On-Premise #DevOps
2026-01-30 TechCrunch AI

OpenClaw’s AI assistants are now building their own social network

The viral personal AI assistant formerly known as Clawdbot and briefly rebranded as Moltbot, has now picked OpenClaw as its new name. The project is now evolving further, aiming to build its own social network, entirely managed by artificial intellig...

#LLM On-Premise #DevOps
2026-01-30 LocalLLaMA

GPT-OSS: Why is this open-source model still so good?

A local LLM user questions the outstanding performance of GPT-OSS 120B, an older but still competitive open-source model. Despite newer architectures and models, GPT-OSS excels in speed, effectiveness, and tool calling. The article explores the reaso...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-30 LocalLLaMA

LocalLLaMA: Stop the spam of unfinished projects

The LocalLLaMA community is calling for a crackdown on posts promoting incomplete and low-quality "Agentic" projects. The excessive presence of such content is making it difficult to find meaningful discussions and valid projects within the forum.

2026-01-30 The Register AI

NYC Silences AI Chatbot: Too Many Errors and a Budget to Fix

The AI-powered chatbot implemented by New York City to answer frequently asked questions from business owners has been deactivated. The decision was made due to the system's frequent incorrect answers and the need to address a $12 billion budget shor...

#LLM On-Premise #DevOps
2026-01-30 404 Media

Embarrassing Email for Musk: Parties on Epstein's Island?

New documents from the U.S. Department of Justice reveal emails between Elon Musk and Jeffrey Epstein, dating back to 2012-2013. Musk had denied involvement with Epstein, but the emails show discussions about visits to the private island and requests...

2026-01-30 404 Media

ELITE User Guide: Palantir's Tool for ICE

The user guide for ELITE, Palantir's tool used by Immigration and Customs Enforcement (ICE), has been revealed. The application allows mapping potential deportation targets and accessing individual dossiers, with a "confidence" score based on data fr...

2026-01-30 Ars Technica AI

AI for Developers: Effective, But with Reservations

AI coding tools are becoming increasingly effective, capable of developing entire applications from simple text prompts. Professional developers confirm the usefulness of solutions like Claude Code and Codex, but express concerns about the long-term ...

#LLM On-Premise #DevOps
2026-01-30 Phoronix

Ubuntu 26.04 Resolute Snapshot 3 Released For Testing

Resolute Snapshot 3 is now available as the newest monthly test candidate leading up the Ubuntu 26.04 LTS release in April. This monthly release provides developers and users with a preview of the new features and improvements, allowing them to ident...

2026-01-30 The Register AI

IRS turns to AI helpers amid staff reductions

The IRS plans to automate tasks such as reviewing tax-exempt status requests and processing amended individual filings using AI. This move comes amid staff reductions within the agency.

2026-01-30 TechCrunch AI

Anthropic brings agentic plugins to Cowork

Anthropic has extended its plugin system to operate within Cowork, the newly launched agentic platform. This integration allows Cowork's agents to access and utilize the functionalities offered by Anthropic's plugins, expanding their operational capa...

2026-01-30 Tom's Hardware

AMD Zen 6: 48MB L3 Cache for 12-Core CCD?

Rumors indicate that AMD might increase the L3 cache of Zen 6 processors to 48MB to compensate for the increased core count in the CCDs. This move would maintain the cache-to-core ratio consistent with Zen 5.

#Hardware #LLM On-Premise #DevOps
2026-01-30 LocalLLaMA

Cline team got absorbed by OpenAI, Kilo responds by open-sourcing

Following the acquisition of the Cline team by OpenAI, Kilo Code, a fork of Cline, announced it will make its backend source code available. The move aims to provide an open-source alternative for developing programming tools with local models, offer...

2026-01-30 DigiTimes

Morris Chang resurfaces as Jensen Huang returns to Taipei

The return of key figures like Morris Chang and Jensen Huang to Taipei raises questions about the dynamics of the technology market. The Digitimes article suggests possible developments in the sector, with potential implications for innovation and co...

#LLM On-Premise #DevOps
2026-01-30 404 Media

AI and Journalism: A Perspective from Kenya

A report from a conference in Kenya on the impact of artificial intelligence on journalism and the fight against disinformation. The event brought together experts from Africa, Europe, and Asia to discuss the challenges and opportunities in the field...

2026-01-30 404 Media

Silicio Valley’s Favorite New AI Agent Has Serious Security Flaws

Moltbot, a viral AI agent popular in Silicio Valley, has significant security flaws. A hacker demonstrated how to exploit a backdoor in its support system to access sensitive user data. This raises concerns about the security of AI agents that automa...

#LLM On-Premise #DevOps
2026-01-30 Tech.eu

Tech funding roundup: RobCo, 2150, acquisitions and new strategies

The past week saw intense activity in the European tech funding landscape, with over €710 million distributed across more than 70 deals. RobCo raised $100 million in a Series C round, while 2150 closed its Fund II at €210 million to support urban and...

2026-01-30 LocalLLaMA

Design Arena is now dominated by an open model

A Reddit post from the LocalLLaMA community speculates about a future (in 2026) where open-source models dominate the design field. The discussion focuses on the impact of this trend and its implications for the industry.

#LLM On-Premise #DevOps
2026-01-30 LocalLLaMA

Kimi-k2.5: Gemini 2.5 Pro-like performance in long context!

A Reddit user reports that the Kimi-k2.5 model achieves performance similar to Gemini 2.5 Pro in handling large contexts. The discussion focuses on the implications of this result for open source LLM models.

#LLM On-Premise #DevOps
2026-01-30 The Register AI

Oracle seeks to build bridges with MySQL developers

Oracle is taking steps to "repair" its relationship with the MySQL community, by moving "commercial-only" features into the Community Edition and prioritizing developer needs. A significant shift for Big Red.

2026-01-30 The Register AI

Autonomous cars, drones cheerfully obey prompt injection by road sign

AI vision systems can be very literal readers. Indirect prompt injection occurs when a bot takes input data and interprets it as a command. Academics have shown that self-driving cars and autonomous drones will follow illicit instructions written ont...

2026-01-30 Tech.eu

Einride boss predicts more European SPAC IPOs

The CEO of Swedish autonomous truck startup Einride believes more European companies will follow its lead and go public via SPAC (Special Purpose Acquisition Company). Einride will list on the New York Stock Exchange with a $1.8 billion valuation. SP...

2026-01-30 TechWire Asia

Shadow AI: Risks for Asian Enterprises and Data Sovereignty

A Reco report reveals that 91% of AI tools operate outside corporate IT control, creating risks for data sovereignty, especially in Asia, with fragmented privacy regulations. Lack of AI governance could compromise compliance and business continuity, ...

2026-01-30 Tech.eu

Mos Health secures $1.1M to expand personalised health offerings

Mos Health, a Polish-American startup developing an AI-based health platform for personalised protocols and supplements, has raised $1.1 million in a pre-seed round. The company aims to address the gap between generic health advice and actual adoptio...

2026-01-30 Tech.eu

Spotify confirms Turkish office opening after government spat

Spotify has announced the opening of an office in Istanbul by the end of June, emphasizing the strategic importance of the Turkish market. The decision follows a period of tension with the Turkish government, which had criticized the platform for con...

2026-01-30 Phoronix

Intel Releases LLM-Scaler-vLLM 1.3 With New LLM Model Support

Intel released the LLM-Scaler-vLLM 1.3 update, expanding support for a larger array of large language models (LLMs). This new release is designed to run on Intel Arc Battlemage graphics cards using a Docker-based stack for deploying vLLM.

#Hardware #LLM On-Premise #DevOps
2026-01-30 DigiTimes

ASIC server demand boosts Taiwan's high-end CCL shipments

The increasing demand for ASIC servers, driven by artificial intelligence applications, is boosting shipments of high-end CCL (Copper Clad Laminate) materials from Taiwan. This trend reflects the growing importance of specialized hardware for AI work...

#Hardware #LLM On-Premise #Fine-Tuning
2026-01-30 DigiTimes

ASML gains from rising EUV demand and US chip spending

The semiconductor equipment manufacturer ASML is benefiting from the rising demand for EUV (Extreme Ultraviolet) lithography and US investments in the chip sector. The Dutch company is a key supplier for advanced chip manufacturers.

#LLM On-Premise #DevOps
2026-01-30 TechWire Asia

Zebra Technologies: Automation Beyond Pilot Projects

Zebra Technologies highlights how automation often stalls after the pilot phase. Customers seek partners who deeply understand their real operations and can integrate hardware, software, and AI to solve specific business problems, moving beyond mere ...

#Hardware
2026-01-30 DigiTimes

Hyundai and Kia hit hardest by Korea-US tariffs

The Hyundai Motor Group, including the Hyundai and Kia brands, is among the automotive manufacturers most exposed to trade tariffs between South Korea and the United States. The news highlights the implications of international trade policies on the ...

2026-01-30 DigiTimes

MediaTek and Arm-backed Arbor drives layered growth with AI strategy

Arbor, backed by MediaTek and Arm, is implementing a growth strategy based on artificial intelligence. The company aims to capitalize on the opportunities offered by AI to expand its business in various sectors. The original Digitimes article delves ...

#LLM On-Premise #DevOps
2026-01-30 DigiTimes

Nvidia H200: China's approval intensifies US-China competition

Nvidia's H200 GPU receives approval from Beijing, marking a new chapter in the technological competition between the United States and China. This strategic move could have significant implications for the chip market and the geopolitical dynamics of...

#Hardware #LLM On-Premise #DevOps
2026-01-30 DigiTimes

Alibaba, Baidu advance IPO plans for AI chip subsidiaries

Alibaba and Baidu are reportedly advancing with initial public offering (IPO) plans for their respective AI chip subsidiaries. This move may reflect a growing emphasis on technological self-reliance in the AI sector.

#LLM On-Premise #DevOps
2026-01-30 ArXiv cs.CL

DeepSearchQA: A Benchmark for Advanced Research Agents

DeepSearchQA is a new benchmark with 900 tasks for evaluating research agents across 17 different fields. Unlike traditional benchmarks, it focuses on the ability to collate fragmented information, eliminate duplicates, and reason about stopping crit...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-30 ArXiv cs.LG

Finetune-Informed Pretraining Boosts Downstream Performance

A novel approach to multimodal pretraining, called Finetune-Informed Pretraining (FIP), optimizes representations by focusing on the most relevant data modality during fine-tuning. This method improves performance without requiring additional data or...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-30 ArXiv cs.LG

Rethinking LLM-Driven Heuristic Design: DASH for Optimization

A new framework, Dynamics-Aware Solver Heuristics (DASH), leverages Large Language Models (LLMs) to improve the efficiency and quality of solutions in combinatorial optimization problems. DASH reduces adaptation costs and improves runtime efficiency ...

#LLM On-Premise #DevOps
2026-01-30 DigiTimes

Apple acquires Israeli audio startup Q.ai

Apple has finalized the acquisition of Q.ai, an Israeli startup specializing in audio technologies. The financial details of the transaction have not been disclosed.

#LLM On-Premise #DevOps
2026-01-30 DigiTimes

Memory prices soar as AI demand tightens DRAM and NAND supply

The increasing demand for artificial intelligence applications is causing a contraction in the supply of DRAM and NAND memory, leading to price increases. The slowdown in spot buying near the year-end may temporarily mitigate the situation.

#LLM On-Premise #DevOps
2026-01-30 LocalLLaMA

Running Claude Code locally with OpenCode, llama.cpp and GLM-4.7 Flash

A Reddit user shared their experience running Claude Code locally using OpenCode, llama.cpp, and the GLM-4.7 Flash model. The setup, designed to replicate a workflow similar to Claude's, leverages CUDA and optimizations like flash attention and conte...

#Hardware #LLM On-Premise
2026-01-30 LocalLLaMA

Mini-cluster with 192GB of VRAM for local AI workloads

A user has built a local computing cluster based on four Lenovo P620 workstations, each equipped with two NVIDIA RTX 3090 GPUs, for a total of 192GB of VRAM. The configuration, interconnected via a 10Gbit network (awaiting a 100Gbit upgrade), is inte...

#Hardware #LLM On-Premise #DevOps
2026-01-30 DigiTimes

Taipower's Record Profit Masks Rising Pressure from AI Data Centers

Taipower's record profits mask the increasing difficulties in meeting the energy demand of AI data centers in Taiwan. The island, a crucial technology hub, faces significant challenges in sustaining the growth of the AI sector.

#LLM On-Premise #DevOps
2026-01-30 DigiTimes

JPP brings AI automation to the factory floor with Techman Robot

JPP integrates robotics solutions from Techman Robot to automate production processes, bringing artificial intelligence directly to factory environments. This collaboration aims to improve efficiency and reduce operating costs through advanced automa...

#LLM On-Premise #DevOps
2026-01-30 DigiTimes

Huawei scales cloud ecosystem in Asia-Pacific; Volcengine surges in AI cloud

Huawei is expanding its cloud ecosystem in the Asia-Pacific region, while Volcengine is experiencing strong growth in the AI cloud sector. This expansion underscores the increasing demand for cloud resources to support AI applications in the region.

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-29 TechCrunch AI

Apple and AI: Analyst Questions Tim Cook on Monetization Strategy

A Morgan Stanley analyst questioned Tim Cook on how Apple plans to monetize its AI investments. The answer, reportedly, did not surprise industry observers. The article analyzes the implications of this question for Apple's future AI strategy.

#LLM On-Premise #DevOps
2026-01-29 ServeTheHome

Dell Pro Max with GB10: Achieving ROI within 12 Months

An analysis of using the Dell Pro Max workstation equipped with a GB10 GPU to solve complex reporting tasks. The original article reports a practical experience that led to a return on investment (ROI) within a 12-month period, focusing on real-world...

#Hardware #LLM On-Premise #DevOps
2026-01-29 OpenAI Blog

Taisei Corporation shapes the next generation of talent with ChatGPT

Taisei Corporation implements ChatGPT Enterprise to support HR-led talent development and scale generative AI across its global construction business. The initiative aims to enhance employee skills and optimize business processes through the adoption...

2026-01-29 TechCrunch AI

SpaceX, Tesla, and xAI reportedly in talks to merge

Elon Musk is reportedly considering merging SpaceX, Tesla, and xAI into a single entity. The deal would integrate the Grok chatbot, Starlink satellites, and SpaceX rockets under one corporation.

#LLM On-Premise #DevOps
2026-01-29 Ars Technica AI

How often do AI chatbots lead users down a harmful path?

A recent study by Anthropic analyzed 1.5 million anonymized conversations with the Claude model, quantifying how often AI chatbots can lead users to take harmful actions or develop dangerous beliefs. The results indicate that, although such patterns ...

#LLM On-Premise #DevOps
2026-01-29 IEEE Spectrum

Benchmarks for AI Agents: Are They Ready for Autonomous Business Operations?

Researchers at Carnegie Mellon and Fujitsu have developed benchmarks to assess the safety and effectiveness of AI agents in business contexts. The tests, focused on logistics, manufacturing, and knowledge management, reveal significant limitations of...

#LLM On-Premise #DevOps #RAG
2026-01-29 LocalLLaMA

LingBot-World: Open Source Dynamic Simulation Outperforms Genie 3

The LingBot-World framework offers a high-capability world model that is fully open source, contrasting with proprietary systems like Genie 3. It surpasses Genie 3 in handling complex physics and scene transitions, maintaining 16 frames per second an...

2026-01-29 OpenAI Blog

ChatGPT: OpenAI to retire GPT-4o and related models in 2026

OpenAI has announced that on February 13, 2026, it will retire the GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini models from ChatGPT. The decision does not currently impact the APIs. This announcement follows the previous communication regarding ...

2026-01-29 LocalLLaMA

Distilled models: why aren't there more?

The emergence of "distilled" models like Qwen 8B DeepSeek R1 has demonstrated reasoning capabilities exceeding their size. The article questions why there aren't more models of this kind, capable of operating on hardware with limited resources.

#Hardware #LLM On-Premise #DevOps
2026-01-29 TechCrunch AI

Microsoft defends Copilot adoption: 'High usage'

Microsoft CEO Satya Nadella addressed rumors of low usage for its Copilot AI, emphasizing the importance of data center investments to support the platform. The company aims to demonstrate the value of its approach in the artificial intelligence mark...

#LLM On-Premise #DevOps
2026-01-29 LocalLLaMA

Mistral CEO Arthur Mensch: AI access like electricity

Mistral CEO Arthur Mensch compares AI access to electricity access, emphasizing the importance of uninterrupted and unthrottled access to this crucial resource. The statement highlights Mistral's vision of AI as a fundamental infrastructure.

#LLM On-Premise #DevOps
2026-01-29 MIT Technology Review

DHS is using Google and Adobe AI to make videos

The US Department of Homeland Security (DHS) is using AI video generators from Google (Veo 3) and Adobe (Firefly) to create and edit content shared with the public. The revelation comes from a document that inventories the commercial AI tools used by...

#LLM On-Premise #DevOps
2026-01-29 Wired AI

AI-Generated Anti-ICE Videos: Catharsis or Misinformation?

AI-generated videos depicting people of color confronting Immigration and Customs Enforcement (ICE) agents are circulating on platforms like Instagram and Facebook. These videos raise questions about their impact: are they a form of catharsis or do t...

2026-01-29 Wired AI

Logical Intelligence Challenges Big Tech with a New Approach to AGI

While major companies pour billions into large language models, San Francisco-based startup Logical Intelligence is taking a different approach to achieving AGI, aiming to emulate the human brain. The company seeks to develop artificial intelligence ...

#LLM On-Premise #DevOps
2026-01-29 TechCrunch AI

Apple buys Israeli startup Q.AI for $2 billion

Apple announced its acquisition of Q.AI, an Israeli startup specializing in artificial intelligence, for approximately $2 billion. This acquisition marks Apple's second-largest to date, signaling a strong interest in enhancing its AI capabilities.

2026-01-29 The Register AI

Dow Chemical says AI is the element behind 4,500 job cuts

Dow Chemical, a 129-year-old chemical company, plans to cut 4,500 jobs, about 12.5 percent of its workforce, due to AI-driven automation. The company uses AI software from C3, a Palantir rival.

#LLM On-Premise #DevOps
2026-01-29 Ars Technica AI

OpenAI Prism: New AI tool sparks fears of "AI slop" in science

OpenAI has released Prism, a free AI-powered workspace for scientists. This tool, integrated with GPT-5.2, aims to facilitate the writing of scientific papers and collaboration. However, some researchers fear that Prism could contribute to an increas...

#LLM On-Premise #DevOps
2026-01-29 The Register AI

AI datacenter boom triples US gas power builds, widening carbon footprint

The growth of data centers dedicated to artificial intelligence is fueling a renewed interest in gas-fired power plants in the United States. This trend risks compromising efforts to transition to renewable sources and reduce carbon emissions, raisin...

#LLM On-Premise #DevOps
2026-01-29 TechCrunch AI

Google unveils Project Genie for AI world generation

Google has announced Project Genie, a new tool for generating virtual worlds powered by advanced AI models like Genie 3, Nano Banana Pro, and Gemini. Initially available to AI Ultra subscribers in the U.S., it offers new creative possibilities.

#LLM On-Premise #DevOps
2026-01-29 Phoronix

Libcamera 0.7 Released: GPU Acceleration for SoftISP Boosts Performance

Libcamera 0.7 has been released, a software library for image signal processors (ISPs) and embedded cameras on Linux. The key update is initial support for GPU acceleration within the software ISP (SoftISP), aiming for improved performance compared t...

#Hardware #LLM On-Premise #DevOps
2026-01-29 TechCrunch AI

OpenAI’s Sora app is struggling after its stellar launch

OpenAI's Sora mobile app is facing a decline in interest after its initial launch. Downloads decreased by 45% in January, with a consequent reduction in user spending. This raises questions about the sustainability of the initial enthusiasm.

2026-01-29 TechCrunch AI

Music publishers sue Anthropic for $3B over copyright infringement

A group of music publishers has filed a lawsuit against Anthropic, accusing it of massive copyright infringement. The lawsuit concerns the unauthorized use of approximately 20,000 copyrighted musical works, with a claim for damages amounting to $3 bi...

2026-01-29 The Register AI

Lennart Poettering Quits Microsoft to Focus on Secure Linux

Lennart Poettering, a prominent figure in the Linux world, has left Microsoft to co-found Amutable. The goal is to develop a Linux operating system with cryptographically verifiable integrity, aiming for greater security and reliability.

#LLM On-Premise #DevOps
2026-01-29 The Register AI

IBM says AI is insane in the mainframe as z17 sales surge

IBM is integrating artificial intelligence capabilities into its z17 mainframes, aiming to modernize existing COBOL applications and reduce operating costs. The company envisions a future where AI fills the skills gap left by earlier generations of C...

#LLM On-Premise #DevOps
2026-01-29 TechCrunch AI

New AI Lab: Flapping Airplanes Focuses on Research

A new artificial intelligence lab called Flapping Airplanes has been launched. A partner at Sequoia Capital shared their perspective on what makes this lab unique, emphasizing the importance of a research-driven approach.

#LLM On-Premise #DevOps
2026-01-29 LocalLLaMA

Qwen3-ASR: Open-Source Models for Multilingual Speech Recognition

The Qwen3-ASR family includes 1.7B and 0.6B parameter models, capable of identifying the language and transcribing audio in 52 languages and dialects. The larger model achieves performance comparable to proprietary commercial APIs, offering a valid o...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-29 The Register AI

Oracle: Job Cuts and Cerner Sale to Fund AI Expansion?

According to a financial analyst, Oracle may cut up to 30,000 jobs and sell its Cerner health tech unit to finance the expensive data centers required for its AI expansion. The news comes amid changing sentiment on Oracle's investment plans.

#LLM On-Premise #DevOps
2026-01-29 The Register AI

Meta to invest $135 billion in AI infrastructure by 2026

Meta plans to nearly double its capital investments in AI infrastructure, exceeding the GDP of some nations. The company aims for 'personal superintelligence,' fueling the growing demand for AI data centers.

#LLM On-Premise #DevOps
2026-01-29 LocalLLaMA

Mini-LLM: an 80M parameter LLM based on Llama 3 architecture

An engineer has developed Mini-LLM, an 80 million parameter transformer language model from scratch, based on the Llama 3 architecture. The project includes tokenization, memory-mapped data loading, mixed precision training, and inference with KV cac...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-29 The Next Web

AI and work: towards a redefinition of roles, not a replacement

The adoption of artificial intelligence is polarizing workers: some see it as a tool for empowerment, others as a threat. AI does not eliminate roles, but transforms low-value-added activities. IBM has stated that 7,000 back-office positions may no l...

#LLM On-Premise #DevOps
2026-01-29 LocalLLaMA

AI Agent Framework Proliferation on GitHub: Bubble Incoming?

A Reddit post regarding GitHub trends highlights a rapid growth of AI agent frameworks. The discussion raises concerns about the long-term sustainability of many of these projects, comparing the situation to the excessive fragmentation seen in JavaSc...

#LLM On-Premise #DevOps
2026-01-29 Tom's Hardware

ASML projects $71 billion in revenue by 2030 due to AI boom

ASML projects a significant revenue increase, reaching $71 billion by 2030. The demand for EUV (Extreme Ultraviolet) systems is surging, primarily driven by the expansion of the artificial intelligence market. Sales to China are lagging behind other ...

#Fine-Tuning
2026-01-29 The Next Web

TNW Moves Its Flagship Conference to London

The Next Web (TNW) is relocating its flagship conference to London, placing the main annual event at the heart of one of the world's most powerful technology and investment ecosystems. TNW is also introducing a new invite-only global event concept: T...

2026-01-29 The Register AI

Vivaldi release surfs a wave of anti-AI sentiment

The latest version of the Vivaldi browser stands out for its clear stance against the pervasive integration of artificial intelligence, in response to a widespread sentiment among users who negatively perceive the addition of AI features in web brows...

2026-01-29 LocalLLaMA

OpenMOSS unveils MOVA: Open-Source model for video and audio

OpenMOSS has released MOVA (MOSS-Video-and-Audio), a fully open-source model with 18 billion active parameters (MoE architecture, 32 billion total). MOVA offers day-0 support for SGLang-Diffusion and aims at scalable and synchronized video and audio ...

2026-01-29 Tom's Hardware

Nvidia H200: China yet to approve imports, orders on hold

Jensen Huang confirmed that Beijing has not yet approved the import of H200 GPUs. As a result, Nvidia has not received new orders from Chinese companies. This situation raises questions about the supply chain and deployment strategies for AI solution...

#Hardware #LLM On-Premise #DevOps
2026-01-29 Tom's Hardware

Los Angeles aims to ban single-use printer cartridges

The city of Los Angeles aims to reduce waste by banning printer cartridges that are not recyclable or do not have a take-back program from the manufacturer. The new ordinance is awaiting final approval from the City Council.

2026-01-29 Phoronix

NVIDIA VA-API Driver 0.0.15 Released With A Few Fixes

NVIDIA-VAAPI-Driver 0.0.15 was released. This VA-API driver, built atop NVIDIA's NVDEC interface, enables video acceleration for NVIDIA GPUs with the Firefox web browser on Linux, supporting VA-API but not NVIDIA's NVDEC.

#Hardware #LLM On-Premise #DevOps
2026-01-29 Tech.eu

TetraxAI raises pre-seed funding for AI risk tools in clean energy

TetraxAI, an AI-powered B2B SaaS platform focused on due diligence and risk management for clean energy infrastructure, has completed a €1.2 million pre-seed funding round. The funding will be used to expand machine learning and engineering teams, br...

2026-01-29 DigiTimes

Micron aims for leadership in the AI memory era

During Nvidia CEO's visit to Taiwan, the Taiwanese President positioned Micron as a leader in memory solutions for artificial intelligence applications. The initiative underscores the strategic importance of high-performance memory manufacturing to s...

#Hardware #LLM On-Premise #DevOps
2026-01-29 LocalLLaMA

Voicebox: Open-Source, Local-First Voice Cloning Studio

Voicebox is a new open-source project enabling local voice cloning using Qwen3-TTS and Whisper. The desktop application, built with Tauri/Rust/Python, offers multi-track editing, audio recording and transcription features, along with a REST API for i...

#LLM On-Premise #DevOps
2026-01-29 TechWire Asia

Asian startups enter Europe: the new tech playbook

Asian startups are adopting an innovative approach to expanding into Europe, leveraging cloud infrastructure, remote teams, and virtual offices. This strategy allows them to establish an operational presence without the high costs of a physical locat...

#LLM On-Premise #DevOps
2026-01-29 TechWire Asia

Vertiv introduces prefabricated AI data centre infrastructure

Vertiv launches SmartRun, a prefabricated system for AI data centers integrating power, liquid cooling, and networking. The goal is to accelerate construction times and reduce complexity, responding to the growing demand for computing power for artif...

#LLM On-Premise #DevOps
2026-01-29 DigiTimes

Samsung reclaims memory sales crown as SK Hynix extends profit lead

Samsung has surpassed SK Hynix in memory sales, reclaiming its position as market leader. Despite this, SK Hynix continues to maintain a lead in profits. Competition in the memory sector remains intense, with significant implications for hardware man...

#Hardware #LLM On-Premise #Fine-Tuning
2026-01-29 LocalLLaMA

LLM Generates Procedural Spells for VR Prototype

A developer has created a system where an LLM generates procedural spells for a virtual reality prototype. The system uses a pool of spell components and converts words into instructions to create unique effects. The soundtrack was made with Suno.

2026-01-29 DigiTimes

Taiwan, US expand AI, drone cooperation under Pax Silica framework

Taiwan and the United States are expanding collaboration in the field of artificial intelligence and drones through the Pax Silica initiative. This partnership aims to strengthen the technological capabilities of both countries in strategic sectors.

#LLM On-Premise #DevOps
2026-01-29 DigiTimes

Nvidia's Huang clarifies: Taiwan chip capacity is new, not moved from US

Nvidia's Jensen Huang clarified that the 40% of chip production capacity in Taiwan represents an increase in overall capacity, and not a shift of production resources from the United States. The clarification comes amid strong demand for GPUs for art...

#Hardware #LLM On-Premise #DevOps
2026-01-29 LocalLLaMA

Devstral 2: Hybrid Logical Reasoning Enhanced with Jinja

A user discovered that Devstral 2 123B and 24B models can be forced into more consistent logical reasoning through the use of Jinja templates. Adding a specific Jinja statement appears to significantly enhance the reasoning capabilities of the models...

#Hardware #LLM On-Premise #DevOps
2026-01-29 DigiTimes

DeepSeek: China's approach to AI without cutting-edge chips

An in-depth analysis reveals how DeepSeek is building AI capabilities in China, addressing challenges posed by the limited availability of cutting-edge chips. The article explores the strategies adopted to overcome these restrictions.

#Hardware #LLM On-Premise #DevOps
2026-01-29 DigiTimes

Analysis: ASML earnings prove AI demand is hitting the factory floor

ASML's financial results, a leader in lithography equipment manufacturing, indicate strong demand growth related to artificial intelligence. The company forecasts increased earnings, a sign that the production of advanced chips for AI applications is...

#Hardware #LLM On-Premise #DevOps
2026-01-29 DigiTimes

Alibaba's new AI chip challenges Nvidia from A800 to A100-class performance

Alibaba's T-Head Semiconductor has developed a new AI chip aiming to compete with the performance of Nvidia's A800 and A100 GPUs. This move could intensify competition in the AI hardware market, potentially offering new options for inference and trai...

#Hardware #LLM On-Premise #DevOps
2026-01-29 DigiTimes

Samsung: profit jumps on AI memory demand, pressures phones, displays

Samsung reports a profit increase, driven by strong demand for high-performance memory for AI applications. Growth in the memory sector offsets difficulties encountered in the smartphone and display markets, also influenced by the evolution of AI.

#LLM On-Premise #DevOps
2026-01-29 LocalLLaMA

Prismer: Open-Source Multi-Agent Environment for Research

Prismer, an open-source environment designed to streamline academic workflows, has been released. The goal is to provide a customizable and privacy-conscious alternative to proprietary solutions, reducing LLM hallucinations through citation verificat...

#LLM On-Premise #DevOps
2026-01-29 ArXiv cs.CL

LLM and Korean Language: Can Human Training Outperform Automation?

A new study shows that, with proper training, human experts can outperform automated systems in identifying Korean texts generated by LLMs. The approach relies on a detailed rubric that analyzes the peculiarities of the language.

#LLM On-Premise #DevOps
2026-01-29 ArXiv cs.LG

Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data

A new study introduces Gap-K%, a novel technique for identifying data used in the pre-training of large language models (LLMs). The method analyzes discrepancies between the model's top-1 prediction and the target token, leveraging the optimization d...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-29 ArXiv cs.AI

NeuroAI: Convergence of Neuroscience and Artificial Intelligence

A 2025 workshop explores synergies between neuroscience and artificial intelligence, identifying promising areas such as embodiment, language, robotics, learning, and neuromorphic engineering. The goal is to develop NeuroAI to improve algorithms and ...

#Hardware
2026-01-29 DigiTimes

AI boom threatens global chip supply, automakers warned

The exponential growth of artificial intelligence raises concerns about the global availability of chips. The automotive sector could be particularly vulnerable due to the high demand for semiconductors for AI.

#LLM On-Premise #DevOps
2026-01-29 DigiTimes

Meta: AI, subscriptions and commerce for future monetization

Meta is exploring new AI monetization strategies, moving beyond advertising. The company is focusing on subscriptions and commerce initiatives to diversify its revenue streams, leveraging the potential offered by new AI models.

#LLM On-Premise #DevOps
2026-01-29 DigiTimes

Analysis: how SK Hynix is binding customers to its AI memory

According to AFP, SK Hynix is strengthening its relationships with customers in the artificial intelligence sector through specialized memory solutions. This strategy aims to ensure greater customer loyalty and position the company as a key supplier ...

#LLM On-Premise #DevOps
2026-01-29 LocalLLaMA

Assistant_Pepe_8B: A Hilarious and Helpful LLM with 1M Context Window

Assistant_Pepe_8B, an 8 billion parameter LLM, has been released, designed to combine top-tier shitposting capabilities with actual helpfulness. The model boasts a 1 million token context window and aims to provide useful and irreverent responses, wh...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-29 DigiTimes

Containerized data centers and liquid cooling development

According to DIGITIMES, developments in containerized data centers and liquid cooling solutions are intensifying. These technologies are crucial for managing the increasing power density and energy efficiency requirements of modern workloads, includi...

#Hardware #LLM On-Premise #Fine-Tuning
2026-01-29 DigiTimes

Microsoft tops US$50b cloud milestone as AI drives growth

Microsoft has surpassed US$50 billion in cloud revenue, with significant growth attributed to investments in artificial intelligence and the adoption of Copilot. The increase in capital expenditure reflects the expansion of AI infrastructure.

#LLM On-Premise #DevOps
2026-01-29 LocalLLaMA

768GB "Mobile" AI Server: a deep dive into a local system

A user has built a high-performance AI server using consumer-grade components, achieving 768GB of memory between RAM and VRAM. The configuration, based on a Threadripper Pro and multiple GPUs, demonstrates how a relatively contained budget can compet...

#Hardware #LLM On-Premise #DevOps
2026-01-29 LocalLLaMA

LM Studio 0.4.0: Updates and Parallelism

Version 0.4.0 of LM Studio has been released. Updates include UI changes, with runtime settings now accessible via developer options. Parallelism tests did not show significant changes in performance.

#LLM On-Premise #DevOps
2026-01-29 Phoronix

GNU gettext Reaches Version 1.0 After 30+ Years In Development

GNU gettext, the widely-used internationalization and localization system, has reached version 1.0 after over 30 years of development. Originating at Sun Microsystems in the early 1990s and later developed by the GNU project from 1995, gettext is fun...

#LLM On-Premise #DevOps
2026-01-29 DigiTimes

Meta targets 2026 for massive AI infra push

Following solid earnings, Meta is planning a significant investment in AI infrastructure by 2026. The company aims to strengthen its computing capabilities to support ambitious future projects in the field of AI.

#Hardware #LLM On-Premise #DevOps
2026-01-28 The Register AI

ServiceNow bets on 80 billion workflows for AI

ServiceNow claims its AI agents are more effective due to 20 years of experience and 80 billion workflows. The company emphasizes that the underlying model is only a part of the final product.

#LLM On-Premise #DevOps
2026-01-28 TechCrunch AI

Tesla invested $2B in Elon Musk’s xAI

Elon Musk's AI company xAI disclosed earlier this month it had raised $20 billion. Tesla is among the investors, with a $2 billion investment. The capital injection will support the development of new AI technologies and models.

2026-01-28 LocalLLaMA

LongCat-Flash-Lite: LLM optimized for fast inference

Meituan-Longcat has released LongCat-Flash-Lite, a large language model (LLM) focused on efficient inference. The model is available on Hugging Face and discussed on Reddit, suggesting interest in local inference deployments.

#Hardware #LLM On-Premise #Fine-Tuning
2026-01-28 TechCrunch AI

Elon Musk teases a new image-labeling system for X

Elon Musk says X will begin identifying "manipulated media" but doesn't share details. The specifics of how this labeling system will work are still unknown. This initiative raises questions about the technical implementation and its effectiveness in...

2026-01-28 Phoronix

Wasmer 7.0 Released: WebAssembly Expands from Desktop to Edge

Wasmer 7.0 is now available, the WebAssembly (WASM) runtime environment that enables lightweight containers runnable anywhere, from desktop to cloud and edge. This security-minded and extensible WASM runtime release introduces new features and enhanc...

#DevOps
2026-01-28 TechCrunch AI

ServiceNow adopts a multi-model approach with Anthropic and OpenAI

ServiceNow has partnered with Anthropic, just a week after announcing a similar collaboration with OpenAI. This strategic move indicates a multi-model approach to integrating artificial intelligence into its solutions.

#LLM On-Premise #DevOps
2026-01-28 Phoronix

GNOME 50 Finally Lands Improved Discrete GPU Detection

The upcoming release of GNOME 50, expected in distributions like Ubuntu 26.04 LTS and Fedora Workstation 44, will feature improved discrete GPU detection within the GNOME Shell. This effort, which has been two years in the making, has finally been me...

#Hardware #LLM On-Premise #DevOps
2026-01-28 Wired AI

Moltbot Is Taking Over Silicio Valley: Privacy at Risk?

The AI assistant Moltbot, formerly known as Clawdbot, is rapidly gaining popularity in Silicio Valley. Despite privacy concerns raised by many, users are increasingly relying on this tool to manage various aspects of their lives.

#LLM On-Premise #DevOps
2026-01-28 TechCrunch AI

WhatsApp to Charge for AI Chatbots in Italy

WhatsApp is introducing a pricing model for developers of AI-powered chatbots operating on its platform in Italy. The cost will be calculated based on the number of messages sent.

2026-01-28 LocalLLaMA

BitMamba-2: 1.58-bit Mamba-2 model trained on CPU

BitMamba-2, a hybrid model combining Mamba-2 SSM with BitNet 1.58-bit quantization, has been released. Trained from scratch on 150 billion tokens, the 1B parameter model achieves around 53 tokens/sec on an Intel Core i3-12100F CPU, paving the way for...

#Hardware
2026-01-28 OpenAI Blog

OpenAI Enhances Data Security in AI Agents

OpenAI implements new safeguards for data handling when AI agents access external links. Built-in security measures aim to prevent data exfiltration via URLs and prompt injection attacks, ensuring a safer environment for users.

#LLM On-Premise #DevOps
2026-01-28 Phoronix

Mesa 26.0-rc2 Released With Numerous AMD, NVIDIA & Intel Driver Fixes

Mesa 26.0-rc2 is now available, the second release candidate that includes an initial batch of bug fixes for open-source OpenGL and Vulkan drivers from AMD, NVIDIA, and Intel. This quarterly update introduces new features and improvements.

#Hardware #LLM On-Premise #DevOps
2026-01-28 Wired AI

Chrome Introduces 'Auto Browse' Agent with Generative AI

Google integrates generative AI into the Chrome browser with the new 'Auto Browse' feature. The agent automates web browsing, placing the user in a position of passive supervision. This is a further push towards integrating AI into everyday software.

#LLM On-Premise #DevOps
2026-01-28 Ars Technica AI

Google begins rolling out Chrome's "Auto Browse" AI agent

Google is expanding Gemini's capabilities in the Chrome browser with the introduction of "Auto Browse", an autonomous agent capable of automating repetitive tasks. The integration includes easier access to Gemini via a side panel and connection to ot...

2026-01-28 TechCrunch AI

Modelence raises $13 million to smooth out the AI stack

Modelence has raised $13 million to develop tools that simplify the software stack for artificial intelligence. The company aims to address the complexities of building AI-based applications, offering innovative solutions for developers.

#LLM On-Premise #DevOps
2026-01-28 Tech.eu

Voyager Ventures closes $275M Fund II

Voyager Ventures has closed its $275 million Fund II, bringing its total assets under management to $475 million. The fund will invest in technologies for energy, materials production, artificial intelligence, and advanced manufacturing, focusing on ...

2026-01-28 Ars Technica AI

China Approves Import of Nvidia H200 AI Chips

China has approved the import of Nvidia's H200 chips for ByteDance, Alibaba, and Tencent after weeks of uncertainty. The approval follows a temporary hold on shipments, despite export clearance from the United States.

#Hardware #LLM On-Premise #DevOps
2026-01-28 LocalLLaMA

Kimi K2.5: Running the 1T Parameter Hybrid Model Locally

The Kimi K2.5 model, boasting state-of-the-art performance in vision, coding, agentic, and chat tasks, can be run locally. The quantized Unsloth Dynamic 1.8-bit version reduces the required disk space by 60%, from 600GB to 240GB.

#Hardware #LLM On-Premise #DevOps
2026-01-28 The Register AI

AI agent hype cools as enterprises struggle to get into production

The implementation of AI agents is slowing down. According to Redis CEO Rowan Trollope, only the largest businesses are successfully navigating the integration challenges and bringing these systems into production. Many organizations are reassessing ...

#LLM On-Premise #DevOps
2026-01-28 Tech.eu

finanzen.net Group snaps up AI investing startup Vickii

The finanzen.net Group, parent company of neo-broker finanzen.net ZERO, has acquired Vickii, a German startup specializing in artificial intelligence for investments. The acquisition aims to integrate Vickii's technology into the existing platform, i...

2026-01-28 LangChain Blog

Context Management for DeepAgents

LangChain's Deep Agents SDK addresses the challenges of context management in complex AI agents. Using compression techniques such as filesystem offloading and summarization, Deep Agents aims to reduce the volume of information in the agent's working...

2026-01-28 404 Media

Hackers Claim Data Breach at Match Group, Owner of Hinge and OkCupid

Match Group, the online dating giant including platforms like Hinge and OkCupid, has suffered a data breach. Hackers claim to have stolen 1.7GB of compressed data, including unique advertising IDs and internal company documents. Match Group is invest...

#LLM On-Premise #DevOps
2026-01-28 LocalLLaMA

AMA With Kimi: The Open-source Lab Behind K2.5 Model

The Kimi team, the open-source research lab behind the K2.5 model, participated in an AMA (Ask Me Anything) session on Reddit to answer questions from the LocalLLaMA community. The session focused on various aspects of the model and its architecture.

2026-01-28 LocalLLaMA

Anthropic CEO calls for AI regulation: Time to back up those models?

Anthropic CEO Dario Amodei expresses concern about the threats posed by artificial intelligence and urges regulation of the sector. This alarm prompts consideration of the importance of backup and protection strategies for AI models, especially in li...

#LLM On-Premise #DevOps
2026-01-28 MIT Technology Review

AI Memory and Privacy: The Next Frontier for Chatbots

AI chatbots' ability to remember preferences is becoming a key selling point. However, this personalization introduces new privacy vulnerabilities. Developers must implement granular controls over data usage and ensure transparency for users, allowin...

2026-01-28 AI News

Salesforce: Scaling enterprise AI Requires End-to-End Data Governance

Salesforce's Franny Hsiao highlights how many AI pilot projects fail to scale to production due to inadequate data governance. Companies must integrate observability and guardrails from the outset of the AI lifecycle, managing latency through 'percei...

#Fine-Tuning
2026-01-28 The Register AI

SK Hynix invests $10B in new 'AI Co.'

Flush with cash, SK Hynix is establishing a new division focused on AI solutions. The Korean company aims to capitalize on the current AI hype, although operational details of the new entity are still scarce.

#LLM On-Premise #DevOps
2026-01-28 TechCrunch AI

Anthropic, OpenAI CEOs condemn ICE enforcement tactics

Anthropic's Dario Amodei and OpenAI's Sam Altman spoke out against ICE enforcement tactics following Minneapolis violence, with one addressing concerns publicly and the other in an internal message.

#LLM On-Premise #DevOps
2026-01-28 MIT Technology Review

LLM Security: Rules succeed at the boundary, fail at the prompt

Prompt injection attacks and the malicious use of AI agents require a paradigm shift in security. Defenses based on semantic rules are fragile. Solid governance, access control, continuous monitoring, and policies enforced at architectural boundaries...

#LLM On-Premise #DevOps
2026-01-28 Tom's Hardware

Apple and Nvidia considering Intel for 2028 chip production

Apple and Nvidia are reportedly considering using Intel to produce some of their chips in the U.S. The decision is said to be motivated by geopolitical issues and tariffs, but it remains to be seen which products Intel will actually be able to manufa...

#Hardware #LLM On-Premise #DevOps
2026-01-28 The Register AI

Old Windows quirks help punch through new admin defenses

A Google researcher discovered a bypass for Windows User Account Control (UAC). The vulnerability was exploited due to delayed patches from Microsoft, highlighting risks in administrator privilege management.

#LLM On-Premise #DevOps
2026-01-28 LocalLLaMA

OpenAI Slows Hiring Pace Amidst Financial Pressure

Sam Altman admitted that OpenAI is 'dramatically slowing down' hiring as the company faces increasing financial pressure. An internal memo signals the need for urgent fixes to ChatGPT, while analysts warn of a potential cash crunch. The company is ex...

#LLM On-Premise #DevOps
2026-01-28 The Register AI

UK bets big on AI policing, expands facial recognition van fleet

The UK government is investing heavily in AI for law enforcement, allocating millions of pounds for live facial recognition (LFR), a new Police.AI unit, and a bespoke legal framework. The aim is to reform law enforcement through the intensive use of ...

#LLM On-Premise #DevOps
2026-01-28 Tech.eu

Funnel Secures $80M Debt Facility for AI-Driven Marketing

Funnel, a Stockholm-based marketing intelligence platform, has secured an $80 million debt facility from HSBC Innovation Banking and Hercules Capital. The funding will support the development of advanced AI-driven features and international expansion...

#LLM On-Premise #DevOps
2026-01-28 AI News

White House compares industrial revolution with AI era

A White House paper draws parallels between the industrial revolution and the current era of artificial intelligence, positioning the latter as a driving force for economic growth. AI is at the center of US economic strategy, with infrastructure inve...

#Hardware
2026-01-28 Tom's Hardware

Starlink reduces orbit to avoid collisions with Chinese satellites

Chinese researchers claim Starlink lowered the orbit of a significant portion of its satellite constellation following a near-miss incident with a Chinese satellite launch in December 2025. Over 4,000 satellites were reportedly pulled to a 300-mile o...

2026-01-28 LocalLLaMA

Kimi K2.5: a promising open-source model for coding

According to a Reddit post, Kimi K2.5 stands out as a particularly effective open-source model for programming tasks. The online discussion suggests that the model offers remarkable results in this specific area.

#LLM On-Premise #DevOps
2026-01-28 AI News

AI Adoption in US Workplaces: Still Fragmented and Role-Dependent

A Gallup survey reveals that the adoption of artificial intelligence in US workplaces is growing, but remains uneven. Usage is concentrated in the technology, finance, and professional services sectors, with lower adoption in customer-facing or manua...

#LLM On-Premise #DevOps
2026-01-28 LocalLLaMA

LLM API Pricing Freefall: Does On-Premise Still Make Sense?

The cost of APIs for large language models (LLMs) is rapidly decreasing, raising questions about the cost-effectiveness of maintaining on-premise infrastructure. Privacy, latency, and customization remain key advantages, but hardware and management c...

#Hardware #LLM On-Premise #DevOps
2026-01-28 The Register AI

UK tax collector plans £2B tech binge: AWS and Capgemini in the lead

The UK's tax collector is budgeting to spend more than £2 billion on new tech deals in the next couple of years. Among the most important contracts, one is set for AWS and another for Capgemini, both to be awarded without competition.

#LLM On-Premise #DevOps
2026-01-28 Tech.eu

Modern Milkman lands £10M to scale its doorstep delivery model

Modern Milkman, a UK-based sustainable grocery delivery service, has raised £10 million in a funding round led by Salica Investments. Founded in 2019, the company aims to further develop its logistics platform and expand integrated services for custo...

2026-01-28 AI News

Standard Chartered: AI and Privacy, an Inseparable Pair

For Standard Chartered, data privacy issues are the starting point for any artificial intelligence project. Data protection regulations influence the type of data that can be used, the transparency of the systems, and their monitoring. The bank adopt...

#LLM On-Premise #DevOps
2026-01-28 The Register AI

Britain's Ministry of Defence signs on the dotted line with Palantir

The UK's Ministry of Defence has directly awarded a £240.6 million contract to US technology company Palantir to continue to licence and support its data analytics work. The 3-year agreement follows protests in the US over Palantir's contracts with I...

2026-01-28 DigiTimes

SK Hynix pledges US$10 billion for US AI arm under tariff pressure

SK Hynix has announced a US$10 billion investment to strengthen its presence in the artificial intelligence sector in the United States. The decision comes amid increasing competition and tariff pressures in the global semiconductor market.

#LLM On-Premise #DevOps
2026-01-28 LocalLLaMA

SanityHarness: Benchmark to evaluate coding agents and LLM models

A developer has created SanityHarness, a benchmark tool to evaluate the capabilities of coding agents and language models in various programming languages. The results are published on SanityBoard, a leaderboard comparing the performance of 49 differ...

#Fine-Tuning
2026-01-28 Tech.eu

b2venture closes €150M Fund V for European startups

b2venture has announced the closing of its Fund V at €150 million, exceeding its hard cap. The fund will support approximately 35 early-stage startups in Europe, focusing on scalable and defensible technologies in deep tech, AI, and robotics. The inv...

2026-01-28 OpenAI Blog

OpenAI Accelerates AI Adoption in Europe with New Initiatives

OpenAI launches the EU Economic Blueprint 2.0, a program featuring new data, partnerships, and initiatives to promote the adoption of artificial intelligence, skills development, and economic growth across Europe. The initiative aims to support Europ...

#LLM On-Premise #DevOps
2026-01-28 OpenAI Blog

EMEA Youth & Wellbeing Grant

Apply for the EMEA Youth & Wellbeing Grant, a €500,000 program funding NGOs and researchers advancing youth safety and wellbeing in the age of AI. The initiative aims to support projects addressing the challenges and opportunities presented by AI for...

2026-01-28 TechWire Asia

Zebra Technologies focuses on AI to optimize frontline operations

Zebra Technologies integrates artificial intelligence into frontline operations to address challenges related to labor shortages, customer expectations, and supply chain unpredictability. The company focuses on solutions that combine AI, data, and hu...

#LLM On-Premise #DevOps
2026-01-28 DigiTimes

SK Hynix reports record-breaking 2025 earnings driven by AI memory boom

SK Hynix anticipates record earnings in 2025, driven by strong demand for high-performance memory for artificial intelligence applications. The growth is primarily attributed to the increased demand for specialized memory solutions for AI workloads.

#LLM On-Premise #DevOps
2026-01-28 DigiTimes

ASML orders beat expectations as company plans 1,700 layoffs

Semiconductor equipment manufacturer ASML has announced better-than-expected orders, along with a workforce reduction plan involving the layoff of 1,700 employees. The news arrives during a period of change in the technology sector.

2026-01-28 Tech.eu

Pallma AI closes $1.6M pre-seed round for AI agent security

London-based Pallma AI has closed a $1.6 million pre-seed round to develop a centralized security platform for AI agents. The solution aims to protect AI-powered applications from real-time threats, integrating with existing technology stacks and mit...

#LLM On-Premise #DevOps
2026-01-28 DigiTimes

China reportedly approves first Nvidia H200 AI chip imports

China has reportedly approved the first imports of Nvidia H200 chips, a high-end artificial intelligence accelerator. This move could have significant implications for the Chinese AI market and competition among tech companies.

#Hardware #LLM On-Premise #Fine-Tuning
2026-01-28 DigiTimes

ASML posts record 2025 results on surging orders

The lithography systems manufacturer ASML expects record financial results for 2025, supported by strong order growth. The company, crucial for the production of advanced semiconductors, continues to benefit from the global demand for increasingly po...

#LLM On-Premise #DevOps
2026-01-28 DigiTimes

Phison's Pascari SSDs power world's first lunar data center

Phison's Pascari SSDs have set a new reliability benchmark by powering the world's first lunar data center. This milestone demonstrates the ability of SSDs to operate in extreme environments, opening new frontiers for data processing in space.

#LLM On-Premise #DevOps
2026-01-28 Tech.eu

Co-reactive raises €6.5M for CO₂-negative materials tech

German startup Co-reactive has secured €6.5 million in seed funding to develop CO₂-negative building materials. The continuous mineralization process converts captured CO₂ and natural minerals into supplementary cementitious materials, reducing emiss...

2026-01-28 DigiTimes

Anthropic targets OpenAI, compute costs remain a challenge

Anthropic aims to compete with OpenAI by increasing revenue, but high compute infrastructure costs pose a significant obstacle. The company is evaluating strategies to optimize resources and scale operations.

#Hardware #LLM On-Premise #DevOps
2026-01-28 DigiTimes

Musk plans to cut AI chip design cycle to 9 months

Elon Musk has announced plans to drastically reduce the design cycle for AI chips to just nine months. This acceleration could significantly impact the development of new AI capabilities, but also raises questions about the validity of testing and va...

#LLM On-Premise #DevOps
2026-01-28 DigiTimes

'Taiwan Dome' drives shift toward networked defense warfare

The 'Taiwan Dome' initiative aims to strengthen defense capabilities through a networked approach. This strategic shift underscores the importance of connectivity and information sharing for a more effective response to threats.

#LLM On-Premise #DevOps
2026-01-28 DigiTimes

SoftBank in talks to invest additional US$30 billion in OpenAI

According to press sources, SoftBank is considering an investment of approximately US$30 billion in OpenAI. If confirmed, the deal would represent a significant capital injection for the company that developed ChatGPT and other generative artificial ...

#LLM On-Premise #DevOps
2026-01-28 DigiTimes

Tesla restarts Dojo development with new focus on space AI

Tesla has restarted the development of its Dojo supercomputer, with a renewed focus on artificial intelligence applications in the space sector. The project, previously focused on autonomous driving, appears to be expanding its scope.

#LLM On-Premise #DevOps
2026-01-28 ArXiv cs.CL

Multilingual ASR: LLM Connectors Optimized for Language Families

A new study explores an efficient approach to multilingual Automatic Speech Recognition (ASR) based on LLMs. The technique involves sharing connectors between language families, reducing the number of parameters and improving generalization across di...

#LLM On-Premise #DevOps
2026-01-28 ArXiv cs.AI

LLM Driven Design of Continuous Optimization Problems

A new study explores the use of large language models (LLMs) to generate continuous optimization problems with controllable characteristics. The LLaMEA framework guides an LLM in creating problem code from natural-language descriptions, expanding the...

2026-01-28 ArXiv cs.AI

A-BPMS Systems: Agentic AI Transforms Business Process Management

A new study envisions a transformation in Business Process Management (BPM) thanks to Agentic Artificial Intelligence. A-BPMS systems integrate autonomy, reasoning, and learning for data-driven process management, extending automation to fully autono...

#LLM On-Premise #DevOps
2026-01-28 OpenAI Blog

TrustBank: AI-Powered Personalized Tax Donations

TrustBank partnered with Recursive to build Choice AI using OpenAI models, delivering personalized, conversational recommendations that simplify Furusato Nozei gift discovery. A multi-agent system helps donors navigate thousands of options and find g...

2026-01-28 LocalLLaMA

Kimi K2.5: Open-Source Model Competitive with Proprietary Alternatives

A Reddit user reported that Kimi K2.5, an open-source model, offers performance comparable to more expensive proprietary models like Opus, at about 10% of the cost. It is highlighted as performing better than GLM, especially in tasks other than just ...

#LLM On-Premise #DevOps
2026-01-28 LocalLLaMA

Arcee AI releases Trinity Large: OpenWeight 400B-A13B

Arcee AI has released Trinity Large, an open-source large language model (LLM) with 400 billion parameters. The model is available under the OpenWeight license, opening new possibilities for research and development in the field of generative artific...

#LLM On-Premise #DevOps
2026-01-27 DigiTimes

Ta-i Technology raises chip resistor prices from February

Ta-i Technology will increase chip resistor prices starting in February, due to increasing pressure on production costs. The decision reflects the challenges that electronic component manufacturers are facing globally.

2026-01-27 LocalLLaMA

Kimi K2: Synthetic Analysis Score of an LLM

A user shared a synthetic analysis score for the Kimi K2 language model on Reddit. The original post links to a tweet with further details, sparking discussion about the model's performance in specific scenarios.

2026-01-27 LocalLLaMA

Dual RTX PRO 6000 Workstation: Multi-user and Long Context Benchmarks

A team benchmarked a workstation with dual RTX PRO 6000s and 1.15TB RAM for multi-user AI workloads. Comparison between GPU-only (INT4) and CPU+GPU (FP8) inference with MiniMax M2.1. Results show INT4 is faster in prefill but limited by KV-cache, whi...

#Hardware #LLM On-Premise #DevOps
2026-01-27 LocalLLaMA

[LEAKED] Kimi K2.5's System Prompt and Tools Released

The full system prompt for Moonshot's Kimi K2.5 model has been leaked, along with tool schemas, memory CRUD protocols, and external datasource integrations. The leak also includes information on context engineering and user profile assembly.

#LLM On-Premise #DevOps
2026-01-27 The Register AI

Nudify app proliferation shows naked ambition of Apple and Google

A study by the Tech Transparency Project reveals the presence of apps on the Apple Store and Google Play that allow users to create fake non-consensual nudes. Despite their claims to ban such software, the two companies have reportedly made millions ...

#LLM On-Premise #DevOps
2026-01-27 TechCrunch AI

Anthropic reportedly upped its latest raise to $20B

Anthropic looks to raise $20 billion at more than $300 billion valuation, according to reports. The financial operation could consolidate the company's position in the large language model market.

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-27 LocalLLaMA

Qwen3-32B: INT4 Quantization Achieves 12x Capacity Gain

A benchmark of Qwen3-32B reveals that INT4 quantization, compared to BF16, allows serving 12 times more concurrent users with only a 1.9% accuracy drop. The test was performed on a single H100 GPU, evaluating different precisions (BF16, FP8, INT8, IN...

#Hardware #LLM On-Premise #DevOps
2026-01-27 Tom's Hardware

Zotac warns: component shortages threaten GPU manufacturers

Zotac Korea has expressed concerns about the graphics card market situation. Component shortages threaten the survival of manufacturers and distributors. The company warns about the potential severity of the crisis.

#Hardware #LLM On-Premise #DevOps
2026-01-27 Phoronix

GNU C Library Moving From Sourceware To Linux Foundation Hosted CTI

The GNU C Library "glibc" developers have decided to move ahead with plans of migrating their core services from Sourcware.org infrastructure over to the Core Toolchain Infrastructure "CTI" project hosted by the Linux Foundation. This transition aims...

#LLM On-Premise #DevOps
2026-01-27 Wired AI

State-Led Crackdown on Grok and xAI Begins

At least 37 attorneys general for US states and territories are taking action against xAI. The reason is Grok's generation of nonconsensual sexual images of women and minors.

#LLM On-Premise #DevOps
2026-01-27 Google AI Blog

Google delves into the development of the Gemini model in a podcast

The latest episode of the Google AI: Release Notes podcast explores the development process of Gemini, one of the world's leading AI coding models. Logan Kilpatrick interviews the "Smokejumpers" team to reveal the secrets behind its creation and the ...

#LLM On-Premise #DevOps
2026-01-27 The Register AI

European firms push on with AI pilots even as payoff doubts grow

Despite a growing number of reports that AI is not benefiting many businesses, Lenovo and IDC say that firms in EMEA are pushing ahead with pilot deployments and still expect it to drive growth and transform how they operate.

#LLM On-Premise #DevOps
2026-01-27 TechCrunch AI

Anthropic and OpenAI CEOs criticize US immigration policies

The CEOs of Anthropic and OpenAI, Dario Amodei and Sam Altman, have publicly criticized US immigration policies following incidents of violence. The statements were made through both official channels and internal communications.

#LLM On-Premise #DevOps
2026-01-27 TechCrunch AI

Google AI Plus with Gemini Pro 3 Now Available Globally

Google has expanded the availability of the Google AI Plus plan, which includes access to Gemini Pro 3 and other AI tools, to all markets, including the United States. The cost in the US is $7.99 per month.

#LLM On-Premise #DevOps
2026-01-27 Tom's Hardware

Iluvatar CoreX targets Nvidia Rubin with GPU roadmap to 2027

Chinese chip designer Shanghai Iluvatar CoreX has unveiled a multi-year GPU architecture roadmap explicitly targeting Nvidia’s next-gen Rubin platform. The company aims to compete directly with Nvidia by 2027, outlining an ambitious challenge in the ...

#Hardware #LLM On-Premise #DevOps
2026-01-27 TechCrunch AI

OpenAI launches Prism, a new AI workspace for scientists

OpenAI has launched Prism, a new scientific workspace program that integrates AI into existing standards for composing research papers. The goal is to improve the efficiency and productivity of researchers.

2026-01-27 Phoronix

KDE Plasma 6.6 Beta 2 Released For Testing

The second beta of the upcoming KDE Plasma 6.6 desktop is now available for testing. The stable version of KDE Plasma 6.6 is still on track for a mid-February release. This release focuses on improving stability and introducing new features for users...

2026-01-27 LocalLLaMA

Rocinante X 12B v1: Open Source LLM for Local Role-Playing

Rocinante X 12B v1 is available, an open-source large language model (LLM) designed for creative role-playing tasks. The model, inspired by Claude, is intended to be run locally, giving users complete control over their data and experience. The Local...

#LLM On-Premise #DevOps
2026-01-27 AI News

Databricks: Enterprise AI adoption shifts to agentic systems

According to Databricks, enterprise AI adoption is shifting towards "agentic" systems, where models independently plan and execute workflows. There has been a 327% increase in the use of multi-agent workflows on the Databricks platform between June a...

#LLM On-Premise #DevOps
2026-01-27 Phoronix

Google Cloud N4A Instances with Axion CPUs Now Available

Google is expanding its Axion ARM processor offerings on Google Cloud with the new N4A instances, now generally available. Optimized for scale-out web servers, microservices, and data analytics, these instances promise a more efficient development an...

2026-01-27 Google AI Blog

Enhanced Search: New AI Capabilities for All Users

Search users worldwide now have easier access to cutting-edge artificial intelligence capabilities directly through Search. The article announces an enhanced user experience, aiming to make AI more accessible.

2026-01-27 LocalLLaMA

Z-Image: New Image Generation Model from Tongyi-MAI

Tongyi-MAI has released Z-Image, a new model for image generation. The model is available on Hugging Face, opening up new possibilities for generative artificial intelligence applications. Further details on the model's architecture and capabilities ...

#LLM On-Premise #DevOps
2026-01-27 The Register AI

Pope warns against naive reliance on AI

The Pope urges Catholics to develop critical thinking skills regarding artificial intelligence, warning against the risks of uncritical reliance on technology and unnatural interactions with chatbots. He calls for protecting one's voice and identity.

#LLM On-Premise #DevOps
2026-01-27 Tom's Hardware

SoftBank pauses Switch acquisition: AI data center plans in doubt

SoftBank has paused talks to acquire U.S. data center operator Switch. The deal, which would have been one of SoftBank's largest, was halted due to regulatory roadblocks, jeopardizing plans to build AI-focused data centers.

#LLM On-Premise #DevOps
2026-01-27 Tom's Hardware

Nvidia DGX Spark review: Blackwell power for AI developers

Nvidia's DGX Spark brings a slice of Grace Blackwell power to the desktop for AI developers. With a 20-core Arm CPU, a Blackwell GPU with 6144 CUDA cores, and 128GB of unified memory, the DGX Spark can run a wide range of AI models and workflows with...

#Hardware #LLM On-Premise #DevOps
2026-01-27 The Register AI

US weather alerts: AI translations still incomplete, says GAO

The Government Accountability Office (GAO) has urged the National Weather Service (NWS) to finalize its plans for AI-powered language translation. Delays and policy uncertainties risk compromising the effectiveness of weather alerts for non-English s...

#LLM On-Premise #DevOps
2026-01-27 The Next Web

TNW Council: Early Insights into Startup Support

The TNW Council has identified significant differences in the needs of startups based on their growth stage. Companies with revenues between €1 and €10 million seek growth strategies and positioning clarity. Those between €10 and €100 million, howeve...

#LLM On-Premise #DevOps
2026-01-27 AI News

Anthropic to Build Government AI Assistant Pilot in the UK

The UK government has selected Anthropic to develop an AI assistant aimed at modernizing citizen interaction with state services. The project focuses on deploying agentic systems powered by Claude to guide users through complex processes, with a focu...

#LLM On-Premise #DevOps
2026-01-27 Tech.eu

How Studocu Is Redefining Exam Prep With AI

Studocu, a platform with over 50 million documents, is transforming exam preparation by integrating AI tools. The platform offers instant summaries, study assistants, and interactive quizzes, supporting millions of students worldwide.

2026-01-27 The Register AI

Japan and US collaborate on AI supercomputing: Genesis project revived

Japan's RIKEN, Fujitsu, Argonne National Laboratory (USA), and Nvidia are collaborating to build next-gen compute infrastructure for AI and high-performance computing (HPC). The initiative revives the Genesis project promoted by the Trump administrat...

#Hardware #LLM On-Premise #DevOps
2026-01-27 Tom's Hardware

Intel XeSS 3: Multi-Frame Generation enabled on Arc GPUs and Core Ultra iGPUs

The latest Intel graphics drivers introduce XeSS 3 Multi-Frame Generation with 2x, 3x, and 4x modes. The technology supports existing XeSS 2 games without requiring developer updates, expanding frame generation capabilities across a wide range of Int...

#Hardware #LLM On-Premise #DevOps
2026-01-27 Tech.eu

ZOHO.VC completes first closing at 70% of target fund

ZOHO.VC, the venture capital arm of ZOLLHOF, has completed the first closing of its inaugural fund, securing 70 per cent of its target volume. The fund focuses on pre-seed and seed investments in technology-driven startups, combining capital with tec...

#Hardware
2026-01-27 Tech.eu

Scoro acquires Envoice to close the project cost visibility gap

Scoro, an Estonian-founded project management platform, has acquired Envoice, an Estonian AI-driven expense and bill management company. The integration aims to provide professional services firms with a clearer, real-time view of project costs, impr...

2026-01-27 AI News

Cold snap highlight’s airlines’ proactive use of AI

The recent cold snap in the US has strained the airline industry. Companies like Air France-KLM and United Airlines are using generative AI to respond faster to customer queries and optimize operations, from flight management to communication. AI ado...

#LLM On-Premise #DevOps #RAG
2026-01-27 LocalLLaMA

Qwen Devs Teasing a New Model: Vision-Language?

The developers of Qwen, the open-source large language model, appear to be teasing the release of a new model. The community speculates that it will be a vision-language model, capable of processing both text and images. More details are expected soo...

#LLM On-Premise #DevOps
2026-01-27 Wired AI

Where Tech Leaders and Students Really Think AI Is Going

A survey explores the opinions of tech CEOs, journalists, students, and other figures in the tech industry regarding the promise and peril of artificial intelligence. The article summarizes the diverse perspectives that emerged, offering a snapshot o...

2026-01-27 Tom's Hardware

Exploring Clawdbot, the AI agent taking the internet by storm

Clawdbot is a new pseudo-locally-hosted gateway for agentic AI that offers a sneak peek at both good and bad futures for the technology. It automates tasks online, but raises security and control issues.

#LLM On-Premise #DevOps
2026-01-27 TechCrunch AI

xAI's Grok Under Fire for Child Safety Failures

A report by Common Sense Media heavily criticizes xAI's Grok chatbot for serious shortcomings in child protection. According to the organization, Grok ranks among the worst chatbots evaluated in terms of safety for young users.

#LLM On-Premise #DevOps
2026-01-27 DigiTimes

Taipower and Westinghouse: Nuclear safety checks for the AI era

Taipower partners with Westinghouse for nuclear safety checks, responding to AI's growing energy demands and net-zero goals. The initiative aims to ensure safe and reliable nuclear plant operations amid new energy challenges.

#LLM On-Premise #DevOps
2026-01-27 DigiTimes

Nvidia launches open models to speed weather forecasting

Nvidia has launched new open source models to accelerate weather forecasting. This initiative aims to provide more accessible and powerful tools for climate modeling, potentially reducing computation times and improving forecast accuracy.

#Hardware #LLM On-Premise #DevOps
2026-01-27 The Register AI

Salesforce AI buffet won't stay all-you-can-eat forever

Gartner is warning Salesforce users that a capped enterprise agreement for its AI and data platforms will not be available when they come to renew, leaving a struggle to predict costs and understand value.

#LLM On-Premise #DevOps
2026-01-27 LocalLLaMA

OpenAI could run out of cash by mid-2027, analyst warns

A new financial analysis predicts OpenAI could burn through its cash reserves by mid-2027. Training costs are exploding, but revenue isn't keeping up. Sam Altman’s '$100 billion Stargate' strategy is reportedly hitting a wall, with competitors like D...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-27 DigiTimes

US grants Taiwan MFN status on Section 232 tariffs, officials say

Taiwanese officials report that the US has granted Taiwan Most Favored Nation (MFN) status regarding Section 232 tariffs. The decision represents an endorsement of Taiwan's industry and could have significant implications for bilateral trade.

2026-01-27 DigiTimes

TSMC and Nvidia ignite AI growth, Taiwan supply chain accelerates expansion

Strong demand for AI solutions, driven by Nvidia, is pushing TSMC and the entire Taiwanese technology supply chain to accelerate expansion plans. The article highlights how the partnership between the chip manufacturer and the GPU giant is fueling si...

#Hardware #LLM On-Premise #DevOps
2026-01-27 The Next Web

Noora Saksa steps in as new Slush CEO

Slush, the Finnish nonprofit behind one of the most influential startup gatherings in Europe, has named Noora Saksa as its new Chief Executive Officer. The shift indicates a strategic evolution for the organisation as it expands beyond its flagship e...

2026-01-27 DigiTimes

Global PCB market: 13.9% growth predicted for 2026

According to TPCA, the global printed circuit board (PCB) market is projected to grow by 13.9% by 2026, driven by increased production capacity related to artificial intelligence. This expansion reflects the growing demand for specialized hardware fo...

#Hardware #LLM On-Premise #Fine-Tuning
2026-01-27 The Register AI

London's Elizabeth Line and its modern 'borks'

London's Elizabeth Line, the latest in urban public transport, also stands out for its modern 'borks'. An ironic commentary on the technological evolution applied even to the most unexpected aspects of urban infrastructure.

#LLM On-Premise #DevOps
2026-01-27 DigiTimes

GlobalFoundries' MIPS takes aim at Arm's hold on automotive AI

MIPS, led by CEO Sameer Wasson, aims to compete with Arm in the artificial intelligence sector for the automotive industry. The competition focuses on innovation and efficiency of computing architectures for advanced applications in vehicles.

#LLM On-Premise #DevOps
2026-01-27 LocalLLaMA

Kimi K2.5: New Open-Source Model with Visual Agentic Intelligence

Moonshot AI introduces Kimi K2.5, an open-source model excelling in agentic tasks, computer vision, and code generation. It features a multi-agent system running in parallel, promising faster speeds compared to single-agent setups. It's available in ...

2026-01-27 LocalLLaMA

Kimi-K2.5: New Open-Source Language Model Released

Kimi-K2.5, a new open-source language model, has been released. The model is accessible via Hugging Face. The announcement was made via a post on the Reddit platform dedicated to local LLM models.

#LLM On-Premise #DevOps
2026-01-27 ArXiv cs.CL

Crystal-KV: Efficient KV Cache Management for Chain-of-Thought LLMs

Crystal-KV is a framework for Key-Value (KV) cache management in large language models (LLMs) using Chain-of-Thought (CoT) reasoning. It optimizes cache utilization by prioritizing information relevant to the final answer, improving throughput and re...

#LLM On-Premise #DevOps
2026-01-27 ArXiv cs.LG

Dataset of Dengue Hospitalizations in Brazil (1999 to 2021)

A new dataset released on Zenodo provides harmonized municipal-level data on dengue hospitalizations in Brazil from 1999 to 2021, disaggregated weekly. The goal is to improve the accuracy of AI models for epidemiological forecasting, including enviro...

#Fine-Tuning
2026-01-27 LocalLLaMA

Jan v3 Instruct: a 4B coding Model with +40% Aider Improvement

The Jan team has released Jan-v3-4B-base-instruct, a 4 billion-parameter model trained with continual pre-training and reinforcement learning. The goal is to improve capabilities across common tasks while preserving general capabilities. The model is...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-27 DigiTimes

Why South Korea's new AI law left manufacturing outside core rules

South Korea's new artificial intelligence law has sparked debate for excluding the manufacturing sector from its core regulations. This strategic choice raises questions about the country's approach to AI regulation and its impact on key industries.

#LLM On-Premise #DevOps
2026-01-27 LocalLLaMA

DeepSeek-OCR-2: New Open Source OCR Model by DeepSeek AI

DeepSeek AI has released DeepSeek-OCR-2, an open-source Optical Character Recognition (OCR) model. The news was shared on Reddit, with a direct link to the model available on Hugging Face. This release could foster the adoption of OCR solutions local...

#LLM On-Premise #DevOps
2026-01-27 DigiTimes

Microsoft Maia 200: Capacity to Increase Tenfold from Maia 100

According to DIGITIMES, Microsoft anticipates an over tenfold increase in the capacity of its Maia 200 systems compared to the previous Maia 100. This significant increase suggests a strong commitment to expanding its computing infrastructure, likely...

#LLM On-Premise #DevOps
2026-01-27 DigiTimes

Nvidia pours US$2 billion into CoreWeave for AI infrastructure expansion

Nvidia has invested US$2 billion in CoreWeave, a cloud infrastructure provider specializing in artificial intelligence workloads. The investment aims to support the expansion of CoreWeave's computing capabilities to meet the growing demand for AI res...

#Hardware #LLM On-Premise #DevOps
2026-01-27 DigiTimes

Amazon: AI applications are already pervasive in 2026

According to DIGITIMES, Amazon anticipates that artificial intelligence applications will be pervasive by 2026. This shift will significantly impact various sectors, transforming how businesses operate and interact with customers. The article explore...

#LLM On-Premise #DevOps
2026-01-27 TechCrunch AI

Qualcomm backs SpotDraft to scale on-device contract AI

SpotDraft, specializing in AI for contract management, receives support from Qualcomm. The company processes over 1 million contracts annually through its AI platform, recording a 173% year-over-year growth. The company aims for a valuation of $400 m...

#LLM On-Premise #DevOps
2026-01-27 LocalLLaMA

Kimi K2.5: New Language Model Released for Testing

A new version of the Kimi language model, named K2.5, has been released. Currently, availability is limited to the official website and there are no official announcements yet, suggesting that the model is still in the testing phase. The previous ver...

#LLM On-Premise #DevOps
2026-01-27 LocalLLaMA

AI skill supply chain vulnerability: developers exposed

A researcher demonstrated how to exploit vulnerabilities in AI model skill sharing platforms, injecting malicious code and executing it on developers' machines. The simulated attack highlights significant supply chain security risks in the world of a...

#LLM On-Premise #DevOps
2026-01-26 DigiTimes

AI-driven power demand tests Taiwan's grid resilience

The surge in power demand driven by artificial intelligence is straining Taiwan's power grid, amid a global shortage of gas turbines and transformers. The resilience of the infrastructure is crucial to support the growth of the AI sector.

#LLM On-Premise #DevOps
2026-01-26 OpenAI Blog

Indeed: AI transforms job search and talent acquisition

Indeed's CRO, Maggie Hulce, explains how artificial intelligence is revolutionizing job search, recruitment, and talent acquisition for both employers and job seekers. AI is optimizing processes, making them more efficient and targeted.

#LLM On-Premise #DevOps
2026-01-26 Tom's Hardware

Nvidia pumps another $2 billion into CoreWeave

Nvidia is investing another $2 billion in CoreWeave, an AI infrastructure provider. The decision reflects Nvidia's confidence in CoreWeave's growth and management, further solidifying the partnership between the two companies.

#Hardware
2026-01-26 Ars Technica AI

OpenAI spills technical details about how its AI coding agent works

OpenAI engineer Michael Bolin published a detailed technical breakdown of how the company's Codex CLI coding agent works internally, offering developers insight into AI coding tools that can write code, run tests, and fix bugs with human supervision....

2026-01-26 The Register AI

TrapC: A Memory-Safe C Language Extension Built with Claude

Robin Rowe introduces TrapC, a memory-safe extension of the C programming language, developed with the help of the Claude language model. The project is almost ready for testing. The article explores the implications of artificial intelligence in the...

2026-01-26 LocalLLaMA

Prompt injection: Local LLM compromised via email

A researcher demonstrated how a single email, containing a masked prompt injection, can trick a local LLM (ClawdBot) into exfiltrating sensitive data. The attack, which doesn't exploit software vulnerabilities, highlights the risks of using AI agents...

#LLM On-Premise #DevOps
2026-01-26 TechCrunch AI

AI startup CVector raises $5M for its industrial ‘nervous system’

New York-based industrial AI startup CVector has raised $5 million in funding. The company has developed a platform that acts as a "nervous system" for industry, with the goal of translating AI into tangible savings on a large scale. The funding will...

#LLM On-Premise #DevOps
2026-01-26 TechCrunch AI

Anthropic launches interactive Claude apps, including Slack

Anthropic has announced the integration of interactive apps within the Claude chatbot interface. Among the initial integrations, Slack and other workplace collaboration tools stand out, opening up new possibilities for using the model in professional...

#LLM On-Premise #DevOps
2026-01-26 The Register AI

AI adoption at work flatlined in Q4, says Gallup

According to a Gallup survey, AI adoption in the workplace stalled in the fourth quarter of 2025. However, those who have already started using it are making increased use of it. Frequent AI users remain a tiny minority of the workforce.

#LLM On-Premise #DevOps
2026-01-26 LocalLLaMA

Multi-agent orchestration for Claude Code: a "hive mind"

A technician has developed a multi-agent system for Claude Code, consisting of seven specialized agents that share persistent memory and communicate with each other. The goal is to simulate more intelligent and contextualized collaboration in code de...

#LLM On-Premise
2026-01-26 LocalLLaMA

2TB SSD at bargain price: the deal is at Walmart!

A Reddit user found a 2TB SSD at an incredibly low price at a local Walmart. The discovery highlights how, sometimes, hardware components can be found at bargain prices in less conventional distribution channels. A great opportunity for those assembl...

#Hardware #LLM On-Premise #DevOps
2026-01-26 LocalLLaMA

Transformers v5: New stable release with performance boosts

Hugging Face has released the stable version 5 of Transformers, focused on improved performance (especially for Mixture-of-Experts), simplified APIs for tokenizers, and dynamic weight loading. A migration guide is available to facilitate the upgrade.

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-26 The Register AI

AI is coming to solve your system outages

A sudden system outage in the middle of the night can trigger panic. But what if artificial intelligence could intervene to diagnose and resolve issues before they manifest, reducing downtime and improving overall infrastructure resilience?

#LLM On-Premise #DevOps
2026-01-26 LocalLLaMA

Pushing Qwen3-Max-Thinking Beyond its Limits

A Reddit discussion analyzes the capabilities of the Qwen3-Max-Thinking language model, exploring its potential and limitations. The LocalLLaMA community questions the model's performance and possible applications, with a focus on inference and optim...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-26 TechCrunch AI

Microsoft announces Maia 200, a powerful new chip for AI inference

Microsoft has announced Maia 200, a new chip designed for scaling AI inference. This processor, successor to the Maia 100 released in 2023, is optimized to run powerful AI models at faster speeds and with more efficiency. The company describes it as ...

#Hardware #LLM On-Premise #DevOps
2026-01-26 TechCrunch AI

Nvidia invests $2B in CoreWeave to boost AI compute capacity

Nvidia will invest $2 billion in CoreWeave, a company specializing in accelerated computing infrastructure, to support the expansion of its AI compute capacity by 5GW. The agreement also includes the integration of future Nvidia architectures, includ...

#Hardware #LLM On-Premise #DevOps
2026-01-26 LocalLLaMA

Benchmarking Used Tesla GPUs for Local LLMs: VRAM Analysis

A Reddit user is benchmarking secondhand Tesla GPUs with high VRAM to evaluate their performance in parallel configurations for local LLMs. The aim is to compare these cost-effective cards against more modern devices, quantifying the results using a ...

#Hardware #LLM On-Premise #DevOps
2026-01-26 404 Media

How AI is Exploited to Smear Controversial Figures

The article analyzes how right-wing influencers are exploiting artificial intelligence to create denigrating memes against individuals who have become symbols of protest movements. This phenomenon, accelerated by the spread of generative AI and meme ...

2026-01-26 Tom's Hardware

Neurophos: Silicio Photonics Chip 10,000x Smaller

Bill Gates-backed Neurophos has developed a silicio photonics chip that promises performance exceeding Nvidia's Vera Rubin GPUs while consuming the same power. The technology boasts a 10,000x size reduction compared to current solutions and advanced ...

#Hardware #LLM On-Premise #DevOps
2026-01-26 AI News

Formula E: Google Cloud AI for net-zero targets

Formula E is leveraging Google Cloud AI to meet its net-zero targets by optimizing global logistics and commercial operations. The multi-year agreement includes the integration of Gemini models for performance analysis, back-office workflows, and eve...

2026-01-26 Ars Technica AI

EU investigates xAI over Grok's sexualized deepfakes

The European Union has launched a formal investigation into Elon Musk's xAI following the spread of sexualized deepfake images, including those of minors, generated by its Grok chatbot. The investigation aims to assess whether xAI has taken adequate ...

#LLM On-Premise #DevOps
2026-01-26 Phoronix

Initial AMD GFX13 Target Merged To LLVM 23 Git - Presumably RDNA5

The AMDGPU GFX13 target, presumably related to the next-generation RDNA5 architecture, has been added to the LLVM 23 Git repository. This update represents a preliminary step towards supporting the new hardware in development toolchains.

#Hardware #LLM On-Premise #DevOps
2026-01-26 LocalLLaMA

Minimax Is Teasing M2.2: Busy February for Chinese Labs

February is shaping up to be a busy month for Chinese AI labs. In addition to the already announced Deepseek v4 and Kimi K3, Minimax is reportedly about to release the M2.2 model. There are also rumors of a proprietary model coming from ByteDance.

2026-01-26 The Register AI

Windows 11: Boot Failures Reported After January Security Updates

Microsoft is investigating reports of boot issues on Windows 11 machines after installing the January security updates. Some systems are stuck in a boot loop, requiring further analysis by Microsoft engineers.

#Hardware #LLM On-Premise #DevOps
2026-01-26 The Register AI

When AI 'builds a browser,' check the repo before believing the hype

Cursor's claim of building a browser almost entirely with AI agents has raised doubts. The article urges careful verification of claims before accepting them as truth, highlighting that code generation is only one part of shipping working software.

#LLM On-Premise #DevOps
2026-01-26 Tom's Hardware

PS4 Slim transformed into a handheld with 7-inch OLED screen

A fan has created a portable PS4 based on a PS4 Slim, integrating a 7-inch OLED screen, HDMI output, and a 3-hour battery. The modified console retains the original functionalities and can also be used while charging.

2026-01-26 Tom's Hardware

Saudi Arabia's 'The Line' megacity scaled back, may become AI hub

Saudi Arabia is reportedly scaling back its ambitious 'The Line' megacity project. New reports suggest a potential shift in focus towards becoming a hub for AI data centers. Originally planned for 9 million residents, the city may now prioritize digi...

#LLM On-Premise #DevOps
2026-01-26 Tech.eu

Synthesia doubles valuation to $4BN in 12 months

UK-based startup Synthesia, specializing in AI-powered corporate videos, has nearly doubled its valuation to $4 billion in just one year. A new $200 million funding round, led by Google Ventures, will support the development of interactive AI agents ...

#Hardware #LLM On-Premise #DevOps
2026-01-26 Phoronix

Linux: New Patches Aim to Lower Memory Use For Swap

A new patch series for the Linux kernel, developed by Kairui Song of Tencent, aims to enhance swap memory management. The changes promise memory savings and a slight increase in system performance.

2026-01-26 AI News

Modernizing apps triples the odds of AI returns, Cloudflare says

According to a Cloudflare report, companies that have modernized their applications are almost three times more likely to see a return on their AI investments. The report highlights application modernization as a key factor for AI success, surpassing...

#LLM On-Premise #DevOps
2026-01-26 DigiTimes

TAISIC Materials shifts focus to high-end SiC substrates

Materials manufacturer TAISIC Materials is shifting its focus towards the production of high-end silicio carbide (SiC) substrates. The strategic decision aims to capitalize on the increasing demand for advanced materials in the semiconductor industry...

2026-01-26 DigiTimes

Strategies of Nvidia, Arm, Qualcomm for AI ASICs

According to DIGITIMES, Nvidia, Arm, and Qualcomm are defining specific strategies for the development of Application-Specific Integrated Circuits (ASICs) dedicated to artificial intelligence. The article analyzes the different directions taken by th...

#Hardware #LLM On-Premise #DevOps
2026-01-26 LocalLLaMA

AI Chatbots Replace Customer Support: A Double-Edged Sword?

Companies are increasingly replacing customer support staff with AI-powered chatbots, often with unsatisfactory results. A user shares negative experiences with Ebay and Payoneer, highlighting irrelevant and inaccurate responses. The discussion focus...

#LLM On-Premise #DevOps
2026-01-26 LocalLLaMA

ChatGPT Subscriptions Canceled Over MAGA Donation

OpenAI's COO's decision to donate heavily to MAGA, Inc. has sparked backlash among ChatGPT users. Many subscribers have announced the cancellation of their premium accounts in protest, raising questions about the ethical alignment of AI companies.

#LLM On-Premise #DevOps
2026-01-26 Tech.eu

Kime raises €2M for AI-driven brand visibility analytics

Copenhagen-based startup Kime has raised €2 million in pre-seed funding to develop an analytics platform that tracks brand visibility within AI-generated responses. The aim is to provide companies with measurable and actionable data on the impact of ...

#LLM On-Premise #DevOps
2026-01-26 TechCrunch AI

Synthesia hits $4B valuation, lets employees cash out

British startup Synthesia, which provides an AI platform for creating interactive training videos, has raised a $200 million Series E funding round. This brings its valuation to $4 billion, up from $2.1 billion just a year ago.

#LLM On-Premise #DevOps
2026-01-26 LocalLLaMA

Reflow Studio: Local workstation for voice cloning and lip sync

Reflow Studio v0.5 is a local and portable workstation for neural dubbing, integrating RVC (voice cloning), Wav2Lip (lip sync), and GFPGAN (face enhancement). It doesn't require Python installation and offers a Cyberpunk-themed interface for an offli...

#LLM On-Premise #DevOps
2026-01-26 The Register AI

Red Teaming for AI: The Cornerstone of Secure Compliance

Red teaming emerges as a cornerstone practice for safeguarding AI systems, especially in the age of agentic AI, where multi-LLM systems make autonomous decisions. Transparency in AI development and deployment is crucial to mitigate vulnerabilities an...

#LLM On-Premise #DevOps
2026-01-26 Tech.eu

Orbital raises $60M Series B to automate real estate law with AI

Orbital, an AI platform for real estate law, has raised $60 million in a Series B funding round. The company aims to expand its presence in the US and UK, further developing its technology to automate legal processes in the real estate sector and cre...

#LLM On-Premise #DevOps
2026-01-26 ArXiv cs.AI

LLM Agent Reliability: A Diagnostic Framework for Tool Invocation

A new diagnostic framework evaluates the reliability of multi-agent LLM agents in enterprise automation, focusing on deployments in privacy-sensitive environments. The research analyzes various hardware architectures and models, identifying bottlenec...

#Hardware
2026-01-26 ArXiv cs.CL

ChiEngMixBench: Evaluating LLMs on Chinese-English Code-Mixed Generation

ChiEngMixBench, a new benchmark, evaluates large language models (LLMs) on Chinese-English code-mixing in real-world communication. It analyzes the spontaneity and naturalness of language, revealing cognitive alignment strategies between LLMs and hum...

#LLM On-Premise #Fine-Tuning #DevOps
2026-01-26 ArXiv cs.LG

Fitbit and Mental Health: Study on Students During the Pandemic

A research analyzed data collected via Fitbit devices to assess the mental health of students during the pandemic. The results indicate that physiological parameters such as heart rate and sleep quality can be useful for early identification of anxie...

#Fine-Tuning
2026-01-26 ArXiv cs.LG

Causal Discovery: New Method for Discrete Data

A new study introduces a generalized score matching approach to identify causal relationships in discrete data. The method, which focuses on identifying the topological order of directed acyclic graphs (DAGs), promises to improve the accuracy of caus...

2026-01-26 DigiTimes

AI development: focus shifts to real-world impact in 2026

In 2026, artificial intelligence development will shift towards concrete applications and tangible results. This change of direction indicates a phase of industry recalibration, with a greater focus on the effectiveness and integration of AI in real-...

#LLM On-Premise #DevOps
2026-01-26 DigiTimes

Foxconn leads EMS/ODM rankings on AI servers and Apple boost

According to DIGITIMES, Foxconn maintains its leading position in the production of AI servers, also benefiting from strong demand from Apple. Competition in the sector remains high, with other manufacturers seeking to gain market share.

#LLM On-Premise #DevOps
2026-01-26 LocalLLaMA

Nvidia DGX Spark GB10 won at a hackathon: what now?

A user won a Dell DGX Spark GB10 workstation at an Nvidia hackathon and is asking for advice on how to best utilize it. Previously, they were using it for inferencing a Nemotron 30B model with vLLM, which required over 100 GB of memory. Now they are ...

#Hardware #LLM On-Premise #DevOps
2026-01-26 LocalLLaMA

AutoGen: accelerated inference with Speculative Reasoning Execution

An engineer optimized Microsoft AutoGen's reasoning loop, reducing agent latency by 85% using Speculative Reasoning Execution (SRE). The module, currently under approval, predicts "tool calls" in parallel with LLM inference. A distributed training sy...

#Hardware #Fine-Tuning
2026-01-26 Phoronix

Several New X.Org Libraries See 2026 Releases

While we await developments on the future of X.Org Server and a possible 26.1 release, several X.Org libraries received new point releases. These releases primarily focus on build fixes and minor improvements.

2026-01-26 DigiTimes

Pegatron aims for growth in AI servers and automotive sectors

Pegatron, despite market challenges, aims for significant growth in the AI server and automotive sectors. The Taiwanese company plans to expand its presence in these strategic markets to offset difficulties encountered in other areas.

#LLM On-Premise #DevOps
2026-01-25 DigiTimes

The ASIC server wars begin as Nvidia squeezes the margins

According to DIGITIMES, the ASIC-based server market is about to enter a phase of more intense competition, with Nvidia putting pressure on profit margins. This could lead to new dynamics in the artificial intelligence hardware sector.

#Hardware #LLM On-Premise #DevOps
2026-01-25 DigiTimes

Ennostar’s Micro LED bet targets a bottleneck inside AI server racks

Ennostar is betting on Micro LEDs to solve overheating issues inside AI servers. The technology could improve the efficiency and reliability of cooling systems, which are crucial for the performance of artificial intelligence workloads.

#Hardware #LLM On-Premise #DevOps
2026-01-25 DigiTimes

Humanoid robots edge toward mass production as Tesla aims for 2026 launch

Tesla is accelerating plans for the mass production of humanoid robots, aiming for a launch in 2026. This initiative could mark a turning point in the robotics sector, opening new perspectives for automation and human-machine interaction. Mass produc...

#LLM On-Premise #DevOps
2026-01-25 DigiTimes

Tariff uncertainty pushes US allies to rethink China ties

Growing trade tensions and uncertainty about tariffs imposed by the United States are prompting its allies to reconsider their economic relationships with China. This situation could lead to a diversification of supply chains and a greater focus on d...

2026-01-25 TechCrunch AI

From Firefighting to AI: Startup Aims for AI Gold Mine

A founder is transforming his experience in the firefighting industry into an opportunity in the field of artificial intelligence. The company sees the nozzle as just the beginning of a journey leading to innovative AI solutions.

#LLM On-Premise #DevOps
2026-01-25 TechCrunch AI

ChatGPT is pulling answers from Elon Musk’s Grokipedia

ChatGPT is incorporating information from Grokipedia, the AI-generated encyclopedia developed by Elon Musk's xAI, into its search results. This raises questions about the origin and reliability of the sources used by large language models.

#LLM On-Premise #DevOps
2026-01-25 TechCrunch AI

Humans&: New Foundation Models for AI Collaboration

Humans&, a startup founded by alumni of Anthropic, Meta, OpenAI, xAI, and Google DeepMind, is building next-generation foundation models focused on collaboration, moving beyond the traditional chat-based approach.

#LLM On-Premise #DevOps
2026-01-25 TechCrunch AI

Science fiction writers, Comic-Con say goodbye to AI

Major players in science fiction and pop culture are taking firmer stances against generative AI. The article explores how these communities are reacting to the advancement of artificial intelligence and what implications this may have for the future...

#LLM On-Premise #DevOps
2026-01-25 LocalLLaMA

GLM-4.7-Flash: performance further improved

A Reddit discussion highlights speed improvements achieved with GLM-4.7-Flash, a large language model. Specific technical details and benchmark results are available via a GitHub link, providing developers with useful information to optimize performa...

#LLM On-Premise #DevOps
2026-01-25 LocalLLaMA

GLM-4.7-Flash: performance slowdown with large contexts?

A user reported a performance drop in the GLM-4.7-Flash model as the context length increases. Benchmarks show a decrease in tokens per second (t/s) when moving from short to longer contexts, suggesting a possible bottleneck in processing long sequen...

#Hardware
2026-01-25 The Next Web

EU: Single company structure for startups, ban on 'high-risk' tech

The European Union is accelerating innovation with "EU Inc," a unified legal structure for startups. Simultaneously, it aims to eliminate technology suppliers deemed "high-risk" from critical infrastructure. These measures aim to strengthen the conti...

2026-01-25 Phoronix

AMD: Graphics Driver Fixes Incoming for Linux 7.0

AMD is planning to release a series of fixes to the open-source AMDGPU and AMDKFD graphics drivers. These changes have been queued up ahead of the next Linux 7.0 kernel merge window and aim to improve the stability and reliability of the drivers. The...

#Hardware
2026-01-25 TechCrunch AI

Gemini-powered Siri: Apple to unveil the upgrade in February?

Rumors suggest Apple might unveil the new version of its Siri voice assistant, powered by Google's Gemini AI, in February. This move would mark a turning point for Siri, long criticized for its limited capabilities compared to competitors.

2026-01-25 LocalLLaMA

Iran: Internet Blackout and Local LLMs as an Alternative

In Iran, a prolonged internet blackout, started over 400 hours ago due to protests, has led to severe restrictions on online access. Only a few sites, including Google and ChatGPT, have been whitelisted. In this scenario, local uncensored language mo...

#Hardware
2026-01-25 LocalLLaMA

Open Source Coding Ideas for AI-Assisted Engineering

A Reddit user seeks advice on structuring a guide for developers, from beginners to veterans, interested in AI-assisted engineering. The goal is to create a collaborative learning environment and identify useful tools for hackathons and long-term pro...

2026-01-25 LocalLLaMA

TrustifAI: A Framework for Evaluating the Reliability of AI Responses

TrustifAI is a new framework designed to quantify and explain the reliability of responses generated by large language models (LLMs). Instead of a simple correctness score, TrustifAI calculates a multi-dimensional 'Trust Score' based on evidence cove...

#RAG
2026-01-25 LocalLLaMA

KV cache fix for GLM 4.7 Flash: longer contexts

An optimization for GLM 4.7 Flash reduces VRAM usage of the KV cache. The modification, which involves removing 'Air', allows handling much longer contexts with the same hardware setup, saving gigabytes of video memory.

#Hardware
2026-01-25 LocalLLaMA

Extreme Modding: RTX 4090 Upgraded to 48 GB of Memory

An enthusiast has published a detailed guide to increase the memory of an RTX 4090 up to 48 GB. The procedure, which requires advanced soldering skills and in-depth hardware knowledge, is documented in a video and is generating interest in the moddin...

#Hardware
2026-01-25 LocalLLaMA

SOSM: An Open-Source Graph-Based Alternative to Transformers

A researcher has open-sourced the Self-Organizing State Model (SOSM) project, a language model architecture exploring alternatives to standard Transformer attention. SOSM uses graph-based routing, separates semantic representation from temporal learn...

2026-01-25 Tom's Hardware

ChatGPT found sourcing data from AI-generated content

ChatGPT has been found to be citing Grokipedia in some of its answers, returning recursive results that risks spreading hallucinated or incorrect information. This raises concerns about the quality and reliability of the language model's output.

2026-01-25 LocalLLaMA

Zerotap: The Android App Aiming to Control Your Phone with AI

The developers of Zerotap, an Android app that allows AI to interact with the phone like a human, are asking users for feedback. The app supports Ollama and models like OpenAI and Gemini. Planned features include: connection to external services, adv...

#LLM On-Premise
2026-01-25 Tom's Hardware

RTX 2080 Ti modded into 900W Titan RTX with transplanted core

A modder has transformed an RTX 2080 Ti Hall of Fame graphics card into a supercharged Titan RTX. The modification involved transplanting the core and adding 24GB of GDDR6 memory, along with a 900W power limit modification. The resulting card outperf...

#Hardware
2026-01-25 LocalLLaMA

What happened to moondream3? The state of the visual model

The Moondream3 visual model, unveiled last year, seems to have disappeared. Despite an MLX version being available, Llama.cpp implementations and public updates are missing. The community is wondering about the future of this promising project.

#LLM On-Premise
2026-01-25 Phoronix

Focusrite Forte USB Audio Interface To Be Supported By Linux 7.0

The Focusrite Forte 2-in, 4-out USB audio interface, a portable audio recording solution, will be supported by the mainline Linux 7.0 kernel. The patches are queued in the Linux kernel's sound subsystem development tree. While a convenient little dev...

#Hardware
2026-01-25 Phoronix

Qualcomm: Display and Graphics Support Enhanced with Linux 7.0

Rob Clark sent out the latest MSM DRM kernel driver updates for the latest Qualcomm display and graphics enhancements, ahead of next month's Linux 7.0 merge window. Highlights include support for Snapdragon 8 Elite Gen 5 and enablement of the older A...

#Hardware
2026-01-25 Tech in Asia

Singapore to invest $786m in public AI research by 2030

Singapore has announced a $786 million investment in public artificial intelligence research by 2030. This initiative follows previous government allocations, including a US$393 million fund allocated to AI Singapore for research and development. The...

2026-01-25 Tech in Asia

Nvidia CEO visits China as H200 chip faces customs uncertainty

Nvidia's CEO has visited China as the company awaits approval from Beijing to sell its H200 AI chip. The sale has been authorized by the US, but uncertainty remains regarding Chinese customs. The move underscores the importance of the Chinese market ...

#Hardware
2026-01-25 Tech in Asia

South Korea denies bias in Coupang investigation

Prime Minister Kim Min-seok clarified that the Korean government has not discriminated against US firms, including Coupang, during an ongoing investigation. The government reaffirms its impartiality towards foreign businesses.

2026-01-25 Tech in Asia

Taiwan: AI and Big Data Drive Innovation in Startups

A recent study in Taiwan reveals that over 80% of local startups focus their activities on artificial intelligence and big data. This figure highlights how these technologies are becoming increasingly central to new businesses, driving innovation and...

2026-01-25 LocalLLaMA

Qwen 3 VL: Distilling Gemini 3 Flash visual reasoning

A user is working on a synthetic data pipeline for high-precision image-to-image models. The goal is to transfer the visual reasoning capabilities of Gemini 3 Flash into the open-source model Qwen 3 VL 32B, to obtain a local engine for high-scalabili...

#Fine-Tuning
2026-01-25 LocalLLaMA

Stable-DiffCoder: a new code LLM built on Seed-Coder

Stable-DiffCoder, a new large language model (LLM) specializing in code generation, has been unveiled. Built upon the Seed-Coder model, Stable-DiffCoder utilizes diffusion techniques to enhance the quality and consistency of the generated code. The p...

2026-01-25 Phoronix

GIMP 3.0.8 Released In Advance Of GIMP 3.2

GIMP 3.0.8 is now available, potentially marking the final bug fix release for the 3.0 series. This update arrives ahead of the anticipated GIMP 3.2 release, the next major version of the popular image manipulation software.

2026-01-25 DigiTimes

Singapore moves to secure AI sovereignty with $786m investment

Singapore aims to strengthen its technological independence in the field of artificial intelligence, allocating $786 million. The goal is to promote the development of local capabilities and reduce dependence on foreign suppliers in a strategic secto...

2026-01-25 DigiTimes

Fan motor demand drives growth for Taiwan analog chip makers

Taiwanese analog chip makers are benefiting from strong demand in the fan motor sector. Growth is driven by the need for efficient cooling solutions in various sectors, from consumer electronics to data centers. This positive trend highlights Taiwan'...

2026-01-25 DigiTimes

Nvidia's China AI chip share falls to 8% as local rivals ramp up

Nvidia's market share in the Chinese AI chip sector has fallen to 8%. Chinese manufacturers are increasing production and gaining market share, eroding the dominance previously held by the US company. This dynamic reflects increasing competition in t...

#Hardware
2026-01-25 DigiTimes

From cars to robots: automakers expand into AI and wearable tech

Automakers are expanding their horizons by investing in artificial intelligence and wearable technologies. This strategic diversification aims to integrate new features into vehicles and explore adjacent sectors, paving the way for future innovations...

2026-01-25 LocalLLaMA

Drift: Codebase Analysis Without AI, Just AST Parsing

A developer has created Drift, a tool for code analysis that uses AST parsing and Regex. It scans the codebase, extracts patterns, and makes them accessible via CLI or IDE. Unlike rule-based tools, Drift learns from the codebase, helping agents avoid...

#Fine-Tuning
2026-01-24 LocalLLaMA

Qwen3-TTS: Ultra-Low Latency, Voice Cloning & OpenAI-Compatible API

The Qwen team has released Qwen3-TTS, an open-source speech synthesis system offering low latency (97ms), voice cloning, and OpenAI API compatibility. It supports 10+ languages and includes high-quality voices. It can be easily integrated into existi...

#Hardware
2026-01-24 LocalLLaMA

LLM: Which local model on 24GB GPU in 2026?

A LocalLLaMA user is wondering about the evolution of large language models (LLMs) that can be run locally. Specifically, he asks if, nine months after the release of Gemma 3 27b, there are better alternatives available that can run on a single 3090t...

#Hardware
2026-01-24 TechCrunch AI

Tech CEOs boast and bicker about AI at Davos

This week's World Economic Forum meeting saw tech leaders hotly debating artificial intelligence. The event transformed, at times, into a high-powered tech conference, with CEOs clashing over future visions and strategies.

2026-01-24 LocalLLaMA

GLM 4.7 Flash: Uncensored "Balanced" & "Aggressive" Variants

Uncensored versions of Z.ai's GLM 4.7 Flash model are now available. This 30B MoE model features approximately 3B active parameters and a 200K token context. The "Balanced" variant, suitable for agentic coding, and the "Aggressive" variant, for uncen...

#LLM On-Premise
2026-01-24 LocalLLaMA

South Korea: An Emerging Power in Artificial Intelligence

South Korea is establishing itself as a leading nation in the field of artificial intelligence, thanks in part to the Korean National Sovereign AI Initiative. This government program incentivizes the development of domestic AI models, funding the mos...

2026-01-24 LocalLLaMA

DIY Audiobook: Open-Source Tool with Qwen3 and Voice Cloning

A developer has created an open-source converter to transform PDFs, EPUBs, and other formats into high-quality audiobooks. The tool uses Qwen3 TTS, an open-source voice model, and supports voice cloning. The goal is to offer a free alternative to pai...

2026-01-24 Tom's Hardware

Custom RTX 3080 Heatsink: 100W Car Amplifier Hack Slashes Temps

A Redditor replaced their RTX 3080's backplate with a massive heatsink repurposed from a 100W car amplifier. This custom mod reportedly slashes GPU temperatures by 10°C, improving thermal performance and potentially extending the lifespan of the grap...

#Hardware
2026-01-24 The Next Web

Mews raises €255M to accelerate AI and automation in hospitality

Amsterdam-based hospitality tech platform Mews has raised €255 million in a Series D funding round. The aim is to accelerate the adoption of automation and AI-powered solutions for hotels worldwide. The funding, led by EQT Growth, values the company ...

2026-01-24 LocalLLaMA

AI Media Player: Automatic Subtitles and Video Chat in the Browser

A new AI-powered media player promises to revolutionize the way we consume video and audio content directly in the browser. With no installation required, it offers automatic subtitles in over 100 languages, translation, summaries, a built-in diction...

2026-01-24 Tom's Hardware

Intel Q4: Revenue at 2010 Low, Slow Recovery, Supply Chain Issues

Intel closed Q4 with positive results but anticipates supply constraints in Q1 2026. The foundry division continues to generate losses, despite the start of 18A production. Corporate management stabilized in 2025, but the path to full recovery still ...

#Hardware
2026-01-24 Tom's Hardware

Zotac reportedly hikes Nvidia RTX GPU prices by up to $200

Zotac has reportedly increased the MSRP of its Nvidia RTX GPUs by up to $200. Some users are also reporting order cancellations prior to the price hike, sparking controversy over the company's handling of deliveries. The company blames a "system erro...

#Hardware
2026-01-24 LocalLLaMA

LLM Workstation: Which Setup Under $5k?

A user, tired of Claude Code's limitations, seeks advice on assembling or purchasing a dedicated machine for offline language model development, with a budget of $5,000. They are considering various options, including pre-configured workstations and ...

#Hardware
2026-01-24 LocalLLaMA

Field test of GLM 4.7 Flash Q6 with RTX 5090

A user shares their hands-on experience with the GLM 4.7 Flash Q6 model, focusing on its ability to handle Roo code in personal web projects. The model proved more reliable and precise than alternatives like GPT-OSS 120b and GLM 4.5 Air, especially w...

#LLM On-Premise
2026-01-24 404 Media

Cow Uses Tools Like a Chimpanzee: Discovery in Austria

A Swiss Brown cow named Veronika has been observed using tools to scratch herself, a behavior previously documented mainly in primates, orcas, and birds. The discovery challenges assumptions about bovine intelligence and raises questions about the ro...

2026-01-24 LocalLLaMA

Local LLM Development: A Challenge for Hardware Coders?

A hardware coder has expressed frustration with the performance of large language models (LLMs) running locally on a 5090 GPU. Despite the powerful hardware, the models seem underutilized and unable to leverage external tools to improve context. The ...

#Hardware #LLM On-Premise
2026-01-24 Tom's Hardware

Applied Digital builds AI data center in secret to avoid protests

Applied Digital is building a new 430 MW AI data center in a secret location. The company, previously involved in crypto mining, wants to avoid media attention and potential protests from local residents. The decision to operate in secrecy is motivat...

2026-01-24 LocalLLaMA

LLM Prompt Library for RAG: An Open-Source Collection

A prompt library for large language models (LLM), specifically designed for Retrieval-Augmented Generation (RAG) architectures, has been created and made available. The library includes prompts focused on grounding constraints, citation rules, and ha...

#RAG
2026-01-24 Phoronix

Linux 6.19: AMDGPU Driver Fixes Regressions

The AMDGPU driver for Linux 6.19 has received urgent fixes to address regressions affecting many users. Developers have worked to integrate the necessary patches and stabilize the system, ensuring a smoother user experience. This timely intervention ...

#Hardware
2026-01-24 Tom's Hardware

China reveals 200-strong AI drone swarm controlled by single soldier

The People's Liberation Army has revealed its latest drone swarm tech, featuring 200 units. The system is resistant to jamming, capable of autonomous decisions, and controlled by a single soldier thanks to an "intelligent algorithm" that allows units...

2026-01-24 LocalLLaMA

Hugging Face: AI & ML Model Highlights of the Week

Hugging Face has released and updated several AI and machine learning models. These include multilingual reasoning models like GLM-4.7, tools for automated report generation, and multimodal models for translation and medical image processing. Also no...

2026-01-24 LocalLLaMA

Uncensored LLM for NSFW Interactions: The Search is On

A Reddit user is seeking an uncensored large language model (LLM) capable of generating particularly spicy and intelligent prompts for sexually explicit role-playing games (NSFW). The discussion is open within the LocalLLaMA community, with the aim o...

2026-01-24 LocalLLaMA

GLM 4.7 Flash: Speed Issues with Large Contexts?

A user reported a significant performance drop with GLM 4.7 Flash in LM Studio after exceeding 10,000 tokens, despite using recommended settings and updated software. The discussion explores whether other implementations, such as vllm, might mitigate...

#Hardware #LLM On-Premise
2026-01-24 LocalLLaMA

Context Engine: Self-Hosted Code Search for LLMs

A developer has created Context Engine, a self-hosted retrieval system for codebases, designed to work with various MCP clients. It uses a hybrid search that combines dense embeddings with lexical search and AST parsing. The goal is to avoid overload...

#LLM On-Premise #DevOps #RAG
2026-01-24 DigiTimes

Taiwanese optics industry poised to reshape automotive electronics

Taiwan's optics industry is gearing up for a key role in automotive electronics, driven by the increasing demand for autonomous driving. Taiwanese companies aim to provide innovative solutions for Advanced Driver Assistance Systems (ADAS) and other a...

2026-01-24 DigiTimes

Taiwan drone alliance sets US FCC certification in sight

A Taiwanese drone alliance is working towards obtaining FCC (Federal Communications Commission) certification in the United States. This strategic move aims to facilitate access to the US market, opening up new business opportunities for Taiwanese co...

2026-01-24 DigiTimes

Energy limits AI growth; supply chains reshaped

The increasing energy consumption of artificial intelligence poses new challenges. Geopolitical tensions over rare earths and packaging innovations are reshaping global supply chains. An analysis by DIGITIMES highlights how these interconnected facto...

2026-01-24 OpenAI Blog

Inside GPT-5 for Work: How Businesses Use GPT-5

A new data-driven report examines ChatGPT adoption across industries, highlighting key automated tasks, departmental usage patterns, and the future prospects of AI in the workplace. The analysis is based on concrete data to provide a clear and useful...

2026-01-24 LocalLLaMA

LuxTTS: Efficient voice cloning with a compact TTS model

LuxTTS, a diffusion-based text-to-speech model with only 120 million parameters, has been released. It stands out for its high-quality voice cloning capabilities, comparable to models ten times larger, and its efficiency, requiring less than 1GB of V...

2026-01-24 LocalLLaMA

Strix Halo: MiniMax Q3 K_XL Runs Surprisingly Fast

A user tested Strix Halo (Bosgame M5 with 128GB) on Ubuntu 25.10, achieving remarkable results with the MiniMax Q3 K_XL model. Specifically, the speed of approximately 30 tokens per second in TG mode makes the model usable for brainstorming and discu...

2026-01-24 TechCrunch AI

AMI Labs: Yann LeCun's new startup in the world of AI models

AMI Labs, Yann LeCun's new venture after leaving Meta, has immediately captured the attention of the industry. The company will focus on developing advanced AI models, promising to revolutionize the field of artificial intelligence. LeCun, a leading ...

2026-01-24 LocalLLaMA

South Korea's Ruthless Race to Sovereign AI

South Korea is engaged in an intense competition to develop its own artificial intelligence. This "AI Squid Game," as it has been dubbed, sees various companies and institutions vying for supremacy in the field of AI, with the goal of achieving techn...

2026-01-23 Wired AI

Trump and AI at Davos: Analysis from Uncanny Valley

Donald Trump and major AI companies shared the stage at the World Economic Forum in Davos. This episode of 'Uncanny Valley' analyzes the implications of this meeting, exploring the dynamics between politics, technology, and the global economy. A focu...

2026-01-23 TechCrunch AI

Google Photos Update: Create Memes with Gemini AI

Google Photos introduces a new feature that allows users to create custom memes from their photos. The integration leverages Google's Gemini AI, offering a fun way to experiment with images.

2026-01-23 The Register AI

Surrender as a service: Microsoft unlocks BitLocker for feds

A report indicates that Microsoft provided the FBI with BitLocker encryption keys to unlock the laptops of Windows users. This raises questions about the actual security of data protected with BitLocker and the importance of independently managing yo...

2026-01-23 OpenAI Blog

Unrolling the Codex agent loop

A technical deep dive into the Codex agent loop, explaining how Codex CLI orchestrates models, tools, prompts, and performance using the Responses API. We explore the architecture and inner workings of this key component for developing applications b...

2026-01-23 LocalLLaMA

ChatGPT: Scaling PostgreSQL to power 800 million users

OpenAI has outlined its PostgreSQL scaling strategies to support ChatGPT's 800 million users. The original article delves into the challenges faced and the solutions implemented to manage such a high workload, while ensuring optimal performance and s...

2026-01-23 LocalLLaMA

Sweep: Open-weights 1.5B model for next-edit autocomplete

Sweep AI has released a 1.5B parameter open-source model, named Sweep, designed to predict the next code edits. Available on Hugging Face and via a JetBrains plugin, this tool uses recent edits as context, outperforming larger models in speed and acc...

#Fine-Tuning
2026-01-23 TechCrunch AI

Meta pauses teen access to AI characters ahead of new version

Meta has temporarily paused teen access to its AI characters. The company is developing new versions of these characters, designed to provide age-appropriate responses. The move is a precautionary measure, pending the release of the updates.

2026-01-23 404 Media

Behind the Blog: Artificial Intelligence, Banks, and Censorship

A behind-the-scenes look at 404 Media. This week, the focus is on the impact of generative artificial intelligence, a conference on money laundering, and the removal of symbols related to slavery. The interview with the Wikimedia Foundation CTO addre...

#Fine-Tuning
2026-01-23 TechCrunch AI

Meta pauses teen access to AI characters

Meta is developing new versions of its AI characters, designed to provide age-appropriate responses to teenagers. The company has temporarily paused access to this feature for younger users in order to refine and calibrate the responses provided by t...

2026-01-23 LocalLLaMA

Voice Agents: Better Models or Tighter Constraints?

In the development of voice agents, the debate focuses on the relative importance between model quality and the definition of effective behavioral constraints. A smarter model does not always translate into superior performance if not properly constr...

2026-01-23 Wired AI

The Math on AI Agents Doesn’t Add Up

A research paper suggests AI agents are mathematically doomed to fail. The industry doesn’t agree. This raises fundamental questions about the actual ability of AI agents to achieve their advertised promises.

2026-01-23 TechCrunch AI

AI CEOs transformed Davos into a tech conference

The World Economic Forum's annual meeting in Davos felt different this year, with AI dominating the conversation. CEOs openly discussed AI implications, overshadowing traditional topics like climate change and global poverty. The event marked a turni...

2026-01-23 Tom's Hardware

Alibaba plans T-Head chip-arm IPO to boost AI infrastructure

Alibaba is reportedly preparing an IPO for its chip manufacturing arm, T-Head. The primary goal is to raise significant capital to fund the development of AI accelerator solutions and support ambitious infrastructure projects. T-Head would compete wi...

2026-01-23 TechCrunch AI

OpenAI's Sam Altman Plans India Visit Amid AI Focus

OpenAI CEO Sam Altman is set to visit India for the first time in nearly a year. The visit comes at a time of great excitement in the artificial intelligence sector, with many industry leaders converging in New Delhi to discuss the future of technolo...

2026-01-23 Tech.eu

Cloover secures over $1.2B, EU Inc launched at Davos

The past week witnessed significant tech funding activity in Europe, with over €2.7 billion distributed across more than 70 deals. Key highlights include the launch of EU Inc at the World Economic Forum in Davos and the announcement of new investment...

2026-01-23 Phoronix

AMD Ryzen AI Software 1.7 Released For Improved Performance

AMD has released a new version of Ryzen AI Software, a user-space package for Microsoft Windows and Linux designed to leverage Ryzen AI NPUs in various AI tasks. The update promises improved performance and new model support.

#Hardware
2026-01-23 Tech.eu

CyberAlloy launches to unite Europe’s cyber defenders

CyberAlloy, an independent network connecting companies, governments, research institutions, venture capitalists, and security specialists, has officially launched. Its goal is to create a cyber-resilient ecosystem by promoting collaboration and info...

2026-01-23 Tom's Hardware

Lian Li RS1200G ATX 3.1 power supply review:

The Lian Li RS1200G ATX 3.1 power supply offers rotational innovation meets reliability. Case compatibility remains a legitimate concern. ATX 3.1 power supplies are designed to support the latest motherboards and GPUs, offering greater energy efficie...

2026-01-23 Tom's Hardware

Asus announces 'immediate internal review' of 800-series motherboards

Asus says it is investigating reports concerning its 800-series motherboards and 9800X3D processors following user complaints of hardware failures. The company aims to shed light on the causes of the malfunctions and assess possible solutions to addr...

#Hardware
2026-01-23 Tech.eu

ClearScore snaps up London mortgage outfit Acre Platforms

London-based fintech ClearScore, which provides credit score services, has acquired UK mortgage platform Acre Platforms. The move, for an undisclosed amount, marks ClearScore’s first move into mortgages and follows its acquisition of Aro Finance last...

2026-01-23 The Next Web

Can AI replace the humanity of Classical Music?

In October 2021, the Beethoven Orchestra Bonn interpreted the first movement of Beethoven’s 10th unfinished symphony, which was completed with the use of artificial intelligence. A team developed an AI to analyze Beethoven’s music style and life, gen...

2026-01-23 Tom's Hardware

Intel shares down 13% despite shrinking losses

Intel reports flat revenue for 2025, but shares plummet due to a $300 million loss, despite a massive external investment. Demand is expected to outpace supply until at least 2026.

#Hardware
2026-01-23 LocalLLaMA

DeepSeek-V3.2: Open-Source Model Rivals GPT-5 at 10x Lower Cost

DeepSeek has released V3.2, an open-source model that reportedly matches GPT-5 on math reasoning while costing 10x less to run. By using a new 'Sparse Attention' architecture, the Chinese lab has achieved frontier-class performance for a total traini...

2026-01-23 LocalLLaMA

Llama.cpp now supports OpenAI Responses API

The integration of the OpenAI Responses API into Llama.cpp is now a reality. This news, welcomed by the community, promises to simplify interaction with language models and open new possibilities in the development of AI-based applications. Initial t...

#Hardware #LLM On-Premise
2026-01-23 LocalLLaMA

GLM4.7-Flash REAP: new model for agentic coding

A version of the GLM4.7-Flash model, called REAP, optimized for agentic coding has been released. Initial tests indicate a significant improvement over previous versions, positioning it among the most efficient models in relation to size. REAP versio...

#Fine-Tuning
2026-01-23 DigiTimes

Foxconn Industrial Internet accelerates AI-driven transformation

Foxconn Industrial Internet (FII) is accelerating its AI-driven transformation to reshape its global manufacturing platform. The company aims to enhance production processes through the integration of artificial intelligence solutions, making them mo...

2026-01-23 DigiTimes

Taiwan: Auto Industry Recovers After Tariff Talks

Taiwan's automotive industry shows signs of recovery after recent tariff negotiations. A DIGITIMES analysis reveals a 15% improvement, while highlighting the need for further reassurance for sustainable growth in the sector.

2026-01-23 DigiTimes

Compal chair celebrates 50 years with Kinpo Group

The chairman of Compal celebrated his 50th anniversary with the Kinpo Group. The event takes place during a moment of reflection on the demographic challenges facing Taiwan and the strategies to ensure sustainable long-term growth. The Compal group i...

2026-01-23 DigiTimes

Taiwan-US tariff agreement expected to boost machine tool industry

A new tariff agreement between Taiwan and the United States is expected to significantly boost Taiwan's machine tool industry. The agreement is projected to reshape global market dynamics, fostering growth and expansion in the sector. The deal could ...

2026-01-23 LocalLLaMA

OpenAI Considers Outcome-Based Pricing: A Tax on Innovation?

OpenAI is reportedly considering an outcome-based pricing model, potentially applying royalties on customer profits. This strategic shift could drive adoption of local solutions, offering greater control over costs and terms of use.

2026-01-23 DigiTimes

Compal eyes explosive AI server growth in 2026 after revenue dip

Taiwanese manufacturer Compal anticipates strong expansion in the AI server market starting in 2026, following a period of revenue decline. The company is investing in new technologies and production capabilities to meet the growing demand for AI sol...

#Hardware
2026-01-23 ArXiv cs.CL

AfriEconQA: A New Dataset for African Economic Analysis

AfriEconQA, a benchmark dataset for African economic analysis based on World Bank reports, has been introduced. Comprising nearly 9,000 QA instances, the dataset aims to evaluate Information Retrieval and RAG systems in a context of numerical reasoni...

#Fine-Tuning #RAG
2026-01-23 ArXiv cs.CL

Entropy-Tree: Tree-Based Decoding with Entropy-Guided Exploration

A novel decoding method for large language models (LLMs), called Entropy-Tree, leverages entropy to guide tree-based exploration. This approach aims to improve both accuracy and reliability in reasoning tasks, outperforming traditional sampling strat...

#Fine-Tuning
2026-01-23 ArXiv cs.LG

Language Models Entangle Language and Culture

New research highlights how the quality of LLM responses is affected by the language used in the query. Low-resource languages receive lower quality answers. The study also reveals that the choice of language significantly impacts the cultural contex...

2026-01-23 ArXiv cs.AI

Uncovering Latent Bias in LLM-Based Emergency Department Triage

New research highlights how large language models (LLMs) integrated into hospital triage systems may exhibit hidden biases against patients from diverse racial, social, and economic backgrounds. The study uses proxy variables to assess the discrimina...

#Fine-Tuning
2026-01-23 DigiTimes

Tata Group to invest US$11 billion in India, boosting AI and chips

Tata Group will invest $11 billion in Maharashtra's Innovation City, India. The aim is to strengthen the country's ambitions in artificial intelligence and semiconductors, strategic sectors for Indian and global technological growth. The initiative i...

2026-01-23 DigiTimes

US backs Taiwan's asymmetric warfare strategy, defense supply chain

The American Institute in Taiwan (AIT) has pledged US backing for Taiwan's asymmetric warfare strategy and its defense supply chain. The AIT emphasized the importance of Taiwan's freedom and security, reiterating the US commitment to support the isla...

2026-01-23 DigiTimes

Palo Alto Networks urges fighting AI threats with AI in 2026

James Yu, Palo Alto Networks' country manager in Taiwan, emphasizes the need to combat AI-driven threats with AI-powered security solutions. The company anticipates a rise in sophisticated attacks and recommends a proactive approach to protect digita...

2026-01-23 DigiTimes

Taiwanese firms remain cautious as AI bubble debate persists

Taiwanese companies are maintaining a cautious attitude towards artificial intelligence, despite the great enthusiasm surrounding this sector. Doubts persist about the sustainability of growth and the real long-term impact of these technologies, curb...

2026-01-23 DigiTimes

Google: Multi-Agent Debate in AI Improves Reasoning

Google research reveals that multi-agent debate within AI models enhances reasoning capabilities, surpassing the limitations of sheer computing power. This innovative approach opens new perspectives in the development of more sophisticated AI systems...

2026-01-23 Phoronix

AMD: Performance Improvements for RDNA4 in RadeonSI Driver

New optimizations for AMD Radeon RDNA4 graphics cards have been merged into the RadeonSI Gallium3D (OpenGL) driver within Mesa. These deliveries, arriving shortly after the Mesa 26.0 release, will be included in Mesa 26.1, expected in Q2. The focus i...

#Hardware
2026-01-23 LocalLLaMA

Unsloth: 1.8-3.3x faster Embedding finetuning

Unsloth announced an improvement in embedding finetuning speed, with increases of 1.8-3.3x and a 20% reduction in VRAM usage. The new feature supports larger contexts and promises no accuracy loss. It requires only 3GB of VRAM for 4bit QLoRA and 6GB ...

#LLM On-Premise #Fine-Tuning #RAG
2026-01-23 TechCrunch AI

OpenAI targets the enterprise market in 2026: the strategy

OpenAI has appointed Barret Zoph to lead its push into the enterprise sector. The move comes just a week after Zoph rejoined the company, signaling OpenAI's strong interest in this market segment. The goal is to compete with the major players in the ...

2026-01-23 DigiTimes

NAND shortages help Phison to enter US server supply chain

Taiwanese manufacturer Phison Electronics is leveraging the NAND memory shortage to expand its presence in the US server market. The company specializes in storage solutions and flash memories, and aims to establish itself as a key supplier for Ameri...

2026-01-23 DigiTimes

Taiwan air cargo volume set to hit historic peak in 2026

Taiwan's air cargo volume is projected to reach a historic peak in 2026, driven primarily by chip exports and artificial intelligence. The island is solidifying its position as a crucial hub for air transport of high-value technology goods.

2026-01-22 Ars Technica AI

cURL scraps bug bounties amid AI-generated false positives

The cURL project, a popular open-source networking tool, has decided to discontinue its bug bounty program. The decision was made due to the overwhelming number of low-quality reports, often automatically generated by artificial intelligence systems,...

2026-01-22 TechCrunch AI

Voice AI engine and OpenAI partner LiveKit hits $1B valuation

LiveKit, a voice AI engine and OpenAI partner, has reached a valuation of $1 billion. This milestone was achieved through a $100 million funding round led by Index Ventures. The company, founded five years ago, is positioning itself as a key player i...

2026-01-22 LocalLLaMA

vLLM raising $150M confirms inference as the new bottleneck

The $150 million funding for vLLM (Inferact) signals a shift in priorities in the AI sector. After years of massive investments in model training, the focus is now on inference, particularly on efficiency, latency, and throughput. The competition wil...

#Hardware #LLM On-Premise #Fine-Tuning
2026-01-22 TechCrunch AI

Are AI agents ready for the workplace? A new benchmark raises doubts

New research assesses how leading AI models perform on actual white-collar work tasks, drawn from consulting, investment banking, and law. The results show that most models failed to complete the tasks effectively, raising doubts about their current ...

2026-01-22 Ars Technica AI

Apple Developing AI-Powered Wearable Pin, Launch Expected in 2027

Apple is reportedly developing a wearable device with artificial intelligence capabilities. The device, similar in size to an AirTag, would be worn as a pin. The launch could happen as early as 2027. It remains to be seen whether the device will be s...

2026-01-22 LocalLLaMA

Unsloth announces support for finetuning embedding models

Daniel Han from Unsloth announced support for finetuning embedding models with Unsloth and Sentence Transformers. It promises faster speeds (up to 3.3x) and lower VRAM usage (up to 20%). Example notebooks are available for RAG and semantic similarity...

#Fine-Tuning #RAG
2026-01-22 Phoronix

Linux Kernel: Fix for Unauthorized GPU Memory Consumption

A vulnerability in the Linux kernel's Direct Rendering Manager (DRM) driver allowed unprivileged users to exhaust kernel memory. The flaw has been fixed to prevent system crashes due to out-of-memory errors.

#Hardware
2026-01-22 PyTorch Blog

Feast Joins the PyTorch Ecosystem: Bridging Feature Stores and Deep Learning

Feast, the open-source platform for managing data in AI, integrates with PyTorch. The goal is to resolve inconsistencies between training and production data, accelerating the release of accurate and reliable models. The integration enables feature s...

#Hardware #Fine-Tuning #DevOps
2026-01-22 TechCrunch AI

Humans&: Coordination is the next frontier for AI

Humans&, a startup founded by alumni of Anthropic, Meta, OpenAI, xAI, and Google DeepMind, is building the next generation of foundation models for collaboration, not chat. The company aims to create AI systems capable of working synergistically with...

2026-01-22 Wired AI

AI-Powered Disinformation Swarms Threaten Democracy

Advances in artificial intelligence are creating a perfect environment for the spread of disinformation on an unprecedented scale and speed. Experts warn that detecting these manipulative campaigns is becoming increasingly difficult, jeopardizing dem...

2026-01-22 Wired AI

How Claude Code Is Reshaping Software—and Anthropic

WIRED spoke with Boris Cherny, head of Claude Code, about how the viral coding tool is changing the way Anthropic works. The adoption of such tools could revolutionize the future of software development, making processes more efficient and accessible...

2026-01-22 The Register AI

Female-dominated careers among most exposed to AI disruption

A recent study by the Brookings Institution highlights how some professions with a high percentage of female workers are particularly vulnerable to the impact of artificial intelligence. Dentists, on the other hand, appear to be among the least expos...

2026-01-22 404 Media

Size Matters: Study on the Impact of Penis Size Among Rivals

A study reveals that male penis size influences both female attraction and the perception of threat among men. The findings suggest that, throughout evolution, penis size may have played a role in male competition, influencing access to partners. The...

2026-01-22 TechCrunch AI

Google now offers free SAT practice exams, powered by Gemini

Google now offers college-bound students a new free resource: practice SAT exams powered by Gemini's artificial intelligence. The initiative aims to make test preparation more accessible, leveraging the advanced capabilities of Google's language mode...

2026-01-22 Tech.eu

Mews raises $300M to accelerate AI-powered hospitality operations

Mews, a hospitality management software provider, has raised $300 million in a Series D funding round led by EQT Growth. The investment aims to enhance the use of artificial intelligence in the hospitality sector, automating processes and improving g...

2026-01-22 MIT Technology Review

ChatGPT Health: Can It Outperform "Dr. Google"?

OpenAI has launched ChatGPT Health, a version of its language model designed to provide medical advice. The initiative arrives at a sensitive time, with growing concerns about the accuracy and safety of health information generated by artificial inte...

2026-01-22 The Register AI

Palantir helps Ukraine train interceptor drone brains

Ukraine is getting a little AI help with its war against Russia. The country is giving Palantir a new level of access to critical warfighting data so its interceptor drones can become more autonomous.

2026-01-22 The Register AI

PowerShell architect retires after decades at the prompt

Jeffrey Snover, chief PowerShell boffin and hero of Windows administrators around the world, has retired from Microsoft after decades dedicated to automation. His career, spent between Microsoft and Google, has profoundly marked the IT world.

2026-01-22 Ars Technica AI

Google AI Mode: Customized Responses with Gmail and Photos

Google is enhancing AI Mode, its AI-powered search interface, with a new feature called "Personal Intelligence." This allows the system to customize responses by drawing on data from the user's Gmail and Google Photos. The feature is available to Goo...

2026-01-22 TechCrunch AI

Google reportedly snags up team behind AI voice startup Hume AI

Google has hired the CEO and top team behind voice AI startup Hume AI, signaling that voice is increasingly becoming the preferred interface over screens. The acquisition could lead to new advanced voice features in Google products.

2026-01-22 Phoronix

Intel Updates IPU Firmware for Panther Lake Laptops

Intel has released an updated IPU 7.5 (Image Processing Unit) firmware for its upcoming Core Ultra Series 3 Panther Lake laptops. The update addresses the image processing unit used by the web cameras on the higher-end models, improving performance a...

#Hardware
2026-01-22 LocalLLaMA

Qwen3 TTS: New Open-Source Text-to-Speech Model Released

Qwen3 TTS, a new open-source text-to-speech (TTS) model, has been released. The project is available on GitHub and Hugging Face, offering developers new options for speech synthesis. This tool promises to expand possibilities in the field of generati...

2026-01-22 Tom's Hardware

US Congress Seeks Veto Power Over AI Chip Exports to China

US lawmakers are considering the AI Overwatch Act, a bill that would grant Congress the power to veto exports of high-performance AI processors, made by companies like AMD and Nvidia, to China and other adversarial nations.

#Hardware
2026-01-22 The Register AI

Uncle Sam's VMware 'bargain' doesn't include the actual hypervisor

The US General Services Administration is touting discounts of up to 64 percent on Broadcom's VMware portfolio under a OneGov Agreement. However, the core vSphere platform, which is central to VMware, is mysteriously absent from the agreement. This r...

2026-01-22 TechCrunch AI

Spotify brings AI-powered Prompted Playlists to the U.S. and Canada

Spotify's AI-powered Prompted Playlists are now available in the US and Canada. Users can describe the music they want to hear using natural language commands, making playlist creation more intuitive. This feature enhances the music listening experie...

2026-01-22 The Register AI

Notepad will now tell you all the ways Microsoft has enshittified it

Microsoft is meddling with Notepad again, this time adding a "What's New" screen so users know the latest indignities heaped on the once-humble text editor. The company seems determined not to leave one of Windows' simplest and longest-lived applicat...

2026-01-22 LocalLLaMA

Qwen3-TTS: Open-Sourced Family of Models for Text-to-Speech

Qwen has open-sourced the full Qwen3-TTS model family, including VoiceDesign, CustomVoice, and Base. Five models are available in two sizes (0.6B & 1.8B), supporting ten languages. Code, pre-trained models, and demos are accessible via GitHub and Hug...

2026-01-22 LocalLLaMA

Qwen developer active on Twitter

A developer of the large language model (LLM) Qwen has been spotted on Twitter. The news was shared on Reddit, sparking discussions in the LocalLLaMA community. Qwen is a model developed by Alibaba, known for its capabilities and performance in vario...

2026-01-22 OpenAI Blog

Praktika's conversational approach to language learning

Praktika uses conversational AI to provide a tailored language learning experience. By leveraging advanced models like GPT-4.1 and GPT-5.2, the platform builds adaptive AI tutors that personalize lessons, track progress, and help learners achieve rea...

2026-01-22 LocalLLaMA

Hugging Face: the week's top trending models

Hugging Face has released several models that are gaining considerable traction. Highlights include GLM-4.7-Flash for fast text generation, GLM-Image for image editing, pocket-tts for speech synthesis, and VibeVoice-ASR for multilingual speech recogn...

2026-01-22 LocalLLaMA

Llama.cpp: CUDA fix for GLM 4.7 Flash Attention merged

A CUDA fix for GLM 4.7 Flash Attention has been integrated into Llama.cpp. The change, proposed via a pull request on GitHub, should improve performance and stability when using large language models (LLM) with CUDA acceleration. The integration is a...

#Hardware #LLM On-Premise
2026-01-22 Tom's Hardware

AMD ROCm: Radical Transformation for AI Development

AMD presented significant updates to ROCm, its software platform, at CES 2026. The company aims to break down barriers in the development of artificial intelligence applications, making ROCm an increasingly accessible and powerful tool for developers...

#Hardware
2026-01-22 TechCrunch AI

Sparkli: Interactive AI-Powered Learning App for Kids by Ex-Google Team

A team of former Google employees is developing Sparkli, an interactive application powered by generative artificial intelligence, designed to make learning more engaging for children. The app aims to overcome the limitations of current solutions, wh...

#Hardware
2026-01-22 AI News

Gates Foundation and OpenAI test AI in African healthcare

The Gates Foundation and OpenAI are collaborating to test the use of artificial intelligence (AI) in primary healthcare in Africa. The initiative, called Horizon1000, aims to introduce AI tools in 1,000 clinics in Rwanda and surrounding communities b...

2026-01-22 DigiTimes

AI PC battle heats up as Nvidia and MediaTek join forces

Nvidia and MediaTek are intensifying the competition in the AI-powered PC sector. The collaboration aims to integrate their respective expertise to offer advanced solutions, in a rapidly expanding and increasingly competitive market. The Digitimes ar...

#Hardware
2026-01-22 The Register AI

AI vibe coding: does automation increase security debt?

The integration of AI in software development brings efficiency, but security risks are emerging. An AI-coded honeypot revealed hidden vulnerabilities, raising concerns about the use of automated coding tools and the potential security debt they gene...

2026-01-22 LocalLLaMA

Qwen3 TTS Open Source Coming Soon via VLLM-Omni PR

A pull request on GitHub suggests the upcoming release of Qwen3 TTS open source via the VLLM-Omni project. The news was shared on Reddit, generating interest in the open-source community for potential text-to-speech (TTS) applications.

#LLM On-Premise
2026-01-22 LocalLLaMA

Slow LLM Generation? Here's a Possible Cause

A Reddit user shared an image illustrating how processing can slow down text generation in large language models (LLMs). The visualization details the steps involved in the generation process, suggesting potential bottlenecks that contribute to the p...

2026-01-22 The Next Web

Digital Networks Act: EU aims to modernize networks for AI

The European Commission has proposed the Digital Networks Act (DNA) to modernize EU telecom networks. The goal is to support AI infrastructure, promote connectivity equity, and foster a more dynamic startup ecosystem. The law aims to modernize how ne...

2026-01-22 Tech.eu

Optalysys raises £23M to support photonic computing development

Optalysys, a Leeds-based photonic computing company, has raised £23 million in a Series A extension round. The funding will be used to accelerate the commercialization of its proprietary photonic chips and further develop its programmable computing t...

2026-01-22 LocalLLaMA

LLMs in Software Development: One Year In

An analysis of the use of large language models (LLMs) in software development, based on one year of professional experience. Chatbots are useful for exploring code and checking regressions. The largest open-source models compete with proprietary one...

#Hardware
2026-01-22 ArXiv cs.CL

LLMs for mental health: the risks of prolonged interactions

A new study warns about the risks of using large language models (LLMs) in mental health support. The research highlights how, in prolonged dialogues, LLMs tend to overstep safety boundaries, offering definitive guarantees or assuming inappropriate p...

2026-01-22 ArXiv cs.CL

Schema-Constrained AI for Biomedical Evidence Extraction from PDFs

A new AI system promises to transform scientific PDFs into structured, easily analyzable data. Using predefined schemas and controlled vocabularies, the system automates the extraction of key variables from complex documents, reducing time and improv...

2026-01-22 ArXiv cs.LG

GCG Attacks: Vulnerabilities in Diffusion Language Models?

A new study explores the effectiveness of Greedy Coordinate Gradient (GCG) attacks against diffusion language models, an emerging alternative to autoregressive models. The research focuses on LLaDA, an open-source model, analyzing different attack va...

#Fine-Tuning
2026-01-22 ArXiv cs.AI

The Ontological Neutrality Theorem: A New Impossibility Result

A new study on arXiv demonstrates that neutral ontologies, essential for modern data systems that must handle legal and political disagreements, cannot include causal or normative commitments at the foundational level. This finding imposes strict con...

2026-01-22 LocalLLaMA

World Labs: Fei Fei Li's new 3D world model

Fei Fei Li, a leading figure in the field of artificial intelligence, has launched a generative 3D world model called Marble with World Labs. Unlike traditional approaches, Marble uses Neural Radiance Fields (NeRF) and Gaussian splatting to create ex...

2026-01-22 DigiTimes

Taiwan-US industries: new wafer and green energy strategy

The chairman of SAS (Semiconductor Assembly and Streets) outlined the challenges ahead for Taiwan and US industries. New strategies focusing on wafer production and green energy development were announced, aiming to strengthen bilateral cooperation a...

2026-01-22 The Register AI

eBay updates legalese to ban AI-powered shop-bots

eBay has updated its policies to ban the use of automated software agents, or shop-bots, powered by artificial intelligence. The decision aims to protect the user experience on the e-commerce platform.

2026-01-22 DigiTimes

Cloud ASIC shipments set to surge in 2026

According to a DIGITIMES report, the market for ASICs (Application-Specific Integrated Circuits) for the cloud is experiencing strong growth. Shipments are expected to surge starting in 2026. Memory capacity remains a critical factor and a potential ...

#Hardware
2026-01-22 LocalLLaMA

Kimi-Linear-48B: GGUF Support and llama.cpp Integration

The implementation of Kimi-Linear-48B in llama.cpp is being discussed online, given its effectiveness in handling long contexts. The community is wondering about the timeline for the model's integration, which promises significant performance improve...

#Hardware #LLM On-Premise
2026-01-22 DigiTimes

EMS watch: The year AI servers broke the EMS rankings

According to AFP, 2024 has been a year of upheaval in the EMS (Electronic Manufacturing Services) rankings due to the increasing importance of servers dedicated to artificial intelligence. These high-power systems have altered the balance of the indu...

2026-01-22 LocalLLaMA

Michigan: Bill Proposed to Limit Children's Access to Chatbots

Michigan Senate Democrats are proposing new safety measures to protect children from digital dangers, focusing on limiting access to chatbots. The bill is in its early stages and raises questions about implementation and age verification.

2026-01-22 DigiTimes

Taiwan PCB direct shipments to the US remain limited

According to DIGITIMES sources, direct shipments of printed circuit boards (PCBs) from Taiwan to the United States remain limited. This highlights a potential reorganization of global supply chains in the electronics sector, with implications for Tai...

2026-01-22 DigiTimes

Raana Semiconductors raises US$3 million for silicio ingot systems

Raana Semiconductors has announced a US$3 million seed funding round. The goal is to develop silicio ingot growth systems. The company aims to reduce reliance on imports in the semiconductor sector, which is crucial for electronics and computing.

2026-01-22 The Register AI

AI networking startup Upscale scores $200M to challenge Nvidia's NVSwitch

Upscale AI has raised $200 million in Series A funding to challenge Nvidia's dominance in the market for switches for rack-scale AI systems. The company plans to use the funds to develop its own SkyHammer silicio-based UALink switches, entering into ...

#Hardware #Fine-Tuning
2026-01-22 The Register AI

Future jobs in AI? Hardhats and boots, tech bigshots argue

AI leaders gathered in Davos for the World Economic Forum, sharing their predictions on AI's impact on jobs. While some fear job losses, others emphasize the growing need for specialized technical skills to support the expanding AI infrastructure, su...

#Hardware
2026-01-21 The Register AI

Davos discussion mulls how to keep AI agents from running wild

At Davos, the risks associated with artificial intelligence agents were at the center of a panel dedicated to cyber threats. In particular, they discussed how to secure these systems and prevent them from becoming an insider threat, exploiting vulner...

2026-01-21 TechCrunch AI

SGLang spins out as RadixArk with $400M valuation

The open-source project SGLang, originating from Ion Stoica’s UC Berkeley lab, is spinning out as RadixArk. The move, backed by funding from Accel, values the new entity at $400 million, amid rapid growth in the inference market.

2026-01-21 TechCrunch AI

Apple plans to make Siri an AI chatbot, report says

Reportedly, Apple is planning to evolve Siri, transforming it from a simple integrated assistant into a more sophisticated chatbot, similar to ChatGPT. This move would mark a significant shift in Apple's approach to artificial intelligence and user i...

2026-01-21 The Register AI

AI hasn't delivered the profits it was hyped for, says Deloitte

A Deloitte study reveals that, for most companies, adopting AI tools hasn't helped the bottom line at all. Despite this, researchers continue to praise the technology's potential, suggesting that the benefits may manifest in the future with a broader...

2026-01-21 LocalLLaMA

LLM Inference: 8 AMD MI50 GPUs for Performance and Affordability

A setup with eight 32GB AMD MI50 GPUs delivers notable performance in large language model (LLM) inference. It achieves 26 tokens per second with MiniMax-M2.1, and 15 tokens per second with GLM 4.7. The system, costing approximately $880 for the GPUs...

#Hardware #LLM On-Premise
2026-01-21 TechCrunch AI

NeurIPS: Hallucinated citations found in AI conference papers

The prestigious AI conference NeurIPS is facing a growing problem: the presence of "hallucinated" citations within scientific papers. Startup GPTZero has highlighted how, in the age of AI-generated content, even the most authoritative venues risk pub...

2026-01-21 PyTorch Blog

PyTorch 2.10: Optimizations and Numerical Debugging

The new PyTorch 2.10 release introduces significant improvements in performance and tools for numerical debugging. Key features include experimental support for Python 3.14, reduced latency thanks to combo-kernels, and new APIs for handling ragged se...

#Hardware
2026-01-21 LangChain Blog

Deep Agents: Building Multi-Agent Applications with Deep Agents

Deep Agents simplifies building complex AI systems through specialized agents. It introduces subagents for context isolation and skills for progressive capability disclosure. The article illustrates how to implement multi-agent systems, preserving co...

2026-01-21 LocalLLaMA

Lemonade v9.1.4: GLM-4.7-Flash-GGUF support and LM Studio compatibility

Lemonade v9.1.4 has been released, a local server for large language models (LLMs). New features include support for GLM-4.7-Flash-GGUF on ROCm and Vulkan, GGUF import from LM Studio, and improved support for various platforms, including Arch, Fedora...

#LLM On-Premise #DevOps
2026-01-21 LocalLLaMA

Fine-tuned Qwen3-14B on DeepSeek Traces: +20% Security Boost

A researcher fine-tuned the Qwen3-14B language model using 10,000 DeepSeek traces, achieving a 20% performance increase on a custom security benchmark. This demonstrates how fine-tuning smaller models with specific datasets can be a viable and more c...

2026-01-21 Phoronix

PyTorch 2.10 Released With More Improvements For AMD ROCm & Intel GPUs

PyTorch 2.10 is out today as the latest feature update to this widely-used deep learning library. The new PyTorch release continues improving support for Intel GPUs as well as for the AMD ROCm compute stack along with still driving more enhancements ...

#Hardware
2026-01-21 LocalLLaMA

Microsoft releases VibeVoice-ASR for speech recognition

Microsoft has released VibeVoice-ASR, a new model for Automatic Speech Recognition (ASR). The model is accessible via Hugging Face, opening new possibilities for developers working on voice applications. The release includes a link to the Hugging Fac...

2026-01-21 404 Media

Podcast: Here’s What Palantir Is Really Building

A new podcast analyzes ELITE, a tool Palantir is developing for ICE (Immigration and Customs Enforcement). It also discusses how AI influencers are creating fake sex tape-style photos with celebrities, and Comic-Con’s ban of AI art after artist pushb...

2026-01-21 Anthropic News

Claude's new constitution: what changes for AI?

Anthropic has introduced a new constitution for Claude, its flagship language model. This update aims to improve the model's alignment with human values and make it safer and more effective in its applications. The initiative represents a crucial ste...

2026-01-21 Tom's Hardware

Intel axes 12th Gen Alder Lake and 4th Gen Xeon Sapphire Rapids

Intel has announced the end-of-life (EOL) for its 12th Generation Alder Lake and 4th Generation Xeon Sapphire Rapids processors. Customers will have a limited time to place final orders for these hybrid CPUs, marking a significant shift in Intel's pr...

#Hardware
2026-01-21 The Register AI

OpenAI Reaches Out to Locals Near Stargate Facilities

OpenAI is trying to alleviate concerns about its new Stargate datacenters. The company promises plans that take into account local needs, minimizing the environmental impact and the impact on electricity costs. The initiative comes at a time of incre...

2026-01-21 LocalLLaMA

Z.ai's new model, GLM-OCR, spotted on GitHub

A new model named GLM-OCR from Z.ai has been spotted on GitHub. The finding was reported on Reddit, in the LocalLLaMA subreddit, via a post including an image and links to the discussion and the original resource. Further details on the model's capab...

2026-01-21 Tom's Hardware

Nvidia Dethrones Apple as TSMC’s Largest Customer

Nvidia CEO Jensen Huang confirmed that his company has overtaken Apple as TSMC's biggest customer, becoming its top client after more than 20 years. This shift underscores Nvidia's growing prominence in the semiconductor industry.

#Hardware
2026-01-21 LocalLLaMA

GLM-4.7-Flash-GGUF bug fix: redownload for better outputs

A bug in GLM-4.7-Flash-GGUF causing looping and poor outputs has been fixed. Users are advised to redownload the model for significantly improved results. Z.ai has suggested optimal parameters for various use cases, including general use and tool-cal...

#LLM On-Premise
2026-01-21 TechCrunch AI

OpenAI aims to ship its first device in 2026, and it could be earbuds

OpenAI is on track to announce its first hardware device, possibly earbuds, in 2026. OpenAI Chief Global Affairs Officer Chris Lehane said that the company plans to unveil its first hardware in the second half of this year. This move marks a signific...

#Hardware
2026-01-21 MIT Technology Review

AI to Boost Productivity by Augmenting, Not Replacing, Workers

A new study by Vanguard forecasts that artificial intelligence (AI) will significantly impact productivity, comparable to the personal computer. AI will augment human capabilities rather than completely replace them, leading to a transformation of wo...

2026-01-21 Ars Technica AI

Has Gemini surpassed ChatGPT? We put the AI models to the test

We compared the AI models from Google (Gemini 3.2 Fast) and OpenAI (ChatGPT 5.2) to evaluate their performance. The tests, based on complex prompts, aim to simulate the standard user experience, that is, those who do not pay for subscriptions. The an...

2026-01-21 LocalLLaMA

GLM 4.7: How to Run with llama.cpp and Flash Attention

Here's how to get GLM 4.7 working on llama.cpp using Flash Attention for improved performance. The guide includes configuration details and a link to a specific Git branch. Note that quantizations may need to be recreated to avoid nonsensical outputs...

#Hardware #LLM On-Premise
2026-01-21 Tom's Hardware

Nvidia CEO Jensen Huang to visit China as H200 shipments loom

Nvidia CEO Jensen Huang is heading to China in late January for a customary Lunar New Year visit. The trip gains importance as it coincides with negotiations regarding the quantity of H200 GPUs Beijing will be allowed to import, amid U.S. export rest...

#Hardware
2026-01-21 Tom's Hardware

OpenAI soothes investors ahead of IPO: revenue scaling confirmed

OpenAI aims to reassure investors ahead of its potential initial public offering (IPO), demonstrating a clear correlation between computing power and revenue growth. The company continues to invest heavily in infrastructure, with expenditure currentl...

2026-01-21 Tom's Hardware

Microsoft: AI needs broad social impact or risks a bubble

Microsoft CEO Satya Nadella warns that artificial intelligence must generate benefits for a broad segment of the population, otherwise it risks losing social permission and turning into a speculative bubble. A wider impact is needed to prevent the be...

2026-01-21 TechCrunch AI

Zanskar thinks 1 TW of geothermal power is being overlooked

Zanskar has raised $115 million to find about a dozen geothermal resources throughout the U.S. West. The goal is to power the grid with clean energy, exploiting previously unexplored potential. The initiative is expected to significantly contribute t...

2026-01-21 IEEE Spectrum

Why AI Keeps Falling for Prompt Injection Attacks

Large language models (LLMs) continue to be vulnerable to prompt injection attacks, a technique that tricks AI into performing unauthorized actions. The difficulty lies in their inability to understand context as a human would, making them susceptibl...

2026-01-21 Tech.eu

Ukrainian-founded Preply hits $1.2B valuation with $150M Series D

Preply, a Ukrainian-founded language learning marketplace, has raised $150 million in Series D funding led by WestCap, valuing the company at $1.2 billion. Preply connects over 100,000 tutors with learners in 180 countries, offering one-on-one lesson...

2026-01-21 LocalLLaMA

Fix for GLM 4.7 Flash Merged into llama.cpp

A fix for an issue related to GLM 4.7 Flash has been merged into llama.cpp. In parallel, FA (Fused Attention) support for CUDA is under development, aiming to further improve performance and efficiency in using NVIDIA GPUs for language model inferenc...

#Hardware #LLM On-Premise
2026-01-21 LocalLLaMA

File Brain: Open-Source Local Semantic Search for Your Documents

File Brain is an open-source search engine that indexes local files and allows searching using natural language. It supports multilingual semantic search, built-in OCR, and is available for Windows and Linux. The goal is to overcome the limitations o...

2026-01-21 TechCrunch AI

Preply: Language learning marketplace achieves unicorn status

Language learning marketplace Preply is now valued at $1.2 billion after raising $150 million. This milestone marks a new chapter for the 14-year-old company and embodies the resilience of the Ukrainian tech sector, where Preply has its roots.

2026-01-21 Tom's Hardware

OpenAI commits to AI data centers with no impact on energy bills

OpenAI is committed to ensuring that electricity prices do not increase in the communities where it builds its Stargate data centers. The company will fund grid upgrades and flexible load management systems to reduce stress on the energy supply. The ...

2026-01-21 Tom's Hardware

Customer Buys RTX 5080, Receives Relabelled RTX 5060 Ti

An Amazon customer was scammed: instead of an RTX 5080 graphics card, they received a relabelled RTX 5060 Ti. The package was sold and shipped by Amazon, suggesting a possible return switcheroo. The deception was spotted due to the 8-pin power connec...

#Hardware
2026-01-21 Phoronix

Linux: One Line of Code Reduces Latency on Xeon CPUs by 5x

A Linux kernel patch aims to significantly reduce wake-up latency on modern Intel Xeon servers. The modification, involving a single line of code, aims to optimize performance in scenarios where responsiveness is critical, especially with NOHZ_FULL c...

#Hardware
2026-01-21 Source

Deep Dive on the new features of LLMOnPremise

This comparison matrix presents decision axes, trade-offs, and constraints solely for evaluation purposes. It does not constitute a recommendation, endorsement, or ranking of deployment models. Final decisions should be guided by your organization's ...

2026-01-21 AI News

Balancing AI cost efficiency with data sovereignty

AI cost efficiency clashes with data sovereignty, forcing companies to rethink their risk frameworks. The case of DeepSeek, a Chinese AI lab, raises concerns about data sharing with state intelligence services. This requires stricter governance, espe...

2026-01-21 AI News

Citi trains 4,000 employees to use AI

Citi has undertaken an internal initiative to integrate artificial intelligence into the daily work of its employees. Approximately 4,000 people, from various business sectors, have been trained to use approved AI tools. The goal is to improve effici...

2026-01-21 DigiTimes

Sequoia Capital signals potential investment shift to Anthropic

Venture capital firm Sequoia Capital is reportedly considering an investment in Anthropic, an artificial intelligence company. This move could signal a shift in the fund's investment strategy, with a greater focus on emerging technologies in the AI f...

2026-01-21 DigiTimes

AI demand boosts Unimicron and Nan Ya PCB profits in December 2025

Unimicron and Nan Ya anticipate increased profits for December 2025, driven by strong demand for PCBs (Printed Circuit Boards) fueled by the artificial intelligence sector. This increase underscores the crucial role of PCB manufacturers in the rapidl...

2026-01-21 OpenAI Blog

OpenAI launches "Edu for Countries" to modernize education with AI

OpenAI introduces "Edu for Countries", a new initiative designed to support governments in adopting artificial intelligence. The goal is to modernize education systems and prepare the workforce of the future, providing tools and resources to integrat...

2026-01-21 OpenAI Blog

Adoption of advanced AI: new initiatives to increase productivity

A new report highlights the stark differences in the adoption of advanced artificial intelligence across countries. New strategic initiatives are outlined to help nations maximize the productivity gains from AI. The goal is to bridge the existing gap...

2026-01-21 The Next Web

When Corporate Knowledge Becomes an Obstacle

An article explores how corporate knowledge, if poorly structured and rigidly transferred, can transform from an asset into a disadvantage, both for companies and employees. The onboarding process is crucial: inadequate information management can com...

2026-01-21 Tech.eu

Fracttal raises $35M to expand AI-driven maintenance

Fracttal, a Madrid-based company specializing in AI-powered maintenance solutions, has closed a $35 million funding round led by Riverwood Capital. The investment will support the company's continued growth, product development, and global expansion....

#Hardware
2026-01-21 Tech.eu

Antidote completes $5M seed round for billing compliance automation

Antidote, a provider of AI-based billing compliance software for law firms, has raised $5 million in a seed funding round. The funding will support the advancement of its platform and expand its presence in the US, aiming to reduce billing errors and...

2026-01-21 DigiTimes

Twin rocket failures expose risks in China's space race

Two recent rocket launch failures in China, highlighted by Galactic Energy, underscore the risks and challenges inherent in the country's growing space ambitions. These incidents raise questions about the reliability of Chinese launch technologies an...

2026-01-21 DigiTimes

Weekly research roundup: EV market splits and edge AI foundry war

The electric vehicle market is showing signs of division, while competition in the edge AI sector is intensifying. New analysis reveals emerging trends and the challenges companies face to succeed in these rapidly evolving sectors. Insights into winn...

2026-01-21 LocalLLaMA

Building an LM from Scratch: Day 6 Update

An enthusiast shares progress on building a language model (LM) from scratch. After stabilizing the system, the focus shifted to training, revealing the need for a significantly higher number of steps to achieve optimal results. Despite initial chall...

#Hardware
2026-01-21 DigiTimes

US tariff easing slows supply chain shift to Southeast Asia

According to Digitimes sources, the easing of US tariffs is slowing down the shift of supply chains towards Southeast Asia. This unexpected change is significantly impacting the strategies of companies that aimed to diversify their production to avoi...

2026-01-21 DigiTimes

China Highlights AI Large Models in Strategic Development Talks

A recent statement by the Chinese Premier has emphasized the importance of large AI models (LLM) in the country's strategic development. This move underscores China's commitment to technological innovation and its ambition to compete globally in the ...

#Fine-Tuning
2026-01-21 OpenAI Blog

Horizon 1000: OpenAI and Gates Foundation Advance AI in Africa

OpenAI and the Gates Foundation launch Horizon 1000, a $50M pilot program to advance AI capabilities for healthcare in Africa. The initiative aims to reach 1,000 clinics by 2028, bringing innovation and improving access to medical care.

2026-01-21 ArXiv cs.CL

LLM: Does Excessive KV Memory Penalize Performance and Quality?

New research analyzes the trade-off between performance and quality of Large Language Models (LLMs) when exposed to large and distracting contexts. The study highlights a non-linear performance degradation linked to the growth of the Key-Value (KV) c...

2026-01-21 ArXiv cs.LG

AdaFRUGAL: Adaptive Memory-Efficient Training with Dynamic Control

A new framework, AdaFRUGAL, promises to drastically reduce memory consumption and training times for large language models (LLMs). Through dynamic controls that automate hyperparameter management, AdaFRUGAL offers a more practical and autonomous appr...

#Hardware #Fine-Tuning
2026-01-21 ArXiv cs.LG

CSyMR: Benchmarking Compositional Symbolic Music Reasoning With LLMs

A new benchmark, CSyMR-Bench, evaluates the compositional symbolic music reasoning capabilities of large language models (LLMs). The dataset, comprising multiple-choice questions derived from expert forums and professional examinations, requires the ...

2026-01-21 ArXiv cs.AI

Rare disease diagnosis: Is AI really up to the task?

A new study challenges the effectiveness of large language models (LLMs) in the differential diagnosis of rare diseases. The MIMIC-RD benchmark reveals that current LLMs struggle to handle real-world clinical complexity, highlighting a significant ga...

#Fine-Tuning
2026-01-21 LocalLLaMA

Alert on LocalLLaMA: Possible Attacks via Suspicious Repositories

A Reddit user raises the alarm about the proliferation of suspicious repositories in the LocalLLaMA subreddit. The linked GitHub profiles appear to be created ad hoc and the posts generated with artificial intelligence tools. Caution is recommended w...

2026-01-21 LocalLLaMA

Camb AI: New Model with Minimal Latency for Live Sports?

A user reported the launch of a new Camb AI model, particularly effective in live sports broadcasts. The most notable aspect is its low latency and high voice quality, making it indistinguishable from human speech. The technology raises questions abo...

2026-01-21 DigiTimes

Intel recruits Qualcomm GPU chief to lead future AI PC efforts

Intel has recruited a former Qualcomm GPU executive to lead its future AI PC efforts. This strategic move aims to strengthen Intel's position in the rapidly growing market for PCs equipped with advanced AI capabilities, leveraging the new leader's ex...

#Hardware
2026-01-21 DigiTimes

Asia Optical bets on humanoid robots as its next growth engine

Asia Optical chairman I-Jen Lai sees humanoid robots as the company's next growth engine. The company is investing in this emerging sector, betting on the long-term potential of advanced robotics. Increased demand is expected in the coming years, wit...

2026-01-21 LocalLLaMA

vLLM releases version 0.14.0: optimizing LLMs

Version 0.14.0 of vLLM has been released, a framework designed to optimize inference for large language models (LLMs). This new version promises improvements in performance and efficiency, making the implementation and use of these models easier.

#LLM On-Premise
2026-01-21 DigiTimes

China's AI industry reshapes as GPUs rise to be core strategic asset

China's artificial intelligence sector is undergoing a profound transformation, with GPUs taking on an increasingly central role as strategic assets. This shift is driven by the growing demand for computing power to train increasingly complex AI mode...

#Hardware
2026-01-21 DigiTimes

Nvidia unveils Alpamayo platform for L4 self-driving

Nvidia has announced Alpamayo, a new platform designed for the development of Level 4 self-driving vehicles. The platform aims to provide car manufacturers and technology suppliers with the tools necessary to accelerate the realization of fully auton...

#Hardware
2026-01-21 DigiTimes

Nvidia challenges Apple's longtime TSMC priority

Nvidia aims to displace Apple as TSMC's priority customer. The competition to secure TSMC's manufacturing capabilities is intensifying, with significant implications for the future of hardware.

#Hardware
2026-01-21 TechCrunch AI

Bolna nabs $6.3M for its India-focused voice orchestration platform

Bolna, specializing in voice orchestration platforms focused on the Indian market, has raised $6.3 million in funding led by General Catalyst. The company stated that 75% of its revenue comes from self-service customers, highlighting strong adoption ...

2026-01-21 DigiTimes

Taiwan: Major Drone Order and IC Design Investment Boost

Taiwan's Ministry of National Defense has announced a major drone procurement order, simultaneously increasing investments in domestic integrated circuit (IC) design. The strategic move aims to bolster the island's defense capabilities and promote te...

2026-01-21 The Register AI

OpenAI: Age Prediction Model for ChatGPT Users

OpenAI has begun deploying an age prediction model for its ChatGPT users. The goal is to filter access to sensitive or potentially harmful content for underage users. This initiative could unlock new monetization opportunities by restricting access b...

2026-01-21 DigiTimes

100% Tariff Threat Puts Taiwan’s Memory Makers on Notice

A 100% tariff threat puts Taiwan's memory makers on notice. This protectionist move could have significant repercussions on the local industry, altering the balance of the global market and prompting companies to revise their production and distribut...

2026-01-20 DigiTimes

Quanta EVP Mike Yang: AI industry paradigm shift just begun

Mike Yang, executive vice president of Quanta Computer, foresees a paradigm shift in the artificial intelligence sector. According to Yang, this is just the beginning of a profound transformation, with significant implications for the future of techn...

2026-01-20 DigiTimes

Taiwan rolls out tiered electricity rates for data centers

Taiwan is introducing tiered electricity rates for data centers in response to the growing energy consumption linked to artificial intelligence. The move aims to incentivize energy efficiency and better manage electricity demand, amid a potential ene...

2026-01-20 DigiTimes

Taiwan's industrial park model goes global

Taiwan's industrial park model, known for its efficiency and ability to foster innovation, is expanding internationally. This approach, which integrates advanced infrastructure and government support, aims to create ecosystems conducive to the growth...

2026-01-20 DigiTimes

Sony and TCL move toward joint venture in home entertainment

Sony and TCL are reportedly considering a joint venture in the home entertainment sector. The potential agreement could lead to increased collaboration in the development and production of televisions and other home entertainment devices. Further det...

2026-01-20 DigiTimes

Inventec doubles 2026 capex to US$1 billion for AI servers

Inventec has announced a doubling of its planned capital expenditure for 2026, bringing it to US$1 billion. The decision is driven by growing opportunities in the artificial intelligence (AI) server market. The company aims to strengthen its position...

#Hardware
2026-01-20 LocalLLaMA

GLM-4.7-Flash implementation in llama.cpp: issues confirmed

Recent discussions suggest that the GLM-4.7-Flash implementation in llama.cpp has issues. Significant differences in logprobs compared to vLLM could explain anomalous behaviors reported by users, such as infinite loops and poor response quality. It i...

#LLM On-Premise
2026-01-20 TechCrunch AI

ChatGPT: age estimation to protect young users

OpenAI introduces a new feature in ChatGPT: the model now estimates the age of users. The goal is to prevent the delivery of potentially problematic content to individuals under 18, strengthening safety measures for young people.

2026-01-20 TechCrunch AI

Tesla restarts Dojo3 project for space-based AI applications

Elon Musk announced that Tesla will restart the development of Dojo3, its previously abandoned third-generation AI chip. Unlike the original plans, Dojo3 will now be dedicated to space-based AI compute, opening new frontiers for Tesla's space applica...

2026-01-20 LocalLLaMA

Giga Potato:free, an LLM Model Challenging Top Performers?

A user discovered a free language model named Giga Potato:free on Kilo Code, and was impressed by its performance. According to initial tests, the model rivals Sonnet 4.5 and Opus 4.5, handling complex prompts with surprising results. Its origin rema...

2026-01-20 The Next Web

Von der Leyen launches "Europe Inc.": a shift for the EU?

At the World Economic Forum in Davos, Ursula von der Leyen outlined a potential shift in European economic policy. The phrase "Europe Inc.", while not a law, represents a strong political signal: the European Commission intends to accelerate a struct...

2026-01-20 The Register AI

Mozilla starts offering RPMs of Firefox Nightly

Mozilla is now offering native RPM packages of Firefox Nightly for Linux distributions in the Red Hat and SUSE families. This provides users with more installation options to try out the latest features of the open-source browser.

2026-01-20 OpenAI Blog

Cisco and OpenAI: AI agents for enterprise engineering

Cisco and OpenAI are collaborating to redefine enterprise engineering. The focus is Codex, an AI software agent embedded in workflows to speed up development, automate defect fixes, and enable AI-native development.

2026-01-20 OpenAI Blog

ChatGPT: Age Prediction Rollout for Enhanced Online Safety

OpenAI is rolling out age estimation on ChatGPT to protect younger users. The system assesses whether an account belongs to a minor or an adult, applying specific safeguards for teenagers. The company plans to progressively improve the model's accura...

2026-01-20 The Register AI

VoidLink: Linux malware targeting the cloud, written by an AI agent

A new Linux malware, named VoidLink, has been discovered targeting cloud infrastructures. What makes it special? According to researchers, it was developed almost entirely by an artificial intelligence agent, likely by a single individual. VoidLink u...

2026-01-20 Phoronix

AMD Making It Easier To Install vLLM For ROCm

AMD has introduced a simpler method for installing vLLM on Radeon/Instinct hardware via ROCm. A new Python wheel facilitates installation without Docker, improving the experience for developers using AMD GPUs for large language model (LLM) inference.

#Hardware #LLM On-Premise #DevOps
2026-01-20 LocalLLaMA

New LongPage Dataset: Over 6K Novels to Train Full Book Writing LLMs

An update to the LongPage dataset has been released, now including over 6,000 full-length novels paired with reasoning traces. These traces break down the story into hierarchical sections, from the general idea to individual chapters and scenes. The ...

#Fine-Tuning
2026-01-20 Tom's Hardware

Noveon Magnetics Raises $215M to Expand Rare Earth Magnet Production

Texas-based Noveon Magnetics has raised $215 million to expand its U.S. operations. The investment aims to improve American access to rare-earth magnets, essential for HDD production and crucial for reducing reliance on China. An estimated $630 milli...

2026-01-20 LocalLLaMA

Liquid AI released the best thinking Language Model Under 1GB

Liquid AI released LFM2.5-1.2B-Thinking, a reasoning model that runs entirely on-device. Trained specifically for concise reasoning, it generates internal thinking traces before producing answers, enabling systematic problem-solving at edge-scale lat...

2026-01-20 The Register AI

AI PCs for the Enterprise: Does TOPS Trump Everything Else?

Artificial intelligence is becoming ubiquitous in the enterprise technology world. But are AI PCs really that widespread? An analysis of the role of computing power (TOPS) in the adoption of AI PCs in the enterprise and whether this parameter is the ...

2026-01-20 LocalLLaMA

GLM-4.7-Flash: impressive benchmarks on H200 and RTX 6000 Ada

The GLM-4.7-Flash model demonstrates remarkable performance in new benchmarks. On a single H200 GPU, it achieves a peak throughput of 4,398 tokens per second. Using an RTX 6000 Ada, the model generates 112 tokens per second utilizing Unsloth dynamic ...

#Hardware #LLM On-Premise
2026-01-20 MIT Technology Review

The era of agentic chaos and how data will save us

The adoption of AI agents is growing rapidly, but many companies are not ready. A solid data infrastructure is essential to avoid chaos and maximize the value of AI. Market leaders invest in quality data to ensure agent reliability and achieve concre...

2026-01-20 404 Media

FAA: Drone No Fly Zone Near DHS Agents and Facilities

The Federal Aviation Administration (FAA) has established a drone no-fly zone within 3,000 feet of Department of Homeland Security (DHS) facilities and mobile assets. The measure, which replaces a previous ban limited to military bases and Department...

2026-01-20 The Register AI

Majority of CEOs report zero payoff from AI splurge

A PwC survey of over 4,500 business leaders reveals that more than half have seen neither increased revenue nor decreased costs following massive investments in AI. The findings raise questions about the actual economic return of these technologies.

2026-01-20 Phoronix

Linux 7.0: Intel GPU Firmware Updates on Non-x86 Systems Ready

Support for updating Intel discrete GPU firmware on non-x86 systems is coming with Linux 7.0. The necessary patches are ready for integration into the upcoming Linux 6.20~7.0 kernel cycle, expanding hardware compatibility and simplifying graphics dri...

#Hardware
2026-01-20 The Register AI

AI framework flaws put enterprise clouds at risk of takeover

Two vulnerabilities in the popular open-source AI framework Chainlit put major enterprises' cloud environments at risk. According to Zafran, the flaws are easy to exploit and could lead to data leaks or full system takeover. It is recommended to upda...

2026-01-20 LocalLLaMA

DeepSeek: a new model appears, codenamed "model1"

A DeepSeek repository has been updated with a reference to a new model identified as "model1". The discovery was made via a file within DeepSeek's FlashMLA repository on GitHub. Further details on the model's specifications or capabilities are curren...

2026-01-20 TechCrunch AI

Emergent: Indian vibe-coding startup raises $70M

Indian startup Emergent, specializing in "vibe-coding", has announced a $70 million funding round, reaching a valuation of $300 million. Investors include SoftBank and Khosla Ventures. The company aims to achieve an annual recurring revenue (ARR) of ...

2026-01-20 Tech.eu

UK to reimburse visa fees for overseas tech talents

The UK government has announced a package of measures to attract talent in the tech sector, offering visa fee reimbursement to key figures working at promising UK startups. The initiative aims to position Britain as a haven of stability and innovatio...

2026-01-20 The Register AI

Windows 11, not AI, kick-started the PC upgrade cycle

In 2025, corporate IT hardware upgrades were driven by the necessity to maintain support, rather than excitement for new AI-related features. IT departments refreshed systems to keep up with compatibility requirements, demonstrating that the urgency ...

#Hardware
2026-01-20 LocalLLaMA

LocalLLaMA: The unstoppable rise of local language models

A Reddit post highlights the surprising capabilities of language models running locally with LocalLLaMA. The discussion emphasizes how these models, while running on consumer hardware, demonstrate a context understanding and responsiveness that often...

#Hardware
2026-01-20 LocalLLaMA

GLM-4.7-Flash: an LLM with a clear thinking process

A user tested GLM-4.7-Flash and noted a very clear thinking process, divided into distinct phases such as request analysis, brainstorming, drafting, and response revision. Despite the longer process duration, the final result is considered high quali...

#Fine-Tuning
2026-01-20 Tom's Hardware

Micron acquires PSMC fab site in Taiwan for $1.8 billion

Micron Technology has announced the acquisition of a production site from PSMC (Powerchip Semiconductor Manufacturing Corp.) in Taiwan for $1.8 billion. The move aims to expand Micron's production capabilities in the region. The deal marks a shift in...

2026-01-20 The Register AI

Windows 95: the (Weird) Trick for Faster Restarts

A Microsoft veteran reveals an unexpected method to speed up Windows 95 restarts: holding down the Shift key. This simple action apparently bypassed certain processes, reducing waiting times. An anecdote that brings to light the peculiarities of oper...

2026-01-20 LocalLLaMA

GLM-4.7-Flash: Z.ai's model for local inference

Z.ai has introduced GLM-4.7-Flash, a 30B MoE model designed for local inference. Optimized for coding, agentic workflows, and chat, the model boasts high performance with only 3.6B active parameters and supports a 200K token context. GLM-4.7-Flash ex...

#Fine-Tuning
2026-01-20 The Next Web

Odoo tops €7 billion valuation as General Atlantic increases stake

Belgian business software company Odoo has reached a fresh milestone, exceeding €7 billion valuation. Growth investor General Atlantic has increased its stake in the firm, buying additional shares from Wallonie Entreprendre. This move isn’t a typical...

2026-01-20 Phoronix

DragonFlyBSD Now Allows Optional AMD GCN 1.1 Support In AMDGPU Driver

DragonFlyBSD's AMDGPU kernel graphics driver continues to be a port of the AMDGPU Linux kernel driver. Their latest porting effort for AMD graphics on DragonFlyBSD is now enabling optional support for the GCN 1.1 "Sea Islands (CIK) graphics processor...

#Hardware
2026-01-20 Tom's Hardware

Tesla restarts Dojo supercomputer project with in-house AI5 chip

Elon Musk has confirmed the restart of the Dojo supercomputer project. This renewed focus is due to advancements in the design of the AI5 chip, entirely developed in-house. Dojo will be the first Tesla supercomputer to feature all in-house hardware, ...

#Hardware
2026-01-20 The Register AI

Manchester ATM ups PIN requirement to full Windows login

An ATM in Manchester has been spotted displaying a Windows 7 login screen, an operating system no longer supported by Microsoft. The image raises concerns about customer data security and the vulnerability of outdated banking systems. The incident hi...

2026-01-20 The Register AI

AI crashes in UK finance: MPs ask who is responsible

UK MPs are urging financial regulators to conduct thorough stress tests. The goal is to prepare businesses for market shocks triggered by artificial intelligence. The crucial issue of assigning responsibility for automated decisions remains unresolve...

2026-01-20 Tom's Hardware

NYSE exploring 24/7 tokenized stock and ETF exchange using blockchain

The New York Stock Exchange (NYSE) is considering launching a 24/7 exchange for tokenized stocks and ETFs, leveraging blockchain technology to modernize trading. This initiative aims to enable continuous operation, potentially revolutionizing how sec...

2026-01-20 The Register AI

UK's Department of Health Seeks Tech Director: £285k Salary

England's Department of Health and Social Care is recruiting a head of technology, digital and data with a maximum salary of up to £285,000 a year, exceeding the salary of the department's boss. The role is pivotal in driving technological innovation...

2026-01-20 Tech.eu

EIT Food: Bridging Foodtech Startups and European Retail

EIT Food's Straight2Market program facilitates the entry of agrifood startups into the European market by directly connecting them with major retailers. The initiative offers financial support, market testing, and experimentation opportunities for in...

2026-01-20 DigiTimes

AI servers: Taiwan supply chain broadly lifted in 2025

The demand for AI servers is expected to significantly impact Taiwan's supply chain in 2025. The primary beneficiaries will be ODM/EMS manufacturers, cooling system specialists, and optical component suppliers. The growth of the AI server market cont...

#Hardware
2026-01-20 Tech.eu

French accounting software platform Pennylane raises $200M

French accounting software platform Pennylane has raised $200m in a funding round led by TCV, with participation from Blackstone Growth and existing investors including Sequoia and CapitalG. Despite not having an immediate need for funds, the company...

2026-01-20 Tech.eu

Orbem raises €55.5M Series B to scale AI-powered MRI technology

Munich-based Orbem, a deeptech company applying artificial intelligence to magnetic resonance imaging, has closed a €55.5 million Series B financing round. The company uses AI to industrialise magnetic resonance imaging, with applications in agricult...

2026-01-20 Tech.eu

British Business Bank invests £25M in Kraken Technologies

The UK government-backed British Business Bank (BBB) is taking a £25m stake in Kraken Technologies, the software entity being spun out of Octopus Energy, marking the bank's biggest ever direct investment into a private firm. The move follows Octopus ...

2026-01-20 DigiTimes

Strategies and deployment of edge AI chip startups

An analysis of the strategies and deployments of startups specializing in chips for artificial intelligence in edge computing. The report examines the challenges and opportunities these companies face in a rapidly evolving market, with a focus on tec...

#Hardware
2026-01-20 The Next Web

Tech events: industry leaders now prefer more targeted meetings

Tech events used to focus on quantity. More attendees meant greater success. But this model is outdated. Today, industry leaders are looking for smaller, more targeted events, where the quality of interactions is higher than the mere size of the crow...

2026-01-20 Tech.eu

Stilla emerges from stealth with $5M to boost AI collaboration

Stockholm-based Stilla has raised $5 million to develop a platform that enhances collaboration between people and AI systems. The goal is to provide an intelligence layer that connects workplace tools like Slack, GitHub, and Notion, ensuring teams st...

2026-01-20 LocalLLaMA

Deepseek-R1: One Year Since the Release of the LLM

It has been a year since the release of Deepseek-R1, a language model that has garnered interest in the community. The news was shared via a Reddit post, marking the anniversary of the release and inviting further discussion about the model and its a...

2026-01-20 LocalLLaMA

GLM 4.7 Flash GGUF Released by Bartowski

Bartowski has released GLM 4.7 Flash GGUF, a new version of the language model. The files are available on Hugging Face. The LocalLLaMA community is actively discussing the implications and potential of this new release. The initiative aims to improv...

2026-01-20 DigiTimes

Taiwan IC designers wager on algorithms for the next boom

Taiwan's integrated circuit (IC) designers are investing in new algorithms to fuel the next wave of industry growth. The goal is to improve efficiency and innovation in chip design, in an increasingly competitive and rapidly evolving market.

2026-01-20 DigiTimes

Nvidia Alpamayo sparks VLA computing power race

Nvidia unveiled Alpamayo, an open-source vision-language-action (VLA) model series, signaling a new phase in autonomous driving technologies. The launch has intensified competition among global automakers, now ramping up investment to secure computin...

#Hardware
2026-01-20 DigiTimes

Alibaba's Qwen expansion links AI directly to consumer services

Alibaba is expanding the integration of its Qwen artificial intelligence model directly into consumer-facing services. This strategic move aims to enhance user experience and offer advanced AI-powered features across various domains, solidifying Alib...

2026-01-20 DigiTimes

China's rare earth prices rise for 6th consecutive quarter

Rare earth prices in China continue to rise, marking the sixth consecutive quarter of increases. This upward trend is bringing inflationary pressures back to the global supply chain, with potential impacts on various industrial sectors that rely on t...

2026-01-20 DigiTimes

Nvidia targets 2026 launch for Windows on Arm notebooks

Nvidia is reportedly planning to enter the Windows on Arm notebook market starting in 2026. This strategic move could lead to increased competition in the laptop processor sector, currently dominated by Qualcomm and other manufacturers. Nvidia's init...

#Hardware
2026-01-20 The Register AI

Micron Acquires $1.8bn DRAM Chip Plant in Taiwan

Micron has announced the acquisition of a DRAM chip manufacturing campus from Powerchip Semiconductor Manufacturing Corporation (PSMC) in Taiwan for $1.8 billion. This acquisition will allow Micron to quickly increase its DRAM manufacturing capacity....

2026-01-20 LocalLLaMA

Unsloth Releases GLM-4.7-Flash in GGUF Format

Unsloth has released the GLM-4.7-Flash language model in GGUF (GPT-Generated Unified Format). This format facilitates the use of the model on various hardware platforms, making it accessible to a wider audience of developers and researchers intereste...

#Hardware
2026-01-20 LocalLLaMA

GLM-4.7-Flash-GGUF is here!

A new version of GLM-4.7-Flash-GGUF has been released, a large language model (LLM) designed for local inference. This implementation, available on Hugging Face, allows users to run the model directly on their devices, opening new possibilities for o...

#Hardware
2026-01-20 OpenAI Blog

AI for self empowerment: new growth opportunities

Artificial intelligence can expand human capabilities, bridging the skills gap and unlocking new opportunities for productivity and growth for individuals, businesses, and nations. An analysis of how AI can foster self-empowerment and development.

2026-01-19 LocalLLaMA

GLM 4.7 Flash: Official Support Merged into llama.cpp

Official support for GLM 4.7 Flash has been merged into llama.cpp. This integration, reported on Reddit, allows developers to leverage the capabilities of GLM 4.7 Flash within the llama.cpp environment, opening up new possibilities for inference and ...

#Hardware #LLM On-Premise
2026-01-19 LocalLLaMA

GLM 4.7 Flash: A Reliable LLM Agent for Lower-End GPUs?

A user reports excellent performance of GLM 4.7 Flash as an LLM agent, even on systems with lower-end GPUs. The model appears to handle complex tasks such as cloning GitHub repositories and editing files without errors, opening new possibilities for ...

#Hardware
2026-01-19 Phoronix

Valve: Power Management Improvements for AMD GCN 1.0 GPUs

A Valve contractor has significantly improved the AMDGPU driver for older GCN 1.0 and GCN 1.1 GPUs. With Linux 6.19, AMDGPU is now the default for these GPUs, offering better performance and RADV Vulkan support. New patches focus on optimizing power ...

#Hardware
2026-01-19 LocalLLaMA

LightOn OCR: New Open Source Model for Optical Character Recognition

LightOn AI has released LightOnOCR-2-1B, an open-source Optical Character Recognition (OCR) model. The model is available on Hugging Face and aims to provide an accessible solution for extracting text from images. Its release has been welcomed by the...

2026-01-19 LocalLLaMA

GLM-4.7-FLASH: Mixed Precision NVFP4 Version Available on Hugging Face

A mixed precision NVFP4 quantized version of GLM-4.7-FLASH has been published on Hugging Face. The author encourages the community to test the model and provide feedback. The model has a size of 20.5 GB and aims to optimize performance while maintain...

#Hardware
2026-01-19 LocalLLaMA

Gemma 3:1b: What are the main uses of small models?

A user wonders about the possible uses of small language models like Gemma 3:1b. These models, while running on less powerful hardware, open up interesting scenarios. It remains to be seen whether they are suitable for basic tasks or simple calculati...

#Hardware
2026-01-19 Ars Technica AI

Musk seeks $134B from OpenAI, accused of 'making up math'

Elon Musk is suing OpenAI, seeking damages between $79 billion and $134 billion. Musk accuses OpenAI of abandoning its nonprofit mission and "making a fool out of him" as an early investor. The amount is based on an expert's estimate that Musk's earl...

2026-01-19 TechCrunch AI

US AI startups raise record funding in 2025

2024 was a pivotal year for the AI industry in the US and beyond. It remains to be seen whether 2025 will be equally positive. Analysis reveals that numerous AI startups have raised over $100 million in funding, marking an unprecedented wave of inves...

2026-01-19 LocalLLaMA

Nvidia GB10 vs GH200: early performance benchmarks

Early benchmarks comparing the performance of Nvidia's GB10 GPU with the GH200 have surfaced online. The data, originating from a Reddit source, offers a preview of the potential of Nvidia's new architecture, although they should be taken with cautio...

#Hardware
2026-01-19 LocalLLaMA

llama.cpp adopts Anthropic Messages API

The llama.cpp library has integrated Anthropic's Messages API, opening new possibilities for interacting with language models. This integration, announced on Reddit and Hugging Face, allows developers to leverage the capabilities of llama.cpp for adv...

#LLM On-Premise
2026-01-19 LocalLLaMA

Z-AI (GLM): Devs Woke Up And Chose Violence

Z-AI (GLM) developers have reportedly adopted an 'aggressive' development strategy. A Reddit post highlights this choice, suggesting direct competition with other teams, particularly those at Qwen. The online discussion focuses on the implications of...

2026-01-19 Tom's Hardware

Eric Demers leaves for Intel after 14 years at Qualcomm

Eric Demers, a key figure in the development of Radeon and Adreno GPUs, leaves Qualcomm after 14 years to join Intel. This move represents a significant reinforcement for Intel's team, led by Lip-Bu Tan, in the dedicated graphics card sector.

#Hardware
2026-01-19 LocalLLaMA

GLM-4.7-Flash: a 30B model that is impressive in BrowseComp

A Reddit post highlights the performance of the GLM-4.7-Flash 30B parameter model in the context of BrowseComp, suggesting that Qwen may need to catch up. The comparison also includes GPT-OSS-20B. The model is available on Hugging Face.

2026-01-19 LocalLLaMA

GLM 4.7 Flash Released: Massive Benchmark Gains?

GLM 4.7 Flash has been released. The open-source community is questioning the potential performance gains compared to Qwen 30b, with a focus on benchmarks. Currently, there is no objective data to support this.

#Fine-Tuning
2026-01-19 LocalLLaMA

Ghost Engine: Run Llama-3-8B in 3GB VRAM by Generating Weights

A new inference engine, called Ghost Engine, promises to drastically reduce memory consumption when running large language models (LLMs). Instead of loading static weights, Ghost Engine generates them on the fly, trading memory bandwidth for compute....

2026-01-19 Tech.eu

Isle of Man launches National AI Office with £1M investment

The Isle of Man Government has launched its National AI Office (NAIO), backed by a £1 million investment. The aim is to coordinate the responsible adoption of artificial intelligence across the island, supporting businesses and the public sector. The...

2026-01-19 LocalLLaMA

GLM-4.7-Flash: New Open-Source Language Model on Hugging Face

The GLM-4.7-Flash language model is now available on Hugging Face. The news was shared on Reddit, sparking discussion within the LocalLLaMA community. The open-source model promises new opportunities for developing generative artificial intelligence ...

2026-01-19 404 Media

ICE’s Facial Recognition App Misidentified a Woman. Twice

The Mobile Fortify app, used by Immigration and Customs Enforcement (ICE) to identify individuals and determine their immigration status, provided two incorrect names for the same woman during a check. The incident raises doubts about the accuracy of...

2026-01-19 IEEE Spectrum

AI Boosts Research Careers, but Flattens Scientific Discovery

An analysis of over 40 million academic papers reveals that scientists using AI tools publish more and reach leadership positions faster. However, AI-driven research tends to focus on narrow areas, limiting originality and diversity in scientific inq...

2026-01-19 LocalLLaMA

On-device browser agent with Qwen: local demo on Chrome

A new demo showcases a local browser agent, powered by Web GPU Liquid LFM and Alibaba's Qwen models, running as a Chrome extension. The agent opens 'All in Podcast' on YouTube. The source code is available on GitHub for those interested in exploring ...

#Hardware
2026-01-19 AI News

Artificial intelligence: transforming credit unions

Artificial intelligence is rapidly transforming financial services, offering new opportunities but also challenges for credit unions. These institutions, built on trust and community alignment, must integrate AI to meet member expectations and compet...

2026-01-19 The Register AI

Police chief suspended after AI hallucination: police chief resigns

The chief constable of West Midlands Police has resigned after his police force used fictional output from Microsoft Copilot in deciding to ban Israeli fans from attending a football match. The officer had denied the use of artificial intelligence sy...

2026-01-19 LocalLLaMA

GLM-4.7-Flash soon? Leaks about the new language model

Hints of a possible imminent release of GLM-4.7-Flash are surfacing. An update to the GLM-4.7 collection, containing a hidden item, has caught the attention of experts. Initial analysis suggests that Zai is preparing to launch this new version. A com...

#LLM On-Premise
2026-01-19 Tom's Hardware

China leads in advanced robotics and world models: AI's next frontier

The AI race is shifting towards advanced robotics and world models. China is positioning itself as a leader in this field, with a high number of operational robots expected as early as 2025. This trend could redefine the global balance in the technol...

2026-01-19 Phoronix

RADV Vulkan Driver Now Implements HPLOC For Faster Ray-Tracing

Valve's RADV Vulkan driver continues to improve ray tracing performance on Linux. The latest implementation, HPLOC, promises a further performance boost for games that leverage this technology. Mesa 26.0 will include this update, bringing tangible be...

#Hardware
2026-01-19 The Register AI

Price, battery life, performance drive PC sales; on-device AI lags

In Q4, commercial resellers primarily shipped AI-capable PCs to enterprise customers. However, the key drivers for purchase were price, battery life, and performance. Integrated artificial intelligence, at least for now, appears to play a less signif...

2026-01-19 Phoronix

SPDX SBOM Generation Tool Proposed For The Linux Kernel

Proposed patches to the Linux kernel introduce an SPDX SBOM Generation Tool. The goal is to increase the transparency of software components, improve vulnerability management, ensure license compliance, and secure the software supply chain.

2026-01-19 LocalLLaMA

Top-K: Optimized Algorithm Up to 20x Faster Than PyTorch

A developer has created an optimized Top-K implementation, crucial for sampling in large language models (LLM). The AVX2-optimized implementation outperforms PyTorch CPU performance by 4-20x, depending on vocabulary size. Integration into llama.cpp r...

#Hardware #LLM On-Premise
2026-01-19 The Next Web

Europe invests €307 million in AI projects

The European Commission has allocated €307.3 million to fund artificial intelligence and related technology projects under the Horizon Europe program. The initiative aims to promote trustworthy AI and European digital autonomy, focusing on data servi...

2026-01-19 LocalLLaMA

Flog: Free iOS Nutrition Tracker App with Local LLM Support

A developer has created Flog, a free iOS app that tracks nutrition through photos, leveraging local LLM models to estimate portions and nutrients. The app integrates with Apple Health and supports LLM models run directly on the device or via LM Studi...

2026-01-19 Tech.eu

Anzen Industries raises $2.2M for chemical production innovation

UK-based startup Anzen Industries has raised $2.2 million in pre-seed funding. The company focuses on producing high-value chemicals using cell-free enzyme systems, aiming to improve the scalability and resilience of global supply chains. The funding...

2026-01-19 DigiTimes

Taiwan carves robotics niche as humanoids proliferate

Taiwan is positioning itself as a key player in the robotics sector, particularly in the development of humanoids. The island aims to leverage its technological and industrial expertise to compete in this growing market, with a focus on applications ...

2026-01-19 LocalLLaMA

JARVIS: Progress Report on LLM Agent Development

A Reddit user shared an update on the development of JARVIS, an agent based on large language models (LLM). The original post includes a link to a demonstration video of the project. The development of LLM agents is a rapidly growing research area, w...

2026-01-19 DigiTimes

Quanta rushing to hire and expand as AI server demand holds strong

Quanta Computer is ramping up hiring and expanding its operations to meet the sustained demand for AI servers. The company aims to strengthen its position in a rapidly expanding market, where the ability to meet customer demands has become crucial.

#Hardware
2026-01-19 DigiTimes

US-Taiwan investment MOU brings clarity on future auto tariffs

A memorandum of understanding (MOU) between the US and Taiwan outlines the future of automotive tariffs. The agreement aims to promote bilateral investments and establish clearer trade conditions, particularly in the automotive sector. The initiative...

2026-01-19 DigiTimes

Apple-Google AI partnership could reshape voice assistant market

A potential collaboration between Apple and Google in the field of artificial intelligence could reshape the voice assistant market. The partnership, if realized, would have an estimated value of up to $5 billion. Implications and details of the agre...

2026-01-19 ArXiv cs.CL

Conversational Agents: Does Conciseness Reduce Expertise?

A new study analyzes the unexpected side effects of using specific stylistic features in prompts for conversational agents based on large language models (LLMs). The research reveals how prompting for conciseness can compromise the perceived expertis...

#Fine-Tuning
2026-01-19 ArXiv cs.CL

BYOL: Bring Your Own Language Into LLMs

A new study introduces BYOL, a framework for improving the performance of large language models (LLMs) in languages with limited digital presence. BYOL classifies languages based on available resources and adapts training techniques, including synthe...

2026-01-19 ArXiv cs.LG

Analytic Bijections for Smooth and Interpretable Normalizing Flows

A new study introduces three families of analytic functions for normalizing flows, offering more efficient and interpretable alternatives to existing approaches. The advantages include increased training stability and the ability to drastically reduc...

2026-01-19 ArXiv cs.AI

LLMs: How Do They Assess Trustworthiness of Online Information?

Large language models (LLMs) are increasingly important in online search and recommendation systems. New research analyzes how these models encode perceived trustworthiness in web narratives, revealing that models internalize psychologically grounded...

#Fine-Tuning
2026-01-19 DigiTimes

Optics manufacturers strengthen ties with semiconductor firms

Optics manufacturers are strengthening ties with semiconductor firms in the silicio photonics race. Asia Optical is among the companies targeted for these strategic partnerships. Asia Optical chairman I-Jen Lai is leading the company through this cru...

2026-01-19 LocalLLaMA

cuda-nn: Custom MoE inference engine in Rust/CUDA without PyTorch

cuda-nn, a MoE (Mixture of Experts) inference engine developed in Rust, Go, and CUDA, has been introduced. This open-source project stands out for its ability to handle models with 6.9 billion parameters without PyTorch, thanks to manually optimized ...

2026-01-19 LocalLLaMA

Hot take: OpenAI should open-source GPT-4o

A user suggested that OpenAI should open-source the GPT-4o model. Despite safety concerns, the move could cover OpenAI's open-source rally for the next few months and save on the costs of maintaining the model.

#Fine-Tuning
2026-01-19 LocalLLaMA

Strix Halo as LLM Server: Which Linux Distro to Choose?

A user is evaluating using their Strix Halo as a server for large language models (LLM) and a media server, looking for the most suitable Linux distribution. Fedora 43 is already installed, but alternatives are being considered for optimal RDP suppor...

2026-01-19 LocalLLaMA

Chatterbox: Memory Spikes During PDF Conversion?

A user reports excessive memory consumption with Chatterbox-TTS-Server while converting a PDF to an audiobook. The process, based on a fast API wrapper, increases memory usage from 3GB to over 8GB while processing small chunks of the book.

2026-01-19 LocalLLaMA

DetLLM: tool to ensure deterministic inference in LLMs

A developer has created DetLLM to address the issue of non-reproducibility in LLM inference. The tool verifies repeatability at the token level, generates a report, and creates a minimal reproduction package for each run, including environment snapsh...

2026-01-19 LocalLLaMA

SLM Prompting: How to Outperform Larger Language Models?

A user is questioning how to get the most out of small language models (SLMs), especially when fine-tuned for a specific topic. The challenge is that traditional prompts, effective with large language models (LLMs), often produce incoherent results w...

2026-01-19 DigiTimes

US-Taiwan defense ties deepen due to 15% tariff cap

According to DIGITIMES, defense ties between the US and Taiwan are deepening, partly due to a 15% tariff cap. This move highlights the increasing collaboration between the two nations in a strategically crucial area.

2026-01-19 LocalLLaMA

Hardware setup with 3 V620 GPUs for 96GB of VRAM

A user has shared their new hardware setup online, which includes three V620 graphics cards for a total of 96GB of VRAM. This configuration is designed for applications that require high video memory capacity, such as training machine learning models...

#Hardware
2026-01-18 DigiTimes

AI: Machine identities outnumber humans in Asia-Pacific

Artificial intelligence is reshaping the cybersecurity landscape in the Asia-Pacific region, with an exponential increase in machine identities. This shift poses new challenges for protecting systems and data, requiring more sophisticated and automat...

2026-01-18 LocalLLaMA

How do you pronounce "GGUF"? The pronunciation dilemma in AI

The pronunciation of "GGUF", a file format used in the field of artificial intelligence, is generating a heated debate in the community. The most common options include "jee-guff", "giguff", and "jee jee you eff". The discussion highlights the challe...

2026-01-18 LocalLLaMA

Are LLM Agents Mostly Markdown Todo List Processors?

A user has raised an interesting question regarding the internal architecture of major agents based on large language models (LLMs). It appears that many of these agents break down complex tasks into simple todo lists, executing them sequentially. Th...

2026-01-18 LocalLLaMA

ROCm+Linux Support on Strix Halo: January 2026 Stability Update

A user on Reddit reported the future release of a stability update for ROCm and Linux support on Strix Halo. The delivery, expected in January 2026, aims to improve the integration of these technologies. Strix Halo is an AMD hardware platform designe...

#Hardware
2026-01-18 OpenAI Blog

AI for human agency: a driver of growth and opportunity

Artificial intelligence can expand human capabilities, bridging the skills gap and unlocking new growth opportunities for individuals, businesses, and nations. An analysis of AI's potential as a tool to increase productivity and foster economic devel...

2026-01-18 LocalLLaMA

RLVR and GRPO: From-Scratch Implementation with Notebook

A code notebook illustrating the from-scratch implementation of RLVR (Reinforcement Learning Value Retrieval) with GRPO (Gradient Ratio Policy Optimization) is now available. The resource, hosted on GitHub, was shared on Reddit and is intended for th...

2026-01-18 Phoronix

Linux 6.19: USB Issues Fixed for Apple M1/M2 Macs

Coming with Linux 6.19-rc6, are two USB fixes specifically for Apple Macs with M1 and M2 chips. The patches, intended for the mainline kernel, will be back-ported to stable Linux versions. This should improve hardware compatibility for those using Li...

#Hardware
2026-01-18 OpenAI Blog

OpenAI: A Business Model Scaling with Intelligence

OpenAI's business model scales with the value of intelligence. The company leverages subscriptions, APIs, advertising, commerce, and compute, all driven by the increasing adoption of ChatGPT. This strategy allows OpenAI to grow efficiently, adapting ...

2026-01-18 Tom's Hardware

Tesla: New AI Chips Every Nine Months, Challenging Nvidia and AMD

Elon Musk aims for a faster development and release cycle for new AI accelerators compared to Nvidia and AMD. The goal is to produce chips in extremely high volumes, but the engineering challenge is significant. Tesla intends to accelerate its roadma...

#Hardware #Fine-Tuning
2026-01-18 TechCrunch AI

Confer: Moxie Marlinspike's privacy-conscious alternative to ChatGPT

Moxie Marlinspike, known for his work on Signal, has launched Confer, an alternative to ChatGPT and Claude focused on privacy. Unlike the latter, Confer ensures that user conversations are not used for model training or advertising purposes, offering...

#Fine-Tuning
2026-01-18 Tom's Hardware

Photoshop on Linux: Developer Patches Wine to Fix Installation Issues

An open-source developer, PhialsBasement, has released a series of patches for Wine that address HTML and JavaScript rendering issues, as well as XML parsing errors. These fixes enable the smooth installation and execution of Adobe Photoshop 2021 and...

2026-01-18 LocalLLaMA

GPU Market in Germany and EU: a critical situation

A Reddit post highlights the difficulties in finding certain graphics cards (GPUs) in Germany and the European Union. The limited availability of these hardware components poses a challenge for gaming enthusiasts, graphics professionals, and research...

#Hardware
2026-01-18 Tom's Hardware

Vintage Resurrection: 1974 Altair 8800 Computer Fixed and Runs in 2026

A 1974 Altair 8800 computer, incorrectly assembled, was repaired and successfully ran its first program in 2026. The machine, powered by an Intel 8080 processor, came to life over fifty years after its construction. The repair was documented by a com...

#Hardware
2026-01-18 Tom's Hardware

U.S. EPA Requires Permits for Musk's xAI Gas Turbine Generators

The U.S. EPA now requires permits to operate gas turbine generators, even temporary ones, closing loopholes in some local ordinances that waived this requirement for deployments that lasted for less than 364 days. This affects Elon Musk's xAI.

2026-01-18 The Register AI

Nvidia leans on emulation to squeeze more HPC oomph from AI chips

Nvidia is leaning on emulation to boost the performance of its AI chips in high-performance computing (HPC), amid competition with AMD. AMD researchers argue that algorithms like the Ozaki scheme merit investigation but aren't yet ready for prime tim...

#Hardware
2026-01-18 LocalLLaMA

Ministral 3 Reasoning Heretic: Uncensored LLM Models and GGUFs

Ministral 3 Reasoning Heretic models are now available, uncensored versions with vision capabilities. User coder3101 released quantized models (Q4, Q5, Q8, BF16) with MMPROJ for vision features, speeding up release times for the community. 4B, 8B and...

#Hardware
2026-01-18 LocalLLaMA

Newelle 1.2: AI assistant for Linux gets an update

Version 1.2 of Newelle, the AI assistant designed for Linux, is now available. The update includes llama.cpp integration, a new model library for ollama/llama.cpp, and hybrid search optimized for document reading. Other new features include the addit...

#LLM On-Premise #RAG
2026-01-18 LocalLLaMA

Analyzing 1M+ Emails for Context Engineering: Key Learnings

A team processed over a million emails to turn them into structured context for AI agents. The analysis revealed that thread reconstruction is complex, attachments are crucial, multilingual conversations are frequent, and data retention is a hurdle f...

← Back to All Topics