Open Source LLMs

2026-02-21 • LocalLLaMA

Qwen Code: Open-Source Coding Agent with No-Telemetry Fork

Qwen Code is an open-source CLI coding agent developed by Alibaba's Qwen team. It automates development tasks by directly interacting with the code. A modified version is available that removes telemetry, ensuring greater privacy. Integration with LM...

#LLM On-Premise #DevOps

2026-02-21 • LocalLLaMA

Ouro-2.6B-Thinking: First Working Inference for ByteDance's Model

Inference issues with ByteDance's Ouro-2.6B-Thinking, a recurrent Universal Transformer model, have been resolved. The fix addresses incompatibilities with Transformers 4.55. The outputs now produce valid results. Tested on NVIDIA L4, achieving 3.8 t...

#Hardware

2026-02-21 • LocalLLaMA

The importance of key figures in open source LLM innovation

A Reddit post highlights the potential impact of prominent figures like Andrej Karpathy in the development of open source large language models (LLMs). The discussion underscores how the presence of experts can significantly accelerate progress and c...

#LLM On-Premise #Fine-Tuning #DevOps

2026-02-20 • LocalLLaMA

Hugging Face acquires GGML and llama.cpp for Local AI advancement

Hugging Face announced the acquisition of GGML and llama.cpp, two open-source projects crucial for efficient execution of large language models (LLMs) on consumer hardware. The goal is to ensure the long-term development of local AI and democratize a...

#Hardware #LLM On-Premise #DevOps

2026-02-20 • LocalLLaMA

Hugging Face Acquires GGML.AI, Focused on Efficient LLM Inference

Hugging Face has acquired GGML.AI, known for its work on efficient inference of large language models (LLMs). The acquisition, discussed on Reddit and GitHub, could lead to greater integration of GGML technologies into the Hugging Face ecosystem, ben...

#Hardware #LLM On-Premise #DevOps

2026-02-20 • LocalLLaMA

PaddleOCR-VL now in llama.cpp

The open-source multilingual model PaddleOCR-VL has been integrated into llama.cpp. This integration allows running model inference directly on local hardware, opening new possibilities for OCR applications with privacy and data sovereignty requireme...

#LLM On-Premise #DevOps

2026-02-20 • ArXiv cs.LG

MMCAformer: Traffic Speed Prediction with Connected Vehicle Data

A new model, MMCAformer, integrates macroscopic traffic flow data with microscopic driving behavior insights from connected vehicles to improve traffic speed prediction accuracy. The approach reduces uncertainty and increases accuracy, especially in ...

2026-02-19 • LocalLLaMA

Doubts raised on actual performance of open-weight AI models running locally

A Reddit post questions the actual capabilities of open-source AI models running offline on consumer hardware. The discussion revolves around the real-world utility of such implementations, raising questions about user expectations.

#Hardware #LLM On-Premise #DevOps

2026-02-19 • LocalLLaMA

Llama.cpp: IQ_K and IQ_KS quantization support

A pull request to llama.cpp introduces support for IQ*_K and IQ*_KS quantization schemes, derived from the ik_llama.cpp project. This implementation could lead to more compact and efficient models, particularly relevant for inference on resource-cons...

#LLM On-Premise #DevOps

2026-02-18 • LocalLLaMA

FlashLM v4: 4.3M ternary model trained on CPU in 2 hours

FlashLM v4 is a language model with 4.3 million parameters, ternary weights (-1, 0, +1), and CPU-based training in just two hours. It generates coherent stories, demonstrating that small models can achieve interesting results with efficient training ...

#Hardware #Fine-Tuning

2026-02-18 • LocalLLaMA

ByteShape LLMs: Coder Models for Every Hardware, Including Raspberry Pi

ByteShape releases Devstral-Small-2-24B and Qwen3-Coder-30B, models optimized for various hardware platforms. Devstral excels on RTX 40/50 GPUs, while Qwen3-Coder offers performance on Raspberry Pi 5. The choice depends on available resources and con...

#Hardware #LLM On-Premise #DevOps

2026-02-18 • LocalLLaMA

Qwen 3.5: MXFP4 quantization coming soon

Junyang Lin confirmed the upcoming release of Qwen 3.5 models with MXFP4 quantization. This format, already adopted by OpenAI with GPT-Oss and Google with Gemma 3 QAT, promises higher quality compared to traditional BF16 quantizations. The initiative...

#Hardware #LLM On-Premise #DevOps

2026-02-18 • TechCrunch AI

Sarvam to bring its AI models to feature phones and edge devices

Indian startup Sarvam is developing small-footprint AI models designed to run on edge devices such as feature phones, cars, and smart glasses. The models, with a footprint of only a few megabytes, can operate offline and with standard processors.

#LLM On-Premise #DevOps

2026-02-18 • TechCrunch AI

Sarvam AI bets on open-source with new language models

Indian AI lab Sarvam AI has unveiled a new lineup of models, including language models with 30 and 105 billion parameters, a text-to-speech model, a speech-to-text model, and a vision model for document parsing. A major bet on open-source AI.

#LLM On-Premise #DevOps

2026-02-17 • LocalLLaMA

Alibaba's Qwen3.5-397B: #3 open-weights model globally

Alibaba's Qwen3.5-397B large language model (LLM) has achieved the third position in the open-source model rankings, according to the Artificial Analysis Intelligence Index. This result highlights the advancements in the field of open AI and the grow...

#LLM On-Premise #DevOps

2026-02-17 • LocalLLaMA

Qwen 3.5: a replacement to Llama 4 Scout?

A Reddit user has raised an interesting question: could Qwen 3.5 be a valid replacement for Llama 4 Scout? The question has sparked a debate in the LocalLLaMA community, with differing opinions on the actual comparability of the two models.

#LLM On-Premise #DevOps

2026-02-16 • LocalLLaMA

Open Source Models Dominate OpenRouter: A Growing Trend

Recent data from OpenRouter indicates that open source models are gaining traction in real-world usage. The trend highlights a growing confidence in open alternatives for AI applications, with significant implications for costs, customization, and da...

#LLM On-Premise #DevOps

2026-02-16 • LocalLLaMA

Qwen 3.5: Open Source Multimodal Model with Ultimate Efficiency

The multimodal model Qwen 3.5-397B-A17B has been released as open source. This latest generation model promises high efficiency and native multimodal capabilities. The news was shared on Reddit, attracting the attention of the LocalLLaMA community.

#LLM On-Premise #DevOps

2026-02-16 • LocalLLaMA

Qwen3.5-397B-A17B Released: The Open-Source Language Model

Qwen3.5-397B-A17B, a large language model (LLM) developed by Qwen, has been released. The model is accessible via Hugging Face, opening new possibilities for research and development in the field of generative artificial intelligence. Its open-source...

#LLM On-Premise #DevOps

2026-02-16 • LocalLLaMA

Qwen3.5-397B-A17B: Open Source Language Model Coming Soon

The large language model (LLM) Qwen3.5-397B-A17B will be released as open source. The announcement was shared via an image from the chat.qwen.ai website, generating interest in the LocalLLaMA community.

#LLM On-Premise #DevOps

2026-02-16 • LocalLLaMA

Alibaba to release Qwen 3.5: next-generation open-source model

Sources indicate that Alibaba will release Qwen 3.5 today, a next-generation open-source large language model (LLM). The model is expected to feature significant innovations in its architecture, opening new possibilities for the artificial intelligen...

#LLM On-Premise #DevOps

2026-02-15 • LocalLLaMA

InclusionAI unveils Ling-2.5-1T: 1T parameter open-source model

InclusionAI has released Ling-2.5-1T, an open-source language model with 1 trillion parameters (63 billion active). Trained on a corpus of 29 trillion tokens, Ling-2.5-1T aims to balance efficiency and performance, offering advanced reasoning capabil...

#LLM On-Premise #DevOps

2026-02-15 • LocalLLaMA

Open-weight models dominate OpenRouter leaderboard

For the first time, the top four models on the OpenRouter leaderboard are all open-weight. This marks a potential turning point for the adoption and trust in open-source language models, offering viable alternatives to proprietary models.

#LLM On-Premise #DevOps

Related Coverage