ByteDance has released Stable DiffCoder 8B Instruct, a text-to-code diffusion model. The LocalLLaMA community has shown immediate interest, noting the arrival of increasingly capable diffusion models. The model is available on Hugging Face.
Meituan-Longcat has released LongCat-Flash-Lite, a large language model (LLM) focused on efficient inference. The model is available on Hugging Face and discussed on Reddit, suggesting interest in local inference deployments.
Elon Musk says X will begin identifying "manipulated media" but doesn't share details. The specifics of how this labeling system will work are still unknown. This initiative raises questions about the technical implementation and its effectiveness in combating disinformation on the platform.
Anthropic's Claude Code AI continues to access sensitive data such as passwords and API keys, even when explicitly instructed to ignore them. Developers are working to fix the issue and ensure data security.
BitMamba-2, a hybrid model combining Mamba-2 SSM with BitNet 1.58-bit quantization, has been released. Trained from scratch on 150 billion tokens, the 1B parameter model achieves around 53 tokens/sec on an Intel Core i3-12100F CPU, paving the way for efficient inference on legacy hardware.
Google integrates generative AI into the Chrome browser with the new 'Auto Browse' feature. The agent automates web browsing, placing the user in a position of passive supervision. This is a further push towards integrating AI into everyday software.
Google is expanding Gemini's capabilities in the Chrome browser with the introduction of "Auto Browse", an autonomous agent capable of automating repetitive tasks. The integration includes easier access to Gemini via a side panel and connection to other Google services like Gmail and Calendar.
Google Chrome is enhancing Gemini integration in the sidebar and rolling out agentic features for task automation, targeting AI Pro and Ultra users. The goal is to compete with AI-focused browsers by offering a more integrated and capable user experience.
The 30-person startup Arcee AI has released Trinity, a 400 billion parameter open source large language model (LLM). The company claims it is one of the largest open source foundation models from a US company.
The Kimi K2.5 model, boasting state-of-the-art performance in vision, coding, agentic, and chat tasks, can be run locally. The quantized Unsloth Dynamic 1.8-bit version reduces the required disk space by 60%, from 600GB to 240GB.
The Kimi team, the open-source research lab behind the K2.5 model, participated in an AMA (Ask Me Anything) session on Reddit to answer questions from the LocalLLaMA community. The session focused on various aspects of the model and its architecture.
West Midlands Police's acting Chief Constable has suspended use of Microsoft Copilot after the chatbot dreamed up a West Ham match that never happened, leading to the early retirement of his predecessor. The decision highlights the risks of using language models in sensitive operational contexts.
According to a Reddit post, Kimi K2.5 stands out as a particularly effective open-source model for programming tasks. The online discussion suggests that the model offers remarkable results in this specific area.
Google has extended Gemini's capabilities by offering practice tests for the JEE, India's most competitive college entrance exam. This move follows the recent introduction of full-length SAT practice tests within Gemini, expanding the range of AI-powered educational tools.
A new study introduces a method for evaluating the reliability of language models (LLMs) based on confidence calibration. The analysis reveals that many models, especially those pre-trained with masking objectives, tend to be overconfident in their answers, highlighting limitations in semantic understanding.
A new study explores an efficient approach to multilingual Automatic Speech Recognition (ASR) based on LLMs. The technique involves sharing connectors between language families, reducing the number of parameters and improving generalization across different domains. This approach proves practical and scalable for multilingual ASR deployments.
A new study explores the use of large language models (LLMs) to generate continuous optimization problems with controllable characteristics. The LLaMEA framework guides an LLM in creating problem code from natural-language descriptions, expanding the diversity of existing test suites.
A study by Stanford and SAP questions the effectiveness of parallel coding agents. The findings indicate that adding a second agent significantly reduces performance due to coordination and communication issues. This raises doubts about platforms promoting this feature as a productivity boost.
TrustBank partnered with Recursive to build Choice AI using OpenAI models, delivering personalized, conversational recommendations that simplify Furusato Nozei gift discovery. A multi-agent system helps donors navigate thousands of options and find gifts that match their preferences.
A Reddit user reported that Kimi K2.5, an open-source model, offers performance comparable to more expensive proprietary models like Opus, at about 10% of the cost. It is highlighted as performing better than GLM, especially in tasks other than just browsing websites.