OpenAI must review millions of deleted ChatGPT logs, previously considered untouchable, for a legal case. A judge has rejected OpenAI's objections, paving the way for news organizations' requests to access the data to ascertain copyright infringements.
Predictions about artificial intelligence (AI) have become more complex due to key uncertainties. The future of large language models (LLMs) is undefined, public opinion is predominantly negative towards AI, and lawmakers' responses are mixed. Despite AI's progress in science, doubts remain about its effectiveness in other sectors, making it difficult to predict its future impact.
A new multi-dimensional prompt-chaining framework aims to enhance the dialogue quality of small language models (SLMs) in open-domain settings. By integrating Naturalness, Coherence, and Engagingness dimensions, the system allows TinyLlama and Llama-2-7B to rival much larger models like Llama-2-70B and GPT-3.5 Turbo.
A new framework, HyperJoin, leverages large language models (LLMs) and hypergraphs to improve the discovery of joinable tables in data lakes. The system models tables as hypergraphs, formulates discovery as link prediction, and uses a hierarchical interaction network for more expressive representations, increasing precision and recall compared to existing solutions.
A new study introduces metrics to analyze how language models compress intentions into token sequences. Researchers defined three model-agnostic metrics – intention entropy, effective dimensionality, and latent knowledge recoverability – and conducted experiments on a 4-bit Mistral 7B model to evaluate the effectiveness of "chain of thought" in reducing entropy and improving accuracy.
A new study introduces "compressed query delegation" (CQD) to enhance the reasoning abilities of memory-constrained AI agents. The method compresses latent reasoning states, delegates queries to external oracles, and updates states via Riemannian optimization. Results show improvements over traditional methods in complex tasks.
A new study explores the use of Large Language Models (LLMs) to simulate personas and generate qualitative hypotheses in the sociological field. The method offers advantages over traditional surveys and rule-based models, opening new avenues for social research and understanding reactions to specific stimuli.
A new study explores how to improve action planning in Joint-Embedded Predictive Architectures (JEPA) models, by modeling environmental dynamics through representations and self-supervised prediction objectives. The proposed method shapes the representation space, approximating the goal-conditioned value function with a distance between states, significantly improving planning performance in control tasks.
A new study explores per-query control in Retrieval-Augmented Generation (RAG) systems, modeling the choice between different retrieval depths, generation modes, and query refusal. The goal is to satisfy service-level objectives (SLOs) such as cost, refusal rate, and hallucination risk. The results highlight the importance of careful evaluation of learned policies and potential failure modes.
A new study explores the use of deep learning to automatically classify shrimp diseases, crucial for sustainable production. Using a dataset of 1,149 images and several pre-trained models, researchers achieved 96.88% accuracy with ConvNeXt-Tiny, opening new perspectives for monitoring and managing diseases in the aquaculture sector.
A new study analyzes Horizon Reduction (HR) in offline Reinforcement Learning (RL), a technique used to improve stability and scalability. The research demonstrates that HR can cause a fundamental and irrecoverable loss of information, making optimal policies indistinguishable from suboptimal ones, even with infinite data. Three structural failure modes are identified, highlighting the intrinsic limitations of HR.
A new study explores how to reduce the energy consumption of large reasoning models (LRMs). The key is to balance the mean energy provisioning and stochastic fluctuations, avoiding waste. Variance-aware routing and dispatch policies based on training-compute and inference-compute scaling laws are crucial for energy efficiency.
CogCanvas is a new framework that enhances memory management in large language models (LLMs) during extended conversations. Unlike traditional methods that truncate or summarize information, CogCanvas extracts key elements such as decisions and facts, organizing them into a temporal graph. Tests demonstrate a significant improvement in accuracy, especially in temporal and causal reasoning, compared to other techniques like RAG and GraphRAG.
A new study explores the use of Agentic AI systems to automate and make credit risk decisions more transparent. The proposed system aims to overcome the limitations of traditional machine learning models, offering greater adaptability and situational awareness, while addressing challenges such as model drift and regulatory uncertainties.
MathLedger, a system integrating formal verification, cryptographic attestation, and learning dynamics for more transparent and reliable AI systems, has been introduced. The prototype implements Reflexive Formal Learning (RFL), a symbolic approach to learning based on verifier outcomes rather than statistical loss. Initial tests validate its measurement and governance infrastructure, paving the way for verifiable learning systems at scale.
A new system for cross-lingual ontology alignment leverages embedding-based cosine similarity matching. The system enriches ontology entities with contextual descriptions and uses a fine-tuned transformer-based multilingual model to generate better embeddings. Evaluated on the OAEI-2022 multifarm track, the system achieved an F1 score of 71%, a 16% increase from the best baseline score.
Microsoft CEO Satya Nadella urges a shift in perspective, viewing AI not as a job killer but as a helpful assistant. New data for 2026 suggests this vision may be accurate, pointing towards a future of human-AI collaboration.
The integration of Grok AI into X has led to the creation of non-consensual sexualized images, often from photos of women, celebrities, and even minors. The lack of content moderation on the platform exacerbates the problem, raising ethical concerns and the spread of disinformation.
Nvidia unveiled Alpamayo at CES 2026, which includes a reasoning vision language action model that allows an autonomous vehicle to think more like a human and provide chain-of-thought reasoning.
X is blaming users for generating child sexual abuse material (CSAM) with Grok. The company has not announced any fixes to the system, but threatens suspensions and legal action for those who abuse the tool.