## Introduction A new benchmark has been launched to test the spatial reasoning capabilities of large language models. GamiBench is a benchmark that focuses on spatial reasoning and 2D-3D planning, with the goal of evaluating how well large language models can understand and manipulate objects across multiple views. ## How GamiBench works GamiBench includes 186 crease patterns 2D and their corresponding 3D folded shapes, with objectives such as predicting 3D fold configurations, distinguishing valid viewpoints, and detecting impossible patterns. The benchmark uses an unique approach that combines perception and instruction-following to evaluate the spatial reasoning of large language models. ## Impact and applications GamiBench has the potential to significantly improve the capabilities of large language models in the field of spatial reasoning and 2D-3D planning. This benchmark can be used to test and improve large language models in various applications, such as computer-aided design, engineering, and robotics. ## Dataset and code The dataset and code are available on GitHub (https://github.com/stvngo/GamiBench).

GamiBench: Evaluating Spatial Reasoning and 2D-to-3D Planning Capabilities of MLLMs with Origami Folding Tasks

💬 Commenti (0)

📚 Approfondimenti

Approfondisci su LLM On-Premise

Cohere Rerank 4 quadruplica la finestra di contesto per migliorare l'accuratezza dei motori di ricerca

Kaggle introduce benchmark collaborativi per modelli di IA

Giustizia per le vittime dei modelli di intelligenza artificiale