## Introduction The threat of text-centric forgery poses a significant challenge to information security and authenticity. Current methods for text-centric forgery analysis are often limited to coarse-grained visual analysis and lack the capacity for sophisticated reasoning. ## The LogicLens Framework To address these challenges, Meta has introduced LogicLens, a unified framework for Visual-Textual Co-reasoning that reformulates these objectives into a single task. This framework is powered by our novel Cross-Cues-aware Chain of Thought (CCT) mechanism, which iteratively validates visual cues against textual logic. ## The PR$^2$ Pipeline To ensure robust alignment across all tasks, we further propose a weighted multi-task reward function for GRPO-based optimization. Complementing this framework, we first designed the PR$^2$ (Perceiver, Reasoner, Reviewer) pipeline, a hierarchical and iterative multi-agent system that generates high-quality, cognitively-aligned annotations. ## The RealText Dataset To test LogicLens, we constructed the RealText dataset, comprising 5,397 images with fine-grained annotations, including textual explanations, pixel-level segmentation, and authenticity labels for model training. Extensive experiments demonstrate the superiority of LogicLens across multiple benchmarks. ## Experimental Results LogicLens surpasses the specialized framework by 41.4% in zero-shot evaluation on T-IC13 and by 23.4% in macro-average F1 score against GPT-4o. On the challenging dense-text T-SROIE dataset, LogicLens establishes a significant lead over other MLLM-based methods in mF1, CSS, and the macro-average F1. ## Conclusion LogicLens represents a significant step forward in the fight against text-centric forgery and offers new opportunities for information security and authenticity.