Multimodal Retrieval for Engineering Archives
A new multimodal retrieval system, called Blueprint, addresses the challenge of making searchable the vast archives of engineering drawings and technical documents accumulated over the decades. These legacy archives often lack consistent metadata, making it difficult to search and access information.
Blueprint uses a combination of techniques, including the detection of canonical drawing regions, region-restricted VLM-based OCR, and the normalization of identifiers (such as DWG, part numbers, and facility identifiers). The system fuses lexical and dense retrieval with a lightweight region-level reranker.
Performance and Evaluation
The system was tested on approximately 770,000 unlabeled files, automatically generating structured metadata suitable for cross-facility search. Evaluation on a 5,000-file benchmark with 350 expert-curated queries demonstrated a 10.1% improvement in Success@3 and an 18.9% relative improvement in nDCG@3 compared to the strongest vision-language baseline. The results highlight the potential for further improvement with refinement of region detection and OCR.
For those evaluating on-premise deployments, there are trade-offs to consider carefully. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these aspects.
๐ฌ Comments (0)
๐ Log in or register to comment on articles.
No comments yet. Be the first to comment!