Semantic Distance as Routing Layer: A Decentralized On-Device Discovery Model

The Era of the Central Index and Its Challenges

For nearly thirty years, the discovery of information and people has been mediated by a model based on a central index. Search engines and recommender systems have dominated the landscape, processing rankings and suggestions server-side. This approach, while historically convenient, presents significant constraints: ranking rules are often opaque, and the incentives driving these classifications do not always align with the interests of end-users.

For companies and technical decision-makers, this centralization raises crucial questions regarding data control, sovereignty, and transparency. Dependence on external infrastructures and proprietary algorithms can entail risks in terms of compliance, security, and ultimately, the long-term Total Cost of Ownership (TCO), especially for sensitive AI workloads that require granular control over the entire pipeline.

A Prototype for Decentralized Discovery

Building on these premises, a recent prototype explores a radical alternative: a serverless, on-device discovery model. The underlying hypothesis is that if each device can locally run a competent embedding model and communicate peer-to-peer with other devices, relevance can be computed at the edge, eliminating the need for a central index or a privileged ranking entity.

The prototype, developed to pressure-test this idea in a real environment rather than through simulation, works as follows: each "post" (or piece of information) is encoded into an embedding using a model running directly on the device, in this case EmbeddingGemma-300M. A lightweight, signed announcement, containing the author and the embedding, is gossiped peer-to-peer within a shared "room." Full content bodies are pulled only for a bounded set of items that a node actually admits. Each device ranks incoming posts against its own, based on cosine similarity, and maintains a bounded local inbox. The architecture is serverless, account-less, with no global ranking, and the address space is defined by semantic meaning.

Implications for On-Premise Deployments and Data Sovereignty

This decentralized approach offers significant advantages for organizations evaluating on-premise deployments or air-gapped environments. By eliminating the need for a centralized server infrastructure for discovery and ranking, single points of failure are reduced, and data sovereignty is strengthened, as sensitive information remains on user devices or within the company's controlled perimeter. This is particularly relevant for sectors with stringent compliance requirements, such as finance or healthcare.

Furthermore, the same technological substrate can be extended to allow AI agents to discover each other. An agent could publish a "need" or an "offer" in the form of an embedding, and other agents with semantically close profiles could respond. This self-organizing and discovery capability among agents opens new perspectives for distributed AI architectures, reducing reliance on central orchestrators and improving overall system resilience.

Towards a Distributed Future for LLMs

The prototype demonstrates that reliance on central indexes for discovery is not a fundamental requirement, but rather a historical convenience that can be overcome with current edge processing capabilities and peer-to-peer networks. While implementing large-scale systems based on this paradigm may present new challenges in terms of consistency management and global scalability, the benefits in terms of control, privacy, and resilience are evident.

For companies investing in Large Language Models (LLM) and AI infrastructures, exploring decentralized models like this offers a path to mitigate risks associated with centralization and optimize TCO by shifting part of the computational load to the edge. AI-RADAR continues to monitor these innovations, providing analytical frameworks to evaluate the trade-offs between centralized and distributed architectures, supporting technical decision-makers in choosing the solutions best suited to their sovereignty and control needs.