NLLB-200: An Analysis of Multilingual Representations

A recent study examined the internal representations of Meta's neural machine translation model NLLB-200, a Transformer encoder-decoder supporting 200 languages. The goal was to determine whether the model acquires universal conceptual representations across languages or simply groups them based on surface similarities.

Methodology and Results

The research used the Swadesh core vocabulary list in 135 languages to probe the model's representation geometry. The results indicate that the distances between the model's embeddings are significantly correlated with phylogenetic distances ($
ho = 0.13$, $p = 0.020$), suggesting that NLLB-200 has implicitly learned the genealogical structure of human languages. It was also found that frequently colexified concept pairs exhibit significantly higher embedding similarity than non-colexified pairs ($U = 42656$, $p = 1.33 imes 10^{-11}$, $d = 0.96$), indicating that the model has internalized universal conceptual associations.

Implications and Tools

Per-language mean-centering of embeddings improves the between-concept to within-concept distance ratio by a factor of 1.19, providing geometric evidence for a language-neutral conceptual store. Semantic offset vectors between fundamental concept pairs show high cross-lingual consistency (mean cosine = 0.84), suggesting that second-order relational structure is preserved across typologically diverse languages. The authors released InterpretCognates, an open-source interactive toolkit for exploring these phenomena, alongside a fully reproducible analysis pipeline.

For those evaluating on-premise deployments, there are trade-offs to consider. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these aspects.