Script Invariance in Language Models
A recent study published on arXiv investigates whether the features learned by large language models (LLMs) represent abstract meaning or are tied to the specific written form of the text. The research focuses on Serbian digraphia, a situation where the Serbian language can be written in both Latin and Cyrillic alphabets, with a near-perfect mapping between characters.
Methodology and Results
The researchers analyzed the feature activations of Sparse Autoencoders (SAEs) across the Gemma model family (270M-27B parameters). They found that identical sentences in different Serbian scripts activate highly overlapping features, far exceeding random baselines. Interestingly, changing script causes less representational divergence than paraphrasing within the same script, suggesting that SAE features prioritize meaning over orthographic form. Cross-script cross-paraphrase comparisons provide evidence against memorization, as these combinations rarely co-occur in training data, but still show substantial feature overlap. This script invariance strengthens with model scale.
Implications
The findings suggest that SAE features can capture semantics at a level of abstraction above surface tokenization. The study proposes Serbian digraphia as a general evaluation paradigm for probing the abstractness of learned representations. For those evaluating on-premise deployments, there are trade-offs to consider. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these options.
๐ฌ Comments (0)
๐ Log in or register to comment on articles.
No comments yet. Be the first to comment!