Contrastive Data Injection to Improve Language Models
A recent study explored a technique to improve bias resistance and sycophancy in language models, achieving promising results with relatively small models. The approach is based on injecting contrastive data pairs during the pre-training phase, even in minimal percentages (0.05%).
The results indicate that a 7 million parameter model, trained with this technique, can achieve performance levels comparable to standard models with a significantly larger number of parameters (18-34 million).
Implementation Details
The technique does not require modifications to the model architecture or the addition of an auxiliary loss function. The injection of contrastive data appears to provide the model with clear examples of the desired behaviors, compensating for the lack of sufficient signals in standard pre-training datasets such as OpenWebText.
Interestingly, the dose of data injection affects the results in a non-linear way: a percentage of 5% seems to be optimal, while a percentage of 10% worsens both behavioral scores and factual accuracy.
Results on Larger Models
The technique was successfully replicated on 12 and 34 million parameter models, showing a similar trend. In particular, contrastive injection seems to resolve a scaling anomaly observed in vanilla 64 million parameter models, where bias resistance tends to regress. With contrastive injection, however, bias resistance remains stable across all scales tested.
The study suggests that, if this technique proves effective even on larger models, it could allow to achieve a behavioral quality comparable to that of models with a number of parameters 5-10 times higher. This would pave the way for running advanced language models on devices with limited resources, such as smartphones, without the need for dedicated GPUs.
๐ฌ Comments (0)
๐ Log in or register to comment on articles.
No comments yet. Be the first to comment!