The article raises an interesting point: the need for a return to the "experimental" era in the development of large language models (LLMs).

The Homogenization of Models

Today, many models tend to converge towards the same "helpful assistant" persona, trained on similar datasets and fine-tuned for specific tasks. This approach, while effective, risks limiting creativity and the exploration of new possibilities.

The Importance of Unconventional Data

The author suggests re-evaluating the use of "unconventional" data sources for training. Projects like GPT-4chan, which used data from online forums, demonstrate the potential of this approach. The idea is that by combining high-performance base models with distinctive, niche datasets, surprising and unexpected results could be achieved.

Beyond Simple Provocation

The article cites MechaEpstein as an example of a model that moves in this direction, but suggests that there is room for greater creativity and originality, going beyond simple provocation or stereotypical responses.