NanoLLama is a framework that allows training models based on the Llama 3 architecture from scratch. Unlike fine-tuning or using LoRA techniques, NanoLLama performs complete pre-training, generating a GGUF file compatible with llama.cpp.
Key Features
- Simplified Training: The entire training process, from data download to GGUF export, is executed with a single command.
- Llama 3 Architecture: Supports the full Llama 3 architecture, with configurations ranging from 46 million to 7 billion parameters.
- Multi-corpus Training: Uses a multi-corpus training approach, based on the SmolLM2 recipe, including FineWeb-Edu, DCLM, code, and mathematics.
- Native GGUF Export: Exports directly to GGUF v3 format, without the need for conversions via HuggingFace or safetensors.
- Personality Injection: Allows training a base model and a model with personality, then subtracting the weights to obtain a portable personality vector.
- Go Inference Engine: Includes an inference engine developed in Go (approximately 9MB), which directly reads GGUF files, useful when the entire llama.cpp stack is not needed.
Pre-trained Models
Several models have already been trained and verified, including nano (46M), micro (87M), mini (175M), and small (338M). Training is underway for goldie (1.1B), a multilingual model.
๐ฌ Comments (0)
๐ Log in or register to comment on articles.
No comments yet. Be the first to comment!