Task automation with Qwen2-0.5B on CPU
A developer has presented the results of fine-tuning the Qwen2-0.5B model for task automation. The system receives tasks in natural language (e.g., "copy logs to backup"), identifies the task type (atomic, repetitive, clarification), and generates execution plans consisting of CLI commands and hotkeys.
Inference occurs entirely locally on the CPU, without the need for a GPU or cloud APIs. The base model is Qwen2-0.5B, refined via LoRA on approximately 1000 custom task examples. Quantization is GGUF Q4_K_M (300MB), and inference is managed by llama.cpp, with response times between 3 and 10 seconds on i3/i5 processors.
Challenges and limitations
The main challenges during training involved data quality, overfitting, and EOS token handling. Converting to GGUF format required the use of BF16 data type and imatrix quantization to obtain stable outputs.
Currently, the system requires full file paths (without smart search), only supports CPU inference, and performs basic tasks without visual understanding. Performance varies: 3-5 seconds on i5 (2018+) with SSD, 5-10 seconds on i3 (2015+) with SSD, and 30-90 seconds on older hardware (Pentium + HDD).
๐ฌ Comments (0)
๐ Log in or register to comment on articles.
No comments yet. Be the first to comment!