A user in the LocalLLaMA community expressed disappointment with Tencent's Youtu-VL-4B-Instruct model, after finding it incomplete despite promises of advanced features in the field of computer vision.
Incomplete Implementation
The model, advertised on Hugging Face as a state-of-the-art (SOTA) solution for object detection, semantic segmentation, and grounding, turned out to be a basic version capable of describing image content, but lacking the promised advanced features. The user discovered that the missing features were listed as "TODO" on GitHub and mentioned in a discussion related to another model.
Restrictive Usage License
In addition to the shortcomings in the code, the model's usage license explicitly prohibits use within the European Union. This, combined with the incomplete state of the model, led the user to advise against its use, warning others of potential wasted time.
Those evaluating the deployment of models of this type, especially in environments with data sovereignty requirements, may find useful the analytical frameworks offered by AI-RADAR on /llm-onpremise to evaluate the trade-offs between different options.
๐ฌ Commenti (0)
๐ Accedi o registrati per commentare gli articoli.
Nessun commento ancora. Sii il primo a commentare!