Alibaba, yes that Alibaba, just released an updated Large Language Model called QWEN 2 VL.
A year in the making, the new model supersedes their original QWEN model and is said — by Alibaba — to have:
- State-of-the-art understanding of images of various resolutions & ratio
- Understanding and interpretation of longer-form videos of 20 minutes +
- Agentic behaviour that can operate mobiles, robots, etc.
- Multilingual Support includes the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, and Vietnamese in addition to English and Chinese.
Initially, 3 models will be released. The Qwen2-VL-2B and Qwen2-VL-7B are open-source and come with an Apache 2.0 licence. The 72B parameter model is not being open-sourced for now, but an API for it is being released.
Model performance
According to one of the tables from the Qwen website, the performance of the new 72B model is amazing, beating both OpenAI’s GPT 4o 0513 and Claude 3.5 Sonnet. Check it out.
Sign Up For Daily Newsletter
Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.