QWEN 2 vision model impresses but falls short | Thomas Reid | Sep 2024

SeniorTechInfo
1 Min Read

Alibaba Unveils Groundbreaking Language Model: QWEN 2 VL

Alibaba, yes that Alibaba, just released an updated Large Language Model called QWEN 2 VL.

A year in the making, the new model supersedes their original QWEN model and is said — by Alibaba — to have:

  • State-of-the-art understanding of images of various resolutions & ratio
  • Understanding and interpretation of longer-form videos of 20 minutes +
  • Agentic behaviour that can operate mobiles, robots, etc.
  • Multilingual Support includes the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, and Vietnamese in addition to English and Chinese.

Initially, 3 models will be released. The Qwen2-VL-2B and Qwen2-VL-7B are open-source and come with an Apache 2.0 licence. The 72B parameter model is not being open-sourced for now, but an API for it is being released.

Model performance

According to one of the tables from the Qwen website, the performance of the new 72B model is amazing, beating both OpenAI’s GPT 4o 0513 and Claude 3.5 Sonnet. Check it out.

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *