French Startup Mistral AI Unveils Free Pixtral 12B Model, Rivals OpenAI with Multimodal Capabilities

September 11, 2024
French Startup Mistral AI Unveils Free Pixtral 12B Model, Rivals OpenAI with Multimodal Capabilities
  • Sophia Yang, head of developer relations at Mistral, emphasized that Pixtral 12B uniquely supports an arbitrary number of images of various sizes.

  • The architecture of Pixtral 12B includes 40 layers, 14,336 hidden dimensions, and 32 attention heads, ensuring high computational capability.

  • Concerns have been raised regarding the training data for Pixtral 12B, particularly its reliance on publicly available web data, which has led to copyright lawsuits against other AI companies.

  • This new model, which boasts a size of 24GB, is available for free under the Apache 2.0 license, allowing users unrestricted access and modification.

  • The release of Pixtral 12B follows Mistral's recent funding success, raising $645 million and achieving a valuation of $6 billion, with notable investors including Microsoft.

  • The announcement of Pixtral 12B was made via Mistral's official account on X (formerly Twitter), and the model weights can be found on platforms like Hugging Face and GitHub.

  • Overall, Mistral aims to democratize access to visual applications for content and data analysis through this innovative launch.

  • Mistral AI, a French startup, has launched Pixtral 12B, a multimodal model designed to process both images and text.

  • Users will soon be able to access Pixtral 12B through platforms like Le Chat and Le Platforme, provided they have accounts.

  • The model features a dedicated vision encoder capable of processing high-resolution images up to 1024x1024 pixels, enhancing its image processing capabilities.

  • Pixtral 12B allows users to input images via URLs or base64 encoding, enabling functionalities like image captioning and object counting.

  • With the launch of Pixtral 12B, Mistral positions itself as a competitor to industry leaders like OpenAI and Anthropic, which offer similar multimodal capabilities.

Summary based on 13 sources


Get a daily email with more Startups stories

More Stories