1sec.ai

Tag

#multilingual

Every item tagged multilingual, newest first.

5 items

modelsMay 14

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

IBM released Granite Embedding Multilingual R2 under Apache 2.0, offering 32K context and sub-100M retrieval quality. This open multilingual embedding model supports 100+ languages and targets builders seeking high-quality, locally deployable embeddings. The model's performance makes it suitable for applications requiring low-latency, high-accuracy retrieval.

Key takeaways
  • 32K context window for handling long input sequences.
  • Sub-100M retrieval quality benchmark achieved.
  • Supports 100+ languages for multilingual applications.
researchNov 21

Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks

The Open ASR Leaderboard now includes multilingual and long-form speech recognition tracks, expanding its scope beyond English-only short-form transcription. This update enables more comprehensive evaluation of automatic speech recognition systems across diverse languages and audio formats. You can explore the refreshed leaderboard and dataset to assess ASR model performance in real-world scenarios. The leaderboard's growth reflects increasing demand for robust, multilingual ASR capabilities.

Key takeaways
  • Open ASR Leaderboard adds multilingual and long-form tracks.
  • Expanded scope enables more comprehensive ASR system evaluation.
  • Leaderboard now reflects growing demand for multilingual ASR.
modelsMar 12

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Google released Gemma 3, a new open LLM that is multimodal, multilingual, and has a long context window. Gemma 3 is available on Hugging Face and aims to provide a high-performance, open alternative for builders. The model supports multiple languages and modalities, making it suitable for a wide range of applications.

Key takeaways
  • Gemma 3 is multimodal and multilingual.
  • Available on Hugging Face.
  • Long context window for handling complex inputs.
modelsMay 24

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens and 11 languages

The Technology Innovation Institute released Falcon 2, an 11B parameter pretrained language model and vision-language model (VLM) trained on 5000B tokens across 11 languages. Falcon 2 targets applications requiring broad multilingual and multimodal capabilities. You can access Falcon 2 models via Hugging Face for research and product development.

Key takeaways
  • 11B parameter model trained on 5000B tokens across 11 languages.
  • Supports both language and vision-language tasks.
  • Available on Hugging Face for research and development.
modelsJul 12

Introducing The World's Largest Open Multilingual Language Model: BLOOM

The BLOOM model, developed by the BigScience research workshop, is a multilingual language model with 176 billion parameters, making it one of the largest open models available. It was trained on 1.5 trillion tokens across 46 languages. BLOOM is designed to be a more accessible and transparent alternative to closed language models, allowing builders to fine-tune and adapt it for specific use cases. The model's large size and diverse training data enable it to handle a wide range of natural langa

Key takeaways
  • 176 billion parameters, one of the largest open models.
  • Trained on 1.5 trillion tokens across 46 languages.
  • Designed for accessibility and transparency, allowing fine-tuning.