IndicContextEval: A Benchmark for Evaluating Context Utilisation in Audio Large Language Models Across 8 Indic Languages
Researchers introduce IndicContextEval, a 56-hour benchmark evaluating how well audio large language models utilise contextual inputs across 8 Indian languages. The benchmark tests models' ability to incorporate domain descriptions and entity lists into speech recognition. This work aims to assess whether models truly leverage context or rely on pre-trained knowledge. You can use this benchmark to develop and evaluate models that better understand contextual cues in multilingual speech.
Key takeaways
- IndicContextEval is a 56-hour multilingual benchmark.
- Evaluates context utilisation in audio LLMs across 8 Indian languages.
- Tests models' ability to incorporate domain descriptions and entity lists.
IndicContextEval: A Benchmark for Evaluating Context Utilisation in Audio Large Language Models Across 8 Indic Languages
Researchers introduce IndicContextEval, a 56-hour benchmark evaluating how well audio large language models utilise contextual inputs across 8 Indian languages. The benchmark tests models' ability to incorporate domain descriptions and entity lists into speech recognition. This work aims to assess whether models truly leverage context or rely on pre-trained knowledge. You can use this benchmark to develop and evaluate models that better understand contextual cues in multilingual speech.
Key takeaways
- IndicContextEval is a 56-hour multilingual benchmark.
- Evaluates context utilisation in audio LLMs across 8 Indian languages.
- Tests models' ability to incorporate domain descriptions and entity lists.