IndicContextEval: A Benchmark for Evaluating Context Utilisation in Audio Large Language Models Across 8 Indic Languages

aarXivscore 0.37

Researchers introduce IndicContextEval, a 56-hour benchmark evaluating how well audio large language models utilise contextual inputs across 8 Indian languages. The benchmark tests models' ability to incorporate domain descriptions and entity lists into speech recognition. This work aims to assess whether models truly leverage context or rely on pre-trained knowledge. You can use this benchmark to develop and evaluate models that better understand contextual cues in multilingual speech.

Key takeaways

IndicContextEval is a 56-hour multilingual benchmark.
Evaluates context utilisation in audio LLMs across 8 Indian languages.
Tests models' ability to incorporate domain descriptions and entity lists.

#multilingual-llms #speech-recognition #benchmarks

Read the original

Feed

research17h ago

IndicContextEval: A Benchmark for Evaluating Context Utilisation in Audio Large Language Models Across 8 Indic Languages

aarXiv

Key takeaways

IndicContextEval is a 56-hour multilingual benchmark.
Evaluates context utilisation in audio LLMs across 8 Indian languages.
Tests models' ability to incorporate domain descriptions and entity lists.

#multilingual-llms #speech-recognition #benchmarks

Read at arXiv