#scientific-llms — 1sec.ai

Testing LLMs on superconductivity research questions

Researchers tested 4 LLMs - PaLM 2, PaLM 2-S, Gemini 1.0, and a fine-tuned BERT - on superconductivity research questions. The models' performance varied significantly across tasks. You can use these results to inform your own evaluation of LLMs for scientific research applications. The study highlights the need for domain-specific evaluation of LLMs.

Key takeaways

PaLM 2-S performed best on superconductivity Q&A.
Fine-tuned BERT matched or exceeded PaLM 2 on some tasks.
LLM performance varied significantly across tasks.

GGoogle Research#scientific-llms #domain-specific #fine-tuning