research94d ago

Testing LLMs on superconductivity research questions

GGoogle Researchscore 0.18

Researchers tested 4 LLMs - PaLM 2, PaLM 2-S, Gemini 1.0, and a fine-tuned BERT - on superconductivity research questions. The models' performance varied significantly across tasks. You can use these results to inform your own evaluation of LLMs for scientific research applications. The study highlights the need for domain-specific evaluation of LLMs.

Key takeaways

PaLM 2-S performed best on superconductivity Q&A.
Fine-tuned BERT matched or exceeded PaLM 2 on some tasks.
LLM performance varied significantly across tasks.

#scientific-llms #domain-specific #fine-tuning

Read the original

research94d ago

Testing LLMs on superconductivity research questions

GGoogle Research

Researchers tested 4 LLMs - PaLM 2, PaLM 2-S, Gemini 1.0, and a fine-tuned BERT - on superconductivity research questions. The models' performance varied significantly across tasks. You can use these results to inform your own evaluation of LLMs for scientific research applications. The study highlights the need for domain-specific evaluation of LLMs.

Key takeaways

PaLM 2-S performed best on superconductivity Q&A.
Fine-tuned BERT matched or exceeded PaLM 2 on some tasks.
LLM performance varied significantly across tasks.

#scientific-llms #domain-specific #fine-tuning

Read at Google Research