New datasets will train AI models to think like scientists
The initiative, called Polymathic AI, uses technology like that powering large language models such as OpenAI’s ChatGPT or Google’s Gemini. But instead of ingesting text, the project’s models learn using scientific datasets from across astrophysics, biology, acoustics, chemistry, fluid dynamics and more, essentially giving the models cross-disciplinary scientific knowledge.“These datasets are by far the most diverse large-scale collections of high-quality data for machine learning training ever assembled for these fields,”…

