Training LLMs to synthesize interdisciplinary research on climate change

ClimateGPT is a family of domain-specific large language models tailored for climate change research. The models use a 7-billion-parameter neural network architecture and were trained on scientific papers containing over 300 billion tokens (words and subwords). Through specialized training and fine-tuning on a dataset developed with climate scientists, these models perform on par with the larger 70-billion-parameter Llama-2-70B chat model on climate benchmarks. The models employ hierarchical retrieval-augmented generation (RAG) to ground responses in verified sources, reducing incorrect outputs. While trained using renewable energy and publicly available, the models' reliance on primarily English-language scientific papers from Global North institutions risks missing crucial climate research and adaptation strategies from the Global South. Future plans include machine translation capabilities, though addressing the underlying data representation challenges remains crucial.

As climate research expands, these models can provide valuable tools for synthesizing findings and supporting urban planning, risk assessment, and resilience-building efforts - provided they evolve to incorporate diverse knowledge sources and research perspectives from all climate-affected regions.

Source: arxiv.org

Sector

Innovation Systems

Explore a forecast

Expansive Views

Optimized Insights

Amplified Actions

Trend

Forecast

Training LLMs to synthesize interdisciplinary research on climate change