IAP-25-082
Secure and Decentralized Federated Learning for Environmental Forecasting
Weather forecasting and environmental prediction are vital for managing risks related to flooding, extreme temperatures, and air quality. However, most current forecasting systems rely on centralised data processing, raising concerns about privacy, data ownership, and resilience to cyber attacks or system failures. These issues have been largely overlooked in both the UK and globally.
This project builds on recent work by Dr Aydin Abadi and colleagues at Newcastle University, who developed a decentralised weather forecasting framework that combines Federated Learning (FL) and blockchain technology. The approach enables multiple organisations, such as weather agencies and research institutes, to train shared forecasting models without exchanging raw data. Blockchain ensures transparent model validation, while privacy-preserving methods protect sensitive local observations.
This PhD will extend and enhance that research by improving the scalability, accuracy, and security of decentralised environmental forecasting. It will explore advanced cryptographic techniques, including secure aggregation, privacy-preserving consensus, and private set intersection, to protect participants’ data and evaluate performance using real and synthetic meteorological datasets. The goal is to deliver a scalable and secure collaborative forecasting framework that strengthens environmental resilience and benefits society.
The project sits within IAPETUS’s “Hazards, Risks and Resilience” theme and bridges computing, artificial intelligence, and environmental data science. It will equip the student with interdisciplinary skills in secure machine learning, blockchain systems, and environmental modelling, contributing to the UK’s broader efforts to ensure climate and hazard resilience.
Methodology
The project will develop and evaluate a privacy-preserving federated learning (FL) framework for weather and environmental forecasting. It builds on previous work [1] at Newcastle University that integrated FL, blockchain, and decentralised storage into a prototype system for secure collaborative forecasting. Traditional forecasting systems require centralising data from multiple organisations, such as weather stations, local authorities, and environmental sensors. These datasets often contain sensitive or proprietary information (for example, precise locations of critical infrastructure or privately operated sensors), making direct data sharing impractical. A privacy-preserving FL framework enables joint model training across these data sources without exposing raw data, thus supporting trustworthy and collaborative forecasting. The methodology consists of three main components: data and model design, privacy-preserving system development, and implementation and evaluation. These complementary phases cover the entire research process, from data preparation and algorithmic design to prototype development and empirical testing.
Data and Model Design. The study will use publicly available meteorological datasets (such as temperature, humidity, and air-pressure records) and synthetic environmental data. It will also explore data from the Newcastle Urban Observatory [4] to evaluate the framework under realistic UK conditions. The research will compare a range of machine learning algorithms, including non-parametric methods (e.g., K-nearest neighbour and kernel regression) and parametric models, to capture both temporal and spatial weather patterns. Each participating node will train locally on its dataset to maintain data privacy.
Secure Collaboration (Privacy-Preserving System Design). Instead of sharing raw data, each node will exchange model updates through a secure aggregation protocol. The research will develop efficient and scalable FL mechanisms that allow edge devices, often resource-constrained, to collaborate with sensors and blockchain-based smart contracts to verifiably produce reliable models. A privacy-preserving consensus (i.e., voting) layer will be designed to resist Sybil, data-poisoning, and model-tampering attacks while preserving participants’ input privacy when validating model quality. The design builds on Dr Abadi’s work on federated learning [1, 2] and on their privacy-preserving voting and dispute-resolution mechanisms [3].
System Implementation and Evaluation. A prototype of the proposed system will be implemented in Python using the Flower FL framework (developed by Flower Labs, where external collaborator Dr Naseri is based). The framework will be evaluated for prediction accuracy, computational overhead, communication efficiency, and robustness under different settings (e.g., number of nodes, dataset sizes, and adversarial conditions).
The candidate will have the opportunity to collaborate with an industry-based external expert, Dr Mohammad Naseri at Flower Labs, to study potential attack vectors and security challenges in federated learning, particularly in scenarios involving autonomous or sensor-driven data sources that lack direct human oversight. Furthermore, the candidate will work with Dr Thomas Zacharias on formally modelling and developing provably secure privacy-enhancing technologies to strengthen the theoretical foundations of the framework. In addition, the project will collaborate with Dr Ameer Mohammad from Kuwait University, whose research group has a strong track record in cryptography, privacy-preserving machine learning, and federated learning. Their involvement will provide complementary expertise in secure protocol design, performance evaluation on various datasets, and cross-institutional testing. The collaboration will also enhance international research exchange between Newcastle University and Kuwait University, broadening the project’s experimental scope and contributing to global efforts on trustworthy AI and environmental resilience.
Project Timeline
Year 1
The first year will focus on establishing the theoretical, experimental, and technical foundations of the project.
Months 1–3: In this period, the candidate conducts a comprehensive literature review covering federated learning, privacy-preserving machine learning, and blockchain-based validation for environmental forecasting. Identify relevant datasets (public meteorological data and Urban Observatory records) and define evaluation metrics.
Furthermore, the candidate will take related introductory courses in cryptography and machine learning; these courses are offered by Newcastle University (or online via Coursera with a very reasonable fee in case the student misses the start date).
Months 4–6: The candidate will develop the initial data-processing pipeline and baseline forecasting models (including non-parametric and parametric algorithms such as K-nearest neighbour, kernel regression). The candidate will evaluate baseline performance using standard metrics (e.g., RMSE).
Months 7–9: The candidate will design and implement the preliminary privacy-preserving federated learning framework, focusing on secure aggregation and data-handling mechanisms. The candidate will be advised to begin with small-scale experiments using synthetic datasets to test functionality. There are several publicly available datasets on Kaggle, e.g., Kaggle weather prediction [5].
Months 10–12: During this period, the candidate will integrate blockchain-based model validation and reputation mechanisms; the main supervisor will establish collaboration links with Flower Labs, Kuwait University, and Dr Thomas Zacharias. The candidate will present initial findings at a workshop or internal research event and prepare a short paper or technical report summarising progress.
Year 2
The main focus of the second year is on system development.
Months 13–18: In this period, the candidate expands dataset coverage to Urban Observatory and real-world data for multi-party training. The candidate develops enhanced privacy-preserving federated learning and voting mechanisms.
Months 19–24: Within this period, the candidate conducts formal security modelling with Dr Thomas Zacharias; integrates proofs of correctness and privacy. Moreover, the candidate begins cross-institutional testing with Kuwait University to validate robustness under various network and dataset conditions.
The outputs include a conference paper submission (e.g., NDSS or ACM CCS) and an open-source release of the updated system.
Year 3
The third year focuses on performance optimisation, benchmarking, and dissemination.
Months 25–30: Within this period, the candidate conducts a comprehensive evaluation of Met Office and Urban Observatory datasets. The candidate measures trade-offs between accuracy, runtime, and privacy.
Months 31–36: During this period, the candidate will optimise the computation and communication efficiency of the developed framework for edge devices and incorporate feedback from project collaborators. The enhanced system will be deployed and tested on resource-constrained platforms, such as Raspberry Pi 4 units configured for federated learning tasks, to evaluate scalability, latency, and energy efficiency under real-world conditions. The candidate prepares journal submissions and policy briefs on privacy-preserving data collaboration for environmental resilience.
Year 3.5
The candidate focuses on thesis writing, result consolidation, and viva preparation. The candidate will finalise documentation of experimental findings, prepare journal submissions, and present results to project partners and environmental stakeholders (e.g., the Urban Observatory and the Met Office).
Training
& Skills
The candidate will receive interdisciplinary training in federated learning, cryptography, and distributed systems, developing the technical and research skills needed to design and evaluate secure, scalable, and privacy-preserving forecasting systems.
Core technical training will include: (a) Advanced machine learning and federated learning frameworks (e.g., TensorFlow, Flower) for distributed model development and evaluation; (b) Cryptographic protocol design and privacy-enhancing technologies, including secure aggregation, differential privacy, and blockchain-based consensus mechanisms; (c) Software engineering and deployment on edge devices (e.g., Raspberry Pi) for real-world testing of distributed systems; and (d) Complementary skills will be developed through participation in IAPETUS and Newcastle University doctoral training activities, including workshops on scientific writing, research ethics, reproducibility, and data management. The candidate will also gain experience in collaborative and cross-institutional research, working with Flower Labs, Kuwait University, and Dr Thomas Zacharias on applied and theoretical aspects of privacy-preserving AI. These collaborations will enhance the candidate’s ability to translate research into practical impact while contributing to international efforts on trustworthy, secure, and sustainable AI for environmental resilience.
References & further reading
[1] Abadi et al. Decentralized Weather Forecasting via Distributed Machine Learning and Blockchain-Based Model Validation. FLTA 2025.
https://arxiv.org/pdf/2508.09299
[2] Abadi et al. Starlit: Privacy-Preserving Federated Learning to Enhance Financial Fraud Detection. FLTA 2025.
https://arxiv.org/pdf/2401.10765
[3] Abadi et al. Payment with Dispute Resolution: A Protocol for Reimbursing Fraud Victims. ACM AsiaCCS 2023.
https://dl.acm.org/doi/10.1145/3579856.3595789
[4]. Newcastle Urban Observatory.
https://urbanobservatory.ac.uk/
[5]. Kaggle Weather Prediction
https://www.kaggle.com/datasets/ananthr1/weather-prediction
