Project Overview
This was a satellite imagery machine learning services engagement: improving the accuracy of annual crop-yield estimates from limited training data using feature engineering and machine learning techniques. The goal was to explore the predictive potential of the customer's annual crop-yield dataset and deliver a production-grade model the customer could deploy and re-train against future seasons.
Data Acquisition & Feature Engineering
Additional data was acquired using external geospatial satellite information and new features were engineered from existing features. This is a representative example of geospatial machine learning consulting and remote sensing data analysis services — combining publicly available Sentinel-2 and Landsat imagery with the customer's ground-truth dataset to extract signal that wasn't visible in the raw yield data alone. Both regression and classification modeling were performed to explore the full predictive potential of the combined dataset.
Results
Support Vector Regression turned out to be the best performing regression model, which achieved an average MAPE of 15.6% across the crops. For various technical reasons, the problem was reframed as a classification problem and an XGBoost model was able to achieve an F1-score between 0.78 and 0.85 across the crops. This turned out to be a promising result and provides the customer with greater predictive improvement than the baseline approach. Engagements like this — predictive analytics for agriculture built on satellite imagery and custom machine learning model development — are a service line we're actively expanding; if you have similar yield, growth-stage, or land-use prediction problems, we'd like to hear about them.
Interested in a similar system?
Let's talk about your requirements.