Training models using Satellite imagery on Amazon Rekognition Custom Labels
Satellite imagery is becoming a more and more important source of insights about changes that happen worldwide. There are multiple satellites that provide publicly available data with almost full earth coverage and almost weekly frequency.
One of the main challenges with satellite imagery is to deal with getting insights from the large dataset which gets continuous updates. In this blog post, I want to showcase how you can use Amazon Rekognition custom labels to train a model that will produce insights based on Sentinel-2 satellite imagery which is publicly available on AWS.
The Sentinel-2 mission is a land monitoring constellation of two satellites that provide high-resolution optical imagery. It has around a 5-day frequency and 10-meter resolution. In our example, we will use satellite imagery to detect and classify agricultural fields.
How to access Sentinel-2 imagery
There are multiple ways how you can access satellite imagery:
- Use one of the browsers mentioned on the page https://registry.opendata.aws/sentinel-2/
- Download imagery from the Amazon S3 bucket directly. Amazon S3 is a storage service that provides scalable and secure access for downloading and uploading data.
Browser is the best option for finding single images for specific dates and places. Amazon S3 is a better option if you want to automate a workflow or develop an application using satellite imagery. I want to showcase the usage of Sentinel-2 Amazon S3 storage as it’s the best way to use imagery if you need to organize a pipeline for its processing.
Find Sentinel-2 scene
One of the simplest options to find the necessary scene is to use Sentinel-2 browser. There are multiple browsers out there. In our example, we will use the EO browser as it provides an S3 path for the image. Basically, you just need to define search parameters and then copy the AWS path field.
Download image bands
When dealing with satellite imagery we work with multiple images from different specters. The idea here is that each spectral band can give a different insight. Because of that sometimes it is more useful to construct images not with RGB bands, but with other bands, for example, NIR which stands for Near Infra-Red imagery. This band gives great insight into vegetation. That is why the combination of NIR-Red-Green bands is popular and it’s called False Color since it’s not a true RGB image. Once we have the AWS path we can use AWS CLI commands to download the necessary bands. In our case, we want to construct a False Color vegetation image which will have higher contrast for vegetation areas so we would need to download NIR-Red-Greed bands (8,4,3 respectively).
Prepare False Color Sentinel-2 image
Once bands are downloaded, we can use rasterio python library to compile False Color image and scale the bands since initially they are provided in uint16 format, but their values actually lie in the range 0–2¹². Here is the example code which does that:
We will get the following result:
Once we have this False Color Vegetation image we can actually crop it in multiple areas to use some of them as examples and some of them for the training and test datasets.
How to train and run Amazon Rekognition model
Once we have images we can create a dataset and train our model. One of the main advantages of using Amazon Rekognition is that you don’t need to know deep learning frameworks or write code for deep learning training or inference. You also don’t need to manage deep learning infrastructure and can start using Amazon Rekognition model right away.
Create and label Dataset
- Create dataset
- Choose “Upload images from your computer”
- On the dataset page click “Add images”
- On the pop-up window, click Choose Files and choose files from your computer. Then click “Upload images”
- Create labels “active field”, “semi-active field”, “non-active field”
- Click “Start labeling”, choose images, and then click “Draw bounding box”
- On the new page, you can now choose labels and then draw rectangles for each label. After you’ve finished labeling you can switch to a different image or click “Done”.
Train and run the model
- On the projects page click “Create Project”
- On the project page choose “Train new model”
- Choose the dataset which we just created and then choose “Split training dataset” for the test dataset. Then click “Train”.
- Once the model is trained you can start making predictions.
You can evaluate the model using one of the crops from the image which we’ve processed before. Rekognition will return the json results with predicted regions that we can visualize.
Visualize the result
You can visualize the results using the following code in jupyter notebook. This code parses response from the Amazon Rekognition model and draws prediction boxes on the image. It also takes into account the type of labels and marks predictions with different colors based on class.
Here is how the output will look like:
We’ve trained and deployed a model for finding agriculture fields on satellite imagery using Amazon Rekognition. As you can see setting everything was pretty simple and you can use this example to develop more complex models for other satellite imagery tasks, for example, forest monitoring and building detection.
Feel free to check the code in the following repo: