This Repository contains code to perform inferences on images to:
- Detect and isolate objects
- Track objects
- Classify objects as moth or non-moth
- Identify the order
- Predict the species
It is intended to be built in to a Docker container and run on Amazon Web Services (AWS) Elastic Container Service (ECS). The images are expected to be located in the AWS Simple Storage Service (S3). The container should run on an Elastic Compute Cloud (EC2) instance having GPU hardware.
This has been forked from https://github.com/AMI-system/amber-inferences.
The container is intended to run on a machine with nVidia GPUs and includes PyTorch 2.6.0, Python 3.12, CUDA 12.4.
You can build the Docker image using
docker build -t lepisense-inferences .
The build copies the local code files, incorporating any changes you may have made which is good for dev. Make sure to push changes to the repo if they are for production.
The Dockerfile contains commented options for how it should start up.
To run the scripts manually, uncomment the line
CMD ["tail", "-f", "/dev/null"]
The container will start but do nothing. You can SSH to the container and then execute commands as you wish
To start the Jupyter server so that you can run the tutorial notebook, uncomment the lines
EXPOSE 80
CMD ["/.venv/bin/jupyter", "notebook", "--no-browser", "--allow-root", "--ip=0.0.0.0", "--port=80", "/lepisense-inferences/examples/tutorial.ipynb"]
To automatically run the inferencing on all the pending data, uncomment the line
Once the work is complete the container will terminate and the AWS infrastructure should scale down to nothing.
To deploy the image to ECS we first push it to the Amazon Elastic Container Registry (ECR)
Before pushing the container you need to authenticate with the image registry using an AWS account that has permission. You can do this using the AWS Command Line Inerface
First sign in. to your AWS account. If it is the first time you will want to
aws configure sso
otherwise it is
aws sso login --profile <your-profile-name>
Check to see if the destination repository already exists.
aws ecr describe-repositories \
--repository-names lepisense/inferences \
--region eu-west-2 \
--profile <your-profile-name>
If the repository does not already exist, create it.
aws ecr create-repository \
--repository-name lepisense/inferences \
--region eu-west-2 \
--profile <your-profile-name>
The output from these commands contains the repositoryUri to which Docker must now authenticate.
Use the following command
aws ecr get-login-password \
--region eu-west-2 \
--profile <your-profile-name> \
| \
docker login \
--username AWS \
--password-stdin <repositoryUri>
Now add a tag to the image we created earlier and then push using the tag name:
docker tag lepisense-inferences <repoistoryUri>
docker push <repsitoryUri>
There should already be buckets for the images that we will be processing, set
up when establishing the
lepisense-input-api. They are
named lepisense-images-<environment> where environment is
replaced by the value of the environment variable of that name.
We need to use the console to create additional buckets for results and the
models. Use the names lepisense-results-<environment> and
lepisense-inference-models
TODO: Covert this to infrastructure as code.
The code running in the container needs to be authenticated to access other AWS resources like the images stored in S3. To do this we create an access policy which is then assigned to a role which is then given to the ECS task definition. See https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html
In the IAM console create a policy with the following JSON. This gives read access to the S3 bucket of dev images. Similar policies for test and prod can be anticipated.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "LepisenseDevImageEcsPolicy",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::lepisense-images-dev",
"arn:aws:s3:::lepisense-images-dev/*"
]
}
]
}Create another policy allowing read/write access to the dev results bucket.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "LepisenseDevResultEcsPolicy",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::lepisense-results-dev",
"arn:aws:s3:::lepisense-results-dev/*"
]
}
]
}Create another policy to give read access to the bucket which will contain the models
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "LepisenseModelEcsPolicy",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::lepisense-inference-models",
"arn:aws:s3:::lepisense-inference-models/*"
]
}
]
}Continuing in the IAM console, create a role of type "AWS Service" and use case
"Elastic Container Service Task". Add the the three permission policies that
were created in the previous steps. Name the role something like
LepisenseDevECSTaskRole
No longer necessary. Omit.
In order to SSH to the container we will create you need a key pair. Follow the console instructions at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/create-key-pairs.html. It may have been finger trouble but an ED25519 key worked for me where an RCA key failed.
No longer necessary. Omit.
A load balancer is needed to allow remote users to browse the Jupyter server running on the container as the container is not accessible on the public internet. This can be omitted if you have no need to use Jupyter notebooks. A tutorial notebook exists to introduce how to run the inference code and it is a useful way to test code without having to deploy a new docker image.
- Go to the EC2 Dashboard and select Security Groups from the left-hand menu.
- Click the Create Security Group button.
- In the Security Group settings Security group name: PublicLoadBalancer Description: Allows http input from specific IP addresses. VPC: Default
- Inbound rules - lock down the source as far as possible Type: HTTP, Source: Custom: Specific IP addresses, e.g. 192.171.199.99/32
- Outbound rules Type: All traffic, Destination: Custom: 0.0.0.0/0
- Click the Create Security Group button.
- Go to the EC2 Dashboard and select Security Groups from the left-hand menu.
- Click the Create Security Group button.
- In the Security Group settings Security group name: LoadBalancerTargets Description: Allows http traffic from load balancer to target. VPC: Default
- Inbound rules Type: HTTP, Source: Custom: security group of load balancer created above.
- Outbound rules Type: All traffic, Destination: Custom: 0.0.0.0/0
- Click the Create Security Group button.
- Go to the EC2 Dashboard and select Target Groups from the left-hand menu.
- Click the Create Target Group button.
- In the Target Group settings Type: IP Target group name: LepiSenseInference VPC: Default
- Health checks Protocol: HTTP Health chekck path: / Success codes: 405,302
- Click the Next button.
- Click the Create Target Group button. There is no need to register a target as ECS will register the private IP of the container with the target group.
- Go to the EC2 Dashboard and select Load Balancers from the left-hand menu.
- Click the Create Load Balancer button.
- Click the Create button for an Application Load Balancer.
- In the Load Balancer settings Load balancer name: LepiSenseInference VPC: Default Availability Zones and subnets: Pick two, e.g. eu-west-2a and eu-west-2b. Security groups: Select the load balancer security group created above.
- In the Listener settings for HTTP:80 settings Target group: Select the target group created above.
- Click the Create Load Balancer button.
This is needed to allow our container to request content from the internet. An ECS container with AWSVPC networking has no public IP so cannot access the internet directly through a public subnet with internet gateway. Note, the load balancer only routes incoming traffic and won't handle outgoing requests. It is also essential that the EC2 instance and the ECS container are created in the same availability zone and this will be the case if they both exist on the same private subnet.
- Go to the VPC Dashboard and select Subnets from the left-hand menu.
- Click the Create Subnet button.
- Select the default VPC.
- In the Subnet Settings Subnet name: lepisense-private-subnet Availability Zone: No preference IPv4 subnet CIDR block: 172.31.48.0/20 assuming that the default subnets are occupying 172.31.0.0/20, 172.31.16.0/20, and 172.31.32.0/20.
- Click the Create Subnet button.
- Go to the VPC Dashboard and select NAT Gateways from the left-hand menu.
- Click the Create NAT Gateway button.
- In the NAT Gateway Settings Name: lepisense-nat-gateway Subnet: Select a public subnet, e.g. eu-west-2a Connectivity Type: Public Elastic IP allocation ID: Click the Allocate Elastic IP button or select from the drop down list (the latter especially if there is an error about the maximum number of addresses being reached).
- Click the Create NAT Gateway button.
- Go to the VPC Dashboard and select Route Tables from the left-hand menu.
- Click the Create Route Table button.
- In the Route Table Settings Name: lepisense-route-table VPC: Select the default VPC
- Click the Create Route Table button.
- In the Routes tab of the newly created route table, click the Edit Routes button.
- Click the Add Route button.
- In the Route settings Deatination: 0.0.0.0/0 Target: Select NAT Gateway in the drop down and find the NAT Gateway created above in the search box.
- Click the Save Changes button.
- In the Subnet Associations tab of the newly created route table, click the Edit Subnet Associations button in the Explicit Subnet Associations list.
- Check the lepisense-private-subnet created above.
- Click the Save Associations button.
It is possible to add VPC endpoints for AWS services so that network traffic to them is kept internal to AWS and not routed via the public internet. This can usefully be done for
- ECS
- ECR
- CloudWatch
- S3 However there is a cost associated with this (as there also is with the NAT Gateway). However, there is a no-cost alternative for S3 called a Gateway Endpoint.
To enable this, we follow this procedure
As our EC2 instances will be started on the private subnet just created they will have no public IP so we cannot SSH directly to them. One way round this is to start a CloudShell in the VPC on the private subnet but the recommendation is to use the Systems Manager Session Manager.
First, you need to create an IAM role that allows the task to call AWS services (the Task role).
- Go to the IAM Dashboard and select Roles from the left-hand menu.
- Click the Create Role button.
- For the Trusted entity type, select AWS service.
- For the Use case, select EC2, then click Next.
- Select the following two managed policies, then click Next:
- AmazonEC2ContainerServiceforEC2Role: This is required for the EC2 instance to register itself as a Container Instance with the ECS cluster.
- AmazonSSMManagedInstanceCore: This is the policy that enables the AWS Systems Manager Agent (SSM Agent) to communicate with the AWS Systems Manager service, allowing you to use Session Manager.
- Name the role ECS-ContainerInstance-SSM-Role and click Create Role.
This role automatically creates the necessary Instance Profile.
Second, you need to create an IAM role that allows the container agent to call AWS services (the Task Execution role). This allows logs to be written to CloudWatch I believe. This needs to be given the AmazonECSTaskExecutionRolePolicy.
The Launch Template configration and choice of AMIs has changed during the course of development. The AMI listed below appears to be superceded.
- Go to the EC2 Dashboard and select Launch Templates from the left-hand menu.
- Click the Create Launch Template button.
- In the Launch Template Settings Name: LepiSenseInferenceGpu AMI: al2023-ami-ecs-gpu-hvm-2023.0.20250923-kernel-6.1-x86_64-ebs Instance type: g4dn.xlarge Common security groups: Select default Advanced details: IAM instance profile: Select the Instance Profile associated with the IAM Role you created above. Advanced details: User data: Add the following lines:
#!/bin/bash
echo ECS_CLUSTER=LepiSenseInferenceGpu >> /etc/ecs/ecs.config;
echo ECS_BACKEND_HOST=https://ecs.eu-west-2.amazonaws.com >> /etc/ecs/ecs.config;
echo ECS_ENABLE_GPU_SUPPORT=true >> /etc/ecs/ecs.config;
if ! id "ssm-user" &>/dev/null; then
adduser -m ssm-user
fi
usermod -a -G docker ssm-user- Click the Create Launch Template button.
- Go to the EC2 Dashboard and select Auto Scaling Groups from the left-hand menu.
- Click the Create Auto Scaling Group button.
- In the settings, set the following then click Next. Name: LepiSenseInferenceGpu Launch template: Select the template created above.
- In the settings, set the following then click Next Availability zones and subnets: Select the private subnet created above.
- Click Next
- In the settings, set the following then click Skip to preview. Min desired capacity: 0
- Click the create Auto Scaling Group button.
A task describes the container that we want ECS to deploy.
Follow the ECS console procedure to create a task with the following values:
- Task definition family: LepisenseInferenceTask
- Launch type: EC2
- Operating system/architecture: Linux/x86_64
- Task size: 4 vCPU, 16 GB
- Network mode: awsvpc 9a. Task role: As created above 9b. Task execution role: None 12a. Container name: LepisenseInference 12b. ImageUri: As created above. (916498879384.dkr.ecr.eu-west-2.amazonaws.com/lepisense:latest)
Add the following environment variables with appropriate values
- ENVIRONMENT: [dev|test|prod]
- LOG_LEVEL: [DEBUG|INFO|WARNING|ERROR|CRITICAL]
- POSTGRES_DB: lepisense
- POSTGRES_HOST: lepisense.c14qc2uwid2u.eu-west-2.rds.amazonaws.com
- POSTGRES_PASSWORD:
- POSTGRES_PORT: 5432
- POSTGRES_USER: postgres
The task we have just created describes a container that we want to run. The next step is to create a cluster which will be built from the launch template we defined earlier.
The infrastructure configuration option have changed midway through development. Where previously the option was for Fargate or EC2 only and we chose EC2 in order to explicitly choose instances with GPU, there is now an option for 'Fargate and Managed Instances' which looks like it would be better for our needs.
- Go to the ECS Dashboard and select Clusters from the left-hand menu.
- Click the Create Cluster button.
- In the Cluster Settings Name: LepiSenseInferenceGpu Previously: AWS Fargate: Deselect Amazon EC2 Instances: Select As of 28/10/2025 Select Fargate and Self-Managed Instances Autoscaling group: Select the ASG created above.
- Click the Create button
Previously, before I was creating a launch template and auto-scaling group, I did the following.
Follow the ECS console procedure at https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-ec2-cluster-console-v2.html to create the cluster with the following values
- Cluster name: e.g. g4dn-xlarge-eu-west-2-on-demand
- Infrastructure: Remove Fargate and add EC2 instances
- b. Auto Scaling Group: Create new Provisioning model: On-demand Container instance AMI: Amazon Linux 2023 GPU x86_64 EC2 instance type: g4dn.xlarge EC2 instance role: Create new role Desired capacity: Min 0, Max 1 SSH key pair: As created above.
- Network settings: VPC: Use default Subnets: Use private subnet created above. Security groups: Select SSH for shell access
Now we can deploy the task as a service on the cluster. Because we are using an EC2 instance with one GPU and the task will use that one GPU, the deployment options have to allow the current task to stop before deploying a new one. This means that service is interrupted. If you want to avoid this then you will need to revise the hardware available or use a blue/green strategy. The current choice of instance is simply based on least cost for a GPU and the strategy is selected to be the fastest to aid development.
TODO: Covert this to infrastructure as code.
- From the list of Task definitions select the one created above. Click the Deploy button and select Create service.
- Service name: LepisenseInferenceService
- Existing cluster: As created above.
- Compute options: Capacity provider strategy
- Capacity provider strategy: Use cluster default.
- Deployment Configuration: Availability Zone re-balancing: Uncheck (so max running tasks can be 100) Deployment Options: Min running tasks: 0 Max running tasks: 100
- Networking VPC: Use default Subnets: Use the private subnet created above (lepisense-private-subnet) Security groups: Select default for inter VPC communication, Select Load Balancer Targets for external internet connection.
- Load Balancing Type: Application Load Balancer Container: LepiSenseInference 80:80 Application Load Balancer: Use existing and select one created above. Listener: Use Existing and select HTTP:80 Target Group: Use exsiting and select one created above.
It can take 5 or 10 minutes to deploy. When successful you can switch to the EC2 console and see the running instance.
Because the models can be large and may be subject to their own version control
I don't want them in the code repository. Create an S3 bucket for them called
lepisense-inference-models and upload the models as follows:
- loalisation/flat_bug_M.pt
- binary/moth-nonmoth-effv2b3_20220506_061527_30.pth
- order/thresholdsTestTrain.csv
- order/dhc_best_128.pth
- species/01_japan_data_category_map.json
- species/turing-japan_v01_resnet50_2024-11-22-17-22_state.pt
These will be downloaded to the container when needed.
If you have configured the launch template to enable the AWS Systems Manager (SSM) then you have easy access to a shell on the EC2 instance.
- Go to the System Manager console and select Explore Nodes from the left-hand menu.
- Select the relevant node and click the Connect button to start a new terminal session.
Obsolete if using SSM.
If the container is in a security group with SSH permissions and you have set up an SSH key pair then this is the alternative way to access the EC2 instance.
Because the instance is on a private subnet you cannot SSH directly to it. Instead we use the AWS CloudShell Within this we need to create a VPC environment. Use the Actions button and select the relevant option.
In the settings choose
- Name: VPC
- Virtual private cloud: Select the default VPC
- Subnet: Select the private subnet
- Security group: Default VPC
Now we need to copy over our private key which we can do using S3 as an interface. You may want to create a bucket for this
To copy your key to S3 execute aws s3 cp ~/.ssh/<filename> s3://<bucketname>
in your local terminal.
To then copy this from S3 to the VPC terminal, execute
aws cp s3://<bucketname>/<filename> . It is then a good idea to delete the
file from S3.
Change the permissions on the file in the VPC terminal using
chmod 400 <filename>
Now the envirionment has been created, go to the EC2 console where the list of instances should include one started by ECS. Select it and click on the Connect button. Copy the example SSH command, paste it in to the cloudshell terminal, and execute.
At the EC2 prompt, run docker container ls to get the id of the container
started by ECS. Then run docker exec -it <ContainerId> /bin/bash to enter
the docker container.
To test network connectivity you can docker-exec in to the container and
execute curl -v s3.eu-west-2.amazonaws.com which should give a positive
reply
If networking is successful then you should be able to list images in the S3
bucket using the AWS CLI. aws s3 ls s3://lepisense-images-dev/
Docker exec in to the container and execute nvidia-smi If this works it will
list the GPU driver and CUDA version.
Now confirm that torch has been successfully configured.
python -c "import torch; print(torch.cuda.is_available())"If TRUE, you have successfully configured torch!
When the inference code changes and needs redeploying
- Rebuild the Docker image
- Push the image to the Elastic Container Registry
- Goto the ECS Console and select Clusters from the left-hand menu
- Click on the LepiSenseInferenceGpu cluster to see its details and select the LepiSenseInferenceService.
- Click on the arrow in the Update Service button and select Force New Deployment. Confirm the dialog which pops up.
Because our chosen instance has one GPU and we have opted for rolling updates, expect some down time between the old container shutting down and the new one starting.
The model files are stored in the S3 bucket called lepisense-inference-models and not built in to the Docker image. They are downloaded to the container when they are first needed. The following model types exist:
- localisation_model: the model for isolating insects in an image
- binary_model: the model for distinguishing moth/non-moth
- order_model: the model for identifying taxonomic order
- order_threshold: the order data thresholds
- species_model: the regional model for identifying species
- species_labels: the species labels
AMBER team members can find these files on OneDrive. Others can contact Katriona Goldmann for the model files.
The Flatbug object detection model is used in this analysis. The confidence
threshold to define object bounding boxes defaults to 0.0. The box threshold can
be altered using the --box_threshold argument in
slurm_scripts/array_processor.sh.
As described above, when documenting how to build the Docker image, there are three ways you can configure the inference code to start: manual, jupyter, and automatic.
In this mode, you open a shell to your EC2 instance and Docker-exec in to the container, as described above in the section on accessing the ECS container, whereupon you can execute commands.
You must activate the virtual environment before running these commands using
source .venv/bin/activateThis allows you to check what deployments exist.
python -m amber_inferences.cli.deployments'Filter to
-
an organisation with
--organisation_name <value> -
a country with
--country_code <value> -
a network with
--network_name <value> -
a deployment with
--deployment_name <value> -
a device type with
--devicetype_name <value>where<value>is substituted by the value to filter with. Several filters can be combined. -
--no-activelists inactive deployments rather than active ones. -
--deletedlists deleted deployments.
This allows you to find which deployments have files waiting to be processed.
python -m amber_inferences.cli.inference_jobsThe filter options are the same as for printing deployments.
--completedlists completed inference jobs rather than pending.--deletedlists deleted deployments.--limit <value>limits the number of rows returned.
This generates a list of files to process for a device on a date.
python -m amber_inferences.cli.generate_keys \
--inference_id <value>A value for inference_id, obtained from the list of jobs, is required.
An optional parameter is
--output_file <value>Output file of S3 keys. Default is /tmp/lepisense/s3_keys.txt.
This processes a list of image files, identifying moths to species level. It outputs a results file which lists each detection in each image.
python -m amber_inferences.cli.perform_inferences \
--inference_id <value>A value for inference_id, obtained from the list of jobs, is required. The list of S3 keys to process should have been created by the generate-keys script.
Optional parameters include
--json_file <value>Input file of S3 keys. Default is '/tmp/lepisense/s3_keys.txt'--output_dir <value>Default is '/tmp/lepisense/'--result_file <value>Output file of results. Default is '/tmp/lepisense/results.csv'--remove_imageDefault is false--save_cropsDefault is false--localisation_model_name <value>Default is 'flat_bug_M.pt'--box_threshold <value>Default is 0.00--binary_model <value>Default is moth-nonmoth-effv2b3_20220506_061527_30.pth--order_model <value>Default is dhc_best_128.pth--order_thresholds <value>Default is thresholdsTestTrain.csv--species_model <value>Default is turing-uk_v03_resnet50_2024-05-13-10-03_state.pt--species_labels <value>Default is 03_uk_data_category_map.json--top_n_species <value>Default is 5--skip_processedIf re-running a job that was interrupted, whether to skip over files that had already been processed. Default is false.--verboseWhether to print extra information about progress. Default is false.
This attempts to connect detections in consecutive images.
python -m amber_inferences.cli.get_tracksOptional parameters include
--tracking_threshold <value>Threshold for the track cost. Default is 1--result_file <value>The file to process and append tracking to. Default is '/tmp/lepisense/results.csv'--verboseWhether to print extra information about progress. Default is false.
The results of inferencing are stored locally in the EC2 instance and will disappear with it so we should save them back to S3 using the following.
python -m amber_inferences.cli.save_results \
--inference_id <value>An optional parameter is
--result_file <value>Output file of results. Default is /tmp/lepisense/results.csv.
You can also process all outstanding inference jobs with one command. In automatic mode, all that happens is that this command is executed on a schedule.
python -m amber_inferences.cli.auto_inferenceWith everything deployed and running you can use your browser to access the
tutorial notebook. The url you need to access it can be found from the load
balancer information. Go to the EC2 console
and select Load Balancers from the list in the right-hand column. Select the
relevant load balancer and copy the DNS name. Paste it in to your browser,
prefix with http:// and go.
You should arrive at a login page requesting token. To obtain this, SSH to the
relevant instance. If the instance has just started, list the docker containers
and then and enter docker logs <container_id>. The log will display
lines like the following:
To access the server, open this file in a browser:
file:///root/.local/share/jupyter/runtime/jpserver-1-open.html
Or copy and paste one of these URLs:
http://ip-172-31-27-174.eu-west-2.compute.internal:80/tree?token=4e33a3b89
The token you require is in the url shown in the log.
If your instance has been up sometime the log will be full of cruft. To then
obtain the token, docker-exec in to the container and execute the command
jupyter server list. You can copy the token from the output.
Copy the token and paste it in to the browser.
To automate the inferencing we create a schedule to start the ECS task which will run until there are no more images to the process. When the task is complete the number of containers should scale to zero and that should cause the EC2 instances to scale to zero.
- Go to the Amazon EventBridge console
- Click the Create Schedules button.
- In the schedule settings Schedule name: LepiSenseInference Occurrence: Recurring schedule Time zone: (UTC +00:00) Europe/London Schedule type: Cron-based Cron expression: 0 12 * * ? * Flexible time window: 30 minutes
- Click the Next button.
- Select the ECS RunTask targer API and supply the following settings ECS cluster: LepiSenseInferenceGpu ECS task: LepiSenseInferenceTaskGPU Task count: 1 (for now) Subnets: The ID of the private subnet Security groups: The ID of the default VPC security group
- Click the Skip to Review and Create Schedule button.
- Click the Save Schedule button.
The results of the inference will be saved in the output directory specified by
the --output_dir argument. The output will include:
- A CSV file containing the results of the inference, including species
predictions, order predictions, and bounding box coordinates. The description
of the columns in the CSV file are outlined in
output_description.md. - A directory containing the cropped images of the detected objects, if the
--save_cropsargument is specified. - A directory containing the original images, if the
--remove_imageargument is not specified.
python3 -m unittest discover -s testsFor coverage:
pytest --cov=src/amber_inferences tests/ --cov-report=term-missing