Calculate the environmental impact of your R computations 🌱
The greenAlgoR package provides tools to estimate the carbon footprint
and energy consumption of computational tasks in R. Based on the Green
Algorithms framework (Lannelongue, Grealey, and Inouye (2021)), this
package helps researchers and data scientists understand and minimize
the environmental impact of their computational work.
- 🔍 Calculate CO2 emissions from R computations based on runtime, CPU, and memory usage
- 🌍 Location-aware estimates using regional carbon intensity data
- 🎯 Targets integration for complete pipeline carbon footprint analysis
- 📊 Visualization tools to compare and contextualize your footprint
- ⚙️ Flexible configuration for different hardware specifications
library(greenAlgoR)
# Calculate footprint for a 2-hour computation
result <- ga_footprint(runtime_h = 2, location_code = "WORLD")
result$carbon_footprint_total_gCO2 # CO2 emissions in grams
# For your current R session
session_footprint <- ga_footprint(runtime_h = "session")
# For targets pipelines (in a targets project)
targets_footprint <- ga_targets()greenAlgoR is not available on CRAN for the moment. You can install the stable development version from GitHub with:
# Install from GitHub (development version)
if (!require("devtools", quietly = TRUE)) {
install.packages("devtools")
}
devtools::install_github("adrientaudiere/greenAlgoR")This package implements the methodology from Lannelongue, Grealey, and Inouye (2021), which provides a standardized approach to quantifying the carbon footprint of computational research. The framework considers:
- Energy consumption: Based on CPU usage, memory requirements, and runtime
- Carbon intensity: Varies by geographical location and energy sources
- Hardware efficiency: Different processors and systems have varying power draws
- Infrastructure: Data center efficiency (PUE - Power Usage Effectiveness)
The carbon footprint is calculated by estimating the energy draw of the algorithm and the carbon intensity of producing this energy at a given location:
Where the energy needed is:
The key factors are: - Power draw for cores: Depends on CPU model and number of cores - Memory power draw: Based on available RAM memory - Usage factor: Corrects for actual core utilization (default: 100%) - PUE: Power Usage Effectiveness for data center efficiency - PSF: Pragmatic Scaling Factor for multiple runs - Carbon intensity: Location-dependent based on energy sources
library("greenAlgoR")# Calculate footprint for a 2-hour computation
result <- ga_footprint(
runtime_h = 2,
location_code = "WORLD", # Global average
n_cores = 4,
memory_ram = 16
)
cat("Carbon footprint:", result$carbon_footprint_total_gCO2, "g CO2\n")
#> Carbon footprint: 85.60754 g CO2
cat("Energy consumption:", result$energy_needed_kWh, "kWh\n")
#> Energy consumption: 0.1802264 kWh# Specify exact CPU model (automatically sets cores and TDP)
fp_specific <- ga_footprint(
runtime_h = 1,
cpu_model = "Core i3-10300",
location_code = "FR" # France (low carbon intensity)
)
fp_specific$carbon_footprint_total_gCO2
#> [1] 7.458519# Compare carbon footprint across different locations
locations <- c("WORLD", "FR", "US", "NO", "CN")
footprints <- sapply(locations, function(loc) {
ga_footprint(runtime_h = 1, location_code = loc)$carbon_footprint_total_gCO2
})
comparison <- data.frame(Location = locations, CO2_grams = footprints)
print(comparison)
#> Location CO2_grams
#> WORLD WORLD 29.4247968
#> FR FR 3.1766391
#> US US 26.2617860
#> NO NO 0.4720357
#> CN CN 33.2902858# Create a simple comparison plot
fp_example <- ga_footprint(runtime_h = 4, n_cores = 4, memory_ram = 16)
# Simple reference comparison
ref_subset <- fp_example$ref_value[1:5, ] # Top 5 reference activities
ref_subset$type <- "Reference"
# Add our computation
our_computation <- data.frame(
variable = "Our Computation",
value = fp_example$carbon_footprint_total_gCO2,
prop_footprint = NA,
type = "Computation"
)
plot_data <- rbind(ref_subset[, c("variable", "value", "type")],
our_computation[, c("variable", "value", "type")])
plot_data$value <- as.numeric(plot_data$value)
ggplot(plot_data, aes(x = reorder(variable, value), y = value, fill = type)) +
geom_col(alpha = 0.8) +
scale_fill_manual(values = c("Reference" = "lightblue",
"Computation" = "darkred")) +
coord_flip() +
labs(
title = "Carbon Footprint Comparison",
x = "Activity",
y = "CO2 Emissions (g)",
fill = "Type"
) +
theme_minimal()Calculate the carbon footprint of your current R session:
# Analyze current R session
fp_session <- ga_footprint(runtime_h = "session", add_storage_estimation = TRUE)
cat("Session footprint:", fp_session$carbon_footprint_total_gCO2, "g CO2\n")
#> Session footprint: 0.01999902 g CO2
cat("Session runtime:", fp_session$runtime_h, "hours\n")
#> Session runtime: 0.0006766667 hoursFor targets workflows, calculate the complete pipeline footprint:
# In a targets project directory
pipeline_footprint <- ga_targets(
location_code = "FR",
n_cores = 4,
memory_ram = 16
)
pipeline_footprint$carbon_footprint_total_gCO2- Getting Started: See
vignette("greenAlgoR-intro")for comprehensive examples - Targets Integration: See
vignette("targets-integration")for pipeline analysis - Function Reference: Use
?ga_footprintand?ga_targetsfor detailed documentation
We welcome contributions! Please:
- Check existing issues
- Submit bug reports or feature requests
- Fork the repository and submit pull requests
- Follow the existing code style and add tests for new features
- Optimize your code: Reduce runtime to minimize carbon footprint
- Choose efficient hardware: Match computational resources to your needs
- Consider location: Run computations in regions with cleaner energy
- Monitor regularly: Track your carbon footprint across projects
- Share awareness: Include carbon footprint in research reporting
- Submit to CRAN
- Allow custom carbon intensity values (e.g., from Electricity Maps)
- Add more visualization options
If you use greenAlgoR in your research, please cite both the package
and the underlying methodology:
# For greenAlgoR package
Taudière, A. (2024). greenAlgoR: Carbon Footprint Estimation for R Computations.
R package version 0.1.1. https://github.com/adrientaudiere/greenAlgoR
# For the Green Algorithms methodology
Lannelongue, L., Grealey, J., Inouye, M. (2021). Green Algorithms:
Quantifying the Carbon Footprint of Computation. Advanced Science, 8(12), 2100707.
https://doi.org/10.1002/advs.202100707
greenAlgoR is an R package that estimates the carbon footprint and
energy consumption of computational tasks. It’s based on the Green
Algorithms framework by Lannelongue et al. (2021) and helps researchers
understand the environmental impact of their computational work in R.
The estimates are based on the peer-reviewed Green Algorithms methodology and use real-world data for: - CPU power consumption from hardware specifications - Regional carbon intensity from energy grid data - Memory power consumption from published research
However, actual consumption may vary based on specific hardware configurations, software optimization, and other factors.
The package supports carbon intensity data for many countries and
regions. Common location codes include: - "WORLD" - Global average -
"US" - United States - "GB" - United Kingdom - "DE" - Germany -
"CN" - China - "FR" - France
See the Green Algorithms database for the complete list.
Problem: You get an error when specifying a cpu_model.
Solution: 1. Use "Any" to use generic TDP values instead of a
specific model 2. Check that your CPU model name exactly matches the
Green Algorithms database 3. Manually specify TDP_per_core and
n_cores instead of using cpu_model
Problem: Memory RAM is not detected automatically.
Solution: Manually specify the memory_ram parameter:
ga_footprint(runtime_h = 1, memory_ram = 16) # 16 GBProblem: runtime_h = "session" gives unexpected results.
Explanation: Session runtime is calculated from when R started, not when your analysis began. For specific computations, use explicit runtime:
# Time a specific operation
start_time <- Sys.time()
# ... your computation ...
end_time <- Sys.time()
runtime_hours <- as.numeric(difftime(end_time, start_time, units = "hours"))
ga_footprint(runtime_h = runtime_hours)Problem: ga_targets() fails or gives zero footprint.
Solutions: 1. Ensure you’re in a directory with a targets project 2.
Check that targets have been run with tar_make() 3. Verify targets
metadata exists:
# Check if targets data exists
targets::tar_meta()
# If no data, run the pipeline first
targets::tar_make()Hardware Configuration: - Use actual hardware specs when possible - For cloud computing, check provider documentation - Personal laptops typically have PUE close to 1.0 - Data centers typically have PUE = 1.2-2.0
Location Selection: - Use your actual geographical location - For cloud computing, use the data center location - Consider running computations in regions with cleaner energy (lower carbon intensity)
- Reduce runtime: Optimize your code for efficiency
- Choose efficient hardware: Match resources to your needs
- Select clean energy regions: Run computations where renewable energy is prevalent
- Cache results: Avoid re-running expensive computations
- Profile your code: Identify and optimize bottlenecks
For research projects:
# Include in your analysis scripts
footprint <- ga_footprint(runtime_h = "session")
cat("Analysis carbon footprint:", footprint$carbon_footprint_total_gCO2, "g CO2\n")
# Save for reporting
saveRDS(footprint, "results/carbon_footprint.rds")For targets pipelines:
# Add to your _targets.R file
list(
# ... your other targets ...
tar_target(
carbon_footprint,
ga_targets(location_code = "FR"),
description = "Calculate pipeline carbon footprint"
)
)The ga_footprint() function returns a list with detailed breakdown:
carbon_footprint_total_gCO2: Total CO2 emissions in gramscarbon_footprint_cores: CPU contribution to emissionscarbon_footprint_memory: Memory contribution to emissionsenergy_needed_kWh: Total energy consumption in kilowatt-hoursruntime_h: Actual runtime used in calculationref_value: Reference activities for comparison (if requested)
Custom carbon intensity: Currently, the package uses predefined carbon intensity values per country. If you are interested in custom values, please post an issue.
Custom hardware parameters: You can specify hardware configurations:
ga_footprint(
runtime_h = 2,
TDP_per_core = 25, # High-performance CPU
n_cores = 16, # Many cores
memory_ram = 128, # Large memory
PUE = 1.4, # Data center efficiency
PSF = 3 # Account for 3 repeated runs
)- Check the documentation: Use
?ga_footprintand?ga_targets - Read the vignettes:
vignette("greenAlgoR-intro")andvignette("targets-integration") - Report issues: Submit bug reports at https://github.com/adrientaudiere/greenAlgoR/issues
We welcome contributions! See the repository README for guidelines on: - Reporting bugs - Suggesting features
- Submitting code improvements - Improving documentation
- Lannelongue, L., Grealey, J., Inouye, M. (2021). Green Algorithms: Quantifying the Carbon Footprint of Computation. Advanced Science, 8(12), 2100707.
- Green Algorithms website: https://calculator.green-algorithms.org/
- Package repository: https://github.com/adrientaudiere/greenAlgoR
Lannelongue, Loïc, Jason Grealey, and Michael Inouye. 2021. “Green Algorithms: Quantifying the Carbon Footprint of Computation.” Advanced Science 8 (12): 2100707. https://doi.org/10.1002/advs.202100707.
