Evaluation of ML Resource Utilization Requires Model Life Cycle Assessment

arXiv cs.LG 06/09/26, 04:00 AM Papers
Summary
This position paper argues that current methods for evaluating AI resource usage are insufficient and advocates for the adoption of life cycle assessment (LCA) to properly account for energy and environmental costs across the entire ML pipeline, from hardware manufacturing to training and inference.
arXiv:2606.07632v1 Announce Type: new Abstract: Proper accounting of the energy requirements and environmental impact of artificial intelligence (AI) systems is necessary for researchers, developers, policy makers, and users to assess the barriers to building systems at scale. With the growing complexity of pipelines and underlying infrastructure needed to develop and deploy AI systems, previous approaches for evaluating AI efficiency which focus on the costs of a single training run or an individual inference prediction are no longer sufficient. In this position paper, we enunciate the need for applying life cycle assessment to evaluate the costs of the machine learning model development and deployment pipeline to properly account for the required resources and downstream impact. Life cycle assessments enable the incorporation of costs across the full life cycle of an AI system and its underlying infrastructure, from the embodied costs associated with the physical computing hardware through the operational costs in training and inference.
Original Article
View Cached Full Text
Cached at: 06/09/26, 08:52 AM
# Evaluation of ML Resource Utilization Requires Model Life Cycle Assessment
Source: [https://arxiv.org/html/2606.07632](https://arxiv.org/html/2606.07632)
###### Abstract

Proper accounting of the energy requirements and environmental impact of artificial intelligence \(AI\) systems is necessary for researchers, developers, policy makers, and users to assess the barriers to building systems at scale\. With the growing complexity of pipelines and underlying infrastructure needed to develop and deploy AI systems, previous approaches for evaluating AI efficiency which focus on the costs of a single training run or an individual inference prediction are no longer sufficient\. In this position paper, we enunciatethe need for applying life cycle assessment to evaluate the costs of the machine learning model development and deployment pipelineto properly account for the required resources and downstream impact\. Life cycle assessments enable the incorporation of costs across the full life cycle of an AI system and its underlying infrastructure, from the embodied costs associated with the physical computing hardware through the operational costs in training and inference\.

Machine Learning, ICML

## 1Introduction

As with any emerging technology, an understanding of the resource utilization and byproducts is necessary to both provision the necessary infrastructure and assess their downstream societal impacts\. Artificial Intelligence is no different, and its resource requirements remain growing but largely uncertain\. The current scaling paradigm, in which large language model performance continues to benefit from increasing scales of computation, has led to growth projections which predict data centers will consume more than 10% of the total U\.S\. energy demand by 2030\(Greenet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib80); Shehabiet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib121)\)\. However, the certainty of such estimates is highly variable and annual load growth expectations vary by more than 4 times\(Aljbouret al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib5)\)\.

Reducing the uncertainty in these projections is necessary to ensure that power generation can be properly provisioned while also avoiding placing strain on existing grid infrastructure or increasing utility prices to individual ratepayers\(Joint Legislative Audit and Review Commission,[2024](https://arxiv.org/html/2606.07632#bib.bib106)\)\. Proper measurement enables informed decision making for governments and industry institutions which have committed hundreds of billions of dollars to investments in computing hardware and energy infrastructure needed to support the development and deployment of large\-scale machine learning models with costs rivaling those of “Big Science” projects\(OpenAI,[2025a](https://arxiv.org/html/2606.07632#bib.bib92); Bobrowsky,[2025](https://arxiv.org/html/2606.07632#bib.bib137); Isaac,[2025](https://arxiv.org/html/2606.07632#bib.bib90); Smith,[2025](https://arxiv.org/html/2606.07632#bib.bib85); Cai and Sophia,[2025](https://arxiv.org/html/2606.07632#bib.bib39); Parasharet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib93)\)\.

However, despite the increasing costs of AI and the breadth of stakeholders affected by AI systems, the methodologies we use to evaluate the resource requirements of machine learning models and their socio\-economic impacts have not evolved in kind\. Existing approaches for evaluating the resource consumption of ML models are often limited to measuring the cost of a marginal step in the development or deployment life cycle of an ML model – i\.e\. the cost of an individual training run or inference prediction\. Measurement of the resource utilization of a system requires aggregation and attribution of those used across all stages in both its production and use, and such approaches focused on the cost of a single constituent stage can fail to account for the effects of efficiency improvements on one stage of a system’s life cycle on the resources in another stage\.

Fortunately, techniques for analyzing the resource requirements and downstream impacts over the lifetime of manufactured products are well established in the field of industrial ecology; namely, withlife cycle assessment \(ISO 14040, ISO 14044\(ISO 14040:2006,[2006](https://arxiv.org/html/2606.07632#bib.bib51),[2006](https://arxiv.org/html/2606.07632#bib.bib52)\)\)\.Life cycle assessment \(LCA\) quantifies the impact of a product by decomposing resource consumption and emissions across the stages of manufacture, use, and disposal; and across types of resources \(e\.g\. energy, carbon emissions, human health impacts\)\. The costs associated with previous lifecycle stages, such as hardware manufacture and model training, are totaled and amortized through use\. Life cycle assessment has been used in semiconductor manufacturing and computing hardware research to quantify the embodied and operational carbon cost of fabrication, recycling, and use of physical hardware\(Guptaet al\.,[2021](https://arxiv.org/html/2606.07632#bib.bib42); Wuet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib138); Schneideret al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib118); Jiet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib184); Guptaet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib151)\)\. However, systematic methods for applying LCA to machine learning models are nascent\.

In this paper,we enunciate the need for life cycle assessment to evaluate the efficiency and environmental impact of machine learning models through development and deploymentby:

1. 1\.Presenting the existing landscape for evaluating ML efficiency and resource use, and its limitations \(§[2](https://arxiv.org/html/2606.07632#S2)\)
2. 2\.Outlining how these issues can be addressed via application oflife cycle assessmentto machine learning models \(§[3](https://arxiv.org/html/2606.07632#S3)\)
3. 3\.Discussing the benefits provided and insights provided by applying LCA to ML models \(§[5](https://arxiv.org/html/2606.07632#S5)\)
4. 4\.Providing alternative perspectives on AI’s resource requirements \(§[4](https://arxiv.org/html/2606.07632#S4)\)
5. 5\.Stating what is needed to enabled life cycle assessment of ML models \(§[6](https://arxiv.org/html/2606.07632#S6)\)\.

## 2Limitations in Existing Approaches to Evaluating ML’s Resource Needs

In response to the growing resource consumption of machine learning models, there has been a significant increase in scientific inquiry into both \(1\) the evaluation of ML’s resource consumption and environmental impact, and \(2\) the design of efficient ML methods; as reflected in a myriad of research surveys\(Menghani,[2023](https://arxiv.org/html/2606.07632#bib.bib108); Trevisoet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib112); Tayet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib110); Wanet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib107); Suiet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib109)\)and publication venues dedicated to the topic\(Rezagholizadehet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib117); Daoet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib115); Wanget al\.,[2024a](https://arxiv.org/html/2606.07632#bib.bib116); Sadat Moosaviet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib113)\)\. Such efforts are necessary first steps towards understanding the overall resource requirements of AI and ML systems\. However, existing efforts have often relied on assumptions which are not reflective of the real\-world systems and workloads which underpin modern AI systems\.

![Refer to caption](https://arxiv.org/html/2606.07632v1/x1.png)Figure 1:LCA enables aggregation across ML model development and deployment life cycles of increasing complexity\. The pre\- and post\-training pipelines of modern LLMs \(e\.g\. OLMo with the Tulu post training recipeWalshet al\.\([2025](https://arxiv.org/html/2606.07632#bib.bib91)\); Lambertet al\.\([2025](https://arxiv.org/html/2606.07632#bib.bib60)\)\) have significantly more stages than classical train\-test settings; and a larger variety of methods for conducting inference\(Wellecket al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib174)\)\.### 2\.1Reliance on Proxy Measures of Efficiency\.

A wide range of efficiency metrics have motivated research in the design of efficient machine learning algorithms, model architectures, and computer systems\. For example, service\-level objectives \(SLOs\) have been used to optimize cloud serving settings where models are deployed to support latency\-sensitive APIs\. Whereas the hardware limitations of mobile and edge settings have yielded model compression methods which reduce the memory overheads of models\. At the same time, theoretical investigations, which are often based on proxy metrics for efficiency such as FLOPs, have yielded parameter\-, data\-, and sample\-efficient ML architectures and training algorithms\. Although such research yields improvements on these efficiency proxy metrics, such proxies are often not highly correlated with more tangible measures such as latency and energy\(Dehghaniet al\.,[2022b](https://arxiv.org/html/2606.07632#bib.bib26); Fernandezet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib31)\)\. For measures of resource utilization to be informative for stakeholders seeking to standardize or account for resource consumption, reporting must correspond to the real\-world quantities of interest\.

### 2\.2Failure to Account for Growing Complexity in the ML Model Life Cycle\.

Previous efforts to account for resource usage and environmental impacts of machine learning models have mainly focused on the resources consumed for a single stage of the model life cycle – e\.g\. the energy or water use of large\-scale model training\(Strubellet al\.,[2020](https://arxiv.org/html/2606.07632#bib.bib125); Pattersonet al\.,[2021](https://arxiv.org/html/2606.07632#bib.bib165); Faizet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib30); Morrisonet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib84)\), the marginal impacts of individual inference predictions\(Luccioniet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib76); Fernandezet al\.,[2025a](https://arxiv.org/html/2606.07632#bib.bib34); Patelet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib95); Wuet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib140); Ding and Shi,[2024](https://arxiv.org/html/2606.07632#bib.bib149); Nguyenet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib156)\), or the embodied costs from manufacture of computing hardware\(Liet al\.,[2025b](https://arxiv.org/html/2606.07632#bib.bib70),[2024b](https://arxiv.org/html/2606.07632#bib.bib176)\)\. However, focusing on individual stages of the model life cycle is insufficient to measure the total resources and environmental impact associated with the choice to build a new machine learning model or AI system\.

To evaluate the resource demands of AI system, it is necessary to account and attribute the resources consumed across all stages of development and deployment\. Conducting such an evaluation is increasingly difficult as the complexity of modern models continues to grow with bespoke deployment and development pipelines\. For instance, state\-of\-the\-art large language models \(Figure[1](https://arxiv.org/html/2606.07632#S2.F1)\), require multiple stages of pre\- and post\-training, leverage auxiliary models for synthetic data, distillation, and reward modeling; rely on an assortment of inference\-time algorithms, and can be deployed across variable hardware platforms\. Each stage of the growing model pipeline introduces further complexity to decision\-making about development and deployment \- as well as additional challenges to accounting of models’ resource consumption and environmental impact\.

### 2\.3Sector\-Wide Projections are Not Grounded in Real Computational Workloads\.

Concerns around the rising power demands of AI data centers have led to the rise of various studies that estimate and project growth in data center energy use\(Shehabiet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib121); Greenet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib80); Aljbouret al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib5)\)\. To obtain projections on energy use, such studies rely on estimates of future chip shipments and energy efficiency to forecast the total demands of computing hardware\. Sector\-level analysis is critical for providing information to developers of electrical grid infrastructure\. With infrastructure lead times of multiple years, accurate sector projections enables grid infrastructure to be built out to support the increased capacity demands of data centers, often in excess\.

However, these studies rely on assumptions about hardware utilization and energy efficiency at a level of abstraction that obfuscates individual workloads\. These assumptions make it impossible to assess for the impact of models developed and deployed by machine learning researchers and practitioners; or to evaluate the impact of model efficiency improvements or design choices\.

## 3Life Cycle Assessment for ML Models

As described in the previous section, existing efforts to measure the resource requirements of ML are limited: often relying on coarse\-grained sector\-level estimates, failing to measure the real\-world resource of interest, or only representing a component of the total costs of an AI system rather than incorporating the total costs across the development and deployment life cycle\.

As a means to address these limitations, we believe thatevaluation of the resource demands of ML models requires life cycle assessment\. Life Cycle Assessment \(LCA;\(Curran,[2006](https://arxiv.org/html/2606.07632#bib.bib23)\)\) provides a methodological basis to determine the environmental and social impacts of a product by accounting for the required resources and environmental impacts of a manufactured product or service through resource extraction, material processing, manufacture, use and disposal \(i\.e\. fromcradle to grave\)\.

Concretely, LCA’s are standardized and defined across two separate ISO standards\. ISO 14040 specifies LCA’s conceptual framework, whereas ISO 14044 specified the technical requirements for conducting an LCA\(ISO 14040:2006,[2006](https://arxiv.org/html/2606.07632#bib.bib51),[2006](https://arxiv.org/html/2606.07632#bib.bib52)\)\.111Specifically, ISO 14040 specifies the stages of an LCA\. By contrast, ISO 14044 provides requirements and guidance on how to conduct an LCA \(e\.g\. considerations for determining system boundaries, factors for assessing the quality of data\)\.For conducting a ML model LCA, we direct practitioners to the practices outlined in ISO 14044\.

At the core of LCA is afunctional unitwhich defines a quantitative reference for the value provided by a process, which can be compared across potential systems\. In turn, a system can be defined that produces the functional unit of interest in relation to resources and emissions\.

In this section, we demonstrate how life cycle assessment can be used to enable to more holistic accounting of machine learning models’ total resource consumption and environmental impacts for producing a functional unit\. We examine the four stages of LCA as defined in ISO standards:Goal Definition and Scoping,Life Cycle Inventory,Life Cycle Impact Assessment, andInterpretation\.

### 3\.1Goal Definition and Scoping

The first stage of an LCA defines afunctional unitcorresponding to the service being delivered by an AI system and its constraints, and thesystem boundarieswhich isolate the processes and resource flows encompassed in the study\.

#### 3\.1\.1Functional Units for Machine Learning

Depending on the stakeholder conducting an LCA, the functional unit and the process of interest may vary\. For example, institutional developers of large foundation models may be interested in the environmental impact and cost associated with the development of families of models, and the functional unit could be defined as a “set of trained foundation models for a language task\.” Downstream users may be concerned with the costs associated with using machine learning models, where a functional unit can be defined as a “processed batch of queries to a machine learning model\.”

While prior work has performed direct measures of the operational costs of conducting model training and inference, reported values are not comparable when they are not grounded in standardized functional units or consistent system boundaries\. For instance, ambiguity in the workloads served render the resulting energy measurements of different model providers, such as OpenAI and Google, difficult to compare\(Altman,[2025](https://arxiv.org/html/2606.07632#bib.bib208); Elsworthet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib204)\)\.

#### 3\.1\.2Product Systems for Modern Machine Learning Model Life Cycles\.

After identifying a functional unit for study, a candidateproduct systemwhich produces the functional unit is modeled; defined from: material extraction, manufacture, use and maintenance, through disposal\. Input resources and output emissions and waste byproducts are associated with each stage, withsystem boundarieswhich specify which resource flows to include in the evaluation\.

In the case of machine learning models, LCA enables aggregation of costs across the fullproduct systemencompassing both the resources consumed during hardware manufacture \(i\.e\.embodied costs\) as well as those consumed during the development and deployment of the model \(i\.e\.operational costs\)\. Historically, the process of developing and deploying models followed a simple process of training and performing validation on small sets of in\-domain i\.i\.d\. datasets\. However, modern ML models are developed using complex pipelines with rapid iterative experimentation processes and multiple stages of training \(See Figure[1](https://arxiv.org/html/2606.07632#S2.F1)\)\. Development now encompasses additional stages, such as: neural architecture search, automated machine learning and experimentation, and long\-context pretraining, as well as continuous retraining of models during development\(Tornedeet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib130); Sangaryaet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib153)\)— each requiring additional resources\. Likewise, the variety of methods for model inference has grown, as new paradigms have emerged which shift computation from training to inference to attain higher performance, e\.g\. via chain\-of\-thought reasoning, self\-refinement, tool use, retrieval\-augmented generation, and in\-context learning\(Wellecket al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib174)\)\.

As with functional units, LCA requires consistency in system boundaries which would otherwise render evaluations incomparable\. For instance, differences in the inclusion of Scope 2 offsite water utilization contributed to estimations of water use in by LLMs differing by orders of magnitude\(Liet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib67); Elsworthet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib204)\)\.

### 3\.2Life Cycle Inventory

The life cycle inventory stage describes and quantifies the environmental flows associated with the functional unit\(National Academies of Sciences, Engineering, and Medicine,[2022](https://arxiv.org/html/2606.07632#bib.bib191)\)\. LCA provides an extensible framework which inventories resource and byproduct flows associated with embodied costs \(e\.g\. rare earth minerals, PFAS, CFCs\)\(Elgamalet al\.,[2025a](https://arxiv.org/html/2606.07632#bib.bib160),[b](https://arxiv.org/html/2606.07632#bib.bib29)\); as well as those incurred during operational use such as energy, water, carbon emissions and air pollution\(Wuet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib138); Morrisonet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib84); Hanet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib44)\)\.

![Refer to caption](https://arxiv.org/html/2606.07632v1/x2.png)\(a\)Aggregate costs:Total environmental impact of models incorporates factors from all stages of model life cycle\.333FollowingMorrisonet al\.\([2025](https://arxiv.org/html/2606.07632#bib.bib84)\), we ground our inference efficiency estimates in ShareGPT data and likewise assume a 4\-year lifetime for GPU hardware\.Utilizing efficient serving optimizations increases the number of functional units produced under a fixed resource budget\.
![Refer to caption](https://arxiv.org/html/2606.07632v1/x3.png)\(b\)Per\-inference costs:For the functional unit defined as a batch of processed requests, the upfront embodied and operational costs are amortized with use and asymptotically approach the marginal cost of inference\.

Figure 2:CO2e emissions of OLMo2 7b training and inference\(Morrisonet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib84); Walshet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib91)\)\. Increasing inference efficiency via offline batching reduces the unit cost, as does amortization of embodied costs over model use\. Decomposition of the resource use across life cycle stages enables identification of thesignificant issues\(i\.e\. the life cycle stage which maximally contributes to total costs\)\.
### 3\.3Life Cycle Impact Assessment

Using the quantified costs determined through the life cycle inventory, the total environmental impact of ML models can be determined by translating the inventoried resources into associated impact categories, such as the contribution to global warming from increased emissions; ozone depletion from CFCs; or human health impacts resulting from water depletion, noise, or air quality pollution\.

Although LLM developers have begun to report on energy requirements and carbon dioxide emission equivalents \(CO2e\), the downstream environmental impact remains largely unreported\(Dubeyet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib28); Walshet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib91)\)\. Fortunately, life cycle impact assessment provides standard conversions and characterization factors for converting inventoried resources into their associated net environmental impact, such as the U\.S\. Environmental Protection Agency’s Tool for the Reduction and Assessment of Chemical and other environmental Impacts \(TRACI\)\(Bare,[2002](https://arxiv.org/html/2606.07632#bib.bib168),[2011](https://arxiv.org/html/2606.07632#bib.bib169)\)\.

### 3\.4Interpretation

For researchers, model developers, and policy makers to utilize the results of an LCA, it is necessary to contextualize and interpret the results of the investigation, by: \(1\) identifying significant issues with the inventory and assessment; \(2\) evaluating the completeness, sensitivity, and consistency of data; and \(3\) providing conclusions and recommendations based on the impact assessment\.

Identification of thesignificant issues\(i\.e\. the components of the life cycle that have the greatest impact on the total result\) enables location of resource bottlenecks in machine learning models, whether it be the costs associated with hardware fabrication, the upfront costs associated with model training, or the marginal costs of individual inferences\. Once the inventory and impact assessment have been validated, the LCA’s results enable estimation of the downstream environmental impact of the full machine learning model life cycle– such as to identify which model design choices yield the most efficient system for providing the specified functional unit \(e\.g\., watts per batched inference\)\.

### 3\.5Case Study: Comparing the Effects of LLM System Design Choices with LCA

As an example, we consider an example LCA with a functional unit corresponding to a batch of processed examples by a large language model \(See Figure[3](https://arxiv.org/html/2606.07632#footnote3)\) with total costs equivalent to the sum of operational inference costs with both upfront training and embodied costs\. For our example, computing the cost to produce the functional unit \(CFUC\_\{\\textbf\{FU\}\}\) requires consideration of not only the marginal cost of inference computation but also the amortized costs of upstream training and hardware manufacturing associated with the inference\. More precisely, we consider a simple time\-share approach for attributing GPU embodied costs, in which the hardware’s manufacturing costs are attributed to training and inference loads based on the length of the workload as fraction of the hardware’s usable lifetime\. However, we note that the efficacy of different attribution approaches for embodied costs remains open research problems\.

CFU=\\displaystyle C\_\{\\text\{FU\}\}=CPer Inference\+Hardware Utilization Time×CEmbodiedHardware Lifespan\\displaystyle C\_\{\\text\{Per Inference\}\}\+\\frac\{\\text\{\\small Hardware Utilization Time\}\\times C\_\{\\text\{Embodied\}\}\}\{\\text\{\\small Hardware Lifespan\}\}\+CExperimentation\+CTrainingTotal Lifetime Inferences\\displaystyle\+\\frac\{C\_\{\\text\{Experimentation\}\}\+C\_\{\\text\{Training\}\}\}\{\\text\{\\small Total Lifetime Inferences\}\}
Accordingly, to compute the cost per functional unit, the practitioner must minimally account and attribute the embodied costs of hardware, operational costs of training, experimentation, and per\-request inference\. This information can be obtained through direct first\-party disclosures from hardware manufacturers\(NVIDIA,[2025b](https://arxiv.org/html/2606.07632#bib.bib205),[a](https://arxiv.org/html/2606.07632#bib.bib206)\)for embodied costs; or measured directly in\-workload through freely available GPU and CPU utilities \(e\.g\. Nvidianvidia\-smiand DCGM; or Intel VTuner\) to account for operational cost\.

In addition to providing baselines for the total resources required by a machine learning model, LCA can be used to evaluate and provide comparisons across multiple systems that produce the same functional unit, and the relative impact of efficiency improvements to stages of the model life cycle\. For modern LLM serving, there exists a variety of design choices that affect the total efficiency and resource consumption, including: parallelization strategies, machine learning software frameworks, cluster scheduling algorithms, and choices in the underlying hardware accelerators\. Life cycle assessment enables comparisons of the cost per functional unit when varying such configurations in the context of the full model life time\. For instance, shown in Figure[2\(b\)](https://arxiv.org/html/2606.07632#S3.F2.sf2), simply increasing the efficiency of LLM serving with increased batching to produce our example functional unit improves hardware utilization, and enables more requests to be served under fixed carbon emissions budgets\.

Furthermore, life cycle assessment enables the study of efficiency optimizations that affect multiple stages of the model life cycle\. The advent of multi\-stage training and inference\-time computing yield complex interactions across stages of model development and use which allow for tradeoff of resources between constituent stages\. For instance, “reasoning” models \(such as DeepSeek\-R1, OpenAI GPT\-o1, and Gemini\) can utilize substantially more resources during inference to attain higher performance on difficult tasks that would otherwise require additional domain\-specific training\. Alternatively, continual or domain\-specific pretraining may extend a model’s utility, delaying the need for full model retraining and/or further offsetting the initial training cost\. LCA can enable analysis of the trade\-offs of these methods, relative efficiencies, and resource\-optimal settings\.

Without systematic quantification as part of an LCA, an understanding of the relative magnitudes of efficiency measures across the entire model lifecycle, under different assumptions about the shape of the model lifecycle, remains largely elusive \- in Figure[2\(b\)](https://arxiv.org/html/2606.07632#S3.F2.sf2), we see that the total per\-inference cost is extremely sensitive to the total “lifespan” of the model until it is used at least tens of billions of times\.

## 4Alternative Views

While we believe LCA provides a comprehensive methodology for evaluating the resource requirements and environmental impact of ML models, we acknowledge critiques and alternative views for evaluating AI’s resource use\.

##### Increasing Costs of AI Will Be Offset by Efficiency Improvements in Algorithms, Systems, and Hardware\.

Historically, the energy efficiency of computing hardware has doubled approximately every 1\.57 years\(Koomeyet al\.,[2010](https://arxiv.org/html/2606.07632#bib.bib209)\)\. Likewise, the energy efficiency of AI in particular has benefited from additional improvements in software frameworks and model architectures which together are poised to lead to overall reductions in AI’s resource demands\(Pattersonet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib97); Oviedoet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib203)\)\.

However, the increased efficiency of computing hardware has given way torebound effects; such as Jevons’ paradox\(Jevons,[1866](https://arxiv.org/html/2606.07632#bib.bib105)\)in which lower cost of use yields increased uptake, leading to increased total resource consumption despite higher utilization and reductions in resources consumed per\-unit of resource\(Luccioniet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib73)\)\.

Under the assumption of Jevons’ paradox and increased capability and profitability,444For instance, Microsoft’s CEO recently[referenced Jevons’ paradox](https://x.com/satyanadella/status/1883753899255046301)to reassure stakeholders that ML efficiency improvements will lead to “skyrocket\[ing\]” demand for AI, alongside significant investments in energy infrastructure, such as Microsoft’s bid to[re\-commission the nuclear reactor at Three Mile Island](https://www.constellationenergy.com/newsroom/2024/Constellation-to-Launch-Crane-Clean-Energy-Center-Restoring-Jobs-and-Carbon-Free-Power-to-The-Grid.html)in order to power an AI data center\.ML demand will expand to consume all resources that can be allocated to it\. In this setting, managing ML’s resource consumption becomes less a question of reducing resource use, than one of resource allocation: Given a limited set of resources \(e\.g\. datacenter energy, land\), what is the most effective allocation of those resources in order to maximize output? There is a need for methodologies and data enabling analysis of such resource allocations, e\.g\. between training and inference workloads, across different model types \(task\-specific, general\-purpose\), tasks, deployment scenarios, and hardware\.

##### LCA of Computing Hardware is More Informative than Model\-Based LCA\.

As datacenters are the physical entities that consume resources and perform the computation of an ML model, they should be the object of study for LCAs as performed inGuptaet al\.\([2021](https://arxiv.org/html/2606.07632#bib.bib42)\); NVIDIA \([2025b](https://arxiv.org/html/2606.07632#bib.bib205),[a](https://arxiv.org/html/2606.07632#bib.bib206)\); Schneideret al\.\([2025](https://arxiv.org/html/2606.07632#bib.bib118)\)\. Analysis of the environmental costs associated with hardware provides insight into the efficiency of the underlying computing platform and informs decision\-making around hardware provisioning and building of physical infrastructure\.

Hardware\-based accounting alone does not provide insight into the resource requirements and associated emissions of themachine learning modelwhich is often developed and deployed across multiple heterogeneous hardware platforms over the course of its lifetime\. For instance, an LCA focused on hardware may examine the carbon emissions associated with the production of computing hardware or the construction of a data center \(e\.g\., the tCO2e attributed to manufacturing a GPU server\)\. While analyses are critical to developers of physical infrastructure, they do not inform model developers and deployers on the resource utilization or environmental impact of the AI system itself\. By contrast, model LCAs study the resource requirements associated with the computing workload which utilizes the underlying physical hardware; and provide insight into the tCO2e attributed to serving an LLM request or to train a model\.

As opposed to evaluating the resources consumed by the physical infrastructure, the resource consumption of across a model lifecycle is more interpretable to both model developers understanding the costs of their deployment; and consumers, as the model itself is providing the primary functionality of AI systems\.

##### LCA is Unnecessary as ML Resource Consumption is Concentrated in a Single Life Cycle Stage\.

Depending on the use case of an AI system, the total resource requirements can be disproportionately concentrated in a single life cycle stage\. For instance, in the case of the largest frontier models, the frequency and scale of inference leads operational inference costs to dominate as the primary contributor to resource consumption – with industry players such as Meta and Google reporting that inference makes up 70% and 60% of their AI power consumption, respectively\(Oviedoet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib203); Wuet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib138); Pattersonet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib97)\)\. In such cases, totaling the marginal costs of inference may be sufficient for determining total costs\. Conversely, experimentation and training largely dominate resource consumption for AI researchers as the systems are not deployed at scale and would otherwise require billions of served inferences to amortize the upfront costs of development\(Luccioniet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib74); Morrisonet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib84)\)\.

Although the impact of fixed embodied and training costs may be amortized through use, the upfront costs of modern ML models remain substantial such that they require deployment at the scale of billions of inference requests before inference begins to exceed the costs of training \(See Figure[2\(b\)](https://arxiv.org/html/2606.07632#S3.F2.sf2)\)\. LCA additionally provides a robust framework for evaluating emerging applications for AI computation, which may introduce additional stages beyond existing ones and introduce additional computation and resource requirements; such as with multi\-agent workflows\.

##### LCAs are Infeasible Due to Information Unavailability\.

LCAs can often rely on information that is disclosed by hardware manufacturers and model providers\. As a result, LCA calculations and estimates are often dependent on proprietary information that private corporations may be reluctant to provide to maintain competitive advantage\. In such settings, LCAs can still be conducted in the presence of missing data using data from representative averages or secondary sources\. As a first step, LCA practitioners and model evaluators could use available disclosures in model tech reports, press releases, and public communications to provide initial approximations of resource utilization\. For instance, Google has disclosed aggregate statistics for energy and water use of their Gemini model\(Schneideret al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib118)\); and OpenAI has disclosed total tokens processed at developer conferences\(Ciaccia,[2025](https://arxiv.org/html/2606.07632#bib.bib211)\)\. While such statistics may not provide the fine\-grained information needed to identify the costs associated with an individual workload, the estimations provided by such a model LCA provide a potential range of resource utilizations for an average or aggregate workload\.

That being said, the data challenges for ML model LCAs may be becoming less of an issue as information is increasingly available via: voluntary disclosures \(e\.g\. disclosure of hardware embodied costs for GPUs by NVIDIA\(NVIDIA,[2025a](https://arxiv.org/html/2606.07632#bib.bib206),[b](https://arxiv.org/html/2606.07632#bib.bib205)\)\), publicly accessible databases \(e\.g\. EcoInvent\(Wernetet al\.,[2016](https://arxiv.org/html/2606.07632#bib.bib212)\)\), and government mandates \(e\.g\. reporting requirements in the EU AI act\(European Parliament and Council of the European Union,[2024](https://arxiv.org/html/2606.07632#bib.bib198)\)\)\. Likewise, conducting direct measurements for ML practitioners is increasingly accessible as many power measurement utilities \(e\.g\.,nvidia\-smi\) have existing integrations with commonly used monitoring tools \(e\.g\.,wandb\)\.

## 5Benefits of LCAs in Machine Learning

Life cycle assessment enables analysis of the ML model ecosystem for a broad group of stakeholders, including: machine learning researchers and practitioners, users of AI systems, energy and data center providers, policy makers, regulators, and community groups\. We discuss insights and decision\-making that is enabled should the ML community develop comprehensive LCAs\.

##### LCA Provides Transparency to Consumers and Enterprise Customers\.

Eco\-feedback programs, such as the the United States Environmental Protection Agency’s Energy Star, have been estimated to reduce consumers’ energy use by 5 trillion kWh, saving 4 billion metric tons of CO2e\(U\.S\. Environmental Protection Agency,[2026](https://arxiv.org/html/2606.07632#bib.bib213)\)\. While such labels exist for physical appliances, consumers lack information on their individual impact and footprint associated with their use of AI\-enabled systems\.

LCA of ML models provides a method for quantifying the costs associated with a consumer’s use of AI systems, and can enable them to assess the resource footprint associated with their individual choices in AI use\. LCAs enable consumers to adjust their utilization to prioritize sustainability, such as by individual reductions in AI usage or opting for less resource intensive models\.

##### LCA Enables Evaluation of the Effectiveness of Efficiency Research\.

Existing evaluations of machine learning models rely on a fragmented assortment of efficiency and task performance benchmarks\(Reddiet al\.,[2020](https://arxiv.org/html/2606.07632#bib.bib179); Mattsonet al\.,[2020](https://arxiv.org/html/2606.07632#bib.bib180); Tschandet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib183); Wanget al\.,[2024b](https://arxiv.org/html/2606.07632#bib.bib181); Yaoet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib187); Hendryckset al\.,[2021](https://arxiv.org/html/2606.07632#bib.bib182)\)\. Machine learning algorithms and systems research has addressed efficiency concerns through the development of: model architectures\(Wanget al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib161)\), efficient serving configurations\(Patelet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib95); Stojkovicet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib123); Shiet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib122); Wuet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib140); Liet al\.,[2025b](https://arxiv.org/html/2606.07632#bib.bib70); Stojkovicet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib124)\), improved parallelization algorithms\(Hsiaet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib158); Fernandezet al\.,[2025b](https://arxiv.org/html/2606.07632#bib.bib33); Youet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib159); Chunget al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib164)\), adaptive inference methods\(Liet al\.,[2024a](https://arxiv.org/html/2606.07632#bib.bib69)\), power efficient hardware\(Pattersonet al\.,[2021](https://arxiv.org/html/2606.07632#bib.bib165); Jouppiet al\.,[2017](https://arxiv.org/html/2606.07632#bib.bib54)\), and carbon\-aware and demand response\(Xinget al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib171)\)\. However, a lack of standardized functional units and treatment of spatio\-temporal uncertainty produces research that relies on inconsistent hardware platforms, serving requirements, model architectures, task domains and performance constraints, and metrics for system efficiency – with academic and industry estimates sometimes differing by orders of magnitude\(Elsworthet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib204)\)\.

Life cycle assessment grounds efficiency evaluations with functional units and system boundaries defined in terms of the end\-to\-end use case, which are more easily compared across systems\. As seen in Figure[2\(b\)](https://arxiv.org/html/2606.07632#S3.F2.sf2), LCA enables comparative analysis of different system design choices, and determination of which optimizations provided by the research community translate to reductions in real\-world resource use and environmental impacts\. As such, ML researchers can use LCAs to better identify efficiency bottlenecks across the model life cycle and motivate new lines of research which target lifetime reductions in efficiency\. For instance, LCA’s could be used to compare resource utilization across stages of training to evaluate whether full retraining is more efficient than alternatives which extend the usable life of a deployed large language model such as continual training which utilizes additional training compute post\-deployment or retrieval\-augmented generation systems which require additional computation during inference\.

##### LCA Empowers Developers of AI Models and Power Infrastructure to Effectively Allocate Resources\.

Further growth in machine learning is becoming constrained by fundamental limitations in the availability of computing hardware and the energy necessary to power them\. Life cycle assessment provides insight into the relative resource consumption and intensity of model training and deployment\. By identifyingsignificant issues\(see §[3\.4](https://arxiv.org/html/2606.07632#S3.SS4)\), LCA can be used by machine learning researchers to identify research questions and directions addressing elements of the model life cycle with highest resource consumption and emissions that present the largest bottlenecks to efficiency and opportunities for improvement\.

Additionally, LCA enables industry stakeholders to provision and allocate resources so that machine learning systems meet target efficiency and environmental goals — not just in terms of the marginal cost of training or inference, but contextualized within the whole machine learning life cycle\. Understanding the relative scales of demand for different life cycle stages enables infrastructure developers to project and accommodate the resources requirements of different workloads \(e\.g\. adapting compute and electrical infrastructure to handle synchronous training or online inference\)\.

##### LCA Informs Standards for AI Resource Utilization for Policy Makers\.

As the use of AI has grown and energy, water, and other impacts have materialized in many communities, policy makers at federal and local levels are increasingly interested in assessing and mitigating impacts\. The energy, carbon, water, air pollutant, noise, and other impacts of AI data centers has driven interest from policy makers and communities for solutions\. Lawmakers in multiple countries have introduced bills and passed laws calling for methods to evaluate the resource requirements and sustainability of AI and establish standardized reporting systems\(NIST,[2023](https://arxiv.org/html/2606.07632#bib.bib210); Congress,[2024](https://arxiv.org/html/2606.07632#bib.bib188); The White House,[2025](https://arxiv.org/html/2606.07632#bib.bib196); European Parliament and Council of the European Union,[2024](https://arxiv.org/html/2606.07632#bib.bib198)\)\.

As a commonly used methodology in the assessment of manufactured products in other domains, LCA provides a ready\-to\-use framework for policy makers and developers of minimum efficiency standards, along with access to LCA practitioners capable of conducting such analyses\. Transparently defining scope and functional units can enable guidelines for voluntary or regulated impacts reporting from industry stakeholders, and inform policy maker decisions\. For example, the U\.S\. Inflation Reduction Act\(117th United States Congress,[2022](https://arxiv.org/html/2606.07632#bib.bib189)\)specifies the use of LCA and a model developed by Argonne National Laboratory as the method required to estimate the life cycle greenhouse emissions of hydrogen production to determine eligibility for federal incentives\. As policymakers consider both incentives and regulations to minimize the environmental impacts of AI and ML, LCA can enable model comparison and account for both impact and performance\. These incentives and regulations can inform industry decision\-making regarding model development to account for resource requirements and external impact\.

##### LCA Improves Accuracy and Completeness in Resource Estimation and Projections\.

While the speed and computing requirements of machine learning research have grown with time, the methods required to evaluate the efficiency and resource consumption of the work have not kept up\. It is necessary to develop a methodological foundation for grounded assessments of cost\.

As shown in Figure[2\(b\)](https://arxiv.org/html/2606.07632#S3.F2.sf2), LCA can be used to allocate resource usage to components, and to estimate the relative importance of the constituent stages of hardware fabrication, model training and inference\. Additionally, by applying LCA across different types of resources, researchers can account for machine learning’s environmental burden along with other key impacts commonly associated with costs of computing such as raw material extraction\(Boyd,[2011](https://arxiv.org/html/2606.07632#bib.bib15)\), water usage\(Liet al\.,[2025a](https://arxiv.org/html/2606.07632#bib.bib65)\), public health\(Hanet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib44)\), and per\- or polyfluoroalkyl substances \(PFAS\)\(Leeet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib62); Elgamalet al\.,[2025a](https://arxiv.org/html/2606.07632#bib.bib160)\)\.

Furthermore, LCA enables researchers to estimate future machine learning systems and provides a tool to understand their potential environmental impact, longterm trends, and rebound effects across a range of scenarios\(Luccioniet al\.,[2025](https://arxiv.org/html/2606.07632#bib.bib73)\)\. Evaluating hypothetical systems with differing assumptions enables projection of the impact of: further scaling of ML systems\(Hoffmannet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib46); Raeet al\.,[2021](https://arxiv.org/html/2606.07632#bib.bib100)\), automation of the development process with autoML\(Tornedeet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib130)\), or alternative hardware platforms such as in edge or mobile settings\(Pattersonet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib157)\)\.

## 6Call to Action: Enabling LCA for ML

Finally, we describe essential components and data gaps to be addressed to enable LCA as a standard practice for analyzing efficiency in machine learning\.

##### User\-Centric Evaluations and Metrics

Variability in the evaluation settings used to characterize efficiency and performance in ML models hinders fair comparison between studies and models\. Moreover, while standard efficiency metrics may be measurable and reproducible \(e\.g\. model parameter count, FLOPs\), they often fail to map directly to practical user\-side requirements such as latency constraints, financial cost, or energy budget\(Dehghaniet al\.,[2022a](https://arxiv.org/html/2606.07632#bib.bib25); Fernandezet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib31),[2025a](https://arxiv.org/html/2606.07632#bib.bib34)\)\. For functional units to correspond to user needs, efficiency benchmarks should not only measure the hardware utilization or speed but be grounded in the performance measured demanded by the use case\.

##### Transparency in Reporting from Model Developers\.

As observed in our example in Figure[2\(b\)](https://arxiv.org/html/2606.07632#S3.F2.sf2), the cost of a functional unit of inference is directly dependent on: the serving configurations, hardware selection, and ML system design decisions\. Likewise, cost of inference is dependent on the total number of inferences served, a necessary datapoint needed for appropriate attribution of resulting implications to the amortizable training and embodied costs\.

While it is increasingly common for developers to release information on the total energy use and estimated carbon emissions of model pretraining, such measurements are often limited to the final training run and fail to account for development costs, the embodied costs of hardware, or the cost and frequency of inference\. For downstream users and regulators to accurately assess the cost of ML models, it is necessary for hardware manufacturers and large\-scale model developers to release information on theembodied resources and emissionsassociated with hardware fabrication; as well as thescale, frequency, and settings for model inference\.

Fortunately, there is precedent for both industry\-led transparency and joint initiatives between governments and industry model providers\. For instance, model providers currently provide self\-reporting through the Foundation Model Transparency Index and with Model Cards for frontier model releases\(Bommasaniet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib150); Mitchellet al\.,[2019](https://arxiv.org/html/2606.07632#bib.bib82)\)\. Likewise, both the U\.S\.’s Center for AI Standards and Innovation and the U\.K\.’s AI Safety Institute collaborate with model providers OpenAI and Anthropic to conduct pre\-release auditing of their model’s capabilities\(Anthropic,[2025](https://arxiv.org/html/2606.07632#bib.bib201); OpenAI,[2025b](https://arxiv.org/html/2606.07632#bib.bib199); National Institute of Standards and Technology \(NIST\),[2025](https://arxiv.org/html/2606.07632#bib.bib197)\)\. In addition to evaluating for model capabilities, we advocate for a voluntary inclusion of details around the resource requirement and deployment for model LCA in model cards and pre\-release audits\.

##### Standardization of LCA Reporting by Public Institutions\.

As industry and private institutions may lack incentives to disclose resource utilization due to competitive advantage and trade secrets, government organizations can work to define LCA\-based measurements for AI efficiency evaluations\. For instance, both the EU AI Act and the White House’s AI Action Plan outline the need for reporting requirements for model providers for general purpose AI \(GPAI\) models\(European Parliament and Council of the European Union,[2024](https://arxiv.org/html/2606.07632#bib.bib198); The White House,[2025](https://arxiv.org/html/2606.07632#bib.bib196)\)\. Standards setting organizations and regulators in these regions can define reporting protocols and standards for AI systems, such as the European Commission or European Committee for Standardization; and the National Institute of Standards and Technology or the American National Standards Institute, respectively\. Concretely, the EU has already mandated reporting of technical details, including model efficiency, for GPAI models to appropriate government agencies by 2027 \(See the EU AI Act Code of Practice: Transparency\)\.

##### Improved Granularity in Metrics and Monitoring Tools\.

Energy consumption extends beyond the GPU hardware accelerator, including other components of the computing stack such as the CPU, memory, disk, and interconnects\(Dodgeet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib27); McAllisteret al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib79)\)\. However, existing reporting is often limited to GPU\-only power draw\(Dubeyet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib28); Strubellet al\.,[2019](https://arxiv.org/html/2606.07632#bib.bib126)\); or relies on approximations of the energy use upper\-bounded by hardware thermal design power \(TDP\) rather than empirical direct measurements\. Furthermore, existing tooling struggles to measure power usage due to processor\-to\-processor variability and insufficient granularity in sampling frequency\(Courtyet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib13); Verdecchiaet al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib131); Yanget al\.,[2023](https://arxiv.org/html/2606.07632#bib.bib143)\)\. To perform LCA for ML models, direct per\-component measurements of resource utilization is needed for all associated computation\.

##### Increased Research Support and Interdisciplinary Collaboration\.

Life cycle assessment is fundamentally interdisciplinary and requires the machine learning research community to pursue collaboration with other fields\. In this work, we primarily examine case studies for the energy use and carbon emissions of ML models\. However, it is necessary to engage with researchers upstream and downstream of the machine learning model development and deployment ecosystem to understand the full impact in other capacities\.

For instance, computer systems experts can provide insight into the efficiency of the underlying computing architectures; and semiconductor researchers can provide insight into the resources required for fabrication and disposal\. Additionally, it is necessary to engage with communities directly affected by model use, such as through collaboration with environmental science and public health researchers\(Hanet al\.,[2024](https://arxiv.org/html/2606.07632#bib.bib44); Liet al\.,[2022](https://arxiv.org/html/2606.07632#bib.bib67)\)\. Interpretation and action based on results of an LCA is made possible cross\-cutting collaborations with domain experts and stakeholders\.

Finally, establishing LCA as an accepted practice for use with machine learning models requires the support of and adoption by the research community\. Our hope is that machine learning researchers will find nuanced evaluation of practical efficiency and environmental impact to be a compelling and promising research direction\.

## Acknowledgments

This work was supported by the National Science Foundation[Grant 2326610](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2326610)and the Graduate Research Fellowship Program under Grant No DGE2140739\. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author\(s\) and do not necessarily reflect the views of the National Science Foundation\. This research has been partially supported by Microsoft Corporation as part of the Keio CMU partnership\.

## References

- 117th United States Congress \(2022\)Inflation reduction act of 2022\.Note:Public Law 117\-167Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px4.p2.1)\.
- J\. Aljbour, T\. Wilson, and P\. Patel \(2024\)Powering intelligence: analyzing artificial intelligence and data center energy consumption\.EPRI White Paper no\. 3002028905\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p1.1),[§2\.3](https://arxiv.org/html/2606.07632#S2.SS3.p1.1)\.
- S\. Altman \(2025\)The gentle singularity\.Note:Sam Altman’s BlogAccessed: 2026\-01\-28External Links:[Link](https://blog.samaltman.com/the-gentle-singularity)Cited by:[§3\.1\.1](https://arxiv.org/html/2606.07632#S3.SS1.SSS1.p2.1)\.
- Anthropic \(2025\)Strengthening our safeguards through collaboration with us caisi and uk aisi\.External Links:[Link](https://www.anthropic.com/news/strengthening-our-safeguards-through-collaboration-with-us-caisi-and-uk-aisi)Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px2.p3.1)\.
- J\. C\. Bare \(2002\)TRACI: the tool for the reduction and assessment of chemical and other environmental impacts\.Journal of industrial ecology6\(3\-4\),pp\. 49–78\.Cited by:[§3\.3](https://arxiv.org/html/2606.07632#S3.SS3.p2.1)\.
- J\. Bare \(2011\)TRACI 2\.0: the tool for the reduction and assessment of chemical and other environmental impacts 2\.0\.Clean Technologies and Environmental Policy13,pp\. 687–696\.Cited by:[§3\.3](https://arxiv.org/html/2606.07632#S3.SS3.p2.1)\.
- M\. Bobrowsky \(2025\)Meta spending to soar on ai, massive data center\.Wall Street Journal\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p2.1)\.
- R\. Bommasani, K\. Klyman, S\. Longpre, B\. Xiong, S\. Kapoor, N\. Maslej, A\. Narayanan, and P\. Liang \(2024\)Foundation model transparency reports\.InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society,Vol\.7,pp\. 181–195\.Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px2.p3.1)\.
- S\. B\. Boyd \(2011\)Life\-cycle assessment of semiconductors\.Springer Science & Business Media\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px5.p2.1)\.
- K\. Cai and D\. M\. Sophia \(2025\)Alphabet plans massive capex hike, reports cloud revenue growth slowed\.Reuters\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p2.1)\.
- J\. Chung, Y\. Gu, I\. Jang, L\. Meng, N\. Bansal, and M\. Chowdhury \(2024\)Reducing energy bloat in large model training\.InProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles,pp\. 144–159\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- C\. Ciaccia \(2025\)OpenAI ’dominating’ consumer AI token consumption, Anthropic wins enterprise: Barclays\.Note:Accessed: 2026\-05\-29External Links:[Link](https://seekingalpha.com/news/4505254-openai-dominating-consumer-ai-token-consumption-anthropic-wins-enterprise-barclays)Cited by:[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px4.p1.1)\.
- 1\. U\. S\. Congress \(2024\)Artificial intelligence environmental impacts act of 2024\.Note:Senate Bill 3732Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px4.p1.1)\.
- B\. Courty, V\. Schmidt, S\. Luccioni, Goyal\-Kamal, MarionCoutarel, B\. Feld, J\. Lecourt, LiamConnell, A\. Saboni, Inimaz, supatomic, M\. Léval, L\. Blanche, A\. Cruveiller, ouminasara, F\. Zhao, A\. Joshi, A\. Bogroff, H\. de Lavoreille, N\. Laskaris, E\. Abati, D\. Blank, Z\. Wang, A\. Catovic, M\. Alencon, M\. Stechly, C\. Bauer, L\. O\. N\. de Araujo, JPW, and MinervaBooks \(2024\)Mlco2/codecarbon: v2\.4\.1\.Zenodo\.External Links:[Document](https://dx.doi.org/10.5281/zenodo.11171501),[Link](https://doi.org/10.5281/zenodo.11171501)Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px4.p1.1)\.
- M\. A\. Curran \(2006\)Life\-cycle assessment: principles and practice\.National Risk Management Research Laboratory, Office of Research and Development, U\.S\. Environmental Protection Agency\.Cited by:[§3](https://arxiv.org/html/2606.07632#S3.p2.1)\.
- T\. Dao, D\. Y\. Fu, M\. Ryabinin, D\. Hesslow, S\. Arora, S\. Yang, D\. Biderman, B\. Chen, A\. Mirhoseini, and P\. Liang \(Eds\.\) \(2025\)Third workshop on efficient systems for foundation models at international conference on machine learning\.Vancouver, Canada\.Cited by:[§2](https://arxiv.org/html/2606.07632#S2.p1.1)\.
- M\. Dehghani, Y\. Tay, A\. Arnab, L\. Beyer, and A\. Vaswani \(2022a\)The efficiency misnomer\.InInternational Conference on Learning Representations,External Links:[Link](https://openreview.net/forum?id=iulEMLYh1uR)Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px1.p1.1)\.
- M\. Dehghani, Y\. Tay, A\. Arnab, L\. Beyer, and A\. Vaswani \(2022b\)The efficiency misnomer\.InInternational Conference on Learning Representations,External Links:[Link](https://openreview.net/forum?id=iulEMLYh1uR)Cited by:[§2\.1](https://arxiv.org/html/2606.07632#S2.SS1.p1.1)\.
- Y\. Ding and T\. Shi \(2024\)Sustainable llm serving: environmental implications, challenges, and opportunities\.In2024 IEEE 15th International Green and Sustainable Computing Conference \(IGSC\),pp\. 37–38\.Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1)\.
- J\. Dodge, T\. Prewitt, R\. Tachet des Combes, E\. Odmark, R\. Schwartz, E\. Strubell, A\. S\. Luccioni, N\. A\. Smith, N\. DeCario, and W\. Buchanan \(2022\)Measuring the carbon intensity of ai in cloud instances\.InProceedings of the 2022 ACM conference on fairness, accountability, and transparency,pp\. 1877–1894\.Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px4.p1.1)\.
- A\. Dubey, A\. Jauhri, A\. Pandey, A\. Kadian, A\. Al\-Dahle, A\. Letman, A\. Mathur, A\. Schelten, A\. Yang, A\. Fan,et al\.\(2024\)The llama 3 herd of models\.arXiv preprint arXiv:2407\.21783\.Cited by:[§3\.3](https://arxiv.org/html/2606.07632#S3.SS3.p2.1),[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px4.p1.1)\.
- M\. Elgamal, D\. Carmean, E\. Ansari, O\. Zed, R\. Peri, S\. Manne, U\. Gupta, G\. Wei, D\. Brooks, G\. Hills,et al\.\(2025a\)CORDOBA: carbon\-efficient optimization framework for computing systems\.In2025 IEEE International Symposium on High Performance Computer Architecture \(HPCA\),pp\. 1289–1303\.Cited by:[§3\.2](https://arxiv.org/html/2606.07632#S3.SS2.p1.1),[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px5.p2.1)\.
- M\. Elgamal, A\. Mahmoud, G\. Wei, D\. Brooks, and G\. Hills \(2025b\)Modeling pfas in semiconductor manufacturing to quantify trade\-offs in energy efficiency and environmental impact of computing systems\.arXiv preprint arXiv:2505\.06727\.Cited by:[§3\.2](https://arxiv.org/html/2606.07632#S3.SS2.p1.1)\.
- C\. Elsworth, K\. Huang, D\. Patterson, I\. Schneider, R\. Sedivy, S\. Goodman, B\. Townsend, P\. Ranganathan, J\. Dean, A\. Vahdat,et al\.\(2025\)Measuring the environmental impact of delivering ai at google scale\.arXiv preprint arXiv:2508\.15734\.Cited by:[§3\.1\.1](https://arxiv.org/html/2606.07632#S3.SS1.SSS1.p2.1),[§3\.1\.2](https://arxiv.org/html/2606.07632#S3.SS1.SSS2.p3.1),[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- European Parliament and Council of the European Union \(2024\)Regulation \(eu\) 2024/1689 of 13 june 2024 laying down harmonised rules on artificial intelligence \(artificial intelligence act\) and amending regulation \(ec\) no 1236/2005 and directive \(eu\) 2019/1937\.Vol\.L,Publications Office of the European Union\.External Links:[Link](https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng)Cited by:[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px4.p2.1),[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px4.p1.1),[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px3.p1.1)\.
- A\. Faiz, S\. Kaneda, R\. Wang, R\. C\. Osi, P\. Sharma, F\. Chen, and L\. Jiang \(2024\)LLMCarbon: modeling the end\-to\-end carbon footprint of large language models\.InThe Twelfth International Conference on Learning Representations,Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1)\.
- J\. Fernandez, J\. Kahn, C\. Na, Y\. Bisk, and E\. Strubell \(2023\)The framework tax: disparities between inference efficiency in nlp research and deployment\.InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing,pp\. 1588–1600\.Cited by:[§2\.1](https://arxiv.org/html/2606.07632#S2.SS1.p1.1),[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px1.p1.1)\.
- J\. Fernandez, C\. Na, V\. Tiwari, Y\. Bisk, S\. Luccioni, and E\. Strubell \(2025a\)Energy considerations of large language model inference and efficiency optimizations\.InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\),W\. Che, J\. Nabende, E\. Shutova, and M\. T\. Pilehvar \(Eds\.\),Vienna, Austria,pp\. 32556–32569\.External Links:[Link](https://aclanthology.org/2025.acl-long.1563/),[Document](https://dx.doi.org/10.18653/v1/2025.acl-long.1563),ISBN 979\-8\-89176\-251\-0Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1),[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px1.p1.1)\.
- J\. Fernandez, L\. Wehrstedt, L\. Shamis, M\. Elhoushi, K\. Saladi, Y\. Bisk, E\. Strubell, and J\. Kahn \(2025b\)Efficient hardware scaling and diminishing returns in large\-scale training of language models\.Transactions on Machine Learning Research\.Note:External Links:ISSN 2835\-8856,[Link](https://openreview.net/forum?id=p7jQEf3wlh)Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- A\. Green, H\. Tai, J\. Noffsinger, and P\. Sachdeva \(2024\)How data centers and the energy sector can sate ai’s hunger for power\.McKinsey and Company\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p1.1),[§2\.3](https://arxiv.org/html/2606.07632#S2.SS3.p1.1)\.
- U\. Gupta, M\. Elgamal, G\. Hills, G\. Wei, H\. S\. Lee, D\. Brooks, and C\. Wu \(2022\)ACT: designing sustainable computer systems with an architectural carbon modeling tool\.InProceedings of the 49th Annual International Symposium on Computer Architecture,pp\. 784–799\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p4.1)\.
- U\. Gupta, Y\. G\. Kim, S\. Lee, J\. Tse, H\. S\. Lee, G\. Wei, D\. Brooks, and C\. Wu \(2021\)Chasing carbon: the elusive environmental footprint of computing\.In2021 IEEE International Symposium on High\-Performance Computer Architecture \(HPCA\),pp\. 854–867\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p4.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px2.p1.1)\.
- Y\. Han, Z\. Wu, P\. Li, A\. Wierman, and S\. Ren \(2024\)The unpaid toll: quantifying the public health impact of ai\.arXiv preprint arXiv:2412\.06288\.Cited by:[§3\.2](https://arxiv.org/html/2606.07632#S3.SS2.p1.1),[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px5.p2.1),[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px5.p2.1)\.
- D\. Hendrycks, C\. Burns, S\. Basart, A\. Zou, M\. Mazeika, D\. Song, and J\. Steinhardt \(2021\)Measuring massive multitask language understanding\.Proceedings of the International Conference on Learning Representations \(ICLR\)\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- J\. Hoffmann, S\. Borgeaud, A\. Mensch, E\. Buchatskaya, T\. Cai, E\. Rutherford, D\. de Las Casas, L\. A\. Hendricks, J\. Welbl, A\. Clark,et al\.\(2022\)Training compute\-optimal large language models\.InProceedings of the 36th International Conference on Neural Information Processing Systems,pp\. 30016–30030\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px5.p3.1)\.
- S\. Hsia, A\. Golden, B\. Acun, N\. Ardalani, Z\. DeVito, G\. Wei, D\. Brooks, and C\. Wu \(2024\)MAD\-max beyond single\-node: enabling large machine learning model acceleration on distributed systems\.In2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture \(ISCA\),pp\. 818–833\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- M\. Isaac \(2025\)Meta to increase spending to $65 billion this year in a\.i\. push\.New York Times\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p2.1)\.
- ISO 14040:2006 \(2006\)Environmental management – Life cycle assessment – Principles and framework\.Standard, Vol\.2006,International Organization for Standardization,Geneva, CH\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p4.1.1),[§3](https://arxiv.org/html/2606.07632#S3.p3.1)\.
- ISO 14044:2006 \(2006\)Environmental management – Life cycle assessment – Requirements and guidelines\.Standard, Vol\.2006,International Organization for Standardization,Geneva, CH\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p4.1.1),[§3](https://arxiv.org/html/2606.07632#S3.p3.1)\.
- W\. S\. Jevons \(1866\)The coal question\.InThe Economics of Population,pp\. 193–204\.Cited by:[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px1.p2.1)\.
- S\. Ji, Z\. Yang, X\. Chen, S\. Cahoon, J\. Hu, Y\. Shi, A\. K\. Jones, and P\. Zhou \(2024\)SCARIF: towards carbon modeling of cloud servers with accelerators\.In2024 IEEE Computer Society Annual Symposium on VLSI \(ISVLSI\),pp\. 496–501\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p4.1)\.
- Joint Legislative Audit and Review Commission \(2024\)Data Centers in Virginia\.Technical reportTechnical Report598,Commonwealth of Virginia\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p2.1)\.
- N\. P\. Jouppi, C\. Young, N\. Patil, D\. Patterson, G\. Agrawal, R\. Bajwa, S\. Bates, S\. Bhatia, N\. Boden, A\. Borchers,et al\.\(2017\)In\-datacenter performance analysis of a tensor processing unit\.InProceedings of the 44th annual international symposium on computer architecture,pp\. 1–12\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- J\. Koomey, S\. Berard, M\. Sanchez, and H\. Wong \(2010\)Implications of historical trends in the electrical efficiency of computing\.IEEE Annals of the History of Computing33\(3\),pp\. 46–54\.Cited by:[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px1.p1.1)\.
- N\. Lambert, J\. Morrison, V\. Pyatkin, S\. Huang, H\. Ivison, F\. Brahman, L\. J\. V\. Miranda, A\. Liu, N\. Dziri, S\. Lyu, Y\. Gu, S\. Malik, V\. Graf, J\. D\. Hwang, J\. Yang, R\. L\. Bras, O\. Tafjord, C\. Wilhelm, L\. Soldaini, N\. A\. Smith, Y\. Wang, P\. Dasigi, and H\. Hajishirzi \(2025\)Tulu 3: pushing frontiers in open language model post\-training\.External Links:[Link](https://arxiv.org/abs/2411.15124),2411\.15124Cited by:[Figure 1](https://arxiv.org/html/2606.07632#S2.F1.2.1),[Figure 1](https://arxiv.org/html/2606.07632#S2.F1.4.2)\.
- J\. C\. Lee, S\. Smaoui, J\. Duffill, B\. Marandi, and T\. Varzakas \(2025\)Forever chemicals pfas global impact and activities, cascading consequences of colossal systems failure: long\-term health effects, food\-systems, eco\-systems\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px5.p2.1)\.
- B\. Li, Y\. Jiang, V\. Gadepally, and D\. Tiwari \(2024a\)Sprout: green generative ai with carbon\-efficient llm inference\.InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing,pp\. 21799–21813\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- P\. Li, J\. Yang, M\. A\. Islam, and S\. Ren \(2022\)Making ai less “thirsty”: uncovering and addressing the secret water footprint of ai models\.Artificial intelligence \(AI\),pp\. 4\.Cited by:[§3\.1\.2](https://arxiv.org/html/2606.07632#S3.SS1.SSS2.p3.1),[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px5.p2.1)\.
- P\. Li, J\. Yang, M\. A\. Islam, and S\. Ren \(2025a\)Making ai less’ thirsty’\.Communications of the ACM68\(7\),pp\. 54–61\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px5.p2.1)\.
- Y\. Li, Z\. Hu, E\. Choukse, R\. Fonseca, G\. E\. Suh, and U\. Gupta \(2025b\)Ecoserve: designing carbon\-aware ai inference systems\.arXiv preprint arXiv:2502\.05043\.Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1),[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- Y\. L\. Li, O\. Graif, and U\. Gupta \(2024b\)Towards carbon\-efficient llm life cycle\.InProceedings of the 3rd Workshop on Sustainable Computer Systems \(HotCarbon\),Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1)\.
- A\. S\. Luccioni, E\. Strubell, and K\. Crawford \(2025\)From efficiency gains to rebound effects: the problem of jevons’ paradox in ai’s polarized environmental debate\.InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency,pp\. 76–88\.Cited by:[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px1.p2.1),[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px5.p3.1)\.
- A\. S\. Luccioni, S\. Viguier, and A\. Ligozat \(2023\)Estimating the carbon footprint of bloom, a 176b parameter language model\.Journal of Machine Learning Research24\(253\),pp\. 1–15\.Cited by:[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px3.p1.1)\.
- S\. Luccioni, Y\. Jernite, and E\. Strubell \(2024\)Power hungry processing: watts driving the cost of ai deployment?\.InProceedings of the 2024 ACM conference on fairness, accountability, and transparency,pp\. 85–99\.Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1)\.
- P\. Mattson, C\. Cheng, G\. Diamos, C\. Coleman, P\. Micikevicius, D\. Patterson, H\. Tang, G\. Wei, P\. Bailis, V\. Bittorf,et al\.\(2020\)Mlperf training benchmark\.Proceedings of Machine Learning and Systems2,pp\. 336–349\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- S\. McAllister, F\. Kazhamiaka, D\. S\. Berger, R\. Fonseca, K\. Frost, A\. Ogus, M\. Sah, R\. Bianchini, G\. Amvrosiadis, N\. Beckmann,et al\.\(2024\)A call for research on storage emissions\.InProceedings of the 3rd Workshop on Sustainable Computer Systems \(HotCarbon\),Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px4.p1.1)\.
- G\. Menghani \(2023\)Efficient deep learning: a survey on making deep learning models smaller, faster, and better\.ACM Computing Surveys55\(12\),pp\. 1–37\.Cited by:[§2](https://arxiv.org/html/2606.07632#S2.p1.1)\.
- M\. Mitchell, S\. Wu, A\. Zaldivar, P\. Barnes, L\. Vasserman, B\. Hutchinson, E\. Spitzer, I\. D\. Raji, and T\. Gebru \(2019\)Model cards for model reporting\.InProceedings of the conference on fairness, accountability, and transparency,pp\. 220–229\.Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px2.p3.1)\.
- J\. Morrison, C\. Na, J\. Fernandez, T\. Dettmers, E\. Strubell, and J\. Dodge \(2025\)Holistically evaluating the environmental impact of creating language models\.InThe Thirteenth International Conference on Learning Representations,External Links:[Link](https://openreview.net/forum?id=04qx93Viwj)Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1),[Figure 2](https://arxiv.org/html/2606.07632#S3.F2),[Figure 2](https://arxiv.org/html/2606.07632#S3.F2.4.2),[§3\.2](https://arxiv.org/html/2606.07632#S3.SS2.p1.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px3.p1.1),[footnote 3](https://arxiv.org/html/2606.07632#footnote3)\.
- National Academies of Sciences, Engineering, and Medicine \(2022\)Current methods for life\-cycle analyses of low\-carbon transportation fuels in the united states\.Cited by:[§3\.2](https://arxiv.org/html/2606.07632#S3.SS2.p1.1)\.
- National Institute of Standards and Technology \(NIST\) \(2025\)”Accelerating ai innovation through measurement science”\.External Links:[Link](https://www.nist.gov/blogs/caisi-research-blog/accelerating-ai-innovation-through-measurement-science)Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px2.p3.1)\.
- S\. Nguyen, B\. Zhou, Y\. Ding, and S\. Liu \(2024\)Towards sustainable large language model serving\.ACM SIGENERGY Energy Informatics Review4\(5\),pp\. 134–140\.Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1)\.
- NIST \(2023\)Artificial intelligence risk management framework \(ai rmf 1\.0\)\.URL: https://nvlpubs\. nist\. gov/nistpubs/ai/nist\. ai,pp\. 100–1\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px4.p1.1)\.
- NVIDIA \(2025a\)Product carbon footprint summary for nvidia hgx b200\.External Links:[Link](https://images.nvidia.com/aem-dam/Solutions/documents/HGX-B200-PCF-Summary.pdf)Cited by:[§3\.5](https://arxiv.org/html/2606.07632#S3.SS5.p2.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px2.p1.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px4.p2.1)\.
- NVIDIA \(2025b\)Product carbon footprint summary for nvidia hgx h100\.External Links:[Link](https://images.nvidia.com/aem-dam/Solutions/documents/HGX-H100-PCF-Summary.pdf)Cited by:[§3\.5](https://arxiv.org/html/2606.07632#S3.SS5.p2.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px2.p1.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px4.p2.1)\.
- OpenAI \(2025a\)Announcing the stargate project\.External Links:[Link](https://openai.com/index/announcing-the-stargate-project/)Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p2.1)\.
- OpenAI \(2025b\)Working with us caisi and uk aisi to build more secure ai systems\.External Links:[Link](https://openai.com/index/us-caisi-uk-aisi-ai-update/)Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px2.p3.1)\.
- F\. Oviedo, F\. Kazhamiaka, E\. Choukse, A\. Kim, A\. Luers, M\. Nakagawa, R\. Bianchini, and J\. M\. L\. Ferres \(2025\)Energy use of ai inference: efficiency pathways and test\-time compute\.arXiv preprint arXiv:2509\.20241\.Cited by:[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px1.p1.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px3.p1.1)\.
- M\. Parashar, T\. DeBlanc\-Knowles, E\. Gianchandani, and L\. E\. Parker \(2023\)Strengthening and democratizing artificial intelligence research and development\.Computer56\(11\),pp\. 85–90\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p2.1)\.
- P\. Patel, E\. Choukse, C\. Zhang, Í\. Goiri, B\. Warrier, N\. Mahalingam, and R\. Bianchini \(2024\)Characterizing power management opportunities for llms in the cloud\.InProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3,pp\. 207–222\.Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1),[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- D\. Patterson, J\. M\. Gilbert, M\. Gruteser, E\. Robles, K\. Sekar, Y\. Wei, and T\. Zhu \(2024\)Energy and emissions of machine learning on smartphones vs\. the cloud\.Communications of the ACM67\(2\),pp\. 86–97\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px5.p3.1)\.
- D\. Patterson, J\. Gonzalez, U\. Hölzle, Q\. Le, C\. Liang, L\. Munguia, D\. Rothchild, D\. R\. So, M\. Texier, and J\. Dean \(2022\)The carbon footprint of machine learning training will plateau, then shrink\.Computer55\(7\),pp\. 18–28\.External Links:[Link](https://arxiv.org/abs/2204.05149),2204\.05149Cited by:[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px1.p1.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px3.p1.1)\.
- D\. Patterson, J\. Gonzalez, Q\. Le, C\. Liang, L\. Munguia, D\. Rothchild, D\. So, M\. Texier, and J\. Dean \(2021\)Carbon emissions and large neural network training\.arXiv preprint arXiv:2104\.10350\.Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1),[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- J\. W\. Rae, S\. Borgeaud, T\. Cai, K\. Millican, J\. Hoffmann, F\. Song, J\. Aslanides, S\. Henderson, R\. Ring, S\. Young,et al\.\(2021\)Scaling language models: methods, analysis & insights from training gopher\.arXiv preprint arXiv:2112\.11446\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px5.p3.1)\.
- V\. J\. Reddi, C\. Cheng, D\. Kanter, P\. Mattson, G\. Schmuelling, C\. Wu, B\. Anderson, M\. Breughe, M\. Charlebois, W\. Chou,et al\.\(2020\)Mlperf inference benchmark\.In2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture \(ISCA\),pp\. 446–459\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- M\. Rezagholizadeh, P\. Passban, Y\. Cheng, S\. Samiee, Y\. Dong, V\. Partovi Nia, Q\. Liu, and B\. Chen \(Eds\.\) \(2024\)Fourth workshop on efficient natural language and speech processing\.Vancouver, Canada\.Cited by:[§2](https://arxiv.org/html/2606.07632#S2.p1.1)\.
- N\. Sadat Moosavi, I\. Gurevych, Y\. Hou, G\. Kim, Y\. J\. Kim, T\. Schuster, and A\. Agrawal \(Eds\.\) \(2023\)Proceedings of the fourth workshop on simple and efficient natural language processing \(sustainlp\)\.Association for Computational Linguistics,Toronto, Canada \(Hybrid\)\.External Links:[Link](https://aclanthology.org/2023.sustainlp-1.0/)Cited by:[§2](https://arxiv.org/html/2606.07632#S2.p1.1)\.
- V\. Sangarya, R\. Bradford, and J\. Kim \(2024\)Estimating environmental cost throughout model’s adaptive life cycle\.InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society,Vol\.7,pp\. 1281–1291\.Cited by:[§3\.1\.2](https://arxiv.org/html/2606.07632#S3.SS1.SSS2.p2.1)\.
- I\. Schneider, H\. Xu, S\. Benecke, D\. Patterson, K\. Huang, P\. Ranganathan, and C\. Elsworth \(2025\)Life\-cycle emissions of ai hardware: a cradle\-to\-grave approach and generational trends\.arXiv preprint arXiv:2502\.01671\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p4.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px2.p1.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px4.p1.1)\.
- A\. Shehabi, A\. Hubbard, A\. Newkirk, N\. Lei, M\. A\. B\. Siddik, B\. Holecek, J\. Koomey, E\. Masanet, D\. Sartor,et al\.\(2024\)2024 united states data center energy usage report\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p1.1),[§2\.3](https://arxiv.org/html/2606.07632#S2.SS3.p1.1)\.
- T\. Shi, Y\. Wu, S\. Liu, and Y\. Ding \(2024\)GreenLLM: disaggregating large language model serving on heterogeneous gpus for lower carbon emissions\.arXiv preprint arXiv:2412\.20322\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- B\. Smith \(2025\)The golden opportunity for american ai\.External Links:[Link](https://blogs.microsoft.com/on-the-issues/2025/01/03/the-golden-opportunity-for-american-ai/)Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p2.1)\.
- J\. Stojkovic, E\. Choukse, C\. Zhang, I\. Goiri, and J\. Torrellas \(2024\)Towards greener llms: bringing energy\-efficiency to the forefront of llm inference\.arXiv preprint arXiv:2403\.20306\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- J\. Stojkovic, C\. Zhang, Í\. Goiri, J\. Torrellas, and E\. Choukse \(2025\)Dynamollm: designing llm inference clusters for performance and energy efficiency\.In2025 IEEE International Symposium on High Performance Computer Architecture \(HPCA\),pp\. 1348–1362\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- E\. Strubell, A\. Ganesh, and A\. McCallum \(2019\)Energy and policy considerations for deep learning in NLP\.InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics,A\. Korhonen, D\. Traum, and L\. Màrquez \(Eds\.\),Florence, Italy,pp\. 3645–3650\.External Links:[Link](https://aclanthology.org/P19-1355/),[Document](https://dx.doi.org/10.18653/v1/P19-1355)Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px4.p1.1)\.
- E\. Strubell, A\. Ganesh, and A\. McCallum \(2020\)Energy and policy considerations for modern deep learning research\.InProceedings of the AAAI conference on artificial intelligence,Vol\.34,pp\. 13693–13696\.Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1)\.
- Y\. Sui, Y\. Chuang, G\. Wang, J\. Zhang, T\. Zhang, J\. Yuan, H\. Liu, A\. Wen, S\. Zhong, H\. Chen, and X\. Hu \(2025\)Stop overthinking: a survey on efficient reasoning for large language models\.External Links:2503\.16419,[Link](https://arxiv.org/abs/2503.16419)Cited by:[§2](https://arxiv.org/html/2606.07632#S2.p1.1)\.
- Y\. Tay, M\. Dehghani, D\. Bahri, and D\. Metzler \(2022\)Efficient transformers: a survey\.ACM Comput\. Surv\.55\(6\)\.External Links:ISSN 0360\-0300,[Link](https://doi.org/10.1145/3530811),[Document](https://dx.doi.org/10.1145/3530811)Cited by:[§2](https://arxiv.org/html/2606.07632#S2.p1.1)\.
- The White House \(2025\)Winning the race: america’s ai action plan\.Technical reportExecutive Office of the President\.External Links:[Link](https://www.whitehouse.gov/wp-content/uploads/2025/07/Americas-AI-Action-Plan.pdf)Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px4.p1.1),[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px3.p1.1)\.
- T\. Tornede, A\. Tornede, J\. Hanselle, F\. Mohr, M\. Wever, and E\. Hüllermeier \(2023\)Towards green automated machine learning: status quo and future directions\.Journal of Artificial Intelligence Research77,pp\. 427–457\.Cited by:[§3\.1\.2](https://arxiv.org/html/2606.07632#S3.SS1.SSS2.p2.1),[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px5.p3.1)\.
- M\. Treviso, J\. Lee, T\. Ji, B\. van Aken, Q\. Cao, M\. R\. Ciosici, M\. Hassid, K\. Heafield, S\. Hooker, C\. Raffel, P\. H\. Martins, A\. F\. T\. Martins, J\. Z\. Forde, P\. Milder, E\. Simpson, N\. Slonim, J\. Dodge, E\. Strubell, N\. Balasubramanian, L\. Derczynski, I\. Gurevych, and R\. Schwartz \(2023\)Efficient methods for natural language processing: a survey\.Transactions of the Association for Computational Linguistics11,pp\. 826–860\.External Links:[Link](https://aclanthology.org/2023.tacl-1.48/),[Document](https://dx.doi.org/10.1162/tacl%5Fa%5F00577)Cited by:[§2](https://arxiv.org/html/2606.07632#S2.p1.1)\.
- A\. Tschand, A\. T\. R\. Rajan, S\. Idgunji, A\. Ghosh, J\. Holleman, C\. Kiraly, P\. Ambalkar, R\. Borkar, R\. Chukka, T\. Cockrell,et al\.\(2025\)MLPerf power: benchmarking the energy efficiency of machine learning systems fromμ\\muwatts to mwatts for sustainable ai\.In2025 IEEE International Symposium on High Performance Computer Architecture \(HPCA\),pp\. 1201–1216\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- U\.S\. Environmental Protection Agency \(2026\)About ENERGY STAR\.Note:[https://www\.energystar\.gov/about](https://www.energystar.gov/about)Accessed: 2026\-05\-29Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px1.p1.1)\.
- R\. Verdecchia, J\. Sallou, and L\. Cruz \(2023\)A systematic review of green ai\.WIREs Data Mining and Knowledge Discovery13\(4\),pp\. e1507\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1002/widm.1507),[Link](https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/widm.1507),https://wires\.onlinelibrary\.wiley\.com/doi/pdf/10\.1002/widm\.1507Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px4.p1.1)\.
- E\. P\. Walsh, L\. Soldaini, D\. Groeneveld, K\. Lo, S\. Arora, A\. Bhagia, Y\. Gu, S\. Huang, M\. Jordan, N\. Lambert,et al\.\(2025\)2 olmo 2 furious \(colm’s version\)\.InSecond Conference on Language Modeling,Cited by:[Figure 1](https://arxiv.org/html/2606.07632#S2.F1.2.1),[Figure 1](https://arxiv.org/html/2606.07632#S2.F1.4.2),[Figure 2](https://arxiv.org/html/2606.07632#S3.F2),[Figure 2](https://arxiv.org/html/2606.07632#S3.F2.4.2),[§3\.3](https://arxiv.org/html/2606.07632#S3.SS3.p2.1)\.
- Z\. Wan, X\. Wang, C\. Liu, S\. Alam, Y\. Zheng,et al\.\(2023\)Efficient large language models: a survey\.arXiv preprint arXiv:2312\.038631\.Cited by:[§2](https://arxiv.org/html/2606.07632#S2.p1.1)\.
- I\. Wang, M\. Elhoushi, H\. E\. Sumbul, S\. Hsia, D\. Jiang, N\. Ardalani, D\. Mahajan, C\. Wu, and B\. Acun \(2025\)CATransformers: carbon aware transformers through joint model\-hardware optimization\.InThe Thirty\-ninth Annual Conference on Neural Information Processing Systems,External Links:[Link](https://openreview.net/forum?id=IjMZfMVyLF)Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- Y\. Wang, S\. Roy, M\. Mancini, K\. Zhou, E\. Tartaglione, A\. Agrawal, G\. A\. Tadesse, M\. Cristani, and Z\. Akata \(Eds\.\) \(2024a\)First workshop on green foundation models\.Milan, Italy\.Cited by:[§2](https://arxiv.org/html/2606.07632#S2.p1.1)\.
- Y\. Wang, X\. Ma, G\. Zhang, Y\. Ni, A\. Chandra, S\. Guo, W\. Ren, A\. Arulraj, X\. He, Z\. Jiang,et al\.\(2024b\)Mmlu\-pro: a more robust and challenging multi\-task language understanding benchmark\.InThe Thirty\-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track,Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- S\. Welleck, A\. Bertsch, M\. Finlayson, H\. Schoelkopf, A\. Xie, G\. Neubig, I\. Kulikov, and Z\. Harchaoui \(2024\)From decoding to meta\-generation: inference\-time algorithms for large language models\.Transactions on Machine Learning Research\.Note:Survey CertificationExternal Links:ISSN 2835\-8856,[Link](https://openreview.net/forum?id=eskQMcIbMS)Cited by:[Figure 1](https://arxiv.org/html/2606.07632#S2.F1.2.1),[Figure 1](https://arxiv.org/html/2606.07632#S2.F1.4.2),[§3\.1\.2](https://arxiv.org/html/2606.07632#S3.SS1.SSS2.p2.1)\.
- G\. Wernet, C\. Bauer, B\. Steubing, J\. Reinhard, E\. Moreno\-Ruiz, and B\. Weidema \(2016\)The ecoinvent database version 3 \(part i\): overview and methodology\.The International Journal of Life Cycle Assessment21\(9\),pp\. 1218–1230\.External Links:[Document](https://dx.doi.org/10.1007/s11367-016-1087-8),[Link](http://link.springer.com/10.1007/s11367-016-1087-8)Cited by:[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px4.p2.1)\.
- C\. Wu, R\. Raghavendra, U\. Gupta, B\. Acun, N\. Ardalani, K\. Maeng, G\. Chang, F\. Aga, J\. Huang, C\. Bai,et al\.\(2022\)Sustainable ai: environmental implications, challenges and opportunities\.Proceedings of Machine Learning and Systems4,pp\. 795–813\.Cited by:[§1](https://arxiv.org/html/2606.07632#S1.p4.1),[§3\.2](https://arxiv.org/html/2606.07632#S3.SS2.p1.1),[§4](https://arxiv.org/html/2606.07632#S4.SS0.SSS0.Px3.p1.1)\.
- Y\. Wu, I\. Hua, and Y\. Ding \(2025\)Unveiling environmental impacts of large language model serving: a functional unit view\.InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\),W\. Che, J\. Nabende, E\. Shutova, and M\. T\. Pilehvar \(Eds\.\),Vienna, Austria,pp\. 10560–10576\.External Links:[Link](https://aclanthology.org/2025.acl-long.519/),[Document](https://dx.doi.org/10.18653/v1/2025.acl-long.519),ISBN 979\-8\-89176\-251\-0Cited by:[§2\.2](https://arxiv.org/html/2606.07632#S2.SS2.p1.1),[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- J\. Xing, B\. Acun, A\. Sundarrajan, D\. Brooks, M\. Chakkaravarthy, N\. Avila, C\. Wu, and B\. C\. Lee \(2023\)Carbon responder: coordinating demand response for the datacenter fleet\.arXiv preprint arXiv:2311\.08589\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- Z\. Yang, K\. Adamek, and W\. Armour \(2023\)Part\-time power measurements: nvidia\-smi’s lack of attention\.arXiv preprint arXiv:2312\.02741\.Cited by:[§6](https://arxiv.org/html/2606.07632#S6.SS0.SSS0.Px4.p1.1)\.
- S\. Yao, N\. Shinn, P\. Razavi, and K\. R\. Narasimhan \(2025\)Tau\-bench: a benchmark for tool\-agent\-user interaction in real\-world domains\.InThe Thirteenth International Conference on Learning Representations,Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
- J\. You, J\. Chung, and M\. Chowdhury \(2023\)Zeus: understanding and optimizing\{\\\{gpu\}\\\}energy consumption of\{\\\{dnn\}\\\}training\.In20th USENIX Symposium on Networked Systems Design and Implementation \(NSDI 23\),pp\. 119–139\.Cited by:[§5](https://arxiv.org/html/2606.07632#S5.SS0.SSS0.Px2.p1.1)\.
Evaluation of ML Resource Utilization Requires Model Life Cycle Assessment

Similar Articles

MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI

Position: LLM Inference Should Be Evaluated as Energy-to-Token Production

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

Is AI ever going to become resource efficient?

The Environmental Cost of LLMs in AIED: Reporting and Practices

Submit Feedback

Similar Articles

MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI
Position: LLM Inference Should Be Evaluated as Energy-to-Token Production
Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild
Is AI ever going to become resource efficient?
The Environmental Cost of LLMs in AIED: Reporting and Practices