Benchmarks v0.1 for ClimateTech

Common quantitative goals have been foundational for other fields, can we develop such targets for the future climate-positive technologies?

Jun 17, 2022

A successful SpaceX Falcon 9 flight source: Defense.gov

This is a work in progress that I’m sharing broadly in hopes of generating feedback, ideas and community discussion. All thoughts and input are appreciated, especially as depth beyond my own expertise is needed to craft the most impactful benchmarks: the easiest way to reach me currently is via Twitter. Success in this effort is help onboard new people into the field of climateTech, with an emphasis on the bioengineering perspective. Together, I hope we can create shared targets that drive innovation, and support new funding efforts that can move at the speed of experimentation.

Goal

As more and more people look to get involved in climate technology development, it is essential that we craft entry points and shared community goals. I’ve argued previously that all hyper-productive communities share common attributes, and here I want to explore the important of well-articulated shared goals. Many people, including myself, spend months and years just trying to find the right problem to work on. One lightweight solution to a “where do I start?” problem is to establish public targets for people to nucleate their early stage ideation. Success in this approach would be a set of quantitative benchmarks that aspiring entrepreneurs and scientists can aim to beat.

This is a work in progress: We intentionally put “V0.1” into the title to welcome community-wide input and conversation. There are creative tensions in this work, which we discuss at the closing section.

By establishing target goals, we can broaden the base of technologists working on this problem and direct funding to support creative, novel efforts.

Criteria for a good benchmark

Good benchmarks both activate a community and illuminate an individual’s first steps. We used the following metrics to assess quality throughout the process of assembling and editing these benchmarks.

Tangible and actionable, even for a non-expert. Is it obvious what this number means and why is it important? If not yet we deem it is important, what information can we provide to most quickly get them up to speed?
Measurable on a minimal unit basis. What is the smallest experiment that could be run based on this benchmark? In the case of a crop plant like corn, “~50% of its nitrogen is synthetically fixed” (Bloch et al 2020) focuses on a single plant, but might be too granular depending on the state of the art of the experimental assay. The next step up in unit size might be to say “75lbs of N fertilizer per acre” (source: EPA 2019) and is perhaps the right unit to work with, whereas the largest unit, e.g. “total synthetically fixed nitrogen usage in the USA is 20 million tons,” may be interesting but unactionable (source).
Specific and easily replicable. Let’s say, for example, we eventually make a benchmark for nitrogen fixation in organisms. If the best nitrogen fixing organism’s N2 reduction reactions are measured at ~10^4 /s/cell , a good benchmark would reference the paper, including the organism and assay. If a benchmark is not easily repeated in a lab, for example $300/ton of CO2 captured at atmospheric conditions, reference the best available publication.
Maximally future-proof. Benchmarks are naturally going to have a shorter horizon than roadmapping towards theoretical ideals, but good benchmarks should be maximally agnostic to current technology and with assumptions listed as explicitly as possible.

Scope of the benchmarking project

We pragmatically accept that not all benchmarks are equal in impact and mindfully avoid making prescriptive hierarchies. To properly determine which problems are more important than others is the domain of careful roadmapping with diverse stakeholders.

Consider, for example, the benchmark of 1 pound of synthetically fixed nitrogen via Haber Bosch (HB) per bushel of corn. Replacing HB-based ammonia with lower CO2 emission processes of nitrogen fixation would certainly be an important lever, yet one might argue that decreasing demand for synthetically fixed nitrogen is the “more important” benchmark. Such debates around hierarchies of impact are important but can quickly get mired in assumptions, proprietary information and projections: this is important but becomes a much larger project. As such, our principle here is that we need a portfolio of approaches for all the problems which we quantify in these benchmarks.

What are examples of successful benchmarks?

Clear goals for a field are both rallying points for community and entry points for outsiders who may bring valuable expertise. One could argue that the very public and tangible number of $/ton in the carbon dioxide removal space has helped many potential practitioners start their brainstorming. What are other cases in which clear benchmarks helped propel a field?

Protein structure prediction. For years, CASP was an annual event that challenged the structural biology and computer science community to predict the structure of a protein based on its sequence alone. The core quantitative benchmark was the Global Distance Test, with several variants. This challenge built around a benchmark attracted bigger and better efforts until in 2018, Google’s DeepMind entered the CASP13 competition and outperformed every other approach to such a degree that Mohammed AlQuraishi (a leader in the space) was compelled to blog “What just happened?” Since then, biotechnology has been permanently changed by access to estimates of previously unknown protein structures.
Autonomous Vehicles. Developing software that could autonomously drive vehicles in natural situations seemed like a long shot in the early 2000s when the DARPA Grand Challenge was announced. The milestone was to complete a 100+ mile course, but the benchmark was the longest distance the best performer of the previous year was able to navigate before getting stuck. In 2004 the best team accomplished 7.32 miles of a 150 mile course. In 2005, five vehicles successfully navigated the complete 132 mile course.
Machine Learning Challenges. The Netflix Prize was the first high-profile public benchmark challenge in modern computer science. Announced in 2006 with the then-unbelievable prize of $1MM cash, the challenge was to beat a benchmark algorithmic performance by at least 10%. Over 20,000 teams got involved, and the target performance was accomplished in 2009. Also in 2009, Fei-Fei Li and her team developed ImageNet, a landmark database and annual 2010-2017 competition that became the foundation of the Deep Learning revolution.
Athletics. Consider the 4-minute mile in 1954, the 2-hour marathon in 2019, 1000lb deadlift in 2006.
Genome reading and writing. Cost per sequencing was highly tangible to the biotech community and as such, incentivized much focus and much funding. With the advent of CRISPR, there was a race in the protein engineering field to optimize the Cas9 enzyme to access as much of the genome as possible (cite Pranam’s Cas9). And in future genomic engineering challenges, such as writing whole chromosomes, Ostrov et al 2019 articulates the clear milestones of the genomic writing community.

Benchmarks

The V1 benchmarks will have four primary topics: Greenhouse Gas Removal, Materials, Agriculture and Energy.

Greenhouse Gas Removal

CO2 capture cost at atmospheric concentrations ($/ton)

Why this matters: Gigaton-scale carbon capture is needed urgently and marginal cost is essential. According to the CDRPrimer.org Section 1-1, the hard-to-avoid emissions for human society in 2022 is on the scale of 1.5-3 gigatons annually, meaning even in optimistic policy and behavior change scenarios humanity will need billion-ton-scale carbon dioxide removal (CDR). <$100/ton has been a de facto target across the CDR community (refer to this entrepreneur’s guide to DAC from 2019). As it is unlikely that a single CDR approach can reliably reach and sustain gigaton carbon capture worldwide, we will need a portfolio of good approaches with different strengths. As one example of recent innovations, Professor Jennifer Wilcox’s oxide cycling (McQueen 2020) indicated a $46/ton capture might be possible, which is now being brought to market via the company Heirloom. Note that CDR approaches must be subject to the developing field of Measurement Reporting and Verification (MRV) standards established by the CDR community. Roughly, a gigaton at today’s rates might cost $1 Trillion, which would be 1.1% of global GDP.

Today: ~$300-$600/ton CO2 via Direct Air Capture (DAC) at scale is the unpublished consensus of the DAC community.

Target: <$100/ton CO2. This proof of concept was established by Keith et al 2018 but not yet commercially demonstrated as of May 2022. <$100/ton would make a DAC company profitable at least in California where the carbon credit is $200/ton (World Resources Institute 2022)1.

CO2 capture energy expenditure at atmospheric concentrations (GJ/ton)

Why this matters: When designing carbon capture technologies, there may be scenarios in which energetic efficiency could be a primary variable to optimize. For example, in privileged environments with cheap or free power, energy-efficient designs could shine. The Climate Technology Primer series covered the theoretical energetic bounds of capturing the CO2 gas from the air (see Section 2.1), which references a target efficiency of 30% on top of theoretical minima of 0.7GJ/ton. For example, when Sahag Voskian and Professor Alan Hatton at MIT first published their electroswing technology in 2019, which later became the company Verdox, they achieved a head-turning per 1GJ/ton but in conditions of 10-100x atmospheric concentrations2. To put this all in context, the global power usage is 83 Exajoules, so without further innovation, capturing a gigaton at 10 GJ/ton would require an infeasible 12% of the worlds power budget.

Today: 5-10 GJ/ton CO2, as reported by the International Energy Agency in November 2021

Target: 2GJ/ton CO2. Source: MacKay 2008 The theoretical minimum energy for capture would be 0.7GJ/ton at 100% efficiency, and in Without Hot Air MacKay estimated 30% efficiency as a reasonable best target. Storage energy cost can vary significantly by instantiation.

Silicate weathering rates (mol CO2 / m2 s)

Why this matters: The silicate-carbonate cycle, referred to as rock weathering in the CDR community, is a natural gigaton-scale carbon flux per year. Silicates reacting with CO2 to form carbonates is perhaps the biggest natural lever the earth has for gigaton-scale capture and storage (Ciais et al 2013). While there are some big ideas, there is much innovation necessary and a perspective by Keleman et al 2020 is a good introduction. Briefly, the primary weathering reactions are silicate dissolution and carbonate precipitation, and the optimal conditions (eg, pH) for the two reactions differ. There have been some efforts to accelerate the weathering process, referred to as Enhanced Rock Weathering. Examples include such as distributing ground basalt across agricultural fields (Beerling 2020) or Project Vesta’s strategy of deploying olivine onto coastlines for waves to mechanically break up the rocks. There have been initial experiments using microbial co-culture (McCutcheon 2021) , siderophores (Torres et al 2019) or carbonic anhydrase in a reactor context showing modest rate increases that are not yet practical (eg Power et al 2016), but there is not yet a leading paradigm toward order-of-magnitude increases in weathering rates. Choice of material is important (see Renforth 2019 for a review on alkaline materials)., but acceleration is still needed. In nature, there is evidence that fungi, trees and lichens can accelerate rock weathering, but it has yet to be harnessed at scale.

Today: Natural weathering rates 10^(−10.53) and 10^(−9.86) mol/ m^2 s^1 for basalt and dunite, respectively (Strefler et al 2018). Basalt is chosen because of its high availability but generally low reactivity, whereas dunite has the one of the highest CO2 sequestration potentials.

Target: No consensus target exists yet, and the target can be considered in two forms: prioritizing speed or scalability for a process. In small volumes at high temperature and low pH, you could trivially increase the weathering rate by ~100x, but this is not a viable solution in practice. Hence, steps toward a >100x acceleration for a scalable process in closed reactors or 10x improvement over large, open volumes would be transformative.

Materials

Energy to break rocks into <10 µm particles (kWh/ton)

Why this matters: Increasing the surface area of a given rock is critical to carbon capture and metal mining processes: finer material can react faster, be used in more versatile processing environments and the impact of negative feedbacks (such as passivated surfaces) may be minimized. For an excellent introduction, we direct the reader to ARPA-E Program Director Doug Wicks’ OPEN 2021 talk on Energy Efficient Routes to Comminution. 10 µm is selected as a candidate desired grind size based on the tradeoff of energy input vs speed of weather (Rinder and von Hagke 2021). However, under 10 µm particle size, 90+% of energy is lost as heat in the milling process: with existing technology, the comminution of the rocks are bottleneck to ERW CDR, both in operating and capital expenditure. Alternatively, there are natural mechanical weathering forces that are both biological (eg, plant roots cracking rocks) or abiotic (eg, water freezing cracks rocks). Additionally, there are many examples from nature of biology weakening or dissolving rocks, such the Red Alder Tree Perakis and Pett-Ridge 2019 or a common brown mushroom (Pinzari et al 2022). Rock comminution may be an obscure problem but could unlock scalable weathering-based carbon capture methods and lower-footprint mining operations, both of which are essential for a stable climate and an electrified industry.

Today: Olivine is 180kWh/ton (Rinder and von Hagke 2021), can exceed 1000kWh/ton depending on the rock composition in mining (Jeswiet et al 2016)

Target: Not defined. Roughly speaking: A proof of concept that demonstrates a path to significant improvement over the status quo would be of high interest.

Carbon-Negative Performance Materials (CO2/ton)

Why this matters: Decarbonizing construction is an opportunity to simultaneously reduce emissions and to sequester captured CO2 in the process. Wood is both a reliable building material and long-term carbon store, and there has been some efforts to increase the performance characteristics to become comparable to aluminum alloys (Xiao et al 2021). Cross-laminated timber is used today in construction and has been studied for life cycle analyses (Anderson et al 2022), and one conclusion is that increases of usable biomass will be essential. Similarly, Arnold et al 2020 produced a detailed techno-economic analysis for carbon negative carbon fiber via algal production of the polyacrylonitrile feedstock: they indicate potential for carbon fiber to reach the gigaton of carbon storage, providing algal production advancements can be discovered. As steel is 8% of the global greenhouse gas emissions, any viable replacements or footprint reductions should be explored (McKinsey 2020). At a similar scale, concrete is also 8% of the global GHG emissions, and carbon negative concrete has been a vibrant area of entrepreneurship (Nature editorial, 2021). More speculatively, there are many performance biomaterials from nature at various stages of study and deployment, such as spider silk, insect chitin or mollusc nacre. Inventing new concepts for carbon-rich, low-footprint materials could have a large carbon impact and may be best developed in collaboration with the construction companies.

Today: Steel emits 1.85tons of CO2 for each ton of steel (McKinsey 2020). Carbon Fiber is <.01% of steel’s volume and 20x the price, and each ton of carbon fiber emits almost 30 tons of CO2 GHG equivalents (source: Composites World 2019).

Target: Construction-ready material with a negative GHG footprint.

Agriculture

Edible calorie production rate (calories/acre/year)

Why this matters: Scalable, distributable and robust food systems are a critical component of the technologies that humanity will need to live in the next 30 years. Focusing on calorie production density could help drive experimentation in cellular agriculture, indoor agriculture or increasing yields of existing plans. Focusing on calorie production rate could also drive innovation for stopgap solutions in dire situations of famine. As complex supply chains show fragility, and as changing weather patterns raise concern over traditional agriculture’s security, the time is right to consider new food production stacks. Organizations like ALLFED or New Harvest have been leading this new path, and more technology innovation is needed to enable decentralized production and resilience to unforeseen volatility such as drought and pests. One could start by modifying plants that we already know how to industrially process, or to start from organisms that have phenomenal growth rates. Consider the potential of fast growing cyanobacteria that divide 16 times in a day (up to 65,000x biomass increase), or that duckweed is thought to have grown so fast as to cause an ice age (The Azolla Event), that have not yet been deployed at climate-relevant scale. It is possible that future paths for food production will be able to produce more food at a lower carbon footprint.

Today: Corn produces 15 million calories per acre per year (Washington Post, 2015)

Target: Can we beat corn on the acre scale anywhere on the planet?

Crop dependence on synthetically fixed nitrogen (pound synthetic N / bushel crop)

Why this matters: Roughly 50+% of crop nitrogen comes from synthetically fixed nitrogen3. The denser the target crop yield (bushels/acre), the more the dependence on fertilizer (example). Synthetically fixed nitrogen has an enormous footprint: 1.2% of global CO2 is from Haber Bosch at a rate of 1.8tons CO2 for each ton of synthetically fixed nitrogen as ammonia. Furthermore, excess Nitrogen in soil has downstream effects, such as local toxic algal blooms and nitrous oxide release. 50% of the total anthropogenic flux of nitrous oxide, which is 300x more potent than CO2 and the number three source of radiative forcing on Earth, comes from agricultural fields and is strongly linked to fertilizer rate (Shcherbak 2014). In 2008 it was forecast that nitrogen-use efficiency could be increased by 50% (Erisman 2008), but some data indicates we are trending toward more, not less, fertilizer usage (FAO 2022 Report). There is significant room for innovation around reducing usage of synthetically fixed nitrogen, increasing naturally fixed nitrogen, preventing nitrous oxide formation, and engineering self-sufficient plant designs (Anas et al 2020 was a decent technical introduction in the context of sugarcane). It also raises fundamental questions about plant biology: Can nitrogen fixation occur anywhere on a plant, why is it only the nodules in the roots that nitrogen fixing symbioses occurs? Could it be done in leaf cells or in emulsion?

Today: Corn is about 1 pound fertilizer per bushel

Target: Not yet defined, anything substantially better and scalable is important.

Energy efficiency of biomass production (Edible calories / Watt)

Why this matters: The incredible increases of crop yields in the 20th century, primarily via the proliferation of synthetically fixed nitrogen and heartier staple crops, have stagnated. For biomass-based production of food, chemicals and energy to realize its trillion-dollar potential, pathways to massively scalable biomass production will be needed. Phototrophic mechanisms are undergoing rapid innovations that may reach deployment: consider the TaCo and CETCH cycles invented by the Tobias Erb and his collaborators, the ability to screen Rubisco using EColi by Robbie Wilson, or in 2021 Cai et al published 7% photo efficiency into carbohydrates. It’s also plausible that hybrid electrotrophic paths might play a role in future biomass production. In 2016 the Nocera and Silver labs published The Bionic Leaf claiming a 10% CO2 reduction energy efficiency in small reactors. Finally, there may be a future of entirely electricity-driven biosynthesis, in contexts such as space travel (Martínez et al 2021 contends 10-21% energy-to-food efficiency), and it has been hypothesized that if oxygen sensitivity can be solved, yields could exceed 50% (Salimijazi et al 2020). This raises speculative bioengineering challenges: would it be possible –and beneficial– to do transplant metabolic pathways? Eg, to put the Shewenella direct electron transfer pathway into deployable yeast or fast growing vibrio natriegens.

Today: Modern crop plants are 1-2% efficient, C4 plants are 2.5% efficient (Zhu et al 2010)

Target: Not yet defined, anything substantially better and scalable is important.

Use of pesticides on corn and soybeans (pounds/acre)

Why this matters: Protection of human and ecosystem health has been underappreciated, relative to its potential impact, by the climate technology community. Just as there has been a concerted effort to monitor, mitigate and sequester the unintended excesses of methane and carbon dioxide, there may need to be a similar effort to manage the engineered small molecule toxicants that are now ubiquitous in the environment. For example, the 250MM pounds of glyphosate used in the US annually are is found in 86% of rainwater in the US correlated with negative effects on human and animal health. Furthermore, pesticides are linked to biodiversity collapse of insects (review: Wagner et al 2021, pollinator collapse: DiBartolomeis et al 2019) and amphibians (Brühl et al 2013). There is active debate around whether crops engineered to be resistant to pesticides and herbicides increase or decrease net chemical usage, and at least one study contends that there is a 7% increase of pesticide between 1996 and 2011 (Benbrook 2012). It is essential to develop solutions which simultaneously protect the food harvest while minimizing usage of reactive small molecules that disrupt the environment. Solutions may include plant engineering, symbiote engineering or bioorthogonal chemistry innovation.

Today: ~2 pounds per acre of pesticide for corn, primarily acetochlor and glyphosate (using data from Minnesota in 2015 as an representative example).

Target: Anything less than today’s level of pesticide usage, with less associated environmental off-target effects that will only be a non-negative impact to the farmer’s yield, could be a significant contribution.

Energy

Green Hydrogen Cost ($/kg)

Why this matters: Hydrogen is a store of energy that can be created and utilized by biology and machinery alike. While Hydrogen has many challenging aspects including storage, explosion risk and leakages which can increase radiative forcing (Ocko et al 2022), hydrogen is still an important frontier with significant potential in a clean energy future. Even if hydrogen is not an ideal energy source, it could at least be an important energy carrier (MacKay 2008, page 129). The Biden Administration has utilized the Defense Production Act to utilize federal resources in order to rapidly increase clean hydrogen capacity in the US. For a review of the present and future of Green Hydrogen’s potential, we direct the reader to the 2021 Report from Columbia’s Center on Global Energy Policy. The NREL Hydrogen Production and Delivery website is a good high-level introduction to the technical challenges such as maximizing algal hydrogen production and decreasing hydrogenase oxygen sensitivity (eg, Swanson et al 2015). It’s possible that ambitious new approaches of de novo protein engineering, algal production or metagenomic searches might yield new paths to progress.

Today: $5/kg (source: KPMG)

Target: $1/kg, set as an “Earthshot” by the US Department of Energy.

Sustainable Aviation Fuel Cost ($/gallon)

Why this matters: Planes need to fly to maintain the global economy, and aviation is 2.4% of the global CO2 emissions. Sustainable aviation fuels (SAF), which could be an 80% reduction in lifecycle GHG emissions , are one path to decarbonize the aviation industry: IATA thinks there could be 2% market penetration within 5 years (IATA.org factsheet). According to the Department of Energy Biotechnology Office, the US produces one billion tons of dry biomass that can theoretically be collected to produce 50-60 billion gallons of biofuels (Rogers et al, 2016), and SAF could be one possible output. The Biden Administration has formalized the opportunity into the Sustainable Aviation Fuels Grand Challenge (fact sheet presentation).

Today: As of May 2022, a gallon of SAF costs $8.67. Regular jet fuel is about $4.15 per gallon. (Flying Magazine, May 2022)

Target: SAF should be cheaper for the consumer than traditional jet fuel.

Work ahead

It was outside the scope of this first draft to specifically identify and describe the critical subproblems inside each benchmark. One reason subproblems were not carefully explored in this draft is that choice of subproblems is likely based on assumptions (eg, on deployment embodiments, organism choice, priority) that need to be carefully workshopped with stakeholders.

Within each of these benchmarks is decades of work that each have reached subtle but important bottlenecks that need to be overcome, and an intention of this first draft is to collectively work toward the identification of the subproblems which will be the nucleation point of important innovations.

For now, in the spirit of bottom-up innovation, the discovery of subproblems is left as an exercise to the reader. If you want to start building technologies that work towards breakthroughs in these benchmarks, reach out to me on Twitter or please comment here in this document.

Acknowledgements: Many amazing people have made contributions to this document. As this is a dynamic effort, with surely some errors in its current form, I am intentionally not sharing the list of acknowledgements today. Those who have contributed so far, thank you so much! Any errors or misjudgments published here are solely my responsibility.

According to the public Stripe CDR purchase table, some groups like Project Vesta are selling tons of captured carbon for less than $100, but all approaches will ultimately need to be subject to MRV

It is difficult to put this into efficiency numbers without knowing the exact CO2 concentration and exact energy cost

This is a conversational number that I’ve heard but been unable to pin down precisely, likely because it varies so much by factors such as crop, soil and location. The number can be roughly backed out by looking at controlled no fertilizer vs fertilizer yields, such as in this paper or in Roberts 2009.

Punk Rock Bio

Discussion about this post