Managers today in multinational companies face increasing pressure to reduce their impact on the environment, especially when talking about data centers – which impact global warming powerfully. But make no mistake, this reduction doesn't per se come from the goodwill of the company or stakeholders but can also be put into action due to tougher regulatory requirements, especially for public companies in the American market. If all combined data centers were a country, it would rank as the fifth largest energy consumer among all countries. In 2020, for example, these computing centers consumed about 1% of global electricity consumption and contributed about 0.3% of carbon dioxide emissions.

Today, companies are required to be transparent about their carbon footprint, and there is a race between data centers to improve their ranking in the efficiency index. Organizations like Greenpeace also contribute to the efforts by publishing lists that rank data centers according to the extent of their carbon emissions, and their energy efficiency (PUE - Power Usage Effectiveness).

The need for greener code

Many of the existing data center initiatives are based on the adoption of renewable energy for cooling or the optimization of cooling systems aimed at reducing energy consumption. However, aside from the energy required to maintain climate control, the software itself has a significant impact on the amount of electricity consumed for data analytics. And when I say some, I mean quite a lot.

According to recent research, one machine learning model, such as Meena (Google) for example, consumes as much energy as a passenger car that has travelled 390,000 km. Researchers from the University of Massachusetts estimated that training one machine learning model causes the emission of 284 tons of carbon dioxide – equivalent to emissions of 5 cars throughout their use- lifetime.

Considering this, efforts, and interest in creating a more efficient code are increasing. The Green Software Foundation, whose active companies include VMware, Microsoft, Accenture, and GitHub, has set itself the goal of designing and building an architecture and writing software code that will consume less energy.

Tips for sustainable machine learning

Several articles have already been published dealing with the issue of how to write algorithms for AI and machine learning models but there are still some basic tips on the subject. One way to reduce computing resources is to minimize the number of model training sessions. There is currently an inventory of hundreds of machine learning models or pre-trained programs, and all that is needed from the developers is to put their data into AI capabilities in applications. This alternative significantly reduces the time required to develop and train the models and in turn the energy consumption.

In addition, it is also important to have full transparency when it comes to the algorithm's carbon footprint, to make decisions on the best way to optimize performance. Researchers from several universities have built tools for this purpose. For example, Green-algorithms calculate the carbon footprint of each data analysis process depending on the hardware, runtime, and geographic location of the server farm. CodeCrabon is a lightweight software package that integrates simply into the Python codebase and estimates the amount of carbon dioxide emitted by the computing resources required to run the code.

Another tip is to use automation to reduce the duration of the training run. It is possible to reduce the number of experiments and the scope of the analyzed data without compromising the two.

The software actually used for processing can also reduce the number of resources required. There are databases specifically designed for massive data processing and they optimally utilize memory and storage consumption to reduce energy consumption. Another advantage of these databases is that there is no need to limit the scope of the analyzed data – this reduces the risk of harming the accuracy of the model due to the attempt to speed up the duration of the run.

Reducing the duration of running the model, in addition to improving energy efficiency, reduces the amount of time required to generate critical business insights such as fraud detection, cyber security solutions, quality control and more. More efficient code is not only better for the environment but also for business.

Reducing the carbon footprint is a company's responsibility

Potential customers demand transparency when it comes to a company's commitment to its green strategies and adopting a "green" code standard can be an important first step in that direction. Employees want to work at a company that is ecologically sensitive and makes responsible decisions on environmental issues. In the future, cloud providers may require transparency into the carbon footprint of processing workloads and impose penalties on processing that is perceived as excessive or non-essential.

Written by Ohad Shalev, Strategic Analyst at Sqream