Data Is The New Oil - 18.02.2021 Update
Indian power production, Actual Covid infections and Air traffic.
Indian power production. Researcher Robbie Andrew collected 150+ monthly data files in pdf format from India’s Central Electricity Authority and created a single time series data file in excel. The dataset splits power generation and capacity by source at the state level, and goes from January 2008 to December 2020.
Actual Covid infections. Researchers Jungsik Noh and Gaudenz Danuser published estimates in time series format of the real number of Covid infections and deaths worldwide, based on confirmed deaths and key pandemic parameter estimates such as the infection fatality ratio. A paper explains the method. “Severe under-ascertainment of Covid-19 cases was found to be universal across US states and countries worldwide. In 25 out of the 50 countries, actual cumulative cases were estimated to be 5-20 times greater than the confirmed cases,” the paper says.
Air traffic. Eurocontrol, the organisation that monitors Europe’s skies, maintains detailed dashboards of air traffic by country and airport. One year after the start of the pandemic, the number of flights is down two-thirds from 2019 levels, with some countries such as the UK faring worse than others. I have saved a snapshot of the full database. Separately, Australia published several datasets of airport traffic going until the end of December 2020.
Other Data Updates
Germany and Switzerland published detailed vehicle sales data for January 2021, showing continued strong momentum for electric vehicles.
Brazil published anonymised data for all patients who have received a Covid-19 vaccine, including their age, sex, ethnicity and the type of vaccine used.
Switzerland released statistics on shop closures during the latest wave of Covid-19 in the country.
Research outfit Aletheia published a consolidated database of global trade agreements, tariffs and trade flows.
Amazon released several climate and geospatial datasets held on its platform and suitable for machine learning.
The European Centre for Disease Control updated its datasets of hospital occupancy and Covid-19 testing for European countries.
Worth a Read
Machine learning at ECMWF: A roadmap for the next 10 years
The European Centre for Medium-Range Weather Forecasts, which manages the world's largest weather prediction database and several supercomputers, published a 20-page memo on how it intends to adopt artificial intelligence over the next 10 years. The Earth system is complex and often displays what statisticians call “non-linear behaviour”. At the same time, decades of data gathering and the multiplication of sensors and satellite images have created hundreds of petabytes of data. In this context, machine learning can be used to improve the efficiency of data models, to extract information from data, or to post-process model output. The paper explains the key challenges for ECMWF in achieving this transition. They include a work culture based on domain knowledge of the earth sciences rather than data science as well as the lack of compatibility between legacy models built in Fortran code (a very efficient but highly complex programming language first developed in the 1950’s) and modern machine learning frameworks that use Python or Julia.
Why hate carbon taxes? Machine learning evidence on the roles of personal responsibility, trust, revenue recycling, and other factors across 23 European countries
Researcher Sebastian Levi analyses opposition to carbon taxes, using data on 40k European individuals. The results identify the feeling of personal responsibility for trying to reduce climate change as the most important condition for predicting opposition to carbon taxes. Interestingly, recycling revenues from existing carbon prices back to households, often considered crucial for securing public support, is only associated with minor increases in the acceptance of higher carbon taxes.
Observed impacts of the Covid-19 first wave on travel behaviour in Switzerland based on a large GPS panel
GPS tracking and a survey of 1.5k Swiss people finds that they reduced the average daily distance travelled by 60% during the Covid-19 lockdown. Cycling’s share of transport increased drastically, a phenomenon which continued well after the end of lockdown in the summer. An increase in car usage during the day, with midday off-peak travelled kilometers now slightly above the 2019 baseline, suggests that people are driving more to avoid public transport.
Counting the cost of the Niger Delta’s largest oil spills: satellite remote sensing reveals extensive environmental damage with >1million people in the impact zone
Researchers used satellite images to delineate an extensive area of 393 km2 that has experienced vegetation mortality resulting from two oil spills in Ogoniland in 2008/9. More than ten years later after the spills, they conclude that the Niger Delta remains heavily polluted with adverse consequences for the 1 million people who live in the area.
Fuel Economy Valuation and Preferences of Indian Two-wheeler Buyers
This paper assesses the importance of fuel economy for buyers of two-wheelers in India, based on a 2018 survey of 8k people. It concludes that the average buyer places a high value on future fuel savings, in addition to style/looks, comfort and brand. This, in turn, suggests that fuel economy standards may not be the most efficient policy in India to limit future oil demand gains. Two-wheelers make up 84% of the domestic passenger vehicle market.
Other Papers
A point prediction method based automatic machine learning for day-ahead power output of multi-region photovoltaic plants (link)
Benefits of Electric Vehicles Integrating into Power Grid (link)
Developing a vehicle emission inventory with high temporal-spatial resolution in Tianjin, China (link)
Support vector regression with asymmetric loss for optimal electric load forecasting (link)
Smart-PGSim: Using Neural Network to Accelerate AC-OPF Power Grid Simulation (link)
An Extended New Approach for Forecasting Short-Term Wind Power Using Modified Fuzzy Wavelet Neural Network: A Case Study in Wind Power Plant (link)
Life Cycle Assessment (LCA) for use on renewable sourced hydrogen fuel cell buses vs diesel engines buses in the city of Rosario, Argentina (link)
Modeling and Predicting the Electricity Production in Hydropower Using Conjunction of Wavelet Transform, Long Short-Term Memory and Random Forest Models (link)
If you do know people who might enjoy this newsletter, I would really appreciate it if you would forward it on. Don't hesitate to contact me (oliv.lejeune@gmail.com) if you have any dataset suggestions or comments about this note. Thanks for reading.