Good quality energy data is critical for research, the more data, the better the research, but the energy sector doesn’t yet have access to a large, national corpus of curated data in one location. Smart meters can fill this gap with high quality information about energy consumption on a national scale from millions of households. Building this corpus of data in a secure, managed infrastructure will make it possible to carry out better research to inform UK government policy, and to help define strategies to address the energy ‘trilemma’ over the next decade: security, affordability and sustainability.
Readings of household electricity or gas consumption, collected every five seconds from ‘smart’ meters from just a few hundred houses, can accumulate hundreds of millions of data points, growing to terabytes in size in just a few years. This scale presents significant technical challenges in storing, managing, analysing, and sharing the data for traditional database technologies.
Despite the many potential benefits of using smart meter data, these benefits must be balanced with the need to respect households’ data privacy of. In the UK, smart meter data is protected by a robust Data Access and Privacy Framework incorporated into legislation via the Smart Energy Code. This means that, beyond monthly data for billing, smart meter data can only be used with the informed consent of the energy consumer and only analysed by accredited parties that have signed up to the Smart Energy Code.
So many questions…
Smart meter researchers work with key stakeholders including government departments, Research Councils, regulatory authorities, energy companies and Distribution Network Operators. Smart data allows researchers to ask:
Once smart meters have been installed in all UK homes, they have the potential to generate over a trillion records per year. So, data collection, management and analytical processes must be scalable to meet the challenges of working with data of this magnitude. The DSaaP infrastructure, based on an industry standard Big Data Hadoop platform provides fast and efficient tools for data exploration, transformation, aggregation and visualisation at scale. The solution offers a flexible and cost-effective infrastructure that links data on physical servers and cloud-based storage and can store petabytes of data. This ‘hybrid model’ brings robust security but also provides authorised users with a seamless way of accessing data.
As the infrastructure delivering the Smart Meter Research portal, the top priority of DSaaP is to maintain the data privacy of households.
Analytical tools integrated into the DSaaP infrastructure support multiple querying with R, Python, Scala, and Structured Query Language (SQL), and provide powerful visualisation tools through a standard web browser. Spark provides the engine that can process billions of records at speeds unachievable in most alternative computing environments, while complex statistical analysis can be performed using R, SQL-like querying with Hive, and geo-spatial visualisations can be produced with Leaflet. Additionally, researchers who prefer to use their own software can access data products in secure remote “containers” – a bit like having a PC in a browser.