Household-vs-business wastes in Scotland during the last decade

Screenshot of the graph of household-vs-business cases in Scotland during the last decade

This graph provides an at-a-glance-comparison between Scotland’s households and businesses in respect of the yearly amounts waste that they have generated during the last decade.

The household:business ratio has been very approximately 2:3, but waste from businesses has been reducing noticeably over the decade.

Using a literate programming tool to generate content

Literate programming tools weave data, code, visualisations and natural language into a flowing narrative. These tools are often used to construct tutorial-style documents that are based on tractable/generatable material.

For us, this sounded like a promising approach as a way to generate content since, one of our aims is to develop (website situated) how-to guides/tutorials based on the tractable waste datasets. So we created our first tutorial-style document using this approach: A walk-through on how to extract information from the data about business waste in Scotland. Here’s a screenshot of it:Screenshot of our first document generated using a literate programming tool

It uses:
Only minimal mark-up and programming code was required (see its source file), and it has proved to be a handy means to generate a data-based tutorial.

Building linked open data about carbon savings

linked open data for carbon savings

We have written a research report which walks through how we might build linked open data (LoD) about carbon savings from dissimilar data sources.

It outlines (using small samples from the datasets) how the data pipeline that feeds our prototype-6 webapp, works.

Building LoD about carbon savings - research report - coversheet

Data Commons Scotland at the SODU 2021 conference

Over the weekend (2-3 Sept 2021), I represented our DCS project at SODU 2021 – Scotland’s annual conference on Open Data.
Organised and run by the Code the City team, this event always provides a great opportunity to catch up with others in Scotland’s friendly Open Data community, and hear about their news.
This year, for me, its highlights included:
  • A “corridor chat” that began ad-hoc, about the preservation of railway history as represented by its data records (mostly paper based).That lead us to discuss Git persistence, the zeitgeist for shared ledger databases with explicit temporal support, and what all of that might mean for recording Open Data!
  • Then, a session on the perhaps more immediate concern of: how to nudge the government into making open, more of the data which it holds. Proposed was the neat idea of aggregating, curating and making searchable all of the responses arising from FOI-requests to local and national government. This would help highlight data that that the government should be making open by default.
  • And it was heartening to see representatives from the Scottish government’s Open Data team attending the conference and running an engaging session that brought together government and community perspectives. The government’s recent initiative to make public sector data easy to find” was one of the topics discussed.
  • The conference even gained an international dimension when two attendees joined us from Sweden to help run a live editing session on Wikidata, contributing to the project to add better data about Scottish government agencies into Wikidata.
  • Our own project received some valuable feedback after I demo-ed our latest prototype website.This wasn’t just all affirmative!… I got some useful insights into what what people found difficult. For example, I like the site’s tools and visualisations but, more needs to be done to help me navigate my path-of-interest through the prototype website“. This nicely ties in with one of our project’s (as yet unrealised) goals: to weave interest-based navigation maps through our data site.
I enjoyed the friendly SODU sessions over the weekend – it was inspiring to hear what others are contributing towards making data more open and accessible.
This year’s SODU was online because of Covid-19. Hopefully next year it will return to its more physical manifestation in Aberdeen city!

“What are my neighbours putting into their bins?!”

What do households put into their bins and and how appropriate are their disposal decisions?

To help provide an answer to that question, Zero Waste Scotland (ZWS) occasionally asks each of the 32 Scottish councils to sample their bin collections and to analyse their content. This compositional analysis uncovers the types and weights of the disposed of materials, and assesses the appropriateness of the disposal decisions (i.e. was it put into the right bin?).

Laudably, ZWS is considering publishing this data as open data. Click on the image below to see a web page that is based on an anonymised subset of this data.

household waste analysis

The Data Lab MSc data challenge event 2021

With Glasgow City hosting the UN Climate Change conference (COP26) later this year, it was appropriate that this year’s The Data Lab data analysis hackathon (held last week) had the theme “pollution reduction”.

Three organisations provided challenge projects for the hackathon teams: we provided a “waste management” project based on our easier-to-use datasets; Code the City provided an “air quality” project; and Scottish Power an “electric vehicle charging” project.

The hackathon was lead by a young Scottish tech start-up company called Filament. They have an interesting product that is basically a sharable, cloud-hosted Jupyter Notebook.

Each day a new cohort of teams would tackle the project challenges. We helped by answering their questions about our datasets, and by suggesting ideas for investigation.
At the end of each day the teams presented their findings.

It was informative to see how the teams (each with a mix of skills that included programming, data analysis and business acumen) organised themselves for group working, handled the data, and applied learned analysis techniques.

The teams had a relatively short amount of time to work on their projects so having easy to use datasets was a deciding factor in how much they could achieve. Therefore one take-away is clear, and helps substantiate an aim of our DCS project… open data needs to be easy to use, not just be accessible. Making data easier to use for non-experts, opens it to a much wider audience and to much more creativity.

Stirling’s bin collection data – revisited

Stirling Council set a precedent by being the first (and still only) Scottish local authority to have published open data about their bin collection of household waste.

The council are currently working on increasing the fidelity of this dataset, e.g. by adding spatial data to describe collection routes. However, we can still squeeze from its current version, several interesting pieces of information. For details, visit the Stirling bin collection page on our website mockup.

‘Easier’ open data about waste in Scotland


Several organisations are doing a very good job of curating & publishing open data about waste in Scotland but, the published data is not always “easy to use” for non-experts. We have see several references to this at open data conference events and on social media platforms:

Whilst statisticians/coders may think that it is reasonably simple to knead together these somewhat diverse datasets into a coherent knowledge, the interested layman doesn’t find it so easy.

One of the objectives of the Data Commons Scotland project is to address the “ease of use” issue over open data. The contents of this repository are the result of us re-working some of the existing source open data so that it is easier to use, understand, consume, parse, and all in one place. It may not be as detailed or have all the nuances as the source data – but aims to be better for the purposes of making the information accessible to non-experts.

We have processed the source data just enough to:

  • provide value-based cross-referencing between datasets
  • add a few fields whose values are generally useful but not easily derivable by a simple calculation (such as latitude & longitude)
  • make it available as simple CSV and JSON files in a Git repository.

We have not augmented the data with derived values that can be simply calculated, such as per-population amounts, averages, trends, totals, etc.

The 10 easier datasets

dataset (generated February 2021) source data (sourced January 2021)
name description file number of records creator supplier licence
household-waste The categorised quantities of the (‘managed’) waste generated by households. CSV JSON 19008 SEPA URL OGL v3.0
household-co2e The carbon impact of the waste generated by households. CSV JSON 288 SEPA SEPA URL OGL v2.0
business-waste-by-region The categorised quantities of the waste generated by industry & commerce. CSV JSON 8976 SEPA SEPA URL OGL v2.0
business-waste-by-sector The categorised quantities of the waste generated by industry & commerce. CSV JSON 2640 SEPA SEPA URL OGL v2.0
waste-site The locations, services & capacities of waste sites. CSV JSON 1254 SEPA SEPA URL OGL v2.0
waste-site-io The categorised quantities of waste going in and out of waste sites. CSV 2667914 SEPA SEPA URL OGL v2.0
material-coding A mapping between the EWC codes and SEPA’s materials classification (as used in these datasets). CSV JSON 557 SEPA SEPA URL OGL v2.0
ewc-coding EWC (European Waste Classification) codes and descriptions. CSV JSON 973 European Commission of the EU Publications Office of the EU URL CC BY 4.0
households Occupied residential dwelling counts. Useful for calculating per-household amounts. CSV JSON 288 NRS URL OGL v3.0
population People counts. Useful for calculating per-citizen amounts. CSV JSON 288 NRS URL OGL v3.0

(The fuller, CSV version of the table above.)

The dimensions of the easier datasets

One of the things that makes these datasets easier to use, is that they use consistent dimensions values/controlled code-lists. This makes it easier to join/link datasets.

So we have tried to rectify the inconsistencies that occur in the source data (in particular, the inconsistent labelling of waste materials and regions). However, this is still “work-in-progress” and we yet to tease out & make consistent further useful dimensions.

dimension description dataset example value of dimension count of values of dimension min value of dimension max value of dimension
region The name of a council area. household-waste Falkirk 32
household-co2e Aberdeen City 32
business-waste-by-region Falkirk 34
waste-site North Lanarkshire 32
households West Dunbartonshire 32
population West Dunbartonshire 32
business-sector The label representing the business/economic sector. business-waste-by-sector Manufacture of food and beverage products 10
year The integer representation of a year. household-waste 2011 9 2011 2019
household-co2e 2013 9 2011 2019
business-waste-by-region 2011 8 2011 2018
business-waste-by-sector 2011 8 2011 2018
waste-site 2019 1 2019 2019
waste-site-io 2013 14 2007 2020
households 2011 9 2011 2019
population 2013 9 2011 2019
quarter The integer representation of the year’s quarter. waste-site-io 4 4
site-name The name of the waste site. waste-site Bellshill H/care Waste Treatment & Transfer 1246
permit The waste site operator’s official permit or licence. waste-site PPC/A/1180708 1254
waste-site-io PPC/A/1000060 1401
status The label indicating the open/closed status of the waste site in the record’s timeframe. waste-site Not applicable 4
latitude The signed decimal representing a latitude. waste-site 55.824871489601804 1227
longitude The signed decimal representing a longitude. waste-site -4.035165962797409 1227
io-direction The label indicating the direction of travel of the waste from the PoV of a waste site. waste-site-io in 2
material The name of a waste material in SEPA’s classification. household-waste Animal and mixed food waste 22
business-waste-by-region Spent solvents 33
business-waste-by-sector Spent solvents 33
material-coding Acid, alkaline or saline wastes 34
management The label indicating how the waste was managed/processed (i.e. what its end-state was). household-waste Other Diversion 3
ewc-code The code from the European Waste Classification hierarchy. waste-site-io 00 00 00 787
material-coding 11 01 06* 557
ewc-coding 01 973
ewc-description The description from the European Waste Classification hierarchy. ewc-coding WASTES RESULTING FROM EXPLORATION, MINING, QUARRYING, AND PHYSICAL AND CHEMICAL TREATMENT OF MINERALS 774
operator The name of the waste site operator. waste-site TRADEBE UK 753
activities The waste processing activities supported by the waste site. waste-site Other treatment 50
accepts The kinds of clients/wastes accepted by the waste site. waste-site Other special 42
population The population count as an integer. population 89800 21420 633120
households The households count as an integer. households 42962 9424 307161
tonnes The waste related quantity as a decimal. household-waste 0 0 183691
household-co2e 251386.54 24768.53 762399.92
business-waste-by-region 753 0 486432
business-waste-by-sector 54 0 1039179
waste-site-io 0 -8.56 2325652.83
tonnes-input The quantity of incoming waste as a decimal. waste-site 154.55 0 1476044
tonnes-treated-recovered The quantity of waste treated or recovered as a decimal. waste-site 133.04 0 1476044
tonnes-output The quantity of outgoing waste as a decimal. waste-site 152.8 0 235354.51

(The CSV version of the table above.)