Objective
Several organisations are doing a very good job of curating & publishing open data about waste in Scotland but, the published data is not always “easy to use” for non-experts. We have see several references to this at open data conference events and on social media platforms:
Whilst statisticians/coders may think that it is reasonably simple to knead together these somewhat diverse datasets into a coherent knowledge, the interested layman doesn’t find it so easy.
One of the objectives of the Data Commons Scotland project is to address the “ease of use” issue over open data. The contents of this repository are the result of us re-working some of the existing source open data so that it is easier to use, understand, consume, parse, and all in one place. It may not be as detailed or have all the nuances as the source data – but aims to be better for the purposes of making the information accessible to non-experts.
We have processed the source data just enough to:
- provide value-based cross-referencing between datasets
- add a few fields whose values are generally useful but not easily derivable by a simple calculation (such as latitude & longitude)
- make it available as simple CSV and JSON files in a Git repository.
We have not augmented the data with derived values that can be simply calculated, such as per-population amounts, averages, trends, totals, etc.
The 10 easier datasets
dataset (generated February 2021) | source data (sourced January 2021) | |||||
---|---|---|---|---|---|---|
name | description | file | number of records | creator | supplier | licence |
household-waste | The categorised quantities of the (‘managed’) waste generated by households. | CSV JSON | 19008 | SEPA | statistics.gov.scot URL | OGL v3.0 |
household-co2e | The carbon impact of the waste generated by households. | CSV JSON | 288 | SEPA | SEPA URL | OGL v2.0 |
business-waste-by-region | The categorised quantities of the waste generated by industry & commerce. | CSV JSON | 8976 | SEPA | SEPA URL | OGL v2.0 |
business-waste-by-sector | The categorised quantities of the waste generated by industry & commerce. | CSV JSON | 2640 | SEPA | SEPA URL | OGL v2.0 |
waste-site | The locations, services & capacities of waste sites. | CSV JSON | 1254 | SEPA | SEPA URL | OGL v2.0 |
waste-site-io | The categorised quantities of waste going in and out of waste sites. | CSV | 2667914 | SEPA | SEPA URL | OGL v2.0 |
material-coding | A mapping between the EWC codes and SEPA’s materials classification (as used in these datasets). | CSV JSON | 557 | SEPA | SEPA URL | OGL v2.0 |
ewc-coding | EWC (European Waste Classification) codes and descriptions. | CSV JSON | 973 | European Commission of the EU | Publications Office of the EU URL | CC BY 4.0 |
households | Occupied residential dwelling counts. Useful for calculating per-household amounts. | CSV JSON | 288 | NRS | statistics.gov.scot URL | OGL v3.0 |
population | People counts. Useful for calculating per-citizen amounts. | CSV JSON | 288 | NRS | statistics.gov.scot URL | OGL v3.0 |
(The fuller, CSV version of the table above.)
The dimensions of the easier datasets
One of the things that makes these datasets easier to use, is that they use consistent dimensions values/controlled code-lists. This makes it easier to join/link datasets.
So we have tried to rectify the inconsistencies that occur in the source data (in particular, the inconsistent labelling of waste materials and regions). However, this is still “work-in-progress” and we yet to tease out & make consistent further useful dimensions.
dimension | description | dataset | example value of dimension | count of values of dimension | min value of dimension | max value of dimension |
---|---|---|---|---|---|---|
region | The name of a council area. | household-waste | Falkirk | 32 | ||
household-co2e | Aberdeen City | 32 | ||||
business-waste-by-region | Falkirk | 34 | ||||
waste-site | North Lanarkshire | 32 | ||||
households | West Dunbartonshire | 32 | ||||
population | West Dunbartonshire | 32 | ||||
business-sector | The label representing the business/economic sector. | business-waste-by-sector | Manufacture of food and beverage products | 10 | ||
year | The integer representation of a year. | household-waste | 2011 | 9 | 2011 | 2019 |
household-co2e | 2013 | 9 | 2011 | 2019 | ||
business-waste-by-region | 2011 | 8 | 2011 | 2018 | ||
business-waste-by-sector | 2011 | 8 | 2011 | 2018 | ||
waste-site | 2019 | 1 | 2019 | 2019 | ||
waste-site-io | 2013 | 14 | 2007 | 2020 | ||
households | 2011 | 9 | 2011 | 2019 | ||
population | 2013 | 9 | 2011 | 2019 | ||
quarter | The integer representation of the year’s quarter. | waste-site-io | 4 | 4 | ||
site-name | The name of the waste site. | waste-site | Bellshill H/care Waste Treatment & Transfer | 1246 | ||
permit | The waste site operator’s official permit or licence. | waste-site | PPC/A/1180708 | 1254 | ||
waste-site-io | PPC/A/1000060 | 1401 | ||||
status | The label indicating the open/closed status of the waste site in the record’s timeframe. | waste-site | Not applicable | 4 | ||
latitude | The signed decimal representing a latitude. | waste-site | 55.824871489601804 | 1227 | ||
longitude | The signed decimal representing a longitude. | waste-site | -4.035165962797409 | 1227 | ||
io-direction | The label indicating the direction of travel of the waste from the PoV of a waste site. | waste-site-io | in | 2 | ||
material | The name of a waste material in SEPA’s classification. | household-waste | Animal and mixed food waste | 22 | ||
business-waste-by-region | Spent solvents | 33 | ||||
business-waste-by-sector | Spent solvents | 33 | ||||
material-coding | Acid, alkaline or saline wastes | 34 | ||||
management | The label indicating how the waste was managed/processed (i.e. what its end-state was). | household-waste | Other Diversion | 3 | ||
ewc-code | The code from the European Waste Classification hierarchy. | waste-site-io | 00 00 00 | 787 | ||
material-coding | 11 01 06* | 557 | ||||
ewc-coding | 01 | 973 | ||||
ewc-description | The description from the European Waste Classification hierarchy. | ewc-coding | WASTES RESULTING FROM EXPLORATION, MINING, QUARRYING, AND PHYSICAL AND CHEMICAL TREATMENT OF MINERALS | 774 | ||
operator | The name of the waste site operator. | waste-site | TRADEBE UK | 753 | ||
activities | The waste processing activities supported by the waste site. | waste-site | Other treatment | 50 | ||
accepts | The kinds of clients/wastes accepted by the waste site. | waste-site | Other special | 42 | ||
population | The population count as an integer. | population | 89800 | 21420 | 633120 | |
households | The households count as an integer. | households | 42962 | 9424 | 307161 | |
tonnes | The waste related quantity as a decimal. | household-waste | 0 | 0 | 183691 | |
household-co2e | 251386.54 | 24768.53 | 762399.92 | |||
business-waste-by-region | 753 | 0 | 486432 | |||
business-waste-by-sector | 54 | 0 | 1039179 | |||
waste-site-io | 0 | -8.56 | 2325652.83 | |||
tonnes-input | The quantity of incoming waste as a decimal. | waste-site | 154.55 | 0 | 1476044 | |
tonnes-treated-recovered | The quantity of waste treated or recovered as a decimal. | waste-site | 133.04 | 0 | 1476044 | |
tonnes-output | The quantity of outgoing waste as a decimal. | waste-site | 152.8 | 0 | 235354.51 |
(The CSV version of the table above.)