Annotating data points on our prototype website

On our requirements list is, to weave interest-based navigation maps through our data site. And feedback from the recent SODU 2021 conference, affirmed this:

I like the site’s tools and visualisations, but more needs to be done to help me navigate my path of interest through the prototype website.

In an exploratory step towards fulfilling that requirement, we have annotated some data points with explanations/narrative. The idea is that that these annotations could become waymarks in navigation maps, to guide users between the datapoints which underpin data-based stories. We might even imagine how clicking a ‘next’ button on a waymark would visually ‘fly’ the user to the next datapoint in the story (which is, perhaps, on a different graph or different page). But(!) back to our present, very simple proof-of-concept implementation…​

Here’s how the annotations look in our present, proof-of-concept implementation:

Annotations plotted on Inverclyde’s household waste generated graph

Each annotation is depicted by an emoji which is plotted beside a datapoint (on a graph, or in a table). When the user hovers over (or clicks on) an annotation’s emoji, a pop-up will display some informative text.

We want to code annotations just as we would any other dataset – as a straighforward CSV file. So we have built a data-drive annotation mechanism. This has allowed us to specify annotations, as data, in a CSV file like this:

Annotations specified in a CSV data file

Each annotation record contains datapoint coordinates which specify the datapoint against which the annotation is to be plotted. The datapoint coordinates include a record-type which specifies the dataset against which the annotation is to be plotted. (In this example, the specified dataset household-waste-derivation-generation is a derived dataset, based on the household-waste and population datasets.)

This proof-of-concept, data-driven, annotation mechanism has been useful because it has:

  1. given us a model with moving parts to learn from,

  2. provided hints about how annotations can be used to help users understand and navigate the data,

  3. shown us that we need more structure around the naming and storage of derived datasets (and their annotations), and

  4. uncovered the difficultlies of retro-fitting an annotations mechanism into our prototype-6 website. (Annotations are displayed using off-the-shelf Vega-lite tooltips and Bulma CSS dropdowns, but these don’t provide a satisfactory level of placement/control/interactivity. More customised webpage components will be needed to provide a better user experience.)

The Fair Share – the CO2e saved by this university based, reuse store

Discover how many cars worth of CO2e is avoided each year because of this university based, reuse store

The Fair Share is a university based, reuse store. It accepts donations of second-hand books, clothes, kitchenware, electricals, etc. and sells these to students. It is run by the Student Union at the University of Stirling. It meets the Revolve quality standard for second-hand stores.

The Fair Share is in the process of publishing its data as open data. Click on the image below to see a web page that is based on an draft of that work.

The Fair Share

The prototype’s architecture – revised

“Trialling Wikibase for our data layer” described how we evaluated the use of Wikibase as a key implementation component in our bi-layer architecture. The conclusion was that Wikibase, although a brilliant product, does not fit our immediate purpose.

In our revised architecture…​

Wikibase is replaced with (dcs-easier-open-data) a simple set of data files (CSV and JSON) hosted in a public repository (GitHub). These data files are generated by the Waste Data Tool (dcs-wdt). Together, dcs-easier-open-data and dcs-wdt implement the architecture’s data layer.

In the architecture’s revised presentation layer, the webapp reads (CSV/JSON formatted) data from the dcs-easier-open-data respository, instead of reading (via SPARQL) data from the Wikibase.

The prototype’s bi-layered architecture - revised

Stirling’s bin collection data – revisited

Stirling Council set a precedent by being the first (and still only) Scottish local authority to have published open data about their bin collection of household waste.

The council are currently working on increasing the fidelity of this dataset, e.g. by adding spatial data to describe collection routes. However, we can still squeeze from its current version, several interesting pieces of information. For details, visit the Stirling bin collection page on our website mockup.

“How is waste in my area?” – a regional dashboard

Introduction

Our aim in this piece of work is:

to surface facts of interest (maximums, minimums, trends, etc.) about waste in an area, to non-experts.

Towards that aim, we have built a prototype regional dashboard which is directly powered by our ‘easier datasets’ about waste.

The prototype is a webapp and it can be accessed here.

our prototype regional dashboard

Curiosities

Even this early prototype manages to surface some curiosities [1] …​

Inverclyde

Inverclyde is doing well.

Inverclyde’s household waste positions Inverclyde’s household waste generation Inverclyde’s household waste CO2e

In the latest data (2019), it generates the fewest tonnes of household waste (per citizen) of any of the council areas. And its same 1st position for CO2e indicates the close relation between the amount of waste generated and its carbon impact.

…​But why is Inverclyde doing so well?

Highland

Highland isn’t doing so well.

Highland’s household waste positions Highland’s household waste generation Highland’s household waste % recycled

In the latest data (2019), it generates the most (except for Argyll & Bute) tonnes of household waste (per citizen) of any of the council areas. And it has the worst trend for percentage recycled.

…​Why is Highland’s percentage recycled been getting worse since 2014?

Fife

Fife has the best trend for household waste generation. That said, it still has been generating an above the average amount of waste per citizen.

Fife’s household waste positions Fife’s household waste generation

The graphs for Fife business waste show that there was an acute reduction in combustion wastes in 2016.

Fife’s business waste

We investigated this anomaly before and discovered that it was caused by the closure of Fife’s coal fired power station (Longannet) on 24th March 2016.

Angus

In the latest two years of data (2018 & 2019), Angus has noticibly reduced the amount of household waste that it landfills.

Angus' household waste management

During the same period, Angus has increased the amount household waste that it processes as ‘other diversion’.

…​What underlies that difference in Angus’ waste processing?

Technologies

This prototype is built as a ‘static’ website with all content-dynamics occurring in the browser. This makes it simple and cheap to host, but results in heavier, more complex web pages.

  • The clickable map is implemented on Leaflet – with Open Street Map map tiles.
  • The charts are constructed using Vega-lite.
  • The content-dynamics are coded in ClojureScript – with Hiccup for HTML, and Reagent for events.
  • The website is hosted on GitHub.

Ideas for evolving this prototype

  1. Provide more qualitative information. This version is quite quantitative because, well, that is nature of the datasets that currently underlay it. So there’s a danger of straying into the “managment by KPI” approach when we should be supporting the “management by understanding” approach.
  2. Include more localised information, e.g. about an area’s re-use shops, or bin collection statistics.
  3. Support deeper dives, e.g. so that users can click on a CO2e trend to navigate to a choropleth map for CO2e.
  4. Allow users to download any of the displayed charts as (CSV) data or as (PNG) images.
  5. Enhance the support of comparisons by allowing users to multi-select regions and overlay their charts.
  6. Allow users to choose from a menu, what chart/data tiles to place on the page.
  7. Provide a what-if? tool. “What if every region reduced by 10% their landfilling of waste material xyz?” – where the tool has a good enough waste model to enable it to compute what-if? outcomes.

1. One of the original sources of data has been off-line due to a cyberattack so, at the time of writing, it has not been possible to double-check all figures from our prototype against original sources.

A mock-up website for functionality & navigation

Introduction

A prototype website will be one of the outcomes of this research project. The website should help non-experts discover, learn about and understand the open data about waste in Scotland.

To date, we have build a couple of mock-ups [1]:

  1. functionality & navigation mock-up for exploring ideas about functionality and navigation for our eventual website.
  2. look’n’feel mock-up for exploring looks/visual aesthetics.

This document concentrates on the functionality & navigation mock-up…​

The splash page of the functionality & navigation mock-up

Functionality

This mock-up ties together a lot of the elements we’ve been working on:

Data Direct access to download the underlying datasets.
A simple, consistent set of CSV and JSON files.
Maps Interactive, on-map depictions of the information from the datasets.
Data grids with graphs A tool for slicing’n’dicing the datasets and visualising the result as a graph.
To make this easier, this tool will provide useful slicing’n’dicing presets: starting points from which users can explore.
SPARQL A query interface to a semantic web representation of the datasets.
This is unlikely to be of use to our target audience, so we’ll probably remove it from the UI but may use its semantic graph internally.
Articles Themed articles and tutorials that are based on evidence from the datasets.
Uses Asciidoc mark-up to make the articles easy to format.
The articles may incorporate data visualisations that are backed by our datasets.

Navigation

The mock-up provides 3 routes to information:

Themes The clickable blocks on the splash page allows users to explore a waste theme by taking the user to a specific set of of articles and tutorials.
Navbar The menu bar at the top of each page, provides an orthogonal, more ‘functional’ classification of the website’s contents.
Search At present, this is a very basic text & tag search. In the future, a predicative/auto-suggestion search based on a semantic graph of the contents, will be provided.

Users navigation histories may help power a further-reading recommender subsystem.

Architecture

Building this mock-up has required some architectural decisions that may help inform the design of our eventual website.

Static website The mock-up has been implemented as a so-called ‘static website’. This means that page content is not dynamically generated by (or saved to) the server-side. The server-side simply serves ‘static content files’.

Pros Implementation-wise, it is an order of magnitude simpler and more scalable than a ‘dynamic’ website.
There are several good, free, open source ‘static website generators/frameworks’.
Static websites can be served for free on hosting platforms such as GitHub (as used for this mock-up).
Cons It can’t support a whole class of functionality, including user uploads, and on-line content editing.
Computation is forced towards the client-side (i.e. into users’ web browsers) which sometimes can have a negative impact on the speed of the UI.
Off-line updates The content of the website can be updated – just not updated on-line. The website maintainers can add new/edit existing datasets, articles, etc. via off-line means.
For off-line updates to this mock-up we use: (i) WDT – a rough’n’ready software script that helps us to curate the datasets that underlay this mock-up; (ii) Cryogen – a static website generator; (iii) Git – to upload updates to our GitHub hosting service.
Client-side computation Page content is dynamically manipulated (e.g. datasets are slice’n’diced) on the client-side (in users’ web browsers) using JavaScript. This enables, for example, the mock-up’s web pages to take the static content that is served by the server-side, and manipulate it so that it can support interactive data visualisations.
Progress in client-side technology even makes it possible to implement a semantic graph supporting triple store in a web browser!

Conclusion

This mock-up website…​

  • provides concrete test-bed for evolving the functionality & navigation aspects of our eventual website, and
  • forces us to think about architectural trade-offs.

1. We use the term “mock-up” to mean an incomplete representation/model – useful for demonstration, design evaluation and acquiring user feedback.

How I chanced on Longannet in the data

I’ve added a “Household vs business waste” time-series to our map-oriented webapp from last week. The business data was parsed from SEPA’s Business Waste Data Tables.

When I watched the waste amounts change through time on this map, Fife’s amounts really stood out…​

Household vs business waste, thru time

Fife was generating so much more waste from business, than the other council areas. But why?

To look at the data in more detail, I loaded it into the data grid & graph tool that we built a couple of months ago.

First, I filtered the data grid to show me: Fife’s four largest, business wastes vs their averages link.

Fife’s four largest, business wastes vs their averages

Fife’s combustion waste stands out from the average.

Secondly, I filtered the data grid to show me: the business combustion waste quantities by sector link.

Business combustion wastes by sector

Unfortunately this data isn’t broken down by council area, but it clearly shows that most of the combustion wastes are generated by the power industry.

An internet search with this information – i.e. “Fife combustion power” – returns a page full of references to Longannet – the coal fuelled power station.

Longannet power station (courtesy of Scottish Power)

According to Wikipedia, Longannet power station was the 21st most polluting in Europe when it closed, so no wonder that its signature in the data is so obvious! It was closed on 24th March 2016, which correlates with the sharp return towards the average in 2016, of the combustion wastes graph line for Fife.

Of course this isn’t a real discovery – SEPA, Scottish Power and the people who lived around the power station will be very familiar with this data anomaly and its cause. But I think that its mildly interesting that a data lay person like me could discover this from looking at these simple data visualisations.

Waste quantities through time, on a map

Preface

Shortly before the end of 2020, I attended the Code The City 21: Put Your City on the Map hack weekend which explored ideas for putting open data onto geographic maps.

It ran several interesting projects. There was one was especially inspiring to me: the Bioregion Dashboard. Its idea is to tell an evidence-backed story-through-the-years, involving interactive data displays against a map. James Littlejohn introduces it in this YouTube video.

This got me thinking about new ways to depict the information that is bound up in the data about waste…​

In particular, thinking about a means to convey at-a-glance, to the lay person, how councils areas compare through time in respect of the amounts of (household solid) waste that they process. Now, the grid & graph prototype that we built a couple of months back, conveys that same information very well (and with a greater fidelity than we will mange in this work) but, to the lay parson like me, it isn’t attention grabbing. I like seeing something with movement and with features that I can relate to, such as animated charts and a geographical map.

The prototype webapp

Leveraging what I learnt at the Code the City 21 hack weekend, I hacked together a prototype webapp that shows how waste quantities change through time, on a geographic map.

The below, animated image of the webapp, it conveys that landfilled-waste is reducing over time whilst total-waste is remaining fairly constant.

Managed solid waste, through time

UI controls

  • The dataset of interest is chosen through the dropdown control, either:
    1. Tonnes of managed solid household waste per person per year.
    2. Tonnes of C02 equivalent from household waste per person per year.
  • Use the slider control to travel through time.
  • Each pie chart depicts the waste-related quantities for a council area.
    • The sizes of its slices and its overall size, are related to the quantities that it depicts.
  • Hover over a council area to see detailed metrics in the detail panel.
  • The usual map zoom and pan controls are supported.

Software and datasets

CO2 equivalent

‘Live’ instance

A ‘live’ instance of this webapp can be accessed here .

Closing thoughts

I haven’t seen these datasets about waste shown in this way before, and I think that it usefully conveys aspects of the datasets in a catchy and easy to understand way. It is low fidelity when compared to a full data grid with graph solution, but the idea is to hold the attention of the average person in the street.

Future work could integrate additional waste-relevant datasets that have geography and time dimensions. Also we should consider alternative metrics (such as ratios), alternative charts (such as bar or polar) and alternative statistics (such as deviation or trend). I went with the ‘most straightforward’ but user-testing might indicate that an alternative is better.

Increasing this project’s on-line presence

This project needs more presence on-line. Thanks for prompting this Ian! (And be sure to read Ian’s posting on the state of open data.) So, this week…

  • Anna & I have made a start on revamping this, the project’s public WordPress site, which Anna created late last year. This site is accessible at https://campuspress.stir.ac.uk/datacommonsscotland. The idea is that we’ll publish on it limited-lifespan information such as relevant happenings & blog postings and take feedback comments.
  • Also, I’ve create a public GitHub site at https://github.com/data-commons-scotland for some of the project’s longer-lifespan outputs such as concepts/models, standards, research output and open source code. GitHub will continue to preserve these (hopefully useful) outputs beyond the lifespan of this project. I’ve made a start by adding some investigation reports (dcs-shorts) and example web application source code (dcs-wcs).