Using a literate programming tool to generate content

Literate programming tools weave data, code, visualisations and natural language into a flowing narrative. These tools are often used to construct tutorial-style documents that are based on tractable/generatable material.

For us, this sounded like a promising approach as a way to generate content since, one of our aims is to develop (website situated) how-to guides/tutorials based on the tractable waste datasets. So we created our first tutorial-style document using this approach: A walk-through on how to extract information from the data about business waste in Scotland. Here’s a screenshot of it:Screenshot of our first document generated using a literate programming tool

It uses:
Only minimal mark-up and programming code was required (see its source file), and it has proved to be a handy means to generate a data-based tutorial.

Annotating data points on our prototype website

On our requirements list is, to weave interest-based navigation maps through our data site. And feedback from the recent SODU 2021 conference, affirmed this:

I like the site’s tools and visualisations, but more needs to be done to help me navigate my path of interest through the prototype website.

In an exploratory step towards fulfilling that requirement, we have annotated some data points with explanations/narrative. The idea is that that these annotations could become waymarks in navigation maps, to guide users between the datapoints which underpin data-based stories. We might even imagine how clicking a ‘next’ button on a waymark would visually ‘fly’ the user to the next datapoint in the story (which is, perhaps, on a different graph or different page). But(!) back to our present, very simple proof-of-concept implementation…​

Here’s how the annotations look in our present, proof-of-concept implementation:

Annotations plotted on Inverclyde’s household waste generated graph

Each annotation is depicted by an emoji which is plotted beside a datapoint (on a graph, or in a table). When the user hovers over (or clicks on) an annotation’s emoji, a pop-up will display some informative text.

We want to code annotations just as we would any other dataset – as a straighforward CSV file. So we have built a data-drive annotation mechanism. This has allowed us to specify annotations, as data, in a CSV file like this:

Annotations specified in a CSV data file

Each annotation record contains datapoint coordinates which specify the datapoint against which the annotation is to be plotted. The datapoint coordinates include a record-type which specifies the dataset against which the annotation is to be plotted. (In this example, the specified dataset household-waste-derivation-generation is a derived dataset, based on the household-waste and population datasets.)

This proof-of-concept, data-driven, annotation mechanism has been useful because it has:

  1. given us a model with moving parts to learn from,

  2. provided hints about how annotations can be used to help users understand and navigate the data,

  3. shown us that we need more structure around the naming and storage of derived datasets (and their annotations), and

  4. uncovered the difficultlies of retro-fitting an annotations mechanism into our prototype-6 website. (Annotations are displayed using off-the-shelf Vega-lite tooltips and Bulma CSS dropdowns, but these don’t provide a satisfactory level of placement/control/interactivity. More customised webpage components will be needed to provide a better user experience.)

Building linked open data about carbon savings

linked open data for carbon savings

We have written a research report which walks through how we might build linked open data (LoD) about carbon savings from dissimilar data sources.

It outlines (using small samples from the datasets) how the data pipeline that feeds our prototype-6 webapp, works.

Building LoD about carbon savings - research report - coversheet

Data Commons Scotland at the SODU 2021 conference

Over the weekend (2-3 Sept 2021), I represented our DCS project at SODU 2021 – Scotland’s annual conference on Open Data.
Organised and run by the Code the City team, this event always provides a great opportunity to catch up with others in Scotland’s friendly Open Data community, and hear about their news.
This year, for me, its highlights included:
  • A “corridor chat” that began ad-hoc, about the preservation of railway history as represented by its data records (mostly paper based).That lead us to discuss Git persistence, the zeitgeist for shared ledger databases with explicit temporal support, and what all of that might mean for recording Open Data!
  • Then, a session on the perhaps more immediate concern of: how to nudge the government into making open, more of the data which it holds. Proposed was the neat idea of aggregating, curating and making searchable all of the responses arising from FOI-requests to local and national government. This would help highlight data that that the government should be making open by default.
  • And it was heartening to see representatives from the Scottish government’s Open Data team attending the conference and running an engaging session that brought together government and community perspectives. The government’s recent initiative to make public sector data easy to find” was one of the topics discussed.
  • The conference even gained an international dimension when two attendees joined us from Sweden to help run a live editing session on Wikidata, contributing to the project to add better data about Scottish government agencies into Wikidata.
  • Our own project received some valuable feedback after I demo-ed our latest prototype website.This wasn’t just all affirmative!… I got some useful insights into what what people found difficult. For example, I like the site’s tools and visualisations but, more needs to be done to help me navigate my path-of-interest through the prototype website“. This nicely ties in with one of our project’s (as yet unrealised) goals: to weave interest-based navigation maps through our data site.
I enjoyed the friendly SODU sessions over the weekend – it was inspiring to hear what others are contributing towards making data more open and accessible.
This year’s SODU was online because of Covid-19. Hopefully next year it will return to its more physical manifestation in Aberdeen city!

“What are my neighbours putting into their bins?!”

What do households put into their bins and and how appropriate are their disposal decisions?

To help provide an answer to that question, Zero Waste Scotland (ZWS) occasionally asks each of the 32 Scottish councils to sample their bin collections and to analyse their content. This compositional analysis uncovers the types and weights of the disposed of materials, and assesses the appropriateness of the disposal decisions (i.e. was it put into the right bin?).

Laudably, ZWS is considering publishing this data as open data. Click on the image below to see a web page that is based on an anonymised subset of this data.

household waste analysis

The Fair Share – the CO2e saved by this university based, reuse store

Discover how many cars worth of CO2e is avoided each year because of this university based, reuse store

The Fair Share is a university based, reuse store. It accepts donations of second-hand books, clothes, kitchenware, electricals, etc. and sells these to students. It is run by the Student Union at the University of Stirling. It meets the Revolve quality standard for second-hand stores.

The Fair Share is in the process of publishing its data as open data. Click on the image below to see a web page that is based on an draft of that work.

The Fair Share

Our new sister project, Waste Stories

We have recently launched a new sister project that complements the Data Commons Scotland’s data-based orientation to waste and resources in Scotland with an approach based on generating stories and short fiction about the materials that enter the waste stream in Scotland.

Waste Stories is a project that aims to transform the relationships that we have with waste by exploiting the affective power of story-telling.  It involves Data Commons Scotland team members Anna Wilson, Hannah Hamilton and Greg Singh. You can find out more about it here:

About Waste Stories

We’ll be using some of the images and stories generated through this project to enhance the Data Commons Scotland open data platform in future.

The Data Lab MSc data challenge event 2021

With Glasgow City hosting the UN Climate Change conference (COP26) later this year, it was appropriate that this year’s The Data Lab data analysis hackathon (held last week) had the theme “pollution reduction”.

Three organisations provided challenge projects for the hackathon teams: we provided a “waste management” project based on our easier-to-use datasets; Code the City provided an “air quality” project; and Scottish Power an “electric vehicle charging” project.

The hackathon was lead by a young Scottish tech start-up company called Filament. They have an interesting product that is basically a sharable, cloud-hosted Jupyter Notebook.

Each day a new cohort of teams would tackle the project challenges. We helped by answering their questions about our datasets, and by suggesting ideas for investigation.
At the end of each day the teams presented their findings.

It was informative to see how the teams (each with a mix of skills that included programming, data analysis and business acumen) organised themselves for group working, handled the data, and applied learned analysis techniques.

The teams had a relatively short amount of time to work on their projects so having easy to use datasets was a deciding factor in how much they could achieve. Therefore one take-away is clear, and helps substantiate an aim of our DCS project… open data needs to be easy to use, not just be accessible. Making data easier to use for non-experts, opens it to a much wider audience and to much more creativity.