What might a Waste Commons Scotland platform look like? Initial ideas in our design scenarios

A core goal of the DCS project is the development of ways in which Open Data platforms can be designed to be both multi-level (in terms of expected expertise) and learnable.  That is, we want to identify and start to develop features that encourage users to access and use the available data in increasingly sophisticated ways, learning both how to use the platform and how to engage with data at the same time.

Because of this, it is essential that the DCS team keep future users at the centre of the research and design process.  We have therefore adopted a design approach based on the creation of personas and scenarios developed from what a range of potential users told us, in a series of in-depth, qualitative interviews.

While personas and scenarios (or user journeys) are fairly widely used in HCI design, we’ve taken a slightly different approach to building our personas.  Building on an approach we developed in previous research (Wilson et al. 2018), we used the methods of phenomenography to analyse the interview data in a way that embraces the richness and diversity of skills, backgrounds, aims and values of potential users.  We then used the results of this analysis to create personas and scenarios that are based on values and capacities rather than needs and solutions.

These scenarios also imagine what a Waste Commons Scotland platform might look like, including some of the features we imagine we will need in order to help people learn how tomake use of the data such a site will link them up with.

You can find the resulting personas and scenarios on the Resources section of this site.

 

The usefulness of putting datasets into Wikidata?

A week ago, I attended Ian Watt‘s workshop on Wikidata at the Scottish Open Data Unconference 2020. It was an interesting session and it got me thinking about how we might upload some our datasets of interest (e.g. amounts of waste generated & recycled per Scottish council area, ‘carbon impact’ figures) into Wikidata. Would having such datasets in Wikidata, be useful?

There is interest in “per council area” and “per citizen  waste data so I thought that I’d start by uploading into Wikidata, a dataset that describes the populations per Scottish council area per year (source: the Population Estimates data cube at statistics.gov.scot).

This executable notebook steps through the nitty-gritty of doing that. SPARQL is used to pull data from both Wikidata and statistics.gov.scot; the data is compared and the QuickStatements tool is used to help automate the creation and modification of Wikidata records. 2232 edits were executed against Wikidata through QuickStatements (taking about 30 mins). Unfortunately QuickStatements does not yet support a means to set the rank of a statement so I had to individually edit the 32 council area pages to mark, in each, its 2019 population value as the Preferred rank population value …​indicating that it is the most up-to-date population value.

But, is having this dataset in Wikidata useful?

The uploaded dataset can be pulled (de-referenced) into Wikipedia articles quite easily. As an example, I edited the Wikipedia article Council areas of Scotland to insert into its main table, the new column “Number of people (latest estimate)” whose values are pulled (each time the page is rendered) directly from the data that I uploaded into Wikidata:

Visualisations based on the upload dataset can be embedded into web pages quite easily. Here’s an example that fetches our dataset from Wikidata and renders it as a line graph, when this web page is loaded into your web browser:

 

Concerns, next steps, alternative approaches.

Interestingly, there is some discussion about the pros & cons of inserting Wikidata values into Wikipedia articles. The main argument against is the immaturity of Wikidata’s structure: therefore a concern about the durability of the references into its data structure. The counter point is that early use & evolution might be the best path to maturity.

The case study for our Data Commons Scotland project, is open data about waste in Scotland. So a next step for the project might be to upload into Wikidata, datasets that describe the amounts of household waste generated & recycled, and ‘carbon impact’ figures. These could also be linked to council areas – as we have done for the population dataset – to support per council area/per citizen statistics and visualisations. Appropriate properties do not yet exist in Wikidata for the description of such data about waste, so new ones would need to be ratified by the Wikidata community.

Should such datasets actually be uploaded into Wikidata?…​These are small datasets and they seem to fit well enough into Wikidata’s knowledge graph. Uploading them into Wikidata may make them easier to access, de-silo the data and help enrich Wikidata’s knowledge graph. But then, of course, there is the keeping it up-to-date issue to solve. Alternatively, those datasets could be pulled dynamically and directly from statistics.gov.scot into Wikipedia articles with the help of some new MediaWiki extensions.

 

 

Data Commons Scotland at SODU2020 – the build up!

We’re excited to be participating in SODU2020 this weekend (5th and 6th September 2020).  SODU is the Scottish Open Data Unconference, organized by Aberdeen’s Code the City and this year’s purely online event looks as if it’s going to be as excliting as ever. The pitches being developed on SODU2020’s Slack channel suggest there are going to be lots of thought-provoking, critcal and productive conversations. We’ll be pitching ourselves, hoping that people will be interested in the Data Commons Scotland project and willing to share their own experiences and expertise in order to help us find some solutions to the challenges we’ve been identifying.

We’re hoping to run at least one session (more, if there’s enough interest) addressing the following questions:

  • How we can help potential data providers feel more comfortable making ‘imperfect’ data open (there are no perfect datasets, right?) 
  • At the same time, how can we communicate to a variety of potential users the quality/reliability/completeness of the data that do get shared so that they can be sensibly used/applied?
  • What has already been done well on other open data sites – we don’t want to reinvent the wheel, after all?
  • What are the best linking approaches (semantic web/shared labels…)
  • And what about community sourced linked open data – what are the reliability issues associated with that, and are their any good tools for uploading it?

To help us get some conversations going around these issues, we’ve produced a short video that highlights some of what we’ve learned so far from the perspective of both potential users and ourselves as researchers/designers.

The first part of the video is based on one of the scenarios we’ve created as part of our user-design process – we’ll post another blog about the six personas and their assocaited scenarios soon. The second part of the video is based on our own perspectives. We’d love to know if you have any suggestions to help us answer some of our questions.