This trip report is part of my NordiCHI experiences. You can find the NordiCHI main-conference post here. For me the conference started on Saturday with a pre-conference workshop. I attended the Curating/Fermenting Data workshop, which allowed us to draw parallels between the process that vegetables go through in fermentation, to the process that data goes through from being 'raw' to providing answers.
The workshop had three organizers and six participants, all of whom had their own perspective on data and personal goals and take-aways. For the organizers, we had Magdalena Tyżlik-Carver, an Associate Professor of Digital Communication and Culture; Lozana Rossenova, a designer and digital humanities researcher who is active on Wikidata; and Lukas Fuchsgruber, a post-doc on digital interfaces for museums and an art-historian. Together, they had various interesting perspectives on the links between fermenting and data collection and processing.
One of the interesting points that Magda made was that there is no single, established definition of data curation. However, curate comes from the latin word curere, which means things such as arranging, taking care of, and attending to. It is a process that centers care. So, if data curation is about managing the lifecycle of data, then the process of curating data is driven by the question: what does it mean for data to have a life? This is where the fermenting came in.
During the workshop, we selected recipes for fermenting, selected and prepared our ingredients, massaged in the salt, and later put the veggies in jar. We also tasted ferments that the organizers had created over the summer, and wrote down our observations, both of our senses and regarding any memories or recipes we know. The workshop was a collaborative process in all aspects, and all our interactions were then afterwards recorded as data, representing the life-cycle of our (and the pre-made) ferments.
In the afternoon we worked with Wikidata to learn more about Linked Open Data and how we could encode it. My knowledge of databases, and specifically RDF, was very convenient here. We reflected on our experiences of the day in three groups on different topics: other-than-human actors such as bacteria, the social aspect such as human agents and their interaction, and sensory data. Then, we were asked to evaluate which of these we could generalize beyond that day's workshop to encode it on Wikidata. I had chosen the social perspective, and although I learnt some new things about creating triples on this data, it was basically impossible to generalize. One statement we were able to encode was that family recipes are an instance of recipes. We did observe that this generalizability was exactly what took the human out of the loop (or the data in this case).
There were two research questions provided as guiding principles for the workshop:
The first question was the main reason for my application to this workshop. In the Knowledge Engineering course that George and I designed last year, this is one of the central questions as well. For us, the solutions is to use data sheets [1]. These are extras enclosed to the dataset, that document answers to the questions such as who, where and why? In the workshop, the provided platform was Wikidata, which does not seem fitting for this purpose, as it is more about generalized data rather than instances. I felt the workshop could perhaps have gone deeper on this question.
For the second reseach question on data that is both machine- and human-readable, the focus of the workshop was on RDF and triples. There was not much discussion on this topic, such as talking through various options. However, as most people seemed to not be familiar with these -the audience was not just computer scientists- this seems fine. I was hoping to discuss more methods for coding meta-data, but I'll just research this on my own.
Overall, perhaps the research questions that were mentioned on the website were not necessarily a representation of this instance of the workshop. Although I did not exactly got out of this workshop what I hoped, I think that the workshop nevertheless was a succes. It was a great opportunity to learn more about various perspectives and metaphors for data curation. I gained new insights into data encoding from a social and historic perspective, that can complement my more technical education and my research position. Plus, (I hope) I made some tasty cabbage!
I want to thank the organizers, Magdalena Tyżlik-Carver, Lozana Rossenova and Lukas Fuchsgruber for the interesting content and hands-on experience.