As The Big Ancient Mediterranean Conference begins to wind down, my head is swirling. People are working on amazing things and it is hard not to keep thinking of all the different ways to learn from them in order to better enhance my own project. We will wrap up today and I imagine that I will need some further time to process the conference and to gather some final reflections, but while it is relatively fresh I wanted to jot down four themes that have been raised and that have interested me in particular:
- It’s the data, stupid. A year or two ago I was won over to this way of thinking and now I see it as all the more important. I used to see myself as building a site; now I am producing well-structured data. I will, to be sure, have a site that will do what I want to with my data, but the site – which can also draw on data from other sites – and the data itself are more disconnected. My product, as Tom Elliot said, is input for someone else. This would be my primary advice for those beginning new DH projects – separate the conception of data and service and if you are in the data production business, make sure your data is interoperable in ways you could never imagine. That is perhaps where it will make its biggest impact.
- Uncertainty. When we present cool visuals, whether heat maps, tables, or social networks, do we risk looking more certain than we uncertainly are? The answer, of course, is yes: even texts do that, although we can mark our uncertainty in footnotes (which readers don’t read at the own peril). The appearance of authority worries some people more than it does me, but I understand the concern. At minimum I think that we all agree that ideally DH projects should have a level of explicitness and transparency that are rare for written work. Somewhere easily accessible I need to specify my data source, my encoding, and whatever code I use to process it. We rarely demand that degree of explicitness in print. Beyond that, I am happy with the production of clean, provocative visualizations just as I am happy with clearly stated, provocative textual arguments. Sebastian Heath’s fascinating project correlating the sites of Roman amphitheaters to major roads, or the Corpus Scriptorium’s tables of frequency of Coptic words are terrific and useful in the way that all scholarship is: they model something important about reality and drive our conversations forward by engaging them in conversation and debate.
- Audience. We cannot and should not try to predict the eventual audiences for our data. But we are obligated to do so for our services and sites. Are we building a tool that three dozen scholars will find indispensable but that anybody else who wanders in out of curiously will be mystified, or a site that non-academics will meander through because it grabs their curiosity? Will people even be interested in our sites as wholes, or will they mainly dip in to grab the data that they want? Many of us hope – I certainly do – that our site will mainly be of interest and use to other scholars but that it will spark the curiosity of others as well. I’m sure that both of those things are partially true, but I increasingly suspect that such an ambition is nearly impossible; a bit like the multi-audience scholarly book. The sites most useful to scholars are simply hard to use. I was drawn into the amazing universe of ancient coins at nomisma.org, but only briefly – it’s a sophisticated collection of data and if one can find how to represent that data as a story it is one that requires a great deal of knowledge to interpret. “Wikipedia style” portals, such as the one that agglomerates things dealing with Syriac, are more accessible but perhaps a little less useful to the scholar. The Ancient Roman Graffiti project similarly offers fascinating material; who will use it, though, and for what purposes? What we need is a more deliberate iterative approach to usability studies: how do people use our site and what can we change to make it more open to whatever audiences we are targeting? The same data can be accessed in many different ways and we need to think hard about the needs and desires of different audiences. We have not talked much about this issue at the conference in a sustained way but if we want our sites to have an impact we need to devote more attention to it.
- Citation. The gold standard of scholarly contribution is citation. For DH to go “mainstream” its insights must ultimately be citeable and those citations must be trusted (to some reasonable degree). One significant barrier to citation has largely fallen, namely the sense that things digital are ephemeral. The increasingly standard practice of putting digital creations into stable digital repositories or archives with stable access numbers is critical in this regard. More challenging, however, is precisely the more ambitious modeling that DH allows. If I come out with a result from a model (e.g., the fastest vs. cheapest route from point a to b) and want to use that as part of my argument in a book, how do I cite it? Ideally, I would provide the site that runs the model, the date (and/or version) of the model, and all the input parameters. That solves half the problem. The other half, though, is more difficult. Is the model sound? Has it been peer-reviewed, and if so, by whom? There is a very real problem finding reviewers qualified enough in both the subject and technical aspects necessary to adequately “certify” a project. There is thus a level of trust required that is unusual for other scholarly productions. The community needs to figure out ways to assure scholars the the results of DH scholarship are in fact to be trusted – at least to the degree that we trust anything at all.
More thoughts to come, and the twitter feed, #BAM2016 is still live.