Is there or should there be a Digital Humanities? My very short answer to both questions is “no” and “no.” In a slightly longer answer I concede that a phrase must be about something if it is gaining currency. For me the something of the term is about the trouble that the humanities have had in absorbing digital technology into their habits of work and recognition. Unlike the natural and social sciences, they have so far put the digital into a ghetto–a mutually convenient practice for those inside and outside, but probably harmful in the long run.
Finally I wrestle with the term by engaging Stanley Fish’s recent tri(bl)logy about the Digital Humanities in the New York Times. Fish actually says very little about the use of digital technology in other humanities fields but focuses on literature departments. He is an eminent Miltonist and was a major force in the world of English departments during the turbulent quarter century from the late sixties into the early nineties. On that account alone, he is worth reading.
English departments for Fish are a story of embattled regimes, insurgencies with a martyr’s and a prophet’s face, the domestication of triumphant insurgencies into a new orthodoxy, and the repetition of the cycle with the emergence of a new insurgency. He remembers “with no little nostalgia” the era of “postmodernism in all its versions.” Now he writes with a benevolent serenity spiced with dashes of cynicism.
The title of his first blog, The Old Order Changeth, might as well be plus ça change. The new insurgents are the Digital Humanists, who all of a sudden are all over the annual convention of the MLA. Whereas in the previous seven years, the sessions dedicated to things digital fluctuated between six and fifteen, with a barely discernible trend line, in 2012 there were 27. Something is going on.
In the second blog with the ironic title “The Digital Humanities and the Transcending of Mortality” Fish describes the new insurgence and its major promise (or threat) as the transformation of a “hitherto linear experience — a lone reader facing a stable text provided by an author who dictates the shape of reading by doling out information in a sequence he controls — into a multi-directional experience in which voices (and images) enter, interact and proliferate in ways that decenter the authority of the author who becomes just another participant.” He quotes Kathleen Fitzpatrick, author of Planned Obsolescence: Publishing, Technology, and the Future of the Academy and the first director of the MLA’s recently established Office of Scholarly Communication:
we need to think less about completed products and more about text in process; less about individual authorship and more about collaboration; less about originality and more about remix; less about ownership and more about sharing.
Fish the Miltonist gleefully points out the theological resonances of such “All in All” talk (Paradise Lost 3.341). He doubts whether the digital prophets would like that but is sure they will agree with it as a “a left agenda (although the digital has no inherent political valence) that self-identifies with civil liberties, the elimination of boundaries, a strong First Amendment, the weakening or end of copyright and the Facebook/YouTube revolutions that have swept across the Arab world.”
As a program director and department chair during the eighties I interviewed hundreds of candidates for positions in English and Comparative Literature. To the extent that they shared a collective sensibility, I don’t remember it as being very different from the values and voices Fish imputes to today’s digital insurgents. I remember that some candidates in those days picked up the non-trivial text processing skills that it took to babysit a dissertation through a mainframe computer. They did so because the machine would automatically and accurately renumber their footnotes. For this they would do anything. This shows that Fish is right when he says that the digital has “no inherent political valence” –or any other valence for that matter. But it also shows that Fish probably is not right when he sees the problem of the Digital Humanities as a “we/plural/text and author detesting” ethos challenging an “I/singular/text and author fetishizing” ethos. English departments are full of folks who love plurals in titles and have doubts about the identity of texts or authors but for good and bad reasons want nothing to with the digital.
What does the digital do?
In his final blog , Fish asks how “the technologies wielded by digital humanities practitioners either facilitate the work of the humanities, as it has been traditionally understood, or bring about an entirely new conception of what work in the humanities can and should be.” He takes a single sentence from Milton’s Areopagitica: “Bishops and Presbyters are the same to us both name and thing,” a prose version of the famous Miltonic line “New Presbyter is but old Priest writ large.” Fish points out that in the surrounding sentences “b’s” and “p’s” proliferate in a “veritable orgy of alliteration and consonance.” A brilliant and entirely manual little exercise in stylometry, drawing inferences from a perceived discrepancy between expected and observed occurrences of bilabial plosives.
Fish sees this exercise as an example of hypothesis-testing criticism. He begins with a “substantive interpretive proposition”—Milton believes that the former martyrs have become oppressors. Guided by that proposition he notices formal patterns and elaborates their correlation with the proposition. In his final paragraph he speaks approvingly of “a criticism that narrows meaning to the significances designed by an author, a criticism that generalizes from a text as small as half a line, a criticism that insists on the distinction between the true and the false, between what is relevant and what is noise, between what is serious and what is mere play.”
From the perspective of such a criticism Fish argues that there is not much to love in two quite different avenues of digital criticism. There is the ludic criticism whose most eloquent advocate is Stephen Ramsay. Far from seeing the critic’s duty in narrowing meaning, Ramsay celebrates the power of algorithms to proliferate meaning through playful de- and transformations of texts. And then there is text mining, where “first you run the numbers, and then you see if they prompt an interpretive hypothesis.” There is no QED or conclusion in either method.
I share Fish’s admiration for Stephen Ramsay’s playful imagination. I also share his skepticism about how far to push a ludic element in the business of interpretation, although Fish surely underestimates the power of play, at least in the severe stance he adopts in this blog. As for text mining, Fish is not quite fair to its claims and methods. To stay with the theological language that he seems to both like and dislike, proper understanding is a form of Anselm’s fides quaerens intellectum. You start with some belief and seek to support it with argument and evidence. Without such “faith”, inquiry is just a boat aimlessly drifting at sea. The larger the ocean of data, the more aimless the drift.
Have I seen text mining that answers to this description? Yes. Is it a fair account of text mining done competently? No. Take the example of Matthew Wilkens’ analysis of place names in American novels of 1851. I have not read this essay but heard the author give a talk on a different version of the same project. Fish describes the search that is not “interpretively directed” as follows: “You don’t know what you’re looking for or why you’re looking for it. How then do you proceed? … The answer is, proceed randomly or on a whim, and see what turns up.”
But that is not how Wilkens proceeded. Instead he asked the quite precise question ” What can we learn about a group of related novels by looking at the distribution of place names in them?” This question rests on the well-tested hypothesis that the distribution of proper nouns in a document will tell you quite a bit about it. In the digital realm “named entity extraction” is an important subfield of Natural Language Processing, but it has a venerable manual equivalent in the genre of the Index Nominum, which is almost as old as the printed book. Many a book has been read on the principle of “Tell me whom you quote, and I tell you what you wrote.”
Extracting place names from a set of novels is a form of “distant reading.” The term, which is Franco Moretti’s, is clearly a polemical challenge to “close reading.” Pierre Bayard’s amusing How to talk about books you haven’t read provides ample evidence that “not-reading” is an ancient and inescapable practice. Fish is quite comfortable with it himself when he bases his analysis of the Advent of Digital Humanities on a reading of the titles of MLA sessions and papers. To vary Fish, “Don’t you have to actually read the papers, before saying what the patterns discovered in them mean?”
“Yes and no,” the answer might be. Fish looks at “distant reading” and says “no thank you.” Wilkens makes a more nuanced and modest case. He presents a scenario in which the members of the profession either practice close reading on the same few dozen novels over and over again or develop new practices in which you use methods developed in Natural Language Processing to perform rough mapping operations that are then followed by a targeted examination of selected examples. I have called this technique “scalable reading.”
How these practices will shape literary analysis remains to be seen. We are very much at the beginning of an era. Speaking for myself and as a former Miltonist, the uncertainty of methods, tools, goals, and outcomes in the enterprise of digitally assisted literary analysis is captured in the comparison of Satan’s shield to the moon as seen by Galileo through his telescope, a wonderfully prophetic image of the power of search tools :
his ponderous shield
Ethereal temper, massy, large and round
Behind him cast; the broad circumference
Hung on his shoulders like the Moon, whose Orb
Through Optic Glass the Tuscan Artist views
At Ev’ning from the top of Fesole,
Or in Valdarno, to descry new Lands,
Rivers or Mountains in her spotty Globe.
(Paradise Lost 1.284-91)
Useful tools for mapping
There are two additional points. First, when it comes to the analysis of canonical texts by highly skilled readers with decades of experience, it is not likely that machines will add much insight, although they may help in producing new forms of confirming evidence – digital helpers in August Boeckh’s definition of the philological enterprise as “the further knowing of the already known.” I have been reading Kahneman’s Thinking, Fast and Slow about the System 1 and System 2 of our minds, how some skills become second nature and move from System 2 to System 1 where they are practiced automatically. Thus an attendant in an indoor garage will drive my car at speeds that make my hair stand up. So it is with Stanley Fish, a superbly gifted reader who draws on decades of his own experience and that of the Milton guild when he reads a sentence in Areopagitica. His System 1 just “sees” the pattern of a sentence and its expanding context.
The best a computer could do in such a case would be to offer a sliding window algorithm that works through 500 word stretches at 100 word intervals and measures the clustering of bilabial plosives. It might confirm the observation of a veritable orgy of them. Or it might demonstrate that actually there are not that many more of them, but that they are organized into a pattern by the chiastic structure of bishop/presbyter. This is not a matter of burning interest to a lot of readers. On the other hand, if you are interested in the count and placement of bilabial plosives at all you might as well go the whole hog.
If the computer is not likely to be of much use in the kind of situation that is exemplified by a Stanley Fish turning to a page of Areopagitica, it may nonetheless be an increasingly useful tool in helping with the mapping operations that lay the groundwork for deeper understanding. The German classicist Karl Reinhardt, torn all his life between Wilamowitz’s Alterumswissenschaft and Nietzschean hermeneutics wrote that
it is part of philological awareness that one deals with phenomena that transcend it. How can one even try to approach the heart of a poem with philological interpretation? And yet, philological interpretation can protect one from errors of the heart.
This is the best statement I know of the necessary modesty that is so important an element of good literary criticism. Philological tools and techniques, whether digital or not, operate within the limits of their domain. But if you use them well and with an acute awareness of their limits, they offer some protection against error and may help you look beyond those limits. Like other tools, computers may open doors, but walking through them will always remain your task. The “last mile” of Boeckhian understanding is forever receding and will always need to be walked.
Diggable and re-diggable data
My second point is a quibble with the sentence “Digitize the entire corpus and you can put questions to it and get answers in a matter of seconds .” An algorithm that takes seconds or minutes to execute may depend on data that it took weeks to prepare, and it may spit out results that it takes days to analyze. Computers may save time, but they also create a lot of new work. “Digitize the entire corpus” is easily said, but quite hard to do. Several years ago I served on a review panel for the NEH competition “Digging into Data.” There were very few “diggable data” then, and there are still very few diggable data now if you think of the range of questions literary scholars are likely to address to textual data of various kinds.
Digitization projects must make some assumptions, tacit or explicit, about the uses to which the data will be put. The default assumption in most digitization projects is that the texts will be served up as surrogates for human reading. Such texts will support simple keyword searching, but they do not add up to machine-actionable data sets that support complex forms of manipulation or analysis.
In 2001 Jerry McGann wrote: “In the next fifty years the entirety of our inherited archive of cultural works will have to be re-edited within a network of digital storage, access, and dissemination. This system, which is already under development, is transnational and transcultural.” In such a system, you would hope for a high degree of “interoperability” in the sense that machines can perform at least a few of the things that human readers do when they pick up one book from one shelf, another book from another shelf, and put things together in the serendipitous and messy manner that Stephen Ramsay calls “Hermeneutically Screwing Around.“ In the classic American research library of the 20th century the Library of Congress classification guaranteed the degree of interoperability that made it easier to find books on shelves. Interoperability beyond that point was left to the remarkable skills and caprice of the all-terrain vehicle known as ‘human reader.’
A decade into the half-century of digital editing, it is, alas, not possible to say that we have come 20% of the way. A reading (whether close or distant) of the MLA sessions on things digital is likely to lead to the melancholy conclusion that the profession has not yet focused on the challenges of rebuilding the documentary infrastructure of primary data in ways that will let scholars do new things with old data in digital form.
In projects of all kinds, digital or not, you must often do a lot “to” your stuff before you can do much “with” it. Scale or “Big Data” are a common challenge to maximizing the power of the computer in any domain The Economist in a piece about the “data deluge” reported that it took the Nestle corporation an entire decade to get their disparate data into a shape that allowed analysts to do useful things with them. In the life sciences enterprises like GenBank, an ” annotated collection of all publicly available DNA sequences,” speak to the commitment of an entire discipline to the collaborative construction of sharable data sets that provide the framework for the development and testing of new hypotheses.
When Theodor Mommsen in 1854 published the first volume of his famous Roman History he had already begun the massively collaborative project Corpus Inscriptionum Latinarum, which by the beginning of World War I had created the highly systematic and “interoperable” edition of Roman inscriptions that fundamentally changed the documentary infrastructure for the study of Roman legal and administrative practices. Within the scope of existing technologies this project was the work of many hands and minds, doing things “to” data in such ways that other hands and minds could do different and unforeseen things “with” them. It made Latin inscriptions “diggable” and “re-diggable” in ways that they had not been.
In a similar way, creating digitally rediggable data will be a big challenge for humanities disciplines. It is, if you will, a Falstaffian task, in which the individual hands and minds can each say of themselves: ”I am not only witty in myself, but the cause that wit is in other men” (2Henry IV 1.2.9). So far, truly re-diggable and multiply recombinable data in the humanities remain few and far between. There is a chicken-and-egg problem here: what comes first, the insights that organize the data or the data in a format that prompts questions and creates the hope of answering them within a time frame that makes their pursuit quite literally “worthwhile.”
No Messiahs, please
As I said earlier, Fish talks about a very small slice of the domain poorly encompassed by the phrase “Digital Humanities.” It happens to be a slice I am interested in, but it is worth repeating that archaeologists, art historians, epigraphers, historians, linguists, musicologists, or papyrologists would find little in his blog entries that speaks to the many ways in which they find the ‘digital’ helpful or indispensable to their projects.
Like Fish, I worked my way through the MLA Program. I was most taken with the abstract of a talk by Alison Byerly, the Provost at Middlebury College. She observes the internal conflict of the two most commonly used terms, “Digital Humanities” and “New Media.” The implicit stance of such rhetoric is “Marcionite” (my term). Media that are “new” and humanities that are “digital” have a New Testament that makes the old one superfluous. I am not sure how many “DH folks” actually think that way. But some do, and the rhetoric has its own dynamic, with mostly unhelpful consequences.
It is different in most other disciplines. There are no self-proclaimed digital biologists, chemists, or economists, but for many practitioners in those disciplines digital tools and methods have become essential parts of their engagement with the primary data in their fields — leaving aside the matter of writing and publishing research results, which is going digital in all fields, including the humanities, albeit at different rates.
Byerly and Fish seem to be at one in their distrust of the Messianic, but Byerly, if I extrapolate correctly from the abstract of her paper, may argue for a patient, practical, and incremental engagement of ‘old’ and ‘new’, ‘digital’ and ‘analog’ with a view to a future in which those distinctions fade away. Messianic impulses are hard to curb. Some years ago the historian Dan Cohen gave a talk in which he asked whether you could think of a digital project that could compare with Jenner’s discovery of the smallpox vaccine. Implicit in the question is the idea that a new technology must legitimate itself with some spectacular breakthrough. But that may not be best way of measuring the impact of technology over time.
If you need a prophet, the anti-Messianic Douglas Engelbart may be the better guide. In his famous essay about Augmenting Human Intellect he said:
You’re probably waiting for something impressive. What I’m trying to prime you for, though, is the realization that the impressive new tricks all are based upon lots of changes in the little things you do. This computerized system is used over and over again to help me do little things –where my methods and ways of handling little things are changed until, lo, they they’ve added up and suddenly I can do impressive new things.
“Lots of changes in the little things you do.” If “you” are a scholar in some humanities discipline, there will be a lot of difference in the little things that stand in the way of getting on with your project. Overcoming them one by one singly or collaboratively may at some point add up to Hippolyta’s vision:
But all the story of the night told over,
And all their minds transfigured so together,
More witnesseth than fancy’s images
And grows to something of great constancy;
But, howsoever, strange and admirable.
(A Midsummer Night’s Dream 5.1.23-17)
But it will take a while.