2) Enabling new research . Every project represented in this workshop is expanding the frontiers of research beyond what was possible in print. I would highlight a project now getting under way that attacks a very traditional goal but is proceeding in a way that is only possible through the combination of advanced automated methods, very powerful computing and very large collections. The history of Latin is hardly a new subject, but we can place this project on a radically new footing. David A. Smith, who holds a Ph.D. in Computer Science and a B.A. in Classics, is one of the principal investigators for a four-year, $2.5 million project funded by the National Science Foundation (NSF). #0910165: Collaborative Research: Mining a Million Scanned Books: Linguistic and Structure Analysis, Fast Expanded Search, and Improved OCR. He has downloaded more than 1.5 million books from the Internet Archive and from these has identified twelve thousand whose language is listed as being Latin. The resulting collection contains approximately1.8 billion words of Latin—almost two hundred times as much Latin as the ten-million–word database on which the TLL has labored for more than a century. If and when we should have access to the twelve million books that Google has already scanned, the collection of available Latin will only increase. Library metadata is, however, rough—we cannot, for example, distinguish the nineteenth-century Latin introduction from an accompanying edition of Cicero, and many books are cited by the date of the published edition (e.g., 1879 in Paris) rather than that of their original creation (e.g., 1623 in Leiden). There are multiple copies of the same author (e.g., ten editions of Horace). Organizing this rough assemblage will provide plenty of opportunity for advanced automated methods. Nevertheless, we now already have in hand the raw materials with which to rewrite the history of the Latin language over the course of two thousand years. Automated analysis with systematic sampling to evaluate error rates redefines the way in which we can conceptualize new research in this subject.

In print culture, Arabic speaking scholars of the Greco-Roman world had little access to, and less visibility within, the largely English, French, German and Italian publication space of classical scholarship. In a networked world where such knowledge bases as treebanks emerge as pre-eminent channels within which to publish interpretations of literary text, the first language of the scholar becomes less important. We are better positioned to establish new intellectual and collegial relationships across challenging barriers of space, language and culture.

3) Redefining who can contribute to scholarship . In this regard, Wikipedia remains an historic phenomenon because it has demonstrated a new mode of intellectual production—one that this philologist thought was at best implausible until Roy Rosenzweig confronted my prejudices with evidence and analysis. His arguments can be found in Rosenzweig 2006. Classicists had developed their own community-driven project with the Suda Online (SOL), (External Link) which has so far produced English translations for more than 27,000 entries from a large tenth-century Byzantine Greek historical encyclopedia of the ancient world. The SOL, however, mobilized professional scholars and included a fairly traditional editorial process. For an overview of the SOL and its editorial process, see Mahoney 2009. The most important project for Classical scholarship in the United States may be the Homer Multitext, because this project demonstrated not only what undergraduates could do in a very complex project but also the effect of participation in this project on their work and on their view of classics. The Homer Multitext defied my own personal expectations as to what undergraduates would do or would find interesting.

