Text Mining America’s German-Language Newspapers, 1830-1914

Processing Germanness


Jana Keck

German-language newspapers flourished during the era of 19th-century mass migration to the United States. Aside from conveying local, national, and transnational news and information, German-language newspapers functioned as powerful tools of retaining language and preserving culture. For digital humanists, observing transnational and transcultural text circulation offers a keen reminder that “information flow” is as much a function of intimate rhizomatic accidents and technological imagination as it is of telegram networks and modal distribution. In my project, I use digitized German-language newspapers from Chronicling America and develop a computational framework to detect texts, which went viral in the U.S. and to enhance advanced search and discovery functionalities of the digital archive by classifying reprinted texts into genres. Specifically, the objective is to illustrate how texts – from hard news to poems – spread across states and decades and how these narratives constructed and preserved a German community across states and decades. The resulting data of my project can be considered a substantial set of enumerative bibliographies of German-American history to be used for further research in the field. Studying multimodal viral phenomena in the German-language press does not only shed light on the theoretical framework of the German-language newspaper as a genre and as the voice and mirror of German ethnic life, but above all reveals patterns of intersections between multiple migrant networks and elucidates the change and continuity of migration experiences. Due to the expansive and demotic nature of newspapers, using computational methods paired with the qualitative analysis of textual data within their historical and cultural contexts delivers insights on the dynamics of knowledge transfer about German culture to show that assimilation and continuing transnational connections are neither incompatible, nor are they binary oppositions. The guiding themes of this research are: German identity in the United States, the genre of the German-language newspaper, as well as text mining and machine learning benefits and limitations.