Mining My Day is a Shiny app that visualizes the common themes, or ‘topics’, in Eleanor Roosevelt’s My Day columns. You can read the blog post about the visualization or jump straight into exploring the app.
From December 1935 until September 1962, Roosevelt authored a nationally syndicated newspaper column entitled My Day and because Roosevelt did not keep any sort of diary and her appointment books were often incomplete, the My Day columns have become a valuable source for historians. The columns, authored six days a week until January 1961 and then three days a week thereafter, represent the only consistent account of her public activities. Thanks to the work of the Eleanor Roosevelt Papers’ staff, who transcribed each My Day column and made them available online, anyone with internet access can now explore this important piece of history.
The online repository of My Day columns allows users to look for columns by year or find a particular word or phrase within the collection through keyword search. Keyword search however, requires a level of specificity and familiarity with the topics discussed within the archive. A topic model allows for a more thematic and interactive way of exploring the archive. Providing a thematic overview, a topic model allows users a different entry point into the archive generated by the content (words) of the documents rather than a predetermined organizing schemata such as Library of Congress subject headings or even keywords.
In collaboration with Joshua Catalano, we used MALLET to divide all of the 7,987 My Day columns (as plain text files) into 70 ‘topics.’ Each ‘topic’ is a set of words that frequently co-occur together throughout the corpus. These words indicate themes or discourses within the corpus. With Python and R, we then transformed the MALLET output into several different types of visualizations using Shiny and a number of other packages. The benefit of using Shiny is that it allows users to change specific input parameters such as the topic, number of words, and the number of documents to display. The resulting app includes a word cloud representing the top words in the topic, a chart depicting the prominence of the topic over time, a map detailing the locations where the most prominent columns in the topic were written, and a list of the prominent columns in the topic (with a link to the text). Additionally, a tab at the bottom of the page allows users to view the top words in all 60 topics.