
Course Details
Spring 2023
Clemson University
Tuesday
Meets in person unless noted otherwise.
3 credit hours.
Meets:
4-6:30pm
Location: Hardin Hall 024.
Instructor Info
Instructor:
Dr. Amanda Regan
aeregan (at) clemson.edu
Pronouns: She/Her
Office Location: Hardin Hall 004
Office Hours: My office hours are flexible and you can schedule a time to meet with me. Make an appointment for Office Hours here.
Course Description
Welcome to History 8510, Methods in Digital History II. This course in computational history will teach you to create, manipulate, explore, and visualize historical data with the goal of advancing historical arguments.
This class is the second in a series of digital history courses at Clemson University. It is designed to build on History 8500: Digital Methods I. In that course you reviewed digital history projects and methods, experimented with out of the box tools, and developed your own digital research collections. In this course we will build on that and you will now learn to use computational methods to analyze historical sources in the programming language R.
This course will break down into roughly four units. The first, will provide an overview of computational history and look at examples of historical scholarship that have relied on computational methods to make historical arguments. The second, will focus on creating methodologically transparent and clean datasets from primary sources. You will learn to document those datasets and methodologies in a reproducible manner using GitHub and version control. The next unit focuses on the basics of the programming with an emphasis on computational thinking skills and the foundational patterns behind every programming language. Finally, in the last half of our course, we will survey various methods within computational history and ask ourselves, what kinds of historical questions do these methods make possible? Each week will focus on a new method – visualization, mapping, text analysis, and network analysis. By the end of this course you’ll have gained a basic familiarity and competency all of these approaches, but more importantly you’ll understand how to ask and answer historical questions using the computational methods that are useful to you. At the conclusion of the course you will be prepared to produce a seminar paper in digital history and potentially employ digital methods for your dissertation.
Why R?
There are numerous languages we could learn in this course but we’ll focus on R which is common in the field of digital history as well as data science. With roots in statistical analysis, R is a high level programming language (meaning further from machine code or really complicated languages like C++). However, more important than the specific language are the programming fundamentals you will learn. After this class you may decide that Python, PHP, Go, or any other number of programming languages better suits your needs based on the historical questions you want to ask. Learning these fundamentals will allow you to pivot as needed and learn new languages, syntaxes, and tools.
A Note on Data and Sources for this Class
For the most part, I’ve provided datasets for all of the assignments in this class so that you don’t have to worry about learning the material and creating a historical dataset that will work with that approach. The only exception to this is the assignment on dataset creation but I will provided suggested primary source repositories in case you don’t yet have a set of sources in mind. However, you are welcome (and encouraged) to use your own data for any assignment. Doing so will only further prepare you for your future classes in digital history and for your dissertation.
Learning goals:
At the conclusion of this course you will:
- be able to use a computer programming language (R) to make historical arguments.
- understand the basic computational patterns in programming.
- Be able to use documentation for programming languages and libraries to learn skills as they become applicable to your work.
- Be able to create historical data from primary sources.
- Understand how to clean, organize, manipulate, and document that data.
- Critically evaluate existing historical data.
- be familiar with the major computational methodologies (i.e. mapping, text analysis etc) and what types of historical data and questions those methods are useful for. Understand their limitations and strengths.
- Visualize your work and communicate your methods to other historians.
Assignments & Grades
Assignment | Percentage of Grade |
---|---|
Worksheets & Participation | 40% |
Dataset Biography | 20% |
Historical Dataset Review | 10% |
Final Project | 30% |
Assignments:
- Worksheets/Problem Sets & Participation (40%): Most weeks you will complete a worksheet that is based on the method for the week. With a few exceptions, these worksheets will be due the week after we discuss that method. Worksheets will be R Markdown documents that allow you to blend code and prose and in these worksheets you’ll be asked to practice various concepts with a provided dataset. These worksheets will contain a mix of problems ranging from easy to very hard. The goal is not perfect completion of a worksheet and there isn’t an answer key that I’ll grade from. Rather, the goal here is to practice the methodology at hand. It’s one thing to read about the methodology and read the documentation for the associated R packages, but it’s another thing to actually apply that methodology to historical sources and ask questions of it. So, in other words, the goal of these worksheets is twofold. First, it is to grasp the basic programming concepts related to the methodology. Second, it is to understand what types of historical questions can be asked of that method and what kinds of arguments or interpretations it makes possible. There are a total of seven worksheets across the semester. They are listed in the schedule on the day they are due. Note that in all cases they are due before the start of class as we will discuss and go over them together.
- #1: R Basics - January 31st at 4:30pm
- #2: Data Structures - due February 14th at 4:30pm
- #3: Data Manipulation - due February 21st at 4:30pm
- #4: Data Visualization - due February 28th at 4:30pm
- #5: Mapping - due March 14th at 4:30pm
- #6: Text Analysis - due April 4th at 4:30pm
- #7: Topic Modeling - due April 11th at 4:30pm
- Dataset Biography (20%): Each of you will create a Dataset Biography during the semester. Creating a historical dataset is a process that is fraught with decisions that impact how the dataset can be used. This assignment asks you to think about and document the process of collecting information from primary sources and turning it into data. Using primary sources of your choosing (I can provide suggestions if you need them), your data management plan should come up with a data structure and define key elements of the dataset. What is the dataset based on? What information will be captured and how will it be stored? Will you be importing external data? (i.e. Historical place names from an existing dataset or information about a person’s term in office.) If so, where will that information come from? Who owns it? What are the methods behind the data collection design and process?
- Historical Dataset Review (10%): For most weeks during the semester one student will give a short presentation on a historical dataset. Much like a book review in a traditional historical seminar, this dataset review should interrogate the methods used to create the dataset. A handout will be provided in class, but your review should ask questions like: Are the data logically and consistently organized? What primary sources is the dataset based on? What is added and what is missing? Was the data processed? Are they layers of scholarly interpretation integrated into the dataset? What digital scholarly products have been produced using this dataset? How does the dataset shape the interpretation that the authors were able to reach? You’ll sign up for a date and a dataset in the first weeks of class. I’ll provide a list of suggestions, however, if you have a dataset that you’d like to review and isn’t on my list email me to discuss the potential of including it.
- Final Project Data Driven Historical Vignette (30%) - At the conclusion of the course you will demonstrate the skills you have learned by writing a data driven historical vignette. This vignette can use a dataset provided for class or one of your own creation, but it will use data visualizations and analysis to make a historical argument about the data. It should blend prose and visualization in a .Rmd document and include citations to other scholarship where necessary. It should be no longer than 1500 words.
Grading Scale: A (93-100), A- (90–92), B+ (87–89), B (83–86), B- (80–82), C+ (77–79), C (73–76), C- (70–72).
Policies & Procedures
Please note that this syllabus may be updated online as necessary. The online version of this syllabus is the only authoritative version.
Late Work
Due dates for all assignments are listed on the course syllabus and in the schedule for the class. Because of the technical nature of this class it is essential that you keep up with the worksheets. If you miss one worksheet the others will be much harder because you will have missed key concepts. Unless otherwise stated, assignments are due on the day listed on the syllabus and canvas. If you submit an assignment late, I will deduct 10% for every day that it is late. Assignments submitted more than 7 days after the due date will not be accepted.
Classroom Conduct
In order to learn, we must be open to the views of people different from ourselves. In the time we share together over the semester, please honor the uniqueness of your fellow classmates and appreciate the opportunity we have to learn from one another. Please respect each others’ opinions and refrain from personal attacks or demeaning comments of any kind. Anyone who engages in hostile or antagonistic rhetoric will be asked to leave the classroom immediately.
Academic Integrity
As members of the Clemson University community, we have inherited Thomas Green Clemson’s vision of this institution as a “high seminary of learning.” Fundamental to this vision is a mutual commitment to truthfulness, honor, and responsibility, without which we cannot earn the trust and respect of others. Furthermore, we recognize that academic dishonesty detracts from the value of a Clemson degree. Therefore, we shall not tolerate lying, cheating, or stealing in any form.
All infractions of academic dishonesty by undergraduates must be reported to Undergraduate Studies for resolution through that office. In cases of plagiarism instructors may use the Plagiarism Resolution Form.
See the Undergraduate Academic Integrity Policy website for additional information and the current catalogue for the policy.
Please keep in mind that if you are copying and pasting text that you did not write yourself, you might be plagiarizing. If you are using copied text, whether pasted or retyped manually, you must be sure to accurately cite the information. Text is accurately cited when: 1) pasted text is surrounded by quotation marks or offset as a block quote and 2) the pasted text is attributed to its author and source and 3) the pasted text is cited in a footnote, endnote, or bibliography.
Student Accessibility Services
Clemson University values the diversity of our student body as a strength and a critical component of our dynamic community. Students with disabilities or temporary injuries/conditions may require accommodations due to barriers in the structure of facilities, course design, technology used for curricular purposes, or other campus resources. Students who experience a barrier to full access to this class should let the instructor know and make an appointment to meet with a staff member in Student Accessibility Services as soon as possible. You can make an appointment by calling 864-656-6848, by emailing studentaccess@lists.clemson.edu, or by visiting Suite 239 in the Academic Success Center building. Appointments are strongly encouraged – drop-ins will be seen if at all possible, but there could be a significant wait due to scheduled appointments. Students who have accommodations are strongly encouraged to request, obtain and send these to their instructors through their AIM portal as early in the semester as possible so that accommodations can be made in a timely manner. It is the student’s responsibility to follow this process each semester.
You can access further information at the Student Accessibility website. Other information is at the university’s Accessibility Portal.
Commitment to Diversity
“Clemson University aspires to create a diverse community that welcomes people of different races, cultures, ages, genders, sexual orientation, religions, socioeconomic levels, political perspectives, abilities, opinions, values and experiences.” - The Clemson University Title IX statement regarding non-discrimination
Clemson University is committed to a policy of equal opportunity for all persons and does not discriminate on the basis of race, color, religion, sex, sexual orientation, gender, pregnancy, national origin, age, disability, veteran’s status, genetic information or protected activity in employment, educational programs and activities, admissions and financial aid. This includes a prohibition against sexual harassment and sexual violence as mandated by Title IX of the Education Amendments of 1972. This Title IX policy is located on the Campus Life website. Ms. Alesia Smith is the Clemson University Title IX Coordinator, and the Executive Director of Equity Compliance. Her office is located at 223 Brackett Hall, 864.656.0620. Remember, email is not a fully secured method of communication and should not be used to discuss Title IX issues.
Emergency Preparedness
Emergency procedures have been posted in all buildings and on all elevators. Students should be reminded to review these procedures for their own safety. All students and employees should be familiar with guidelines from the Clemson Police Department. Visit here for information about safety.
Clemson University is committed to providing a safe campus environment for students, faculty, staff, and visitors. As members of the community, we encourage you to take the following actions to be better prepared in case of an emergency:
- Ensure you are signed up for emergency alerts
- Download the Rave Guardian app to your phone (https://www.clemson.edu/cusafety/cupd/rave-guardian/)
- Learn what you can do to prepare yourself in the event of an active threat (http://www.clemson.edu/cusafety/EmergencyManagement/)
Schedule
Note: Unless stated otherwise, all reading and worksheets should be completed before class for the day that it is listed.
Tuesday, January 17, 2023
- Topic: What is Computational History? Do Historians Need to Learn to Code?
- Readings:
- David M. Berry. “The Computational Turn: Thinking about the Digital Humanities, Cultural Machine 12 (2011). https://doi.org/10.2337/DB11-0751
- Taylor Arnold & Lauren Tilton. New Data? The Role of Statistics in DH in Debates in the Digital Humanities 2019, ed. Matthew K. Gold and Lauren F. Klein (University of Minnesota Press, 2019).
- Fred Gibbs. New Forms of History: Critiquing Data and Its Representations in The American Historian, (February 2016).
- Adam Crymble. How to Solve Programming Problems if you’re learning Programming, (2014).
- Tasks ahead of class:
- Download a Plain Text Editor. Choices include Atom, Sublime Text, Visual Studio Code (mac only). I prefer visual studio code but there are many options.
- Create a GitHub Account.. Submit your username via canvas.
- Download Slack to at least one of your devices and join our Slack Group. Check your email for an invite. Never used slack? Look over the quick start guide.
- You should see a channel called 8510-Spring23 in the left hand bar under channels. In that channel, introduce yourself to the class.
- Tech Fluency Survey (see link on canvas)
- Optional: We’ll be using the Palmetto Cluster for this class, however, if you’d like you can also spend some time setting up dependencies on your machines so that you also have a local environment. I can help as needed, but start by trying to install each of the following:
Tuesday, January 24, 2023
- Topics:
- History and Data
- Reproducible Research
- Readings:
- Anelise Hanson Shrout. "(Re)Humanizing Data: Digitally Navigating the Bellevue Almshouse” in Current Research in Digital History 1 (2018), https://doi.org/10.31835/crdh.2018.10
- Jessica Marie Johnson. “Markup Bodies: Black [Life] Studies and Slavery [Death] Studies at the Digital Crossroads” in Social Text 36, no 4, https://doi.org/10.1215/01642472-7145658
- Abraham Gibson & Cindy Ermus. The History of Science and the Science of History: Computational Methods, Algorithms, and the Future of the Field, Isis 110, no 3 (2019). http://dx.doi.org/10.1086/705543
- The Turing Way Community, Becky Arnold, Louise Bowler et.al, The Turing Way: A Handbook for Reproducible Data Science. v0.0.4, Zenodo, 2019. DOI.org (Datacite), https://doi.org/10.5281/ZENODO.3233986.
- Geir Kjetil Sandve, Anton Nekrutenko, James Taylor, and Eivind Hovig. Ten Simple Rules for Reproducible Computational Research, PLOS Computational Biology 9(10): e1003285. https://doi.org/10.1371/journal.pcbi.1003285
- Tutorials and Activities:
- Introduction to the Bash Command Line
- Hello World: Git Tutorial
- Read through Karl Broman’s Git/GitHub Tutorial
- Assignments:
- Fork the 8510-Worksheets repository. Clone it to the palmetto cluster, edit the read me file, and then commit and push it back to Github.
- Share the link to your repository on Slack and upload it to canvas.
- Be sure that your
Readme.md
file explains what the repo contains (or will contain). - If you need it, heres a short overview of markdown.
- Fork the 8510-Worksheets repository. Clone it to the palmetto cluster, edit the read me file, and then commit and push it back to Github.
Tuesday, January 31, 2023
- Topic:
- Why R?
- R Basics
- Reading:
- Bruno Rodrigues. Modern R, Chapter 1: Getting to know R Studio
- Hadley Wickham & Garrett Grolemund. R for Data Science: Chapters 2 (Basics), 4 (Scripts), 6 (Projects), 8 (Data Import), and 21 (R Markdown).
- R Markdown Document
- Assignments:
- The R Basics Worksheet Due
Tuesday, February 7, 2023
- Topics:
- Information as Data (Tidy Data)
- Reading:
- Karl W. Broman and Kara H. Woo, “Data Organization in Spreadsheets,” American Statistician 72, no. 1 (2018): 2–10, https://doi.org/10.1080/00031305.2017.1375989.
- Hadley Wickham, “Tidy Data,” Journal of Statistical Software 50, no. 10 (2014).
- Watch Hadley Wickham, “Tidy Data and Tidy Tools,” NYC Open Statistical Computing Meetup, Dec. 2011.
- Assignments Due:
- Data Structures worksheet DUE (focuses on data structures, functions, and loops).
Tuesday, February 14, 2023
- Topics:
- Wrangling & Manipulating Data (Tidy Data pt 2)
- Reading:
- Wickham & Groulemund, R for Data Science, chs 5, 12, 13.
- The Tidy Tools Manifesto
- Welcome to the Tidyverse
- Introduction to dplyr
- Assignments Due:
- Create a dataset: Using the principles of Tidy Data create a well structured csv file before class. You should use either a primary source dataset that you have already, or one of the primary source sets linked below. Upload your dataset to slack along with a brief description.
Tuesday, February 21, 2023
- Topics:
- Data visualization
- Reading:
- Hadley Wickham & Garret Groulemund, R for Data Science, chs 1, 3, 22.
- Kieran Healy, Data Visualization, 1, 3, 4.
- David Staley, Computers, Visualization, and History: How New Technology will Transform Our Understanding of the Past, Introduction and Chapter 2.
- Look at:
- Assignments Due:
- Data Manipulation Worksheet Due (focuses on data manipulation usign the tools provided in the Tidyverse)
- Find one historical visualization and post it in the slack channel. In your post, describe why this visualization is useful (or why it is flawed.)
Tuesday, February 28, 2023
- Topic:
- Using Exploratory Data Analysis to Ask and Answer Questions
- Reading:
- Hadley Wickham & Garret Groulemund, R for Data Science, Chapter 7
- Roger Peng, Exploratory Data Analysis with R, Chapters 3-6
- Kieran Healy and James Moody. “Data Visualization in Sociology”, Annual Review of Sociology 40 (July 2014). DOI:10.1146/annurev-soc-071312-145551
- Assignments Due:
- Data Visualization Worksheet Due
Tuesday, March 7, 2023
- Topic:
- Creating Historical Geographic Data
- Reading:
- Jannelle Legg. “Mapping Deaf Missions.” Also look at the data in Legg’s GitHub repository.
- Cameron Blevins and Richard W. Helbock. “US Post Offices.”
- Be sure to read the Data Biography then explore Gossamer Network
- Jordan Bratt. “Geolocating the Towns from A New Nation Votes” on Mapping Early American Elections.
- Review the
ggmap
documentation. - Peter Prevos. Geocoding with
ggmap
and the Google API
- Assignments:
- Use one of the datasets in our class’s github repository and complete Peng’s Exploratory Data Analysis Checklist in a new
.Rmd
file. Follow all his steps beginning with Formulate a question and then use the tools described in his book and in class last week to explore that dataset.
- Use one of the datasets in our class’s github repository and complete Peng’s Exploratory Data Analysis Checklist in a new
Tuesday, March 14, 2023
- Topics:
- Maps
- Reading:
- Robert K. Nelson and Edward L. Ayers, eds., American Panorama: An Atlas of United States History
- Cameron Blevins, “Space, Nation, and the Triumph of Region: A View of the World from Houston”, Journal of American History, Volume 101, Issue 1, June 2014, Pages 122–147, https://doi.org/10.1093/jahist/jau184
Leaflet
library documentation- You may also find the documentation on the leaflet website useful.
sf
library documentationtigris
library documentation
- Assignments:
- Mapping Worksheet
- Data Biography Due
Tuesday, March 21, 2023
- Spring Break
Tuesday, March 28, 2023
- Topic:
- Text Analysis I: The Basics
- Reading:
- Matthew K. Gold and Lauren F. Klein et al., “Forum: Text Analysis at Scale,” in Debates in the Digital Humanities 2016 (University of Minnesota Press, 2016), 525–568.
- Kasper Welbers, Wouter Van Atteveldt, and Kenneth Benoit. “Text Analysis in R,” Communication Methods & Measures 11, no 4 (2017). https://doi.org/10.1080/19312458.2017.1387238
- Taylor Arnold, Nicolas Ballier, Paula Lisson, and Lauren Tilton. “Beyond lexical frequencies: using R for text analysis in the digital humanities,” Language Resources and Evaluation 53 (2019). https://doi.org/10.1007/s10579-019-09456-6
- Review the Documentation for:
Tuesday, April 4, 2023
- Topic:
- Text Analysis II: Topic Modeling, Text Reuse, and Clustering
- Reading:
- Joshua Catalano, “Digitally Analyzing the Uneven Ground: Language Borrowing Among Indian Treaties,” Current Research in Digital History 1 (2018): https://doi.org/10.31835/crdh.2018.02.
- Ryan Cordell, “Reprinting, Circulation, and the Network Author in Antebellum Newspapers,” American Literary History 27, no. 3 (2015): 417–445, https://doi.org/10.1093/alh/ajv028
- Andrew Goldstone and Ted Underwood, “The Quiet Transformations of Literary Studies: What Thirteen Thousand Scholars Could Tell Us” in New Literary History 45, no.3 (Summer 2014). https://doi.org/10.1353/nlh.2014.0025
- Also explore the accompanying website Quiet Transformations and a related project Signs@40: Feminist Scholarship through Four Decades
- Robert K. Nelson, Mining the Dispatch
- Maria Antoniak, Topic Modeling for the People
- Assignments:
- Text Analysis Basics Worksheet Due
Tuesday, April 11, 2023
- Topic:
- Text Analysis III: Word Embeddings (first half of class)
- Final Project Sandbox Time (second half of class)
- Readings:
- Ben Schmidt, “Vector Space Models for the Digital Humanities” (October 25, 2015).
- Ben Schmidt, “Rejecting the Gender Binary: A Vector-Space Operation” (October 30, 2015).
- Sydney Bowen. “Using Temporal Word Embeddings to Reveal the Shifting Notion of Beauty in Vogue” from Robots Reading Vogue.
- Sandeep Soni, Lauren F. Klein, Jacob Eisenstein, “Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers,”, Journal of Cultural Analytics 6, no. 1 (2021).
- Assignments:
- Topic Modeling Worksheet Due
Tuesday, April 18, 2023
- Topic:
- Networks (first half of class)
- Final Project Sandbox Time (second half of class)
- Reading:
- Yann C. Ryan and Sebastian E. Ahnert, “The Measure of the Archive: The Robustness of Network Analysis in Early Modern Correspondence,” The Journal of Cultural Analytics 6, no. 3 (2021).
- Jim Casey, “A Committee of the Whole,” Current Research in Digital History, 2 (2019).
Tuesday, April 25, 2023
- Topic:
- Computer Vision & Next steps in Computational History (first half of class)
- Final Project Sandbox Time (second half of class)
- Reading:
- Taylor Arnold, Lauren Tilton, and Annie Berke, “Visual Style in Two Network Era Sitcoms,” Journal of Cultural Analytics, 4, no 2 (2019).
- See also Taylor Arnold & Lauren Tilton, “Distant Viewing Toolkit: A Python Package for the Analysis of Visual Culture”, Journal of Open Source Software, 45, no. 5 (2020).
- Lauren Tilton, “The Visual Turn in DH,” Keynote, Digital Humanities and the Visual World Symposium
- Taylor Arnold, Lauren Tilton, and Annie Berke, “Visual Style in Two Network Era Sitcoms,” Journal of Cultural Analytics, 4, no 2 (2019).
Tuesday, May 3, 2023
- Final project due by 6pm
- Public presentation of final projects for department & campus community.