Project Methodology

James W. Pennebaker and Ashwini Ashokkumar, The Pandemic Project and the University of Texas at Austin

May 7, 2020

Never has the world so quickly mobilized to and felt the effects of a contagious deadly disease. Across most countries, people have been encouraged or required to remain in their homes for weeks and months while overburdened medical facilities treat the sick and dying. The biological, medical, and epidemiological patterns of the coronavirus, or COVID-19 are being studied and reported in great detail. Fewer research projects are examining the social and psychological effects of the outbreak.

The Pandemic Project follows from previous research on the social dynamics of large-scale disasters and upheavals. The current research team and many others have found striking differences in the ways that entire communities and even nations respond to both natural and man-made disruptions. Further, reactions to upheavals are highly dynamic, with rapid changes in attitudes and behaviors unfolding over time. A common observation is that in the days and weeks after an event, people come together. They talk with others, offer support, and congregate in memorial services, rallies, or just talk more freely with friends and strangers. Despite these early periods of cohesiveness, it is not uncommon for later periods of discord and recriminations.

The COVID-19 outbreak is unique among major upheavals in the last few generations because the social dynamics are so different. Rather than encouraging social connections, we are all discouraged from being close to others. Our streets are quiet and most places of work, leisure, and worship are empty. Entire countries are full of people sitting quietly in their homes watching movies, reading, or interacting with others on social media.

How is COVID shaping our lives? The Pandemic Project was launched to answer this question. Using a variety of research methods, we will be releasing a number of brief reports about what we are finding. The goal is to provide general information to the public as well as to those in government, business, and healthcare about the ways people are thinking, feeling, and behaving.

The Research Methods

There are several goals and associated research questions of the Pandemic Project. Many rely on very different types of research methods.

The COVID-19 Survey

Although COVID was initially diagnosed in December, 2019, it wasn’t considered a serious problem until late January, 2020. Even until the end of February, most U.S. government officials were dismissive about its potential impact despite strident warnings from officials within the Centers of Disease Control (CDC) and National Institutes of Health (NIH). By the first week of March, the U.S. public started becoming aware of the approaching crisis and, on March 11, the World Health Organization announced that COVID-19 was an international pandemic. For the purposes of this project, we consider Wednesday, March 11, 2020 the first day of the U.S. COVID-19 crisis.

The first COVID survey from our lab was constructed between March 12 and March 18, with pilot versions completed by current and former graduate students, close friends, and family. Version 1 of the survey was officially released a little after midnight (CDT) on Thursday, March 19. Version 1 continued to collect responses until Friday, March 27 when Version 2 was released.

Survey content. Version 1 of the survey included 54 questions and was launched on a website developed by Sashank Macharla, The survey itself was built on a Qualtrics platform that was able to provide individual feedback to participants about their responses. In addition to basic demographic information, one of the first questions was an open ended item that asked:

… spend at least 5 minutes writing about your thoughts and feelings related to the COVID-19 outbreak. Specifically, when you think about the coronavirus and its effects on your life, what feelings and thoughts come to your mind?

This question was used in all subsequent versions of the survey and provided personal perspectives that were largely untainted by the theories or beliefs of the researchers. Later sections of the question asked respondents about their daily social, work-related, and everyday behaviors, reliance on social media, as well as beliefs about the possible causes of the virus’s spread. Other items asked about their coping methods and beliefs about the likely time course and implications of the disease. Finally, people were given the option of providing their Twitter and Reddit handles. On the feedback page, participants were provided feedback about their responses and given the option to download their feedback or have it emailed to them. They were informed that by providing their email, we might contact them in the future for their help in completing additional surveys.

Version 2, which was released on March 28, contained about 80% of the items from the original version. Because new issues and concerns were raised about public perceptions of the disease and it’s likely long term impact, a series of additional items were added. The new version had 64 items.

Version 3 was released on April 30, just as the U.S. President and many state governors began opening up businesses to help spur the economy. The third version had 63 items and contained about 80% of the items from Version 2.  The primary additions were questions focused on how people were thinking about transitioning out of over six weeks of isolation.

Sampling strategies and participants. The surveys were completed by three separate samples:

Snowball sample. The first COVID survey was officially launched on March 19, 2020. Links to the COVID survey were sent to current and former graduate students, other colleagues in psychology, and friends and family members with requests to forward links to friends and others. The links were also passed to news media and out to social media through Twitter, Reddit, and Facebook. People who completed the survey were encouraged to forward the link to friends as well. Over the 8 days the survey was live, 11,290 people opened the survey and 9,981 completed the demographics page. Unfortunately, about 35% of participants dropped out once they reached the item asking for people to respond the open-ended text. Those that dropped out were more likely to be male (25 vs 17 percent) and younger (38 vs 43 years old).

The final full sample of those who completed the writing task and post-writing survey items was 6,588 respondents. As can be seen in Table 1, the sample was largely American and Canadian (83.7 and 5.1% respectively), female, middle aged, and well-educated. Overall, 54.6% were employed full time, 13.2% retired, 13% students, and 8.8% were part time. Cleary, this was not a random sample. Nevertheless, the responses were consistent with the other Version 1 samples.











Employed FT %

Snowball V1








Snowball V2








Snowball V3








Prolific V1


3/25, 3/27






Prolific V2


April 4,7,11, 13,16, 20,24






Prolific V3


May 1, 5






Student V2







Table 1. Demographic information for the three versions of the COVID-19 Survey, across the Snowball and Prolific samples. U.S. refers to percent of people who live in the United States; Female refers to percent who identify as female,\; Age is expressed in years; College is percentage of respondents with a college degree or higher; Employed refers to percentage who are employed full time. Note that a subset of both the Snowball and Prolific samples participated in two versions of the questionnaire.

Prolific samples. A common way to collect online survey responses is through crowdsourcing companies such as Amazon MTurk or Prolific. The company retains a large curated group of workers who are available to complete questionnaires. For all versions, groups of workers completed the COVID survey starting on March 25. The Prolific samples were more balanced in terms of sex of participants and also much younger, and somewhat less education. Again, the group was still highly educated. Although about 50% were employed full time, another 19.7% were full time students.

College student sample. Approximately 530 students enrolled in a large online Introductory Psychology class at The University of Texas at Austin were given a reading assignment that included taking Version 1 of the COVID survey. Although they were not required to take the survey, they were expected to have learned the research material discussed in the feedback section of the questionnaire (including research based on Wendy Wood’s research on habits and Roxane Silver’s research on the effects of media exposure). Overall, 428 of the students completed all or most of the survey. Table 2 provides basic demographics. The college sample was made up of people who identified as Asian or Asian American (29.7%), Black or African American (6.1%), Hispanic or Latinx (25.3%), White or European American (35.5%), or other groups (3.4%).

Reddit sample

Reddit is a popular social media website on which users participate in discussion-based communities (“subreddits”). The public Reddit API ( provides researchers with access to all posts on the website. Note that people on Reddit use handles, as opposed to real names, which protects the identity of the users. For the current project, 1,484,151 Reddit comments posted by 158,531 users between January 10, 2020, and April 10, 2020, on 8 city subreddits were downloaded. The cities included New York City, Seattle, Austin, Boston, Houston, Chicago, Los Angeles, and Portland. Comments posted by bots or which had fewer than 15 words were excluded. The final dataset had 762,098 Reddit comments posted by 105,412 users. Although we cannot get information about users’ individual demographics, aggregated demographc information is available. Reddit’s users are largely from the US (over 50%) and tend to be male (about 67-69%) and younger (64% users are below age 29). Most Redditors have some college education or a degree. The ethnic distribution of Reddit users follows US population trends.

Working References

Barry, J.M. (2005). The great influenza: The story of the deadliest epidemic in history. New York: Penguin.

Bonanno, G. A., Brewin, C. R., Kaniasty, K., & Greca, A. M. L. (2010). Weighing the costs of disaster: Consequences, risks, and resilience in individuals, families, and communities. Psychological science in the public interest, 11(1), 1-49.

Brancati, D. (2007). Political aftershocks: The impact of earthquakes on intrastate conflict. Journal of Conflict Resolution, 51(5), 715-743. [Shows that earthquakes — especially larger ones — increase conflict because of limited resources. Studied 185 countries from 1995-2002]

Cohn, M.A., Mehl, M.R., & Pennebaker, J.W. (2004). Linguistic Markers of Psychological Change Surrounding September 11, 2001. Psychological Science, 15, 687-693.

Drury, J., Novelli, D., & Stott, C. (2013). Psychological disaster myths in the perception and management of mass emergencies. Journal of Applied Social Psychology, 43(11), 2259-2270. [All data show that disasters bring people together but the beliefs are that crowds and people go wild]

Solnit, R. (2009). A paradise built in hell: The extraordinary communities that arise in disaster. New York: Viking.

Fritz, C. E., & Williams, H. B. (1957). The human being in disasters: A research perspective. The Annals of the American Academy of Political and Social Science, 309, 42–51.

Gortner, E.M., & Pennebaker, J.W. (2003). The anatomy of a disaster: Media coverage and community-wide health effects of the Texas A&M Bonfire tragedy. Journal of Social and Clinical Psychology, 22, 580-603.

Garfin, D. R., Silver, R. C., & Holman, E. A. (2020). The novel coronavirus (COVID-2019) outbreak: Amplification of public health consequences by media exposure. Health Psychology. Advance online publication.

Kaniasty, K., & Norris, F. H. (1995). In search of altruistic community: Patterns of social support mobilization following Hurricane Hugo. American Journal of Community Psychology, 23(4), 447-477. [people receive high levels of social and material support immediately after a disaster but in the longer term, most support diminishes especially for those who need it most]

Norris, F. H., Friedman, M. J., Watson, P. J., Byrne, C. M., Diaz, E., & Kaniasty, K. (2002). 60,000 disaster victims speak: Part I. An empirical review of the empirical literature, 1981–2001. Psychiatry: Interpersonal and biological processes, 65(3), 207-239. [Project on major horrible things really focusing on long term psychological data. Disasters are quite unsettling — especially young people].

Mehl, M.R., & Pennebaker, J.W. (2003). The social dynamics of a cultural upheaval: Social interactions surrounding September 11, 2001. Psychological Science, 14, 579-585.

Pennebaker, J.W. & Newtson, D. (1983). Observation of a unique event: Psychological impact of Mt. St. Helens. In H. Reis (Ed.), Naturalistic approaches to studying social interaction (pp. 93-109). San Francisco: Jossey-Bass.

Pennebaker, J.W. & Harber, K.D. (1993). A social stage model of collective coping: The Persian Gulf War and other natural disasters. Journal of Social Issues, 49, 125-145.

Silver, R. C., Holman, E. A., McIntosh, D. N., Poulin, M., & Gil-Rivas, V. (2002). Nationwide longitudinal study of psychological responses to September 11. JAMA, 288(10), 1235-1244.

Author contact information

James W. Pennebaker <>

Ashwini Ashokkuman <

%d bloggers like this: