top of page

Covid-19 vaccination hashtag activism and frame:
Social Network Analysis of Twitter Data

Dec 2021
Gradate school individual research project  
Quantitative research

In the wake of the Covid-19 pandemic, many people have become increasingly skeptical of vaccines and their efficacy. This skepticism has fueled the spread of conspiracy theories and misinformation, leading to increased confusion and hesitancy around vaccination. As a graduate student at Georgetown, I was interested in exploring the role of social media in shaping these attitudes and behaviors.

My project, Covid-19 Vaccination Hashtag Activism and Frame: Social Network Analysis of Twitter Data, aimed to analyze the group behavior of pro-vaccine and anti-vaccine communities on Twitter. By studying these groups, I hoped to gain insights into the ways in which they engage with each other and disseminate information about the Covid-19 vaccine.

To complete my research, I first had to teach myself a range of skills, including R language, data collection, data cleaning, and graphic design. With these skills in hand, I was able to extract and process data from the Twitter API, pulling out more than 3000 user accounts over a three-month period using five hashtags.
Using quantitative methods, I analyzed the data to measure centrality and modularity within each group, as well as to identify key actors and influencers. Ultimately, my research revealed that the anti-vaccine group exhibited closer relationships and more cohesive behavior than the pro-vaccine group.

Through this project, I gained invaluable experience in conducting quantitative research and working with large datasets. I also deepened my understanding of the role that social media plays in shaping public attitudes and behaviors, particularly in the context of a public health crisis.


This research intends to fulfill the group behavior observation gap under the “new normal” secluded lifestyle generated by Covid-19.  

Research type

R  terminal

Quantitative research

Research Design

In order to fill a gap in our knowledge of how Covid-19 vaccine information circulates on Twitter, I designed the whole research in several stages:

Main hypothesis formulation



Social Network Plot Production 

and conclusion 
Main hypothesis formulation

As one of the most dreadful diseases in human society, the coronavirus has been the most terrible topic dominating the planet as of 2019. Asymptomatic to acute respiratory distress syndrome with very severe pneumonia, septic shock, and multi-organ failure, which may result in death, are all possible clinical manifestations of such corona virus infections (Guan et al. 2020).

In comparison to other nations, the United States had a more dire scenario. Things are finally deteriorating at a time when the globe is experiencing a crisis. Six potential vaccines have received $10 billion in funding through the government Operation Warp Speed initiative, which is run by the U.S. Department of Health and Human Services (FDA,2021). Pfizer and BioNTech submitted their Covid-19 Shot for emergency permission in the US on November 20, 2020, making them the first pharmaceutical company to request regulatory approval of a corona virus (Miller & Kuchler, 2020). After a month, the US started distributing the first COVID vaccines in each of the 50 states (Stacey & Kuchler, 2020).

Although the creation of Covid-19 vaccinations, which required significant scientific effort, was overcome, the vaccine still faced significant logistical, distribution, and communication issues, the latter of which were mostly caused by people's reluctance to receive the vaccine (Savoia et al., 2021). A patient's level resistance to the immunizations is vaccine hesitancy, which can be fueled by the group holding the position of cautious accepters to outright deniers (Puri et al. 2020). The hesitation to receive the COVID-19 vaccine is a complex matter with many varying viewpoints.


By examining survey data from 2650 people in all 50 US states as well as Puerto Rico, American Samoa, and Guam, Savoia et al. (2021), for instance, noted that prior exposure to racial prejudice was a predictor of vaccine hesitation.

Hanna Barczyk for NPR “Opinion: Vaccine Hesitancy In The U.S. Is A Peculiar Privilege”

Vaccines have raised public worries about safety and effectiveness even though it has been demonstrated that they can minimize the chance of contracting an infectious disease and are one of the most successful public interventions. Public trust in COVID is reported by Ariana Remmel in 2021. After the authorities halted immunizations with the Johnson & Johnson shot in late April, supported by scientific study, 19 vaccines have been prevented in the US. Their conclusion implies that social media is more important than it should be in the field of immunization.

There has been prior research on the visual vaccination discussion on Twitter. The Covid-19 vaccinations, which are essential for the public's safety and health globally, present a unique problem. The purpose of this study is to close a knowledge gap on how Covid-19 vaccine information is disseminated on Twitter. Additionally, this research seeks to understand the perspectives of ma
ny networks, communities, and groups that oppose Covid-19 vaccinations. As a result, this study aims to address the following questions:


  • What is the online behavior pattern in the anti- and pro- Covid 19 vaccine groups?

  • Who among these performers wields the most clout?

Based on the questions and relevant research, my hypothesis comes as follows:

H0: Users of anti-group are more closely connected to each other than users of pro-group in terms of interaction on Twitter hashtags. After analyzing the group behavior, I discussed the key actors in the two group and how they can influence the distribution of vaccine information.


The aim of this study is to compare between the dissemination of Covid-19 vaccine information in anti- and pro- vaccine group. To find out the Twitter network and identify the key influential actors, I collected tweets posted on 15 December 2020 and May 2021 using R and Twitter API to extract post data. I gathered only tweets written in English, having posted on at least one of the following hashtags: #antivaxxer, #Covidiots, # antivaxxers, #antivax, #vaccinesideEffects, #GetVaccinated, #VaccinesWork, #GetVaccinatedNow, #Vaccinated, #PassSanitaire. To select the hashtags for the collection criteria, I referred to the online website Hashtagify, a hashtag tracking tool, to systemize the correlation and popularity of the related hashtags. Meanwhile, I also reviewed how people use these hashtags, especially their emotional expression, after the mass Covid-19 vaccine production and the vaccinations pause with the Johnson & Johnson (J&J) shot.

Data collection

SNA is a method for investigating social systems by creating networks and graphs (Otte and Rousseau,2002). SNA will be used in this study to produce the overall network plot. The edge represents the connection between users and the node represents the users who post using the hashtag. This connection occurs when users mention, retweet, reply to, or quote other users' tweets.

Hashtag selection process

Then I separated it to two virous group: anti- and pro-vaccines. To better understanding each group’s behavior, I conducted the size and connectivity analysis. In light of the research from Milani et al. (2020), I considered parameters such as density and modularity. Density indicates the ratio between the number of detected retweets and the number of possible retweets in a network group. As for modularity, Newman (2010) conveys that range from highly connected to not connected can be calculated from 1 to 0. Utilizing density and modularity, I gained more insights within the two opposite cluster. Combining the cluster size, connectivity, density and modularity, I generated how information flows within or out of the groups.

To explore the users’ behavior pattern in different hashtags on Twitter, I collected information about users who posted and retweeted in different topics in two specific time periods. The two points in time were one day after the first rollout (2020.12) of the vaccine in the United States and one week after the announcement of the withdrawal of the Johnson & Johnson vaccine (2021.05). For the opposing group, the hashtags include #antivaxxer, #Covidiots, # antivaxxers, #antivax, #vaccinesideEffects; As for the positive group, the hashtag include #GetVaccinated, #VaccinesWork, #GetVaccinatedNow, #Vaccinated, #PassSanitaire. These data are used to generate the corresponding Network, with users as nodes and forwarding interactions between users as edges.

I gathered data on Twitter users who posted and retweeted using various hashtags to study their behavior patterns. Specifically, I examined the activity of users during two specific time periods: one day after the first rollout of the vaccine in the United States in December 2020 and one week after the announcement of the withdrawal of the Johnson & Johnson vaccine in May 2021. I separated the hashtags into two groups: negative and positive. The negative group included hashtags such as #antivaxxer, #Covidiots, # antivaxxers, #antivax, and #vaccinesideEffects, while the positive group included hashtags such as #GetVaccinated, #VaccinesWork, #GetVaccinatedNow, #Vaccinated, and #PassSanitaire. I used this data to create a Network that mapped users as nodes and interactions between users as edges. To enhance my understanding of the graph's data, I examined the density of each network. Density is a measure of the number of connections a participant has relative to the total possible connections they could have. This measure, expressed as a percentage, can be used to indicate the proportion of actual linkages compared to potential linkages that are observable (Maarten, et al., 2007). The density graphics are as follows.

According to the findings, in 2020, users who used positive hashtags had closer connections with each other compared to those who used negative hashtags. However, the situation was reversed in 2021. To examine the performance of individual users in each network, the researcher conducted node-related research and measured each user's influence based on their degree, betweenness, and closeness. A node is considered central if it has many direct connections with other nodes. In 2020, there was a significant difference in centrality between the positive and negative networks, with the nodes in the negative network having almost twice the value of the nodes in the positive network. 
At the same time, I further computed each network's own degree-centrality. The greater difference presented in the negative network in 2020 and 2021 is evidence of a greater concentration of rights.









Closeness indicates how easy it is for a node to reach other nodes, which is the reciprocal of the average distance of all other nodes. The larger this value is, the more points are gathered around it. The closeness of each node in each network was obtained by calculating the R language and calculating the mean value to derive the closeness of the connection between each network. The result indicates that everyone

in the negative network is more closely connected. Also, the result is significant bytest, which proves that the negative network users are more closely connected than the users of the positive network.










Group name
bottom of page