| Peer-Reviewed

Applying Machine Learning Models to Classify Xenophobic Tweets Against Asians, With Data Analysis of Hate Crimes

Received: 23 September 2021    Accepted: 8 November 2021    Published: 19 November 2021
Views:       Downloads:
Abstract

This paper offers insight to the COVID-19 pandemic and its effect on people's attitudes towards certain minority groups, particularly Asians, Asian-Americans, and Pacific Islanders. With the Coronavirus first being identified in Wuhan, China, xenophobia, and racism towards groups pertaining to the supposed origins of the COVID-19 pandemic have been on the rise. Along with the violent physical attacks on these groups, this paper will focus on the online hate and xenophobia that Asians face due to their race, ethnicity, country of origin, and/or others. In this paper, Python is employed as the primary programming language; external libraries such as pandas, NumPy, sklearn, WordCloud, and matplotlib are imported for handling data. In analyzing the racism against Asians, keywords such as “Asian Hate,” “Hate Crime” and “anti-Asian” are utilized, and the Python programming language is employed to sift through Google News articles with these keywords and identify patterns in the words’ usages. Furthermore, the frequencies of the keywords’ usages on online platforms such as Twitter are also analyzed in the form of comma-separated files, with patterns of usage over time before and after the COVID-19 pandemic began being identified. Randomly selected tweets are classified into five categories: anti-Asian, not anti-Asian, not English, hate against others racial groups, and support towards Asians. These tweets are classified by artificial intelligence using machine learning methods of logistic regression, support vector machine, and Naive Bayes; the artificial intelligence was taught using pre-classified data sets. Classified tweets represent the implication and relevance between the tweets and xenophobia. This classification model of xenophobia is expected to be used in social media content censoring and enhance the internet chatting etiquette. The goal of this classification model is to terminate anti-Asian hatred and lower the overall level of societal racism.

Published in International Journal of Science, Technology and Society (Volume 9, Issue 6)
DOI 10.11648/j.ijsts.20210906.14
Page(s) 281-288
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Asian Hate, COVID-19, Xenophobia, Racism, Online Hate

References
[1] Gover, A., Harper, S., & Langton, L. (2020). Anti-Asian hate crime during the COVID-19 pandemic: Exploring the reproduction of inequality. American Journal of Criminal Justice, 45 (4), 647-667.
[2] Brendan Lantz, and Marin R. Wenger. (2021, August). Are Asian Victims Less Likely to Report Hate Crime Victimization to the Police? Implications for Research and Policy in the Wake of the COVID-19 Pandemic, Crime & Delinquency (CAD).
[3] Hitman, Gadi & Harel, Dror. (2016). Hate Crimes—Methodological, Theoretical & Empirical Difficulties—A Pragmatic & Legal Overview. Journal of Cultural and Religious Studies. 4. 10.17265/2328-2177/2016.01.001..
[4] Tavernise, S., & Oppel, R. A. (2020, March 23). Spit On, Yelled At, Attacked: Chinese-Americans Fear for Their Safety. The New York Times. https://www.nytimes.com/2020/03/23/us/chinese-coronavirus-racist-attacks.html.
[5] Martin, A. (2021, July 15). Why is it so difficult to stop abuse on social media? Sky News. https://news.sky.com/story/why-is-it-so-difficult-to-stop-abuse-on-social-media-12354192.
[6] Shimizu, K. (2020, February 11). 2019-nCoV, fake news, and racism. The Lancet. https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)30357-3/fulltext.
[7] Afolabi, Oyeronke & Holder, Raymond. (2021). Social Media and Racism in 21 st Century America: A Case Study of Twitter. Merriam-Webster. (n.d.). Xenophobia vs. racism: Explaining the difference. Merriam-Webster.
[8] AJMC. (2021, January 2). A Timeline of COVID-19 developments in 2020. AJMC. https://www.ajmc.com/view/a-timeline-of-COVID19-developments-in-2020.
[9] Anderson, M. (2020, August 20). Social media conversations about race. Pew Research Center: Internet, Science & Tech. https://www.pewresearch.org/internet/2016/08/15/social-media-conversations-about-race/.
[10] https://www.MachineLearningMastery. (2020, April 7). 4 Types of Classification Tasks in Machine Learning. Retrieved August 5, 2021, from Machine Learning Mastery website: https://machinelearningmastery.com/types-of-classification-in-machine-learning/.
[11] Rohith Gandhi. (2018, June 7). Support Vector Machine — Introduction to Machine Learning Algorithms. Retrieved August 5, 2021, from Medium website: https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47.
[12] Peng, Joanne & Lee, Kuk & Ingersoll, Gary. (2002). An Introduction to Logistic Regression Analysis and Reporting. Journal of Educational Research - J EDUC RES. 96. 3-14. 10.1080/00220670209598786.
[13] Kaviani, Pouria & Dhotre, Sunita. (2017). Short Survey on Naive Bayes Algorithm. International Journal of Advance Research in Computer Science and Management. 04.
[14] Wibawa, Aji & Kurniawan, Ahmad & Murti, Della & Adiperkasa, Risky Perdana & Putra, Sandika & Kurniawan, Sulton & Nugraha, Youngga. (2019). Naïve Bayes Classifier for Journal Quartile Classification. International Journal of Recent Contributions from Engineering, Science & IT (iJES).
[15] Rish, Irina. (2001). An Empirical Study of the Naïve Bayes Classifier. IJCAI 2001 Work Empir Methods Artif Intell. 3.
Cite This Article
  • APA Style

    Gi Joon Chang, Seoyoon Choi, Gyeongmin Han, Heuiseo Kim, Inselbag Lee. (2021). Applying Machine Learning Models to Classify Xenophobic Tweets Against Asians, With Data Analysis of Hate Crimes. International Journal of Science, Technology and Society, 9(6), 281-288. https://doi.org/10.11648/j.ijsts.20210906.14

    Copy | Download

    ACS Style

    Gi Joon Chang; Seoyoon Choi; Gyeongmin Han; Heuiseo Kim; Inselbag Lee. Applying Machine Learning Models to Classify Xenophobic Tweets Against Asians, With Data Analysis of Hate Crimes. Int. J. Sci. Technol. Soc. 2021, 9(6), 281-288. doi: 10.11648/j.ijsts.20210906.14

    Copy | Download

    AMA Style

    Gi Joon Chang, Seoyoon Choi, Gyeongmin Han, Heuiseo Kim, Inselbag Lee. Applying Machine Learning Models to Classify Xenophobic Tweets Against Asians, With Data Analysis of Hate Crimes. Int J Sci Technol Soc. 2021;9(6):281-288. doi: 10.11648/j.ijsts.20210906.14

    Copy | Download

  • @article{10.11648/j.ijsts.20210906.14,
      author = {Gi Joon Chang and Seoyoon Choi and Gyeongmin Han and Heuiseo Kim and Inselbag Lee},
      title = {Applying Machine Learning Models to Classify Xenophobic Tweets Against Asians, With Data Analysis of Hate Crimes},
      journal = {International Journal of Science, Technology and Society},
      volume = {9},
      number = {6},
      pages = {281-288},
      doi = {10.11648/j.ijsts.20210906.14},
      url = {https://doi.org/10.11648/j.ijsts.20210906.14},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijsts.20210906.14},
      abstract = {This paper offers insight to the COVID-19 pandemic and its effect on people's attitudes towards certain minority groups, particularly Asians, Asian-Americans, and Pacific Islanders. With the Coronavirus first being identified in Wuhan, China, xenophobia, and racism towards groups pertaining to the supposed origins of the COVID-19 pandemic have been on the rise. Along with the violent physical attacks on these groups, this paper will focus on the online hate and xenophobia that Asians face due to their race, ethnicity, country of origin, and/or others. In this paper, Python is employed as the primary programming language; external libraries such as pandas, NumPy, sklearn, WordCloud, and matplotlib are imported for handling data. In analyzing the racism against Asians, keywords such as “Asian Hate,” “Hate Crime” and “anti-Asian” are utilized, and the Python programming language is employed to sift through Google News articles with these keywords and identify patterns in the words’ usages. Furthermore, the frequencies of the keywords’ usages on online platforms such as Twitter are also analyzed in the form of comma-separated files, with patterns of usage over time before and after the COVID-19 pandemic began being identified. Randomly selected tweets are classified into five categories: anti-Asian, not anti-Asian, not English, hate against others racial groups, and support towards Asians. These tweets are classified by artificial intelligence using machine learning methods of logistic regression, support vector machine, and Naive Bayes; the artificial intelligence was taught using pre-classified data sets. Classified tweets represent the implication and relevance between the tweets and xenophobia. This classification model of xenophobia is expected to be used in social media content censoring and enhance the internet chatting etiquette. The goal of this classification model is to terminate anti-Asian hatred and lower the overall level of societal racism.},
     year = {2021}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Applying Machine Learning Models to Classify Xenophobic Tweets Against Asians, With Data Analysis of Hate Crimes
    AU  - Gi Joon Chang
    AU  - Seoyoon Choi
    AU  - Gyeongmin Han
    AU  - Heuiseo Kim
    AU  - Inselbag Lee
    Y1  - 2021/11/19
    PY  - 2021
    N1  - https://doi.org/10.11648/j.ijsts.20210906.14
    DO  - 10.11648/j.ijsts.20210906.14
    T2  - International Journal of Science, Technology and Society
    JF  - International Journal of Science, Technology and Society
    JO  - International Journal of Science, Technology and Society
    SP  - 281
    EP  - 288
    PB  - Science Publishing Group
    SN  - 2330-7420
    UR  - https://doi.org/10.11648/j.ijsts.20210906.14
    AB  - This paper offers insight to the COVID-19 pandemic and its effect on people's attitudes towards certain minority groups, particularly Asians, Asian-Americans, and Pacific Islanders. With the Coronavirus first being identified in Wuhan, China, xenophobia, and racism towards groups pertaining to the supposed origins of the COVID-19 pandemic have been on the rise. Along with the violent physical attacks on these groups, this paper will focus on the online hate and xenophobia that Asians face due to their race, ethnicity, country of origin, and/or others. In this paper, Python is employed as the primary programming language; external libraries such as pandas, NumPy, sklearn, WordCloud, and matplotlib are imported for handling data. In analyzing the racism against Asians, keywords such as “Asian Hate,” “Hate Crime” and “anti-Asian” are utilized, and the Python programming language is employed to sift through Google News articles with these keywords and identify patterns in the words’ usages. Furthermore, the frequencies of the keywords’ usages on online platforms such as Twitter are also analyzed in the form of comma-separated files, with patterns of usage over time before and after the COVID-19 pandemic began being identified. Randomly selected tweets are classified into five categories: anti-Asian, not anti-Asian, not English, hate against others racial groups, and support towards Asians. These tweets are classified by artificial intelligence using machine learning methods of logistic regression, support vector machine, and Naive Bayes; the artificial intelligence was taught using pre-classified data sets. Classified tweets represent the implication and relevance between the tweets and xenophobia. This classification model of xenophobia is expected to be used in social media content censoring and enhance the internet chatting etiquette. The goal of this classification model is to terminate anti-Asian hatred and lower the overall level of societal racism.
    VL  - 9
    IS  - 6
    ER  - 

    Copy | Download

Author Information
  • Big Heart Christian School, YongIn, South Korea

  • Seoul International School, Seongnam, South Korea

  • Cardigan Mountain School, Canaan, United States

  • Palisades Park High School, Palisades Park, United States

  • St. Mark’s School, Southborough, United States

  • Sections