This semester I am teaching two new courses, “Antisemitism: A History” and “Introduction to Digital Humanities.”
Except, of course, I’m not anymore. Better: I’m paused. I’ll pick the courses back up, online, in a bit over a week.
The pause has been highly unnerving, but has also given me a little space and time to play around with an idea that came up in the last couple of meetings of my “Antisemitism” class. The idea of Jews as contagious and spreaders of disease is a very old trope (strongly connected to notions of a “pure” corporate body threatened by invisible outside forces). With no empirical evidence whatsoever to blame Jews for the outbreak of the novel coronavirus, is such a theory nevertheless afloat, and if so, what does it look like?
While looking at the usual antisemitic websites can provide some insight, I thought it might be interesting to explore Twitter. My original approach was indirect: If we simply counted the number of times Jews were mentioned in tweets over the course of the last few months, do we see any significant upticks? I posed that question to my class, played around for a couple of hours to see if I could answer the questions, and then – frustrated by my inability to get anything to work – was engulfed in the usual stuff, and then mayhem.
Over the last couple of days, though, I’ve had some time to return to this question and sharpen my digital humanities skills as well. This post is deliberately exploratory, meant to help beginners – like me – think about and apply analytical approaches to Twitter while also exploring the emerging discourse around the novel coronavirus. Does this exploration really yield anything useful? Maybe.
Step-by-step:
- My original plan called for simply counting all tweets that mention Jews over the last few months. Given the limits of Twitter, though, that proves harder than it sounds. One software packages, though, does promise to be able to do this: TWINT (found here). So I just needed to install TWINT and run it.
- That turned out to be easier said than done, and it is what originally frustrated me. I have some facility in using the command line on my Windows machine and with running Python, but it is pretty basic. Installing TWINT required me first to install “pip3” and “git” (I already had an Anaconda build of Python installed, although it took me forever to find where it was in my directory since it installed into “hidden files”), and to mess with the file paths so that everything can communicate with everything else. But I did get it installed and working!
- I then ran TWINT from the command line with the simple instruction to find all tweets that mention “Jews” from January 1, 2020, to today and to output the results to a csv file.
- About 20 minutes later, TWINT crashed with error messages. It had downloaded about a day-and-a-half’s worth of tweets that fit with my criteria (starting with the most recent), some 25,000. Investigating the error messages, it appears that that is about all Twitter allows at a single time. But it did generate the csv file! The file contained a mass of data about each of the tweets, including not just the tweets but all kinds of information about the tweet. There were, in fact, 34 columns of data.
- So my next thought was to keep running TWINT for each day or two and then amalgamating the data into a single file. I may yet do that, but first I thought it might be interesting to explore what a single day of tweets about Jews looks like.
- I found a great tutorial for working with Twitter data here, opened a new Python 3 Jupyter notebook, and worked my way through it, using my csv file. I can’t say that I understood every command (especially when it comes to navigating dataframes) but I knew enough to get it to work and to get a general sense of my data.
And some analyses:
Top ten tweets, by number of times that the tweet occurs (the count is the number on the right):
26 | https://songwhip.com/artist/137-music-tribe … | 43 |
---|---|---|
10211 | Jew | 21 |
10328 | Jews | 12 |
4631 | Conspiracy theory that Jews created virus spre… | 12 |
9377 | Islam permitted the marriage of Muslim men wit… | 9 |
13757 | Portugal declares official commemoration day f… | 8 |
17417 | Trump Declares Sunday ‘National Day of Prayer’… | 8 |
15495 | Thank you x | 7 |
9215 | Inside luxury resort run by Israeli spies to s… | 7 |
23870 | ???????????? | 7 |
Some of these are a bit peculiar, but digging back into the data helps. The first one is a link to a Christian ministerial music group. The fourth is, interestingly enough, about how others are antisemitically linking Jews to the spread of the coronavirus. What we do not have in this list is evidence that people are actually making this connection (but see this article, which is linked to one such tweet and is linked in my data).
Next, I did a hashtag analysis. Top ten hashtags (count on the right) in this group of tweets:
hashtag | ||
---|---|---|
0 | #Jews | 152 |
1 | #Bible | 131 |
2 | #God | 125 |
3 | #coronavirus | 81 |
4 | #Jew | 75 |
5 | #Israel | 61 |
6 | #Love | 56 |
7 | #COVID | 56 |
8 | #COVID19 | 45 |
9 | #Faith | 40 |
This is somewhat more interesting and concerning. Jews seem to be linked most commonly to issues of religion, but hashtags dealing with the coronavirus (numbers 3, 7, and 8) comprise a significant percentage of the total. Note, though, that this is out of a total dataset of about 25,000 tweets, so the relative total is actually quite low.
Next, I did an analysis of mentions, that is, twitter entities marked with an “@” sign. Top ten, with counts on the right:
0 | @realDonaldTrump | 56 |
---|---|---|
1 | @YouTube | 51 |
2 | @BernieSanders | 46 |
3 | @timesofisrael | 33 |
4 | @Rosenbergradio | 24 |
5 | @JayElectronica | 24 |
6 | @POTUS | 22 |
7 | @JoeBiden | 15 |
8 | @UKLabour | 14 |
9 | @CNN | 13 |
Numbers 4 and 5 relate to a spat over a potentially antisemitic song (for more, see this). The large number of mentions of media outlets is probably typical (stories from such sources are often tweeted); the large number of political figures and parties, though, seems curious. Digging a bit deeper into this aspect of the data – which I did not do – might help us to better understand it.
Back to hashtags. Which ones appear together with others? Only hashtags used ten or more times in my datafile are included:
Remember that this is a snap of just one day’s hashtags and so reflects a single, fast news-cycle. Hashtags relating to Jews are associated most frequently with hashtags relating to religion (with something is going on with #Hawaii and #Einstein in this news-cycle). Maybe reassuringly, hashtags about the coronavirus are not co-appearing with hashtags referring to Jews (although all, by definition, are in tweets that somewhere mention the term “Jews”). Of some concern, though, is the more frequent appearance of hashtags relating to the virus with #Muslims in these posts.
Finally, I subjected the data to topic modelling. Topic modelling is an analysis that shows which words most commonly group together. The code in the tutorial works (somewhat imperfectly, as you can see below, but good enough) to tidy the data before applying the LDA algorithm (the same one, I believe, used by MALLET). And, using 10 topics, the results:
(I had a problem with the column headings; the first two columns actually belong to Topic 0 and the last two to Topic 9.)
Some of these topics, reading down, are pretty easy to decipher: Topic 1 (after adjusting, as noted above), is about the Arab-Israeli situation. Topic 2 deals with Jews and race. Topics 7 and 9 come back to Jews and religion, particularly from a Christian perspective. Topics 4 and 6 are darker. The coronavirus is mentioned only in Topic 6, and even there has a low weight.
I originally set out with the hypothesis that Jews would be connected to the coronavirus. There are websites and people that make such a connection, but this preliminary exploration into a bit more than a day of Twitter data suggests that it is less common that we might expect. While it would be interesting to look at equivalent data over a longer period of time, and to track the topics over such a period as well, it looks like my hypothesis is largely incorrect.
That, at least, is a small bright spot in these very strange times.