Energy Mobility Tweets Searches

Do Expatriates Tweet Differently?

Martina Galletti
SONY Computer Science Lab - Paris

In a globally connected world, many people live abroad for significant parts of their lives. These expatriates have a different experience of the Coronavirus outbreak than those residing in their home country. Perhaps they were in dismay at the lack of concern they saw around them while their friends and family were suffering back in their home country? Perhaps they were more concerned with the outbreak in the country they currently reside in rather than in their home country? To gain insight into this potential difference we decided to identify and compare tweets from expatriates (those living abroad) and residents (those living in their home country). Identifying these two classes of people is not a clear cut problem using Twitter data. We recommend that you are mindful of the caveats of this data analysis that are mentioned at the end of this report.

To this end, we have currently only conducted this analysis using the Italian dataset, because only in this dataset was it possible to justify identification of expatriates and residents. In particular:

In contrast to this, 75% of English speakers and 72% of French speakers speak it as a second language. Furthermore, both French and English are spoken across multiple different countries with large populations as a first language such the United States and the UK in the case of English and France and French-Canada in the case of French.

Tweet Distribution

To start our analysis, we looked at the normalised frequency of Tweets for expats and residents. As you can see from the image below they seem to follow the the same trends with only minor variations. This means that expatriates look closely at the events of the motherland and the peaks of their curves correspond to main events in Italy.

We can see, in fact, a high peak on the 21st and 22nd of February, when Italian authorities confirmed the first deaths due to COVID-19 and some initial lockdowns covering certain municipalities were declared. Minor peaks follow during the first ten days of March when Prime Minister Giuseppe Conte imposed first, on the 8th of March, a regional lockdown to the most affected areas and on the 9th of March, a national quarantine.

Word Usage Difference

While the tweet distributions for expatriates and residents are similar, the words used in the tweets to talk about the Coronavirus are quite different. Being that the focus to investigate how italians abroad and residents talk about the health crisis, we looked into the dataset for the most used verbs, nouns, and adjectives used this in combination with the token ‘coronavirus’ in one sentence and we repeated this investigation for residents and expatriates. To highlight the differences between the expatriates and residents, we only focus on words that occur in one group and not in the other for the top 300 most frequent words. So the words shown for expat are among it’s most frequent 300 words and not found in the top 300 most frequent words of residents. The frequencies shown for words are the same frequencies from the frequency calculation prior to filtering by word difference.

On a verb level, residents seem to be most concerned about receiving information, perhaps, about the end of the lockdown : “comunicare”(to communicate), “avvisare”(to announce), “esporre”(to expose or commentare (to communicate are some key verbs in this sense. They also seem quite worried about being protected (with verbs such as “tutelare” (to protect) or “garantire”(to guarantee), maybe against the Coronavirus or the imminent economical crisis. They even seem to ask some interventions about it with verbs such as “intervenire” (to intervene),ricevere(to receive, attivare (to activate). They wonder about the re-opening (“riaprire”, to reopen, perhaps of the country while still worrying about a possible growth (“crescere”to grow perhaps of the coronavirus cases, while hoping for a general improving (“improve”(to improve) of the situation.

Expatriates, on the other hand, seem to talk about a possible slow down (rallentare (to slow down), “rilasciare” (to release), but they also seem to challenge or venture something (“sfidare”,to challenge or “osare”, to dare). They are concerned about some spread, perhaps of the virus (“trasmettere”(to spread),”sviluppare”(to develop, or “rilasciare”to release), but they still talk about visiting (“visitare”,to visit, perhaps families or beloved ones in Italy. Interestingly enough, there are also verbs related to positive feelings such as to love (amare) or to imagine (immaginare); this could be the sign that expatriates tend to maintain a more positive view on the health crisis. They seem also to hope for a sort of union (unire). Moreover, they talk also of a possible underestimate (sottovalutare) which was a shared feeling especially at the beginning of the health crisis when Italy was already under lockdown, while other countries seemed to carry on as it was nothing. Maybe this is also why the verb urgere) (“be necessary”) is present because they believed it was necessary to apply the same measures in the countries where they are living.

If we look more closely at the nominal entities, residents are most concerned about entities such as disembark (sbarco), perhaps of immigrants in Sicily (Sicilia) or maybe of travelers on a cruise as it was for the case of the Diamond Princess. Enterprises (impresa) are also present in their Tweets together with gross domestic product (PIL) and shops (negozio), testifying a general worry about the economical situation of the country. Entities related to the search of a vaccine ca be found such as researcher (ricercatore), invocation (appello), medicine (farmaco), fever (febbre) and donation (donazione). Bulletin (bollettino) can also unsurprisingly be found, especially if we think that every day during the months of the lockdown in Italy all the principal newscast broadcast the variations of numbers of deaths and recoveries. Chaos (“caos”) is also evoked which could be considered as a key entity to describe the general situation and overall feeling provoked by the health crisis. Finally, the world of school is present with student (studente) and the word of family with parents (genitore).

Expatriates seem first of all to Tweet about (or to look for) breaking news about the health crisis. They also wonder about some depression (depressione). Unsurprisingly, foreign locations are present such as London (“Londra”) or Hangzhou, while in the residents lexicon foreign countries or capitals were not so common. This could lead us to say that expatriates seem to have a more international view on the health crisis than Residents as one might expect. Europe, strategy (strategia) and help (aiuto) can be found and this is quite interesting considered the great debate which spread in Italy about the lack of an unified European help, especially during the first weeks of the national lockdown. Unsurprisingly, general tokens related to the virus can be found, such as Quarantine (Quarantena), epidemics (epidemia]; a reference to the Sars epidemic is also present.

Finally, residents’ adjectives describe a regional (regionale) or local (locale) dimension. They reflect the desire to be free (libero) and operative (operativo, but at the same time the worry about the uncertainty of the present (and future) seem to be present with a key adjective such as precario. They also dream of being immune (immune) and they define, perhaps the situation shameful (vergognoso)

On the contrary Expatriates’ adjectives are once again more related to a more international point of view : Spanish, (spagnolo), foreign, (estero) or Swiss (svizzero) can in fact be found. The multiple adjectives (people from Ticino ticinese or canton, cantonale) relating to Switzerland are also unsurprisingly present given the amount of Italian speaking inhabitants in those areas.
Finally also believe the health crisis is a main (principale), current (attuale) and dramatic (drammatico) issue. As for the case of nominal entities, we can find positive adjectives which reflect a more positive attitude such as “wonderful” (meraviglioso) or “beautiful” (bello).

Caveats

Because we are not explicitly given information on whether a person is in their home country or abroad we infer it based on their language and reported location. We only explore countries that have a language that are closely tied to that country. That is, where speakers of that country are overwhelming either born and raised there or have had significant exposure to that country or people from that country. This limits the possibility for exploration significantly and inevitably makes our results noisy for various reasons. For example, we could not perform this analysis for English or French as both are widely spoken languages and simply knowing that a person speaks English or French does not definitively identify where they are from. In the case of a native English speaker they could be from the United States, The United Kingdom, Australia, India (English is the first language of many people in India) and others even without considering that English the the second language of many people over the world.

Our induction based on language and location is based on the following assumptions:

We use the terms expatriate and resident very broadly here. We do not know why an individual is abroad or at their country of residence at the time of Tweeting. They may have been on holiday, on a business trip, working there, settled there etc. We assume that:

All of these caveats make the data analysis noisy in that there will be many Tweets for which the assumptions we have made do not hold. However, if our assumptions are broadly correct we can still identify interesting comparisons between expats and residents with these caveats in mind.