UBC Research Data

COVID-19 related tweets from British Columbia sources Paterson, Susan; Brigham, Doug

Description

Tweets about COVID-19 from sources in British Columbia. This dataset includes tweets from government officials, health authorities and journalists. The tweet IDs were collected using Documenting the Now's Twarc library (https://github.com/DocNow/twarc).

The date of the earliest available tweet is different for each handle. The date of the latest available tweet will not be later than the upload date for each file. See the file-level information below.

The tweet ids were extracted from the raw JSON files retrieved from Twitter using Twarc. However, Twitter's terms of use do not permit the sharing of the raw JSON files for this dataset. The raw JSON files can be retrieved from Twitter, provided the content is still available, using the 'hydrate command within Twarc. The researchers retained the source JSON files and may be contacted by other researchers if they wish to access them. The files of tweet ids will be updated over time and this metadata, the files and this readme.txt file will be updated accordingly.

Raw JSON files were harvested using Twarc's 'timeline' command. The 'timeline' command retrieves the most recent tweets from the specified handle, to a maximum of approximately 3,300 tweets. The data for each handle was collected approximately weekly, starting in January 2021.

In order not to lose earlier tweets, we concatenated the JSON for each new 'timeline' crawl to the earlier crawls and de-duplicated the combined JSON using Twarc's 'deduplicate' command. We then used Twarc's 'dehydrate' command to extract just the tweet ids from the deduplicate JSON file. Finally, we sorted the tweet ids numerically so that they would appear in ascending date order.

The basic workflow looks like: twarc timeline --> concatenate JSON files --> deduplicate resulting JSON file --> dehydrate tweet ids --> sort tweet ids.

The Twitter handles include: @BCGovNews: BC Government News. Tweets in this file start on 2019-06-06. @CDCofBC: BC Centre for Disease Control. Tweets in this file start on 2019-06-28. @Fraserhealth: Fraser Health Authority. Tweets in this file start on 2019-01-07. @ImmunizeBC: Evidence-based immunization information and tools for BC residents from the BC Centre for Disease Control. Tweets in this file start on 2014-11-19. @Interior_Health: Health authority for the Southern Interior of BC. Tweets in this file start on 2017-06-30. @Northern_Health: Health authority for the Northern Interior of BC. Tweets in this file start on 2018-07-01. @PHSAofBC: Provincial Health Services Authority of BC. Tweets in this file start on 2019-11-01. @SAHoffman: Suzanne Hoffman, Superintendent of Schools for Vancouver. Tweets in this file start on 2009-08-14. @VCHhealthcare: Vancouver Coastal Health Authority. Tweets in this file start on 2019-02-15. @VanIslandHealth: Vancouver Island Health Authority. Tweets in this file start on 2017-12-13. @adriandix: Adrian Dix, Member of the Legislative Assembly for Vancouver-Kingsway and BC Minister of Health. Tweets in this file start on 2019-09-13. @fnha: First Nations Health Authority. Tweets in this file start on 2017-05-03. @govTogetherBC: Government of BC citizen engagement. Tweets in this file start on 2016-09-03. @j_mcelroy: Municipal Affairs Reporter for CBC Vancouver. Tweets in this file start on 2021-01-04. @jordantinney: Jordan Tinney, Superintendent of Schools for Surrey. Tweets in this file start on 2013-01-16. @keithbaldrey: Keith Baldrey, Political journalist for Global TV, British Columbia. Tweets in this file start on 2020-12-14. @kennedystewart: Kennedy Stewart, 40th Mayor of Vancouver. Tweets in this file start on 2016-10-16. @richardzussman: Reporter for Global TV, British Columbia, at the provincial legislature. Tweets in this file start on 2020-12-31.

Item Media

Item Citations and Data

Usage Statistics