About

Welcome to the data repository of the COVID-19 World Symptoms Survey Microdata.

Access to the Microdata

Access to non-aggregated microdata can be granted to academic and nonprofit researchers. A request needs to be approved by UMD and Facebook and the researcher's institution must sign a Data User Agreement before data access is provided. To learn more about the requirements, eligibility, and to request access please go to COVID-19 Symptom Survey – Request for Data Access.

FAQ

1. What is the difference between Part-A and Full files?
The primary difference is the weights, based on differing inclusion criteria. A respondent is considered a Part A complete if they have responded:
  • Yes to the two consent questions (intro1 and intro2)
  • Yes to A1 (they are over 18)
  • Provided a response to country/region (A2 or A2_2)
  • And has at least one answer (yes or no) on B1 (symptoms)
A respondent is considered a full complete if they have responded:
  • Yes to the two consent questions (intro1 and intro2)
  • Yes to A1 (they are over 18)
  • Provided a response to country/region (A2 or A2_2)
  • Provided a response to two additional items

Note: these two items could include B1. Any response that qualifies for Part A complete should also qualify for full complete if they answer one additional question beyond B1, however, the two additional questions could also not include B1. Because of this, the full file is not fully nested inside the Part A file. Unless you are solely analyzing the symptom item (B1) you will likely use the full file.

2. Is there a codebook for different columns in the microdata file?
Yes:
  • For microdata before 2020-06-27: codebook
  • For microdata after 2020-06-27: codebook
  • For microdata after 2020-11-23: codebook
  • For microdata after 2020-12-21: codebook
  • For microdata after 2021-01-14: codebook
  • For microdata after 2021-02-06: codebook
  • For microdata after 2021-03-02: codebook
  • Release Notes

  • 2021-03-02: A new version of the survey(V10) has been launched. Microdata starting from 2021-03-02 will have responses from additional questions. Please see the new codebook link above.
  • 2021-02-12: Due to a survey logic issue in the version 5 survey configuration, there are about 38 miscoded B8 values each day in both Part-A and Full weighted datasets from June 27 to November 22, 2020. As of February 12, 2021, we have implemented a fix and backfilled the data. Re-downloading the data will correct the issue. The issue can also be fixed by recoding B8 as -99 if B7 is 2.
  • 2021-02-06: A new version of the survey(V9) has been launched. Microdata starting from 2021-02-06 will have responses from additional questions. Please see the new codebook link above.
  • A country regional code table is now available for version 9. For the complete table see Country Region Response Map.

  • 2021-01-26: The COVID-19 vaccine questions 'V1' and 'V2' are fielding in 5 more countries. These are the United Arab Emirates, Seychelles, Bahrain, Singapore, and Onam. Please find an updated list of countries with secured COVID-19 vaccines here. This list can also be found in the codebook for version 8.
  • 2021-01-14: A new version of the survey(V8) has been launched. Microdata starting from 2021-01-14 will have responses from additional questions. Please see the new codebook link above.
  • A country regional code table is now available for version 8. For the complete table see Country Region Response Map.

  • 2021-01-11:
  • Issue 1
    Due to a data processing issue in our pipelines, responses to questions A1, B8, C2, C3, C4, C5, C6, D3, D4, E3, and E4 were incorrectly coded in the microdata from November 23 to December 30, 2020. This issue affected both the Part A and Full weighted data sets.
    If you downloaded microdata prior to December 30 for any part of this time period, you will need to re-download the data. As of December 30, we have implemented a fix and backfilled the data, so re-downloading the data at present will correct the issue. Any analyses conducted using the errant data for the above items will need to be re-run.

    Issue 2
    We have also corrected a more minor issue which does not affect data integrity for survey responses at the individual response level, but may have affected data aggregation. In Wave 7, some region names were formatted slightly differently than in previous waves and were missing the GID1 field. We expect that this would only have caused issues for region-level aggregations during the period during which Wave 6 and Wave 7 overlapped (December 21 to December 28, 2020), or if you are explicitly using the GID1 field. We expect to land a fix and backfill data to correct this issue by January 12, 2021; we will post in the release notes on the microdata API when this fix is fully implemented. At that time, re-downloading the data will correct any issues with region names or GID1.

  • 2020-12-21: A new version of the survey(V7) has been launched. Microdata starting from 2020-12-21 will have responses from additional questions. Please see the new codebook link above.
  • A country regional code table is now available for version 7. For the complete table see Country Region Response Map.
    Seen but unanswered questions are now set to -77.
    Missing, valid skipped, and invalid answers are set to -99.

  • 2020-12-08: There was an overlap between wave 5 and wave 6b of the survey. Data for questions B13_* and B14_* was collected on 11-30-2020. We will collect this additional data for December on 12-10-2020. Moving forward it will be collected on the 1st of each month.
  • From 11-23-2020 to 12-4-2020 the survey version '5' identifier was not added to the datasets, it was left blank. This is now fixed.

  • 2020-12-03: Column names of microdata files from 11-23-2020 to 11-30-2020 have been coded correctly to match the codebook. C7.1 is now C14. B1b_x14 was added.
  • 2020-11-20: A new version of the survey(V6b) has been launched. Microdata starting from 2020-11-23 will have responses from additional questions. Please see the new codebook link above.
  • A country regional code table is now available. For the complete table see Country Region Response Map.

  • 2020-09-30: The microdata has been updated to include unweighted countries. Column '1w_0unw' has been added to the csv files to flag weighted and unweighted countries. For the list of weighted countries see Weighted Countries List. For the Weights and Methodology Brief for the COVID-19 Symptom Survey please follow this link: Weights and Methodology.
  • 2020-06-27: A new version of the survey(V5) has been launched. Microdata starting from 2020-06-27 will have responses from additional questions.