You are currently browsing the tag archive for the ‘coronavirus’ tag.

After some discussion with the applied math research groups here at UCLA (in particular the groups led by Andrea Bertozzi and Deanna Needell), one of the members of these groups, Chris Strohmeier, has produced a proposal for a Polymath project to crowdsource in a single repository (a) a collection of public data sets relating to the COVID-19 pandemic, (b) requests for such data sets, (c) requests for data cleaning of such sets, and (d) submissions of cleaned data sets.  (The proposal can be viewed as a PDF, and is also available on Overleaf).  As mentioned in the proposal, this database would be slightly different in focus than existing data sets such as the COVID-19 data sets hosted on Kaggle, with a focus on producing high quality cleaned data sets.  (Another relevant data set that I am aware of is the SafeGraph aggregated foot traffic data, although this data set, while open, is not quite public as it requires a non-commercial agreement to execute.  Feel free to mention further relevant data sets in the comments.)

This seems like a very interesting and timely proposal to me and I would like to open it up for discussion, for instance by proposing some seed requests for data and data cleaning and to discuss possible platforms that such a repository could be built on.  In the spirit of “building the plane while flying it”, one could begin by creating a basic github repository as a prototype and use the comments in this blog post to handle requests, and then migrate to a more high quality platform once it becomes clear what direction this project might move in.  (For instance one might eventually move beyond data cleaning to more sophisticated types of data analysis.)

UPDATE, Mar 25: a prototype page for such a clearinghouse is now up at this wiki page.

UPDATE, Mar 27: the data cleaning aspect of this project largely duplicates the existing efforts at the United against COVID-19 project, so we are redirecting requests of this type to that project (and specifically to their data discourse page).  The polymath proposal will now refocus on crowdsourcing a list of public data sets relating to the COVID-19 pandemic.


At the most recent MSRI board of trustees meeting on Mar 7 (conducted online, naturally), Nicolas Jewell (a Professor of Biostatistics and Statistics at Berkeley, also affiliated with the Berkeley School of Public Health and the London School of Health and Tropical Disease), gave a presentation on the current coronavirus epidemic entitled “2019-2020 Novel Coronavirus outbreak: mathematics of epidemics, and what it can and cannot tell us”.  The presentation (updated with Mar 18 data), hosted by David Eisenbud (the director of MSRI), together with a question and answer session, is now on Youtube:

(I am on this board, but could not make it to this particular meeting; I caught up on the presentation later, and thought it would of interest to several readers of this blog.)  While there is some mathematics in the presentation, it is relatively non-technical.