whitepaper - The science of contact tracing

Improved Risk Management: Reducing the epidemiological risk using contact tracing data

Download the PDF file of the whitepaper here.

By Bertrand Maury and Sylvain Faure of Université Paris-Saclay, in conjunction with Microshare

Executive summary: It’s all a matter of scale

Epidemiological risk management at a national level is a question of regulatory arbitrage, in a context which is both highly uncertain and constrained. The large-scale rollout of contact tracing apps on personal smartphones cannot reach the required critical mass of regular users necessary to generate reliable data. Contamination chains are therefore impossible to identify on a large scale, and available data is essentially of the macroscopic type. As a consequence, political stakeholders are reduced to taking half-blind decisions applicable to whole territories, disregarding local features and specificities.

At the scale of a company or local institution, the situation is entirely different. Efficient and low-cost contact tracing solutions have emerged, profoundly renewing the domain of epidemiological risk assessment.

This technological revolution goes far beyond the straight Contact Counting: it provides the basis of an irreplaceable Decision Support Tool to monitor middle-sized companies and possibly various other types of institutions, such as educational facilities, or care facilities for the elderly.

The knowledge of day-to-day contact matrices over a significant period of time indeed allows to:

  1. Closely track the propagation of an epidemic with the population under consideration, leading to an accurate and robust identification of the most exposed people whenever an individual is tested positive;
  2. Estimate an instantaneous risk score at the global level, which makes it possible to elaborate targeted recommendations/guidelines to significantly improve the community’s resistance to an anticipated epidemiological threat;
  3. Investigate the organic behavior of a community, understand its spontaneous structure beyond regulatory organization rules, and possibly design new organization principles to increase its resistance to potential threats of various types.

We show in this paper how accurate data from a dynamic network can be used to elaborate optimal decisions in the short-term (1), middle-term (2), and long-term (3).

Use case 1: Short term reaction

On a day “D”, an individual is tested positive in a company/community. After isolation of this Patient Zero (“P0”), identifying people that may have been contaminated is crucial, in order to have them tested and possibly treated quickly, and to stop the local epidemic propagation. Simple queries on the contact tracing database make it possible to identify the individuals who have had a significant number of contacts with P0. Yet, the full knowledge of day-to-day contact matrices makes it possible to get a more accurate idea of the risk distribution within the population. Considering for instance that P0 started to be contagious 8 days before being detected, agent-based epidemiological models can be run starting from D-8, to simulate the evolution of the virus attack among the population. Those computations lead to a list of values, one for each individual, each of which corresponds to the probability of being infected, estimated from the knowledge at the organization’s disposal. The state-of-the-art epidemiological model we propose takes full advantage of the rich knowledge that is at hand. The contact network is indeed not considered as a static object, but as a dynamic structure which encodes the very behavior of the community in terms of contact, day after day. We illustrate the strategy on a real data set collected in a European firm with 60 employees, over a three-month period in the summer of 2020. Data acquisition was based on Bluetooth Low Energy (BLE) and LoRaWAN anchor and bracelet technology supplied by the IoT technology company Kerlink, together with IoT data-management firm Microshare. All data is fully anonymized: we shall refer to individual employees with an index between 0 and 59.

The principle of the epidemiological approach is illustrated by Fig. 1 : the three graphs correspond to three consecutive days, and dots represent employees. On day D (left), a single individual is contaminated (red dot at the center), and the epidemic propagates over the networks, preferentially along “fat” edges (which correspond to many contacts).

Fig. 1 Day-to-day propagation

Fig. 1 Day-to-day propagation

The outcome from the user’s standpoint is given in Fig. 2. We focus here on a 20 people subpopulation. Each column of the double-entry table corresponds to contamination probabilities induced by some individual being initially infected, where light colors correspond to high probabilities. As an illustration, we consider the case P0 = 49. We extract the corresponding incidence column, which contains the contamination probabilities. The representation with bars directly gives the individuals who have the highest probability to be infected. In the present situation, it reveals 5 individuals (namely 50, 54, 55, 56, and 57) with high probabilities of being contaminated. We also plotted (right-hand side of the figure) the number of contacts of P0, it can be noted that some people have a high contamination probability while having few direct contacts with P0. This highlights the importance of considering epidemiological models based on the whole day-to-day succession of contact matrices, to properly account for chains of contamination.

Fig. 2 Contamination probabilities resulting from P0 = 49

Fig. 2 Contamination probabilities resulting from P0 = 49

Use case 2: Middle term global risk management in advance of an outbreak

Consider now the situation of a healthy community in a high-risk context: an inner epidemic outbreak has to be anticipated. The key issue now lies in anticipated damage control:

How can I take appropriate actions to rapidly reduce the epidemiological risk while limiting the effect they have upon the proper functioning of my company?

This approach relies on a relevant elaboration of an epidemic Global Risk Index (“eGRI”) which is estimated from contact matrices. The procedure to estimate this index follows the approach described in the previous section. The idea consists of exploring the incidence of all possible scenarios, i.e. all the possibilities in terms of Patient Zero. For each of those virtual conditions, the probabilities of being infected for all individuals are computed, and a double averaging procedure is performed over the whole population: averaging over all possible emergence scenarios first, and then averaging over potentially infected individuals. Note that the first averaging step builds a collection of Individual Risk Indices (IRI), which makes it possible to identify the most exposed individuals in the population. The number eGRI obtained by further averaging over the whole population expressed as a percentage can be interpreted as the mean probability of being infected if the epidemic enters the community. In other words, it gives the order of magnitude of the fraction of the population which is likely to be infected at the very time the epidemic is detected, without any assumption on Patient Zero.

Fig.3 - Risk Index eGRI

Fig.3 – Risk Index eGRI

This risk index eGRI can be used to explore in a quantitative way the effect of organization modalities upon the epidemiological risk. As an illustration, we considered as previously an 8-day period, and we carried out computations of the epidemic propagation successively starting from patient zero P0 = 0, 1, 2, …, 59, i.e. exploring all possibilities in terms of Patient Zero. The bar charts represented in Fig. 4 correspond to contamination probabilities, in other words, individual risk indices, from which eGRI can be computed by averaging them all out. The chart (a) on the left-hand side corresponds to the current situation. The second one (b) answers the question: if I were able to divide by 3 all contacts durations on the same period, how would it affect the Individual Risk Indices, and the Global Risk Index? One can check that eGRI is much lower (close to 3 %). Now, if one divides by 3 the number of contacts of a small targeted subgroup (the 7 more exposed in the basic scenario), the eGRI is also significantly reduced (c), with a much lighter change in behaviors. The last chart (d) corresponds to alternate presence at the office: one half of the employees come in on even days, the other half on odd days. It shows that, in spite of a heavy (and possibly costly) change in the functioning of the company, the effect upon eGRI is limited.

Fig.4 - Epidemiological Individual and Global Risk indices for various scenarios.

Fig.4 – Epidemiological Individual and Global Risk indices for various scenarios

Use case 3: Risk management in the long term

Contact matrices can also contribute to risk assessment in the long term, by making it possible to investigate the link between the organic structure of the community and its ability to withstand various sorts of viral attacks. This challenge can be addressed by transforming an aggregate series of contact matrices into a representative weighted network, and by carrying out clustering techniques to exhibit the very heart of the living community of interacting entities. The term “cluster” is primarily meant here in the graph-theoretical sense: it corresponds to sub-groups of individuals who strongly interact with each other. In fact, they also correspond to potential epidemiological clusters, i.e. sub-communities that are likely to share a common fate in case of a virus attack. This representation paves the way for a re-think of an organization’s rules, modalities of interaction and coordination, which can be used to reshape the structural network, in order to make it more resilient to a whole range of potential threats. To illustrate how the inner structure of a population of interacting people can be unveiled, we consider the same population over a two-week period. We assemble the global contract matrix as the sum of day-to-day contact matrices and we build a metric network, where the edge lengths are defined as a decreasing function of the total number of contacts: two individuals with many interactions are considered to be close to each other. A global distance table is then obtained by following a shortest path principle on the network. From this distance table, various tools can be used to exhibit the underlying structure, like Agglomerative Clustering. Fig. 5 presents this clustering approach in terms of a dendrogram, i.e. a tree where the leaves are the individuals. The tree can be seen as a genealogical tree, where the employees (identified here by an index between 0 and 59) are considered members of a large family. Following this metaphor, clusters appear as sub-families, each represented by a different color, gathering tightly connected people, possibly working in the same office or department.

Fig. 5 Structuration of a 60-employee company through Agglomerative Clustering

Fig. 5 Structuration of a 60-employee company through Agglomerative Clustering

Conclusion
We have presented a three-step procedure to infer valuable information on epidemiological risk out of contact tracing data. This approach paves the way for future developments relying on sophisticated mathematical tools (epidemiological models, graph theory, clustering approaches), which are synergistically used to estimate reliable individual and global risk indices and to provide efficient tools to reduce them in a targeted way.

As more organizations develop and use these models and share their pseudonymized data, we see the opportunity to develop a common set of tools to compare behaviors and provide benchmarks to assist in driving appropriate responses to this new normal.

If you would like to talk to us about how data from Microshare’s Universal Contact Tracing solution is helping organizations to better manage their workforce safety around the world, please get in touch.

Bertrand Maury is Co-founder of Signactif and Professor of Mathematics at University Paris-Saclay & Ecole Normale Supérieure. Sylvain Faure is Co-founder of Signactif and CNRS Research Engineer at Université Paris-Saclay