Wednesday, March 7, 2018

Summary of Research Paper "Review of software for space-time disease surveillance"



•SaTScan, ClusterSeer, GeoSurveilance and R Surveillance Package are software which have the capability of space-time disease surveillance.
•Colin Robertson*, Trisalyn A Nelson have done a research to review these 4 software packages in 2009.
•The research "Review of software for space-time disease surveillance" was published in 2010 in Robertson and Nelson International Journal of Health Geo graphics 2010.
Here in this post I have summarized what I have read in this research paper as you can see in this following table.

Research Title
  • Review of software for space-time disease surveillance
What authors have done in this paper?
  • In this research paper authors have reviewed software packages which have the capability of space-time disease surveillance analysis and outlined salient features, short comings and usability of those software packages.
Why GIS (Geographical Information Systems) are used?
  • Used for disease mapping and can visualize spatial variation in disease risk.
Why statistical methods are used?
  • To detect changes in underlying disease process
Key factor of selecting method of analysis is?
  • Selecting the method of analyzing depends on the objective of the surveillance.
         i.e. outbreak detection
               Trend monitoring
               Case detection
               Situational awareness
How the methods can be classified?
     1.    Statistical tests
     2.    Model based approaches
Scan statistics are mostly used for?
  • They are mostly used in outbreak detection.
What are modelling approaches are used for?
  • They are mostly used for adjusting number of cases of disease.
In this research, software programs were reviewed based on two criteria
     1.    Programs which had methods that handle both space   and time and methods are built in to the software.

In this criteria they have identified 4 software packages.
  • SaTScan 8.0
  • ClusterSeer 2.3
  • GeoSurveillance 1.1
  • Surveillance package 1.1-2 for R

     2. Comprehensive Disease Surveillance Systems
  • RODS, AEGIS
  • But this type of software are built in enterprise level and can not be accessed by researchers or analysts.
How researchers are going to conduct reviewing process for each software package?
  • They conducted the reviewing process in two steps. (According to broad steps of typical data analysis)
  • Two steps were;
         1. Review data pre-processing of each software package
         2. Analysis of methods and issues
Reviews
Data Pre-processing

SaTScan
  •   It needs data to input in three separate files.

      1. Geo File (file which store spatial locations)
      2.  Case File ( file which store cases)
      3.   Population File (file which store population of each area)
  • Format of SaTScan files : Text based. But SaTscan provide facility to import to common formats such as CSV, DBF
  • Key Advantage of SaTScan over other software Packages is :
  • SaTScan provides functionality to aggregate data into years in a temporarily manner. This is important because it limits the amount of data restructuring when trying different analysis parameters.
ClusterSeer
 
  • This requires unique record for every space time unit under surveillance     


ClusterSeer and GeoSurveillance both can read from polygon shapefile and calculate centroid coordinates.
GeoSurveillance
  •  Requires aggregation temporarily and spatially
  • Counts of cases need to be attributes of polygon shape file or text file
  • Fields are required to be named in sequential order.

  •  Draw Backs – Here data preprocessing is an extensive process and also does not support temporarily aggregation of data.
Surveillance Package in R
  •  Data required to be in a matrix format
  • ·Rows - temporal observations
  •  Columns - spatial units
Methods used in this software packages
SaTScan
  • SaTScan offers number of scan statistics
  • Examples : spatial, temporal, space time versions and
    Retrospective and prospective modes
  • By using many probability models, different data types can be accommodated
  • Examples for different probability models :
         1. poison
         2. Bernoulli
         3. Space time permutation
         4. Multinomial
         5. Ordinal
         6. Exponential
         7. Normal

GeoSurveillance
  •  This software package implements cusum approach to surveillance.
  • ·Retrospective mode - performs global spatial analysis
  • ·Prospective mode - uni-variate parallel surveillance
  • ·Multivariate cusum is not yet supported
ClusterSeer
  •   ClusterSeer does have widest range of spatial time methods implemented.
  • These methods are suited to disease surveillance included space time scanning.
  • Therefore ClusterSeer is a useful tool in exploring disease surveillance data
R surveillance Package
·         Does have number of algorithms
  • ·         Farrington method
  • ·         Poison cusum
  • ·         Two component negative binomial model
·         These algorithms are model based and non-spatial
Data Output
SaTScan
  • SaTScan outputs are limited to text file and database files
  • This database file output can be linked back into input file in GIS for further examination of clusters.
  • No data exploration functionality is available
GeoSurveillance
  • Analysis data can be written to a text file
  • And this file can be manipulated in other software
  • Provides a basic map linked to list of cusum scores
  • Cusum chart displays temporal patterns of cusum scores
ClusterSeer
  • Has advanced data output facilities
  • Mapping and graphic outputs can be exported as images
  • Results can also be exported with the data to new files for further examination inside statistical or GIS software.
R Surveillance Package
  • Has access to exporting and visualization functions

Related Researches on Disease Surveillance using SaTScan Tool

The following two researches relates to vector borne disease surveillance using SaTScan Tool.



1. Population Density, Water Supply, and the Risk of Dengue Fever in Vietnam: Cohort Study
    and  Spatial Analysis  
    By Wolf-Peter Schmidt, Motoi Suzuki, Vu Dinh Thiem, Richard G. White, Ataru Tsuzuki, Lay-
    Myint Yoshida, Hideki Yanai, Ubydul Haque, Le Huu Tho, Dang Duc Anh, Koya Ariyoshi
    Published in 2011         


2. The spatial-temporal clustering of Plasmodium falciparum infection over eleven years in 
    Gezira State, The Sudan
     By Samia E Mirghani, Bakri YM Nour, Sayed M Bushra, Ibrahim M Elhassan, Robert W Snow
     and Abdisalan M Noor
     Malaria Journal 2010 9:172

  
In table 1, I have summarized research readings on above numbered research paper 1. In table 2 I have summarized research paper 2.

    

 Table 1

Title
Population Density, Water Supply, and the Risk of Dengue Fever in Vietnam: Cohort Study and Spatial Analysis
Link
Method
individual-level cohort study by applying space-time scan statistics and mathematical models
Parameters
In previous works, researches have investigated the effect on dengue transmission based on these parameters.

  1. Climate change
  2. Demographic transition
  3. Urban structure

In this research main focus goes to two parameters ; population density and inadequate water supply.

Note: Researches have mentioned that though population density and inadequate water supply are two major contributions to dengue epidemics, these two risk factors are poorly understood.
Data source
They have used hospital admission records and detailed household data from a large census area in vietnam.
Findings
They have showed that intense dengue virus transmission may occur in a
remarkably narrow range of human population densities with a
high mosquito/human host ratio in the absence of tap water

 Table 2
Title
The spatial-temporal clustering of Plasmodium falciparum infection over eleven years in Gezira State, The Sudan
Link
Method
Space time clustering using the bernoulli model and the significance of clusters were tested using the kulldorff scan statistics.
Data source
Data from cross-sectional surveys undertaken in January each year from 1999-2009 in 88 villages in the Gezira state were assembled

In each survey, about 100 children between age 2 to 10 were sampled to test if they have presence with the disease.
Final database consists of these data parameters
  • Name of the village
  • The longitude and latitude
  • The year of survey
  • A summary of the number of children who were examined
  • The number who were positive for P. falciparum parasites.
Why they have used Kulldorff scan statistics?
This kulldorff scan statistics are implemented in SatScan.
And used for the purpose, analysing spatial and temporal clustering in the data.

Specific aim was to identify clusters of high infection rate.
Why they have used Bernoulli model?
 Reason 1
the number of people who participated in the surveys varied over years. So they need to avoid the effect of these sampling changes so that they can avoid clusters  that are driven by number of people participated in the survey rather than number of people who had infected with the disease.
                   
Reason 2
to allow analysis of the purely spatial and/or the space-time scan statistics