Most Popular Article of 2017: Disease Prediction by Machine Learning Over Big Data From Healthcare Communities

With big data growth in biomedical and healthcare communities, accurate analysis of medical data benefits early disease detection, patient care, and community services. However, the analysis accuracy is reduced when the quality of medical data is incomplete. Moreover, different regions exhibit unique characteristics of certain regional diseases, which may weaken the prediction of disease outbreaks. In this paper, we streamline machine learning algorithms for effective prediction of chronic disease outbreak in disease-frequent communities. We experiment the modified prediction models over real-life hospital data collected from central China in 2013-2015. To overcome the difficulty of incomplete data, we use a latent factor model to reconstruct the missing data. We experiment on a regional chronic disease of cerebral infarction. We propose a new convolutional neural network (CNN)-based multimodal disease risk prediction algorithm using structured and unstructured data from hospital. To the best of our knowledge, none of the existing work focused on both data types in the area of medical big data analytics. Compared with several typical prediction algorithms, the prediction accuracy of our proposed algorithm reaches 94.8% with a convergence speed, which is faster than that of the CNN-based unimodal disease risk prediction algorithm.

View this article on IEEE Xplore

Dengue Epidemics Prediction: A Survey of the State-of-the-Art based on Data Science Processes

Dengue infection is a mosquito-borne disease caused by dengue viruses, which are carried by several species of mosquito of the genus Aedes, principally Ae. aegypti. Dengue outbreaks are endemic in tropical and sub-tropical regions of the world, mainly in urban and sub-urban areas. The outbreak is one of the top ten diseases causing the most deaths worldwide. According to the World Health Organization (WHO), dengue infection has increased 30-fold globally over the past five decades. About 50 to 100 million new infections occur annually in more than 80 countries. Many researchers are working on measures to prevent and control the spread. One avenue of research is collaboration between computer science and the epidemiology researchers in developing methods of predicting potential outbreaks of dengue infection. An important research objective is to develop models that enable, or enhance, forecasting of outbreaks of dengue, giving medical professionals the opportunity to develop plans for handling the outbreak, well in advance. Researchers have been gathering and analyzing data to better identify the relational factors driving the spread of the disease, as well as the development of a variety of methods of predictive modelling using statistical and mathematical analysis and Machine Learning. In this substantial review of the literature on the state of the art of research over the past decades, we identified six main issues to be explored and analyzed: (1) The available data sources, (2) Data preparation techniques, (3) Data representations, (4) Forecasting models and methods, (5) Dengue forecasting models evaluation approaches, and (6) Future challenges and possibilities in forecasting modelling of dengue outbreaks. Our comprehensive exploration of the issues provides a valuable information foundation for new researchers in this important area of public health research and epidemiology.

View this article on IEEE Xplore