Code Generation Using Machine Learning: A Systematic Review

Recently, machine learning (ML) methods have been used to create powerful language models for a broad range of natural language processing tasks. An important subset of this field is that of generating code of programming languages for automatic software development. This review provides a broad and detailed overview of studies for code generation using ML. We selected 37 publications indexed in arXiv and IEEE Xplore databases that train ML models on programming language data to generate code. The three paradigms of code generation we identified in these studies are description-to-code, code-to-description, and code-to-code. The most popular applications that work in these paradigms were found to be code generation from natural language descriptions, documentation generation, and automatic program repair, respectively. The most frequently used ML models in these studies include recurrent neural networks, transformers, and convolutional neural networks. Other neural network architectures, as well as non-neural techniques, were also observed. In this review, we have summarized the applications, models, datasets, results, limitations, and future work of 37 publications. Additionally, we include discussions on topics general to the literature reviewed. This includes comparing different model types, comparing tokenizers, the volume and quality of data used, and methods for evaluating synthesized code. Furthermore, we provide three suggestions for future work for code generation using ML.

View this article on IEEE Xplore

 

Artificial Intelligence and COVID-19: Deep Learning Approaches for Diagnosis and Treatment

COVID-19 outbreak has put the whole world in an unprecedented difficult situation bringing life around the world to a frightening halt and claiming thousands of lives. Due to COVID-19’s spread in 212 countries and territories and increasing numbers of infected cases and death tolls mounting to 5,212,172 and 334,915 (as of May 22 2020), it remains a real threat to the public health system. This paper renders a response to combat the virus through Artificial Intelligence (AI). Some Deep Learning (DL) methods have been illustrated to reach this goal, including Generative Adversarial Networks (GANs), Extreme Learning Machine (ELM), and Long/Short Term Memory (LSTM). It delineates an integrated bioinformatics approach in which different aspects of information from a continuum of structured and unstructured data sources are put together to form the user-friendly platforms for physicians and researchers. The main advantage of these AI-based platforms is to accelerate the process of diagnosis and treatment of the COVID-19 disease. The most recent related publications and medical reports were investigated with the purpose of choosing inputs and targets of the network that could facilitate reaching a reliable Artificial Neural Network-based tool for challenges associated with COVID-19. Furthermore, there are some specific inputs for each platform, including various forms of the data, such as clinical data and medical imaging which can improve the performance of the introduced approaches toward the best responses in practical applications.

View this article on IEEE Xplore

Dengue Epidemics Prediction: A Survey of the State-of-the-Art based on Data Science Processes

Dengue infection is a mosquito-borne disease caused by dengue viruses, which are carried by several species of mosquito of the genus Aedes, principally Ae. aegypti. Dengue outbreaks are endemic in tropical and sub-tropical regions of the world, mainly in urban and sub-urban areas. The outbreak is one of the top ten diseases causing the most deaths worldwide. According to the World Health Organization (WHO), dengue infection has increased 30-fold globally over the past five decades. About 50 to 100 million new infections occur annually in more than 80 countries. Many researchers are working on measures to prevent and control the spread. One avenue of research is collaboration between computer science and the epidemiology researchers in developing methods of predicting potential outbreaks of dengue infection. An important research objective is to develop models that enable, or enhance, forecasting of outbreaks of dengue, giving medical professionals the opportunity to develop plans for handling the outbreak, well in advance. Researchers have been gathering and analyzing data to better identify the relational factors driving the spread of the disease, as well as the development of a variety of methods of predictive modelling using statistical and mathematical analysis and Machine Learning. In this substantial review of the literature on the state of the art of research over the past decades, we identified six main issues to be explored and analyzed: (1) The available data sources, (2) Data preparation techniques, (3) Data representations, (4) Forecasting models and methods, (5) Dengue forecasting models evaluation approaches, and (6) Future challenges and possibilities in forecasting modelling of dengue outbreaks. Our comprehensive exploration of the issues provides a valuable information foundation for new researchers in this important area of public health research and epidemiology.

View this article on IEEE Xplore