Chat2VIS: Generating Data Visualizations via Natural Language Using ChatGPT, Codex and GPT-3 Large Language Models

The field of data visualisation has long aimed to devise solutions for generating visualisations directly from natural language text. Research in Natural Language Interfaces (NLIs) has contributed towards the development of such techniques. However, the implementation of workable NLIs has always been challenging due to the inherent ambiguity of natural language, as well as in consequence of unclear and poorly written user queries which pose problems for existing language models in discerning user intent. Instead of pursuing the usual path of developing new iterations of language models, this study uniquely proposes leveraging the advancements in pre-trained large language models (LLMs) such as ChatGPT and GPT-3 to convert free-form natural language directly into code for appropriate visualisations. This paper presents a novel system, Chat2VIS, which takes advantage of the capabilities of LLMs and demonstrates how, with effective prompt engineering, the complex problem of language understanding can be solved more efficiently, resulting in simpler and more accurate end-to-end solutions than prior approaches. Chat2VIS shows that LLMs together with the proposed prompts offer a reliable approach to rendering visualisations from natural language queries, even when queries are highly misspecified and underspecified. This solution also presents a significant reduction in costs for the development of NLI systems, while attaining greater visualisation inference abilities compared to traditional NLP approaches that use hand-crafted grammar rules and tailored models. This study also presents how LLM prompts can be constructed in a way that preserves data security and privacy while being generalisable to different datasets. This work compares the performance of GPT-3, Codex and ChatGPT across several case studies and contrasts the performances with prior studies.

View this article on IEEE Xplore

 

Code Generation Using Machine Learning: A Systematic Review

Recently, machine learning (ML) methods have been used to create powerful language models for a broad range of natural language processing tasks. An important subset of this field is that of generating code of programming languages for automatic software development. This review provides a broad and detailed overview of studies for code generation using ML. We selected 37 publications indexed in arXiv and IEEE Xplore databases that train ML models on programming language data to generate code. The three paradigms of code generation we identified in these studies are description-to-code, code-to-description, and code-to-code. The most popular applications that work in these paradigms were found to be code generation from natural language descriptions, documentation generation, and automatic program repair, respectively. The most frequently used ML models in these studies include recurrent neural networks, transformers, and convolutional neural networks. Other neural network architectures, as well as non-neural techniques, were also observed. In this review, we have summarized the applications, models, datasets, results, limitations, and future work of 37 publications. Additionally, we include discussions on topics general to the literature reviewed. This includes comparing different model types, comparing tokenizers, the volume and quality of data used, and methods for evaluating synthesized code. Furthermore, we provide three suggestions for future work for code generation using ML.

View this article on IEEE Xplore

 

Energy Optimization in Massive MIMO UAV-Aided MEC-Enabled Vehicular Networks

This paper presents a novel unmanned aerial vehicle (UAV)-aided mobile edge computing (MEC) architecture for vehicular networks. It is considered that the vehicles should complete latency-critical computation-intensive tasks either locally with on-board computation units or by offloading part of their tasks to road side units (RSUs) with collocated MEC servers. In this direction, a hovering UAV can serve as an aerial RSU (ARSU) for task processing or act as an aerial relay and further offload the computation tasks to a ground RSU (GRSU). To significantly reduce the delay during data offloading and downloading, this architecture relies on the benefits of line-of-sight (LoS) massive multiple-input–multiple-output (MIMO). Therefore, it is considered that the vehicles, the ARSU, and the GRSU employ large-scale antennas. A three-dimensional (3-D) geometrical representation of the MEC-enabled network is introduced and an optimization method is proposed that minimizes the computation-based and communication-based weighted total energy consumption (WTEC) of vehicles and ARSU subject to transmit power allocation, task allocation, and time slot scheduling. The results verify the theoretical derivations, emphasize on the effectiveness of the LoS massive MIMO transmission, and provide useful engineering insights.

*Published in the IEEE Vehicular Technology Society Section within IEEE Access.

View this article on IEEE Xplore

 

Robots and Wizards: An Investigation Into Natural Human–Robot Interaction

The goal of the study was to research different communication modalities needed for intuitive Human-Robot Interaction. This study utilizes a Wizard of Oz prototyping method to enable a restriction-free, intuitive interaction with an industrial robot. The data from 36 test subjects suggests a high preference for speech input, automatic path planning and pointing gestures. The catalogue developed during this experiment contains intrinsic gestures suggesting that the two most popular gestures per action can be sufficient to cover the majority of users. The system scored an average of 74% in different user interface experience questionnaires, while containing forced flaws. These findings allow a future development of an intuitive Human-Robot interaction system with high user acceptance.

*The video published with this article received a promotional prize for the 2020 IEEE Access Best Multimedia Award (Part 2).

View this article on IEEE Xplore

 

Network Representation Learning: From Traditional Feature Learning to Deep Learning

Network representation learning (NRL) is an effective graph analytics technique and promotes users to deeply understand the hidden characteristics of graph data. It has been successfully applied in many real-world tasks related to network science, such as social network data processing, biological information processing, and recommender systems. Deep Learning is a powerful tool to learn data features. However, it is non-trivial to generalize deep learning to graph-structured data since it is different from the regular data such as pictures having spatial information and sounds having temporal information. Recently, researchers proposed many deep learning-based methods in the area of NRL. In this survey, we investigate classical NRL from traditional feature learning method to the deep learning-based model, analyze relationships between them, and summarize the latest progress. Finally, we discuss open issues considering NRL and point out the future directions in this field.

View this article on IEEE Xplore

State-Based Decoding of Force Signals From Multi-Channel Local Field Potentials

The functional use of brain-machine interfaces (BMIs) in everyday tasks requires the accurate decoding of both movement and force information. In real-word tasks such as reach-to-grasp movements, a prosthetic hand should be switched between reaching and grasping modes, depending on the detection of the user intents in the decoder part of the BMI. Therefore, it is important to detect the rest or active states of different actions in the decoder to produce the corresponding continuous command output during the estimated state. In this study, we demonstrated that the resting and force-generating time-segments in a key pressing task could be accurately detected from local field potentials (LFPs) in rat’s primary motor cortex. Common spatial pattern (CSP) algorithm was applied on different spectral LFP sub-bands to maximize the difference between the two classes of force and rest. We also showed that combining a discrete state decoder with linear or non-linear continuous force variable decoders could lead to a higher force decoding performance compared with the case we use a continuous variable decoder only. Moreover, the results suggest that gamma LFP signals (50-100 Hz) could be used successfully for decoding the discrete rest/force states as well as continuous values of the force variable. The results of this study can offer substantial benefits for the implementation of a self-paced, force-related command generator in BMI experiments without the need for manual external signals to select the state of the decoder.

View this article on IEEE Xplore

Machine Learning Empowered Spectrum Sharing in Intelligent Unmanned Swarm Communication Systems: Challenges, Requirements and Solutions

The unmanned swarm system (USS) has been seen as a promising technology, and will play an extremely important role in both the military and civilian fields such as military strikes, disaster relief and transportation business. As the “nerve center” of USS, the unmanned swarm communication system (USCS) provides the necessary information transmission medium so as to ensure the system stability and mission implementation. However, challenges caused by multiple tasks, distributed collaboration, high dynamics, ultra-dense and jamming threat make it hard for USCS to manage limited spectrum resources. To tackle with such problems, the machine learning (ML) empowered intelligent spectrum management technique is introduced in this paper. First, based on the challenges of the spectrum resource management in USCS, the requirement of spectrum sharing is analyzed from the perspective of spectrum collaboration and spectrum confrontation. We found that suitable multi-agent collaborative decision making is promising to realize effective spectrum sharing in both two perspectives. Therefore, a multi-agent learning framework is proposed which contains mobile-computing-assisted and distributed structures. Based on the framework, we provide case studies. Finally, future research directions are discussed.

View this article on IEEE Xplore

Federating Cloud Systems for Collaborative Construction and Engineering

The construction industry has undergone a transformation in the use of data to drive its processes and outcomes, especially with the use of Building Information Modelling (BIM). In particular, project collaboration in the construction industry can involve multiple stakeholders (architects, engineers, consultants) that exchange data at different project stages. Therefore, the use of Cloud computing in construction projects has continued to increase, primarily due to the ease of access, availability and scalability in data storage and analysis available through such platforms. Federation of cloud systems can provide greater flexibility in choosing a Cloud provider, enabling different members of the construction project to select a provider based on their cost to benefit requirements. When multiple construction disciplines collaborate online, the risk associated with project failure increases as the capability of a provider to deliver on the project cannot be assessed apriori. In such uncontrolled industrial environments, “trust” can be an efficacious mechanism for more informed decision making adaptive to the evolving nature of such multi-organisation dynamic collaborations in construction. This paper presents a trust based Cooperation Value Estimation (CoVE) approach to enable and sustain collaboration among disciplines in construction projects mainly focusing on data privacy, security and performance. The proposed approach is demonstrated with data and processes from a real highway bridge construction project describing the entire selection process of a cloud provider. The selection process uses the audit and assessment process of the Cloud Security Alliance (CSA) and real world performance data from the construction industry workloads. Other application domains can also make use of this proposed approach by adapting it to their respective specifications. Experimental evaluation has shown that the proposed approach ensures on-time completion of projects and enhanced

View this article on IEEE Xplore

A Cascaded Multimodal Natural User Interface to Reduce Driver Distraction

Natural user interfaces (NUI) have been used to reduce driver distraction while using in-vehicle infotainment systems (IVIS), and multimodal interfaces have been applied to compensate for the shortcomings of a single modality in NUIs. These multimodal NUIs have variable effects on different types of driver distraction and on different stages of drivers’ secondary tasks. However, current studies provide a limited understanding of NUIs. The design of multimodal NUIs is typically based on evaluation of the strengths of a single modality. Furthermore, studies of multimodal NUIs are not based on equivalent comparison conditions. To address this gap, we compared five single modalities commonly used for NUIs (touch, mid-air gesture, speech, gaze, and physical buttons located in a steering wheel) during a lane change task (LCT) to provide a more holistic view of driver distraction. Our findings suggest that the best approach is a combined cascaded multimodal interface that accounts for the characteristics of a single modality. We compared several combinations of cascaded multimodalities by considering the characteristics of each modality in the sequential phase of the command input process. Our results show that the combinations speech + button, speech + touch, and gaze + button represent the best cascaded multimodal interfaces to reduce driver distraction for IVIS.

View this article on IEEE Xplore