Controlling the Skyrmion Density and Size for Quantized Convolutional Neural Network
The exceptional properties of skyrmion devices, including their miniature size, topologically protected nature, and low current requirements, render them highly promising for energy-efficient neuromorphic computing applications. Examining the creation, stability, and dynamics of magnetic skyrmions in thin-film systems is imperative to realize these skyrmion-based neuromorphic devices. Herein, we report the creation, stability, and tunability of magnetic skyrmions in the Ta/IrMn/CoFeB/MgO thin-film system. We use polar magneto-optic Kerr effect (MOKE) microscopy and micromagnetic simulations to investigate the magnetic-field dependence of skyrmion density and size. The topological charge evolution with time under a magnetic field is studied, and the transformation dynamics are explained. Furthermore, we demonstrate skyrmion size and density tunability as parameters controlled by voltage, current, and magnetic field via Voltage-Controlled Magnetoresistance (VCMA) and Dzyaloshinskii-Moriya Interaction (DMI). We propose a skyrmion-based synaptic device for neuromorphic computing applications. The device exhibits spin-orbit torque-controlled discrete topological resistance states with high linearity and uniformity, allowing for the realization of the hardware implementation of weight quantization in a Quantized Convolutional Neural Network (QCNN). Our experimental results demonstrate that the devices can be trained and tested on the CIFAR-10 dataset, achieving a recognition accuracy of ~87%. The findings open new avenues for developing neuromorphic computing devices based on tunable skyrmion systems.
View this article on IEEE Xplore
EXplainable Artificial Intelligence (XAI)—From Theory to Methods and Applications
Intelligent applications supported by Machine Learning have achieved remarkable performance rates for a wide range of tasks in many domains. However, understanding why a trained algorithm makes a particular decision remains problematic. Given the growing interest in the application of learning-based models, some concerns arise in the dealing with sensible environments, which may impact users’ lives. The complex nature of those models’ decision mechanisms makes them the so-called “black boxes,” in which the understanding of the logic behind automated decision-making processes by humans is not trivial. Furthermore, the reasoning that leads a model to provide a specific prediction can be more important than performance metrics, which introduces a trade-off between interpretability and model accuracy. Explaining intelligent computer decisions can be regarded as a way to justify their reliability and establish trust. In this sense, explanations are critical tools that verify predictions to discover errors and biases previously hidden within the models’ complex structures, opening up vast possibilities for more responsible applications. In this review, we provide theoretical foundations of Explainable Artificial Intelligence (XAI), clarifying diffuse definitions and identifying research objectives, challenges, and future research lines related to turning opaque machine learning outputs into more transparent decisions. We also present a careful overview of the state-of-the-art explainability approaches, with a particular analysis of methods based on feature importance, such as the well-known LIME and SHAP. As a result, we highlight practical applications of the successful use of XAI.
View this article on IEEE Xplore
Scalable Empirical Dynamic Modeling With Parallel Computing and Approximate k-NN Search
Empirical Dynamic Modeling (EDM) is a mathematical framework for modeling and predicting non-linear time series data. Although EDM is increasingly adopted in various research fields, its application to large-scale data has been limited due to its high computational cost. This article presents kEDM, a high-performance implementation of EDM for analyzing large-scale time series datasets. kEDM adopts the Kokkos performance-portable programming model to efficiently run on both CPU and GPU while sharing a single code base. We also conduct hardware-specific optimization of performance-critical kernels. kEDM achieved up to 6.58× speedup in pairwise causal inference of real-world biology datasets compared to an existing EDM implementation. Furthermore, we integrate multiple approximate k-NN search algorithms into EDM to enable the analysis of extremely large datasets that were intractable with conventional EDM based on exhaustive k-NN search. EDM-based time series forecast enhanced with approximate k-NN search demonstrated up to 790× speedup compared to conventional Simplex projection with less than 1% increase in MAPE.
View this article on IEEE Xplore
AMS Circuit Design Optimization Technique Based on ANN Regression Model With VAE Structure
The advanced design of an analog mixed-signal circuit is not simple enough to meet the requirements of the performance matrix as well as robust operations under process-voltage-temperature (PVT) changes. Even commercial products demand stringent specifications while maintaining the system’s performance. The main objectives of this study are to increase the efficiency of the design optimization process by configuring the design process in multiple regression modeling stages, to characterize our target circuit into a regression model including PVT variations, and to enable a search for co- optimum design points while simultaneously checking performance sensitivity. We used an artificial neural network (ANN) to develop a regression model and divided the ANN modeling process into coarse and fine simulation steps. In addition, we applied a variational autoencoder (VAE) structure to the ANN model to reduce the training error due to an insufficient input sample. According to the proposed algorithm, the AMS circuit designer can quickly search for the co- optimum point, which results in the best performance, while the least sensitive operation as the design process uses a regression model instead of launching heavy SPICE simulations. In this study, a voltage-controlled oscillator (VCO) is selected to prove the proposed algorithm. Under various design conditions (CMOS 180 nm, 65 nm, and 45 nm processes), we proceed with the proposed design flow to obtain the best performance score that can be evaluated by a figure-of-merit (FoM). As a result, the proposed regression model-based design flow achieves twice accurate results in comparison to that of the conventional single-step design flow.
View this article on IEEE Xplore
DNN Partitioning for Inference Throughput Acceleration at the Edge
Deep neural network (DNN) inference on streaming data requires computing resources to satisfy inference throughput requirements. However, latency and privacy sensitive deep learning applications cannot afford to offload computation to remote clouds because of the implied transmission cost and lack of trust in third-party cloud providers. Among solutions to increase performance while keeping computation on a constrained environment, hardware acceleration can be onerous, and model optimization requires extensive design efforts while hindering accuracy. DNN partitioning is a third complementary approach, and consists of distributing the inference workload over several available edge devices, taking into account the edge network properties and the DNN structure, with the objective of maximizing the inference throughput (number of inferences per second). This paper introduces a method to predict inference and transmission latencies for multi-threaded distributed DNN deployments, and defines an optimization process to maximize the inference throughput. A branch and bound solver is then presented and analyzed to quantify the achieved performance and complexity. This analysis has led to the definition of the acceleration region, which describes deterministic conditions on the DNN and network properties under which DNN partitioning is beneficial. Finally, experimental results confirm the simulations and show inference throughput improvements in sample edge deployments.
View this article on IEEE Xplore
Security Hardening of Intelligent Reflecting Surfaces Against Adversarial Machine Learning Attacks
Next-generation communication networks, also known as NextG or 5G and beyond, are the future data transmission systems that aim to connect a large amount of Internet of Things (IoT) devices, systems, applications, and consumers at high-speed data transmission and low latency. Fortunately, NextG networks can achieve these goals with advanced telecommunication, computing, and Artificial Intelligence (AI) technologies in the last decades and support a wide range of new applications. Among advanced technologies, AI has a significant and unique contribution to achieving these goals for beamforming, channel estimation, and Intelligent Reflecting Surfaces (IRS) applications of 5G and beyond networks. However, the security threats and mitigation for AI-powered applications in NextG networks have not been investigated deeply in academia and industry due to being new and more complicated. This paper focuses on an AI-powered IRS implementation in NextG networks along with its vulnerability against adversarial machine learning attacks. This paper also proposes the defensive distillation mitigation method to defend and improve the robustness of the AI-powered IRS model, i.e., reduce the vulnerability. The results indicate that the defensive distillation mitigation method can significantly improve the robustness of AI-powered models and their performance under an adversarial attack.
View this article on IEEE Xplore
Machine Learning Based Transient Stability Emulation and Dynamic System Equivalencing of Large-Scale AC-DC Grids for Faster-Than-Real-Time Digital Twin
Modern power systems have been expanding significantly including the integration of high voltage direct current (HVDC) systems, bringing a tremendous computational challenge to transient stability simulation for dynamic security assessment (DSA). In this work, a practical method for energy control center with the machine learning (ML) based synchronous generator model (SGM) and dynamic equivalent model (DEM) is proposed to reduce the computational burden of the traditional transient stability (TS) simulation. The proposed ML-based models are deployed on the field programmable gate arrays (FPGAs) for faster-than-real-time (FTRT) digital twin hardware emulation of the real power system. The Gated Recurrent Unit (GRU) algorithm is adopted to train the SGM and DEM, where the training and testing datasets are obtained from the off-line simulation tool DSAToolsTM/TSAT®. A test system containing 15 ACTIVSg 500-bus systems interconnected by a 15-terminal DC grid is established for validating the accuracy of the proposed FTRT digital twin emulation platform. Due to the complexity of emulating large-scale AC-DC grid, multiple FPGA boards are applied, and a proper interface strategy is also proposed for data synchronization. As a result, the efficacy of the hardware emulation is demonstrated by two case studies, where an FTRT ratio of more than 684 is achieved by applying the GRU-SGM, while it reaches over 208 times for hybrid computational-ML based digital twin of AC-DC grid.
*Published in the IEEE Power & Energy Society Section within IEEE Access.
View this article on IEEE Xplore
The Internet of Federated Things (IoFT)
The Internet of Things (IoT) is on the verge of a major paradigm shift. In the IoT system of the future, IoFT, the “cloud” will be substituted by the “crowd” where model training is brought to the edge, allowing IoT devices to collaboratively extract knowledge and build smart analytics/models while keeping their personal data stored locally. This paradigm shift was set into motion by the tremendous increase in computational power on IoT devices and the recent advances in decentralized and privacy-preserving model training, coined as federated learning (FL). This article provides a vision for IoFT and a systematic overview of current efforts towards realizing this vision. Specifically, we first introduce the defining characteristics of IoFT and discuss FL data-driven approaches, opportunities, and challenges that allow decentralized inference within three dimensions: (i) a global model that maximizes utility across all IoT devices, (ii) a personalized model that borrows strengths across all devices yet retains its own model, (iii) a meta-learning model that quickly adapts to new devices or learning tasks. We end by describing the vision and challenges of IoFT in reshaping different industries through the lens of domain experts. Those industries include manufacturing, transportation, energy, healthcare, quality & reliability, business, and computing.
View this article on IEEE Xplore
A Data Compression Strategy for the Efficient Uncertainty Quantification of Time-Domain Circuit Responses
This paper presents an innovative modeling strategy for the construction of efficient and compact surrogate models for the uncertainty quantification of time-domain responses of digital links. The proposed approach relies on a two-step methodology. First, the initial dataset of available training responses is compressed via principal component analysis (PCA). Then, the compressed dataset is used to train compact surrogate models for the reduced PCA variables using advanced techniques for uncertainty quantification and parametric macromodeling. Specifically, in this work sparse polynomial chaos expansion and least-square support-vector machine regression are used, although the proposed methodology is general and applicable to any surrogate modeling strategy. The preliminary compression allows limiting the number and complexity of the surrogate models, thus leading to a substantial improvement in the efficiency. The feasibility and performance of the proposed approach are investigated by means of two digital link designs with 54 and 115 uncertain parameters, respectively.
Published in the IEEE Electronics Packaging Society Section within IEEE Access.
View this article on IEEE Xplore
A Simple Sum of Products Formula to Compute the Reliability of the KooN System
Reliability block diagram (RBD) is a well-known, high-level abstract modeling method for calculating systems reliability. Increasing redundancy is the most important way for increasing Fault-tolerance and reliability of dependable systems. K-out-of-N (KooN) is one of the known redundancy models. The redundancy causes repeated events and increases the complexity of the computing system’s reliability, and researchers use techniques like factorization to overcome it. Current methods lead to the cumbersome formula that needs a lot of simplification to change in the form of Sum of the Products (SoP) in terms of reliabilities of its constituting components. In This paper, a technique for extracting simple formula for calculating the KooN system’s reliability in SoP form using the Venn diagram is presented. Then, the shortcoming of using the Venn diagram that is masking some joints events in the case of a large number of independent components is explained. We proposed the replacement of Lattice instead of Venn diagrams to overcome this weakness. Then, the Lattice of reliabilities that is dual of power set Lattice of components is introduced. Using the basic properties of Lattice of reliabilities and their inclusion relationships, we propose an algorithm for driving a general formula of the KooN system’s reliability in SoP form. The proposed algorithm gives the SoP formula coefficients by computing elements of the main diagonal and elements below it in a squared matrix. The computational and space complexity of the proposed algorithm is θ ((n – k) 2 /2) that n is the number of different components and k denotes the number of functioning components. A lemma and a theorem are defined and proved as a basis of the proposed general formula for computing coefficients of the SoP formula of the KooN system. Computational and space complexity of computing all of the coefficients of reliability formula of KooN system using this formula reduced to $\theta (n-k)$ . The proposed formula is simple and is in the form of SoP, and its computation is less error-prone.
Follow us: