-
CapsLorentzNet: Integrating Physics Inspired Features with Graph Convolution
Authors:
Rameswar Sahu
Abstract:
With the advent of advanced machine learning techniques, boosted object tagging has witnessed significant progress. In this article, we take this field further by introducing novel architectural modifications compatible with a wide array of Graph Neural Network (GNN) architectures. Our approach advocates for integrating capsule layers, replacing the conventional decoding blocks in standard GNNs. T…
▽ More
With the advent of advanced machine learning techniques, boosted object tagging has witnessed significant progress. In this article, we take this field further by introducing novel architectural modifications compatible with a wide array of Graph Neural Network (GNN) architectures. Our approach advocates for integrating capsule layers, replacing the conventional decoding blocks in standard GNNs. These capsules are a group of neurons with vector activations. The orientation of these vectors represents important properties of the objects under study, with their magnitude characterizing whether the object under study belongs to the class represented by the capsule. Moreover, capsule networks incorporate a regularization by reconstruction mechanism, facilitating the seamless integration of expert-designed high-level features into the analysis. We have studied the usefulness of our architecture with the LorentzNet architecture for quark-gluon tagging. Here, we have replaced the decoding block of LorentzNet with a capsulated decoding block and have called the resulting architecture CapsLorentzNet. Our new architecture can enhance the performance of LorentzNet by 20 \% for the quark-gluon tagging task.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
To prune or not to prune : A chaos-causality approach to principled pruning of dense neural networks
Authors:
Rajan Sahu,
Shivam Chadha,
Nithin Nagaraj,
Archana Mathur,
Snehanshu Saha
Abstract:
Reducing the size of a neural network (pruning) by removing weights without impacting its performance is an important problem for resource-constrained devices. In the past, pruning was typically accomplished by ranking or penalizing weights based on criteria like magnitude and removing low-ranked weights before retraining the remaining ones. Pruning strategies may also involve removing neurons fro…
▽ More
Reducing the size of a neural network (pruning) by removing weights without impacting its performance is an important problem for resource-constrained devices. In the past, pruning was typically accomplished by ranking or penalizing weights based on criteria like magnitude and removing low-ranked weights before retraining the remaining ones. Pruning strategies may also involve removing neurons from the network in order to achieve the desired reduction in network size. We formulate pruning as an optimization problem with the objective of minimizing misclassifications by selecting specific weights. To accomplish this, we have introduced the concept of chaos in learning (Lyapunov exponents) via weight updates and exploiting causality to identify the causal weights responsible for misclassification. Such a pruned network maintains the original performance and retains feature explainability.
△ Less
Submitted 19 August, 2023;
originally announced August 2023.
-
DELFI: Deep Mixture Models for Long-term Air Quality Forecasting in the Delhi National Capital Region
Authors:
Naishadh Parmar,
Raunak Shah,
Tushar Goswamy,
Vatsalya Tandon,
Ravi Sahu,
Ronak Sutaria,
Purushottam Kar,
Sachchida Nand Tripathi
Abstract:
The identification and control of human factors in climate change is a rapidly growing concern and robust, real-time air-quality monitoring and forecasting plays a critical role in allowing effective policy formulation and implementation. This paper presents DELFI, a novel deep learning-based mixture model to make effective long-term predictions of Particulate Matter (PM) 2.5 concentrations. A key…
▽ More
The identification and control of human factors in climate change is a rapidly growing concern and robust, real-time air-quality monitoring and forecasting plays a critical role in allowing effective policy formulation and implementation. This paper presents DELFI, a novel deep learning-based mixture model to make effective long-term predictions of Particulate Matter (PM) 2.5 concentrations. A key novelty in DELFI is its multi-scale approach to the forecasting problem. The observation that point predictions are more suitable in the short-term and probabilistic predictions in the long-term allows accurate predictions to be made as much as 24 hours in advance. DELFI incorporates meteorological data as well as pollutant-based features to ensure a robust model that is divided into two parts: (i) a stack of three Long Short-Term Memory (LSTM) networks that perform differential modelling of the same window of past data, and (ii) a fully-connected layer enabling attention to each of the components. Experimental evaluation based on deployment of 13 stations in the Delhi National Capital Region (Delhi-NCR) in India establishes that DELFI offers far superior predictions especially in the long-term as compared to even non-parametric baselines. The Delhi-NCR recorded the 3rd highest PM levels amongst 39 mega-cities across the world during 2011-2015 and DELFI's performance establishes it as a potential tool for effective long-term forecasting of PM levels to enable public health management and environment protection.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Long-Term Missing Value Imputation for Time Series Data Using Deep Neural Networks
Authors:
Jangho Park,
Juliane Muller,
Bhavna Arora,
Boris Faybishenko,
Gilberto Pastorello,
Charuleka Varadharajan,
Reetik Sahu,
Deborah Agarwal
Abstract:
We present an approach that uses a deep learning model, in particular, a MultiLayer Perceptron (MLP), for estimating the missing values of a variable in multivariate time series data. We focus on filling a long continuous gap (e.g., multiple months of missing daily observations) rather than on individual randomly missing observations. Our proposed gap filling algorithm uses an automated method for…
▽ More
We present an approach that uses a deep learning model, in particular, a MultiLayer Perceptron (MLP), for estimating the missing values of a variable in multivariate time series data. We focus on filling a long continuous gap (e.g., multiple months of missing daily observations) rather than on individual randomly missing observations. Our proposed gap filling algorithm uses an automated method for determining the optimal MLP model architecture, thus allowing for optimal prediction performance for the given time series. We tested our approach by filling gaps of various lengths (three months to three years) in three environmental datasets with different time series characteristics, namely daily groundwater levels, daily soil moisture, and hourly Net Ecosystem Exchange. We compared the accuracy of the gap-filled values obtained with our approach to the widely-used R-based time series gap filling methods ImputeTS and mtsdi. The results indicate that using an MLP for filling a large gap leads to better results, especially when the data behave nonlinearly. Thus, our approach enables the use of datasets that have a large gap in one variable, which is common in many long-term environmental monitoring observations.
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
Coverage Analysis of Broadcast Networks with Users Having Heterogeneous Content/Advertisement Preferences
Authors:
Kanchan Chaurasia,
Reena Sahu,
Abhishek Gupta
Abstract:
This work is focused on the system-level performance of a broadcast network. Since all transmitters in a broadcast network transmit the identical signal, received signals from multiple transmitters can be combined to improve system performance. We develop a stochastic geometry based analytical framework to derive the coverage of a typical receiver. We show that there may exist an optimal connectiv…
▽ More
This work is focused on the system-level performance of a broadcast network. Since all transmitters in a broadcast network transmit the identical signal, received signals from multiple transmitters can be combined to improve system performance. We develop a stochastic geometry based analytical framework to derive the coverage of a typical receiver. We show that there may exist an optimal connectivity radius that maximizes the rate coverage. Our analysis includes the fact that users may have their individual content/advertisement preferences. We assume that there are multiple classes of users with each user class prefers a particular type of content/advertisements and the users will pay the network only when then can see content aligned with their interest. The operator may choose to transmit multiple contents simultaneously to cater more users' interests to increase its revenue. We present revenue models to study the impact of the number of contents on the operator revenue. We consider two scenarios for users' distribution: one where users' interest depends on their geographical location and the one where it doesn't. With the help of numerical results and analysis, we show the impact of various parameters including content granularity, connectivity radius, and rate threshold and present important design insights.
△ Less
Submitted 27 January, 2021;
originally announced January 2021.
-
Anonymous proof-of-asset transactions using designated blind signatures
Authors:
Neetu Sharma,
Rajeev Anand Sahu,
Vishal Saraswat,
Joaquin Garcia-Alfaro
Abstract:
We propose a scheme to preserve the anonymity of users in proof-of-asset transactions. We assume bitcoin-like cryptocurrency systems in which a user must prove the strength of its assets (i.e., solvency), prior conducting further transactions. The traditional way of addressing such a problem is the use of blind signatures, i.e., a kind of digital signature whose properties satisfy the anonymity of…
▽ More
We propose a scheme to preserve the anonymity of users in proof-of-asset transactions. We assume bitcoin-like cryptocurrency systems in which a user must prove the strength of its assets (i.e., solvency), prior conducting further transactions. The traditional way of addressing such a problem is the use of blind signatures, i.e., a kind of digital signature whose properties satisfy the anonymity of the signer. Our work focuses on the use of a designated verifier signature scheme that limits to only a single authorized party (within a group of signature requesters) to verify the correctness of the transaction.
△ Less
Submitted 26 October, 2020; v1 submitted 29 September, 2020;
originally announced September 2020.
-
problemConquero at SemEval-2020 Task 12: Transformer and Soft label-based approaches
Authors:
Karishma Laud,
Jagriti Singh,
Randeep Kumar Sahu,
Ashutosh Modi
Abstract:
In this paper, we present various systems submitted by our team problemConquero for SemEval-2020 Shared Task 12 Multilingual Offensive Language Identification in Social Media. We participated in all the three sub-tasks of OffensEval-2020, and our final submissions during the evaluation phase included transformer-based approaches and a soft label-based approach. BERT based fine-tuned models were su…
▽ More
In this paper, we present various systems submitted by our team problemConquero for SemEval-2020 Shared Task 12 Multilingual Offensive Language Identification in Social Media. We participated in all the three sub-tasks of OffensEval-2020, and our final submissions during the evaluation phase included transformer-based approaches and a soft label-based approach. BERT based fine-tuned models were submitted for each language of sub-task A (offensive tweet identification). RoBERTa based fine-tuned model for sub-task B (automatic categorization of offense types) was submitted. We submitted two models for sub-task C (offense target identification), one using soft labels and the other using BERT based fine-tuned model. Our ranks for sub-task A were Greek-19 out of 37, Turkish-22 out of 46, Danish-26 out of 39, Arabic-39 out of 53, and English-20 out of 85. We achieved a rank of 28 out of 43 for sub-task B. Our best rank for sub-task C was 20 out of 39 using BERT based fine-tuned model.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
Surrogate Optimization of Deep Neural Networks for Groundwater Predictions
Authors:
Juliane Mueller,
Jangho Park,
Reetik Sahu,
Charuleka Varadharajan,
Bhavna Arora,
Boris Faybishenko,
Deborah Agarwal
Abstract:
Sustainable management of groundwater resources under changing climatic conditions require an application of reliable and accurate predictions of groundwater levels. Mechanistic multi-scale, multi-physics simulation models are often too hard to use for this purpose, especially for groundwater managers who do not have access to the complex compute resources and data. Therefore, we analyzed the appl…
▽ More
Sustainable management of groundwater resources under changing climatic conditions require an application of reliable and accurate predictions of groundwater levels. Mechanistic multi-scale, multi-physics simulation models are often too hard to use for this purpose, especially for groundwater managers who do not have access to the complex compute resources and data. Therefore, we analyzed the applicability and performance of four modern deep learning computational models for predictions of groundwater levels. We compare three methods for optimizing the models' hyperparameters, including two surrogate model-based algorithms and a random sampling method. The models were tested using predictions of the groundwater level in Butte County, California, USA, taking into account the temporal variability of streamflow, precipitation, and ambient temperature. Our numerical study shows that the optimization of the hyperparameters can lead to reasonably accurate performance of all models (root mean squared errors of groundwater predictions of 2 meters or less), but the ''simplest'' network, namely a multilayer perceptron (MLP) performs overall better for learning and predicting groundwater data than the more advanced long short-term memory or convolutional neural networks in terms of prediction accuracy and time-to-solution, making the MLP a suitable candidate for groundwater prediction.
△ Less
Submitted 3 February, 2020; v1 submitted 28 August, 2019;
originally announced August 2019.
-
Performance of QAM Schemes with Dual-Hop DF Relaying Systems over Mixed $η$-$μ$ and $κ$-$μ$ Fading Channels
Authors:
Dharmendra Dixit,
P. R. Sahu
Abstract:
Performance of quadrature amplitude modulation (QAM) schemes is analyzed with dual-hop decode-and-forward (DF) relaying systems over mixed $η$-$μ$ and $κ$-$μ$ fading channels. Closed-form expressions are obtained for the average symbol error rate (ASER) for general order rectangular QAM and cross QAM schemes using moment generating function based approach. Derived expressions are in the form of La…
▽ More
Performance of quadrature amplitude modulation (QAM) schemes is analyzed with dual-hop decode-and-forward (DF) relaying systems over mixed $η$-$μ$ and $κ$-$μ$ fading channels. Closed-form expressions are obtained for the average symbol error rate (ASER) for general order rectangular QAM and cross QAM schemes using moment generating function based approach. Derived expressions are in the form of Lauricella's $(F_D^{(n)}(\cdot), Φ_1^{(n)}(\cdot))$ hypergeometric functions which can be numerically evaluated using either integral or series representation. The obtained ASER expressions include other mixed fading channel cases addressed in the literature as special cases such as mixed Hoyt, and Rice fading, mixed Nakagami-$m$, and Rice fading. We further obtain a simple expression for the asymptotic ASER, which is useful to determine a factor governing the system performance at high SNRs, i.e., the diversity order. Additionally, we analyze the optimal power allocation, which provides a practical design rule to optimally distribute the total transmission power between the source and the relay to minimize the ASER. Extensive numerical and computer simulation results are presented that confirm the accuracy of presented mathematical analysis.
△ Less
Submitted 8 September, 2015;
originally announced September 2015.