Search | arXiv e-print repository

Beyond Anonymization: Object Scrubbing for Privacy-Preserving 2D and 3D Vision Tasks

Authors: Murat Bilgehan Ertan, Ronak Sahu, Phuong Ha Nguyen, Kaleel Mahmood, Marten van Dijk

Abstract: We introduce ROAR (Robust Object Removal and Re-annotation), a scalable framework for privacy-preserving dataset obfuscation that eliminates sensitive objects instead of modifying them. Our method integrates instance segmentation with generative inpainting to remove identifiable entities while preserving scene integrity. Extensive evaluations on 2D COCO-based object detection show that ROAR achiev… ▽ More We introduce ROAR (Robust Object Removal and Re-annotation), a scalable framework for privacy-preserving dataset obfuscation that eliminates sensitive objects instead of modifying them. Our method integrates instance segmentation with generative inpainting to remove identifiable entities while preserving scene integrity. Extensive evaluations on 2D COCO-based object detection show that ROAR achieves 87.5% of the baseline detection average precision (AP), whereas image dropping achieves only 74.2% of the baseline AP, highlighting the advantage of scrubbing in preserving dataset utility. The degradation is even more severe for small objects due to occlusion and loss of fine-grained details. Furthermore, in NeRF-based 3D reconstruction, our method incurs a PSNR loss of at most 1.66 dB while maintaining SSIM and improving LPIPS, demonstrating superior perceptual quality. Our findings establish object removal as an effective privacy framework, achieving strong privacy guarantees with minimal performance trade-offs. The results highlight key challenges in generative inpainting, occlusion-robust segmentation, and task-specific scrubbing, setting the foundation for future advancements in privacy-preserving vision systems. △ Less

Submitted 23 April, 2025; originally announced April 2025.

Comments: Submitted to ICCV 2025

arXiv:2503.05149 [pdf]

Development and Enhancement of Text-to-Image Diffusion Models

Authors: Rajdeep Roshan Sahu

Abstract: This research focuses on the development and enhancement of text-to-image denoising diffusion models, addressing key challenges such as limited sample diversity and training instability. By incorporating Classifier-Free Guidance (CFG) and Exponential Moving Average (EMA) techniques, this study significantly improves image quality, diversity, and stability. Utilizing Hugging Face's state-of-the-art… ▽ More This research focuses on the development and enhancement of text-to-image denoising diffusion models, addressing key challenges such as limited sample diversity and training instability. By incorporating Classifier-Free Guidance (CFG) and Exponential Moving Average (EMA) techniques, this study significantly improves image quality, diversity, and stability. Utilizing Hugging Face's state-of-the-art text-to-image generation model, the proposed enhancements establish new benchmarks in generative AI. This work explores the underlying principles of diffusion models, implements advanced strategies to overcome existing limitations, and presents a comprehensive evaluation of the improvements achieved. Results demonstrate substantial progress in generating stable, diverse, and high-quality images from textual descriptions, advancing the field of generative artificial intelligence and providing new foundations for future applications. Keywords: Text-to-image, Diffusion model, Classifier-free guidance, Exponential moving average, Image generation. △ Less

Submitted 7 March, 2025; originally announced March 2025.

arXiv:2502.05416 [pdf, other]

Deep Generative Models with Hard Linear Equality Constraints

Authors: Ruoyan Li, Dipti Ranjan Sahu, Guy Van den Broeck, Zhe Zeng

Abstract: While deep generative models~(DGMs) have demonstrated remarkable success in capturing complex data distributions, they consistently fail to learn constraints that encode domain knowledge and thus require constraint integration. Existing solutions to this challenge have primarily relied on heuristic methods and often ignore the underlying data distribution, harming the generative performance. In th… ▽ More While deep generative models~(DGMs) have demonstrated remarkable success in capturing complex data distributions, they consistently fail to learn constraints that encode domain knowledge and thus require constraint integration. Existing solutions to this challenge have primarily relied on heuristic methods and often ignore the underlying data distribution, harming the generative performance. In this work, we propose a probabilistically sound approach for enforcing the hard constraints into DGMs to generate constraint-compliant and realistic data. This is achieved by our proposed gradient estimators that allow the constrained distribution, the data distribution conditioned on constraints, to be differentiably learned. We carry out extensive experiments with various DGM model architectures over five image datasets and three scientific applications in which domain knowledge is governed by linear equality constraints. We validate that the standard DGMs almost surely generate data violating the constraints. Among all the constraint integration strategies, ours not only guarantees the satisfaction of constraints in generation but also archives superior generative performance than the other methods across every benchmark. △ Less

Submitted 12 February, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

arXiv:2409.13000 [pdf]

Introducing the Large Medical Model: State of the art healthcare cost and risk prediction with transformers trained on patient event sequences

Authors: Ricky Sahu, Eric Marriott, Ethan Siegel, David Wagner, Flore Uzan, Troy Yang, Asim Javed

Abstract: With U.S. healthcare spending approaching $5T (NHE Fact Sheet 2024), and 25% of it estimated to be wasteful (Waste in the US the health care system: estimated costs and potential for savings, n.d.), the need to better predict risk and optimal patient care is evermore important. This paper introduces the Large Medical Model (LMM), a generative pre-trained transformer (GPT) designed to guide and pre… ▽ More With U.S. healthcare spending approaching $5T (NHE Fact Sheet 2024), and 25% of it estimated to be wasteful (Waste in the US the health care system: estimated costs and potential for savings, n.d.), the need to better predict risk and optimal patient care is evermore important. This paper introduces the Large Medical Model (LMM), a generative pre-trained transformer (GPT) designed to guide and predict the broad facets of patient care and healthcare administration. The model is trained on medical event sequences from over 140M longitudinal patient claims records with a specialized vocabulary built from medical terminology systems and demonstrates a superior capability to forecast healthcare costs and identify potential risk factors. Through experimentation and validation, we showcase the LMM's proficiency in not only in cost and risk predictions, but also in discerning intricate patterns within complex medical conditions and an ability to identify novel relationships in patient care. The LMM is able to improve both cost prediction by 14.1% over the best commercial models and chronic conditions prediction by 1.9% over the best transformer models in research predicting a broad set of conditions. The LMM is a substantial advancement in healthcare analytics, offering the potential to significantly enhance risk assessment, cost management, and personalized medicine. △ Less

Submitted 5 December, 2024; v1 submitted 19 September, 2024; originally announced September 2024.

Comments: 10 pages, 10 figures

ACM Class: I.2.1; K.4.1; K.4.3; J.1; J.3

arXiv:2406.13856 [pdf, other]

doi 10.14778/3717755.3717759

Kishu: Time-Traveling for Computational Notebooks

Authors: Zhaoheng Li, Supawit Chockchowwat, Ribhav Sahu, Areet Sheth, Yongjoo Park

Abstract: Computational notebooks (e.g., Jupyter, Google Colab) are widely used by data scientists. A key feature of notebooks is the interactive computing model of iteratively executing cells (i.e., a set of statements) and observing the result (e.g., model or plot). Unfortunately, existing notebook systems do not offer time-traveling to past states: when the user executes a cell, the notebook session stat… ▽ More Computational notebooks (e.g., Jupyter, Google Colab) are widely used by data scientists. A key feature of notebooks is the interactive computing model of iteratively executing cells (i.e., a set of statements) and observing the result (e.g., model or plot). Unfortunately, existing notebook systems do not offer time-traveling to past states: when the user executes a cell, the notebook session state consisting of user-defined variables can be irreversibly modified - e.g., the user cannot 'un-drop' a dataframe column. This is because, unlike DBMS, existing notebook systems do not keep track of the session state. Existing techniques for checkpointing and restoring session states, such as OS-level memory snapshot or application-level session dump, are insufficient: checkpointing can incur prohibitive storage costs and may fail, while restoration can only be inefficiently performed from scratch by fully loading checkpoint files. In this paper, we introduce a new notebook system, Kishu, that offers time-traveling to and from arbitrary notebook states using an efficient and fault-tolerant incremental checkpoint and checkout mechanism. Kishu creates incremental checkpoints that are small and correctly preserve complex inter-variable dependencies at a novel Co-variable granularity. Then, to return to a previous state, Kishu accurately identifies the state difference between the current and target states to perform incremental checkout at sub-second latency with minimal data loading. Kishu is compatible with 146 object classes from popular data science libraries (e.g., Ray, Spark, PyTorch), and reduces checkpoint size and checkout time by up to 4.55x and 9.02x, respectively, on a variety of notebooks. △ Less

Submitted 28 March, 2025; v1 submitted 19 June, 2024; originally announced June 2024.

Journal ref: PVLDB, 18(4): 970 - 985, 2024

arXiv:2403.11826 [pdf, other]

Integrating Physics Inspired Features with Graph Convolution

Authors: Rameswar Sahu

Abstract: With the advent of advanced machine learning techniques, boosted object tagging has witnessed significant progress. In this article, we take this field further by introducing novel architectural modifications compatible with a wide array of Graph Neural Network (GNN) architectures. Our approach advocates for integrating capsule layers, replacing the conventional decoding blocks in standard GNNs. T… ▽ More With the advent of advanced machine learning techniques, boosted object tagging has witnessed significant progress. In this article, we take this field further by introducing novel architectural modifications compatible with a wide array of Graph Neural Network (GNN) architectures. Our approach advocates for integrating capsule layers, replacing the conventional decoding blocks in standard GNNs. These capsules are a group of neurons with vector activations. The orientation of these vectors represents important properties of the objects under study, with their magnitude characterizing whether the object under study belongs to the class represented by the capsule. Moreover, capsule networks incorporate a regularization by reconstruction mechanism, facilitating the seamless integration of expert-designed high-level features into the analysis. We have studied the usefulness of our architecture with the LorentzNet architecture for quark-gluon tagging. Here, we have replaced the decoding block of LorentzNet with a capsulated decoding block and have called the resulting architecture CapsLorentzNet. Our new architecture can enhance the performance of LorentzNet by 20 \% for the quark-gluon tagging task. △ Less

Submitted 24 January, 2025; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: 16 pages, 3 figures

arXiv:2308.09955 [pdf, other]

To prune or not to prune : A chaos-causality approach to principled pruning of dense neural networks

Authors: Rajan Sahu, Shivam Chadha, Nithin Nagaraj, Archana Mathur, Snehanshu Saha

Abstract: Reducing the size of a neural network (pruning) by removing weights without impacting its performance is an important problem for resource-constrained devices. In the past, pruning was typically accomplished by ranking or penalizing weights based on criteria like magnitude and removing low-ranked weights before retraining the remaining ones. Pruning strategies may also involve removing neurons fro… ▽ More Reducing the size of a neural network (pruning) by removing weights without impacting its performance is an important problem for resource-constrained devices. In the past, pruning was typically accomplished by ranking or penalizing weights based on criteria like magnitude and removing low-ranked weights before retraining the remaining ones. Pruning strategies may also involve removing neurons from the network in order to achieve the desired reduction in network size. We formulate pruning as an optimization problem with the objective of minimizing misclassifications by selecting specific weights. To accomplish this, we have introduced the concept of chaos in learning (Lyapunov exponents) via weight updates and exploiting causality to identify the causal weights responsible for misclassification. Such a pruned network maintains the original performance and retains feature explainability. △ Less

Submitted 19 August, 2023; originally announced August 2023.

arXiv:2210.15923 [pdf, other]

DELFI: Deep Mixture Models for Long-term Air Quality Forecasting in the Delhi National Capital Region

Authors: Naishadh Parmar, Raunak Shah, Tushar Goswamy, Vatsalya Tandon, Ravi Sahu, Ronak Sutaria, Purushottam Kar, Sachchida Nand Tripathi

Abstract: The identification and control of human factors in climate change is a rapidly growing concern and robust, real-time air-quality monitoring and forecasting plays a critical role in allowing effective policy formulation and implementation. This paper presents DELFI, a novel deep learning-based mixture model to make effective long-term predictions of Particulate Matter (PM) 2.5 concentrations. A key… ▽ More The identification and control of human factors in climate change is a rapidly growing concern and robust, real-time air-quality monitoring and forecasting plays a critical role in allowing effective policy formulation and implementation. This paper presents DELFI, a novel deep learning-based mixture model to make effective long-term predictions of Particulate Matter (PM) 2.5 concentrations. A key novelty in DELFI is its multi-scale approach to the forecasting problem. The observation that point predictions are more suitable in the short-term and probabilistic predictions in the long-term allows accurate predictions to be made as much as 24 hours in advance. DELFI incorporates meteorological data as well as pollutant-based features to ensure a robust model that is divided into two parts: (i) a stack of three Long Short-Term Memory (LSTM) networks that perform differential modelling of the same window of past data, and (ii) a fully-connected layer enabling attention to each of the components. Experimental evaluation based on deployment of 13 stations in the Delhi National Capital Region (Delhi-NCR) in India establishes that DELFI offers far superior predictions especially in the long-term as compared to even non-parametric baselines. The Delhi-NCR recorded the 3rd highest PM levels amongst 39 mega-cities across the world during 2011-2015 and DELFI's performance establishes it as a potential tool for effective long-term forecasting of PM levels to enable public health management and environment protection. △ Less

Submitted 28 October, 2022; originally announced October 2022.

Comments: 6 pages

arXiv:2202.12441 [pdf, other]

Long-Term Missing Value Imputation for Time Series Data Using Deep Neural Networks

Authors: Jangho Park, Juliane Muller, Bhavna Arora, Boris Faybishenko, Gilberto Pastorello, Charuleka Varadharajan, Reetik Sahu, Deborah Agarwal

Abstract: We present an approach that uses a deep learning model, in particular, a MultiLayer Perceptron (MLP), for estimating the missing values of a variable in multivariate time series data. We focus on filling a long continuous gap (e.g., multiple months of missing daily observations) rather than on individual randomly missing observations. Our proposed gap filling algorithm uses an automated method for… ▽ More We present an approach that uses a deep learning model, in particular, a MultiLayer Perceptron (MLP), for estimating the missing values of a variable in multivariate time series data. We focus on filling a long continuous gap (e.g., multiple months of missing daily observations) rather than on individual randomly missing observations. Our proposed gap filling algorithm uses an automated method for determining the optimal MLP model architecture, thus allowing for optimal prediction performance for the given time series. We tested our approach by filling gaps of various lengths (three months to three years) in three environmental datasets with different time series characteristics, namely daily groundwater levels, daily soil moisture, and hourly Net Ecosystem Exchange. We compared the accuracy of the gap-filled values obtained with our approach to the widely-used R-based time series gap filling methods ImputeTS and mtsdi. The results indicate that using an MLP for filling a large gap leads to better results, especially when the data behave nonlinearly. Thus, our approach enables the use of datasets that have a large gap in one variable, which is common in many long-term environmental monitoring observations. △ Less

Submitted 24 February, 2022; originally announced February 2022.

arXiv:2101.11356 [pdf, other]

Coverage Analysis of Broadcast Networks with Users Having Heterogeneous Content/Advertisement Preferences

Authors: Kanchan Chaurasia, Reena Sahu, Abhishek Gupta

Abstract: This work is focused on the system-level performance of a broadcast network. Since all transmitters in a broadcast network transmit the identical signal, received signals from multiple transmitters can be combined to improve system performance. We develop a stochastic geometry based analytical framework to derive the coverage of a typical receiver. We show that there may exist an optimal connectiv… ▽ More This work is focused on the system-level performance of a broadcast network. Since all transmitters in a broadcast network transmit the identical signal, received signals from multiple transmitters can be combined to improve system performance. We develop a stochastic geometry based analytical framework to derive the coverage of a typical receiver. We show that there may exist an optimal connectivity radius that maximizes the rate coverage. Our analysis includes the fact that users may have their individual content/advertisement preferences. We assume that there are multiple classes of users with each user class prefers a particular type of content/advertisements and the users will pay the network only when then can see content aligned with their interest. The operator may choose to transmit multiple contents simultaneously to cater more users' interests to increase its revenue. We present revenue models to study the impact of the number of contents on the operator revenue. We consider two scenarios for users' distribution: one where users' interest depends on their geographical location and the one where it doesn't. With the help of numerical results and analysis, we show the impact of various parameters including content granularity, connectivity radius, and rate threshold and present important design insights. △ Less

Submitted 27 January, 2021; originally announced January 2021.

arXiv:2009.13978 [pdf, ps, other]

Anonymous proof-of-asset transactions using designated blind signatures

Authors: Neetu Sharma, Rajeev Anand Sahu, Vishal Saraswat, Joaquin Garcia-Alfaro

Abstract: We propose a scheme to preserve the anonymity of users in proof-of-asset transactions. We assume bitcoin-like cryptocurrency systems in which a user must prove the strength of its assets (i.e., solvency), prior conducting further transactions. The traditional way of addressing such a problem is the use of blind signatures, i.e., a kind of digital signature whose properties satisfy the anonymity of… ▽ More We propose a scheme to preserve the anonymity of users in proof-of-asset transactions. We assume bitcoin-like cryptocurrency systems in which a user must prove the strength of its assets (i.e., solvency), prior conducting further transactions. The traditional way of addressing such a problem is the use of blind signatures, i.e., a kind of digital signature whose properties satisfy the anonymity of the signer. Our work focuses on the use of a designated verifier signature scheme that limits to only a single authorized party (within a group of signature requesters) to verify the correctness of the transaction. △ Less

Submitted 26 October, 2020; v1 submitted 29 September, 2020; originally announced September 2020.

Comments: 17 pages, extended conference version

arXiv:2007.10877 [pdf, other]

problemConquero at SemEval-2020 Task 12: Transformer and Soft label-based approaches

Authors: Karishma Laud, Jagriti Singh, Randeep Kumar Sahu, Ashutosh Modi

Abstract: In this paper, we present various systems submitted by our team problemConquero for SemEval-2020 Shared Task 12 Multilingual Offensive Language Identification in Social Media. We participated in all the three sub-tasks of OffensEval-2020, and our final submissions during the evaluation phase included transformer-based approaches and a soft label-based approach. BERT based fine-tuned models were su… ▽ More In this paper, we present various systems submitted by our team problemConquero for SemEval-2020 Shared Task 12 Multilingual Offensive Language Identification in Social Media. We participated in all the three sub-tasks of OffensEval-2020, and our final submissions during the evaluation phase included transformer-based approaches and a soft label-based approach. BERT based fine-tuned models were submitted for each language of sub-task A (offensive tweet identification). RoBERTa based fine-tuned model for sub-task B (automatic categorization of offense types) was submitted. We submitted two models for sub-task C (offense target identification), one using soft labels and the other using BERT based fine-tuned model. Our ranks for sub-task A were Greek-19 out of 37, Turkish-22 out of 46, Danish-26 out of 39, Arabic-39 out of 53, and English-20 out of 85. We achieved a rank of 28 out of 43 for sub-task B. Our best rank for sub-task C was 20 out of 39 using BERT based fine-tuned model. △ Less

Submitted 21 July, 2020; originally announced July 2020.

Comments: 10 pages,2 figures,8 tables, Accepted at Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval-2020)

arXiv:1908.10947 [pdf, other]

Surrogate Optimization of Deep Neural Networks for Groundwater Predictions

Authors: Juliane Mueller, Jangho Park, Reetik Sahu, Charuleka Varadharajan, Bhavna Arora, Boris Faybishenko, Deborah Agarwal

Abstract: Sustainable management of groundwater resources under changing climatic conditions require an application of reliable and accurate predictions of groundwater levels. Mechanistic multi-scale, multi-physics simulation models are often too hard to use for this purpose, especially for groundwater managers who do not have access to the complex compute resources and data. Therefore, we analyzed the appl… ▽ More Sustainable management of groundwater resources under changing climatic conditions require an application of reliable and accurate predictions of groundwater levels. Mechanistic multi-scale, multi-physics simulation models are often too hard to use for this purpose, especially for groundwater managers who do not have access to the complex compute resources and data. Therefore, we analyzed the applicability and performance of four modern deep learning computational models for predictions of groundwater levels. We compare three methods for optimizing the models' hyperparameters, including two surrogate model-based algorithms and a random sampling method. The models were tested using predictions of the groundwater level in Butte County, California, USA, taking into account the temporal variability of streamflow, precipitation, and ambient temperature. Our numerical study shows that the optimization of the hyperparameters can lead to reasonably accurate performance of all models (root mean squared errors of groundwater predictions of 2 meters or less), but the ''simplest'' network, namely a multilayer perceptron (MLP) performs overall better for learning and predicting groundwater data than the more advanced long short-term memory or convolutional neural networks in terms of prediction accuracy and time-to-solution, making the MLP a suitable candidate for groundwater prediction. △ Less

Submitted 3 February, 2020; v1 submitted 28 August, 2019; originally announced August 2019.

Comments: submitted to Journal of Global Optimization; main paper: 25 pages, 19 figures, 1 table; online supplement: 11 pages, 18 figures, 3 tables

Report number: LBNL-2001234

arXiv:1509.02620 [pdf, ps, other]

Performance of QAM Schemes with Dual-Hop DF Relaying Systems over Mixed $η$-$μ$ and $κ$-$μ$ Fading Channels

Authors: Dharmendra Dixit, P. R. Sahu

Abstract: Performance of quadrature amplitude modulation (QAM) schemes is analyzed with dual-hop decode-and-forward (DF) relaying systems over mixed $η$-$μ$ and $κ$-$μ$ fading channels. Closed-form expressions are obtained for the average symbol error rate (ASER) for general order rectangular QAM and cross QAM schemes using moment generating function based approach. Derived expressions are in the form of La… ▽ More Performance of quadrature amplitude modulation (QAM) schemes is analyzed with dual-hop decode-and-forward (DF) relaying systems over mixed $η$-$μ$ and $κ$-$μ$ fading channels. Closed-form expressions are obtained for the average symbol error rate (ASER) for general order rectangular QAM and cross QAM schemes using moment generating function based approach. Derived expressions are in the form of Lauricella's $(F_D^{(n)}(\cdot), Φ_1^{(n)}(\cdot))$ hypergeometric functions which can be numerically evaluated using either integral or series representation. The obtained ASER expressions include other mixed fading channel cases addressed in the literature as special cases such as mixed Hoyt, and Rice fading, mixed Nakagami-$m$, and Rice fading. We further obtain a simple expression for the asymptotic ASER, which is useful to determine a factor governing the system performance at high SNRs, i.e., the diversity order. Additionally, we analyze the optimal power allocation, which provides a practical design rule to optimally distribute the total transmission power between the source and the relay to minimize the ASER. Extensive numerical and computer simulation results are presented that confirm the accuracy of presented mathematical analysis. △ Less

Submitted 8 September, 2015; originally announced September 2015.

Comments: 25

Showing 1–14 of 14 results for author: Sahu, R