-
LATTEO: A Framework to Support Learning Asynchronously Tempered with Trusted Execution and Obfuscation
Authors:
Abhinav Kumar,
George Torres,
Noah Guzinski,
Gaurav Panwar,
Reza Tourani,
Satyajayant Misra,
Marcin Spoczynski,
Mona Vij,
Nageen Himayat
Abstract:
The privacy vulnerabilities of the federated learning (FL) paradigm, primarily caused by gradient leakage, have prompted the development of various defensive measures. Nonetheless, these solutions have predominantly been crafted for and assessed in the context of synchronous FL systems, with minimal focus on asynchronous FL. This gap arises in part due to the unique challenges posed by the asynchr…
▽ More
The privacy vulnerabilities of the federated learning (FL) paradigm, primarily caused by gradient leakage, have prompted the development of various defensive measures. Nonetheless, these solutions have predominantly been crafted for and assessed in the context of synchronous FL systems, with minimal focus on asynchronous FL. This gap arises in part due to the unique challenges posed by the asynchronous setting, such as the lack of coordinated updates, increased variability in client participation, and the potential for more severe privacy risks. These concerns have stymied the adoption of asynchronous FL. In this work, we first demonstrate the privacy vulnerabilities of asynchronous FL through a novel data reconstruction attack that exploits gradient updates to recover sensitive client data. To address these vulnerabilities, we propose a privacy-preserving framework that combines a gradient obfuscation mechanism with Trusted Execution Environments (TEEs) for secure asynchronous FL aggregation at the network edge. To overcome the limitations of conventional enclave attestation, we introduce a novel data-centric attestation mechanism based on Multi-Authority Attribute-Based Encryption. This mechanism enables clients to implicitly verify TEE-based aggregation services, effectively handle on-demand client participation, and scale seamlessly with an increasing number of asynchronous connections. Our gradient obfuscation mechanism reduces the structural similarity index of data reconstruction by 85% and increases reconstruction error by 400%, while our framework improves attestation efficiency by lowering average latency by up to 1500% compared to RA-TLS, without additional overhead.
△ Less
Submitted 6 February, 2025;
originally announced February 2025.
-
DarkMind: Latent Chain-of-Thought Backdoor in Customized LLMs
Authors:
Zhen Guo,
Reza Tourani
Abstract:
With the growing demand for personalized AI solutions, customized LLMs have become a preferred choice for businesses and individuals, driving the deployment of millions of AI agents across various platforms, e.g., GPT Store hosts over 3 million customized GPTs. Their popularity is partly driven by advanced reasoning capabilities, such as Chain-of-Thought, which enhance their ability to tackle comp…
▽ More
With the growing demand for personalized AI solutions, customized LLMs have become a preferred choice for businesses and individuals, driving the deployment of millions of AI agents across various platforms, e.g., GPT Store hosts over 3 million customized GPTs. Their popularity is partly driven by advanced reasoning capabilities, such as Chain-of-Thought, which enhance their ability to tackle complex tasks. However, their rapid proliferation introduces new vulnerabilities, particularly in reasoning processes that remain largely unexplored. We introduce DarkMind, a novel backdoor attack that exploits the reasoning capabilities of customized LLMs. Designed to remain latent, DarkMind activates within the reasoning chain to covertly alter the final outcome. Unlike existing attacks, it operates without injecting triggers into user queries, making it a more potent threat. We evaluate DarkMind across eight datasets covering arithmetic, commonsense, and symbolic reasoning domains, using five state-of-the-art LLMs with five distinct trigger implementations. Our results demonstrate DarkMind effectiveness across all scenarios, underscoring its impact. Finally, we explore potential defense mechanisms to mitigate its risks, emphasizing the need for stronger security measures.
△ Less
Submitted 24 January, 2025;
originally announced January 2025.
-
Persistent Backdoor Attacks in Continual Learning
Authors:
Zhen Guo,
Abhinav Kumar,
Reza Tourani
Abstract:
Backdoor attacks pose a significant threat to neural networks, enabling adversaries to manipulate model outputs on specific inputs, often with devastating consequences, especially in critical applications. While backdoor attacks have been studied in various contexts, little attention has been given to their practicality and persistence in continual learning, particularly in understanding how the c…
▽ More
Backdoor attacks pose a significant threat to neural networks, enabling adversaries to manipulate model outputs on specific inputs, often with devastating consequences, especially in critical applications. While backdoor attacks have been studied in various contexts, little attention has been given to their practicality and persistence in continual learning, particularly in understanding how the continual updates to model parameters, as new data distributions are learned and integrated, impact the effectiveness of these attacks over time. To address this gap, we introduce two persistent backdoor attacks-Blind Task Backdoor and Latent Task Backdoor-each leveraging minimal adversarial influence. Our blind task backdoor subtly alters the loss computation without direct control over the training process, while the latent task backdoor influences only a single task's training, with all other tasks trained benignly. We evaluate these attacks under various configurations, demonstrating their efficacy with static, dynamic, physical, and semantic triggers. Our results show that both attacks consistently achieve high success rates across different continual learning algorithms, while effectively evading state-of-the-art defenses, such as SentiNet and I-BAU.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Unveiling the Unseen: Exploring Whitebox Membership Inference through the Lens of Explainability
Authors:
Chenxi Li,
Abhinav Kumar,
Zhen Guo,
Jie Hou,
Reza Tourani
Abstract:
The increasing prominence of deep learning applications and reliance on personalized data underscore the urgent need to address privacy vulnerabilities, particularly Membership Inference Attacks (MIAs). Despite numerous MIA studies, significant knowledge gaps persist, particularly regarding the impact of hidden features (in isolation) on attack efficacy and insufficient justification for the root…
▽ More
The increasing prominence of deep learning applications and reliance on personalized data underscore the urgent need to address privacy vulnerabilities, particularly Membership Inference Attacks (MIAs). Despite numerous MIA studies, significant knowledge gaps persist, particularly regarding the impact of hidden features (in isolation) on attack efficacy and insufficient justification for the root causes of attacks based on raw data features. In this paper, we aim to address these knowledge gaps by first exploring statistical approaches to identify the most informative neurons and quantifying the significance of the hidden activations from the selected neurons on attack accuracy, in isolation and combination. Additionally, we propose an attack-driven explainable framework by integrating the target and attack models to identify the most influential features of raw data that lead to successful membership inference attacks. Our proposed MIA shows an improvement of up to 26% on state-of-the-art MIA.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Silver Linings in the Shadows: Harnessing Membership Inference for Machine Unlearning
Authors:
Nexhi Sula,
Abhinav Kumar,
Jie Hou,
Han Wang,
Reza Tourani
Abstract:
With the continued advancement and widespread adoption of machine learning (ML) models across various domains, ensuring user privacy and data security has become a paramount concern. In compliance with data privacy regulations, such as GDPR, a secure machine learning framework should not only grant users the right to request the removal of their contributed data used for model training but also fa…
▽ More
With the continued advancement and widespread adoption of machine learning (ML) models across various domains, ensuring user privacy and data security has become a paramount concern. In compliance with data privacy regulations, such as GDPR, a secure machine learning framework should not only grant users the right to request the removal of their contributed data used for model training but also facilitates the elimination of sensitive data fingerprints within machine learning models to mitigate potential attack - a process referred to as machine unlearning. In this study, we present a novel unlearning mechanism designed to effectively remove the impact of specific data samples from a neural network while considering the performance of the unlearned model on the primary task. In achieving this goal, we crafted a novel loss function tailored to eliminate privacy-sensitive information from weights and activation values of the target model by combining target classification loss and membership inference loss. Our adaptable framework can easily incorporate various privacy leakage approximation mechanisms to guide the unlearning process. We provide empirical evidence of the effectiveness of our unlearning approach with a theoretical upper-bound analysis through a membership inference mechanism as a proof of concept. Our results showcase the superior performance of our approach in terms of unlearning efficacy and latency as well as the fidelity of the primary task, across four datasets and four deep learning architectures.
△ Less
Submitted 5 July, 2024; v1 submitted 30 June, 2024;
originally announced July 2024.
-
A Generative Framework for Low-Cost Result Validation of Machine Learning-as-a-Service Inference
Authors:
Abhinav Kumar,
Miguel A. Guirao Aguilera,
Reza Tourani,
Satyajayant Misra
Abstract:
The growing popularity of Machine Learning (ML) has led to its deployment in various sensitive domains, which has resulted in significant research focused on ML security and privacy. However, in some applications, such as Augmented/Virtual Reality, integrity verification of the outsourced ML tasks is more critical--a facet that has not received much attention. Existing solutions, such as multi-par…
▽ More
The growing popularity of Machine Learning (ML) has led to its deployment in various sensitive domains, which has resulted in significant research focused on ML security and privacy. However, in some applications, such as Augmented/Virtual Reality, integrity verification of the outsourced ML tasks is more critical--a facet that has not received much attention. Existing solutions, such as multi-party computation and proof-based systems, impose significant computation overhead, which makes them unfit for real-time applications. We propose Fides, a novel framework for real-time integrity validation of ML-as-a-Service (MLaaS) inference. Fides features a novel and efficient distillation technique--Greedy Distillation Transfer Learning--that dynamically distills and fine-tunes a space and compute-efficient verification model for verifying the corresponding service model while running inside a trusted execution environment. Fides features a client-side attack detection model that uses statistical analysis and divergence measurements to identify, with a high likelihood, if the service model is under attack. Fides also offers a re-classification functionality that predicts the original class whenever an attack is identified. We devised a generative adversarial network framework for training the attack detection and re-classification models. The evaluation shows that Fides achieves an accuracy of up to 98% for attack detection and 94% for re-classification.
△ Less
Submitted 24 April, 2024; v1 submitted 31 March, 2023;
originally announced April 2023.
-
Harpocrates: Anonymous Data Publication in Named Data Networking
Authors:
Md Washik Al Azad,
Reza Tourani,
Abderrahmen Mtibaa,
Spyridon Mastorakis
Abstract:
Named-Data Networking (NDN), a prominent realization of the Information-Centric Networking (ICN) vision, offers a request-response communication model where data is identified based on application-defined names at the network layer. This amplifies the ability of censoring authorities to restrict user access to certain data/websites/applications and monitor user requests. The majority of existing N…
▽ More
Named-Data Networking (NDN), a prominent realization of the Information-Centric Networking (ICN) vision, offers a request-response communication model where data is identified based on application-defined names at the network layer. This amplifies the ability of censoring authorities to restrict user access to certain data/websites/applications and monitor user requests. The majority of existing NDN-based frameworks have focused on enabling users in a censoring network to access data available outside of this network, without considering how data producers in a censoring network can make their data available to users outside of this network. This problem becomes especially challenging, since the NDN communication paths are symmetric, while producers are mandated to sign the data they generate and identify their certificates. In this paper, we propose Harpocrates, an NDN-based framework for anonymous data publication under censorship conditions. Harpocrates enables producers in censoring networks to produce and make their data available to users outside of these networks while remaining anonymous to censoring authorities. Our evaluation demonstrates that Harpocrates achieves anonymous data publication under different settings, being able to identify and adapt to censoring actions.
△ Less
Submitted 16 January, 2022;
originally announced January 2022.
-
DLWIoT: Deep Learning-based Watermarking for Authorized IoT Onboarding
Authors:
Spyridon Mastorakis,
Xin Zhong,
Pei-Chi Huang,
Reza Tourani
Abstract:
The onboarding of IoT devices by authorized users constitutes both a challenge and a necessity in a world, where the number of IoT devices and the tampering attacks against them continuously increase. Commonly used onboarding techniques today include the use of QR codes, pin codes, or serial numbers. These techniques typically do not protect against unauthorized device access-a QR code is physical…
▽ More
The onboarding of IoT devices by authorized users constitutes both a challenge and a necessity in a world, where the number of IoT devices and the tampering attacks against them continuously increase. Commonly used onboarding techniques today include the use of QR codes, pin codes, or serial numbers. These techniques typically do not protect against unauthorized device access-a QR code is physically printed on the device, while a pin code may be included in the device packaging. As a result, any entity that has physical access to a device can onboard it onto their network and, potentially, tamper it (e.g.,install malware on the device). To address this problem, in this paper, we present a framework, called Deep Learning-based Watermarking for authorized IoT onboarding (DLWIoT), featuring a robust and fully automated image watermarking scheme based on deep neural networks. DLWIoT embeds user credentials into carrier images (e.g., QR codes printed on IoT devices), thus enables IoT onboarding only by authorized users. Our experimental results demonstrate the feasibility of DLWIoT, indicating that authorized users can onboard IoT devices with DLWIoT within 2.5-3sec.
△ Less
Submitted 17 October, 2020;
originally announced October 2020.
-
Predictive and Causal Implications of using Shapley Value for Model Interpretation
Authors:
Sisi Ma,
Roshan Tourani
Abstract:
Shapley value is a concept from game theory. Recently, it has been used for explaining complex models produced by machine learning techniques. Although the mathematical definition of Shapley value is straight-forward, the implication of using it as a model interpretation tool is yet to be described. In the current paper, we analyzed Shapley value in the Bayesian network framework. We established t…
▽ More
Shapley value is a concept from game theory. Recently, it has been used for explaining complex models produced by machine learning techniques. Although the mathematical definition of Shapley value is straight-forward, the implication of using it as a model interpretation tool is yet to be described. In the current paper, we analyzed Shapley value in the Bayesian network framework. We established the relationship between Shapley value and conditional independence, a key concept in both predictive and causal modeling. Our results indicate that, eliminating a variable with high Shapley value from a model do not necessarily impair predictive performance, whereas eliminating a variable with low Shapley value from a model could impair performance. Therefore, using Shapley value for feature selection do not result in the most parsimonious and predictively optimal model in the general case. More importantly, Shapley value of a variable do not reflect their causal relationship with the target of interest.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
Democratizing the Edge: A Pervasive Edge Computing Framework
Authors:
Reza Tourani,
Srikathyayani Srikanteswara,
Satyajayant Misra,
Richard Chow,
Lily Yang,
Xiruo Liu,
Yi Zhang
Abstract:
The needs of emerging applications, such as augmented and virtual reality, federated machine learning, and autonomous driving, have motivated edge computing--the push of computation capabilities to the edge. Various edge computing architectures have emerged, including multi-access edge computing and edge-cloud, all with the premise of reducing communication latency and augmenting privacy. However,…
▽ More
The needs of emerging applications, such as augmented and virtual reality, federated machine learning, and autonomous driving, have motivated edge computing--the push of computation capabilities to the edge. Various edge computing architectures have emerged, including multi-access edge computing and edge-cloud, all with the premise of reducing communication latency and augmenting privacy. However, these architectures rely on static and pre-deployed infrastructure, falling short in harnessing the abundant resources at the network's edge. In this paper, we discuss the design of Pervasive Edge Computing (PEC)--a democratized edge computing framework, which enables end-user devices (e.g., smartphones, IoT devices, and vehicles) to dynamically participate in a large-scale computing ecosystem. Our vision of the democratized edge involves the real-time composition of services using available edge resources like data, software, and compute-hardware from multiple stakeholders. We discuss how the novel Named-Data Networking architecture can facilitate service deployment, discovery, invocation, and migration. We also discuss the economic models critical to the adoption of PEC and the outstanding challenges for its full realization.
△ Less
Submitted 1 July, 2020;
originally announced July 2020.
-
LASeR: Lightweight Authentication and Secured Routing for NDN IoT in Smart Cities
Authors:
Travis Mick,
Reza Tourani,
Satyajayant Misra
Abstract:
Recent literature suggests that the Internet of Things (IoT) scales much better in an Information-Centric Networking (ICN) model instead of the current host-centric Internet Protocol (IP) model. In particular, the Named Data Networking (NDN) project (one of the ICN architecture flavors) offers features exploitable by IoT applications, such as stateful forwarding, in- network caching, and built-in…
▽ More
Recent literature suggests that the Internet of Things (IoT) scales much better in an Information-Centric Networking (ICN) model instead of the current host-centric Internet Protocol (IP) model. In particular, the Named Data Networking (NDN) project (one of the ICN architecture flavors) offers features exploitable by IoT applications, such as stateful forwarding, in- network caching, and built-in assurance of data provenance. Though NDN-based IoT frameworks have been proposed, none have adequately and holistically addressed concerns related to secure onboarding and routing. Additionally, emerging IoT applications such as smart cities require high scalability and thus pose new challenges to NDN routing. Therefore, in this work, we propose and evaluate a novel, scalable framework for lightweight authentication and hierarchical routing in the NDN IoT (ND- NoT). Our ns-3 based simulation analyses demonstrate that our framework is scalable and efficient. It supports deployment densities as high as 40,000 nodes/km2 with an average onboarding convergence time of around 250 seconds and overhead of less than 20 KiB per node. This demonstrates its efficacy for emerging large-scale IoT applications such as smart cities.
△ Less
Submitted 24 March, 2017;
originally announced March 2017.
-
AccConF: An Access Control Framework for Leveraging In-Network Cached Data in ICNs
Authors:
S. Misra,
R. Tourani,
F. Natividad,
T. Mick,
N. Majd,
H. Huang
Abstract:
The fast-growing Internet traffic is increasingly becoming content-based and driven by mobile users, with users more interested in data rather than its source. This has precipitated the need for an information-centric Internet architecture. Research in information-centric networks (ICNs) have resulted in novel architectures, e.g., CCN/NDN, DONA, and PSIRP/PURSUIT; all agree on named data based add…
▽ More
The fast-growing Internet traffic is increasingly becoming content-based and driven by mobile users, with users more interested in data rather than its source. This has precipitated the need for an information-centric Internet architecture. Research in information-centric networks (ICNs) have resulted in novel architectures, e.g., CCN/NDN, DONA, and PSIRP/PURSUIT; all agree on named data based addressing and pervasive caching as integral design components. With network-wide content caching, enforcement of content access control policies become non-trivial. Each caching node in the network needs to enforce access control policies with the help of the content provider. This becomes inefficient and prone to unbounded latencies especially during provider outages.
In this paper, we propose an efficient access control framework for ICN, which allows legitimate users to access and use the cached content directly, and does not require verification/authentication by an online provider authentication server or the content serving router. This framework would help reduce the impact of system down-time from server outages and reduce delivery latency by leveraging caching while guaranteeing access only to legitimate users. Experimental/simulation results demonstrate the suitability of this scheme for all users, but particularly for mobile users, especially in terms of the security and latency overheads.
△ Less
Submitted 10 March, 2016;
originally announced March 2016.
-
Security, Privacy, and Access Control in Information-Centric Networking: A Survey
Authors:
Reza Tourani,
Travis Mick,
Satyajayant Misra,
Gaurav Panwar
Abstract:
Information-Centric Networking (ICN) is a new networking paradigm, which replaces the widely used host-centric networking paradigm in communication networks (e.g., Internet, mobile ad hoc networks) with an information-centric paradigm, which prioritizes the delivery of named content, oblivious of the contents origin. Content and client security are more intrinsic in the ICN paradigm versus the cur…
▽ More
Information-Centric Networking (ICN) is a new networking paradigm, which replaces the widely used host-centric networking paradigm in communication networks (e.g., Internet, mobile ad hoc networks) with an information-centric paradigm, which prioritizes the delivery of named content, oblivious of the contents origin. Content and client security are more intrinsic in the ICN paradigm versus the current host centric paradigm where they have been instrumented as an after thought. By design, the ICN paradigm inherently supports several security and privacy features, such as provenance and identity privacy, which are still not effectively available in the host-centric paradigm. However, given its nascency, the ICN paradigm has several open security and privacy concerns, some that existed in the old paradigm, and some new and unique. In this article, we survey the existing literature in security and privacy research sub-space in ICN. More specifically, we explore three broad areas: security threats, privacy risks, and access control enforcement mechanisms.
We present the underlying principle of the existing works, discuss the drawbacks of the proposed approaches, and explore potential future research directions. In the broad area of security, we review attack scenarios, such as denial of service, cache pollution, and content poisoning. In the broad area of privacy, we discuss user privacy and anonymity, name and signature privacy, and content privacy. ICN's feature of ubiquitous caching introduces a major challenge for access control enforcement that requires special attention. In this broad area, we review existing access control mechanisms including encryption-based, attribute-based, session-based, and proxy re-encryption-based access control schemes. We conclude the survey with lessons learned and scope for future work.
△ Less
Submitted 1 June, 2017; v1 submitted 10 March, 2016;
originally announced March 2016.