Cryptography and Security
- [1] arXiv:2405.05435 [pdf, ps, other]
-
Title: Analysis and prevention of AI-based phishing email attacksComments: Electronics, acceptedSubjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Phishing email attacks are among the most common and most harmful cybersecurity attacks. With the emergence of generative AI, phishing attacks can be based on emails generated automatically, making it more difficult to detect them. That is, instead of a single email format sent to a large number of recipients, generative AI can be used to send each potential victim a different email, making it more difficult for cybersecurity systems to identify the scam email before it reaches the recipient. Here we describe a corpus of AI-generated phishing emails. We also use different machine learning tools to test the ability of automatic text analysis to identify AI-generated phishing emails. The results are encouraging, and show that machine learning tools can identify an AI-generated phishing email with high accuracy compared to regular emails or human-generated scam email. By applying descriptive analytic, the specific differences between AI-generated emails and manually crafted scam emails are profiled, and show that AI-generated emails are different in their style from human-generated phishing email scams. Therefore, automatic identification tools can be used as a warning for the user. The paper also describes the corpus of AI-generated phishing emails that is made open to the public, and can be used for consequent studies. While the ability of machine learning to detect AI-generated phishing email is encouraging, AI-generated phishing emails are different from regular phishing emails, and therefore it is important to train machine learning systems also with AI-generated emails in order to repel future phishing attacks that are powered by generative AI.
- [2] arXiv:2405.05469 [pdf, ps, html, other]
-
Title: PLLM-CS: Pre-trained Large Language Model (LLM) for Cyber Threat Detection in Satellite NetworksSubjects: Cryptography and Security (cs.CR)
Satellite networks are vital in facilitating communication services for various critical infrastructures. These networks can seamlessly integrate with a diverse array of systems. However, some of these systems are vulnerable due to the absence of effective intrusion detection systems, which can be attributed to limited research and the high costs associated with deploying, fine-tuning, monitoring, and responding to security breaches. To address these challenges, we propose a pretrained Large Language Model for Cyber Security , for short PLLM-CS, which is a variant of pre-trained Transformers [1], which includes a specialized module for transforming network data into contextually suitable inputs. This transformation enables the proposed LLM to encode contextual information within the cyber data. To validate the efficacy of the proposed method, we conducted empirical experiments using two publicly available network datasets, UNSW_NB 15 and TON_IoT, both providing Internet of Things (IoT)-based traffic data. Our experiments demonstrate that proposed LLM method outperforms state-of-the-art techniques such as BiLSTM, GRU, and CNN. Notably, the PLLM-CS method achieves an outstanding accuracy level of 100% on the UNSW_NB 15 dataset, setting a new standard for benchmark performance in this domain.
- [3] arXiv:2405.05525 [pdf, ps, other]
-
Title: Ditto: Quantization-aware Secure Inference of Transformers upon MPCComments: to be published in ICML 2024Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Due to the rising privacy concerns on sensitive client data and trained models like Transformers, secure multi-party computation (MPC) techniques are employed to enable secure inference despite attendant overhead. Existing works attempt to reduce the overhead using more MPC-friendly non-linear function approximations. However, the integration of quantization widely used in plaintext inference into the MPC domain remains unclear. To bridge this gap, we propose the framework named Ditto to enable more efficient quantization-aware secure Transformer inference. Concretely, we first incorporate an MPC-friendly quantization into Transformer inference and employ a quantization-aware distillation procedure to maintain the model utility. Then, we propose novel MPC primitives to support the type conversions that are essential in quantization and implement the quantization-aware MPC execution of secure quantized inference. This approach significantly decreases both computation and communication overhead, leading to improvements in overall efficiency. We conduct extensive experiments on Bert and GPT2 models to evaluate the performance of Ditto. The results demonstrate that Ditto is about $3.14\sim 4.40\times$ faster than MPCFormer (ICLR 2023) and $1.44\sim 2.35\times$ faster than the state-of-the-art work PUMA with negligible utility degradation.
- [4] arXiv:2405.05590 [pdf, ps, html, other]
-
Title: TroLLoc: Logic Locking and Layout Hardening for IC Security Closure against Hardware TrojansFangzhou Wang, Qijing Wang, Lilas Alrahis, Bangqi Fu, Shui Jiang, Xiaopeng Zhang, Ozgur Sinanoglu, Tsung-Yi Ho, Evangeline F.Y. Young, Johann KnechtelSubjects: Cryptography and Security (cs.CR); Hardware Architecture (cs.AR); Machine Learning (cs.LG)
Due to cost benefits, supply chains of integrated circuits (ICs) are largely outsourced nowadays. However, passing ICs through various third-party providers gives rise to many security threats, like piracy of IC intellectual property or insertion of hardware Trojans, i.e., malicious circuit modifications.
In this work, we proactively and systematically protect the physical layouts of ICs against post-design insertion of Trojans. Toward that end, we propose TroLLoc, a novel scheme for IC security closure that employs, for the first time, logic locking and layout hardening in unison. TroLLoc is fully integrated into a commercial-grade design flow, and TroLLoc is shown to be effective, efficient, and robust. Our work provides in-depth layout and security analysis considering the challenging benchmarks of the ISPD'22/23 contests for security closure. We show that TroLLoc successfully renders layouts resilient, with reasonable overheads, against (i) general prospects for Trojan insertion as in the ISPD'22 contest, (ii) actual Trojan insertion as in the ISPD'23 contest, and (iii) potential second-order attacks where adversaries would first (i.e., before Trojan insertion) try to bypass the locking defense, e.g., using advanced machine learning attacks. Finally, we release all our artifacts for independent verification [2]. - [5] arXiv:2405.05784 [pdf, ps, html, other]
-
Title: Link Stealing Attacks Against Inductive Graph Neural NetworksYixin Wu, Xinlei He, Pascal Berrang, Mathias Humbert, Michael Backes, Neil Zhenqiang Gong, Yang ZhangComments: To appear in the 24th Privacy Enhancing Technologies Symposium (PETS 2024), July 15-20, 2024Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
A graph neural network (GNN) is a type of neural network that is specifically designed to process graph-structured data. Typically, GNNs can be implemented in two settings, including the transductive setting and the inductive setting. In the transductive setting, the trained model can only predict the labels of nodes that were observed at the training time. In the inductive setting, the trained model can be generalized to new nodes/graphs. Due to its flexibility, the inductive setting is the most popular GNN setting at the moment. Previous work has shown that transductive GNNs are vulnerable to a series of privacy attacks. However, a comprehensive privacy analysis of inductive GNN models is still missing. This paper fills the gap by conducting a systematic privacy analysis of inductive GNNs through the lens of link stealing attacks, one of the most popular attacks that are specifically designed for GNNs. We propose two types of link stealing attacks, i.e., posterior-only attacks and combined attacks. We define threat models of the posterior-only attacks with respect to node topology and the combined attacks by considering combinations of posteriors, node attributes, and graph features. Extensive evaluation on six real-world datasets demonstrates that inductive GNNs leak rich information that enables link stealing attacks with advantageous properties. Even attacks with no knowledge about graph structures can be effective. We also show that our attacks are robust to different node similarities and different graph features. As a counterpart, we investigate two possible defenses and discover they are ineffective against our attacks, which calls for more effective defenses.
- [6] arXiv:2405.05789 [pdf, ps, other]
-
Title: High-Performance Privacy-Preserving Matrix Completion for Trajectory RecoveryComments: 7 pages, 10 figuresSubjects: Cryptography and Security (cs.CR); Numerical Analysis (math.NA)
Matrix completion has important applications in trajectory recovery and mobile social networks. However, sending raw data containing personal, sensitive information to cloud computing nodes may lead to privacy exposure issue.The privacy-preserving matrix completion is a useful approach to perform matrix completion while preserving privacy. In this paper, we propose a high-performance method for privacy-preserving matrix completion. First,we use a lightweight encryption scheme to encrypt the raw data and then perform matrix completion using alternating direction method of multipliers (ADMM). Then,the complemented matrix is decrypted and compared with the original matrix to calculate the error. This method has faster speed with higher accuracy. The results of numerical experiments reveal that the proposed method is faster than other algorithms.
- [7] arXiv:2405.05846 [pdf, ps, html, other]
-
Title: Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion ModelsSubjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
The past few years have witnessed substantial advancement in text-guided image generation powered by diffusion models. However, it was shown that text-to-image diffusion models are vulnerable to training image memorization, raising concerns on copyright infringement and privacy invasion. In this work, we perform practical analysis of memorization in text-to-image diffusion models. Targeting a set of images to protect, we conduct quantitive analysis on them without need to collect any prompts. Specifically, we first formally define the memorization of image and identify three necessary conditions of memorization, respectively similarity, existence and probability. We then reveal the correlation between the model's prediction error and image replication. Based on the correlation, we propose to utilize inversion techniques to verify the safety of target images against memorization and measure the extent to which they are memorized. Model developers can utilize our analysis method to discover memorized images or reliably claim safety against memorization. Extensive experiments on the Stable Diffusion, a popular open-source text-to-image diffusion model, demonstrate the effectiveness of our analysis method.
- [8] arXiv:2405.05906 [pdf, ps, other]
-
Title: Deep Multi-Task Learning for Malware Image ClassificationJournal-ref: Journal of Information Security and Applications, Volume 64, 2022, Page 103057Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Malicious software is a pernicious global problem. A novel multi-task learning framework is proposed in this paper for malware image classification for accurate and fast malware detection. We generate bitmap (BMP) and (PNG) images from malware features, which we feed to a deep learning classifier. Our state-of-the-art multi-task learning approach has been tested on a new dataset, for which we have collected approximately 100,000 benign and malicious PE, APK, Mach-o, and ELF examples. Experiments with seven tasks tested with 4 activation functions, ReLU, LeakyReLU, PReLU, and ELU separately demonstrate that PReLU gives the highest accuracy of more than 99.87% on all tasks. Our model can effectively detect a variety of obfuscation methods like packing, encryption, and instruction overlapping, strengthing the beneficial claims of our model, in addition to achieving the state-of-art methods in terms of accuracy.
- [9] arXiv:2405.05930 [pdf, ps, html, other]
-
Title: Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and FairnessSubjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Networking and Internet Architecture (cs.NI)
AI-generated content (AIGC) models, represented by large language models (LLM), have brought revolutionary changes to the content generation fields. The high-speed and extensive 6G technology is an ideal platform for providing powerful AIGC mobile service applications, while future 6G mobile networks also need to support intelligent and personalized mobile generation services. However, the significant ethical and security issues of current AIGC models, such as adversarial attacks, privacy, and fairness, greatly affect the credibility of 6G intelligent networks, especially in ensuring secure, private, and fair AIGC applications. In this paper, we propose TrustGAIN, a novel paradigm for trustworthy AIGC in 6G networks, to ensure trustworthy large-scale AIGC services in future 6G networks. We first discuss the adversarial attacks and privacy threats faced by AIGC systems in 6G networks, as well as the corresponding protection issues. Subsequently, we emphasize the importance of ensuring the unbiasedness and fairness of the mobile generative service in future intelligent networks. In particular, we conduct a use case to demonstrate that TrustGAIN can effectively guide the resistance against malicious or generated false information. We believe that TrustGAIN is a necessary paradigm for intelligent and trustworthy 6G networks to support AIGC services, ensuring the security, privacy, and fairness of AIGC network services.
New submissions for Friday, 10 May 2024 (showing 9 of 9 entries )
- [10] arXiv:2405.05369 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: Model Reconstruction Using Counterfactual Explanations: Mitigating the Decision Boundary ShiftSubjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Machine Learning (stat.ML)
Counterfactual explanations find ways of achieving a favorable model outcome with minimum input perturbation. However, counterfactual explanations can also be exploited to steal the model by strategically training a surrogate model to give similar predictions as the original (target) model. In this work, we investigate model extraction by specifically leveraging the fact that the counterfactual explanations also lie quite close to the decision boundary. We propose a novel strategy for model extraction that we call Counterfactual Clamping Attack (CCA) which trains a surrogate model using a unique loss function that treats counterfactuals differently than ordinary instances. Our approach also alleviates the related problem of decision boundary shift that arises in existing model extraction attacks which treat counterfactuals as ordinary instances. We also derive novel mathematical relationships between the error in model approximation and the number of queries using polytope theory. Experimental results demonstrate that our strategy provides improved fidelity between the target and surrogate model predictions on several real world datasets.
- [11] arXiv:2405.05428 (cross-list from cs.CV) [pdf, ps, html, other]
-
Title: Adversary-Guided Motion Retargeting for Skeleton AnonymizationSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Skeleton-based motion visualization is a rising field in computer vision, especially in the case of virtual reality (VR). With further advancements in human-pose estimation and skeleton extracting sensors, more and more applications that utilize skeleton data have come about. These skeletons may appear to be anonymous but they contain embedded personally identifiable information (PII). In this paper we present a new anonymization technique that is based on motion retargeting, utilizing adversary classifiers to further remove PII embedded in the skeleton. Motion retargeting is effective in anonymization as it transfers the movement of the user onto the a dummy skeleton. In doing so, any PII linked to the skeleton will be based on the dummy skeleton instead of the user we are protecting. We propose a Privacy-centric Deep Motion Retargeting model (PMR) which aims to further clear the retargeted skeleton of PII through adversarial learning. In our experiments, PMR achieves motion retargeting utility performance on par with state of the art models while also reducing the performance of privacy attacks.
- [12] arXiv:2405.05502 (cross-list from cs.CV) [pdf, ps, other]
-
Title: Towards Accurate and Robust Architectures via Neural Architecture SearchComments: Accepted by CVPR2024. arXiv admin note: substantial text overlap with arXiv:2212.14049Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
To defend deep neural networks from adversarial attacks, adversarial training has been drawing increasing attention for its effectiveness. However, the accuracy and robustness resulting from the adversarial training are limited by the architecture, because adversarial training improves accuracy and robustness by adjusting the weight connection affiliated to the architecture. In this work, we propose ARNAS to search for accurate and robust architectures for adversarial training. First we design an accurate and robust search space, in which the placement of the cells and the proportional relationship of the filter numbers are carefully determined. With the design, the architectures can obtain both accuracy and robustness by deploying accurate and robust structures to their sensitive positions, respectively. Then we propose a differentiable multi-objective search strategy, performing gradient descent towards directions that are beneficial for both natural loss and adversarial loss, thus the accuracy and robustness can be guaranteed at the same time. We conduct comprehensive experiments in terms of white-box attacks, black-box attacks, and transferability. Experimental results show that the searched architecture has the strongest robustness with the competitive accuracy, and breaks the traditional idea that NAS-based architectures cannot transfer well to complex tasks in robustness scenarios. By analyzing outstanding architectures searched, we also conclude that accurate and robust neural architectures tend to deploy different structures near the input and output, which has great practical significance on both hand-crafting and automatically designing of accurate and robust architectures.
- [13] arXiv:2405.05573 (cross-list from cs.CV) [pdf, ps, html, other]
-
Title: Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive TriggersSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
Poisoning-based backdoor attacks expose vulnerabilities in the data preparation stage of deep neural network (DNN) training. The DNNs trained on the poisoned dataset will be embedded with a backdoor, making them behave well on clean data while outputting malicious predictions whenever a trigger is applied. To exploit the abundant information contained in the input data to output label mapping, our scheme utilizes the network trained from the clean dataset as a trigger generator to produce poisons that significantly raise the success rate of backdoor attacks versus conventional approaches. Specifically, we provide a new categorization of triggers inspired by the adversarial technique and develop a multi-label and multi-payload Poisoning-based backdoor attack with Positive Triggers (PPT), which effectively moves the input closer to the target label on benign classifiers. After the classifier is trained on the poisoned dataset, we can generate an input-label-aware trigger to make the infected classifier predict any given input to any target label with a high possibility. Under both dirty- and clean-label settings, we show empirically that the proposed attack achieves a high attack success rate without sacrificing accuracy across various datasets, including SVHN, CIFAR10, GTSRB, and Tiny ImageNet. Furthermore, the PPT attack can elude a variety of classical backdoor defenses, proving its effectiveness.
- [14] arXiv:2405.05588 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Model Inversion Robustness: Can Transfer Learning Help?Journal-ref: CVPR 2024Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
Model Inversion (MI) attacks aim to reconstruct private training data by abusing access to machine learning models. Contemporary MI attacks have achieved impressive attack performance, posing serious threats to privacy. Meanwhile, all existing MI defense methods rely on regularization that is in direct conflict with the training objective, resulting in noticeable degradation in model utility. In this work, we take a different perspective, and propose a novel and simple Transfer Learning-based Defense against Model Inversion (TL-DMI) to render MI-robust models. Particularly, by leveraging TL, we limit the number of layers encoding sensitive information from private training dataset, thereby degrading the performance of MI attack. We conduct an analysis using Fisher Information to justify our method. Our defense is remarkably simple to implement. Without bells and whistles, we show in extensive experiments that TL-DMI achieves state-of-the-art (SOTA) MI robustness. Our code, pre-trained models, demo and inverted data are available at: this https URL
- [15] arXiv:2405.05610 (cross-list from cs.CL) [pdf, ps, other]
-
Title: Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLMSubjects: Computation and Language (cs.CL); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Large language models (LLMs) have achieved remarkable performance in various natural language processing tasks, especially in dialogue systems. However, LLM may also pose security and moral threats, especially in multi round conversations where large models are more easily guided by contextual content, resulting in harmful or biased responses. In this paper, we present a novel method to attack LLMs in multi-turn dialogues, called CoA (Chain of Attack). CoA is a semantic-driven contextual multi-turn attack method that adaptively adjusts the attack policy through contextual feedback and semantic relevance during multi-turn of dialogue with a large model, resulting in the model producing unreasonable or harmful content. We evaluate CoA on different LLMs and datasets, and show that it can effectively expose the vulnerabilities of LLMs, and outperform existing attack methods. Our work provides a new perspective and tool for attacking and defending LLMs, and contributes to the security and ethical assessment of dialogue systems.
- [16] arXiv:2405.05611 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Privacy-Preserving Edge Federated Learning for Intelligent Mobile-Health SystemsSubjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Machine Learning (ML) algorithms are generally designed for scenarios in which all data is stored in one data center, where the training is performed. However, in many applications, e.g., in the healthcare domain, the training data is distributed among several entities, e.g., different hospitals or patients' mobile devices/sensors. At the same time, transferring the data to a central location for learning is certainly not an option, due to privacy concerns and legal issues, and in certain cases, because of the communication and computation overheads. Federated Learning (FL) is the state-of-the-art collaborative ML approach for training an ML model across multiple parties holding local data samples, without sharing them. However, enabling learning from distributed data over such edge Internet of Things (IoT) systems (e.g., mobile-health and wearable technologies, involving sensitive personal/medical data) in a privacy-preserving fashion presents a major challenge mainly due to their stringent resource constraints, i.e., limited computing capacity, communication bandwidth, memory storage, and battery lifetime. In this paper, we propose a privacy-preserving edge FL framework for resource-constrained mobile-health and wearable technologies over the IoT infrastructure. We evaluate our proposed framework extensively and provide the implementation of our technique on Amazon's AWS cloud platform based on the seizure detection application in epilepsy monitoring using wearable technologies.
- [17] arXiv:2405.05724 (cross-list from cs.SI) [pdf, ps, other]
-
Title: Private Online Community Detection for Censored Block ModelsSubjects: Social and Information Networks (cs.SI); Cryptography and Security (cs.CR); Information Theory (cs.IT)
We study the private online change detection problem for dynamic communities, using a censored block model (CBM). Focusing on the notion of edge differential privacy (DP), we seek to understand the fundamental tradeoffs between the privacy budget, detection delay, and exact community recovery of community labels. We establish the theoretical lower bound on the delay in detecting changes privately and propose an algorithm capable of identifying changes in the community structure, while maintaining user privacy. Further, we provide theoretical guarantees for the effectiveness of our proposed method by showing necessary and sufficient conditions on change detection and exact recovery under edge DP. Simulation and real data examples are provided to validate the proposed method.
Cross submissions for Friday, 10 May 2024 (showing 8 of 8 entries )
- [18] arXiv:2011.08069 (replaced) [pdf, ps, other]
-
Title: Reconciling Security and Utility in Next-Generation Epidemic Risk Mitigation SystemsPierfrancesco Ingo, Nichole Boufford, Ming Cheng Jiang, Rowan Lindsay, Matthew Lentz, Gilles Barthe, Manuel Gomez-Rodriguez, Bernhard Schölkopf, Deepak Garg, Peter Druschel, Aastha MehtaSubjects: Cryptography and Security (cs.CR); Computers and Society (cs.CY); Social and Information Networks (cs.SI); Populations and Evolution (q-bio.PE)
Epidemics like the recent COVID-19 require proactive contact tracing and epidemiological analysis to predict and subsequently contain infection transmissions. The proactive measures require large scale data collection, which simultaneously raise concerns regarding users' privacy. Digital contact tracing systems developed in response to COVID-19 either collected extensive data for effective analytics at the cost of users' privacy or collected minimal data for the sake of user privacy but were ineffective in predicting and mitigating the epidemic risks. We present Silmarillion--in preparation for future epidemics--a system that reconciles user's privacy with rich data collection for higher utility. In Silmarillion, user devices record Bluetooth encounters with beacons installed in strategic locations. The beacons further enrich the encounters with geo-location, location type, and environment conditions at the beacon installation site. This enriched information enables detailed scientific analysis of disease parameters as well as more accurate personalized exposure risk notification. At the same time, Silmarillion provides privacy to all participants and non-participants at the same level as that guaranteed in digital and manual contact tracing. We describe the design of Silmarillion and its communication protocols that ensure user privacy and data security. We also evaluate a prototype of Silmarillion built using low-end IoT boards, showing that the power consumption and user latencies are adequately low for a practical deployment. Finally, we briefly report on a small-scale deployment within a university building as a proof-of-concept.
- [19] arXiv:2011.14341 (replaced) [pdf, ps, other]
-
Title: Optimizing Virtual Payment Channel Establishment in the Face of On-Path AdversariesLukas Aumayr, Esra Ceylan, Yannik Kopyciok, Matteo Maffei, Pedro Moreno-Sanchez, Iosif Salem, Stefan SchmidComments: This is the extended technical report of the work accepted at IFIP Networking 2024Subjects: Cryptography and Security (cs.CR)
Payment channel networks (PCNs) are among the most promising solutions to the scalability issues in permissionless blockchains, by allowing parties to pay each other off-chain through a path of payment channels (PCs). However, routing transactions comes at a cost which is proportional to the number of intermediaries, since each charges a fee for the routing service. Furthermore, analogous to other networks, malicious intermediaries in the payment path can lead to security and privacy threats. Virtual channels (VCs), i.e., bridges over PC paths, mitigate the above PCN issues, as an intermediary participates only once to set up the VC and is then excluded from every future VC transaction. However, similar to PCs, creating a VC has a cost that must be paid out of the bridged PCs' balance. Currently, we are missing guidelines to where and how many VCs to set up. Ideally, VCs should minimize transaction costs while mitigating security and privacy threats from on-path adversaries.
In this work, we address for the first time the VC setup problem, formalizing it as an optimization problem. We present an integer linear program (ILP) to compute the globally optimal VC setup strategy in terms of transaction costs, security, and privacy. We then accompany the computationally heavy ILP with a fast local greedy algorithm. Our model and algorithms can be used with any on-path adversary, given that its strategy can be expressed as a set of corrupted nodes that is estimated by the honest nodes. We conduct an evaluation of the greedy algorithm over a snapshot of the Lightning Network (LN), the largest Bitcoin-based PCN. Our results confirm on real-world data that our greedy strategy minimizes costs while protecting against security and privacy threats of on-path adversaries. These findings may serve the LN community as guidelines for the deployment of VCs. - [20] arXiv:2306.04642 (replaced) [pdf, ps, html, other]
-
Title: DiffusionShield: A Watermark for Copyright Protection against Generative Diffusion ModelsSubjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Recently, Generative Diffusion Models (GDMs) have showcased their remarkable capabilities in learning and generating images. A large community of GDMs has naturally emerged, further promoting the diversified applications of GDMs in various fields. However, this unrestricted proliferation has raised serious concerns about copyright protection. For example, artists including painters and photographers are becoming increasingly concerned that GDMs could effortlessly replicate their unique creative works without authorization. In response to these challenges, we introduce a novel watermarking scheme, DiffusionShield, tailored for GDMs. DiffusionShield protects images from copyright infringement by GDMs through encoding the ownership information into an imperceptible watermark and injecting it into the images. Its watermark can be easily learned by GDMs and will be reproduced in their generated images. By detecting the watermark from generated images, copyright infringement can be exposed with evidence. Benefiting from the uniformity of the watermarks and the joint optimization method, DiffusionShield ensures low distortion of the original image, high watermark detection performance, and the ability to embed lengthy messages. We conduct rigorous and comprehensive experiments to show the effectiveness of DiffusionShield in defending against infringement by GDMs and its superiority over traditional watermarking methods. The code for DiffusionShield is accessible in this https URL.
- [21] arXiv:2306.14882 (replaced) [pdf, ps, html, other]
-
Title: Citadel: Real-World Hardware-Software Contracts for Secure Enclaves Through Microarchitectural Isolation and Controlled SpeculationSubjects: Cryptography and Security (cs.CR); Hardware Architecture (cs.AR)
Hardware isolation primitives such as secure enclaves aim to protect sensitive programs, but remain vulnerable to transient execution attacks. Complete microarchitectural isolation is not a satisfactory defense mechanism as it leaves out public shared memory, critical for usability and application performance. Conversely, hardware-software co-designs for secure speculation can counter these attacks but are not yet practical, since they make assumptions on the speculation modes, the exposed microarchitectural state, and the software, which are all hard to support for the entire software stack. This paper advocates for processors to incorporate microarchitectural isolation primitives and mechanisms for controlled speculation, enabling different execution modes. These modes can restrict what is exposed to an attacker, effectively balancing performance and program-analysis complexity. We introduce two mechanisms to securely share memory between an enclave and an untrusted OS in an out-of-order processor. We show that our two modes are complementary, achieving speculative non-interference with a reasonable performance impact, while requiring minimal code annotation and simple program analysis doable by hand. Our prototype, Citadel, is a multicore processor running on an FPGA, booting untrusted Linux, and supporting comprehensive enclave capabilities, such as shared memory, and remote attestation. To our knowledge, Citadel is the first end-to-end enclave platform to run secure applications, such as cryptographic libraries or small private inference workloads, on a speculative out-of-order multicore processor while protecting against a significant class of side-channel attacks.
- [22] arXiv:2309.01098 (replaced) [pdf, ps, other]
-
Title: martFL: Enabling Utility-Driven Data Marketplace with a Robust and Verifiable Federated Learning ArchitectureComments: Another version of this paper is published in the proceedings of ACM Conference on Computer and Communications Security (CCS) 2023Subjects: Cryptography and Security (cs.CR)
The development of machine learning models requires a large amount of training data. Data marketplaces are essential for trading high-quality, private-domain data not publicly available online. However, due to growing data privacy concerns, direct data exchange is inappropriate. Federated Learning (FL) is a distributed machine learning paradigm that exchanges data utilities (in form of local models or gradients) among multiple parties without directly sharing the raw data. However, several challenges exist when applying existing FL architectures to construct a data marketplace: (i) In existing FL architectures, Data Acquirers (DAs) cannot privately evaluate local models from Data Providers (DPs) prior to trading; (ii) Model aggregation protocols in existing FL designs struggle to exclude malicious DPs without "overfitting" to the DA's (possibly biased) root dataset; (iii) Prior FL designs lack a proper billing mechanism to enforce the DA to fairly allocate the reward according to contributions made by different DPs. To address above challenges, we propose martFL, the first federated learning architecture that is specifically designed to enable a secure utility-driven data marketplace. At a high level, martFL is powered by two innovative designs: (i) a quality-aware model aggregation protocol that achieves robust local model aggregation even when the DA's root dataset is biased; (ii) a verifiable data transaction protocol that enables the DA to prove, both succinctly and in zero-knowledge, that it has faithfully aggregates the local models submitted by different DPs according to the committed aggregation weights, based on which the DPs can unambiguously claim the corresponding reward. We implement a prototype of martFL and evaluate it extensively over various tasks. The results show that martFL can improve the model accuracy by up to 25% while saving up to 64% data acquisition cost.
- [23] arXiv:2312.10593 (replaced) [pdf, ps, other]
-
Title: A Novel RFID Authentication Protocol Based on A Block-Order-Modulus Variable Matrix Encryption AlgorithmSubjects: Cryptography and Security (cs.CR); Signal Processing (eess.SP)
In this paper, authentication for mobile radio frequency identification (RFID) systems with low-cost tags is studied. Firstly, an adaptive modulus (AM) encryption algorithm is proposed. Subsequently, in order to enhance the security without additional storage of new key matrices, a self-updating encryption order (SUEO) algorithm is designed. Furthermore, a diagonal block local transpose key matrix (DBLTKM) encryption algorithm is presented, which effectively expands the feasible domain of the key space. Based on the above three algorithms, a novel joint AM-SUEO-DBLTKM encryption algorithm is constructed. Making full use of the advantages of the proposed joint algorithm, a two-way RFID authentication protocol, named AM-SUEO-DBLTKM-RFID, is proposed for mobile RFID systems. In addition, the Burrows-Abadi-Needham (BAN) logic and security analysis indicate that the proposed AM-SUEO-DBLTKM-RFID protocol can effectively combat various typical attacks. Numerical results demonstrate that the proposed AM-SUEO-DBLTKM algorithm can save 99.59\% of tag storage over traditional algorithms. Finally, the low computational complexity as well as the low storage cost of the proposed AM-SUEO-DBLTKM-RFID protocol facilitates deployment within low-cost RFID tags.
- [24] arXiv:2405.04355 (replaced) [pdf, ps, other]
-
Title: SmmPack: Obfuscation for SMM Modules with TPM Sealed KeyComments: 21 pages, 7 figures, 4 tables. Accepted at the 21st Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA 2024)Subjects: Cryptography and Security (cs.CR)
System Management Mode (SMM) is the highest-privileged operating mode of x86 and x86-64 processors. Through SMM exploitation, attackers can tamper with the Unified Extensible Firmware Interface (UEFI) firmware, disabling the security mechanisms implemented by the operating system and hypervisor. Vulnerabilities enabling SMM code execution are often reported as Common Vulnerabilities and Exposures (CVEs); however, no security mechanisms currently exist to prevent attackers from analyzing those vulnerabilities. To increase the cost of vulnerability analysis of SMM modules, we introduced SmmPack. The core concept of SmmPack involves encrypting an SMM module with the key securely stored in a Trusted Platform Module (TPM). We assessed the effectiveness of SmmPack in preventing attackers from obtaining and analyzing SMM modules using various acquisition methods. Our results show that SmmPack significantly increases the cost by narrowing down the means of module acquisition. Furthermore, we demonstrated that SmmPack operates without compromising the performance of the original SMM modules. We also clarified the management and adoption methods of SmmPack, as well as the procedure for applying BIOS updates, and demonstrated that the implementation of SmmPack is realistic.
- [25] arXiv:2405.04760 (replaced) [pdf, ps, other]
-
Title: Large Language Models for Cyber Security: A Systematic Literature ReviewHanXiang Xu, ShenAo Wang, NingKe Li, KaiLong Wang, YanJie Zhao, Kai Chen, Ting Yu, Yang Liu, HaoYu WangComments: 46 pages,6 figuresSubjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
The rapid advancement of Large Language Models (LLMs) has opened up new opportunities for leveraging artificial intelligence in various domains, including cybersecurity. As the volume and sophistication of cyber threats continue to grow, there is an increasing need for intelligent systems that can automatically detect vulnerabilities, analyze malware, and respond to attacks. In this survey, we conduct a comprehensive review of the literature on the application of LLMs in cybersecurity (LLM4Security). By comprehensively collecting over 30K relevant papers and systematically analyzing 127 papers from top security and software engineering venues, we aim to provide a holistic view of how LLMs are being used to solve diverse problems across the cybersecurity domain. Through our analysis, we identify several key findings. First, we observe that LLMs are being applied to a wide range of cybersecurity tasks, including vulnerability detection, malware analysis, network intrusion detection, and phishing detection. Second, we find that the datasets used for training and evaluating LLMs in these tasks are often limited in size and diversity, highlighting the need for more comprehensive and representative datasets. Third, we identify several promising techniques for adapting LLMs to specific cybersecurity domains, such as fine-tuning, transfer learning, and domain-specific pre-training. Finally, we discuss the main challenges and opportunities for future research in LLM4Security, including the need for more interpretable and explainable models, the importance of addressing data privacy and security concerns, and the potential for leveraging LLMs for proactive defense and threat hunting. Overall, our survey provides a comprehensive overview of the current state-of-the-art in LLM4Security and identifies several promising directions for future research.
- [26] arXiv:2405.04803 (replaced) [pdf, ps, other]
-
Title: Blockchains for Internet of Things: Fundamentals, Applications, and ChallengesSubjects: Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI)
Internet of Things (IoT) services necessitate the storage, transmission, and analysis of diverse data for inference, autonomy, and control. Blockchains, with their inherent properties of decentralization and security, offer efficient database solutions for these devices through consensus-based data sharing. However, it's essential to recognize that not every blockchain system is suitable for specific IoT applications, and some might be more beneficial when excluded with privacy concerns. For example, public blockchains are not suitable for storing sensitive data. This paper presents a detailed review of three distinct blockchains tailored for enhancing IoT applications. We initially delve into the foundational aspects of three blockchain systems, highlighting their strengths, limitations, and implementation needs. Additionally, we discuss the security issues in different blockchains. Subsequently, we explore the blockchain's application in three pivotal IoT areas: edge AI, communications, and healthcare. We underscore potential challenges and the future directions for integrating different blockchains in IoT. Ultimately, this paper aims to offer a comprehensive perspective on the synergies between blockchains and the IoT ecosystem, highlighting the opportunities and complexities involved.
- [27] arXiv:2205.00135 (replaced) [pdf, ps, html, other]
-
Title: Failing to hash into supersingular isogeny graphsJeremy Booher, Ross Bowden, Javad Doliskani, Tako Boris Fouotsa, Steven D. Galbraith, Sabrina Kunzweiler, Simon-Philipp Merz, Christophe Petit, Benjamin Smith, Katherine E. Stange, Yan Bo Ti, Christelle Vincent, José Felipe Voloch, Charlotte Weitkämper, Lukas ZobernigComments: 34 pages, 8 figuresSubjects: Number Theory (math.NT); Cryptography and Security (cs.CR)
An important open problem in supersingular isogeny-based cryptography is to produce, without a trusted authority, concrete examples of "hard supersingular curves" that is, equations for supersingular curves for which computing the endomorphism ring is as difficult as it is for random supersingular curves. A related open problem is to produce a hash function to the vertices of the supersingular $\ell$-isogeny graph which does not reveal the endomorphism ring, or a path to a curve of known endomorphism ring. Such a hash function would open up interesting cryptographic applications. In this paper, we document a number of (thus far) failed attempts to solve this problem, in the hope that we may spur further research, and shed light on the challenges and obstacles to this endeavour. The mathematical approaches contained in this article include: (i) iterative root-finding for the supersingular polynomial; (ii) gcd's of specialized modular polynomials; (iii) using division polynomials to create small systems of equations; (iv) taking random walks in the isogeny graph of abelian surfaces; and (v) using quantum random walks.
- [28] arXiv:2301.10053 (replaced) [pdf, ps, other]
-
Title: A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic DataJournal-ref: Published in the Proceedings of the 33rd USENIX Security Symposium (USENIX Security 2024), please cite accordinglySubjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Recent advances in synthetic data generation (SDG) have been hailed as a solution to the difficult problem of sharing sensitive data while protecting privacy. SDG aims to learn statistical properties of real data in order to generate "artificial" data that are structurally and statistically similar to sensitive data. However, prior research suggests that inference attacks on synthetic data can undermine privacy, but only for specific outlier records. In this work, we introduce a new attribute inference attack against synthetic data. The attack is based on linear reconstruction methods for aggregate statistics, which target all records in the dataset, not only outliers. We evaluate our attack on state-of-the-art SDG algorithms, including Probabilistic Graphical Models, Generative Adversarial Networks, and recent differentially private SDG mechanisms. By defining a formal privacy game, we show that our attack can be highly accurate even on arbitrary records, and that this is the result of individual information leakage (as opposed to population-level inference). We then systematically evaluate the tradeoff between protecting privacy and preserving statistical utility. Our findings suggest that current SDG methods cannot consistently provide sufficient privacy protection against inference attacks while retaining reasonable utility. The best method evaluated, a differentially private SDG mechanism, can provide both protection against inference attacks and reasonable utility, but only in very specific settings. Lastly, we show that releasing a larger number of synthetic records can improve utility but at the cost of making attacks far more effective.
- [29] arXiv:2311.05304 (replaced) [pdf, ps, other]
-
Title: Data Valuation and Detections in Federated LearningComments: CVPR 2024Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
Federated Learning (FL) enables collaborative model training while preserving the privacy of raw data. A challenge in this framework is the fair and efficient valuation of data, which is crucial for incentivizing clients to contribute high-quality data in the FL task. In scenarios involving numerous data clients within FL, it is often the case that only a subset of clients and datasets are pertinent to a specific learning task, while others might have either a negative or negligible impact on the model training process. This paper introduces a novel privacy-preserving method for evaluating client contributions and selecting relevant datasets without a pre-specified training algorithm in an FL task. Our proposed approach FedBary, utilizes Wasserstein distance within the federated context, offering a new solution for data valuation in the FL framework. This method ensures transparent data valuation and efficient computation of the Wasserstein barycenter and reduces the dependence on validation datasets. Through extensive empirical experiments and theoretical analyses, we demonstrate the potential of this data valuation method as a promising avenue for FL research.
- [30] arXiv:2401.04956 (replaced) [pdf, ps, other]
-
Title: EmMixformer: Mix transformer for eye movement recognitionSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
Eye movement (EM) is a new highly secure biometric behavioral modality that has received increasing attention in recent years. Although deep neural networks, such as convolutional neural network (CNN), have recently achieved promising performance, current solutions fail to capture local and global temporal dependencies within eye movement data. To overcome this problem, we propose in this paper a mixed transformer termed EmMixformer to extract time and frequency domain information for eye movement recognition. To this end, we propose a mixed block consisting of three modules, transformer, attention Long short-term memory (attention LSTM), and Fourier transformer. We are the first to attempt leveraging transformer to learn long temporal dependencies within eye movement. Second, we incorporate the attention mechanism into LSTM to propose attention LSTM with the aim to learn short temporal dependencies. Third, we perform self attention in the frequency domain to learn global features. As the three modules provide complementary feature representations in terms of local and global dependencies, the proposed EmMixformer is capable of improving recognition accuracy. The experimental results on our eye movement dataset and two public eye movement datasets show that the proposed EmMixformer outperforms the state of the art by achieving the lowest verification error.
- [31] arXiv:2401.09027 (replaced) [pdf, ps, other]
-
Title: Exact Homomorphic EncryptionSubjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR)
Inspired by the concept of fault tolerance quantum computation, this article proposes a framework dubbed Exact Homomorphic Encryption, EHE, enabling exact computations on encrypted data without the need for pre-decryption. The introduction of quantum gates is a critical step for constructing the message encryption and the computation encryption within the framework. Of significance is that both encryptions are respectively accomplished in a multivariate polynomial set generated by quantum gates. Two fundamental traits of quantum gates, the invertibility and the noncommutativity, establish the success of EHE. The encrypted computation is exact because its encryption transformation is conducted with invertible gates. In the same vein, decryptions for both an encrypted message and encrypted computation are exact. The second trait of noncommutativity among applied quantum gates brings forth the security for the two encryptions. Toward the message encryption, a plaintext is encoded into a ciphertext via a polynomial set generated by a product of noncommuting gates randomly chosen. In the computation encryption, a desired operation is encoded into an encrypted polynomial set generated by another product of noncommuting gates. The encrypted computation is then the evaluation of the encrypted polynomial set on the ciphertext and is referred to as the cryptovaluation. EHE is not only attainable on quantum computers, but also straightforwardly realizable on traditional computing environments. Surpassing the standard security 2^128 of quantum resilience, both the encryptions further reach a security greater than the suggested threshold 2^1024 and are characterized as hyper quantum-resilient. Thanks to the two essential traits of quantum gates, this framework can be regarded as the initial tangible manifestation of the concept noncommutative cryptography.
- [32] arXiv:2405.01795 (replaced) [pdf, ps, other]
-
Title: Human Factors in the LastPass BreachSubjects: Human-Computer Interaction (cs.HC); Cryptography and Security (cs.CR)
This paper examines the complex nature of cyber attacks through an analysis of the LastPass breach. It argues for the integration of human-centric considerations into cybersecurity measures, focusing on mitigating factors such as goal-directed behavior, cognitive overload, human biases (e.g., optimism, anchoring), and risky behaviors. Findings from an analysis of this breach offers support to the perspective that addressing both the human and technical dimensions of cyber defense can significantly enhance the resilience of cyber systems against complex threats. This means maintaining a balanced approach while simultaneously simplifying user interactions, making users aware of biases, and discouraging risky practices are essential for preventing cyber incidents.
- [33] arXiv:2405.04344 (replaced) [pdf, ps, other]
-
Title: Enhancing Scalability of Metric Differential Privacy via Secret Dataset Partitioning and Benders DecompositionComments: To be published in IJCAI 2024Subjects: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
Metric Differential Privacy (mDP) extends the concept of Differential Privacy (DP) to serve as a new paradigm of data perturbation. It is designed to protect secret data represented in general metric space, such as text data encoded as word embeddings or geo-location data on the road network or grid maps. To derive an optimal data perturbation mechanism under mDP, a widely used method is linear programming (LP), which, however, might suffer from a polynomial explosion of decision variables, rendering it impractical in large-scale mDP.
In this paper, our objective is to develop a new computation framework to enhance the scalability of the LP-based mDP. Considering the connections established by the mDP constraints among the secret records, we partition the original secret dataset into various subsets. Building upon the partition, we reformulate the LP problem for mDP and solve it via Benders Decomposition, which is composed of two stages: (1) a master program to manage the perturbation calculation across subsets and (2) a set of subproblems, each managing the perturbation derivation within a subset. Our experimental results on multiple datasets, including geo-location data in the road network/grid maps, text data, and synthetic data, underscore our proposed mechanism's superior scalability and efficiency.