Cryptography and Security

New submissions

Submissions received from Wed 24 Apr 24 to Thu 25 Apr 24, announced Fri, 26 Apr 24

New submissions
Cross-lists
Replacements

[ total of 39 entries: 1-39 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 26 Apr 24

[1] arXiv:2404.16117 [pdf, other]: Title: Cybersecurity Assessment of the Polar Bluetooth Low Energy Heart-rate Sensor

Authors: Smone Soderi

Comments: Body Area Networks: Smart IoT and Big Data for Intelligent Health Management, 2019

Subjects: Cryptography and Security (cs.CR); Computers and Society (cs.CY); Networking and Internet Architecture (cs.NI)

Wireless communications among wearable and implantable devices implement the information exchange around the human body. Wireless body area network (WBAN) technology enables non-invasive applications in our daily lives. Wireless connected devices improve the quality of many services, and they make procedures easier. On the other hand, they open up large attack surfaces and introduces potential security vulnerabilities. Bluetooth low energy (BLE) is a low-power protocol widely used in wireless personal area networks (WPANs). This paper analyzes the security vulnerabilities of a BLE heart-rate sensor. By observing the received signal strength indicator (RSSI) variations, it is possible to detect anomalies in the BLE connection. The case-study shows that an attacker can easily intercept and manipulate the data transmitted between the mobile app and the BLE device. With this research, the author would raise awareness about the security of the heart-rate information that we can receive from our wireless body sensors.
[2] arXiv:2404.16118 [pdf, other]: Title: Act as a Honeytoken Generator! An Investigation into Honeytoken Generation with Large Language Models

Authors: Daniel Reti, Norman Becker, Tillmann Angeli, Anasuya Chattopadhyay, Daniel Schneider, Sebastian Vollmer, Hans D. Schotten

Comments: 12 pages

Subjects: Cryptography and Security (cs.CR)

With the increasing prevalence of security incidents, the adoption of deception-based defense strategies has become pivotal in cyber security. This work addresses the challenge of scalability in designing honeytokens, a key component of such defense mechanisms. The manual creation of honeytokens is a tedious task. Although automated generators exists, they often lack versatility, being specialized for specific types of honeytokens, and heavily rely on suitable training datasets. To overcome these limitations, this work systematically investigates the approach of utilizing Large Language Models (LLMs) to create a variety of honeytokens. Out of the seven different honeytoken types created in this work, such as configuration files, databases, and log files, two were used to evaluate the optimal prompt. The generation of robots.txt files and honeywords was used to systematically test 210 different prompt structures, based on 16 prompt building blocks. Furthermore, all honeytokens were tested across different state-of-the-art LLMs to assess the varying performance of different models. Prompts performing optimally on one LLMs do not necessarily generalize well to another. Honeywords generated by GPT-3.5 were found to be less distinguishable from real passwords compared to previous methods of automated honeyword generation. Overall, the findings of this work demonstrate that generic LLMs are capable of creating a wide array of honeytokens using the presented prompt structures.
[3] arXiv:2404.16120 [pdf, other]: Title: Securing Hybrid Wireless Body Area Networks (HyWBAN): Advancements in Semantic Communications and Jamming Techniques

Authors: Simone Soderi, Mariella Särestöniemi, Syifaul Fuada, Matti Hämäläinen, Marcos Katz, Jari Iinatti

Journal-ref: Digital Health and Wireless Solutions, 2024

Subjects: Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI)

This paper explores novel strategies to strengthen the security of Hybrid Wireless Body Area Networks (HyWBANs), essential in smart healthcare and Internet of Things (IoT) applications. Recognizing the vulnerability of HyWBAN to sophisticated cyber-attacks, we propose an innovative combination of semantic communications and jamming receivers. This dual-layered security mechanism protects against unauthorized access and data breaches, particularly in scenarios involving in-body to on-body communication channels. We conduct comprehensive laboratory measurements to understand hybrid (radio and optical) communication propagation through biological tissues and utilize these insights to refine a dataset for training a Deep Learning (DL) model. These models, in turn, generate semantic concepts linked to cryptographic keys for enhanced data confidentiality and integrity using a jamming receiver. The proposed model demonstrates a significant reduction in energy consumption compared to traditional cryptographic methods, like Elliptic Curve Diffie-Hellman (ECDH), especially when supplemented with jamming. Our approach addresses the primary security concerns and sets the baseline for future secure biomedical communication systems advancements.
[4] arXiv:2404.16195 [pdf, other]: Title: A Game-Theoretic Analysis of Auditing Differentially Private Algorithms with Epistemically Disparate Herd

Authors: Ya-Ting Yang, Tao Zhang, Quanyan Zhu

Subjects: Cryptography and Security (cs.CR); Computer Science and Game Theory (cs.GT)

Privacy-preserving AI algorithms are widely adopted in various domains, but the lack of transparency might pose accountability issues. While auditing algorithms can address this issue, machine-based audit approaches are often costly and time-consuming. Herd audit, on the other hand, offers an alternative solution by harnessing collective intelligence. Nevertheless, the presence of epistemic disparity among auditors, resulting in varying levels of expertise and access to knowledge, may impact audit performance. An effective herd audit will establish a credible accountability threat for algorithm developers, incentivizing them to uphold their claims. In this study, our objective is to develop a systematic framework that examines the impact of herd audits on algorithm developers using the Stackelberg game approach. The optimal strategy for auditors emphasizes the importance of easy access to relevant information, as it increases the auditors' confidence in the audit process. Similarly, the optimal choice for developers suggests that herd audit is viable when auditors face lower costs in acquiring knowledge. By enhancing transparency and accountability, herd audit contributes to the responsible development of privacy-preserving algorithms.
[5] arXiv:2404.16212 [pdf, other]: Title: An Analysis of Recent Advances in Deepfake Image Detection in an Evolving Threat Landscape

Authors: Sifat Muhammad Abdullah, Aravind Cheruvu, Shravya Kanchi, Taejoong Chung, Peng Gao, Murtuza Jadliwala, Bimal Viswanath

Comments: Accepted to IEEE S&P 2024; 19 pages, 10 figures

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Deepfake or synthetic images produced using deep generative models pose serious risks to online platforms. This has triggered several research efforts to accurately detect deepfake images, achieving excellent performance on publicly available deepfake datasets. In this work, we study 8 state-of-the-art detectors and argue that they are far from being ready for deployment due to two recent developments. First, the emergence of lightweight methods to customize large generative models, can enable an attacker to create many customized generators (to create deepfakes), thereby substantially increasing the threat surface. We show that existing defenses fail to generalize well to such \emph{user-customized generative models} that are publicly available today. We discuss new machine learning approaches based on content-agnostic features, and ensemble modeling to improve generalization performance against user-customized models. Second, the emergence of \textit{vision foundation models} -- machine learning models trained on broad data that can be easily adapted to several downstream tasks -- can be misused by attackers to craft adversarial deepfakes that can evade existing defenses. We propose a simple adversarial attack that leverages existing foundation models to craft adversarial samples \textit{without adding any adversarial noise}, through careful semantic manipulation of the image content. We highlight the vulnerabilities of several defenses against our attack, and explore directions leveraging advanced foundation models and adversarial training to defend against this new threat.
[6] arXiv:2404.16232 [pdf, other]: Title: SECO: Secure Inference With Model Splitting Across Multi-Server Hierarchy

Authors: Shuangyi Chen, Ashish Khisti

Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)

In the context of prediction-as-a-service, concerns about the privacy of the data and the model have been brought up and tackled via secure inference protocols. These protocols are built up by using single or multiple cryptographic tools designed under a variety of different security assumptions.
In this paper, we introduce SECO, a secure inference protocol that enables a user holding an input data vector and multiple server nodes deployed with a split neural network model to collaboratively compute the prediction, without compromising either party's data privacy. We extend prior work on secure inference that requires the entire neural network model to be located on a single server node, to a multi-server hierarchy, where the user communicates to a gateway server node, which in turn communicates to remote server nodes. The inference task is split across the server nodes and must be performed over an encrypted copy of the data vector.
We adopt multiparty homomorphic encryption and multiparty garbled circuit schemes, making the system secure against dishonest majority of semi-honest servers as well as protecting the partial model structure from the user. We evaluate SECO on multiple models, achieving the reduction of computation and communication cost for the user, making the protocol applicable to user's devices with limited resources.
[7] arXiv:2404.16241 [pdf, other]: Title: Synergizing Privacy and Utility in Data Analytics Through Advanced Information Theorization

Authors: Zahir Alsulaimawi

Subjects: Cryptography and Security (cs.CR)

This study develops a novel framework for privacy-preserving data analytics, addressing the critical challenge of balancing data utility with privacy concerns. We introduce three sophisticated algorithms: a Noise-Infusion Technique tailored for high-dimensional image data, a Variational Autoencoder (VAE) for robust feature extraction while masking sensitive attributes and an Expectation Maximization (EM) approach optimized for structured data privacy. Applied to datasets such as Modified MNIST and CelebrityA, our methods significantly reduce mutual information between sensitive attributes and transformed data, thereby enhancing privacy. Our experimental results confirm that these approaches achieve superior privacy protection and retain high utility, making them viable for practical applications where both aspects are crucial. The research contributes to the field by providing a flexible and effective strategy for deploying privacy-preserving algorithms across various data types and establishing new benchmarks for utility and confidentiality in data analytics.
[8] arXiv:2404.16251 [pdf, ps, other]: Title: Investigating the prompt leakage effect and black-box defenses for multi-turn LLM interactions

Authors: Divyansh Agarwal, Alexander R. Fabbri, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu

Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

Prompt leakage in large language models (LLMs) poses a significant security and privacy threat, particularly in retrieval-augmented generation (RAG) systems. However, leakage in multi-turn LLM interactions along with mitigation strategies has not been studied in a standardized manner. This paper investigates LLM vulnerabilities against prompt leakage across 4 diverse domains and 10 closed- and open-source LLMs. Our unique multi-turn threat model leverages the LLM's sycophancy effect and our analysis dissects task instruction and knowledge leakage in the LLM response. In a multi-turn setting, our threat model elevates the average attack success rate (ASR) to 86.2%, including a 99% leakage with GPT-4 and claude-1.3. We find that some black-box LLMs like Gemini show variable susceptibility to leakage across domains - they are more likely to leak contextual knowledge in the news domain compared to the medical domain. Our experiments measure specific effects of 6 black-box defense strategies, including a query-rewriter in the RAG scenario. Our proposed multi-tier combination of defenses still has an ASR of 5.3% for black-box LLMs, indicating room for enhancement and future direction for LLM security research.
[9] arXiv:2404.16255 [pdf, other]: Title: Enhancing Privacy in Face Analytics Using Fully Homomorphic Encryption

Authors: Bharat Yalavarthi, Arjun Ramesh Kaushik, Arun Ross, Vishnu Boddeti, Nalini Ratha

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)

Modern face recognition systems utilize deep neural networks to extract salient features from a face. These features denote embeddings in latent space and are often stored as templates in a face recognition system. These embeddings are susceptible to data leakage and, in some cases, can even be used to reconstruct the original face image. To prevent compromising identities, template protection schemes are commonly employed. However, these schemes may still not prevent the leakage of soft biometric information such as age, gender and race. To alleviate this issue, we propose a novel technique that combines Fully Homomorphic Encryption (FHE) with an existing template protection scheme known as PolyProtect. We show that the embeddings can be compressed and encrypted using FHE and transformed into a secure PolyProtect template using polynomial transformation, for additional protection. We demonstrate the efficacy of the proposed approach through extensive experiments on multiple datasets. Our proposed approach ensures irreversibility and unlinkability, effectively preventing the leakage of soft biometric attributes from face embeddings without compromising recognition accuracy.
[10] arXiv:2404.16256 [pdf, other]: Title: Probabilistic Tracker Management Policies for Low-Cost and Scalable Rowhammer Mitigation

Authors: Aamer Jaleel, Stephen W. Keckler, Gururaj Saileshwar

Subjects: Cryptography and Security (cs.CR); Hardware Architecture (cs.AR)

This paper focuses on mitigating DRAM Rowhammer attacks. In recent years, solutions like TRR have been deployed in DDR4 DRAM to track aggressor rows and then issue a mitigative action by refreshing neighboring victim rows. Unfortunately, such in-DRAM solutions are resource-constrained (only able to provision few tens of counters to track aggressor rows) and are prone to thrashing based attacks, that have been used to fool them. Secure alternatives for in-DRAM trackers require tens of thousands of counters.
In this work, we demonstrate secure and scalable rowhammer mitigation using resource-constrained trackers. Our key idea is to manage such trackers with probabilistic management policies (PROTEAS). PROTEAS includes component policies like request-stream sampling and random evictions which enable thrash-resistance for resource-constrained trackers. We show that PROTEAS can secure small in-DRAM trackers (with 16 counters per DRAM bank) even when Rowhammer thresholds drop to 500 while incurring less than 3% slowdown. Moreover, we show that PROTEAS significantly outperforms a recent similar probabilistic proposal from Samsung (called DSAC) while achieving 11X - 19X the resilience against Rowhammer.
[11] arXiv:2404.16271 [pdf, ps, other]: Title: True random number generation using metastable 1T' molybdenum ditelluride

Authors: Yang Liu, Pengyu Liu, Yingyi Wen, Zihan Liang, Songwei Liu, Lekai Song, Jingfang Pei, Xiaoyue Fan, Teng Ma, Gang Wang, Shuo Gao, Kong-Pang Pun, Xiaolong Chen, Guohua Hu

Subjects: Cryptography and Security (cs.CR); Materials Science (cond-mat.mtrl-sci)

True random numbers play a critical role in secure cryptography. The generation relies on a stable and readily extractable entropy source. Here, from solution-processed structurally metastable 1T' MoTe2, we prove stable output of featureless, stochastic, and yet stable conductance noise at a broad temperature (down to 15 K) with minimal power consumption (down to 0.05 micro-W). Our characterizations and statistical analysis of the characteristics of the conductance noise suggest that the noise arises from the volatility of the stochastic polarization of the underlying ferroelectric dipoles in the 1T' MoTe2. Further, as proved in our experiments and indicated by our Monte Carlo simulation, the ferroelectric dipole polarization is a reliable entropy source with the stochastic polarization persistent and stable over time. Exploiting the conductance noise, we achieve the generation of true random numbers and demonstrate their use in common cryptographic applications, for example, password generation and data encryption. Besides, particularly, we show a privacy safeguarding approach to sensitive data that can be critical for the cryptography of neural networks. We believe our work will bring insights into the understanding of the metastable 1T' MoTe2 and, more importantly, underpin its great potential in secure cryptography.
[12] arXiv:2404.16362 [pdf, other]: Title: Feature graph construction with static features for malware detection

Authors: Binghui Zou, Chunjie Cao, Longjuan Wang, Yinan Cheng, Jingzhang Sun

Subjects: Cryptography and Security (cs.CR)

Malware can greatly compromise the integrity and trustworthiness of information and is in a constant state of evolution. Existing feature fusion-based detection methods generally overlook the correlation between features. And mere concatenation of features will reduce the model's characterization ability, lead to low detection accuracy. Moreover, these methods are susceptible to concept drift and significant degradation of the model. To address those challenges, we introduce a feature graph-based malware detection method, MFGraph, to characterize applications by learning feature-to-feature relationships to achieve improved detection accuracy while mitigating the impact of concept drift. In MFGraph, we construct a feature graph using static features extracted from binary PE files, then apply a deep graph convolutional network to learn the representation of the feature graph. Finally, we employ the representation vectors obtained from the output of a three-layer perceptron to differentiate between benign and malicious software. We evaluated our method on the EMBER dataset, and the experimental results demonstrate that it achieves an AUC score of 0.98756 on the malware detection task, outperforming other baseline models. Furthermore, the AUC score of MFGraph decreases by only 5.884% in one year, indicating that it is the least affected by concept drift.
[13] arXiv:2404.16363 [pdf, other]: Title: Byzantine Attacks Exploiting Penalties in Ethereum PoS

Authors: Ulysse Pavloff, Yackolley Amoussou-Genou, Sara Tucci-Piergiovanni

Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)

In May 2023, the Ethereum blockchain experienced its first inactivity leak, a mechanism designed to reinstate chain finalization amid persistent network disruptions. This mechanism aims to reduce the voting power of validators who are unreachable within the network, reallocating this power to active validators. This paper investigates the implications of the inactivity leak on safety within the Ethereum blockchain. Our theoretical analysis reveals scenarios where actions by Byzantine validators expedite the finalization of two conflicting branches, and instances where Byzantine validators reach a voting power exceeding the critical safety threshold of one-third. Additionally, we revisit the probabilistic bouncing attack, illustrating how the inactivity leak can result in a probabilistic breach of safety, potentially allowing Byzantine validators to exceed the one-third safety threshold. Our findings uncover how penalizing inactive nodes can compromise blockchain properties, particularly in the presence of Byzantine validators capable of coordinating actions.
[14] arXiv:2404.16504 [pdf, ps, other]: Title: Hardware Implementation of Double Pendulum Pseudo Random Number Generator

Authors: Jarrod Lim, Tom Manuel Opalla Piccio, Chua Min Jie Michelle, Maoyang Xiang, T. Hui Teo

Comments: 15 pages, 12 figure

Subjects: Cryptography and Security (cs.CR); Signal Processing (eess.SP)

The objective of this project is to utilize an FPGA board which is the CMOD A7 35t to obtain a pseudo random number which can be used for encryption. We aim to achieve this by leveraging the inherent randomness present in environmental data captured by sensors. This data will be used as a seed to initialize an algorithm implemented on the CMOD A7 35t FPGA board. The project will focus on interfacing the sensors with the FPGA and developing suitable algorithms to ensure the generated numbers exhibit strong randomness properties.
[15] arXiv:2404.16632 [pdf, ps, other]: Title: Introducing Systems Thinking as a Framework for Teaching and Assessing Threat Modeling Competency

Authors: Siddhant S. Joshi, Preeti Mukherjee, Kirsten A. Davis, James C. Davis

Comments: Presented at the Annual Conference of the American Society for Engineering Education (ASEE'24) 2024

Subjects: Cryptography and Security (cs.CR); Software Engineering (cs.SE)

Computing systems face diverse and substantial cybersecurity threats. To mitigate these cybersecurity threats, software engineers need to be competent in the skill of threat modeling. In industry and academia, there are many frameworks for teaching threat modeling, but our analysis of these frameworks suggests that (1) these approaches tend to be focused on component-level analysis rather than educating students to reason holistically about a system's cybersecurity, and (2) there is no rubric for assessing a student's threat modeling competency. To address these concerns, we propose using systems thinking in conjunction with popular and industry-standard threat modeling frameworks like STRIDE for teaching and assessing threat modeling competency. Prior studies suggest a holistic approach, like systems thinking, can help understand and mitigate cybersecurity threats. Thus, we developed and piloted two novel rubrics - one for assessing STRIDE threat modeling performance and the other for assessing systems thinking performance while conducting STRIDE.
To conduct this study, we piloted the two rubrics mentioned above to assess threat model artifacts of students enrolled in an upper-level software engineering course at Purdue University in Fall 2021, Spring 2023, and Fall 2023. Students who had both systems thinking and STRIDE instruction identified and attempted to mitigate component-level as well as systems-level threats. Students with only STRIDE instruction tended to focus on identifying and mitigating component-level threats and discounted system-level threats. We contribute to engineering education by: (1) describing a new rubric for assessing threat modeling based on systems thinking; (2) identifying trends and blindspots in students' threat modeling approach; and (3) envisioning the benefits of integrating systems thinking in threat modeling teaching and assessment.
[16] arXiv:2404.16651 [pdf, other]: Title: Evolutionary Large Language Models for Hardware Security: A Comparative Survey

Authors: Mohammad Akyash, Hadi Mardani Kamali

Subjects: Cryptography and Security (cs.CR)

Automating hardware (HW) security vulnerability detection and mitigation during the design phase is imperative for two reasons: (i) It must be before chip fabrication, as post-fabrication fixes can be costly or even impractical; (ii) The size and complexity of modern HW raise concerns about unknown vulnerabilities compromising CIA triad. While Large Language Models (LLMs) can revolutionize both HW design and testing processes, within the semiconductor context, LLMs can be harnessed to automatically rectify security-relevant vulnerabilities inherent in HW designs. This study explores the seeds of LLM integration in register transfer level (RTL) designs, focusing on their capacity for autonomously resolving security-related vulnerabilities. The analysis involves comparing methodologies, assessing scalability, interpretability, and identifying future research directions. Potential areas for exploration include developing specialized LLM architectures for HW security tasks and enhancing model performance with domain-specific knowledge, leading to reliable automated security measurement and risk mitigation associated with HW vulnerabilities.
[17] arXiv:2404.16744 [pdf, ps, other]: Title: JITScanner: Just-in-Time Executable Page Check in the Linux Operating System

Authors: Pasquale Caporaso, Giuseppe Bianchi, Francesco Quaglia

Subjects: Cryptography and Security (cs.CR)

Modern malware poses a severe threat to cybersecurity, continually evolving in sophistication. To combat this threat, researchers and security professionals continuously explore advanced techniques for malware detection and analysis. Dynamic analysis, a prevalent approach, offers advantages over static analysis by enabling observation of runtime behavior and detecting obfuscated or encrypted code used to evade detection. However, executing programs within a controlled environment can be resource-intensive, often necessitating compromises, such as limiting sandboxing to an initial period. In our article, we propose an alternative method for dynamic executable analysis: examining the presence of malicious signatures within executable virtual pages precisely when their current content, including any updates over time, is accessed for instruction fetching. Our solution, named JITScanner, is developed as a Linux-oriented package built upon a Loadable Kernel Module (LKM). It integrates a user-level component that communicates efficiently with the LKM using scalable multi-processor/core technology. JITScanner's effectiveness in detecting malware programs and its minimal intrusion in normal runtime scenarios have been extensively tested, with the experiment results detailed in this article. These experiments affirm the viability of our approach, showcasing JITScanner's capability to effectively identify malware while minimizing runtime overhead.

Cross-lists for Fri, 26 Apr 24

[18] arXiv:2404.15076 (cross-list from eess.SY) [pdf, other]: Title: Securing O-RAN Open Interfaces

Authors: Joshua Groen, Salvatore D'Oro, Utku Demir, Leonardo Bonati, Davide Villa, Michele Polese, Tommaso Melodia, Kaushik Chowdhury

Subjects: Systems and Control (eess.SY); Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI)

The next generation of cellular networks will be characterized by openness, intelligence, virtualization, and distributed computing. The Open Radio Access Network (Open RAN) framework represents a significant leap toward realizing these ideals, with prototype deployments taking place in both academic and industrial domains. While it holds the potential to disrupt the established vendor lock-ins, Open RAN's disaggregated nature raises critical security concerns. Safeguarding data and securing interfaces must be integral to Open RAN's design, demanding meticulous analysis of cost/benefit tradeoffs.
In this paper, we embark on the first comprehensive investigation into the impact of encryption on two pivotal Open RAN interfaces: the E2 interface, connecting the base station with a near-real-time RAN Intelligent Controller, and the Open Fronthaul, connecting the Radio Unit to the Distributed Unit. Our study leverages a full-stack O-RAN ALLIANCE compliant implementation within the Colosseum network emulator and a production-ready Open RAN and 5G-compliant private cellular network. This research contributes quantitative insights into the latency introduced and throughput reduction stemming from using various encryption protocols. Furthermore, we present four fundamental principles for constructing security by design within Open RAN systems, offering a roadmap for navigating the intricate landscape of Open RAN security.
[19] arXiv:2404.16109 (cross-list from cs.LG) [pdf, other]: Title: zkLLM: Zero Knowledge Proofs for Large Language Models

Authors: Haochen Sun, Jason Li, Hongyang Zhang

Comments: Accepted to ACM CCS 2024, camera-ready version under preparation. This is the author's version of the work. It is posted here for your personal use. Not for redistribution

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)

The recent surge in artificial intelligence (AI), characterized by the prominence of large language models (LLMs), has ushered in fundamental transformations across the globe. However, alongside these advancements, concerns surrounding the legitimacy of LLMs have grown, posing legal challenges to their extensive applications. Compounding these concerns, the parameters of LLMs are often treated as intellectual property, restricting direct investigations.
In this study, we address a fundamental challenge within the realm of AI legislation: the need to establish the authenticity of outputs generated by LLMs. To tackle this issue, we present zkLLM, which stands as the inaugural specialized zero-knowledge proof tailored for LLMs to the best of our knowledge. Addressing the persistent challenge of non-arithmetic operations in deep learning, we introduce tlookup, a parallelized lookup argument designed for non-arithmetic tensor operations in deep learning, offering a solution with no asymptotic overhead. Furthermore, leveraging the foundation of tlookup, we introduce zkAttn, a specialized zero-knowledge proof crafted for the attention mechanism, carefully balancing considerations of running time, memory usage, and accuracy.
Empowered by our fully parallelized CUDA implementation, zkLLM emerges as a significant stride towards achieving efficient zero-knowledge verifiable computations over LLMs. Remarkably, for LLMs boasting 13 billion parameters, our approach enables the generation of a correctness proof for the entire inference process in under 15 minutes. The resulting proof, compactly sized at less than 200 kB, is designed to uphold the privacy of the model parameters, ensuring no inadvertent information leakage.
[20] arXiv:2404.16154 (cross-list from cs.LG) [pdf, other]: Title: A Comparative Analysis of Adversarial Robustness for Quantum and Classical Machine Learning Models

Authors: Maximilian Wendlinger, Kilian Tscharke, Pascal Debus

Comments: submitted to IEEE QCE24

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Quantum Physics (quant-ph)

Quantum machine learning (QML) continues to be an area of tremendous interest from research and industry. While QML models have been shown to be vulnerable to adversarial attacks much in the same manner as classical machine learning models, it is still largely unknown how to compare adversarial attacks on quantum versus classical models. In this paper, we show how to systematically investigate the similarities and differences in adversarial robustness of classical and quantum models using transfer attacks, perturbation patterns and Lipschitz bounds. More specifically, we focus on classification tasks on a handcrafted dataset that allows quantitative analysis for feature attribution. This enables us to get insight, both theoretically and experimentally, on the robustness of classification networks. We start by comparing typical QML model architectures such as amplitude and re-upload encoding circuits with variational parameters to a classical ConvNet architecture. Next, we introduce a classical approximation of QML circuits (originally obtained with Random Fourier Features sampling but adapted in this work to fit a trainable encoding) and evaluate this model, denoted Fourier network, in comparison to other architectures. Our findings show that this Fourier network can be seen as a "middle ground" on the quantum-classical boundary. While adversarial attacks successfully transfer across this boundary in both directions, we also show that regularization helps quantum networks to be more robust, which has direct impact on Lipschitz bounds and transfer attacks.
[21] arXiv:2404.16156 (cross-list from quant-ph) [pdf, other]: Title: Guardians of the Quantum GAN

Authors: Archisman Ghosh, Debarshi Kundu, Avimita Chatterjee, Swaroop Ghosh

Comments: 11 pages, 10 figures

Subjects: Quantum Physics (quant-ph); Hardware Architecture (cs.AR); Cryptography and Security (cs.CR); Machine Learning (cs.LG)

Quantum Generative Adversarial Networks (qGANs) are at the forefront of image-generating quantum machine learning models. To accommodate the growing demand for Noisy Intermediate-Scale Quantum (NISQ) devices to train and infer quantum machine learning models, the number of third-party vendors offering quantum hardware as a service is expected to rise. This expansion introduces the risk of untrusted vendors potentially stealing proprietary information from the quantum machine learning models. To address this concern we propose a novel watermarking technique that exploits the noise signature embedded during the training phase of qGANs as a non-invasive watermark. The watermark is identifiable in the images generated by the qGAN allowing us to trace the specific quantum hardware used during training hence providing strong proof of ownership. To further enhance the security robustness, we propose the training of qGANs on a sequence of multiple quantum hardware, embedding a complex watermark comprising the noise signatures of all the training hardware that is difficult for adversaries to replicate. We also develop a machine learning classifier to extract this watermark robustly, thereby identifying the training hardware (or the suite of hardware) from the images generated by the qGAN validating the authenticity of the model. We note that the watermark signature is robust against inferencing on hardware different than the hardware that was used for training. We obtain watermark extraction accuracy of 100% and ~90% for training the qGAN on individual and multiple quantum hardware setups (and inferencing on different hardware), respectively. Since parameter evolution during training is strongly modulated by quantum noise, the proposed watermark can be extended to other quantum machine learning models as well.
[22] arXiv:2404.16177 (cross-list from cs.IR) [pdf, other]: Title: Advancing Recommender Systems by mitigating Shilling attacks

Authors: Aditya Chichani, Juzer Golwala, Tejas Gundecha, Kiran Gawande

Comments: Published in IEEE, Proceedings of 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT)

Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)

Considering the premise that the number of products offered grow in an exponential fashion and the amount of data that a user can assimilate before making a decision is relatively small, recommender systems help in categorizing content according to user preferences. Collaborative filtering is a widely used method for computing recommendations due to its good performance. But, this method makes the system vulnerable to attacks which try to bias the recommendations. These attacks, known as 'shilling attacks' are performed to push an item or nuke an item in the system. This paper proposes an algorithm to detect such shilling profiles in the system accurately and also study the effects of such profiles on the recommendations.
[23] arXiv:2404.16220 (cross-list from cs.IT) [pdf, other]: Title: When does a bent concatenation not belong to the completed Maiorana-McFarland class?

Authors: Sadmir Kudin, Enes Pasalic, Alexandr Polujan, Fengrong Zhang

Comments: This is the authors' version of the camera-ready version to be presented at the 2024 IEEE International Symposium on Information Theory (ISIT 2024)

Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR); Discrete Mathematics (cs.DM); Combinatorics (math.CO)

Every Boolean bent function $f$ can be written either as a concatenation $f=f_1||f_2$ of two complementary semi-bent functions $f_1,f_2$; or as a concatenation $f=f_1||f_2||f_3||f_4$ of four Boolean functions $f_1,f_2,f_3,f_4$, all of which are simultaneously bent, semi-bent, or 5-valued spectra-functions. In this context, it is essential to ask: When does a bent concatenation $f$ (not) belong to the completed Maiorana-McFarland class $\mathcal{M}^\#$? In this article, we answer this question completely by providing a full characterization of the structure of $\mathcal{M}$-subspaces for the concatenation of the form $f=f_1||f_2$ and $f=f_1||f_2||f_3||f_4$, which allows us to specify the necessary and sufficient conditions so that $f$ is outside $\mathcal{M}^\#$. Based on these conditions, we propose several explicit design methods of specifying bent functions outside $\mathcal{M}^\#$ in the special case when $f=g||h||g||(h+1)$, where $g$ and $h$ are bent functions.
[24] arXiv:2404.16287 (cross-list from stat.ML) [pdf, other]: Title: Differentially Private Federated Learning: Servers Trustworthiness, Estimation, and Statistical Inference

Authors: Zhe Zhang, Ryumei Nakada, Linjun Zhang

Comments: 56 pages, 3 figures

Subjects: Machine Learning (stat.ML); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Statistics Theory (math.ST); Methodology (stat.ME)

Differentially private federated learning is crucial for maintaining privacy in distributed environments. This paper investigates the challenges of high-dimensional estimation and inference under the constraints of differential privacy. First, we study scenarios involving an untrusted central server, demonstrating the inherent difficulties of accurate estimation in high-dimensional problems. Our findings indicate that the tight minimax rates depends on the high-dimensionality of the data even with sparsity assumptions. Second, we consider a scenario with a trusted central server and introduce a novel federated estimation algorithm tailored for linear regression models. This algorithm effectively handles the slight variations among models distributed across different machines. We also propose methods for statistical inference, including coordinate-wise confidence intervals for individual parameters and strategies for simultaneous inference. Extensive simulation experiments support our theoretical advances, underscoring the efficacy and reliability of our approaches.
[25] arXiv:2404.16638 (cross-list from cs.LG) [pdf, other]: Title: Privacy-Preserving Statistical Data Generation: Application to Sepsis Detection

Authors: Eric Macias-Fassio, Aythami Morales, Cristina Pruenza, Julian Fierrez

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)

The biomedical field is among the sectors most impacted by the increasing regulation of Artificial Intelligence (AI) and data protection legislation, given the sensitivity of patient information. However, the rise of synthetic data generation methods offers a promising opportunity for data-driven technologies. In this study, we propose a statistical approach for synthetic data generation applicable in classification problems. We assess the utility and privacy implications of synthetic data generated by Kernel Density Estimator and K-Nearest Neighbors sampling (KDE-KNN) within a real-world context, specifically focusing on its application in sepsis detection. The detection of sepsis is a critical challenge in clinical practice due to its rapid progression and potentially life-threatening consequences. Moreover, we emphasize the benefits of KDE-KNN compared to current synthetic data generation methodologies. Additionally, our study examines the effects of incorporating synthetic data into model training procedures. This investigation provides valuable insights into the effectiveness of synthetic data generation techniques in mitigating regulatory constraints within the biomedical field.
[26] arXiv:2404.16706 (cross-list from cs.DS) [pdf, other]: Title: Efficient and Near-Optimal Noise Generation for Streaming Differential Privacy

Authors: Krishnamurthy (Dj) Dvijotham, H. Brendan McMahan, Krishna Pillutla, Thomas Steinke, Abhradeep Thakurta

Subjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC); Cryptography and Security (cs.CR); Machine Learning (cs.LG)

In the task of differentially private (DP) continual counting, we receive a stream of increments and our goal is to output an approximate running total of these increments, without revealing too much about any specific increment. Despite its simplicity, differentially private continual counting has attracted significant attention both in theory and in practice. Existing algorithms for differentially private continual counting are either inefficient in terms of their space usage or add an excessive amount of noise, inducing suboptimal utility.
The most practical DP continual counting algorithms add carefully correlated Gaussian noise to the values. The task of choosing the covariance for this noise can be expressed in terms of factoring the lower-triangular matrix of ones (which computes prefix sums). We present two approaches from this class (for different parameter regimes) that achieve near-optimal utility for DP continual counting and only require logarithmic or polylogarithmic space (and time).
Our first approach is based on a space-efficient streaming matrix multiplication algorithm for a class of Toeplitz matrices. We show that to instantiate this algorithm for DP continual counting, it is sufficient to find a low-degree rational function that approximates the square root on a circle in the complex plane. We then apply and extend tools from approximation theory to achieve this. We also derive efficient closed-forms for the objective function for arbitrarily many steps, and show direct numerical optimization yields a highly practical solution to the problem. Our second approach combines our first approach with a recursive construction similar to the binary tree mechanism.
[27] arXiv:2404.16751 (cross-list from quant-ph) [pdf, other]: Title: Efficient unitary designs and pseudorandom unitaries from permutations

Authors: Chi-Fang Chen, Adam Bouland, Fernando G.S.L. Brandão, Jordan Docter, Patrick Hayden, Michelle Xu

Comments: 70 pages, 11 figures

Subjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR); Mathematical Physics (math-ph)

In this work we give an efficient construction of unitary $k$-designs using $\tilde{O}(k\cdot poly(n))$ quantum gates, as well as an efficient construction of a parallel-secure pseudorandom unitary (PRU). Both results are obtained by giving an efficient quantum algorithm that lifts random permutations over $S(N)$ to random unitaries over $U(N)$ for $N=2^n$. In particular, we show that products of exponentiated sums of $S(N)$ permutations with random phases approximately match the first $2^{\Omega(n)}$ moments of the Haar measure. By substituting either $\tilde{O}(k)$-wise independent permutations, or quantum-secure pseudorandom permutations (PRPs) in place of the random permutations, we obtain the above results. The heart of our proof is a conceptual connection between the large dimension (large-$N$) expansion in random matrix theory and the polynomial method, which allows us to prove query lower bounds at finite-$N$ by interpolating from the much simpler large-$N$ limit. The key technical step is to exhibit an orthonormal basis for irreducible representations of the partition algebra that has a low-degree large-$N$ expansion. This allows us to show that the distinguishing probability is a low-degree rational polynomial of the dimension $N$.

Replacements for Fri, 26 Apr 24

[28] arXiv:2105.08899 (replaced) [pdf, other]: Title: FairCMS: Cloud Media Sharing with Fair Copyright Protection

Authors: Xiangli Xiao, Yushu Zhang, Leo Yu Zhang, Zhongyun Hua, Zhe Liu, Jiwu Huang

Comments: Accepted by IEEE Transactions on Computational Social Systems

Subjects: Cryptography and Security (cs.CR); Multimedia (cs.MM)
[29] arXiv:2204.00955 (replaced) [pdf, other]: Title: FIRST: FrontrunnIng Resilient Smart ConTracts

Authors: Emrah Sariboz, Gaurav Panwar, Roopa Vishwanathan, Satyajayant Misra

Comments: 16 pages, 5 figures

Subjects: Cryptography and Security (cs.CR)
[30] arXiv:2209.07936 (replaced) [pdf, other]: Title: PA-Boot: A Formally Verified Authentication Protocol for Multiprocessor Secure Boot

Authors: Zhuoruo Zhang, Chenyang Yu, Rui Chang, Mingshuai Chen, Bo Feng, He Huang, Qinming Dai, Wenbo Shen, Yongwang Zhao

Subjects: Cryptography and Security (cs.CR); Hardware Architecture (cs.AR)
[31] arXiv:2304.00083 (replaced) [pdf, other]: Title: A Generative Framework for Low-Cost Result Validation of Machine Learning-as-a-Service Inference

Authors: Abhinav Kumar, Miguel A. Guirao Aguilera, Reza Tourani, Satyajayant Misra

Comments: 15 pages, 12 figures

Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[32] arXiv:2310.14117 (replaced) [pdf, other]: Title: ZTD$_{JAVA}$: Mitigating Software Supply Chain Vulnerabilities via Zero-Trust Dependencies

Authors: Paschal C. Amusuo, Kyle A. Robinson, Tanmay Singla, Huiyun Peng, Aravind Machiry, Santiago Torres-Arias, Laurent Simon, James C. Davis

Comments: 15 pages, 5 figures, 5 tables

Subjects: Cryptography and Security (cs.CR); Software Engineering (cs.SE)
[33] arXiv:2312.04902 (replaced) [pdf, other]: Title: BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting

Authors: Huming Qiu, Junjie Sun, Mi Zhang, Xudong Pan, Min Yang

Comments: To Appear in the 45th IEEE Symposium on Security and Privacy, May 20-23, 2024

Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
[34] arXiv:2404.05297 (replaced) [pdf, other]: Title: Automated Attack Synthesis for Constant Product Market Makers

Authors: Sujin Han, Jinseo Kim, Sung-Ju Lee, Insu Yun

Comments: 12 pages, 8 figures

Subjects: Cryptography and Security (cs.CR); Software Engineering (cs.SE)
[35] arXiv:2404.14246 (replaced) [pdf, other]: Title: Chain of trust: Unraveling references among Common Criteria certified products

Authors: Adam Janovsky, Łukasz Chmielewski, Petr Svenda, Jan Jancar, Vashek Matyas

Subjects: Cryptography and Security (cs.CR)
[36] arXiv:2404.15735 (replaced) [pdf, other]: Title: Replacing Cryptopuzzles with Useful Computation in Blockchain Proof-of-Work Protocols

Authors: Andrea Merlina, Thiago Garrett, Roman Vitenberg

Comments: Submitted to ACM Computing Surveys

Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)
[37] arXiv:2305.15792 (replaced) [pdf, other]: Title: IDEA: Invariant Defense for Graph Adversarial Robustness

Authors: Shuchang Tao, Qi Cao, Huawei Shen, Yunfan Wu, Bingbing Xu, Xueqi Cheng

Comments: Submitted to Information Sciences

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
[38] arXiv:2401.13516 (replaced) [pdf, other]: Title: Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces

Authors: Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou

Comments: arXiv admin note: substantial text overlap with arXiv:2308.09921, arXiv:2305.05943

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[39] arXiv:2404.10201 (replaced) [pdf, other]: Title: Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages

Authors: Hilal Asi, Vitaly Feldman, Jelani Nelson, Huy L. Nguyen, Kunal Talwar, Samson Zhou

Comments: Fixed author ordering

Subjects: Data Structures and Algorithms (cs.DS); Cryptography and Security (cs.CR); Information Theory (cs.IT); Machine Learning (cs.LG)

New submissions
Cross-lists
Replacements

[ total of 39 entries: 1-39 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2404, contact, help (Access key information)

> cs > cs.CR

Cryptography and Security

New submissions

New submissions for Fri, 26 Apr 24

Cross-lists for Fri, 26 Apr 24

Replacements for Fri, 26 Apr 24