Skip to main content

Showing 1–10 of 10 results for author: Gulzar, M A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.18385  [pdf, other

    cs.CR

    Blocking Tracking JavaScript at the Function Granularity

    Authors: Abdul Haddi Amjad, Shaoor Munir, Zubair Shafiq, Muhammad Ali Gulzar

    Abstract: Modern websites extensively rely on JavaScript to implement both functionality and tracking. Existing privacy enhancing content blocking tools struggle against mixed scripts, which simultaneously implement both functionality and tracking, because blocking the script would break functionality and not blocking it would allow tracking. We propose Not.js, a fine grained JavaScript blocking tool that o… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  2. arXiv:2404.18881  [pdf, other

    cs.HC cs.LG cs.SE

    Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking

    Authors: Hong Jin Kang, Fabrice Harel-Canada, Muhammad Ali Gulzar, Violet Peng, Miryung Kim

    Abstract: Data augmentation techniques apply transformations to existing texts to generate additional data. The transformations may produce low-quality texts, where the meaning of the text is changed and the text may even be mangled beyond human comprehension. Analyzing the synthetically generated texts and their corresponding labels is slow and demanding. To winnow out texts with incorrect labels, we devel… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: NAACL 2024 Findings

  3. arXiv:2403.02694  [pdf, other

    cs.LG cs.AI cs.CL cs.CR cs.DC

    Privacy-Aware Semantic Cache for Large Language Models

    Authors: Waris Gill, Mohamed Elidrisi, Pallavi Kalapatapu, Ali Anwar, Muhammad Ali Gulzar

    Abstract: Large Language Models (LLMs) like ChatGPT and Llama2 have revolutionized natural language processing and search engine dynamics. However, these models incur exceptionally high computational costs. For instance, GPT-3 consists of 175 billion parameters where inference demands billions of floating-point operations. Caching is a natural solution to reduce LLM inference costs on repeated queries which… ▽ More

    Submitted 3 April, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: This study presents the first privacy aware semantic cache for LLMs based on Federated Learning. Total pages 13

    ACM Class: I.2.7

  4. arXiv:2312.13632  [pdf, other

    cs.LG cs.AI cs.CV cs.DC cs.SE

    ProvFL: Client-Driven Interpretability of Global Model Predictions in Federated Learning

    Authors: Waris Gill, Ali Anwar, Muhammad Ali Gulzar

    Abstract: Federated Learning (FL) trains a collaborative machine learning model by aggregating multiple privately trained clients' models over several training rounds. Such a long, continuous action of model aggregations poses significant challenges in reasoning about the origin and composition of such a global model. Regardless of the quality of the global model or if it has a fault, understanding the mode… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: 22 pages. For access to the source code used in this study, please contact the authors directly

  5. arXiv:2307.08672  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    FedDefender: Backdoor Attack Defense in Federated Learning

    Authors: Waris Gill, Ali Anwar, Muhammad Ali Gulzar

    Abstract: Federated Learning (FL) is a privacy-preserving distributed machine learning technique that enables individual clients (e.g., user participants, edge devices, or organizations) to train a model on their local data in a secure environment and then share the trained model with an aggregator to build a global model collaboratively. In this work, we propose FedDefender, a defense mechanism against tar… ▽ More

    Submitted 22 February, 2024; v1 submitted 1 July, 2023; originally announced July 2023.

    Comments: Published in SE4SafeML 2023 (co-located with FSE 2023). See https://dl.acm.org/doi/abs/10.1145/3617574.3617858

  6. arXiv:2302.01182  [pdf, other

    cs.CR cs.SE

    Blocking JavaScript without Breaking the Web: An Empirical Investigation

    Authors: Abdul Haddi Amjad, Zubair Shafiq, Muhammad Ali Gulzar

    Abstract: Modern websites heavily rely on JavaScript (JS) to implement legitimate functionality as well as privacy-invasive advertising and tracking. Browser extensions such as NoScript block any script not loaded by a trusted list of endpoints, thus hoping to block privacy-invasive scripts while avoiding breaking legitimate website functionality. In this paper, we investigate whether blocking JS on the web… ▽ More

    Submitted 23 March, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Journal ref: petsymposium 2023

  7. arXiv:2301.03553  [pdf, other

    cs.SE cs.CV cs.DC cs.LG

    FedDebug: Systematic Debugging for Federated Learning Applications

    Authors: Waris Gill, Ali Anwar, Muhammad Ali Gulzar

    Abstract: In Federated Learning (FL), clients independently train local models and share them with a central aggregator to build a global model. Impermissibility to access clients' data and collaborative training make FL appealing for applications with data-privacy concerns, such as medical imaging. However, these FL characteristics pose unprecedented challenges for debugging. When a global model's performa… ▽ More

    Submitted 22 February, 2024; v1 submitted 9 January, 2023; originally announced January 2023.

    Comments: Published at ICSE 2023. Link https://ieeexplore.ieee.org/document/10172839

    Journal ref: In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE) (pp. 456-789). IEEE (2023)

  8. arXiv:2205.05137  [pdf, other

    cs.CL cs.LG

    Sibylvariant Transformations for Robust Text Classification

    Authors: Fabrice Harel-Canada, Muhammad Ali Gulzar, Nanyun Peng, Miryung Kim

    Abstract: The vast majority of text transformation techniques in NLP are inherently limited in their ability to expand input space coverage due to an implicit constraint to preserve the original class label. In this work, we propose the notion of sibylvariance (SIB) to describe the broader set of transforms that relax the label-preserving constraint, knowably vary the expected class, and lead to significant… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: 9 pages, Findings of ACL 2022

  9. arXiv:2108.13923  [pdf, other

    cs.NI

    TrackerSift: Untangling Mixed Tracking and Functional Web Resources

    Authors: Abdul Haddi Amjad, Danial Saleem, Fareed Zaffar, Muhammad Ali Gulzar, Zubair Shafiq

    Abstract: Trackers have recently started to mix tracking and functional resources to circumvent privacy-enhancing content blocking tools. Such mixed web resources put content blockers in a bind: risk breaking legitimate functionality if they act and risk missing privacy-invasive advertising and tracking if they do not. In this paper, we propose TrackerSift to progressively classify and untangle mixed web re… ▽ More

    Submitted 29 September, 2021; v1 submitted 28 August, 2021; originally announced August 2021.

  10. arXiv:2103.05118  [pdf, other

    cs.SE

    Efficient Fuzz Testing for Apache Spark Using Framework Abstraction

    Authors: Qian Zhang, Jiyuan Wang, Muhammad Ali Gulzar, Rohan Padhye, Miryung Kim

    Abstract: The emerging data-intensive applications are increasingly dependent on data-intensive scalable computing (DISC) systems, such as Apache Spark, to process large data. Despite their popularity, DISC applications are hard to test. In recent years, fuzz testing has been remarkably successful; however, it is nontrivial to apply such traditional fuzzing to big data analytics directly because: (1) the lo… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.