Skip to main content

Showing 1–50 of 357 results for author: Qin, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.04825  [pdf, other

    cs.CR cs.AI cs.LG

    Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution

    Authors: Shuo Shao, Yiming Li, Hongwei Yao, Yiling He, Zhan Qin, Kui Ren

    Abstract: Ownership verification is currently the most critical and widely adopted post-hoc method to safeguard model copyright. In general, model owners exploit it to identify whether a given suspicious third-party model is stolen from them by examining whether it has particular properties `inherited' from their released models. Currently, backdoor-based model watermarks are the primary and cutting-edge me… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  2. arXiv:2405.04180  [pdf, other

    cs.LG cs.CV

    Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models

    Authors: Zhixuan Chu, Lei Zhang, Yichen Sun, Siqiao Xue, Zhibo Wang, Zhan Qin, Kui Ren

    Abstract: The rapid advancement in text-to-video (T2V) generative models has enabled the synthesis of high-fidelity video content guided by textual descriptions. Despite this significant progress, these models are often susceptible to hallucination, generating contents that contradict the input text, which poses a challenge to their reliability and practical deployment. To address this critical issue, we in… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2306.08302, arXiv:2403.05131 by other authors

  3. arXiv:2405.04160  [pdf, other

    cs.CL

    A Causal Explainable Guardrails for Large Language Models

    Authors: Zhixuan Chu, Yan Wang, Longfei Li, Zhibo Wang, Zhan Qin, Kui Ren

    Abstract: Large Language Models (LLMs) have shown impressive performance in natural language tasks, but their outputs can exhibit undesirable attributes or biases. Existing methods for steering LLMs towards desired attributes often assume unbiased representations and rely solely on steering prompts. However, the representations learned from pre-training can introduce semantic biases that influence the steer… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 23 pages

  4. arXiv:2405.04095  [pdf, other

    cs.CR cs.AI

    Going Proactive and Explanatory Against Malware Concept Drift

    Authors: Yiling He, Junchi Lei, Zhan Qin, Kui Ren

    Abstract: Deep learning-based malware classifiers face significant challenges due to concept drift. The rapid evolution of malware, especially with new families, can depress classification accuracy to near-random levels. Previous research has primarily focused on detecting drift samples, relying on expert-led analysis and labeling for model retraining. However, these methods often lack a comprehensive under… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  5. arXiv:2405.03873  [pdf, other

    cs.AI cs.HC

    Investigating Personalized Driving Behaviors in Dilemma Zones: Analysis and Prediction of Stop-or-Go Decisions

    Authors: Ziye Qin, Siyan Li, Guoyuan Wu, Matthew J. Barth, Amr Abdelraouf, Rohit Gupta, Kyungtae Han

    Abstract: Dilemma zones at signalized intersections present a commonly occurring but unsolved challenge for both drivers and traffic operators. Onsets of the yellow lights prompt varied responses from different drivers: some may brake abruptly, compromising the ride comfort, while others may accelerate, increasing the risk of red-light violations and potential safety hazards. Such diversity in drivers' stop… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  6. arXiv:2405.00507  [pdf, other

    cs.CV

    NeRF-Guided Unsupervised Learning of RGB-D Registration

    Authors: Zhinan Yu, Zheng Qin, Yijie Tang, Yongjun Wang, Renjiao Yi, Chenyang Zhu, Kai Xu

    Abstract: This paper focuses on training a robust RGB-D registration model without ground-truth pose supervision. Existing methods usually adopt a pairwise training strategy based on differentiable rendering, which enforces the photometric and the geometric consistency between the two registered frames as supervision. However, this frame-to-frame framework suffers from poor multi-view consistency due to fac… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  7. arXiv:2404.17867  [pdf, other

    cs.CV eess.IV

    Are Watermarks Bugs for Deepfake Detectors? Rethinking Proactive Forensics

    Authors: Xiaoshuai Wu, Xin Liao, Bo Ou, Yuling Liu, Zheng Qin

    Abstract: AI-generated content has accelerated the topic of media synthesis, particularly Deepfake, which can manipulate our portraits for positive or malicious purposes. Before releasing these threatening face images, one promising forensics solution is the injection of robust watermarks to track their own provenance. However, we argue that current watermarking models, originally devised for genuine images… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  8. arXiv:2404.14381  [pdf, other

    cs.CV cs.MM

    TAVGBench: Benchmarking Text to Audible-Video Generation

    Authors: Yuxin Mao, Xuyang Shen, Jing Zhang, Zhen Qin, Jinxing Zhou, Mochu Xiang, Yiran Zhong, Yuchao Dai

    Abstract: The Text to Audible-Video Generation (TAVG) task involves generating videos with accompanying audio based on text descriptions. Achieving this requires skillful alignment of both audio and video elements. To support research in this field, we have developed a comprehensive Text to Audible-Video Generation Benchmark (TAVGBench), which contains over 1.7 million clips with a total duration of 11.8 th… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Technical Report. Project page:https://github.com/OpenNLPLab/TAVGBench

  9. arXiv:2404.13493  [pdf, other

    cs.CV

    Authentic Emotion Mapping: Benchmarking Facial Expressions in Real News

    Authors: Qixuan Zhang, Zhifeng Wang, Yang Liu, Zhenyue Qin, Kaihao Zhang, Sabrina Caldwell, Tom Gedeon

    Abstract: In this paper, we present a novel benchmark for Emotion Recognition using facial landmarks extracted from realistic news videos. Traditional methods relying on RGB images are resource-intensive, whereas our approach with Facial Landmark Emotion Recognition (FLER) offers a simplified yet effective alternative. By leveraging Graph Neural Networks (GNNs) to analyze the geometric and spatial relations… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  10. arXiv:2404.11865  [pdf, other

    cs.CV

    From Image to Video, what do we need in multimodal LLMs?

    Authors: Suyuan Huang, Haoxin Zhang, Yan Gao, Yao Hu, Zengchang Qin

    Abstract: Multimodal Large Language Models (MLLMs) have demonstrated profound capabilities in understanding multimodal information, covering from Image LLMs to the more complex Video LLMs. Numerous studies have illustrated their exceptional cross-modal comprehension. Recently, integrating video foundation models with large language models to build a comprehensive video understanding system has been proposed… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  11. arXiv:2404.11791  [pdf, other

    cs.IR

    Consolidating Ranking and Relevance Predictions of Large Language Models through Post-Processing

    Authors: Le Yan, Zhen Qin, Honglei Zhuang, Rolf Jagerman, Xuanhui Wang, Michael Bendersky, Harrie Oosterhuis

    Abstract: The powerful generative abilities of large language models (LLMs) show potential in generating relevance labels for search applications. Previous work has found that directly asking about relevancy, such as ``How relevant is document A to query Q?", results in sub-optimal ranking. Instead, the pairwise ranking prompting (PRP) approach produces promising ranking performance through asking about pai… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  12. arXiv:2404.11121  [pdf, other

    cs.CR cs.AI

    TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge Deployment

    Authors: Qinfeng Li, Zhiqiang Shen, Zhenghan Qin, Yangfan Xie, Xuhong Zhang, Tianyu Du, Jianwei Yin

    Abstract: Proprietary large language models (LLMs) have been widely applied in various scenarios. Additionally, deploying LLMs on edge devices is trending for efficiency and privacy reasons. However, edge deployment of proprietary LLMs introduces new security challenges: edge-deployed models are exposed as white-box accessible to users, enabling adversaries to conduct effective model stealing (MS) attacks.… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.07152 by other authors

  13. arXiv:2404.10499  [pdf, other

    cs.CV cs.AI

    Robust Noisy Label Learning via Two-Stream Sample Distillation

    Authors: Sihan Bai, Sanping Zhou, Zheng Qin, Le Wang, Nanning Zheng

    Abstract: Noisy label learning aims to learn robust networks under the supervision of noisy labels, which plays a critical role in deep learning. Existing work either conducts sample selection or label correction to deal with noisy labels during the model training process. In this paper, we design a simple yet effective sample selection framework, termed Two-Stream Sample Distillation (TSSD), for noisy labe… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  14. arXiv:2404.07904  [pdf, other

    cs.CL

    HGRN2: Gated Linear RNNs with State Expansion

    Authors: Zhen Qin, Songlin Yang, Weixuan Sun, Xuyang Shen, Dong Li, Weigao Sun, Yiran Zhong

    Abstract: Hierarchically gated linear RNN (HGRN,Qin et al. 2023) has demonstrated competitive training speed and performance in language modeling, while offering efficient inference. However, the recurrent state size of HGRN remains relatively small, which limits its expressiveness.To address this issue, inspired by linear attention, we introduce a simple outer-product-based state expansion mechanism so tha… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Techinical Report. Yiran Zhong is the corresponding author. The source code is available at https://github.com/OpenNLPLab/HGRN2

  15. arXiv:2404.07413  [pdf, other

    cs.CL cs.AI

    JetMoE: Reaching Llama2 Performance with 0.1M Dollars

    Authors: Yikang Shen, Zhen Guo, Tianle Cai, Zengyi Qin

    Abstract: Large Language Models (LLMs) have achieved remarkable results, but their increasing resource demand has become a major obstacle to the development of powerful and accessible super-human intelligence. This report introduces JetMoE-8B, a new LLM trained with less than $0.1 million, using 1.25T tokens from carefully mixed open-source corpora and 30,000 H100 GPU hours. Despite its low cost, the JetMoE… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  16. arXiv:2404.04557  [pdf, other

    cs.CV

    Learning Instance-Aware Correspondences for Robust Multi-Instance Point Cloud Registration in Cluttered Scenes

    Authors: Zhiyuan Yu, Zheng Qin, Lintao Zheng, Kai Xu

    Abstract: Multi-instance point cloud registration estimates the poses of multiple instances of a model point cloud in a scene point cloud. Extracting accurate point correspondence is to the center of the problem. Existing approaches usually treat the scene point cloud as a whole, overlooking the separation of instances. Therefore, point features could be easily polluted by other points from the background o… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  17. arXiv:2404.02882  [pdf, other

    cs.LG cs.CL

    Linear Attention Sequence Parallelism

    Authors: Weigao Sun, Zhen Qin, Dong Li, Xuyang Shen, Yu Qiao, Yiran Zhong

    Abstract: Sequence Parallel (SP) serves as a prevalent strategy to handle long sequences that exceed the memory limit of a single GPU. However, existing SP methods do not take advantage of linear attention features, resulting in sub-optimal parallelism efficiency and usability for linear attention-based language models. In this paper, we introduce Linear Attention Sequence Parallel (LASP), an efficient SP m… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Technical Report. Weigao Sun and Zhen Qin contribute equally to this paper. Yiran Zhong is the corresponding author. The code is available at https://github.com/OpenNLPLab/LASP

  18. arXiv:2404.02733  [pdf, other

    cs.CV

    InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation

    Authors: Haofan Wang, Matteo Spinelli, Qixun Wang, Xu Bai, Zekui Qin, Anthony Chen

    Abstract: Tuning-free diffusion-based models have demonstrated significant potential in the realm of image personalization and customization. However, despite this notable progress, current models continue to grapple with several complex challenges in producing style-consistent image generation. Firstly, the concept of style is inherently underdetermined, encompassing a multitude of elements such as color,… ▽ More

    Submitted 4 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: Technical Report

  19. arXiv:2403.13310  [pdf, other

    cs.IR cs.LG cs.LO

    A Semantic Search Engine for Mathlib4

    Authors: Guoxiong Gao, Haocheng Ju, Jiedong Jiang, Zihan Qin, Bin Dong

    Abstract: The interactive theorem prover, Lean, enables the verification of formal mathematical proofs and is backed by an expanding community. Central to this ecosystem is its mathematical library, mathlib4, which lays the groundwork for the formalization of an expanding range of mathematical theories. However, searching for theorems in mathlib4 can be challenging. To successfully search in mathlib4, users… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  20. arXiv:2403.11570  [pdf, other

    cs.CV

    LogicalDefender: Discovering, Extracting, and Utilizing Common-Sense Knowledge

    Authors: Yuhe Liu, Mengxue Kang, Zengchang Qin, Xiangxiang Chu

    Abstract: Large text-to-image models have achieved astonishing performance in synthesizing diverse and high-quality images guided by texts. With detail-oriented conditioning control, even finer-grained spatial control can be achieved. However, some generated images still appear unreasonable, even with plentiful object features and a harmonious style. In this paper, we delve into the underlying causes and fi… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  21. arXiv:2403.09969  [pdf, other

    cs.LG

    Prediction of Vessel Arrival Time to Pilotage Area Using Multi-Data Fusion and Deep Learning

    Authors: Xiaocai Zhang, Xiuju Fu, Zhe Xiao, Haiyan Xu, Xiaoyang Wei, Jimmy Koh, Daichi Ogawa, Zheng Qin

    Abstract: This paper investigates the prediction of vessels' arrival time to the pilotage area using multi-data fusion and deep learning approaches. Firstly, the vessel arrival contour is extracted based on Multivariate Kernel Density Estimation (MKDE) and clustering. Secondly, multiple data sources, including Automatic Identification System (AIS), pilotage booking information, and meteorological data, are… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: The 26th IEEE International Conference on Intelligent Transportation Systems (ITSC 2023)

  22. arXiv:2403.05026  [pdf, other

    cs.LG cs.AI

    Spectral Invariant Learning for Dynamic Graphs under Distribution Shifts

    Authors: Zeyang Zhang, Xin Wang, Ziwei Zhang, Zhou Qin, Weigao Wen, Hui Xue, Haoyang Li, Wenwu Zhu

    Abstract: Dynamic graph neural networks (DyGNNs) currently struggle with handling distribution shifts that are inherent in dynamic graphs. Existing work on DyGNNs with out-of-distribution settings only focuses on the time domain, failing to handle cases involving distribution shifts in the spectral domain. In this paper, we discover that there exist cases with distribution shifts unobservable in the time do… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: NeurIPS'23

  23. arXiv:2402.15759  [pdf

    cs.CV cs.AI

    Increasing SAM Zero-Shot Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without Human Annotation

    Authors: Zekun Jiang, Dongjie Cheng, Ziyuan Qin, Jun Gao, Qicheng Lao, Kang Li, Le Zhang

    Abstract: This study develops and evaluates a novel multimodal medical image zero-shot segmentation algorithm named Text-Visual-Prompt SAM (TV-SAM) without any manual annotations. TV-SAM incorporates and integrates large language model GPT-4, Vision Language Model GLIP, and Segment Anything Model (SAM), to autonomously generate descriptive text prompts and visual bounding box prompts from medical images, th… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

    Comments: 12 pages, 4 figures, 4 tables

  24. arXiv:2402.07818  [pdf, other

    cs.LG cs.AI cs.CL

    Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning

    Authors: Z Liu, J Lou, W Bao, Y Hu, B Li, Z Qin, K Ren

    Abstract: Fine-tuning on task-specific datasets is a widely-embraced paradigm of harnessing the powerful capability of pretrained LLMs for various downstream tasks. Due to the popularity of LLMs fine-tuning and its accompanying privacy concerns, differentially private (DP) fine-tuning of pretrained LLMs has been widely used to safeguarding the privacy of task-specific datasets. Lying at the design core of D… ▽ More

    Submitted 9 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  25. arXiv:2402.07610  [pdf, other

    cs.CL cs.AI

    Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping

    Authors: Haoyu Wang, Guozheng Ma, Ziqiao Meng, Zeyu Qin, Li Shen, Zhong Zhang, Bingzhe Wu, Liu Liu, Yatao Bian, Tingyang Xu, Xueqian Wang, Peilin Zhao

    Abstract: Self-alignment is an effective way to reduce the cost of human annotation while ensuring promising model capability. However, most current methods complete the data collection and training steps in a single round, which may overlook the continuously improving ability of self-aligned models. This gives rise to a key query: What if we do multi-time bootstrapping self-alignment? Does this strategy en… ▽ More

    Submitted 21 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  26. arXiv:2402.05135  [pdf, other

    cs.AI cs.CL cs.IR

    CADReN: Contextual Anchor-Driven Relational Network for Controllable Cross-Graphs Node Importance Estimation

    Authors: Zijie Zhong, Yunhui Zhang, Ziyi Chang, Zengchang Qin

    Abstract: Node Importance Estimation (NIE) is crucial for integrating external information into Large Language Models through Retriever-Augmented Generation. Traditional methods, focusing on static, single-graph characteristics, lack adaptability to new graphs and user-specific requirements. CADReN, our proposed method, addresses these limitations by introducing a Contextual Anchor (CA) mechanism. This appr… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 8 pages, 6 figures

    MSC Class: 68T07

  27. arXiv:2402.01878  [pdf, other

    cs.CL cs.LG

    LiPO: Listwise Preference Optimization through Learning-to-Rank

    Authors: Tianqi Liu, Zhen Qin, Junru Wu, Jiaming Shen, Misha Khalman, Rishabh Joshi, Yao Zhao, Mohammad Saleh, Simon Baumgartner, Jialu Liu, Peter J. Liu, Xuanhui Wang

    Abstract: Aligning language models (LMs) with curated human feedback is critical to control their behaviors in real-world applications. Several recent policy optimization methods, such as DPO and SLiC, serve as promising alternatives to the traditional Reinforcement Learning from Human Feedback (RLHF) approach. In practice, human feedback often comes in a format of a ranked list over multiple responses to a… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  28. arXiv:2401.16265  [pdf, other

    cs.CL cs.DC

    CO2: Efficient Distributed Training with Full Communication-Computation Overlap

    Authors: Weigao Sun, Zhen Qin, Weixuan Sun, Shidi Li, Dong Li, Xuyang Shen, Yu Qiao, Yiran Zhong

    Abstract: The fundamental success of large language models hinges upon the efficacious implementation of large-scale distributed training techniques. Nevertheless, building a vast, high-performance cluster featuring high-speed communication interconnectivity is prohibitively costly, and accessible only to prominent entities. In this work, we aim to lower this barrier and democratize large-scale training wit… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: ICLR 2024 Spotlight. Yiran Zhong is the corresponding author. Code is available at: https://github.com/OpenNLPLab/CO2

  29. arXiv:2401.13516  [pdf, other

    cs.CV cs.CR

    Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces

    Authors: Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou

    Abstract: Deepfake videos are becoming increasingly realistic, showing few tampering traces on facial areasthat vary between frames. Consequently, existing Deepfake detection methods struggle to detect unknown domain Deepfake videos while accurately locating the tampered region. To address thislimitation, we propose Delocate, a novel Deepfake detection model that can both recognize andlocalize unknown domai… ▽ More

    Submitted 5 May, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2308.09921, arXiv:2305.05943

  30. arXiv:2401.09490  [pdf, other

    q-bio.QM cs.IR

    Gene-associated Disease Discovery Powered by Large Language Models

    Authors: Jiayu Chang, Shiyu Wang, Chen Ling, Zhaohui Qin, Liang Zhao

    Abstract: The intricate relationship between genetic variation and human diseases has been a focal point of medical research, evidenced by the identification of risk genes regarding specific diseases. The advent of advanced genome sequencing techniques has significantly improved the efficiency and cost-effectiveness of detecting these genetic markers, playing a crucial role in disease diagnosis and forming… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: This is the official paper accepted by AAAI 2024 Workshop on Large Language Models for Biological Discoveries

  31. arXiv:2401.08217  [pdf, other

    cs.IR

    LLM-Guided Multi-View Hypergraph Learning for Human-Centric Explainable Recommendation

    Authors: Zhixuan Chu, Yan Wang, Qing Cui, Longfei Li, Wenqing Chen, Zhan Qin, Kui Ren

    Abstract: As personalized recommendation systems become vital in the age of information overload, traditional methods relying solely on historical user interactions often fail to fully capture the multifaceted nature of human interests. To enable more human-centric modeling of user preferences, this work proposes a novel explainable recommendation framework, i.e., LLMHG, synergizing the reasoning capabiliti… ▽ More

    Submitted 29 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 14 pages, 5 figures

  32. arXiv:2401.07519  [pdf, other

    cs.CV cs.AI

    InstantID: Zero-shot Identity-Preserving Generation in Seconds

    Authors: Qixun Wang, Xu Bai, Haofan Wang, Zekui Qin, Anthony Chen, Huaxia Li, Xu Tang, Yao Hu

    Abstract: There has been significant progress in personalized image synthesis with methods such as Textual Inversion, DreamBooth, and LoRA. Yet, their real-world applicability is hindered by high storage demands, lengthy fine-tuning processes, and the need for multiple reference images. Conversely, existing ID embedding-based methods, while requiring only a single forward inference, face challenges: they ei… ▽ More

    Submitted 2 February, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

    Comments: Technical Report, project page available at https://instantid.github.io/

  33. arXiv:2401.04812  [pdf, other

    cs.AI

    Sample-and-Bound for Non-Convex Optimization

    Authors: Yaoguang Zhai, Zhizhen Qin, Sicun Gao

    Abstract: Standard approaches for global optimization of non-convex functions, such as branch-and-bound, maintain partition trees to systematically prune the domain. The tree size grows exponentially in the number of dimensions. We propose new sampling-based methods for non-convex optimization that adapts Monte Carlo Tree Search (MCTS) to improve efficiency. Instead of the standard use of visitation count i… ▽ More

    Submitted 19 February, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Published at AAAI 2024. Code is available at https://github.com/aaucsd/MCIR

  34. arXiv:2401.04658  [pdf, other

    cs.CL cs.AI

    Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

    Authors: Zhen Qin, Weigao Sun, Dong Li, Xuyang Shen, Weixuan Sun, Yiran Zhong

    Abstract: Linear attention is an efficient attention mechanism that has recently emerged as a promising alternative to conventional softmax attention. With its ability to process tokens in linear computational complexities, linear attention, in theory, can handle sequences of unlimited length without sacrificing speed, i.e., maintaining a constant training speed for various sequence lengths with a fixed mem… ▽ More

    Submitted 15 January, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Technical Report. Yiran Zhong is the corresponding author. The source code is available at https://github.com/OpenNLPLab/lightning-attention

  35. arXiv:2401.02592  [pdf, other

    stat.ML cs.LG eess.SP math.OC

    Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery

    Authors: Zhen Qin, Michael B. Wakin, Zhihui Zhu

    Abstract: In this paper, we provide the first convergence guarantee for the factorization approach. Specifically, to avoid the scaling ambiguity and to facilitate theoretical analysis, we optimize over the so-called left-orthogonal TT format which enforces orthonormality among most of the factors. To ensure the orthonormal structure, we utilize the Riemannian gradient descent (RGD) for optimizing those fact… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  36. arXiv:2401.00859  [pdf, ps, other

    eess.IV cs.CV cs.LG

    Federated Multi-View Synthesizing for Metaverse

    Authors: Yiyu Guo, Zhijin Qin, Xiaoming Tao, Geoffrey Ye Li

    Abstract: The metaverse is expected to provide immersive entertainment, education, and business applications. However, virtual reality (VR) transmission over wireless networks is data- and computation-intensive, making it critical to introduce novel solutions that meet stringent quality-of-service requirements. With recent advances in edge intelligence and deep learning, we have developed a novel multi-view… ▽ More

    Submitted 18 December, 2023; originally announced January 2024.

  37. arXiv:2312.16909  [pdf, other

    cs.IT

    A GAN-based Semantic Communication for Text without CSI

    Authors: Jin Mao, Ke Xiong, Ming Liu, Zhijin Qin, Wei Chen, Pingyi Fan, Khaled Ben Letaief

    Abstract: Recently, semantic communication (SC) has been regarded as one of the potential paradigms of 6G. Current SC frameworks require channel state information (CSI) to handle severe signal distortion induced by channel fading. Since the channel estimation overhead for obtaining CSI cannot be neglected, we therefore propose a generative adversarial network (GAN) based SC framework (Ti-GSC) that doesn't r… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  38. arXiv:2312.16797  [pdf, other

    cs.CV

    Multi-Prompts Learning with Cross-Modal Alignment for Attribute-based Person Re-Identification

    Authors: Yajing Zhai, Yawen Zeng, Zhiyong Huang, Zheng Qin, Xin Jin, Da Cao

    Abstract: The fine-grained attribute descriptions can significantly supplement the valuable semantic information for person image, which is vital to the success of person re-identification (ReID) task. However, current ReID algorithms typically failed to effectively leverage the rich contextual information available, primarily due to their reliance on simplistic and coarse utilization of image attributes. R… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: AAAI 2024

  39. arXiv:2312.10336  [pdf, ps, other

    cs.LG

    Certified Minimax Unlearning with Generalization Rates and Deletion Capacity

    Authors: Jiaqi Liu, Jian Lou, Zhan Qin, Kui Ren

    Abstract: We study the problem of $(ε,δ)$-certified machine unlearning for minimax models. Most of the existing works focus on unlearning from standard statistical learning models that have a single variable and their unlearning steps hinge on the direct Hessian-based conventional Newton update. We develop a new $(ε,δ)$-certified machine unlearning algorithm for minimax models. It proposes a minimax unlearn… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  40. arXiv:2312.06353  [pdf, other

    cs.LG cs.DC

    Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes

    Authors: Zhen Qin, Daoyuan Chen, Bingchen Qian, Bolin Ding, Yaliang Li, Shuiguang Deng

    Abstract: Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions. Federated learning offers a way to fine-tune LLMs using the abundant data on end devices without compromising data privacy. Most existing federated fine-tuning methods for LLMs rely on parameter-efficient fine-tuning techniques, which may not reach the performance height poss… ▽ More

    Submitted 31 January, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: Codes are available at https://github.com/alibaba/FederatedScope/tree/FedKSeed. We will continuously update the codebase and arXiv version

  41. arXiv:2312.04738  [pdf, other

    cs.CR

    DPI: Ensuring Strict Differential Privacy for Infinite Data Streaming

    Authors: Shuya Feng, Meisam Mohammady, Han Wang, Xiaochen Li, Zhan Qin, Yuan Hong

    Abstract: Streaming data, crucial for applications like crowdsourcing analytics, behavior studies, and real-time monitoring, faces significant privacy risks due to the large and diverse data linked to individuals. In particular, recent efforts to release data streams, using the rigorous privacy notion of differential privacy (DP), have encountered issues with unbounded privacy leakage. This challenge limits… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: To appear in IEEE S&P 2024

  42. arXiv:2312.04584  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    Towards Sample-specific Backdoor Attack with Clean Labels via Attribute Trigger

    Authors: Yiming Li, Mingyan Zhu, Junfeng Guo, Tao Wei, Shu-Tao Xia, Zhan Qin

    Abstract: Currently, sample-specific backdoor attacks (SSBAs) are the most advanced and malicious methods since they can easily circumvent most of the current backdoor defenses. In this paper, we reveal that SSBAs are not sufficiently stealthy due to their poisoned-label nature, where users can discover anomalies if they check the image-label relationship. In particular, we demonstrate that it is ineffectiv… ▽ More

    Submitted 10 December, 2023; v1 submitted 3 December, 2023; originally announced December 2023.

    Comments: 14 pages

  43. arXiv:2312.04236  [pdf, other

    cs.CV cs.AI

    Detecting and Restoring Non-Standard Hands in Stable Diffusion Generated Images

    Authors: Yiqun Zhang, Zhenyue Qin, Yang Liu, Dylan Campbell

    Abstract: We introduce a pipeline to address anatomical inaccuracies in Stable Diffusion generated hand images. The initial step involves constructing a specialized dataset, focusing on hand anomalies, to train our models effectively. A finetuned detection model is pivotal for precise identification of these anomalies, ensuring targeted correction. Body pose estimation aids in understanding hand orientation… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  44. arXiv:2312.02175  [pdf, other

    cs.IT

    Wavefront Transformation-based Near-field Channel Prediction for Extremely Large Antenna Array with Mobility

    Authors: Weidong Li, Haifan Yin, Ziao Qin, Merouane Debbah

    Abstract: This paper addresses the mobility problem in extremely large antenna array (ELAA) communication systems. In order to account for the performance loss caused by the spherical wavefront of ELAA in the mobility scenario, we propose a wavefront transformation-based matrix pencil (WTMP) channel prediction method. In particular, we design a matrix to transform the spherical wavefront into a new wavefron… ▽ More

    Submitted 17 November, 2023; originally announced December 2023.

  45. arXiv:2312.01479  [pdf, other

    cs.SD cs.LG eess.AS

    OpenVoice: Versatile Instant Voice Cloning

    Authors: Zengyi Qin, Wenliang Zhao, Xumin Yu, Xin Sun

    Abstract: We introduce OpenVoice, a versatile voice cloning approach that requires only a short audio clip from the reference speaker to replicate their voice and generate speech in multiple languages. OpenVoice represents a significant advancement in addressing the following open challenges in the field: 1) Flexible Voice Style Control. OpenVoice enables granular control over voice styles, including emotio… ▽ More

    Submitted 2 January, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

    Comments: Technical Report

  46. Graph Coordinates and Conventional Neural Networks -- An Alternative for Graph Neural Networks

    Authors: Zheyi Qin, Randy Paffenroth, Anura P. Jayasumana

    Abstract: Graph-based data present unique challenges and opportunities for machine learning. Graph Neural Networks (GNNs), and especially those algorithms that capture graph topology through message passing for neighborhood aggregation, have been a leading solution. However, these networks often require substantial computational resources and may not optimally leverage the information contained in the graph… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: This paper is submitted and will be published on Big Data Conference 2023, Data-driven Science for Graphs: Algorithms, Architectures, and Application workshop

  47. arXiv:2312.01081  [pdf, other

    cs.IT cs.AI cs.LG

    Adaptive Resource Allocation for Semantic Communication Networks

    Authors: Lingyi Wang, Wei Wu, Fuhui Zhou, Zhaohui Yang, Zhijin Qin

    Abstract: Semantic communication, recognized as a promising technology for future intelligent applications, has received widespread research attention. Despite the potential of semantic communication to enhance transmission reliability, especially in low signal-to-noise (SNR) environments, the critical issue of resource allocation and compatibility in the dynamic wireless environment remains largely unexplo… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  48. arXiv:2312.00740  [pdf, ps, other

    cs.IT

    Computing Networks Enabled Semantic Communications

    Authors: Zhijin Qin, Jingkai Ying, Dingxi Yang, Hengjiang Wang, Xiaoming Tao

    Abstract: Semantic communication has shown great potential in boosting the effectiveness and reliability of communications. However, its systems to date are mostly enabled by deep learning, which requires demanding computing resources. This article proposes a framework for the computing networks enabled semantic communication system, aiming to offer sufficient computing resources for semantic processing and… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  49. arXiv:2312.00508  [pdf, other

    cs.CR

    PyraTrans: Attention-Enriched Pyramid Transformer for Malicious URL Detection

    Authors: Ruitong Liu, Yanbin Wang, Zhenhao Guo, Haitao Xu, Zhan Qin, Wenrui Ma, Fan Zhang

    Abstract: Although advancements in machine learning have driven the development of malicious URL detection technology, current techniques still face significant challenges in their capacity to generalize and their resilience against evolving threats. In this paper, we propose PyraTrans, a novel method that integrates pretrained Transformers with pyramid feature learning to detect malicious URL. PyraTrans ut… ▽ More

    Submitted 6 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: 12 pages, 7 figures

  50. arXiv:2311.16191  [pdf, other

    cs.LG cs.AI

    Learning Multi-Pattern Normalities in the Frequency Domain for Efficient Time Series Anomaly Detection

    Authors: Feiyi Chen, Yingying zhang, Zhen Qin, Lunting Fan, Renhe Jiang, Yuxuan Liang, Qingsong Wen, Shuiguang Deng

    Abstract: Anomaly detection significantly enhances the robustness of cloud systems. While neural network-based methods have recently demonstrated strong advantages, they encounter practical challenges in cloud environments: the contradiction between the impracticality of maintaining a unique model for each service and the limited ability to deal with diverse normal patterns by a unified model, as well as is… ▽ More

    Submitted 18 March, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: Accepted by IEEE 40th International Conference on Data Engineering (ICDE 2024)