Skip to main content

Showing 1–50 of 505 results for author: Zhao, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.16928  [pdf

    cs.SI cs.GT

    TopoLa: a novel embedding framework for understanding complex networks

    Authors: Kai Zheng, Qilong Feng, Yaohang Li, Qichang Zhao, Jinhui Xu, Jianxin Wang

    Abstract: Complex networks, which are the abstractions of many real-world systems, present a persistent challenge across disciplines for people to decipher their underlying information. Recently, hyperbolic geometry of latent spaces has gained traction in network analysis, due to its ability to preserve certain local intrinsic properties of the nodes. In this study, we explore the problem from a much broade… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 85 pages, 17 figures

  2. arXiv:2405.16219  [pdf, other

    cs.LG stat.ML

    Deep Causal Generative Models with Property Control

    Authors: Qilong Zhao, Shiyu Wang, Guangji Bai, Bo Pan, Zhaohui Qin, Liang Zhao

    Abstract: Generating data with properties of interest by external users while following the right causation among its intrinsic factors is important yet has not been well addressed jointly. This is due to the long-lasting challenge of jointly identifying key latent variables, their causal relations, and their correlation with properties of interest, as well as how to leverage their discoveries toward causal… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 13 pages, 6 figures

  3. arXiv:2405.08779  [pdf, other

    cs.LG

    Jacobian Regularizer-based Neural Granger Causality

    Authors: Wanqi Zhou, Shuanghao Bai, Shujian Yu, Qibin Zhao, Badong Chen

    Abstract: With the advancement of neural networks, diverse methods for neural Granger causality have emerged, which demonstrate proficiency in handling complex data, and nonlinear relationships. However, the existing framework of neural Granger causality has several limitations. It requires the construction of separate predictive models for each target variable, and the relationship depends on the sparsity… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 20 pages, 7 figures, ICML 2024

  4. arXiv:2405.07459  [pdf, other

    cs.CV

    DualFocus: A Unified Framework for Integrating Positive and Negative Descriptors in Text-based Person Retrieval

    Authors: Yuchuan Deng, Zhanpeng Hu, Jiakun Han, Chuang Deng, Qijun Zhao

    Abstract: Text-based person retrieval (TPR) aims to retrieve images of a person from an extensive array of candidates based on a given textual description. The core challenge lies in mapping visual and textual data into a unified latent space. While existing TPR methods concentrate on recognizing explicit and positive characteristics, they often neglect the critical influence of negative descriptors, result… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  5. arXiv:2405.05508  [pdf, other

    cs.IR cs.AI

    Redefining Information Retrieval of Structured Database via Large Language Models

    Authors: Mingzhu Wang, Yuzhe Zhang, Qihang Zhao, Juanyi Yang, Hong Zhang

    Abstract: Retrieval augmentation is critical when Language Models (LMs) exploit non-parametric knowledge related to the query through external knowledge bases before reasoning. The retrieved information is incorporated into LMs as context alongside the query, enhancing the reliability of responses towards factual questions. Prior researches in retrieval augmentation typically follow a retriever-generator pa… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  6. arXiv:2405.04128  [pdf, other

    cs.CL cs.SD eess.AS

    Fine-grained Speech Sentiment Analysis in Chinese Psychological Support Hotlines Based on Large-scale Pre-trained Model

    Authors: Zhonglong Chen, Changwei Song, Yining Chen, Jianqiang Li, Guanghui Fu, Yongsheng Tong, Qing Zhao

    Abstract: Suicide and suicidal behaviors remain significant challenges for public policy and healthcare. In response, psychological support hotlines have been established worldwide to provide immediate help to individuals in mental crises. The effectiveness of these hotlines largely depends on accurately identifying callers' emotional states, particularly underlying negative emotions indicative of increased… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  7. arXiv:2405.03419  [pdf, other

    cs.NE cs.LG

    Automated Metaheuristic Algorithm Design with Autoregressive Learning

    Authors: Qi Zhao, Tengfei Liu, Bai Yan, Qiqi Duan, Jian Yang, Yuhui Shi

    Abstract: Automated design of metaheuristic algorithms offers an attractive avenue to reduce human effort and gain enhanced performance beyond human intuition. Current automated methods design algorithms within a fixed structure and operate from scratch. This poses a clear gap towards fully discovering potentials over the metaheuristic family and fertilizing from prior design experience. To bridge the gap,… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  8. arXiv:2405.03239  [pdf, other

    cs.LG cs.AI

    Deep Learning for Detecting and Early Predicting Chronic Obstructive Pulmonary Disease from Spirogram Time Series: A UK Biobank Study

    Authors: Shuhao Mei, Yuxi Zhou, Jiahao Xu, Yuxuan Wan, Shan Cao, Qinghao Zhao, Shijia Geng, Junqing Xie, Shenda Hong

    Abstract: Chronic Obstructive Pulmonary Disease (COPD) is a chronic inflammatory lung condition that causes airflow obstruction. The existing methods can only detect patients who already have COPD based on obvious features shown in the spirogram (In this article, the spirogram specifically involves measuring Volume-Flow curve time series). Early prediction of COPD risk is vital for monitoring COPD disease p… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  9. arXiv:2405.02373  [pdf, other

    math.OC cs.LG stat.ML

    Exponentially Weighted Algorithm for Online Network Resource Allocation with Long-Term Constraints

    Authors: Ahmed Sid-Ali, Ioannis Lambadaris, Yiqiang Q. Zhao, Gennady Shaikhet, Amirhossein Asgharnia

    Abstract: This paper studies an online optimal resource reservation problem in communication networks with job transfers where the goal is to minimize the reservation cost while maintaining the blocking cost under a certain budget limit. To tackle this problem, we propose a novel algorithm based on a randomized exponentially weighted method that encompasses long-term constraints. We then analyze the perform… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2305.15558

  10. arXiv:2405.00468  [pdf, other

    cs.CV cs.AI

    Feature-Aware Noise Contrastive Learning For Unsupervised Red Panda Re-Identification

    Authors: Jincheng Zhang, Qijun Zhao, Tie Liu

    Abstract: To facilitate the re-identification (Re-ID) of individual animals, existing methods primarily focus on maximizing feature similarity within the same individual and enhancing distinctiveness between different individuals. However, most of them still rely on supervised learning and require substantial labeled data, which is challenging to obtain. To avoid this issue, we propose a Feature-Aware Noise… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures, IJCNN2024

  11. arXiv:2404.19287  [pdf, other

    cs.CV

    Revisiting the Adversarial Robustness of Vision Language Models: a Multimodal Perspective

    Authors: Wanqi Zhou, Shuanghao Bai, Qibin Zhao, Badong Chen

    Abstract: Pretrained vision-language models (VLMs) like CLIP have shown impressive generalization performance across various downstream tasks, yet they remain vulnerable to adversarial attacks. While prior research has primarily concentrated on improving the adversarial robustness of image encoders to guard against attacks on images, the exploration of text-based and multimodal attacks has largely been over… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 16 pages, 14 figures

  12. arXiv:2404.18047  [pdf, other

    cs.RO

    LIKO: LiDAR, Inertial, and Kinematic Odometry for Bipedal Robots

    Authors: Qingrui Zhao, Mingyuan Li, Yongliang Shi, Xuechao Chen, Zhangguo Yu, Lianqiang Han, Zhenyuan Fu, Jintao Zhang, Chao Li, Yuanxi Zhang, Qiang Huang

    Abstract: High-frequency and accurate state estimation is crucial for biped robots. This paper presents a tightly-coupled LiDAR-Inertial-Kinematic Odometry (LIKO) for biped robot state estimation based on an iterated extended Kalman filter. Beyond state estimation, the foot contact position is also modeled and estimated. This allows for both position and velocity updates from kinematic measurement. Addition… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  13. arXiv:2404.15000  [pdf, other

    cs.CR

    EarPass: Secure and Implicit Call Receiver Authentication Using Ear Acoustic Sensing

    Authors: Xiping Sun, Jing Chen, Kun He, Zhixiang He, Ruiying Du, Yebo Feng, Qingchuan Zhao, Cong Wu

    Abstract: Private voice communication often contains sensitive information, making it critical to ensure that only authorized users have access to such calls. Unfortunately, current authentication mechanisms, such as PIN-based passwords, fingerprint recognition, and face recognition, fail to authenticate the call receiver, leaving a gap in security. To fill the gap, we present EarPass, a secure and implicit… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  14. arXiv:2404.14443  [pdf

    cs.CL cs.AI

    Evaluation of Machine Translation Based on Semantic Dependencies and Keywords

    Authors: Kewei Yuan, Qiurong Zhao, Yang Xu, Xiao Zhang, Huansheng Ning

    Abstract: In view of the fact that most of the existing machine translation evaluation algorithms only consider the lexical and syntactic information, but ignore the deep semantic information contained in the sentence, this paper proposes a computational method for evaluating the semantic correctness of machine translations based on reference translations and incorporating semantic dependencies and sentence… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  15. arXiv:2404.13798  [pdf, other

    cs.CV

    Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation

    Authors: Jensen Hwa, Qingyu Zhao, Aditya Lahiri, Adnan Masood, Babak Salimi, Ehsan Adeli

    Abstract: Conditional independence (CI) constraints are critical for defining and evaluating fairness in machine learning, as well as for learning unconfounded or causal representations. Traditional methods for ensuring fairness either blindly learn invariant features with respect to a protected variable (e.g., race when classifying sex from face images) or enforce CI relative to the protected attribute onl… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: To appear at the 2024 IEEE CVPR Workshop on Fair, Data-Efficient, and Trusted Computer Vision

  16. arXiv:2404.13430  [pdf, other

    physics.chem-ph cs.LG

    React-OT: Optimal Transport for Generating Transition State in Chemical Reactions

    Authors: Chenru Duan, Guan-Horng Liu, Yuanqi Du, Tianrong Chen, Qiyuan Zhao, Haojun Jia, Carla P. Gomes, Evangelos A. Theodorou, Heather J. Kulik

    Abstract: Transition states (TSs) are transient structures that are key in understanding reaction mechanisms and designing catalysts but challenging to be captured in experiments. Alternatively, many optimization algorithms have been developed to search for TSs computationally. Yet the cost of these algorithms driven by quantum chemistry methods (usually density functional theory) is still high, posing chal… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 5 figures, 1 table

  17. arXiv:2404.12659  [pdf, ps, other

    cs.CL

    SOS-1K: A Fine-grained Suicide Risk Classification Dataset for Chinese Social Media Analysis

    Authors: Hongzhi Qi, Hanfei Liu, Jianqiang Li, Qing Zhao, Wei Zhai, Dan Luo, Tian Yu He, Shuo Liu, Bing Xiang Yang, Guanghui Fu

    Abstract: In the social media, users frequently express personal emotions, a subset of which may indicate potential suicidal tendencies. The implicit and varied forms of expression in internet language complicate accurate and rapid identification of suicidal intent on social media, thus creating challenges for timely intervention efforts. The development of deep learning models for suicide risk detection is… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  18. arXiv:2404.12235  [pdf, other

    cs.CV

    Beyond Average: Individualized Visual Scanpath Prediction

    Authors: Xianyu Chen, Ming Jiang, Qi Zhao

    Abstract: Understanding how attention varies across individuals has significant scientific and societal impacts. However, existing visual scanpath models treat attention uniformly, neglecting individual differences. To bridge this gap, this paper focuses on individualized scanpath prediction (ISP), a new attention modeling task that aims to accurately predict how different individuals shift their attention… ▽ More

    Submitted 18 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: To appear in CVPR2024

  19. arXiv:2404.11449  [pdf, other

    cs.CL cs.LG

    AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts

    Authors: Meng Jiang, Yi Jing Yu, Qing Zhao, Jianqiang Li, Changwei Song, Hongzhi Qi, Wei Zhai, Dan Luo, Xiaoqin Wang, Guanghui Fu, Bing Xiang Yang

    Abstract: Cognitive Behavioral Therapy (CBT) is an effective technique for addressing the irrational thoughts stemming from mental illnesses, but it necessitates precise identification of cognitive pathways to be successfully implemented in patient care. In current society, individuals frequently express negative emotions on social media on specific topics, often exhibiting cognitive distortions, including… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  20. arXiv:2404.09790  [pdf, other

    cs.CV

    NTIRE 2024 Challenge on Image Super-Resolution ($\times$4): Methods and Results

    Authors: Zheng Chen, Zongwei Wu, Eduard Zamfir, Kai Zhang, Yulun Zhang, Radu Timofte, Xiaokang Yang, Hongyuan Yu, Cheng Wan, Yuxin Hong, Zhijuan Huang, Yajun Zou, Yuan Huang, Jiamin Lin, Bingnan Han, Xianyu Guan, Yongsheng Yu, Daoan Zhang, Xuanwu Yin, Kunlong Zuo, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou , et al. (63 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained. The challenge involves generating corresponding high-resolution (HR) images, magnified by a factor of four, from low-resolution (LR) inputs using prior information. The LR images originate from bicubic downsampling degradation. The aim of the challenge i… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: NTIRE 2024 webpage: https://cvlai.net/ntire/2024. Code: https://github.com/zhengchen1999/NTIRE2024_ImageSR_x4

  21. arXiv:2404.09149  [pdf, other

    eess.SY cs.NE math.NA

    Heuristic Solution to Joint Deployment and Beamforming Design for STAR-RIS Aided Networks

    Authors: Bai Yan, Qi Zhao, Jin Zhang, J. Andrew Zhang

    Abstract: This paper tackles the deployment challenges of Simultaneous Transmitting and Reflecting Reconfigurable Intelligent Surface (STAR-RIS) in communication systems. Unlike existing works that use fixed deployment setups or solely optimize the location, this paper emphasizes the joint optimization of the location and orientation of STAR-RIS. This enables searching across all user grouping possibilities… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 30 pages

  22. arXiv:2404.08921  [pdf, other

    cs.CV

    PNeRV: Enhancing Spatial Consistency via Pyramidal Neural Representation for Videos

    Authors: Qi Zhao, M. Salman Asif, Zhan Ma

    Abstract: The primary focus of Neural Representation for Videos (NeRV) is to effectively model its spatiotemporal consistency. However, current NeRV systems often face a significant issue of spatial inconsistency, leading to decreased perceptual quality. To address this issue, we introduce the Pyramidal Neural Representation for Videos (PNeRV), which is built on a multi-scale information connection and comp… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  23. arXiv:2404.08917  [pdf, other

    cs.CV

    MAProtoNet: A Multi-scale Attentive Interpretable Prototypical Part Network for 3D Magnetic Resonance Imaging Brain Tumor Classification

    Authors: Binghua Li, Jie Mao, Zhe Sun, Chao Li, Qibin Zhao, Toshihisa Tanaka

    Abstract: Automated diagnosis with artificial intelligence has emerged as a promising area in the realm of medical imaging, while the interpretability of the introduced deep neural networks still remains an urgent concern. Although contemporary works, such as XProtoNet and MProtoNet, has sought to design interpretable prediction models for the issue, the localization precision of their resulting attribution… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  24. arXiv:2404.06477  [pdf, other

    cs.PL cs.LO

    Mechanised Hypersafety Proofs about Structured Data: Extended Version

    Authors: Vladimir Gladshtein, Qiyuan Zhao, Willow Ahrens, Saman Amarasinghe, Ilya Sergey

    Abstract: Arrays are a fundamental abstraction to represent collections of data. It is often possible to exploit structural properties of the data stored in an array (e.g., repetition or sparsity) to develop a specialised representation optimised for space efficiency. Formally reasoning about correctness of manipulations with such structured data is challenging, as they are often composed of multiple loops… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: Extended version of the paper accepted at PLDI'24

  25. arXiv:2404.05892  [pdf, other

    cs.CL cs.AI

    Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

    Authors: Bo Peng, Daniel Goldstein, Quentin Anthony, Alon Albalak, Eric Alcaide, Stella Biderman, Eugene Cheah, Xingjian Du, Teddy Ferdinan, Haowen Hou, Przemysław Kazienko, Kranthi Kiran GV, Jan Kocoń, Bartłomiej Koptyra, Satyapriya Krishna, Ronald McClelland Jr., Niklas Muennighoff, Fares Obeid, Atsushi Saito, Guangyu Song, Haoqin Tu, Stanisław Woźniak, Ruichong Zhang, Bingchen Zhao, Qihang Zhao , et al. (3 additional authors not shown)

    Abstract: We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture. Our architectural design advancements include multi-headed matrix-valued states and a dynamic recurrence mechanism that improve expressivity while maintaining the inference efficiency characteristics of RNNs. We introduce a new multilingual corpus with 1.12 trillion tokens and a fast tokeni… ▽ More

    Submitted 10 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  26. arXiv:2404.04421  [pdf, other

    cs.GR cs.CV

    PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations

    Authors: Yang Zheng, Qingqing Zhao, Guandao Yang, Wang Yifan, Donglai Xiang, Florian Dubost, Dmitry Lagun, Thabo Beeler, Federico Tombari, Leonidas Guibas, Gordon Wetzstein

    Abstract: Modeling and rendering photorealistic avatars is of crucial importance in many applications. Existing methods that build a 3D avatar from visual observations, however, struggle to reconstruct clothed humans. We introduce PhysAvatar, a novel framework that combines inverse rendering with inverse physics to automatically estimate the shape and appearance of a human from multi-view video data along w… ▽ More

    Submitted 9 April, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

    Comments: Project Page: https://qingqing-zhao.github.io/PhysAvatar

  27. arXiv:2404.03741  [pdf, other

    cs.RO

    A High-Fidelity Simulation Framework for Grasping Stability Analysis in Human Casualty Manipulation

    Authors: Qianwen Zhao, Rajarshi Roy, Chad Spurlock, Kevin Lister, Long Wang

    Abstract: Recently, there has been a growing interest in rescue robots due to their vital role in addressing emergency scenarios and providing crucial support in challenging or hazardous situations where human intervention is difficult. However, very few of these robots are capable of actively engaging with humans and undertaking physical manipulation tasks. This limitation is largely attributed to the abse… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 8 pages, revision submitted to IEEE RA-L, under review

  28. arXiv:2404.03162  [pdf, other

    cs.CR

    LTRDetector: Exploring Long-Term Relationship for Advanced Persistent Threats Detection

    Authors: Xiaoxiao Liu, Fan Xu, Nan Wang, Qinxin Zhao, Dalin Zhang, Xibin Zhao, Jiqiang Liu

    Abstract: Advanced Persistent Threat (APT) is challenging to detect due to prolonged duration, infrequent occurrence, and adept concealment techniques. Existing approaches primarily concentrate on the observable traits of attack behaviors, neglecting the intricate relationships formed throughout the persistent attack lifecycle. Thus, we present an innovative APT detection framework named LTRDetector, implem… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  29. arXiv:2404.03066  [pdf, other

    cs.MA cs.NI math.DS

    Traffic Divergence Theory: An Analysis Formalism for Dynamic Networks

    Authors: Matin Macktoobian, Zhan Shu, Qing Zhao

    Abstract: Traffic dynamics is universally crucial in analyzing and designing almost any network. This article introduces a novel theoretical approach to analyzing network traffic dynamics. This theory's machinery is based on the notion of traffic divergence, which captures the flow (im)balance of network nodes and links. It features various analytical probes to investigate both spatial and temporal traffic… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Journal ref: IEEE Access, 2024

  30. arXiv:2404.01754  [pdf, other

    cs.SE cs.AI

    Peer-aided Repairer: Empowering Large Language Models to Repair Advanced Student Assignments

    Authors: Qianhui Zhao, Fang Liu, Li Zhang, Yang Liu, Zhen Yan, Zhenghao Chen, Yufei Zhou, Jing Jiang, Ge Li

    Abstract: Automated generation of feedback on programming assignments holds significant benefits for programming education, especially when it comes to advanced assignments. Automated Program Repair techniques, especially Large Language Model based approaches, have gained notable recognition for their potential to fix introductory assignments. However, the programs used for evaluation are relatively simple.… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: On-going work

  31. arXiv:2403.17712  [pdf, other

    cs.CV

    Invisible Gas Detection: An RGB-Thermal Cross Attention Network and A New Benchmark

    Authors: Jue Wang, Yuxiang Lin, Qi Zhao, Dong Luo, Shuaibao Chen, Wei Chen, Xiaojiang Peng

    Abstract: The widespread use of various chemical gases in industrial processes necessitates effective measures to prevent their leakage during transportation and storage, given their high toxicity. Thermal infrared-based computer vision detection techniques provide a straightforward approach to identify gas leakage areas. However, the development of high-quality algorithms has been challenging due to the lo… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  32. arXiv:2403.17297  [pdf, other

    cs.CL cs.AI

    InternLM2 Technical Report

    Authors: Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang , et al. (75 additional authors not shown)

    Abstract: The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introduces InternLM2, an open-source LLM that outperforms its predecessors in comprehensive evaluations across 6 dimensions and 30 benchmarks, long-context m… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  33. arXiv:2403.16649  [pdf, other

    cs.AI

    CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment

    Authors: Feiteng Fang, Liang Zhu, Min Yang, Xi Feng, Jinchang Hou, Qixuan Zhao, Chengming Li, Xiping Hu, Ruifeng Xu

    Abstract: Reinforcement learning from human feedback (RLHF) is a crucial technique in aligning large language models (LLMs) with human preferences, ensuring these LLMs behave in beneficial and comprehensible ways to users. However, a longstanding challenge in human alignment techniques based on reinforcement learning lies in their inherent complexity and difficulty in training. To address this challenge, we… ▽ More

    Submitted 26 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  34. arXiv:2403.16067  [pdf, other

    cs.CV cs.AI

    Robust Diffusion Models for Adversarial Purification

    Authors: Guang Lin, Zerui Tao, Jianhai Zhang, Toshihisa Tanaka, Qibin Zhao

    Abstract: Diffusion models (DMs) based adversarial purification (AP) has shown to be the most powerful alternative to adversarial training (AT). However, these methods neglect the fact that pre-trained diffusion models themselves are not robust to adversarial attacks as well. Additionally, the diffusion process can easily destroy semantic information and generate a high quality image but totally different f… ▽ More

    Submitted 24 May, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

  35. arXiv:2403.15574  [pdf, other

    cs.AI

    SensoryT5: Infusing Sensorimotor Norms into T5 for Enhanced Fine-grained Emotion Classification

    Authors: Yuhan Xia, Qingqing Zhao, Yunfei Long, Ge Xu, Jia Wang

    Abstract: In traditional research approaches, sensory perception and emotion classification have traditionally been considered separate domains. Yet, the significant influence of sensory experiences on emotional responses is undeniable. The natural language processing (NLP) community has often missed the opportunity to merge sensory knowledge with emotion classification. To address this gap, we propose Sens… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Accepted by CogALex 2024 conference

  36. arXiv:2403.11473  [pdf, other

    cs.CL cs.AI

    Word Order's Impacts: Insights from Reordering and Generation Analysis

    Authors: Qinghua Zhao, Jiaang Li, Lei Li, Zenghui Zhou, Junfeng Liu

    Abstract: Existing works have studied the impacts of the order of words within natural text. They usually analyze it by destroying the original order of words to create a scrambled sequence, and then comparing the models' performance between the original and scrambled sequences. The experimental results demonstrate marginal drops. Considering this findings, different hypothesis about word order is proposed,… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  37. arXiv:2403.11101  [pdf, other

    cs.CV

    Hierarchical Generative Network for Face Morphing Attacks

    Authors: Zuyuan He, Zongyong Deng, Qiaoyun He, Qijun Zhao

    Abstract: Face morphing attacks circumvent face recognition systems (FRSs) by creating a morphed image that contains multiple identities. However, existing face morphing attack methods either sacrifice image quality or compromise the identity preservation capability. Consequently, these attacks fail to bypass FRSs verification well while still managing to deceive human observers. These methods typically rel… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by FG2024

  38. arXiv:2403.10831  [pdf, other

    cs.CV

    DUE: Dynamic Uncertainty-Aware Explanation Supervision via 3D Imputation

    Authors: Qilong Zhao, Yifei Zhang, Mengdan Zhu, Siyi Gu, Yuyang Gao, Xiaofeng Yang, Liang Zhao

    Abstract: Explanation supervision aims to enhance deep learning models by integrating additional signals to guide the generation of model explanations, showcasing notable improvements in both the predictability and explainability of the model. However, the application of explanation supervision to higher-dimensional data, such as 3D medical images, remains an under-explored domain. Challenges associated wit… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: 9 pages,6 figures

  39. arXiv:2403.09037  [pdf, other

    cs.CV cs.CL

    The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?

    Authors: Qinyu Zhao, Ming Xu, Kartik Gupta, Akshay Asthana, Liang Zheng, Stephen Gould

    Abstract: Large vision-language models (LVLMs), designed to interpret and respond to human instructions, occasionally generate hallucinated or harmful content due to inappropriate instructions. This study uses linear probing to shed light on the hidden knowledge at the output layer of LVLMs. We demonstrate that the logit distributions of the first tokens contain sufficient information to determine whether t… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Under review. Project page: https://github.com/Qinyu-Allen-Zhao/LVLM-LP

  40. arXiv:2403.08334  [pdf, other

    cs.CR

    DONAPI: Malicious NPM Packages Detector using Behavior Sequence Knowledge Mapping

    Authors: Cheng Huang, Nannan Wang, Ziyan Wang, Siqi Sun, Lingzi Li, Junren Chen, Qianchong Zhao, Jiaxuan Han, Zhen Yang, Lei Shi

    Abstract: With the growing popularity of modularity in software development comes the rise of package managers and language ecosystems. Among them, npm stands out as the most extensive package manager, hosting more than 2 million third-party open-source packages that greatly simplify the process of building code. However, this openness also brings security risks, as evidenced by numerous package poisoning i… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 18 pages, accepted for publication at USENIX Security 2024

  41. arXiv:2403.06942  [pdf, other

    eess.SY cs.LG stat.ML

    Grid Monitoring and Protection with Continuous Point-on-Wave Measurements and Generative AI

    Authors: Lang Tong, Xinyi Wang, Qing Zhao

    Abstract: Purpose This article presents a case for a next-generation grid monitoring and control system, leveraging recent advances in generative artificial intelligence (AI), machine learning, and statistical inference. Advancing beyond earlier generations of wide-area monitoring systems built upon supervisory control and data acquisition (SCADA) and synchrophasor technologies, we argue for a monitoring an… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  42. arXiv:2403.05854  [pdf, other

    cs.CV

    LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content

    Authors: Qihao Zhao, Yalun Dai, Hao Li, Wei Hu, Fan Zhang, Jun Liu

    Abstract: Long-tail recognition is challenging because it requires the model to learn good representations from tail categories and address imbalances across all categories. In this paper, we propose a novel generative and fine-tuning framework, LTGC, to handle long-tail recognition via leveraging generated content. Firstly, inspired by the rich implicit knowledge in large-scale models (e.g., large language… ▽ More

    Submitted 26 May, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

    Comments: CVPR 2024, Oral

  43. arXiv:2403.05808  [pdf, other

    cs.CV eess.IV

    Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution

    Authors: Junxiong Lin, Yan Wang, Zeng Tao, Boyang Wang, Qing Zhao, Haorang Wang, Xuan Tong, Xinji Mai, Yuxuan Lin, Wei Song, Jiawen Yu, Shaoqi Yan, Wenqiang Zhang

    Abstract: Pre-trained diffusion models utilized for image generation encapsulate a substantial reservoir of a priori knowledge pertaining to intricate textures. Harnessing the potential of leveraging this a priori knowledge in the context of image super-resolution presents a compelling avenue. Nonetheless, prevailing diffusion-based methodologies presently overlook the constraints imposed by degradation inf… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  44. arXiv:2403.05743  [pdf, ps, other

    eess.SP cs.LG econ.GN

    Forecasting Electricity Market Signals via Generative AI

    Authors: Xinyi Wang, Qing Zhao, Lang Tong

    Abstract: This paper presents a generative artificial intelligence approach to probabilistic forecasting of electricity market signals, such as real-time locational marginal prices and area control error signals. Inspired by the Wiener-Kallianpur innovation representation of nonparametric time series, we propose a weak innovation autoencoder architecture and a novel deep learning algorithm that extracts the… ▽ More

    Submitted 24 April, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Journal ref: Under review for IEEE Transactions on Power Systems, March 2024

  45. arXiv:2403.04299  [pdf, other

    cs.RO cs.AI

    LitSim: A Conflict-aware Policy for Long-term Interactive Traffic Simulation

    Authors: Haojie Xin, Xiaodong Zhang, Renzhi Tang, Songyang Yan, Qianrui Zhao, Chunze Yang, Wen Cui, Zijiang Yang

    Abstract: Simulation is pivotal in evaluating the performance of autonomous driving systems due to the advantages of high efficiency and low cost compared to on-road testing. Bridging the gap between simulation and the real world requires realistic agent behaviors. However, the existing works have the following shortcomings in achieving this goal: (1) log replay offers realistic scenarios but often leads to… ▽ More

    Submitted 1 May, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: 9 pages, 6 figures, under review

  46. arXiv:2403.04294  [pdf, other

    cs.CV

    A$^{3}$lign-DFER: Pioneering Comprehensive Dynamic Affective Alignment for Dynamic Facial Expression Recognition with CLIP

    Authors: Zeng Tao, Yan Wang, Junxiong Lin, Haoran Wang, Xinji Mai, Jiawen Yu, Xuan Tong, Ziheng Zhou, Shaoqi Yan, Qing Zhao, Liyuan Han, Wenqiang Zhang

    Abstract: The performance of CLIP in dynamic facial expression recognition (DFER) task doesn't yield exceptional results as observed in other CLIP-based classification tasks. While CLIP's primary objective is to achieve alignment between images and text in the feature space, DFER poses challenges due to the abstract nature of text and the dynamic nature of video, making label representation limited and perf… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  47. arXiv:2403.02871  [pdf, other

    quant-ph cs.LG

    Quantum Mixed-State Self-Attention Network

    Authors: Fu Chen, Qinglin Zhao, Li Feng, Chuangtao Chen, Yangbin Lin, Jianhong Lin

    Abstract: The rapid advancement of quantum computing has increasingly highlighted its potential in the realm of machine learning, particularly in the context of natural language processing (NLP) tasks. Quantum machine learning (QML) leverages the unique capabilities of quantum computing to offer novel perspectives and methodologies for complex data processing and pattern recognition challenges. This paper i… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  48. arXiv:2403.01968  [pdf, other

    cs.CV

    Explicit Motion Handling and Interactive Prompting for Video Camouflaged Object Detection

    Authors: Xin Zhang, Tao Xiao, Gepeng Ji, Xuan Wu, Keren Fu, Qijun Zhao

    Abstract: Camouflage poses challenges in distinguishing a static target, whereas any movement of the target can break this disguise. Existing video camouflaged object detection (VCOD) approaches take noisy motion estimation as input or model motion implicitly, restricting detection performance in complex dynamic scenes. In this paper, we propose a novel Explicit Motion handling and Interactive Prompting fra… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 9 pages, 6 figures

  49. arXiv:2403.00876  [pdf, other

    cs.CL cs.AI

    Word Order and World Knowledge

    Authors: Qinghua Zhao, Vinit Ravishankar, Nicolas Garneau, Anders Søgaard

    Abstract: Word order is an important concept in natural language, and in this work, we study how word order affects the induction of world knowledge from raw text using language models. We use word analogies to probe for such knowledge. Specifically, in addition to the natural word order, we first respectively extract texts of six fixed word orders from five languages and then pretrain the language models o… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  50. arXiv:2403.00510  [pdf, other

    cs.CL cs.AI

    ROME: Memorization Insights from Text, Probability and Hidden State in Large Language Models

    Authors: Bo Li, Qinghua Zhao, Lijie Wen

    Abstract: Probing the memorization of large language models holds significant importance. Previous works have established metrics for quantifying memorization, explored various influencing factors, such as data duplication, model size, and prompt length, and evaluated memorization by comparing model outputs with training corpora. However, the training corpora are of enormous scale and its pre-processing is… ▽ More

    Submitted 4 March, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: Submitted to ACL, 2024