Skip to main content

Showing 1–50 of 405 results for author: Peng, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05008  [pdf, other

    cs.CL

    ADELIE: Aligning Large Language Models on Information Extraction

    Authors: Yunjia Qi, Hao Peng, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li

    Abstract: Large language models (LLMs) usually fall short on information extraction (IE) tasks and struggle to follow the complex instructions of IE tasks. This primarily arises from LLMs not being aligned with humans, as mainstream alignment datasets typically do not include IE data. In this paper, we introduce ADELIE (Aligning large language moDELs on Information Extraction), an aligned LLM that effective… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  2. arXiv:2405.03188  [pdf, other

    cs.LG

    Hyperbolic Geometric Latent Diffusion Model for Graph Generation

    Authors: Xingcheng Fu, Yisen Gao, Yuecen Wei, Qingyun Sun, Hao Peng, Jianxin Li, Xianxian Li

    Abstract: Diffusion models have made significant contributions to computer vision, sparking a growing interest in the community recently regarding the application of them to graph generation. Existing discrete graph diffusion models exhibit heightened computational complexity and diminished training efficiency. A preferable and natural way is to directly diffuse the graph within the latent space. However, d… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Accepted by the 41st International Conference on Machine Learning (ICML 2024)

  3. arXiv:2405.02686  [pdf, other

    cs.CV cs.AI

    Boosting 3D Neuron Segmentation with 2D Vision Transformer Pre-trained on Natural Images

    Authors: Yik San Cheng, Runkai Zhao, Heng Wang, Hanchuan Peng, Weidong Cai

    Abstract: Neuron reconstruction, one of the fundamental tasks in neuroscience, rebuilds neuronal morphology from 3D light microscope imaging data. It plays a critical role in analyzing the structure-function relationship of neurons in the nervous system. However, due to the scarcity of neuron datasets and high-quality SWC annotations, it is still challenging to develop robust segmentation methods for single… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 3 pages

  4. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  5. arXiv:2404.15574  [pdf, other

    cs.CL

    Retrieval Head Mechanistically Explains Long-Context Factuality

    Authors: Wenhao Wu, Yizhong Wang, Guangxuan Xiao, Hao Peng, Yao Fu

    Abstract: Despite the recent progress in long-context language models, it remains elusive how transformer-based models exhibit the capability to retrieve relevant information from arbitrary locations within the long context. This paper aims to address this question. Our systematic investigation across a wide spectrum of models reveals that a special type of attention heads are largely responsible for retrie… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Preprint

  6. arXiv:2404.15381  [pdf, other

    cs.LG cs.AI

    Advances and Open Challenges in Federated Learning with Foundation Models

    Authors: Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Anran Li, Yulan Gao, Alysa Ziying Tan, Bo Zhao, Xiaoxiao Li, Zengxiang Li, Qiang Yang

    Abstract: The integration of Foundation Models (FMs) with Federated Learning (FL) presents a transformative paradigm in Artificial Intelligence (AI), offering enhanced capabilities while addressing concerns of privacy, data decentralization, and computational efficiency. This paper provides a comprehensive survey of the emerging field of Federated Foundation Models (FedFM), elucidating their synergistic rel… ▽ More

    Submitted 29 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Survey of Federated Foundation Models (FedFM)

  7. arXiv:2404.15070  [pdf, other

    cs.SI cs.AI

    BotDGT: Dynamicity-aware Social Bot Detection with Dynamic Graph Transformers

    Authors: Buyun He, Yingguang Yang, Qi Wu, Hao Liu, Renyu Yang, Hao Peng, Xiang Wang, Yong Liao, Pengyuan Zhou

    Abstract: Detecting social bots has evolved into a pivotal yet intricate task, aimed at combating the dissemination of misinformation and preserving the authenticity of online interactions. While earlier graph-based approaches, which leverage topological structure of social networks, yielded notable outcomes, they overlooked the inherent dynamicity of social networks -- In reality, they largely depicted the… ▽ More

    Submitted 24 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: IJCAI 2024

  8. Unsupervised Social Bot Detection via Structural Information Theory

    Authors: Hao Peng, Jingyun Zhang, Xiang Huang, Zhifeng Hao, Angsheng Li, Zhengtao Yu, Philip S. Yu

    Abstract: Research on social bot detection plays a crucial role in maintaining the order and reliability of information dissemination while increasing trust in social interactions. The current mainstream social bot detection models rely on black-box neural network technology, e.g., Graph Neural Network, Transformer, etc., which lacks interpretability. In this work, we present UnDBot, a novel unsupervised, i… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 42 pages, 12 figures, accepted for publication in Transactions on Information Systems

  9. arXiv:2404.12852  [pdf, other

    cs.CR cs.CV cs.LG

    LSP Framework: A Compensatory Model for Defeating Trigger Reverse Engineering via Label Smoothing Poisoning

    Authors: Beichen Li, Yuanfang Guo, Heqi Peng, Yangxi Li, Yunhong Wang

    Abstract: Deep neural networks are vulnerable to backdoor attacks. Among the existing backdoor defense methods, trigger reverse engineering based approaches, which reconstruct the backdoor triggers via optimizations, are the most versatile and effective ones compared to other types of methods. In this paper, we summarize and construct a generic paradigm for the typical trigger reverse engineering process. B… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  10. arXiv:2404.12635  [pdf, other

    cs.CV cs.CR cs.LG

    AED-PADA:Improving Generalizability of Adversarial Example Detection via Principal Adversarial Domain Adaptation

    Authors: Heqi Peng, Yunhong Wang, Ruijie Yang, Beichen Li, Rui Wang, Yuanfang Guo

    Abstract: Adversarial example detection, which can be conveniently applied in many scenarios, is important in the area of adversarial defense. Unfortunately, existing detection methods suffer from poor generalization performance, because their training process usually relies on the examples generated from a single known adversarial attack and there exists a large discrepancy between the training and unseen… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  11. arXiv:2404.09760  [pdf, other

    cs.LG cs.AI

    Effective Reinforcement Learning Based on Structural Information Principles

    Authors: Xianghua Zeng, Hao Peng, Dingli Su, Angsheng Li

    Abstract: Although Reinforcement Learning (RL) algorithms acquire sequential behavioral patterns through interactions with the environment, their effectiveness in noisy and high-dimensional scenarios typically relies on specific structural priors. In this paper, we propose a novel and general Structural Information principles-based framework for effective Decision-Making, namely SIDM, approached from an inf… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  12. arXiv:2404.09285  [pdf, other

    cs.DC

    Egret: Reinforcement Mechanism for Sequential Computation Offloading in Edge Computing

    Authors: Haosong Peng, Yufeng Zhan, DiHua Zhai, Xiaopu Zhang, Yuanqing Xia

    Abstract: As an emerging computing paradigm, edge computing offers computing resources closer to the data sources, helping to improve the service quality of many real-time applications. A crucial problem is designing a rational pricing mechanism to maximize the revenue of the edge computing service provider (ECSP). However, prior works have considerable limitations: clients are static and are required to di… ▽ More

    Submitted 29 April, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE TSC

  13. arXiv:2404.09267  [pdf, other

    cs.DC

    Tangram: High-resolution Video Analytics on Serverless Platform with SLO-aware Batching

    Authors: Haosong Peng, Yufeng Zhan, Peng Li, Yuanqing Xia

    Abstract: Cloud-edge collaborative computing paradigm is a promising solution to high-resolution video analytics systems. The key lies in reducing redundant data and managing fluctuating inference workloads effectively. Previous work has focused on extracting regions of interest (RoIs) from videos and transmitting them to the cloud for processing. However, a naive Infrastructure as a Service (IaaS) resource… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE International Conference on Distributed Computing Systems (ICDCS) 2024

  14. arXiv:2404.09245  [pdf, other

    cs.MM cs.CV

    Arena: A Patch-of-Interest ViT Inference Acceleration System for Edge-Assisted Video Analytics

    Authors: Haosong Peng, Wei Feng, Hao Li, Yufeng Zhan, Qihua Zhou, Yuanqing Xia

    Abstract: The advent of edge computing has made real-time intelligent video analytics feasible. Previous works, based on traditional model architecture (e.g., CNN, RNN, etc.), employ various strategies to filter out non-region-of-interest content to minimize bandwidth and computation consumption but show inferior performance in adverse environments. Recently, visual foundation models based on transformers h… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  15. arXiv:2404.08263  [pdf, other

    cs.CL cs.AI cs.LG cs.SI

    Relational Prompt-based Pre-trained Language Models for Social Event Detection

    Authors: Pu Li, Xiaoyan Yu, Hao Peng, Yantuan Xian, Linqin Wang, Li Sun, Jingyun Zhang, Philip S. Yu

    Abstract: Social Event Detection (SED) aims to identify significant events from social streams, and has a wide application ranging from public opinion analysis to risk management. In recent years, Graph Neural Network (GNN) based solutions have achieved state-of-the-art performance. However, GNN-based methods often struggle with noisy and missing edges between messages, affecting the quality of learned mess… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: ACM TOIS Under Review

  16. arXiv:2404.02078  [pdf, other

    cs.AI cs.CL cs.LG

    Advancing LLM Reasoning Generalists with Preference Trees

    Authors: Lifan Yuan, Ganqu Cui, Hanbin Wang, Ning Ding, Xingyao Wang, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, Bowen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun

    Abstract: We introduce Eurus, a suite of large language models (LLMs) optimized for reasoning. Finetuned from Mistral-7B and CodeLlama-70B, Eurus models achieve state-of-the-art results among open-source models on a diverse set of benchmarks covering mathematics, code generation, and logical reasoning problems. Notably, Eurus-70B beats GPT-3.5 Turbo in reasoning through a comprehensive benchmarking across 1… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Models and data are available at https://github.com/OpenBMB/Eurus

  17. arXiv:2404.01019  [pdf, other

    cs.CL cs.AI

    Source-Aware Training Enables Knowledge Attribution in Language Models

    Authors: Muhammad Khalifa, David Wadden, Emma Strubell, Honglak Lee, Lu Wang, Iz Beltagy, Hao Peng

    Abstract: Large language models (LLMs) learn a vast amount of knowledge during pretraining, but they are often oblivious to the source(s) of such knowledge. We investigate the problem of intrinsic source citation, where LLMs are required to cite the pretraining source supporting a generated response. Intrinsic source citation can enhance LLM transparency, interpretability, and verifiability. To give LLMs su… ▽ More

    Submitted 11 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  18. arXiv:2403.19063  [pdf, other

    cs.IR

    Instruction-based Hypergraph Pretraining

    Authors: Mingdai Yang, Zhiwei Liu, Liangwei Yang, Xiaolong Liu, Chen Wang, Hao Peng, Philip S. Yu

    Abstract: Pretraining has been widely explored to augment the adaptability of graph learning models to transfer knowledge from large datasets to a downstream task, such as link prediction or classification. However, the gap between training objectives and the discrepancy between data distributions in pretraining and downstream tasks hinders the transfer of the pretrained knowledge. Inspired by instruction-b… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted by SIGIR'24

  19. SolderlessPCB: Reusing Electronic Components in PCB Prototyping through Detachable 3D Printed Housings

    Authors: Zeyu Yan, Jiasheng Li, Zining Zhang, Huaishu Peng

    Abstract: The iterative prototyping process for printed circuit boards (PCBs) frequently employs surface-mounted device (SMD) components, which are often discarded rather than reused due to the challenges associated with desoldering, leading to unnecessary electronic waste. This paper introduces SolderlessPCB, a collection of techniques for solder-free PCB prototyping, specifically designed to promote the r… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Journal ref: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

  20. arXiv:2403.18540  [pdf, other

    stat.ML cs.LG stat.CO

    skscope: Fast Sparsity-Constrained Optimization in Python

    Authors: Zezhi Wang, Jin Zhu, Peng Chen, Huiyang Peng, Xiaoke Zhang, Anran Wang, Yu Zheng, Junxian Zhu, Xueqin Wang

    Abstract: Applying iterative solvers on sparsity-constrained optimization (SCO) requires tedious mathematical deduction and careful programming/debugging that hinders these solvers' broad impact. In the paper, the library skscope is introduced to overcome such an obstacle. With skscope, users can solve the SCO by just programming the objective function. The convenience of skscope is demonstrated through two… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 4 pages

  21. arXiv:2403.14733  [pdf

    cs.AI cs.CL cs.LG

    Open Knowledge Base Canonicalization with Multi-task Learning

    Authors: Bingchen Liu, Huang Peng, Weixin Zeng, Xiang Zhao, Shijun Liu, Li Pan

    Abstract: The construction of large open knowledge bases (OKBs) is integral to many knowledge-driven applications on the world wide web such as web search. However, noun phrases and relational phrases in OKBs often suffer from redundancy and ambiguity, which calls for the investigation on OKB canonicalization. Current solutions address OKB canonicalization by devising advanced clustering algorithms and usin… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2310.16419

  22. arXiv:2403.10361  [pdf, other

    cs.CR

    Unveiling Wash Trading in Popular NFT Markets

    Authors: Yuanzheng Niu, Xiaoqi Li, Hongli Peng, Wenkai Li

    Abstract: As emerging digital assets, NFTs are susceptible to anomalous trading behaviors due to the lack of stringent regulatory mechanisms, potentially causing economic losses. In this paper, we conduct the first systematic analysis of four non-fungible tokens (NFT) markets. Specifically, we analyze more than 25 million transactions within these markets, to explore the evolution of wash trade activities.… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: This paper has been accepted by WWW 2024

  23. arXiv:2403.04706  [pdf, other

    cs.CL cs.AI

    Common 7B Language Models Already Possess Strong Math Capabilities

    Authors: Chen Li, Weiqi Wang, Jingcheng Hu, Yixuan Wei, Nanning Zheng, Han Hu, Zheng Zhang, Houwen Peng

    Abstract: Mathematical capabilities were previously believed to emerge in common language models only at a very large scale or require extensive math-related pre-training. This paper shows that the LLaMA-2 7B model with common pre-training already exhibits strong mathematical abilities, as evidenced by its impressive accuracy of 97.7% and 72.0% on the GSM8K and MATH benchmarks, respectively, when selecting… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  24. arXiv:2403.04175  [pdf

    physics.med-ph cs.AI

    Understanding the PULSAR Effect in Combined Radiotherapy and Immunotherapy through Attention Mechanisms with a Transformer Model

    Authors: Hao Peng, Casey Moore, Debabrata Saha, Steve Jiang, Robert Timmerman

    Abstract: PULSAR (personalized, ultra-fractionated stereotactic adaptive radiotherapy) is the adaptation of stereotactic ablative radiotherapy towards personalized cancer management. For the first time, we applied a transformer-based attention mechanism to investigate the underlying interactions between combined PULSAR and PD-L1 blockade immunotherapy based on a murine cancer model (Lewis Lung Carcinoma, LL… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  25. arXiv:2402.17214  [pdf, other

    cs.CV

    CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization

    Authors: Hao-Yang Peng, Jia-Peng Zhang, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Hu

    Abstract: In the field of digital content creation, generating high-quality 3D characters from single images is challenging, especially given the complexities of various body poses and the issues of self-occlusion and pose ambiguity. In this paper, we present CharacterGen, a framework developed to efficiently generate 3D characters. CharacterGen introduces a streamlined generation pipeline along with an ima… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  26. arXiv:2402.16294  [pdf, other

    cs.CR cs.AI

    Decentralized Federated Unlearning on Blockchain

    Authors: Xiao Liu, Mingyuan Li, Xu Wang, Guangsheng Yu, Wei Ni, Lixiang Li, Haipeng Peng, Renping Liu

    Abstract: Blockchained Federated Learning (FL) has been gaining traction for ensuring the integrity and traceability of FL processes. Blockchained FL involves participants training models locally with their data and subsequently publishing the models on the blockchain, forming a Directed Acyclic Graph (DAG)-like inheritance structure that represents the model relationship. However, this particular DAG-based… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  27. arXiv:2402.13717  [pdf, other

    cs.CL

    Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

    Authors: Xiaoyan Yu, Tongxu Luo, Yifan Wei, Fangyu Lei, Yiming Huang, Hao Peng, Liehuang Zhu

    Abstract: Large Language Models (LLMs) have revolutionized open-domain dialogue agents but encounter challenges in multi-character role-playing (MCRP) scenarios. To address the issue, we present Neeko, an innovative framework designed for efficient multiple characters imitation. Unlike existing methods, Neeko employs a dynamic low-rank adapter (LoRA) strategy, enabling it to adapt seamlessly to diverse char… ▽ More

    Submitted 1 March, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  28. arXiv:2402.13093  [pdf, other

    cs.CL cs.AI

    Event-level Knowledge Editing

    Authors: Hao Peng, Xiaozhi Wang, Chunyang Li, Kaisheng Zeng, Jiangshan Duo, Yixin Cao, Lei Hou, Juanzi Li

    Abstract: Knowledge editing aims at updating knowledge of large language models (LLMs) to prevent them from becoming outdated. Existing work edits LLMs at the level of factual knowledge triplets. However, natural knowledge updates in the real world come from the occurrences of new events rather than direct changes in factual triplets. In this paper, we propose a new task setting: event-level knowledge editi… ▽ More

    Submitted 21 April, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: 18 pages, 2 figures

  29. arXiv:2402.10171  [pdf, other

    cs.CL cs.AI

    Data Engineering for Scaling Language Models to 128K Context

    Authors: Yao Fu, Rameswar Panda, Xinyao Niu, Xiang Yue, Hannaneh Hajishirzi, Yoon Kim, Hao Peng

    Abstract: We study the continual pretraining recipe for scaling language models' context lengths to 128K, with a focus on data engineering. We hypothesize that long context modeling, in particular \textit{the ability to utilize information at arbitrary input locations}, is a capability that is mostly already acquired through large-scale pretraining, and that this capability can be readily extended to contex… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: Code at https://github.com/FranxYao/Long-Context-Data-Engineering

  30. arXiv:2402.09747  [pdf, other

    eess.IV cs.CV cs.LG

    Less is more: Ensemble Learning for Retinal Disease Recognition Under Limited Resources

    Authors: Jiahao Wang, Hong Peng, Shengchao Chen, Sufen Ren

    Abstract: Retinal optical coherence tomography (OCT) images provide crucial insights into the health of the posterior ocular segment. Therefore, the advancement of automated image analysis methods is imperative to equip clinicians and researchers with quantitative data, thereby facilitating informed decision-making. The application of deep learning (DL)-based approaches has gained extensive traction for exe… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: Ongoing work

  31. arXiv:2402.02769  [pdf, other

    cs.LG cs.AI

    Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate

    Authors: Can Jin, Tong Che, Hongwu Peng, Yiyuan Li, Marco Pavone

    Abstract: Generalization remains a central challenge in machine learning. In this work, we propose Learning from Teaching (LoT), a novel regularization technique for deep neural networks to enhance generalization. Inspired by the human ability to capture concise and abstract patterns, we hypothesize that generalizable correlations are expected to be easier to teach. LoT operationalizes this concept to impro… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  32. arXiv:2402.01030  [pdf, other

    cs.CL cs.AI

    Executable Code Actions Elicit Better LLM Agents

    Authors: Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji

    Abstract: Large Language Model (LLM) agents, capable of performing a broad range of actions, such as invoking tools and controlling robots, show great potential in tackling real-world challenges. LLM agents are typically prompted to produce actions by generating JSON or text in a pre-defined format, which is usually limited by constrained action space (e.g., the scope of pre-defined tools) and restricted fl… ▽ More

    Submitted 18 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Code, data, model, and demo are available at https://github.com/xingyaoww/code-act

  33. arXiv:2401.14024  [pdf

    cs.CV

    PLCNet: Patch-wise Lane Correction Network for Automatic Lane Correction in High-definition Maps

    Authors: Haiyang Peng, Yi Zhan, Benkang Wang, Hongtao Zhang

    Abstract: In High-definition (HD) maps, lane elements constitute the majority of components and demand stringent localization requirements to ensure safe vehicle navigation. Vision lane detection with LiDAR position assignment is a prevalent method to acquire initial lanes for HD maps. However, due to incorrect vision detection and coarse camera-LiDAR calibration, initial lanes may deviate from their true p… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  34. arXiv:2401.12780  [pdf, other

    cs.LG

    DeepRicci: Self-supervised Graph Structure-Feature Co-Refinement for Alleviating Over-squashing

    Authors: Li Sun, Zhenhao Huang, Hua Wu, Junda Ye, Hao Peng, Zhengtao Yu, Philip S. Yu

    Abstract: Graph Neural Networks (GNNs) have shown great power for learning and mining on graphs, and Graph Structure Learning (GSL) plays an important role in boosting GNNs with a refined graph. In the literature, most GSL solutions either primarily focus on structure refinement with task-specific supervision (i.e., node classification), or overlook the inherent weakness of GNNs themselves (e.g., over-squas… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted by IEEE ICDM 2023, Full paper, 10 pages

  35. arXiv:2401.11664  [pdf, other

    cs.LG cs.AI cs.AR

    Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM

    Authors: Bingbing Li, Geng Yuan, Zigeng Wang, Shaoyi Huang, Hongwu Peng, Payman Behnam, Wujie Wen, Hang Liu, Caiwen Ding

    Abstract: Resistive Random Access Memory (ReRAM) has emerged as a promising platform for deep neural networks (DNNs) due to its support for parallel in-situ matrix-vector multiplication. However, hardware failures, such as stuck-at-fault defects, can result in significant prediction errors during model inference. While additional crossbars can be used to address these failures, they come with storage overhe… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

  36. arXiv:2401.10774  [pdf, other

    cs.LG cs.CL

    Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

    Authors: Tianle Cai, Yuhong Li, Zhengyang Geng, Hongwu Peng, Jason D. Lee, Deming Chen, Tri Dao

    Abstract: The inference process in Large Language Models (LLMs) is often limited due to the absence of parallelism in the auto-regressive decoding process, resulting in most operations being restricted by the memory bandwidth of accelerators. While methods such as speculative decoding have been suggested to address this issue, their implementation is impeded by the challenges associated with acquiring and m… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: The code for this implementation is available at https://github.com/FasterDecoding/Medusa

  37. arXiv:2401.05806  [pdf, other

    cs.CV

    CLIP-Driven Semantic Discovery Network for Visible-Infrared Person Re-Identification

    Authors: Xiaoyan Yu, Neng Dong, Liehuang Zhu, Hao Peng, Dapeng Tao

    Abstract: Visible-infrared person re-identification (VIReID) primarily deals with matching identities across person images from different modalities. Due to the modality gap between visible and infrared images, cross-modality identity matching poses significant challenges. Recognizing that high-level semantics of pedestrian appearance, such as gender, shape, and clothing style, remain consistent across moda… ▽ More

    Submitted 12 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

  38. arXiv:2401.01232  [pdf, other

    cs.LG

    Motif-aware Riemannian Graph Neural Network with Generative-Contrastive Learning

    Authors: Li Sun, Zhenhao Huang, Zixi Wang, Feiyang Wang, Hao Peng, Philip Yu

    Abstract: Graphs are typical non-Euclidean data of complex structures. In recent years, Riemannian graph representation learning has emerged as an exciting alternative to Euclidean ones. However, Riemannian methods are still in an early stage: most of them present a single curvature (radius) regardless of structural complexity, suffer from numerical instability due to the exponential/logarithmic map, and la… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI24

  39. arXiv:2401.00343  [pdf, other

    cs.CV

    SHARE: Single-view Human Adversarial REconstruction

    Authors: Shreelekha Revankar, Shijia Liao, Yu Shen, Junbang Liang, Huaishu Peng, Ming Lin

    Abstract: The accuracy of 3D Human Pose and Shape reconstruction (HPS) from an image is progressively improving. Yet, no known method is robust across all image distortion. To address issues due to variations of camera poses, we introduce SHARE, a novel fine-tuning method that utilizes adversarial data augmentation to enhance the robustness of existing HPS techniques. We perform a comprehensive analysis on… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

  40. arXiv:2312.15946  [pdf, other

    cs.SD cs.GR eess.AS

    EnchantDance: Unveiling the Potential of Music-Driven Dance Movement

    Authors: Bo Han, Yi Ren, Hao Peng, Teng Zhang, Zeyu Ling, Xiang Yin, Feilin Han

    Abstract: The task of music-driven dance generation involves creating coherent dance movements that correspond to the given music. While existing methods can produce physically plausible dances, they often struggle to generalize to out-of-set data. The challenge arises from three aspects: 1) the high diversity of dance movements and significant differences in the distribution of music modalities, which make… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  41. arXiv:2312.12804  [pdf, other

    cs.CV

    Multi-stages attention Breast cancer classification based on nonlinear spiking neural P neurons with autapses

    Authors: Bo Yang, Hong Peng, Xiaohui Luo, Jun Wang

    Abstract: Breast cancer(BC) is a prevalent type of malignant tumor in women. Early diagnosis and treatment are vital for enhancing the patients' survival rate. Downsampling in deep networks may lead to loss of information, so for compensating the detail and edge information and allowing convolutional neural networks to pay more attention to seek the lesion region, we propose a multi-stages attention archite… ▽ More

    Submitted 4 January, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  42. arXiv:2312.12789  [pdf, other

    eess.IV cs.CV cs.LG

    SLP-Net:An efficient lightweight network for segmentation of skin lesions

    Authors: Bo Yang, Hong Peng, Chenggang Guo, Xiaohui Luo, Jun Wang, Xianzhong Long

    Abstract: Prompt treatment for melanoma is crucial. To assist physicians in identifying lesion areas precisely in a quick manner, we propose a novel skin lesion segmentation technique namely SLP-Net, an ultra-lightweight segmentation network based on the spiking neural P(SNP) systems type mechanism. Most existing convolutional neural networks achieve high segmentation accuracy while neglecting the high hard… ▽ More

    Submitted 4 January, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  43. arXiv:2312.12183  [pdf, other

    cs.LG cs.CR

    Poincaré Differential Privacy for Hierarchy-Aware Graph Embedding

    Authors: Yuecen Wei, Haonan Yuan, Xingcheng Fu, Qingyun Sun, Hao Peng, Xianxian Li, Chunming Hu

    Abstract: Hierarchy is an important and commonly observed topological property in real-world graphs that indicate the relationships between supervisors and subordinates or the organizational behavior of human groups. As hierarchy is introduced as a new inductive bias into the Graph Neural Networks (GNNs) in various tasks, it implies latent topological relations for attackers to improve their inference attac… ▽ More

    Submitted 29 February, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted by the Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

  44. arXiv:2312.11891  [pdf, other

    cs.SI cs.LG

    Hierarchical and Incremental Structural Entropy Minimization for Unsupervised Social Event Detection

    Authors: Yuwei Cao, Hao Peng, Zhengtao Yu, Philip S. Yu

    Abstract: As a trending approach for social event detection, graph neural network (GNN)-based methods enable a fusion of natural language semantics and the complex social network structural information, thus showing SOTA performance. However, GNN-based methods can miss useful message correlations. Moreover, they require manual labeling for training and predetermining the number of events for prediction. In… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024

  45. arXiv:2312.11152  [pdf, other

    cs.CL cs.AI

    Prompt Based Tri-Channel Graph Convolution Neural Network for Aspect Sentiment Triplet Extraction

    Authors: Kun Peng, Lei Jiang, Hao Peng, Rui Liu, Zhengtao Yu, Jiaqian Ren, Zhifeng Hao, Philip S. Yu

    Abstract: Aspect Sentiment Triplet Extraction (ASTE) is an emerging task to extract a given sentence's triplets, which consist of aspects, opinions, and sentiments. Recent studies tend to address this task with a table-filling paradigm, wherein word relations are encoded in a two-dimensional table, and the process involves clarifying all the individual cells to extract triples. However, these studies ignore… ▽ More

    Submitted 24 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted in SIAM International Conference on Data Mining (SDM24)

  46. arXiv:2312.10917  [pdf, other

    cs.LG

    Semi-Supervised Clustering via Structural Entropy with Different Constraints

    Authors: Guangjie Zeng, Hao Peng, Angsheng Li, Zhiwei Liu, Runze Yang, Chunyang Liu, Lifang He

    Abstract: Semi-supervised clustering techniques have emerged as valuable tools for leveraging prior information in the form of constraints to improve the quality of clustering outcomes. Despite the proliferation of such methods, the ability to seamlessly integrate various types of constraints remains limited. While structural entropy has proven to be a powerful clustering approach with wide-ranging applicat… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: 9 pages, 3 figures, accepted by SDM 2024

  47. arXiv:2312.10253  [pdf, other

    cs.CL

    Catwalk: A Unified Language Model Evaluation Framework for Many Datasets

    Authors: Dirk Groeneveld, Anas Awadalla, Iz Beltagy, Akshita Bhagia, Ian Magnusson, Hao Peng, Oyvind Tafjord, Pete Walsh, Kyle Richardson, Jesse Dodge

    Abstract: The success of large language models has shifted the evaluation paradigms in natural language processing (NLP). The community's interest has drifted towards comparing NLP models across many tasks, domains, and datasets, often at an extreme scale. This imposes new engineering challenges: efforts in constructing datasets and models have been fragmented, and their formats and interfaces are incompati… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: technical report, work in progress

  48. arXiv:2312.08656  [pdf, other

    cs.LG cs.AI cs.DC

    MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training

    Authors: Hongwu Peng, Xi Xie, Kaustubh Shivdikar, MD Amit Hasan, Jiahui Zhao, Shaoyi Huang, Omer Khan, David Kaeli, Caiwen Ding

    Abstract: In the acceleration of deep neural network training, the GPU has become the mainstream platform. GPUs face substantial challenges on GNNs, such as workload imbalance and memory access irregularities, leading to underutilized hardware. Existing solutions such as PyG, DGL with cuSPARSE, and GNNAdvisor frameworks partially address these challenges but memory traffic is still significant. We argue t… ▽ More

    Submitted 18 March, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: ASPLOS 2024 accepted publication

    ACM Class: I.2; C.5

  49. arXiv:2312.08098  [pdf, other

    cs.SI cs.AI

    Adversarial Socialbots Modeling Based on Structural Information Principles

    Authors: Xianghua Zeng, Hao Peng, Angsheng Li

    Abstract: The importance of effective detection is underscored by the fact that socialbots imitate human behavior to propagate misinformation, leading to an ongoing competition between socialbots and detectors. Despite the rapid advancement of reactive detectors, the exploration of adversarial socialbot modeling remains incomplete, significantly hindering the development of proactive detectors. To address t… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: 9 pages, 5 figures, 2 tables

  50. arXiv:2312.01022  [pdf, other

    cs.LG

    Advanced Large Language Model (LLM)-Driven Verilog Development: Enhancing Power, Performance, and Area Optimization in Code Synthesis

    Authors: Kiran Thorat, Jiahui Zhao, Yaotian Liu, Hongwu Peng, Xi Xie, Bin Lei, Jeff Zhang, Caiwen Ding

    Abstract: The increasing use of Advanced Language Models (ALMs) in diverse sectors, particularly due to their impressive capability to generate top-tier content following linguistic instructions, forms the core of this investigation. This study probes into ALMs' deployment in electronic hardware design, with a specific emphasis on the synthesis and enhancement of Verilog programming. We introduce an innovat… ▽ More

    Submitted 9 January, 2024; v1 submitted 1 December, 2023; originally announced December 2023.