Skip to main content

Showing 1–50 of 378 results for author: Pan, Z

Searching in archive cs. Search in all archives.
.
  1. Node Injection Attack Based on Label Propagation Against Graph Neural Network

    Authors: Peican Zhu, Zechen Pan, Keke Tang, Xiaodong Cui, Jinhuan Wang, Qi Xuan

    Abstract: Graph Neural Network (GNN) has achieved remarkable success in various graph learning tasks, such as node classification, link prediction and graph classification. The key to the success of GNN lies in its effective structure information representation through neighboring aggregation. However, the attacker can easily perturb the aggregation process through injecting fake nodes, which reveals that G… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by TCSS;DOI:10.1109/TCSS.2024.3395794

  2. arXiv:2405.17822  [pdf, other

    cs.CL cs.AI

    Conv-CoA: Improving Open-domain Question Answering in Large Language Models via Conversational Chain-of-Action

    Authors: Zhenyu Pan, Haozheng Luo, Manling Li, Han Liu

    Abstract: We present a Conversational Chain-of-Action (Conv-CoA) framework for Open-domain Conversational Question Answering (OCQA). Compared with literature, Conv-CoA addresses three major challenges: (i) unfaithful hallucination that is inconsistent with real-time or domain facts, (ii) weak reasoning performance in conversational scenarios, and (iii) unsatisfying performance in conversational information… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  3. arXiv:2405.15984  [pdf, other

    cs.CL cs.AI

    Evaluating the Adversarial Robustness of Retrieval-Based In-Context Learning for Large Language Models

    Authors: Simon Chi Lok Yu, Jie He, Pasquale Minervini, Jeff Z. Pan

    Abstract: With the emergence of large language models, such as LLaMA and OpenAI GPT-3, In-Context Learning (ICL) gained significant attention due to its effectiveness and efficiency. However, ICL is very sensitive to the choice, order, and verbaliser used to encode the demonstrations in the prompt. Retrieval-Augmented ICL methods try to address this problem by leveraging retrievers to extract semantically r… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 29 pages, 6 figures

  4. arXiv:2405.14918  [pdf, other

    cs.LG cs.ET

    AnalogCoder: Analog Circuit Design via Training-Free Code Generation

    Authors: Yao Lai, Sungyoung Lee, Guojin Chen, Souradip Poddar, Mengkang Hu, David Z. Pan, Ping Luo

    Abstract: Analog circuit design is a significant task in modern chip technology, focusing on the selection of component types, connectivity, and parameters to ensure proper circuit functionality. Despite advances made by Large Language Models (LLMs) in digital circuit design, the complexity and scarcity of data in analog circuitry pose significant challenges. To mitigate these issues, we introduce AnalogCod… ▽ More

    Submitted 30 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  5. arXiv:2405.14366  [pdf, other

    cs.CL cs.AI cs.LG

    MiniCache: KV Cache Compression in Depth Dimension for Large Language Models

    Authors: Akide Liu, Jing Liu, Zizheng Pan, Yefei He, Gholamreza Haffari, Bohan Zhuang

    Abstract: A critical approach for efficiently deploying computationally demanding large language models (LLMs) is Key-Value (KV) caching. The KV cache stores key-value states of previously generated tokens, significantly reducing the need for repetitive computations and thereby lowering latency in autoregressive generation. However, the size of the KV cache grows linearly with sequence length, posing challe… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Tech report

  6. arXiv:2405.13915  [pdf, other

    cs.LG cs.SI

    HeteGraph-Mamba: Heterogeneous Graph Learning via Selective State Space Model

    Authors: Zhenyu Pan, Yoonsung Jeong, Xiaoda Liu, Han Liu

    Abstract: We propose a heterogeneous graph mamba network (HGMN) as the first exploration in leveraging the selective state space models (SSSMs) for heterogeneous graph learning. Compared with the literature, our HGMN overcomes two major challenges: (i) capturing long-range dependencies among heterogeneous nodes and (ii) adapting SSSMs to heterogeneous graph data. Our key contribution is a general graph arch… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  7. arXiv:2405.13602  [pdf, other

    cs.AI cs.CL cs.LG

    COTET: Cross-view Optimal Transport for Knowledge Graph Entity Typing

    Authors: Zhiwei Hu, Víctor Gutiérrez-Basulto, Zhiliang Xiang, Ru Li, Jeff Z. Pan

    Abstract: Knowledge graph entity typing (KGET) aims to infer missing entity type instances in knowledge graphs. Previous research has predominantly centered around leveraging contextual information associated with entities, which provides valuable clues for inference. However, they have long ignored the dual nature of information inherent in entities, encompassing both high-level coarse-grained cluster know… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  8. arXiv:2405.08815  [pdf, other

    cs.CV

    Efficient Vision-Language Pre-training by Cluster Masking

    Authors: Zihao Wei, Zixuan Pan, Andrew Owens

    Abstract: We propose a simple strategy for masking image patches during visual-language contrastive learning that improves the quality of the learned representations and the training speed. During each iteration of training, we randomly mask clusters of visually similar image patches, as measured by their raw pixel intensities. This provides an extra learning signal, beyond the contrastive training itself,… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: CVPR 2024, Project page: https://zxp46.github.io/cluster-masking/ , Code: https://github.com/Zi-hao-Wei/Efficient-Vision-Language-Pre-training-by-Cluster-Masking

  9. arXiv:2405.07425   

    cs.CV

    Sakuga-42M Dataset: Scaling Up Cartoon Research

    Authors: Zhenglin Pan

    Abstract: Hand-drawn cartoon animation employs sketches and flat-color segments to create the illusion of motion. While recent advancements like CLIP, SVD, and Sora show impressive results in understanding and generating natural video by scaling large models with extensive datasets, they are not as effective for cartoons. Through our empirical experiments, we argue that this ineffectiveness stems from a not… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission

  10. arXiv:2405.06758  [pdf, other

    cs.LG

    Scalable and Effective Arithmetic Tree Generation for Adder and Multiplier Designs

    Authors: Yao Lai, Jinxin Liu, David Z. Pan, Ping Luo

    Abstract: Across a wide range of hardware scenarios, the computational efficiency and physical size of the arithmetic units significantly influence the speed and footprint of the overall hardware system. Nevertheless, the effectiveness of prior arithmetic design techniques proves inadequate, as it does not sufficiently optimize speed and area, resulting in a reduced processing rate and larger module size. T… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  11. arXiv:2405.06524  [pdf, other

    cs.CL

    Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts

    Authors: Wenyu Huang, Guancheng Zhou, Mirella Lapata, Pavlos Vougiouklis, Sebastien Montella, Jeff Z. Pan

    Abstract: Although Large Language Models (LLMs) are effective in performing various NLP tasks, they still struggle to handle tasks that require extensive, real-world knowledge, especially when dealing with long-tail facts (facts related to long-tail entities). This limitation highlights the need to supplement LLMs with non-parametric knowledge. To address this issue, we analysed the effects of different typ… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  12. arXiv:2405.03959  [pdf, other

    cs.CV

    Joint Identity Verification and Pose Alignment for Partial Fingerprints

    Authors: Xiongjun Guan, Zhiyu Pan, Jianjiang Feng, Jie Zhou

    Abstract: Currently, portable electronic devices are becoming more and more popular. For lightweight considerations, their fingerprint recognition modules usually use limited-size sensors. However, partial fingerprints have few matchable features, especially when there are differences in finger pressing posture or image quality, which makes partial fingerprint verification challenging. Most existing methods… ▽ More

    Submitted 21 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  13. arXiv:2405.01199  [pdf, other

    cs.CV

    Latent Fingerprint Matching via Dense Minutia Descriptor

    Authors: Zhiyu Pan, Yongjie Duan, Xiongjun Guan, Jianjiang Feng, Jie Zhou

    Abstract: Latent fingerprint matching is a daunting task, primarily due to the poor quality of latent fingerprints. In this study, we propose a deep-learning based dense minutia descriptor (DMD) for latent fingerprint matching. A DMD is obtained by extracting the fingerprint patch aligned by its central minutia, capturing detailed minutia information and texture information. Our dense descriptor takes the f… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 10 pages, 6 figures

  14. arXiv:2405.01112  [pdf, other

    cs.CV

    Sports Analysis and VR Viewing System Based on Player Tracking and Pose Estimation with Multimodal and Multiview Sensors

    Authors: Wenxuan Guo, Zhiyu Pan, Ziheng Xi, Alapati Tuerxun, Jianjiang Feng, Jie Zhou

    Abstract: Sports analysis and viewing play a pivotal role in the current sports domain, offering significant value not only to coaches and athletes but also to fans and the media. In recent years, the rapid development of virtual reality (VR) and augmented reality (AR) technologies have introduced a new platform for watching games. Visualization of sports competitions in VR/AR represents a revolutionary tec… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2312.06409

  15. arXiv:2404.18528  [pdf, other

    cs.LG

    Generation of Uncorrelated Residual Variables for Chemical Process Fault Diagnosis via Transfer Learning-based Input-Output Decoupled Network

    Authors: Zhuofu Pan, Qingkai Sui, Yalin Wang, Jiang Luo, Jie Chen, Hongtian Chen

    Abstract: Structural decoupling has played an essential role in model-based fault isolation and estimation in past decades, which facilitates accurate fault localization and reconstruction thanks to the diagonal transfer matrix design. However, traditional methods exhibit limited effectiveness in modeling high-dimensional nonlinearity and big data, and the decoupling idea has not been well-valued in data-dr… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  16. arXiv:2404.18407  [pdf, other

    cs.CR cs.AR

    ICMarks: A Robust Watermarking Framework for Integrated Circuit Physical Design IP Protection

    Authors: Ruisi Zhang, Rachel Selina Rajarathnam, David Z. Pan, Farinaz Koushanfar

    Abstract: Physical design watermarking on contemporary integrated circuit (IC) layout encodes signatures without considering the dense connections and design constraints, which could lead to performance degradation on the watermarked products. This paper presents ICMarks, a quality-preserving and robust watermarking framework for modern IC physical design. ICMarks embeds unique watermark signatures during t… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  17. arXiv:2404.17590  [pdf, other

    cs.IR cs.AI

    Leveraging Intra-modal and Inter-modal Interaction for Multi-Modal Entity Alignment

    Authors: Zhiwei Hu, Víctor Gutiérrez-Basulto, Zhiliang Xiang, Ru Li, Jeff Z. Pan

    Abstract: Multi-modal entity alignment (MMEA) aims to identify equivalent entity pairs across different multi-modal knowledge graphs (MMKGs). Existing approaches focus on how to better encode and aggregate information from different modalities. However, it is not trivial to leverage multi-modal knowledge in entity alignment due to the modal heterogeneity. In this paper, we propose a Multi-Grained Interactio… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  18. arXiv:2404.17528  [pdf, other

    cs.CV

    Geometry-aware Reconstruction and Fusion-refined Rendering for Generalizable Neural Radiance Fields

    Authors: Tianqi Liu, Xinyi Ye, Min Shi, Zihao Huang, Zhiyu Pan, Zhan Peng, Zhiguo Cao

    Abstract: Generalizable NeRF aims to synthesize novel views for unseen scenes. Common practices involve constructing variance-based cost volumes for geometry reconstruction and encoding 3D descriptors for decoding novel views. However, existing methods show limited generalization ability in challenging conditions due to inaccurate geometry, sub-optimal descriptors, and decoding strategies. We address these… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024. Project page: https://gefucvpr24.github.io

  19. arXiv:2404.17456  [pdf, other

    cs.NE

    Converting High-Performance and Low-Latency SNNs through Explicit Modelling of Residual Error in ANNs

    Authors: Zhipeng Huang, Jianhao Ding, Zhiyu Pan, Haoran Li, Ying Fang, Zhaofei Yu, Jian K. Liu

    Abstract: Spiking neural networks (SNNs) have garnered interest due to their energy efficiency and superior effectiveness on neuromorphic chips compared with traditional artificial neural networks (ANNs). One of the mainstream approaches to implementing deep SNNs is the ANN-SNN conversion, which integrates the efficient training strategy of ANNs with the energy-saving potential and fast inference capability… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  20. arXiv:2404.15744  [pdf, other

    cs.LG cs.AI cs.CR

    A General Black-box Adversarial Attack on Graph-based Fake News Detectors

    Authors: Peican Zhu, Zechen Pan, Yang Liu, Jiwei Tian, Keke Tang, Zhen Wang

    Abstract: Graph Neural Network (GNN)-based fake news detectors apply various methods to construct graphs, aiming to learn distinctive news embeddings for classification. Since the construction details are unknown for attackers in a black-box scenario, it is unrealistic to conduct the classical adversarial attacks that require a specific adjacency matrix. In this paper, we propose the first general black-box… ▽ More

    Submitted 25 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI2024

  21. arXiv:2404.13194  [pdf, other

    cs.LG cs.AI cs.CV

    Privacy-Preserving Debiasing using Data Augmentation and Machine Unlearning

    Authors: Zhixin Pan, Emma Andrews, Laura Chang, Prabhat Mishra

    Abstract: Data augmentation is widely used to mitigate data bias in the training dataset. However, data augmentation exposes machine learning models to privacy attacks, such as membership inference attacks. In this paper, we propose an effective combination of data augmentation and machine unlearning, which can reduce data bias while providing a provable defense against known attacks. Specifically, we maint… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  22. arXiv:2404.12142  [pdf, other

    cs.CV cs.LG eess.IV

    SDIP: Self-Reinforcement Deep Image Prior Framework for Image Processing

    Authors: Ziyu Shu, Zhixin Pan

    Abstract: Deep image prior (DIP) proposed in recent research has revealed the inherent trait of convolutional neural networks (CNN) for capturing substantial low-level image statistics priors. This framework efficiently addresses the inverse problems in image processing and has induced extensive applications in various domains. However, as the whole algorithm is initialized randomly, the DIP algorithm often… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  23. arXiv:2404.09848  [pdf, other

    cs.AI cs.LG

    HyperMono: A Monotonicity-aware Approach to Hyper-Relational Knowledge Representation

    Authors: Zhiwei Hu, Víctor Gutiérrez-Basulto, Zhiliang Xiang, Ru Li, Jeff Z. Pan

    Abstract: In a hyper-relational knowledge graph (HKG), each fact is composed of a main triple associated with attribute-value qualifiers, which express additional factual knowledge. The hyper-relational knowledge graph completion (HKGC) task aims at inferring plausible missing links in a HKG. Most existing approaches to HKGC focus on enhancing the communication between qualifier pairs and main triples, whil… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  24. arXiv:2404.02148  [pdf, other

    cs.CV

    Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models

    Authors: Zeyu Yang, Zijie Pan, Chun Gu, Li Zhang

    Abstract: Recent advancements in 3D generation are predominantly propelled by improvements in 3D-aware image diffusion models which are pretrained on Internet-scale image data and fine-tuned on massive 3D data, offering the capability of producing highly consistent multi-view images. However, due to the scarcity of synchronized multi-view video data, it is impractical to adapt this paradigm to 4D generation… ▽ More

    Submitted 22 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Technical Report

  25. arXiv:2404.01253  [pdf, other

    cs.CL

    UniArk: Improving Generalisation and Consistency for Factual Knowledge Extraction through Debiasing

    Authors: Yijun Yang, Jie He, Pinzhen Chen, Víctor Gutiérrez-Basulto, Jeff Z. Pan

    Abstract: Several recent papers have investigated the potential of language models as knowledge bases as well as the existence of severe biases when extracting factual knowledge. In this work, we focus on the factual probing performance over unseen prompts from tuning, and using a probabilistic view we show the inherent misalignment between pre-training and downstream tuning objectives in language models fo… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: NAACL 2024

  26. arXiv:2403.17359  [pdf, other

    cs.CL

    Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

    Authors: Zhenyu Pan, Haozheng Luo, Manling Li, Han Liu

    Abstract: We present a Chain-of-Action (CoA) framework for multimodal and retrieval-augmented Question-Answering (QA). Compared to the literature, CoA overcomes two major challenges of current QA applications: (i) unfaithful hallucination that is inconsistent with real-time or domain facts and (ii) weak reasoning performance over compositional information. Our key contribution is a novel reasoning-retrieval… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  27. arXiv:2403.16437  [pdf, other

    cs.SE cs.CL

    Evaluating Large Language Models with Runtime Behavior of Program Execution

    Authors: Junkai Chen, Zhiyuan Pan, Xing Hu, Zhenhao Li, Ge Li, Xin Xia

    Abstract: Large language models for code (i.e., code LLMs) have shown strong code understanding and generation capabilities. To evaluate the capabilities of code LLMs in various aspects, many benchmarks have been proposed (e.g., HumanEval and ClassEval). Code reasoning is one of the most essential abilities of code LLMs, but existing benchmarks for code reasoning are not sufficient. Typically, they focus on… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  28. arXiv:2403.14806  [pdf, other

    cs.ET physics.app-ph physics.optics

    Photonic-Electronic Integrated Circuits for High-Performance Computing and AI Accelerator

    Authors: Shupeng Ning, Hanqing Zhu, Chenghao Feng, Jiaqi Gu, Zhixing Jiang, Zhoufeng Ying, Jason Midkiff, Sourabh Jain, May H. Hlaing, David Z. Pan, Ray T. Chen

    Abstract: In recent decades, the demand for computational power has surged, particularly with the rapid expansion of artificial intelligence (AI). As we navigate the post-Moore's law era, the limitations of traditional electrical digital computing, including process bottlenecks and power consumption issue, are propelling the search for alternative computing paradigms. Among various emerging technologies, in… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  29. arXiv:2403.13261  [pdf, other

    cs.CV

    Self-Supervised Class-Agnostic Motion Prediction with Spatial and Temporal Consistency Regularizations

    Authors: Kewei Wang, Yizheng Wu, Jun Cen, Zhiyu Pan, Xingyi Li, Zhe Wang, Zhiguo Cao, Guosheng Lin

    Abstract: The perception of motion behavior in a dynamic environment holds significant importance for autonomous driving systems, wherein class-agnostic motion prediction methods directly predict the motion of the entire point cloud. While most existing methods rely on fully-supervised learning, the manual labeling of point cloud data is laborious and time-consuming. Therefore, several annotation-efficient… ▽ More

    Submitted 21 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  30. arXiv:2403.12968  [pdf, other

    cs.CL cs.LG

    LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

    Authors: Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, Qingwei Lin, Victor Rühle, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Dongmei Zhang

    Abstract: This paper focuses on task-agnostic prompt compression for better generalizability and efficiency. Considering the redundancy in natural language, existing approaches compress prompts by removing tokens or lexical units according to their information entropy obtained from a causal language model such as LLaMa-7B. The challenge is that information entropy may be a suboptimal compression metric: (i)… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  31. arXiv:2403.05798  [pdf, other

    cs.LG

    $\textbf{S}^2$IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting

    Authors: Zijie Pan, Yushan Jiang, Sahil Garg, Anderson Schneider, Yuriy Nevmyvaka, Dongjin Song

    Abstract: Recently, there has been a growing interest in leveraging pre-trained large language models (LLMs) for various time series applications. However, the semantic space of LLMs, established through the pre-training, is still underexplored and may help yield more distinctive and informative representations to facilitate time series forecasting. To this end, we propose Semantic Space Informed Prompt lea… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  32. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry, Lepikhin, Timothy Lillicrap, Jean-baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser, Ioannis Antonoglou, Rohan Anil, Sebastian Borgeaud, Andrew Dai, Katie Millican, Ethan Dyer, Mia Glaese, Thibault Sottiaux, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, James Molloy , et al. (683 additional authors not shown)

    Abstract: In this report, we present the latest model of the Gemini family, Gemini 1.5 Pro, a highly compute-efficient multimodal mixture-of-experts model capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. Gemini 1.5 Pro achieves near-perfect recall on long-context retrieval tasks across modalit… ▽ More

    Submitted 25 April, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  33. arXiv:2403.03655  [pdf, other

    cs.CR

    Kronos: A Secure and Generic Sharding Blockchain Consensus with Optimized Overhead

    Authors: Yizhong Liu, Andi Liu, Yuan Lu, Zhuocheng Pan, Yinuo Li, Jianwei Liu, Song Bian, Mauro Conti

    Abstract: Sharding enhances blockchain scalability by dividing the network into shards, each managing specific unspent transaction outputs or accounts. As an introduced new transaction type, cross-shard transactions pose a critical challenge to the security and efficiency of sharding blockchains. Currently, there is a lack of a generic sharding consensus pattern that achieves both security and low overhead.… ▽ More

    Submitted 5 May, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

  34. arXiv:2403.03391  [pdf, other

    stat.ML cond-mat.stat-mech cs.LG

    CoRMF: Criticality-Ordered Recurrent Mean Field Ising Solver

    Authors: Zhenyu Pan, Ammar Gilani, En-Jui Kuo, Zhuo Liu

    Abstract: We propose an RNN-based efficient Ising model solver, the Criticality-ordered Recurrent Mean Field (CoRMF), for forward Ising problems. In its core, a criticality-ordered spin sequence of an $N$-spin Ising model is introduced by sorting mission-critical edges with greedy algorithm, such that an autoregressive mean-field factorization can be utilized and optimized with Recurrent Neural Networks (RN… ▽ More

    Submitted 7 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  35. arXiv:2402.18331  [pdf, other

    cs.CV

    FineDiffusion: Scaling up Diffusion Models for Fine-grained Image Generation with 10,000 Classes

    Authors: Ziying Pan, Kun Wang, Gang Li, Feihong He, Xiwang Li, Yongxuan Lai

    Abstract: The class-conditional image generation based on diffusion models is renowned for generating high-quality and diverse images. However, most prior efforts focus on generating images for general categories, e.g., 1000 classes in ImageNet-1k. A more challenging task, large-scale fine-grained image generation, remains the boundary to explore. In this work, we present a parameter-efficient strategy, cal… ▽ More

    Submitted 6 April, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

  36. arXiv:2402.17907  [pdf, other

    eess.AS cs.SD

    NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

    Authors: Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux

    Abstract: Head-related transfer functions (HRTFs) are important for immersive audio, and their spatial interpolation has been studied to upsample finite measurements. Recently, neural fields (NFs) which map from sound source direction to HRTF have gained attention. Existing NF-based methods focused on estimating the magnitude of the HRTF from a given sound source direction, and the magnitude is converted to… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024

  37. arXiv:2402.14901  [pdf, other

    cs.CL cs.AI

    A Usage-centric Take on Intent Understanding in E-Commerce

    Authors: Wendi Zhou, Tianyi Li, Pavlos Vougiouklis, Mark Steedman, Jeff Z. Pan

    Abstract: Identifying and understanding user intents is a pivotal task for E-Commerce. Despite its popularity, intent understanding has not been consistently defined or accurately benchmarked. In this paper, we focus on predicative user intents as "how a customer uses a product", and pose intent understanding as a natural language reasoning task, independent of product ontologies. We identify two weaknesses… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  38. arXiv:2402.14167  [pdf, other

    cs.CV cs.LG

    T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

    Authors: Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, Anima Anandkumar

    Abstract: Sampling from diffusion probabilistic models (DPMs) is often expensive for high-quality image generation and typically requires many steps with a large model. In this paper, we introduce sampling Trajectory Stitching T-Stitch, a simple yet efficient technique to improve the sampling efficiency with little or no generation degradation. Instead of solely using a large DPM for the entire sampling tra… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  39. arXiv:2402.12722  [pdf, other

    cs.LG

    Structural Knowledge Informed Continual Multivariate Time Series Forecasting

    Authors: Zijie Pan, Yushan Jiang, Dongjin Song, Sahil Garg, Kashif Rasul, Anderson Schneider, Yuriy Nevmyvaka

    Abstract: Recent studies in multivariate time series (MTS) forecasting reveal that explicitly modeling the hidden dependencies among different time series can yield promising forecasting performance and reliable explanations. However, modeling variable dependencies remains underexplored when MTS is continuously accumulated under different regimes (stages). Due to the potential distribution and dependency di… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  40. arXiv:2402.12554  [pdf, other

    cs.CL

    Archer: A Human-Labeled Text-to-SQL Dataset with Arithmetic, Commonsense and Hypothetical Reasoning

    Authors: Danna Zheng, Mirella Lapata, Jeff Z. Pan

    Abstract: We present Archer, a challenging bilingual text-to-SQL dataset specific to complex reasoning, including arithmetic, commonsense and hypothetical reasoning. It contains 1,042 English questions and 1,042 Chinese questions, along with 521 unique SQL queries, covering 20 English databases across 20 domains. Notably, this dataset demonstrates a significantly higher level of complexity compared to exist… ▽ More

    Submitted 24 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: EACL 2024

  41. arXiv:2402.12545  [pdf, other

    cs.CL

    TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness

    Authors: Danna Zheng, Danyang Liu, Mirella Lapata, Jeff Z. Pan

    Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across various domains, prompting a surge in their practical applications. However, concerns have arisen regarding the trustworthiness of LLMs outputs, particularly in closed-book question-answering tasks, where non-experts may struggle to identify inaccuracies due to the absence of contextual or ground truth information. This… ▽ More

    Submitted 6 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  42. BiasEye: A Bias-Aware Real-time Interactive Material Screening System for Impartial Candidate Assessment

    Authors: Qianyu Liu, Haoran Jiang, Zihao Pan, Qiushi Han, Zhenhui Peng, Quan Li

    Abstract: In the process of evaluating competencies for job or student recruitment through material screening, decision-makers can be influenced by inherent cognitive biases, such as the screening order or anchoring information, leading to inconsistent outcomes. To tackle this challenge, we conducted interviews with seven experts to understand their challenges and needs for support in the screening process.… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: IUI' 24, March 18-21, 2024, Greenville, SC, USA

  43. arXiv:2402.08916  [pdf, other

    eess.SP cs.IT

    Lightweight Deep Learning Based Channel Estimation for Extremely Large-Scale Massive MIMO Systems

    Authors: Shen Gao, Peihao Dong, Zhiwen Pan, Xiaohu You

    Abstract: Extremely large-scale massive multiple-input multiple-output (XL-MIMO) systems introduce the much higher channel dimensionality and incur the additional near-field propagation effect, aggravating the computation load and the difficulty to acquire the prior knowledge for channel estimation. In this article, an XL-MIMO channel network (XLCNet) is developed to estimate the high-dimensional channel, w… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE Transactions on Vehicular Technology

  44. arXiv:2402.05391  [pdf, other

    cs.AI cs.CV cs.IR cs.LG

    Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

    Authors: Zhuo Chen, Yichi Zhang, Yin Fang, Yuxia Geng, Lingbing Guo, Xiang Chen, Qian Li, Wen Zhang, Jiaoyan Chen, Yushan Zhu, Jiaqi Li, Xiaoze Liu, Jeff Z. Pan, Ningyu Zhang, Huajun Chen

    Abstract: Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the semantic web community's exploration into multi-modal dimensions unlocking new avenues for innovation. In this survey, we carefully review over 300 articles, focusing on KG-aware research in two principal aspects: KG-driven Multi-Modal (KG4MM) learning, where KGs support multi-modal tasks, and Multi-Modal Kno… ▽ More

    Submitted 26 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: Ongoing work; 41 pages (Main Text), 55 pages (Total), 11 Tables, 13 Figures, 619 citations; Paper list is available at https://github.com/zjukg/KG-MM-Survey

  45. arXiv:2402.03182  [pdf, other

    cs.LG

    Empowering Time Series Analysis with Large Language Models: A Survey

    Authors: Yushan Jiang, Zijie Pan, Xikun Zhang, Sahil Garg, Anderson Schneider, Yuriy Nevmyvaka, Dongjin Song

    Abstract: Recently, remarkable progress has been made over large language models (LLMs), demonstrating their unprecedented capability in varieties of natural language tasks. However, completely training a large general-purpose model from the scratch is challenging for time series analysis, due to the large volumes and varieties of time series data, as well as the non-stationarity that leads to concept drift… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  46. arXiv:2402.00411  [pdf, other

    cs.NE cs.AI cs.CV

    LM-HT SNN: Enhancing the Performance of SNN to ANN Counterpart through Learnable Multi-hierarchical Threshold Model

    Authors: Zecheng Hao, Xinyu Shi, Zhiyu Pan, Yujia Liu, Zhaofei Yu, Tiejun Huang

    Abstract: Compared to traditional Artificial Neural Network (ANN), Spiking Neural Network (SNN) has garnered widespread academic interest for its intrinsic ability to transmit information in a more biological-inspired and energy-efficient manner. However, despite previous efforts to optimize the learning gradients and model structure of SNNs through various methods, SNNs still lag behind ANNs in terms of pe… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 15 pages, 2 figures

  47. arXiv:2401.17618  [pdf, other

    cs.CR cs.OS

    Beyond Control: Exploring Novel File System Objects for Data-Only Attacks on Linux Systems

    Authors: Jinmeng Zhou, Jiayi Hu, Ziyue Pan, Jiaxun Zhu, Guoren Li, Wenbo Shen, Yulei Sui, Zhiyun Qian

    Abstract: The widespread deployment of control-flow integrity has propelled non-control data attacks into the mainstream. In the domain of OS kernel exploits, by corrupting critical non-control data, local attackers can directly gain root access or privilege escalation without hijacking the control flow. As a result, OS kernels have been restricting the availability of such non-control data. This forces att… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 14 pages, in submission of the 31th ACM Conference on Computer and Communications Security (CCS), 2024

  48. Ambush from All Sides: Understanding Security Threats in Open-Source Software CI/CD Pipelines

    Authors: Ziyue Pan, Wenbo Shen, Xingkai Wang, Yutian Yang, Rui Chang, Yao Liu, Chengwei Liu, Yang Liu, Kui Ren

    Abstract: The continuous integration and continuous deployment (CI/CD) pipelines are widely adopted on Internet hosting platforms, such as GitHub. With the popularity, the CI/CD pipeline faces various security threats. However, current CI/CD pipelines suffer from malicious code and severe vulnerabilities. Even worse, people have not been fully aware of its attack surfaces and the corresponding impacts. Th… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Journal ref: IEEE Transactions on Dependable and Secure Computing (Volume: 21, Issue: 1, Jan.-Feb. 2024)

  49. arXiv:2401.15820  [pdf, other

    cs.CV cs.AI

    Knowledge-Aware Neuron Interpretation for Scene Classification

    Authors: Yong Guan, Freddy Lecue, Jiaoyan Chen, Ru Li, Jeff Z. Pan

    Abstract: Although neural models have achieved remarkable performance, they still encounter doubts due to the intransparency. To this end, model prediction explanation is attracting more and more attentions. However, current methods rarely incorporate external knowledge and still suffer from three limitations: (1) Neglecting concept completeness. Merely selecting concepts may not sufficient for prediction.… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: Accepted to AAAI2024

  50. arXiv:2401.14640  [pdf, other

    cs.CL

    Benchmarking Large Language Models in Complex Question Answering Attribution using Knowledge Graphs

    Authors: Nan Hu, Jiaoyan Chen, Yike Wu, Guilin Qi, Sheng Bi, Tongtong Wu, Jeff Z. Pan

    Abstract: The attribution of question answering is to provide citations for supporting generated statements, and has attracted wide research attention. The current methods for automatically evaluating the attribution, which are often based on Large Language Models (LLMs), are still inadequate, particularly in recognizing subtle differences between attributions, and complex relationships between citations an… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 13 pages, 5 figures