Skip to main content

Showing 1–50 of 214 results for author: Mao, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.03435  [pdf, other

    cond-mat.dis-nn cs.AI cs.LG

    A method for quantifying the generalization capabilities of generative models for solving Ising models

    Authors: Qunlong Ma, Zhi Ma, Ming Gao

    Abstract: For Ising models with complex energy landscapes, whether the ground state can be found by neural networks depends heavily on the Hamming distance between the training datasets and the ground state. Despite the fact that various recently proposed generative models have shown good performance in solving Ising models, there is no adequate discussion on how to quantify their generalization capabilitie… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 10 pages, 7 figures

    Journal ref: Mach. Learn.: Sci. Technol. 5 (2024) 025011

  2. arXiv:2404.18262  [pdf, other

    cs.AI

    Generating Situated Reflection Triggers about Alternative Solution Paths: A Case Study of Generative AI for Computer-Supported Collaborative Learning

    Authors: Atharva Naik, Jessica Ruhan Yin, Anusha Kamath, Qianou Ma, Sherry Tongshuang Wu, Charles Murray, Christopher Bogart, Majd Sakr, Carolyn P. Rose

    Abstract: An advantage of Large Language Models (LLMs) is their contextualization capability - providing different responses based on student inputs like solution strategy or prior discussion, to potentially better engage students than standard feedback. We present a design and evaluation of a proof-of-concept LLM application to offer students dynamic and contextualized feedback. Specifically, we augment an… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  3. arXiv:2404.16831  [pdf, other

    cs.CV

    The Third Monocular Depth Estimation Challenge

    Authors: Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, YiPing Bao, Xiao Liu, Dohyeong Kim, Jinseong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, Jinqiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora , et al. (16 additional authors not shown)

    Abstract: This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su… ▽ More

    Submitted 27 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: To appear in CVPRW2024

  4. arXiv:2404.12006  [pdf, other

    cs.CL

    Variational Multi-Modal Hypergraph Attention Network for Multi-Modal Relation Extraction

    Authors: Qian Li, Cheng Ji, Shu Guo, Yong Zhao, Qianren Mao, Shangguang Wang, Yuntao Wei, Jianxin Li

    Abstract: Multi-modal relation extraction (MMRE) is a challenging task that aims to identify relations between entities in text leveraging image information. Existing methods are limited by their neglect of the multiple entity pairs in one sentence sharing very similar contextual information (ie, the same text and image), resulting in increased difficulty in the MMRE task. To address this limitation, we pro… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  5. arXiv:2404.08951  [pdf, other

    cs.CV cs.LG

    Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation

    Authors: Qinghe Ma, Jian Zhang, Lei Qi, Qian Yu, Yinghuan Shi, Yang Gao

    Abstract: Both limited annotation and domain shift are prevalent challenges in medical image segmentation. Traditional semi-supervised segmentation and unsupervised domain adaptation methods address one of these issues separately. However, the coexistence of limited annotation and domain shift is quite common, which motivates us to introduce a novel and challenging scenario: Mixed Domain Semi-supervised med… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  6. arXiv:2404.06225  [pdf, other

    cond-mat.stat-mech cond-mat.dis-nn cs.LG

    Message Passing Variational Autoregressive Network for Solving Intractable Ising Models

    Authors: Qunlong Ma, Zhi Ma, Jinlong Xu, Hairui Zhang, Ming Gao

    Abstract: Many deep neural networks have been used to solve Ising models, including autoregressive neural networks, convolutional neural networks, recurrent neural networks, and graph neural networks. Learning a probability distribution of energy configuration or finding the ground states of a disordered, fully connected Ising model is essential for statistical mechanics and NP-hard problems. Despite tremen… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 18 pages, 14 figures

  7. arXiv:2404.01343  [pdf, other

    cs.CL cs.AI

    CHOPS: CHat with custOmer Profile Systems for Customer Service with LLMs

    Authors: Jingzhe Shi, Jialuo Li, Qinwei Ma, Zaiwen Yang, Huan Ma, Lei Li

    Abstract: Businesses and software platforms are increasingly turning to Large Language Models (LLMs) such as GPT-3.5, GPT-4, GLM-3, and LLaMa-2 for chat assistance with file access or as reasoning agents for customer service. However, current LLM-based customer service models have limited integration with customer profiles and lack the operational capabilities necessary for effective service. Moreover, exis… ▽ More

    Submitted 15 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: 14 pages

  8. arXiv:2403.19826   

    cs.AI

    Segmentation Re-thinking Uncertainty Estimation Metrics for Semantic Segmentation

    Authors: Qitian Ma, Shyam Nanda Rai, Carlo Masone, Tatiana Tommasi

    Abstract: In the domain of computer vision, semantic segmentation emerges as a fundamental application within machine learning, wherein individual pixels of an image are classified into distinct semantic categories. This task transcends traditional accuracy metrics by incorporating uncertainty quantification, a critical measure for assessing the reliability of each segmentation prediction. Such quantificati… ▽ More

    Submitted 8 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Premature Submission: accidentally submitted before it was ready

  9. arXiv:2403.12865  [pdf, other

    cs.RO

    PE-Planner: A Performance-Enhanced Quadrotor Motion Planner for Autonomous Flight in Complex and Dynamic Environments

    Authors: Jiaxin Qiu, Qingchen Liu, Jiahu Qin, Dewang Cheng, Yawei Tian, Qichao Ma

    Abstract: The role of a motion planner is pivotal in quadrotor applications, yet existing methods often struggle to adapt to complex environments, limiting their ability to achieve fast, safe, and robust flight. In this letter, we introduce a performance-enhanced quadrotor motion planner designed for autonomous flight in complex environments including dense obstacles, dynamic obstacles, and unknown disturba… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  10. arXiv:2403.03736  [pdf, other

    cs.CV cs.LG eess.IV

    Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer

    Authors: Naifu Xue, Qi Mao, Zijian Wang, Yuan Zhang, Siwei Ma

    Abstract: Recent progress in generative compression technology has significantly improved the perceptual quality of compressed data. However, these advancements primarily focus on producing high-frequency details, often overlooking the ability of generative models to capture the prior distribution of image content, thus impeding further bitrate reduction in extreme compression scenarios (<0.05 bpp). Motivat… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  11. arXiv:2402.19371  [pdf

    cs.CL cs.AI cs.IR

    OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models

    Authors: Jenish Maharjan, Anurag Garikipati, Navan Preet Singh, Leo Cyrus, Mayank Sharma, Madalina Ciobanu, Gina Barnes, Rahul Thapa, Qingqing Mao, Ritankar Das

    Abstract: LLMs have become increasingly capable at accomplishing a range of specialized-tasks and can be utilized to expand equitable access to medical knowledge. Most medical LLMs have involved extensive fine-tuning, leveraging specialized medical data and significant, thus costly, amounts of computational power. Many of the top performing LLMs are proprietary and their access is limited to very few resear… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  12. arXiv:2402.15680  [pdf, other

    cs.LG

    Overcoming Pitfalls in Graph Contrastive Learning Evaluation: Toward Comprehensive Benchmarks

    Authors: Qian Ma, Hongliang Chi, Hengrui Zhang, Kay Liu, Zhiwei Zhang, Lu Cheng, Suhang Wang, Philip S. Yu, Yao Ma

    Abstract: The rise of self-supervised learning, which operates without the need for labeled data, has garnered significant interest within the graph learning community. This enthusiasm has led to the development of numerous Graph Contrastive Learning (GCL) techniques, all aiming to create a versatile graph encoder that leverages the wealth of unlabeled data for various downstream tasks. However, the current… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  13. arXiv:2402.15153  [pdf, other

    cs.CL cs.LG

    Self-Adaptive Reconstruction with Contrastive Learning for Unsupervised Sentence Embeddings

    Authors: Junlong Liu, Xichen Shang, Huawen Feng, Junhao Zheng, Qianli Ma

    Abstract: Unsupervised sentence embeddings task aims to convert sentences to semantic vector representations. Most previous works directly use the sentence representations derived from pretrained language models. However, due to the token bias in pretrained language models, the models can not capture the fine-grained semantics in sentences, which leads to poor predictions. To address this issue, we propose… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: 8 pages, 3 figures

  14. arXiv:2402.14609  [pdf, other

    cs.LG cs.AI cs.CR cs.DB

    FedCQA: Answering Complex Queries on Multi-Source Knowledge Graphs via Federated Learning

    Authors: Qi Hu, Weifeng Jiang, Haoran Li, Zihao Wang, Jiaxin Bai, Qianren Mao, Yangqiu Song, Lixin Fan, Jianxin Li

    Abstract: Complex logical query answering is a challenging task in knowledge graphs (KGs) that has been widely studied. The ability to perform complex logical reasoning is essential and supports various graph reasoning-based downstream tasks, such as search engines. Recent approaches are proposed to represent KG entities and logical queries into embedding vectors and find answers to logical queries from the… ▽ More

    Submitted 25 February, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  15. arXiv:2402.14145  [pdf, other

    stat.ML cs.LG stat.ME

    Multiply Robust Estimation for Local Distribution Shifts with Multiple Domains

    Authors: Steven Wilkins-Reeves, Xu Chen, Qi Ma, Christine Agarwal, Aude Hofleitner

    Abstract: Distribution shifts are ubiquitous in real-world machine learning applications, posing a challenge to the generalization of models trained on one data distribution to another. We focus on scenarios where data distributions vary across multiple segments of the entire population and only make local assumptions about the differences between training and test (deployment) distributions within each seg… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 9 pages, 4 figures

  16. arXiv:2402.12954  [pdf, other

    cs.LG cs.AI cs.LO

    Conditional Logical Message Passing Transformer for Complex Query Answering

    Authors: Chongzhi Zhang, Zhiping Peng, Junhao Zheng, Qianli Ma

    Abstract: Complex Query Answering (CQA) over Knowledge Graphs (KGs) is a challenging task. Given that KGs are usually incomplete, neural models are proposed to solve CQA by performing multi-hop logical reasoning. However, most of them cannot perform well on both one-hop and multi-hop queries simultaneously. Recent work proposes a logical message passing mechanism based on the pre-trained neural link predict… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 13 pages, 3 figures, and 12 tables

  17. arXiv:2402.10447  [pdf, other

    cs.CL cs.LG

    Incremental Sequence Labeling: A Tale of Two Shifts

    Authors: Shengjie Qiu, Junhao Zheng, Zhen Liu, Yicheng Luo, Qianli Ma

    Abstract: The incremental sequence labeling task involves continuously learning new classes over time while retaining knowledge of the previous ones. Our investigation identifies two significant semantic shifts: E2O (where the model mislabels an old entity as a non-entity) and O2E (where the model labels a non-entity or old entity as a new entity). Previous research has predominantly focused on addressing t… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  18. arXiv:2402.10063  [pdf, other

    cs.LG

    Balancing the Causal Effects in Class-Incremental Learning

    Authors: Junhao Zheng, Ruiyan Wang, Chongzhi Zhang, Huawen Feng, Qianli Ma

    Abstract: Class-Incremental Learning (CIL) is a practical and challenging problem for achieving general artificial intelligence. Recently, Pre-Trained Models (PTMs) have led to breakthroughs in both visual and natural language processing tasks. Despite recent studies showing PTMs' potential ability to learn sequentially, a plethora of work indicates the necessity of alleviating the catastrophic forgetting o… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  19. arXiv:2402.08526  [pdf, other

    cs.LG cs.CL

    Concept-1K: A Novel Benchmark for Instance Incremental Learning

    Authors: Junhao Zheng, Shengjie Qiu, Qianli Ma

    Abstract: Incremental learning (IL) is essential to realize the human-level intelligence in the neural network. However, existing IL scenarios and datasets are unqualified for assessing forgetting in PLMs, giving an illusion that PLMs do not suffer from catastrophic forgetting. To this end, we propose a challenging IL scenario called instance-incremental learning (IIL) and a novel dataset called Concept-1K,… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  20. EmoWear: Exploring Emotional Teasers for Voice Message Interaction on Smartwatches

    Authors: Pengcheng An, Jiawen Zhu, Zibo Zhang, Yifei Yin, Qingyuan Ma, Che Yan, Linghao Du, Jian Zhao

    Abstract: Voice messages, by nature, prevent users from gauging the emotional tone without fully diving into the audio content. This hinders the shared emotional experience at the pre-retrieval stage. Research scarcely explored "Emotional Teasers"-pre-retrieval cues offering a glimpse into an awaiting message's emotional tone without disclosing its content. We introduce EmoWear, a smartwatch voice messaging… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

    Comments: To appear at ACM CHI '24

  21. arXiv:2402.05952  [pdf, other

    cs.LG cs.AI cs.CL

    Advancing Graph Representation Learning with Large Language Models: A Comprehensive Survey of Techniques

    Authors: Qiheng Mao, Zemin Liu, Chenghao Liu, Zhuo Li, Jianling Sun

    Abstract: The integration of Large Language Models (LLMs) with Graph Representation Learning (GRL) marks a significant evolution in analyzing complex data structures. This collaboration harnesses the sophisticated linguistic capabilities of LLMs to improve the contextual understanding and adaptability of graph models, thereby broadening the scope and potential of GRL. Despite a growing body of research dedi… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  22. arXiv:2402.05410  [pdf, other

    cs.CV

    SpirDet: Towards Efficient, Accurate and Lightweight Infrared Small Target Detector

    Authors: Qianchen Mao, Qiang Li, Bingshu Wang, Yongjun Zhang, Tao Dai, C. L. Philip Chen

    Abstract: In recent years, the detection of infrared small targets using deep learning methods has garnered substantial attention due to notable advancements. To improve the detection capability of small targets, these methods commonly maintain a pathway that preserves high-resolution features of sparse and tiny targets. However, it can result in redundant and expensive computations. To tackle this challeng… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  23. arXiv:2402.03492  [pdf, other

    eess.IV cs.CV

    Beyond Strong labels: Weakly-supervised Learning Based on Gaussian Pseudo Labels for The Segmentation of Ellipse-like Vascular Structures in Non-contrast CTs

    Authors: Qixiang Ma, Antoine Łucas, Huazhong Shu, Adrien Kaladji, Pascal Haigron

    Abstract: Deep-learning-based automated segmentation of vascular structures in preoperative CT scans contributes to computer-assisted diagnosis and intervention procedure in vascular diseases. While CT angiography (CTA) is the common standard, non-contrast CT imaging is significant as a contrast-risk-free alternative, avoiding complications associated with contrast agents. However, the challenges of labor-i… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  24. arXiv:2402.02514  [pdf, other

    eess.IV cs.CV cs.LG

    Deep Supervision by Gaussian Pseudo-label-based Morphological Attention for Abdominal Aorta Segmentation in Non-Contrast CTs

    Authors: Qixiang Ma, Antoine Lucas, Adrien Kaladji, Pascal Haigron

    Abstract: The segmentation of the abdominal aorta in non-contrast CT images is a non-trivial task for computer-assisted endovascular navigation, particularly in scenarios where contrast agents are unsuitable. While state-of-the-art deep learning segmentation models have been proposed recently for this task, they are trained on manually annotated strong labels. However, the inherent ambiguity in the boundary… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted by 21st IEEE International Symposium on Biomedical Imaging

  25. arXiv:2402.02425  [pdf, other

    cs.LG physics.flu-dyn

    EuLagNet: Eulerian Fluid Prediction with Lagrangian Dynamics

    Authors: Qilong Ma, Haixu Wu, Lanxiang Xing, Jianmin Wang, Mingsheng Long

    Abstract: Accurately predicting the future fluid is important to extensive areas, such as meteorology, oceanology and aerodynamics. However, since the fluid is usually observed from an Eulerian perspective, its active and intricate dynamics are seriously obscured and confounded in static grids, bringing horny challenges to the prediction. This paper introduces a new Lagrangian-guided paradigm to tackle the… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  26. arXiv:2402.00262  [pdf

    cs.AI

    Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective

    Authors: Qun Ma, Xiao Xue, Deyu Zhou, Xiangning Yu, Donghua Liu, Xuwen Zhang, Zihan Zhao, Yifan Shen, Peilin Ji, Juanjuan Li, Gang Wang, Wanpeng Ma

    Abstract: Computational experiments have emerged as a valuable method for studying complex systems, involving the algorithmization of counterfactuals. However, accurately representing real social systems in Agent-based Modeling (ABM) is challenging due to the diverse and intricate characteristics of humans, including bounded rationality and heterogeneity. To address this limitation, the integration of Large… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  27. arXiv:2401.17812  [pdf, other

    cs.NI cs.AI

    Deterministic Computing Power Networking: Architecture, Technologies and Prospects

    Authors: Qingmin Jia, Yujiao Hu, Xiaomao Zhou, Qianpiao Ma, Kai Guo, Huayu Zhang, Renchao Xie, Tao Huang, Yunjie Liu

    Abstract: With the development of new Internet services such as computation-intensive and delay-sensitive tasks, the traditional "Best Effort" network transmission mode has been greatly challenged. The network system is urgently required to provide end-to-end transmission determinacy and computing determinacy for new applications to ensure the safe and efficient operation of services. Based on the research… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  28. arXiv:2401.09181  [pdf, other

    cs.LG

    Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer

    Authors: Junhao Zheng, Qianli Ma, Zhen Liu, Binquan Wu, Huawen Feng

    Abstract: Multimodal Continual Instruction Tuning (MCIT) enables Multimodal Large Language Models (MLLMs) to meet continuously emerging requirements without expensive retraining. MCIT faces two major obstacles: catastrophic forgetting (where old knowledge is forgotten) and negative forward transfer (where the performance of future tasks is degraded). Although existing methods have greatly alleviated catastr… ▽ More

    Submitted 28 February, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  29. arXiv:2401.05507  [pdf, other

    cs.CL cs.AI

    InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

    Authors: Xueyu Hu, Ziyu Zhao, Shuang Wei, Ziwei Chai, Qianli Ma, Guoyin Wang, Xuwu Wang, Jing Su, Jingjing Xu, Ming Zhu, Yao Cheng, Jianbo Yuan, Jiwei Li, Kun Kuang, Yang Yang, Hongxia Yang, Fei Wu

    Abstract: In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks. These tasks require agents to end-to-end solving complex tasks by interacting with an execution environment. This benchmark contains DAEval, a dataset consisting of 257 data analysis questions derived from 52 CSV files, and an agent framework which incorpora… ▽ More

    Submitted 11 March, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: 27 pages, 7 figures, work in progress

  30. arXiv:2401.04507  [pdf, other

    cs.CL cs.AI

    TechGPT-2.0: A large language model project to solve the task of knowledge graph construction

    Authors: Jiaqi Wang, Yuying Chang, Zhong Li, Ning An, Qi Ma, Lei Hei, Haibo Luo, Yifei Lu, Feiliang Ren

    Abstract: Large language models have exhibited robust performance across diverse natural language processing tasks. This report introduces TechGPT-2.0, a project designed to enhance the capabilities of large language models specifically in knowledge graph construction tasks, including named entity recognition (NER) and relationship triple extraction (RTE) tasks in NLP applications. Additionally, it serves a… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  31. arXiv:2312.15622  [pdf, other

    cs.CV cs.AI cs.MM

    Scalable Face Image Coding via StyleGAN Prior: Towards Compression for Human-Machine Collaborative Vision

    Authors: Qi Mao, Chongyu Wang, Meng Wang, Shiqi Wang, Ruijie Chen, Libiao Jin, Siwei Ma

    Abstract: The accelerated proliferation of visual content and the rapid development of machine vision technologies bring significant challenges in delivering visual data on a gigantic scale, which shall be effectively represented to satisfy both human and machine requirements. In this work, we investigate how hierarchical representations derived from the advanced generative prior facilitate constructing an… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: Accepted by IEEE TIP

  32. arXiv:2312.11396  [pdf, other

    cs.CV cs.AI

    MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance

    Authors: Qi Mao, Lan Chen, Yuchao Gu, Zhen Fang, Mike Zheng Shou

    Abstract: Recent diffusion-based image editing approaches have exhibited impressive editing capabilities in images with simple compositions. However, localized editing in complex scenarios has not been well-studied in the literature, despite its growing real-world demands. Existing mask-based inpainting methods fall short of retaining the underlying structure within the edit region. Meanwhile, mask-free att… ▽ More

    Submitted 21 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: for project page, see https://mag-edit.github.io/

  33. arXiv:2312.07887  [pdf, other

    cs.CL cs.LG

    Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models

    Authors: Junhao Zheng, Shengjie Qiu, Qianli Ma

    Abstract: Incremental Learning (IL) has been a long-standing problem in both vision and Natural Language Processing (NLP) communities. In recent years, as Pre-trained Language Models (PLMs) have achieved remarkable progress in various NLP downstream tasks, utilizing PLMs as backbones has become a common practice in recent research of IL in NLP. Most assume that catastrophic forgetting is the biggest obstacl… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  34. arXiv:2312.07248  [pdf, ps, other

    cs.LG cs.AI

    Multi-Granularity Framework for Unsupervised Representation Learning of Time Series

    Authors: Chengyang Ye, Qiang Ma

    Abstract: Representation learning plays a critical role in the analysis of time series data and has high practical value across a wide range of applications. including trend analysis, time series data retrieval and forecasting. In practice, data confusion is a significant issue as it can considerably impact the effectiveness and accuracy of data analysis, machine learning models and decision-making processe… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  35. arXiv:2312.01919  [pdf, other

    cs.CV

    COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction

    Authors: Qihang Ma, Xin Tan, Yanyun Qu, Lizhuang Ma, Zhizhong Zhang, Yuan Xie

    Abstract: The autonomous driving community has shown significant interest in 3D occupancy prediction, driven by its exceptional geometric perception and general object recognition capabilities. To achieve this, current works try to construct a Tri-Perspective View (TPV) or Occupancy (OCC) representation extending from the Bird-Eye-View perception. However, compressed views like TPV representation lose 3D ge… ▽ More

    Submitted 11 April, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: CVPR2024. Code is available at https://github.com/NotACracker/COTR

  36. arXiv:2311.17119  [pdf, other

    cs.CV

    Continuous Pose for Monocular Cameras in Neural Implicit Representation

    Authors: Qi Ma, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: In this paper, we showcase the effectiveness of optimizing monocular camera poses as a continuous function of time. The camera poses are represented using an implicit neural function which maps the given time to the corresponding camera pose. The mapped camera poses are then used for the downstream tasks where joint camera pose optimization is also required. While doing so, the network parameters… ▽ More

    Submitted 2 March, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  37. arXiv:2311.06952  [pdf, other

    cs.LG math.OC

    A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets

    Authors: Jiayang Ren, Valentín Osuna-Enciso, Morimasa Okamoto, Qiangqiang Mao, Chaojie Ji, Liang Cao, Kaixun Hua, Yankai Cao

    Abstract: Decision trees are essential yet NP-complete to train, prompting the widespread use of heuristic methods such as CART, which suffers from sub-optimal performance due to its greedy nature. Recently, breakthroughs in finding optimal decision trees have emerged; however, these methods still face significant computational costs and struggle with continuous features in large-scale datasets and deep tre… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: 36 pages (13 pages for the main body, 23 pages for the appendix), 7 figures

  38. arXiv:2311.04760  [pdf, other

    cs.IR cs.LG

    Towards Open-world Cross-Domain Sequential Recommendation: A Model-Agnostic Contrastive Denoising Approach

    Authors: Wujiang Xu, Xuying Ning, Wenfang Lin, Mingming Ha, Qiongxu Ma, Qianqiao Liang, Xuewen Tao, Linxun Chen, Bing Han, Minnan Luo

    Abstract: Cross-domain sequential recommendation (CDSR) aims to address the data sparsity problems that exist in traditional sequential recommendation (SR) systems. The existing approaches aim to design a specific cross-domain unit that can transfer and propagate information across multiple domains by relying on overlapping users with abundant behaviors. However, in real-world recommender systems, CDSR sc… ▽ More

    Submitted 23 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

  39. Rethinking Cross-Domain Sequential Recommendation under Open-World Assumptions

    Authors: Wujiang Xu, Qitian Wu, Runzhong Wang, Mingming Ha, Qiongxu Ma, Linxun Chen, Bing Han, Junchi Yan

    Abstract: Cross-Domain Sequential Recommendation (CDSR) methods aim to tackle the data sparsity and cold-start problems present in Single-Domain Sequential Recommendation (SDSR). Existing CDSR works design their elaborate structures relying on overlapping users to propagate the cross-domain information. However, current CDSR methods make closed-world assumptions, assuming fully overlapping users across mult… ▽ More

    Submitted 12 April, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Journal ref: Proceedings of the ACM Web Conference 2024 (WWW '24)

  40. arXiv:2311.03127  [pdf, other

    cs.CL cs.AI

    Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs

    Authors: Longyue Wang, Zhaopeng Tu, Yan Gu, Siyou Liu, Dian Yu, Qingsong Ma, Chenyang Lyu, Liting Zhou, Chao-Hong Liu, Yufeng Ma, Weiyu Chen, Yvette Graham, Bonnie Webber, Philipp Koehn, Andy Way, Yulin Yuan, Shuming Shi

    Abstract: Translating literary works has perennially stood as an elusive dream in machine translation (MT), a journey steeped in intricate challenges. To foster progress in this domain, we hold a new shared task at WMT 2023, the first edition of the Discourse-Level Literary Translation. First, we (Tencent AI Lab and China Literature Ltd.) release a copyrighted and document-level Chinese-English web novel co… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: WMT2023 Discourse-Level Literary Translation Shared Task Overview Paper

  41. arXiv:2311.02775  [pdf, other

    cs.LG cs.AI cs.CL

    AI-TA: Towards an Intelligent Question-Answer Teaching Assistant using Open-Source LLMs

    Authors: Yann Hicke, Anmol Agarwal, Qianou Ma, Paul Denny

    Abstract: Responding to the thousands of student questions on online QA platforms each semester has a considerable human cost, particularly in computing courses with rapidly growing enrollments. To address the challenges of scalable and intelligent question-answering (QA), we introduce an innovative solution that leverages open-source Large Language Models (LLMs) from the LLaMA-2 family to ensure data priva… ▽ More

    Submitted 18 December, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: Updates for camera-ready submission

    Journal ref: NeurIPS Workshop on Generative AI for Education (GAIED), 2023

  42. arXiv:2311.01282  [pdf, other

    cs.LG cs.CL

    FlashDecoding++: Faster Large Language Model Inference on GPUs

    Authors: Ke Hong, Guohao Dai, Jiaming Xu, Qiuli Mao, Xiuhong Li, Jun Liu, Kangdi Chen, Yuhan Dong, Yu Wang

    Abstract: As the Large Language Model (LLM) becomes increasingly important in various domains. However, the following challenges still remain unsolved in accelerating LLM inference: (1) Synchronized partial softmax update. The softmax operation requires a synchronized update operation among each partial softmax result, leading to ~20% overheads for the attention computation in LLMs. (2) Under-utilized compu… ▽ More

    Submitted 5 January, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

  43. arXiv:2311.00921  [pdf, other

    math.NA cs.MS

    $O(N)$ distributed direct factorization of structured dense matrices using runtime systems

    Authors: Sameer Deshmukh, Qinxiang Ma, Rio Yokota, George Bosilca

    Abstract: Structured dense matrices result from boundary integral problems in electrostatics and geostatistics, and also Schur complements in sparse preconditioners such as multi-frontal methods. Exploiting the structure of such matrices can reduce the time for dense direct factorization from $O(N^3)$ to $O(N)$. The Hierarchically Semi-Separable (HSS) matrix is one such low rank matrix format that can be fa… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  44. arXiv:2310.19347  [pdf, other

    cs.CL cs.AI

    Improving Factual Consistency of Text Summarization by Adversarially Decoupling Comprehension and Embellishment Abilities of LLMs

    Authors: Huawen Feng, Yan Fan, Xiong Liu, Ting-En Lin, Zekun Yao, Yuchuan Wu, Fei Huang, Yongbin Li, Qianli Ma

    Abstract: Despite the recent progress in text summarization made by large language models (LLMs), they often generate summaries that are factually inconsistent with original articles, known as "hallucinations" in text generation. Unlike previous small models (e.g., BART, T5), current LLMs make fewer silly mistakes but more sophisticated ones, such as imposing cause and effect, adding false details, overgene… ▽ More

    Submitted 14 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

  45. arXiv:2310.18992  [pdf, other

    cs.CL cs.AI

    Bipartite Graph Pre-training for Unsupervised Extractive Summarization with Graph Convolutional Auto-Encoders

    Authors: Qianren Mao, Shaobo Zhao, Jiarui Li, Xiaolei Gu, Shizhu He, Bo Li, Jianxin Li

    Abstract: Pre-trained sentence representations are crucial for identifying significant sentences in unsupervised document extractive summarization. However, the traditional two-step paradigm of pre-training and sentence-ranking, creates a gap due to differing optimization objectives. To address this issue, we argue that utilizing pre-trained embeddings derived from a process specifically designed to optimiz… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted by the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)

  46. arXiv:2310.14845  [pdf, other

    cs.LG

    ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt

    Authors: Mouxiang Chen, Zemin Liu, Chenghao Liu, Jundong Li, Qiheng Mao, Jianling Sun

    Abstract: Recent research has demonstrated the efficacy of pre-training graph neural networks (GNNs) to capture the transferable graph semantics and enhance the performance of various downstream tasks. However, the semantic knowledge learned from pretext tasks might be unrelated to the downstream task, leading to a semantic gap that limits the application of graph pre-training. To reduce this gap, tradition… ▽ More

    Submitted 17 December, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

  47. arXiv:2310.13008  [pdf, other

    cs.LG cs.AI cs.CL

    LoBaSS: Gauging Learnability in Supervised Fine-tuning Data

    Authors: Haotian Zhou, Tingkai Liu, Qianli Ma, Jianbo Yuan, Pengfei Liu, Yang You, Hongxia Yang

    Abstract: Supervised Fine-Tuning (SFT) serves as a crucial phase in aligning Large Language Models (LLMs) to specific task prerequisites. The selection of fine-tuning data profoundly influences the model's performance, whose principle is traditionally grounded in data quality and distribution. In this paper, we introduce a new dimension in SFT data selection: learnability. This new dimension is motivated by… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  48. arXiv:2310.11908  [pdf, other

    cs.GT cs.MA

    Edge Manipulations for the Maximum Vertex-Weighted Bipartite b-matching

    Authors: Gennaro Auricchio, Qun Ma, Jie Zhang

    Abstract: In this paper, we explore the Mechanism Design aspects of the Maximum Vertex-weighted $b$-Matching (MVbM) problem on bipartite graphs $(A\cup T, E)$. The set $A$ comprises agents, while $T$ represents tasks. The set $E$ is the private information of either agents or tasks. In this framework, we investigate three mechanisms - $\MB$, $\MD$, and $\MG$ - that, given an MVbM problem as input, return a… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 37 pages, 6 figures. arXiv admin note: substantial text overlap with arXiv:2307.12305

    MSC Class: 91A35 91A68

  49. arXiv:2310.10080  [pdf, other

    cs.CL

    Let's reward step by step: Step-Level reward model as the Navigators for Reasoning

    Authors: Qianli Ma, Haotian Zhou, Tingkai Liu, Jianbo Yuan, Pengfei Liu, Yang You, Hongxia Yang

    Abstract: Recent years have seen considerable advancements in multi-step reasoning with Large Language Models (LLMs). The previous studies have elucidated the merits of integrating feedback or search mechanisms during model inference to improve the reasoning accuracy. The Process-Supervised Reward Model (PRM), typically furnishes LLMs with step-by-step feedback during the training phase, akin to Proximal Po… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  50. arXiv:2310.10022  [pdf, other

    cs.CV

    An Empirical Study of Super-resolution on Low-resolution Micro-expression Recognition

    Authors: Ling Zhou, Mingpei Wang, Xiaohua Huang, Wenming Zheng, Qirong Mao, Guoying Zhao

    Abstract: Micro-expression recognition (MER) in low-resolution (LR) scenarios presents an important and complex challenge, particularly for practical applications such as group MER in crowded environments. Despite considerable advancements in super-resolution techniques for enhancing the quality of LR images and videos, few study has focused on investigate super-resolution for improving LR MER. The scarcity… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.