Skip to main content

Showing 1–50 of 2,493 results for author: Liu, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05767  [pdf

    cs.NE

    Large Language Model-Aided Evolutionary Search for Constrained Multiobjective Optimization

    Authors: Zeyi Wang, Songbai Liu, Jianyong Chen, Kay Chen Tan

    Abstract: Evolutionary algorithms excel in solving complex optimization problems, especially those with multiple objectives. However, their stochastic nature can sometimes hinder rapid convergence to the global optima, particularly in scenarios involving constraints. In this study, we employ a large language model (LLM) to enhance evolutionary search for solving constrained multi-objective optimization prob… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 15 pages, 6 figures, 2024 International Conference on Intelligent Computing

  2. arXiv:2405.04966  [pdf, other

    cs.IT cs.CV cs.MA

    Communication-Efficient Collaborative Perception via Information Filling with Codebook

    Authors: Yue Hu, Juntong Peng, Sifei Liu, Junhao Ge, Si Liu, Siheng Chen

    Abstract: Collaborative perception empowers each agent to improve its perceptual ability through the exchange of perceptual messages with other agents. It inherently results in a fundamental trade-off between perception ability and communication cost. To address this bottleneck issue, our core idea is to optimize the collaborative messages from two key aspects: representation and selection. The proposed cod… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 10 pages, Accepted by CVPR 2024

  3. arXiv:2405.04416  [pdf, other

    cs.CV

    DistGrid: Scalable Scene Reconstruction with Distributed Multi-resolution Hash Grid

    Authors: Sidun Liu, Peng Qiao, Zongxin Ye, Wenyu Li, Yong Dou

    Abstract: Neural Radiance Field~(NeRF) achieves extremely high quality in object-scaled and indoor scene reconstruction. However, there exist some challenges when reconstructing large-scale scenes. MLP-based NeRFs suffer from limited network capacity, while volume-based NeRFs are heavily memory-consuming when the scene resolution increases. Recent approaches propose to geographically partition the scene and… ▽ More

    Submitted 8 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: Originally submitted to Siggraph Asia 2023

  4. arXiv:2405.04233  [pdf, other

    cs.CV cs.LG

    Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models

    Authors: Fan Bao, Chendong Xiang, Gang Yue, Guande He, Hongzhou Zhu, Kaiwen Zheng, Min Zhao, Shilong Liu, Yaole Wang, Jun Zhu

    Abstract: We introduce Vidu, a high-performance text-to-video generator that is capable of producing 1080p videos up to 16 seconds in a single generation. Vidu is a diffusion model with U-ViT as its backbone, which unlocks the scalability and the capability for handling long videos. Vidu exhibits strong coherence and dynamism, and is capable of generating both realistic and imaginative videos, as well as un… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Project page at https://www.shengshu-ai.com/vidu

  5. arXiv:2405.03905  [pdf, other

    cs.AR cs.CV cs.SD eess.AS

    A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAM

    Authors: Qinyu Chen, Kwantae Kim, Chang Gao, Sheng Zhou, Taekwang Jang, Tobi Delbruck, Shih-Chii Liu

    Abstract: This paper introduces, to the best of the authors' knowledge, the first fine-grained temporal sparsity-aware keyword spotting (KWS) IC leveraging temporal similarities between neighboring feature vectors extracted from input frames and network hidden states, eliminating unnecessary operations and memory accesses. This KWS IC, featuring a bio-inspired delta-gated recurrent neural network (ΔRNN) cla… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  6. arXiv:2405.03481  [pdf, other

    cs.LG

    AnchorGT: Efficient and Flexible Attention Architecture for Scalable Graph Transformers

    Authors: Wenhao Zhu, Guojie Song, Liang Wang, Shaoguo Liu

    Abstract: Graph Transformers (GTs) have significantly advanced the field of graph representation learning by overcoming the limitations of message-passing graph neural networks (GNNs) and demonstrating promising performance and expressive power. However, the quadratic complexity of self-attention mechanism in GTs has limited their scalability, and previous approaches to address this issue often suffer from… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  7. arXiv:2405.02801  [pdf, other

    cs.SD cs.AI eess.AS

    Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models

    Authors: Tianze Xu, Jiajun Li, Xuesong Chen, Xinrui Yao, Shuchang Liu

    Abstract: In recent years, AI-Generated Content (AIGC) has witnessed rapid advancements, facilitating the generation of music, images, and other forms of artistic expression across various industries. However, researches on general multi-modal music generation model remain scarce. To fill this gap, we propose a multi-modal music generation framework Mozart's Touch. It could generate aligned music with the c… ▽ More

    Submitted 7 May, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: 7 pages, 2 figures, submitted to ACM MM 2024

  8. arXiv:2405.01851  [pdf, other

    cs.LG cs.AI

    Deep Learning Inference on Heterogeneous Mobile Processors: Potentials and Pitfalls

    Authors: Sicong Liu, Wentao Zhou, Zimu Zhou, Bin Guo, Minfan Wang, Cheng Fang, Zheng Lin, Zhiwen Yu

    Abstract: There is a growing demand to deploy computation-intensive deep learning (DL) models on resource-constrained mobile devices for real-time intelligent applications. Equipped with a variety of processing units such as CPUs, GPUs, and NPUs, the mobile devices hold potential to accelerate DL inference via parallel execution across heterogeneous processors. Various efficient parallel methods have been e… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  9. arXiv:2405.01688  [pdf, other

    cs.CV

    Adapting Self-Supervised Learning for Computational Pathology

    Authors: Eric Zimmermann, Neil Tenenholtz, James Hall, George Shaikovski, Michal Zelechowski, Adam Casson, Fausto Milletari, Julian Viret, Eugene Vorontsov, Siqi Liu, Kristen Severson

    Abstract: Self-supervised learning (SSL) has emerged as a key technique for training networks that can generalize well to diverse tasks without task-specific supervision. This property makes SSL desirable for computational pathology, the study of digitized images of tissues, as there are many target applications and often limited labeled training samples. However, SSL algorithms and models have been primari… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Presented at DCA in MI Workshop, CVPR 2024

  10. arXiv:2405.01411  [pdf, other

    cs.CR

    IDPFilter: Mitigating Interdependent Privacy Issues in Third-Party Apps

    Authors: Shuaishuai Liu, Gergely Biczók

    Abstract: Third-party applications have become an essential part of today's online ecosystem, enhancing the functionality of popular platforms. However, the intensive data exchange underlying their proliferation has increased concerns about interdependent privacy (IDP). This paper provides a comprehensive investigation into the previously underinvestigated IDP issues of third-party apps. Specifically, first… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 36 pages 12 figures

  11. arXiv:2405.01333  [pdf, other

    cs.RO cs.CV

    NeRF in Robotics: A Survey

    Authors: Guangming Wang, Lei Pan, Songyou Peng, Shaohui Liu, Chenfeng Xu, Yanzi Miao, Wei Zhan, Masayoshi Tomizuka, Marc Pollefeys, Hesheng Wang

    Abstract: Meticulous 3D environment representations have been a longstanding goal in computer vision and robotics fields. The recent emergence of neural implicit representations has introduced radical innovation to this field as implicit representations enable numerous capabilities. Among these, the Neural Radiance Field (NeRF) has sparked a trend because of the huge representational advantages, such as sim… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 21 pages, 19 figures

  12. arXiv:2405.01000  [pdf, other

    cs.IT eess.SP

    Low-Complexity Near-Field Localization with XL-MIMO Sectored Uniform Circular Arrays

    Authors: Shicong Liu, Xianghao Yu

    Abstract: Rapid advancement of antenna technology catalyses the popularization of extremely large-scale multiple-input multiple-output (XL-MIMO) antenna arrays, which pose unique challenges for localization with the inescapable near-field effect. In this paper, we propose an efficient near-field localization algorithm by leveraging a sectored uniform circular array (sUCA). In particular, we first customize… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 6 pages, 6 figures

  13. arXiv:2404.19534  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results

    Authors: Yuekun Dai, Dafeng Zhang, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Peiqing Yang, Zhezhu Jin, Guanqun Liu, Chen Change Loy, Lize Zhang, Shuai Liu, Chaoyu Feng, Luyang Wang, Shuan Chen, Guangqi Shao, Xiaotao Wang, Lei Lei, Qirui Yang, Qihua Cheng, Zhiqiang Xu, Yihao Liu, Huanjing Yue, Jingyu Yang , et al. (38 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Nighttime Flare Removal Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  14. arXiv:2404.19282  [pdf, other

    cs.MM

    Dual Dynamic Threshold Adjustment Strategy for Deep Metric Learning

    Authors: Xiruo Jiang, Yazhou Yao, Sheng Liu, Fumin Shen, Liqiang Nie, Xiansheng Hua

    Abstract: Loss functions and sample mining strategies are essential components in deep metric learning algorithms. However, the existing loss function or mining strategy often necessitate the incorporation of additional hyperparameters, notably the threshold, which defines whether the sample pair is informative. The threshold provides a stable numerical standard for determining whether to retain the pairs.… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: accepted by ACM Transactions on Multimedia Computing, Communications, and Applications

  15. arXiv:2404.19209  [pdf, other

    cs.DC

    AdaOper: Energy-efficient and Responsive Concurrent DNN Inference on Mobile Devices

    Authors: Zheng Lin, Bin Guo, Sicong Liu, Wentao Zhou, Yasan Ding, Yu Zhang, Zhiwen Yu

    Abstract: Deep neural network (DNN) has driven extensive applications in mobile technology. However, for long-running mobile apps like voice assistants or video applications on smartphones, energy efficiency is critical for battery-powered devices. The rise of heterogeneous processors in mobile devices today has introduced new challenges for optimizing energy efficiency. Our key insight is that partitioning… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  16. arXiv:2404.18890  [pdf, other

    cs.CV

    Hide and Seek: How Does Watermarking Impact Face Recognition?

    Authors: Yuguang Yao, Steven Grosz, Sijia Liu, Anil Jain

    Abstract: The recent progress in generative models has revolutionized the synthesis of highly realistic images, including face images. This technological development has undoubtedly helped face recognition, such as training data augmentation for higher recognition accuracy and data privacy. However, it has also introduced novel challenges concerning the responsible use and proper attribution of computer gen… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  17. arXiv:2404.18656  [pdf, other

    cs.IT

    Symmetric Entropy Regions of Degrees Six and Seven

    Authors: Zihan Li, Shaocheng Liu, Qi Chen

    Abstract: In this paper, we classify all G-symmetric almost entropic regions according to their Shannon-tightness, that is, whether they can be fully characterized by Shannon-type inequalities, where G is a permutation group of degree 6 or 7.

    Submitted 29 April, 2024; originally announced April 2024.

    Journal ref: 2024 IEEE International Symposium on Information Theory

  18. M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework

    Authors: Zijian Zhang, Shuchang Liu, Jiaao Yu, Qingpeng Cai, Xiangyu Zhao, Chunxu Zhang, Ziru Liu, Qidong Liu, Hongwei Zhao, Lantao Hu, Peng Jiang, Kun Gai

    Abstract: Multi-domain recommendation and multi-task recommendation have demonstrated their effectiveness in leveraging common information from different domains and objectives for comprehensive user modeling. Nonetheless, the practical recommendation usually faces multiple domains and tasks simultaneously, which cannot be well-addressed by current methods. To this end, we introduce M3oE, an adaptive multi-… ▽ More

    Submitted 7 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  19. arXiv:2404.18239  [pdf, other

    cs.LG cs.CL

    SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning

    Authors: Jinghan Jia, Yihua Zhang, Yimeng Zhang, Jiancheng Liu, Bharat Runwal, James Diffenderfer, Bhavya Kailkhura, Sijia Liu

    Abstract: Large Language Models (LLMs) have highlighted the necessity of effective unlearning mechanisms to comply with data regulations and ethical AI practices. LLM unlearning aims at removing undesired data influences and associated model capabilities without compromising utility out of the scope of unlearning. While interest in studying LLM unlearning is growing,the impact of the optimizer choice for LL… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  20. arXiv:2404.18133  [pdf, other

    cs.GT

    Fair Division of Indivisible Goods with Comparison-Based Queries

    Authors: Xiaolin Bu, Zihao Li, Shengxin Liu, Jiaxin Song, Biaoshuai Tao

    Abstract: We study the problem of fairly allocating $m$ indivisible goods to $n$ agents, where agents may have different preferences over the goods. In the traditional setting, agents' valuations are provided as inputs to the algorithm. In this paper, we study a new comparison-based query model where the algorithm presents two bundles of goods to an agent and the agent responds by telling the algorithm whic… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  21. arXiv:2404.18132  [pdf, other

    cs.GT

    Allocating Mixed Goods with Customized Fairness and Indivisibility Ratio

    Authors: Bo Li, Zihao Li, Shengxin Liu, Zekai Wu

    Abstract: We consider the problem of fairly allocating a combination of divisible and indivisible goods. While fairness criteria like envy-freeness (EF) and proportionality (PROP) can always be achieved for divisible goods, only their relaxed versions, such as the ''up to one'' relaxations EF1 and PROP1, can be satisfied when the goods are indivisible. The ''up to one'' relaxations require the fairness cond… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Appears in the 33rd International Joint Conference on Artificial Intelligence (IJCAI), 2024

  22. arXiv:2404.18058  [pdf, other

    eess.IV cs.CV

    Joint Reference Frame Synthesis and Post Filter Enhancement for Versatile Video Coding

    Authors: Weijie Bao, Yuantong Zhang, Jianghao Jia, Zhenzhong Chen, Shan Liu

    Abstract: This paper presents the joint reference frame synthesis (RFS) and post-processing filter enhancement (PFE) for Versatile Video Coding (VVC), aiming to explore the combination of different neural network-based video coding (NNVC) tools to better utilize the hierarchical bi-directional coding structure of VVC. Both RFS and PFE utilize the Space-Time Enhancement Network (STENet), which receives two i… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  23. arXiv:2404.17974  [pdf, other

    cs.RO cs.CV

    HVOFusion: Incremental Mesh Reconstruction Using Hybrid Voxel Octree

    Authors: Shaofan Liu, Junbo Chen, Jianke Zhu

    Abstract: Incremental scene reconstruction is essential to the navigation in robotics. Most of the conventional methods typically make use of either TSDF (truncated signed distance functions) volume or neural networks to implicitly represent the surface. Due to the voxel representation or involving with time-consuming sampling, they have difficulty in balancing speed, memory storage, and surface quality. In… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  24. arXiv:2404.17883  [pdf, other

    cs.CV

    Underwater Variable Zoom: Depth-Guided Perception Network for Underwater Image Enhancement

    Authors: Zhixiong Huang, Xinying Wang, Jinjiang Li, Shenglan Liu, Lin Feng

    Abstract: Underwater scenes intrinsically involve degradation problems owing to heterogeneous ocean elements. Prevailing underwater image enhancement (UIE) methods stick to straightforward feature modeling to learn the mapping function, which leads to limited vision gain as it lacks more explicit physical cues (e.g., depth). In this work, we investigate injecting the depth prior into the deep UIE model for… ▽ More

    Submitted 1 May, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

  25. arXiv:2404.17100  [pdf, other

    cs.CV

    Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting

    Authors: Yuanyuan Liu, Yuxuan Huang, Shuyang Liu, Yibing Zhan, Zijing Chen, Zhe Chen

    Abstract: In Video-based Facial Expression Recognition (V-FER), models are typically trained on closed-set datasets with a fixed number of known classes. However, these V-FER models cannot deal with unknown classes that are prevalent in real-world scenarios. In this paper, we introduce a challenging Open-set Video-based Facial Expression Recognition (OV-FER) task, aiming at identifying not only known classe… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  26. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  27. arXiv:2404.16271  [pdf

    cs.CR cond-mat.mtrl-sci

    True random number generation using metastable 1T' molybdenum ditelluride

    Authors: Yang Liu, Pengyu Liu, Yingyi Wen, Zihan Liang, Songwei Liu, Lekai Song, Jingfang Pei, Xiaoyue Fan, Teng Ma, Gang Wang, Shuo Gao, Kong-Pang Pun, Xiaolong Chen, Guohua Hu

    Abstract: True random numbers play a critical role in secure cryptography. The generation relies on a stable and readily extractable entropy source. Here, from solution-processed structurally metastable 1T' MoTe2, we prove stable output of featureless, stochastic, and yet stable conductance noise at a broad temperature (down to 15 K) with minimal power consumption (down to 0.05 micro-W). Our characterizatio… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  28. arXiv:2404.16205  [pdf, other

    cs.CV cs.MM

    AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results

    Authors: Marcos V. Conde, Saman Zadtootaghaj, Nabajeet Barman, Radu Timofte, Chenlong He, Qi Zheng, Ruoxi Zhu, Zhengzhong Tu, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Zicheng Zhang, Haoning Wu, Yingjie Zhou, Chunyi Li, Xiaohong Liu, Weisi Lin, Guangtao Zhai , et al. (11 additional authors not shown)

    Abstract: This paper reviews the AIS 2024 Video Quality Assessment (VQA) Challenge, focused on User-Generated Content (UGC). The aim of this challenge is to gather deep learning-based methods capable of estimating the perceptual quality of UGC videos. The user-generated videos from the YouTube UGC Dataset include diverse content (sports, games, lyrics, anime, etc.), quality and resolutions. The proposed met… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Workshop -- AI for Streaming (AIS) Video Quality Assessment Challenge

  29. arXiv:2404.16006  [pdf, other

    cs.CV

    MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

    Authors: Kaining Ying, Fanqing Meng, Jin Wang, Zhiqian Li, Han Lin, Yue Yang, Hao Zhang, Wenbo Zhang, Yuqi Lin, Shuo Liu, Jiayi Lei, Quanfeng Lu, Runjian Chen, Peng Xu, Renrui Zhang, Haozhe Zhang, Peng Gao, Yali Wang, Yu Qiao, Ping Luo, Kaipeng Zhang, Wenqi Shao

    Abstract: Large Vision-Language Models (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation. However, existing multimodal evaluation benchmarks cover a limited number of multimodal tasks testing rudimentary capabilities, falling short in tracking LVLM development. In this study, we present MMT-Bench, a comprehensive benchmark designed to… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 77 pages, 41 figures

  30. arXiv:2404.15714  [pdf, other

    cs.CV cs.AI

    Ada-DF: An Adaptive Label Distribution Fusion Network For Facial Expression Recognition

    Authors: Shu Liu, Yan Xu, Tongming Wan, Xiaoyan Kui

    Abstract: Facial expression recognition (FER) plays a significant role in our daily life. However, annotation ambiguity in the datasets could greatly hinder the performance. In this paper, we address FER task via label distribution learning paradigm, and develop a dual-branch Adaptive Distribution Fusion (Ada-DF) framework. One auxiliary branch is constructed to obtain the label distributions of samples. Th… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  31. arXiv:2404.14946  [pdf, other

    cs.SD cs.CL eess.AS

    StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations

    Authors: Sen Liu, Yiwei Guo, Xie Chen, Kai Yu

    Abstract: While acoustic expressiveness has long been studied in expressive text-to-speech (ETTS), the inherent expressiveness in text lacks sufficient attention, especially for ETTS of artistic works. In this paper, we introduce StoryTTS, a highly ETTS dataset that contains rich expressiveness both in acoustic and textual perspective, from the recording of a Mandarin storytelling show. A systematic and com… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted by ICASSP 2024

    Journal ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 11521-11525

  32. arXiv:2404.14712  [pdf, other

    physics.ao-ph cs.AI cs.DC eess.IV physics.geo-ph

    ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability

    Authors: Xiao Wang, Aristeidis Tsaris, Siyan Liu, Jong-Youl Choi, Ming Fan, Wei Zhang, Junqi Yin, Moetasim Ashfaq, Dan Lu, Prasanna Balaprakash

    Abstract: Earth system predictability is challenged by the complexity of environmental dynamics and the multitude of variables involved. Current AI foundation models, although advanced by leveraging large and heterogeneous data, are often constrained by their size and data integration, limiting their effectiveness in addressing the full range of Earth system prediction challenges. To overcome these limitati… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  33. arXiv:2404.14709  [pdf, ps, other

    cs.CV eess.IV

    SC-HVPPNet: Spatial and Channel Hybrid-Attention Video Post-Processing Network with CNN and Transformer

    Authors: Tong Zhang, Wenxue Cui, Shaohui Liu, Feng Jiang

    Abstract: Convolutional Neural Network (CNN) and Transformer have attracted much attention recently for video post-processing (VPP). However, the interaction between CNN and Transformer in existing VPP methods is not fully explored, leading to inefficient communication between the local and global extracted features. In this paper, we explore the interaction between CNN and Transformer in the task of VPP, a… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  34. arXiv:2404.14646  [pdf, other

    cs.SE cs.AI

    Exploring and Unleashing the Power of Large Language Models in Automated Code Translation

    Authors: Zhen Yang, Fang Liu, Zhongxing Yu, Jacky Wai Keung, Jia Li, Shuo Liu, Yifan Hong, Xiaoxue Ma, Zhi Jin, Ge Li

    Abstract: Code translation tools are developed for automatic source-to-source translation. Although learning-based transpilers have shown impressive enhancement against rule-based counterparts, owing to their task-specific pre-training on extensive monolingual corpora. Their current performance still remains unsatisfactory for practical deployment, and the associated training resources are also prohibitivel… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 23 pages, 7 figures, accepted by FSE'24 (2024 ACM International Conference on the Foundations of Software Engineering)

  35. arXiv:2404.14006  [pdf, other

    cs.LG cs.CV

    Distilled Datamodel with Reverse Gradient Matching

    Authors: Jingwen Ye, Ruonan Yu, Songhua Liu, Xinchao Wang

    Abstract: The proliferation of large-scale AI models trained on extensive datasets has revolutionized machine learning. With these models taking on increasingly central roles in various applications, the need to understand their behavior and enhance interpretability has become paramount. To investigate the impact of changes in training data on a pre-trained model, a common approach is leave-one-out retraini… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024

  36. arXiv:2404.12759  [pdf, other

    cs.LG

    decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points

    Authors: Yi Guo, Fanliu Kong, Xiaoyang Li, Hui Li, Wei Chen, Xiaogang Tian, Jinping Cai, Yang Zhang, Shouda Liu

    Abstract: Quantization emerges as one of the most promising compression technologies for deploying efficient large models for various real time application in recent years. Considering that the storage and IO of weights take up the vast majority of the overhead inside a large model, weight only quantization can lead to large gains. However, existing quantization schemes suffer from significant accuracy degr… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: quantization for deep models

  37. arXiv:2404.12659  [pdf, ps, other

    cs.CL

    SOS-1K: A Fine-grained Suicide Risk Classification Dataset for Chinese Social Media Analysis

    Authors: Hongzhi Qi, Hanfei Liu, Jianqiang Li, Qing Zhao, Wei Zhai, Dan Luo, Tian Yu He, Shuo Liu, Bing Xiang Yang, Guanghui Fu

    Abstract: In the social media, users frequently express personal emotions, a subset of which may indicate potential suicidal tendencies. The implicit and varied forms of expression in internet language complicate accurate and rapid identification of suicidal intent on social media, thus creating challenges for timely intervention efforts. The development of deep learning models for suicide risk detection is… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  38. arXiv:2404.12274  [pdf, other

    cs.CL cs.AI

    Advancing the Robustness of Large Language Models through Self-Denoised Smoothing

    Authors: Jiabao Ji, Bairu Hou, Zhen Zhang, Guanhua Zhang, Wenqi Fan, Qing Li, Yang Zhang, Gaowen Liu, Sijia Liu, Shiyu Chang

    Abstract: Although large language models (LLMs) have achieved significant success, their vulnerability to adversarial perturbations, including recent jailbreak attacks, has raised considerable concerns. However, the increasing size of these models and their limited access make improving their robustness a challenging task. Among various defense strategies, randomized smoothing has shown great potential for… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: Accepted by NAACL 2024. Jiabao, Bairu, Zhen, Guanhua contributed equally. This is an updated version of the paper: arXiv:2307.07171

  39. arXiv:2404.11770  [pdf, other

    cs.CV cs.AI

    Event-Based Eye Tracking. AIS 2024 Challenge Survey

    Authors: Zuowen Wang, Chang Gao, Zongwei Wu, Marcos V. Conde, Radu Timofte, Shih-Chii Liu, Qinyu Chen, Zheng-jun Zha, Wei Zhai, Han Han, Bohao Liao, Yuliang Wu, Zengyu Wan, Zhong Wang, Yang Cao, Ganchao Tan, Jinze Chen, Yan Ru Pei, Sasskia Brüers, Sébastien Crouzet, Douglas McLelland, Oliver Coenen, Baoheng Zhang, Yizhao Gao, Jingyuan Li , et al. (14 additional authors not shown)

    Abstract: This survey reviews the AIS 2024 Event-Based Eye Tracking (EET) Challenge. The task of the challenge focuses on processing eye movement recorded with event cameras and predicting the pupil center of the eye. The challenge emphasizes efficient eye tracking with event cameras to achieve good task accuracy and efficiency trade-off. During the challenge period, 38 participants registered for the Kaggl… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Qinyu Chen is the corresponding author

  40. arXiv:2404.11313  [pdf, other

    eess.IV cs.AI

    NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

    Authors: Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei Li, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo , et al. (43 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge

  41. Inductive Cognitive Diagnosis for Fast Student Learning in Web-Based Online Intelligent Education Systems

    Authors: Shuo Liu, Junhao Shen, Hong Qian, Aimin Zhou

    Abstract: Cognitive diagnosis aims to gauge students' mastery levels based on their response logs. Serving as a pivotal module in web-based online intelligent education systems (WOIESs), it plays an upstream and fundamental role in downstream tasks like learning item recommendation and computerized adaptive testing. WOIESs are open learning environment where numerous new students constantly register and com… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: WWW 2024

  42. arXiv:2404.11256  [pdf, other

    cs.CV

    MMCBE: Multi-modality Dataset for Crop Biomass Estimation and Beyond

    Authors: Xuesong Li, Zeeshan Hayder, Ali Zia, Connor Cassidy, Shiming Liu, Warwick Stiller, Eric Stone, Warren Conaty, Lars Petersson, Vivien Rolland

    Abstract: Crop biomass, a critical indicator of plant growth, health, and productivity, is invaluable for crop breeding programs and agronomic research. However, the accurate and scalable quantification of crop biomass remains inaccessible due to limitations in existing measurement methods. One of the obstacles impeding the advancement of current crop biomass prediction methodologies is the scarcity of publ… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 10 pages, 10 figures, 3 tables

  43. arXiv:2404.11013  [pdf, ps, other

    cs.LG math.OC

    Control Theoretic Approach to Fine-Tuning and Transfer Learning

    Authors: Erkan Bayram, Shenyu Liu, Mohamed-Ali Belabbas, Tamer Başar

    Abstract: Given a training set in the form of a paired $(\mathcal{X},\mathcal{Y})$, we say that the control system $\dot{x} = f(x,u)$ has learned the paired set via the control $u^*$ if the system steers each point of $\mathcal{X}$ to its corresponding target in $\mathcal{Y}$. Most existing methods for finding a control function $u^*$ require learning of a new control function if the training set is updated… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  44. arXiv:2404.10484  [pdf, other

    cs.CV

    AbsGS: Recovering Fine Details for 3D Gaussian Splatting

    Authors: Zongxin Ye, Wenyu Li, Sidun Liu, Peng Qiao, Yong Dou

    Abstract: 3D Gaussian Splatting (3D-GS) technique couples 3D Gaussian primitives with differentiable rasterization to achieve high-quality novel view synthesis results while providing advanced real-time rendering performance. However, due to the flaw of its adaptive density control strategy in 3D-GS, it frequently suffers from over-reconstruction issue in intricate scenes containing high-frequency details,… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  45. arXiv:2404.10358  [pdf, other

    cs.CV

    Improving Bracket Image Restoration and Enhancement with Flow-guided Alignment and Enhanced Feature Aggregation

    Authors: Wenjie Lin, Zhen Liu, Chengzhi Jiang, Mingyan Han, Ting Jiang, Shuaicheng Liu

    Abstract: In this paper, we address the Bracket Image Restoration and Enhancement (BracketIRE) task using a novel framework, which requires restoring a high-quality high dynamic range (HDR) image from a sequence of noisy, blurred, and low dynamic range (LDR) multi-exposure RAW inputs. To overcome this challenge, we present the IREANet, which improves the multiple exposure alignment and aggregation with a Fl… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  46. arXiv:2404.10209  [pdf, other

    cs.AI cs.LG

    Demonstration of DB-GPT: Next Generation Data Interaction System Empowered by Large Language Models

    Authors: Siqiao Xue, Danrui Qi, Caigao Jiang, Wenhui Shi, Fangyin Cheng, Keting Chen, Hongjun Yang, Zhiping Zhang, Jianshan He, Hongyang Zhang, Ganglin Wei, Wang Zhao, Fan Zhou, Hong Yi, Shaodong Liu, Hongjun Yang, Faqiang Chen

    Abstract: The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. The technologies of interacting with data particularly have an important entanglement with LLMs as efficient and intuitive data interactions are paramount. In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interact… ▽ More

    Submitted 24 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  47. arXiv:2404.09599  [pdf, other

    cs.CR

    Enhancing Code Vulnerability Detection via Vulnerability-Preserving Data Augmentation

    Authors: Shangqing Liu, Wei Ma, Jian Wang, Xiaofei Xie, Ruitao Feng, Yang Liu

    Abstract: Source code vulnerability detection aims to identify inherent vulnerabilities to safeguard software systems from potential attacks. Many prior studies overlook diverse vulnerability characteristics, simplifying the problem into a binary (0-1) classification task for example determining whether it is vulnerable or not. This poses a challenge for a single deep learning-based model to effectively lea… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  48. arXiv:2404.09526  [pdf, other

    cs.DC cs.LG

    LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism

    Authors: Bingyang Wu, Shengyu Liu, Yinmin Zhong, Peng Sun, Xuanzhe Liu, Xin Jin

    Abstract: The context window of large language models (LLMs) is rapidly increasing, leading to a huge variance in resource usage between different requests as well as between different phases of the same request. Restricted by static parallelism strategies, existing LLM serving systems cannot efficiently utilize the underlying resources to serve variable-length requests in different phases. To address this… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  49. arXiv:2404.08760  [pdf, other

    cs.CL cs.AI

    The Generation Gap:Exploring Age Bias in Large Language Models

    Authors: Siyang Liu, Trish Maturi, Siqi Shen, Rada Mihalcea

    Abstract: In this paper, we explore the alignment of values in Large Language Models (LLMs) with specific age groups, leveraging data from the World Value Survey across thirteen categories. Through a diverse set of prompts tailored to ensure response robustness, we find a general inclination of LLM values towards younger demographics. Additionally, we explore the impact of incorporating age identity informa… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 4 pages

  50. arXiv:2404.08227  [pdf, other

    cs.RO

    A Passively Bendable, Compliant Tactile Palm with RObotic Modular Endoskeleton Optical (ROMEO) Fingers

    Authors: Sandra Q. Liu, Edward H. Adelson

    Abstract: Many robotic hands currently rely on extremely dexterous robotic fingers and a thumb joint to envelop themselves around an object. Few hands focus on the palm even though human hands greatly benefit from their central fold and soft surface. As such, we develop a novel structurally compliant soft palm, which enables more surface area contact for the objects that are pressed into it. Moreover, this… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Accepted to ICRA 2024