Skip to main content

Showing 1–50 of 152 results for author: An, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.02057  [pdf, ps, other

    cs.CR cs.AI

    MGC: A Compiler Framework Exploiting Compositional Blindness in Aligned LLMs for Malware Generation

    Authors: Lu Yan, Zhuo Zhang, Xiangzhe Xu, Shengwei An, Guangyu Shen, Zhou Xuan, Xuan Chen, Xiangyu Zhang

    Abstract: Large language models (LLMs) have democratized software development, reducing the expertise barrier for programming complex applications. This accessibility extends to malicious software development, raising significant security concerns. While LLM providers have implemented alignment mechanisms to prevent direct generation of overtly malicious code, these safeguards predominantly evaluate individ… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  2. arXiv:2506.10424  [pdf, ps, other

    cs.CR cs.AI

    SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks

    Authors: Kaiyuan Zhang, Siyuan Cheng, Hanxi Guo, Yuetian Chen, Zian Su, Shengwei An, Yuntao Du, Charles Fleming, Ashish Kundu, Xiangyu Zhang, Ninghui Li

    Abstract: Large language models (LLMs) have achieved remarkable success and are widely adopted for diverse applications. However, fine-tuning these models often involves private or sensitive information, raising critical privacy concerns. In this work, we conduct the first comprehensive study evaluating the vulnerability of fine-tuned LLMs to membership inference attacks (MIAs). Our empirical analysis demon… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: Accepted by the 34th USENIX Security Symposium 2025. Code is available at https://github.com/KaiyuanZh/SOFT

  3. arXiv:2506.03195  [pdf, ps, other

    cs.CV cs.AI cs.LG

    Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs

    Authors: Yunqi Hong, Sohyun An, Andrew Bai, Neil Y. C. Lin, Cho-Jui Hsieh

    Abstract: Despite Multimodal Large Language Models (MLLMs) showing promising results on general zero-shot image classification tasks, fine-grained image classification remains challenging. It demands precise attention to subtle visual details to distinguish between visually similar subcategories--details that MLLMs may easily overlook without explicit guidance. To address this, we introduce AutoSEP, an iter… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  4. arXiv:2505.21765  [pdf, ps, other

    cs.AI

    Don't Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models

    Authors: Sohyun An, Ruochen Wang, Tianyi Zhou, Cho-Jui Hsieh

    Abstract: While recent success of large reasoning models (LRMs) significantly advanced LLMs' reasoning capability by optimizing the final answer accuracy using reinforcement learning, they may also drastically increase the output length due to overthinking, characterized by unnecessarily complex reasoning paths that waste computation and potentially degrade the performance. We hypothesize that such ineffici… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: Work In Progress

  5. arXiv:2505.11769  [pdf, ps, other

    cs.CV

    Technical Report for ICRA 2025 GOOSE 2D Semantic Segmentation Challenge: Boosting Off-Road Segmentation via Photometric Distortion and Exponential Moving Average

    Authors: Wonjune Kim, Lae-kyoung Lee, Su-Yong An

    Abstract: We report on the application of a high-capacity semantic segmentation pipeline to the GOOSE 2D Semantic Segmentation Challenge for unstructured off-road environments. Using a FlashInternImage-B backbone together with a UPerNet decoder, we adapt established techniques, rather than designing new ones, to the distinctive conditions of off-road scenes. Our training recipe couples strong photometric di… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

    Comments: Winners of the GOOSE 2D Semantic Segmentation Challenge at the IEEE ICRA Workshop on Field Robotics 2025

  6. arXiv:2504.15431  [pdf, other

    cs.CL cs.AI cs.LG

    Trillion 7B Technical Report

    Authors: Sungjun Han, Juyoung Suk, Suyeong An, Hyungguk Kim, Kyuseok Kim, Wonsuk Yang, Seungtaek Choi, Jamin Shin

    Abstract: We introduce Trillion-7B, the most token-efficient Korean-centric multilingual LLM available. Our novel Cross-lingual Document Attention (XLDA) mechanism enables highly efficient and effective knowledge transfer from English to target languages like Korean and Japanese. Combined with optimized data mixtures, language-specific filtering, and tailored tokenizer construction, Trillion-7B achieves com… ▽ More

    Submitted 21 April, 2025; originally announced April 2025.

    Comments: Preview version

  7. arXiv:2504.06534  [pdf, other

    cs.DS cs.CG

    Single-Source Shortest Path Problem in Weighted Disk Graphs

    Authors: Shinwoo An, Eunjin Oh, Jie Xue

    Abstract: In this paper, we present efficient algorithms for the single-source shortest path problem in weighted disk graphs. A disk graph is the intersection graph of a family of disks in the plane. Here, the weight of an edge is defined as the Euclidean distance between the centers of the disks corresponding to the endpoints of the edge. Given a family of $n$ disks in the plane whose radii lie in $[1,Ψ]$… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

    Comments: In SoCG'25

  8. arXiv:2504.03515  [pdf, other

    cs.RO cs.LG

    Dexterous Manipulation through Imitation Learning: A Survey

    Authors: Shan An, Ziyu Meng, Chao Tang, Yuning Zhou, Tengyu Liu, Fangqiang Ding, Shufang Zhang, Yao Mu, Ran Song, Wei Zhang, Zeng-Guang Hou, Hong Zhang

    Abstract: Dexterous manipulation, which refers to the ability of a robotic hand or multi-fingered end-effector to skillfully control, reorient, and manipulate objects through precise, coordinated finger movements and adaptive force modulation, enables complex interactions similar to human hand dexterity. With recent advances in robotics and machine learning, there is a growing demand for these systems to op… ▽ More

    Submitted 17 May, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

    Comments: 22pages, 5 figures

  9. arXiv:2503.06863  [pdf, other

    cs.RO cs.CV

    HIF: Height Interval Filtering for Efficient Dynamic Points Removal

    Authors: Shufang Zhang, Tao Jiang, Jiazheng Wu, Ziyu Meng, Ziyang Zhang, Shan An

    Abstract: 3D point cloud mapping plays a essential role in localization and autonomous navigation. However, dynamic objects often leave residual traces during the map construction process, which undermine the performance of subsequent tasks. Therefore, dynamic object removal has become a critical challenge in point cloud based map construction within dynamic scenarios. Existing approaches, however, often in… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

  10. arXiv:2503.05995  [pdf, other

    cs.RO

    ReJSHand: Efficient Real-Time Hand Pose Estimation and Mesh Reconstruction Using Refined Joint and Skeleton Features

    Authors: Shan An, Shipeng Dai, Mahrukh Ansari, Yu Liang, Ming Zeng, Konstantinos A. Tsintotas, Changhong Fu, Hong Zhang

    Abstract: Accurate hand pose estimation is vital in robotics, advancing dexterous manipulation in human-computer interaction. Toward this goal, this paper presents ReJSHand (which stands for Refined Joint and Skeleton Features), a cutting-edge network formulated for real-time hand pose estimation and mesh reconstruction. The proposed framework is designed to accurately predict 3D hand gestures under real-ti… ▽ More

    Submitted 7 March, 2025; originally announced March 2025.

  11. arXiv:2503.05117  [pdf, other

    cs.RO cs.OS

    HyperGraph ROS: An Open-Source Robot Operating System for Hybrid Parallel Computing based on Computational HyperGraph

    Authors: Shufang Zhang, Jiazheng Wu, Jiacheng He, Kaiyi Wang, Shan An

    Abstract: This paper presents HyperGraph ROS, an open-source robot operating system that unifies intra-process, inter-process, and cross-device computation into a computational hypergraph for efficient message passing and parallel execution. In order to optimize communication, HyperGraph ROS dynamically selects the optimal communication mechanism while maintaining a consistent API. For intra-process message… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  12. arXiv:2502.11387  [pdf, other

    cs.CL

    RoleMRC: A Fine-Grained Composite Benchmark for Role-Playing and Instruction-Following

    Authors: Junru Lu, Jiazheng Li, Guodong Shen, Lin Gui, Siyu An, Yulan He, Di Yin, Xing Sun

    Abstract: Role-playing is important for Large Language Models (LLMs) to follow diverse instructions while maintaining role identity and the role's pre-defined ability limits. Existing role-playing datasets mostly contribute to controlling role style and knowledge boundaries, but overlook role-playing in instruction-following scenarios. We introduce a fine-grained role-playing and instruction-following compo… ▽ More

    Submitted 16 February, 2025; originally announced February 2025.

  13. arXiv:2502.06139  [pdf, other

    cs.CL

    LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs

    Authors: Sumin An, Junyoung Sung, Wonpyo Park, Chanjun Park, Paul Hongsuck Seo

    Abstract: While large language models (LLMs) excel in generating coherent and contextually rich outputs, their capacity to efficiently handle long-form contexts is limited by fixed-length position embeddings. Additionally, the computational cost of processing long sequences increases quadratically, making it challenging to extend context length. To address these challenges, we propose Long-form Context Inje… ▽ More

    Submitted 22 May, 2025; v1 submitted 9 February, 2025; originally announced February 2025.

    Comments: Accepted to NAACL 2025. Project Page: https://ssuminan.github.io/LCIRC/

  14. CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling

    Authors: Kaiyuan Zhang, Siyuan Cheng, Guangyu Shen, Bruno Ribeiro, Shengwei An, Pin-Yu Chen, Xiangyu Zhang, Ninghui Li

    Abstract: Federated learning collaboratively trains a neural network on a global server, where each local client receives the current global model weights and sends back parameter updates (gradients) based on its local private data. The process of sending these model updates may leak client's private data information. Existing gradient inversion attacks can exploit this vulnerability to recover private trai… ▽ More

    Submitted 26 January, 2025; originally announced January 2025.

    Comments: Accepted by 32nd Annual Network and Distributed System Security Symposium (NDSS 2025). Code is available at https://censor-gradient.github.io

  15. arXiv:2412.19031  [pdf, other

    cs.SE cs.AI

    Repository Structure-Aware Training Makes SLMs Better Issue Resolver

    Authors: Zexiong Ma, Shengnan An, Zeqi Lin, Yanzhen Zou, Bing Xie

    Abstract: Language models have been applied to various software development tasks, but the performance varies according to the scale of the models. Large Language Models (LLMs) outperform Small Language Models (SLMs) in complex tasks like repository-level issue resolving, but raise concerns about privacy and cost. In contrast, SLMs are more accessible but under-perform in complex tasks. In this paper, we in… ▽ More

    Submitted 25 December, 2024; originally announced December 2024.

  16. arXiv:2412.14905  [pdf, other

    cs.CL cs.AI

    Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation

    Authors: Zexiong Ma, Shengnan An, Zeqi Lin, Yanzhen Zou, Jian-Guang Lou, Bing Xie

    Abstract: Large language models (LLMs) are susceptible to generating hallucinated information, despite the integration of retrieval-augmented generation (RAG). Parallel context extension (PCE) is a line of research attempting to effectively integrating parallel (unordered) contexts, while it still suffers from hallucinations when adapted to RAG scenarios. In this paper, we propose DePaC (Dehallucinating Par… ▽ More

    Submitted 19 December, 2024; originally announced December 2024.

  17. arXiv:2412.11787  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    A Method for Detecting Legal Article Competition for Korean Criminal Law Using a Case-augmented Mention Graph

    Authors: Seonho An, Young Yik Rhim, Min-Soo Kim

    Abstract: As social systems become increasingly complex, legal articles are also growing more intricate, making it progressively harder for humans to identify any potential competitions among them, particularly when drafting new laws or applying existing laws. Despite this challenge, no method for detecting such competitions has been proposed so far. In this paper, we propose a new legal AI task called Lega… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

    Comments: under review

    ACM Class: I.2.7

  18. arXiv:2412.05825  [pdf, other

    cs.LG cs.CV

    Self-Supervised Learning with Probabilistic Density Labeling for Rainfall Probability Estimation

    Authors: Junha Lee, Sojung An, Sujeong You, Namik Cho

    Abstract: Numerical weather prediction (NWP) models are fundamental in meteorology for simulating and forecasting the behavior of various atmospheric variables. The accuracy of precipitation forecasts and the acquisition of sufficient lead time are crucial for preventing hazardous weather events. However, the performance of NWP models is limited by the nonlinear and unpredictable patterns of extreme weather… ▽ More

    Submitted 8 December, 2024; originally announced December 2024.

    Comments: Accepted by WACV 2025

  19. arXiv:2412.04862  [pdf, other

    cs.CL

    EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

    Authors: LG AI Research, Soyoung An, Kyunghoon Bae, Eunbi Choi, Kibong Choi, Stanley Jungkyu Choi, Seokhee Hong, Junwon Hwang, Hyojin Jeon, Gerrard Jeongwon Jo, Hyunjik Jo, Jiyeon Jung, Yountae Jung, Hyosang Kim, Joonkee Kim, Seonghwan Kim, Soyeon Kim, Sunkyoung Kim, Yireun Kim, Yongil Kim, Youchul Kim, Edward Hwayoung Lee, Haeju Lee, Honglak Lee, Jinsik Lee , et al. (8 additional authors not shown)

    Abstract: This technical report introduces the EXAONE 3.5 instruction-tuned language models, developed and released by LG AI Research. The EXAONE 3.5 language models are offered in three configurations: 32B, 7.8B, and 2.4B. These models feature several standout capabilities: 1) exceptional instruction following capabilities in real-world scenarios, achieving the highest scores across seven benchmarks, 2) ou… ▽ More

    Submitted 9 December, 2024; v1 submitted 6 December, 2024; originally announced December 2024.

    Comments: arXiv admin note: text overlap with arXiv:2408.03541

  20. arXiv:2412.01471  [pdf, other

    cs.CV

    Multi-Granularity Video Object Segmentation

    Authors: Sangbeom Lim, Seongchan Kim, Seungjun An, Seokju Cho, Paul Hongsuck Seo, Seungryong Kim

    Abstract: Current benchmarks for video segmentation are limited to annotating only salient objects (i.e., foreground instances). Despite their impressive architectural designs, previous works trained on these benchmarks have struggled to adapt to real-world scenarios. Thus, developing a new video segmentation dataset aimed at tracking multi-granularity segmentation target in the video scene is necessary. In… ▽ More

    Submitted 3 December, 2024; v1 submitted 2 December, 2024; originally announced December 2024.

    Comments: Project Page: https://cvlab-kaist.github.io/MUG-VOS

  21. arXiv:2411.05214  [pdf, other

    cs.CL

    STAND-Guard: A Small Task-Adaptive Content Moderation Model

    Authors: Minjia Wang, Pingping Lin, Siqi Cai, Shengnan An, Shengjie Ma, Zeqi Lin, Congrui Huang, Bixiong Xu

    Abstract: Content moderation, the process of reviewing and monitoring the safety of generated content, is important for development of welcoming online platforms and responsible large language models. Content moderation contains various tasks, each with its unique requirements tailored to specific scenarios. Therefore, it is crucial to develop a model that can be easily adapted to novel or customized conten… ▽ More

    Submitted 7 November, 2024; originally announced November 2024.

    Comments: 20 pages, 1 figure

  22. arXiv:2411.00813  [pdf, other

    cs.MM cs.AI cs.CL cs.CV cs.CY cs.LG cs.SI eess.AS

    Personality Analysis from Online Short Video Platforms with Multi-domain Adaptation

    Authors: Sixu An, Xiangguo Sun, Yicong Li, Yu Yang, Guandong Xu

    Abstract: Personality analysis from online short videos has gained prominence due to its applications in personalized recommendation systems, sentiment analysis, and human-computer interaction. Traditional assessment methods, such as questionnaires based on the Big Five Personality Framework, are limited by self-report biases and are impractical for large-scale or real-time analysis. Leveraging the rich, mu… ▽ More

    Submitted 25 October, 2024; originally announced November 2024.

  23. arXiv:2410.07701  [pdf, other

    cs.RO

    Autonomous Driving in Unstructured Environments: How Far Have We Come?

    Authors: Chen Min, Shubin Si, Xu Wang, Hanzhang Xue, Weizhong Jiang, Yang Liu, Juan Wang, Qingtian Zhu, Qi Zhu, Lun Luo, Fanjie Kong, Jinyu Miao, Xudong Cai, Shuai An, Wei Li, Jilin Mei, Tong Sun, Heng Zhai, Qifeng Liu, Fangzhou Zhao, Liang Chen, Shuai Wang, Erke Shang, Linzhi Shang, Kunlong Zhao , et al. (13 additional authors not shown)

    Abstract: Research on autonomous driving in unstructured outdoor environments is less advanced than in structured urban settings due to challenges like environmental diversities and scene complexity. These environments-such as rural areas and rugged terrains-pose unique obstacles that are not common in structured urban areas. Despite these difficulties, autonomous driving in unstructured outdoor environment… ▽ More

    Submitted 31 October, 2024; v1 submitted 10 October, 2024; originally announced October 2024.

    Comments: Survey paper; 38 pages

  24. arXiv:2410.02465  [pdf, other

    cs.CL cs.AI

    Revealing the Inherent Instructability of Pre-Trained Language Models

    Authors: Seokhyun An, Minji Kim, Hyounghun Kim

    Abstract: Instruction tuning -- supervised fine-tuning using instruction-response pairs -- is a key step in making pre-trained large language models (LLMs) instructable. Meanwhile, LLMs perform multitask learning during their pre-training, acquiring extensive knowledge and capabilities. We hypothesize that the pre-training stage can enable them to develop the ability to comprehend and address instructions.… ▽ More

    Submitted 16 February, 2025; v1 submitted 3 October, 2024; originally announced October 2024.

    Comments: 31 pages

  25. arXiv:2409.18164  [pdf

    cs.AI cs.CL cs.LG

    Data-Prep-Kit: getting your data ready for LLM application development

    Authors: David Wood, Boris Lublinsky, Alexy Roytman, Shivdeep Singh, Constantin Adam, Abdulhamid Adebayo, Sungeun An, Yuan Chi Chang, Xuan-Hong Dang, Nirmit Desai, Michele Dolfi, Hajar Emami-Gohari, Revital Eres, Takuya Goto, Dhiraj Joshi, Yan Koyfman, Mohammad Nassar, Hima Patel, Paramesvaran Selvam, Yousaf Shah, Saptha Surendran, Daiki Tsuzuku, Petros Zerfos, Shahrokh Daijavad

    Abstract: Data preparation is the first and a very important step towards any Large Language Model (LLM) development. This paper introduces an easy-to-use, extensible, and scale-flexible open-source data preparation toolkit called Data Prep Kit (DPK). DPK is architected and designed to enable users to scale their data preparation to their needs. With DPK they can prepare data on a local machine or effortles… ▽ More

    Submitted 12 November, 2024; v1 submitted 26 September, 2024; originally announced September 2024.

    Comments: 10 pages, 7 figures

  26. arXiv:2409.16913  [pdf, ps, other

    cs.AI

    Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing

    Authors: Wenhao Liu, Siyu An, Junru Lu, Muling Wu, Tianlong Li, Xiaohua Wang, Changze lv, Xiaoqing Zheng, Di Yin, Xing Sun, Xuanjing Huang

    Abstract: Role-Playing Agents (RPAs) have shown remarkable performance in various applications, yet they often struggle to recognize and appropriately respond to hard queries that conflict with their role-play knowledge. To investigate RPAs' performance when faced with different types of conflicting requests, we develop an evaluation benchmark that includes contextual knowledge conflicting requests, paramet… ▽ More

    Submitted 13 June, 2025; v1 submitted 25 September, 2024; originally announced September 2024.

    Journal ref: Annual Meeting of the Association for Computational Linguistics (ACL), 2025, Findings

  27. arXiv:2409.16202  [pdf, other

    cs.AI

    CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data

    Authors: Qian-Wen Zhang, Haochen Wang, Fang Li, Siyu An, Lingfeng Qiao, Liangcai Gao, Di Yin, Xing Sun

    Abstract: Online education platforms have significantly transformed the dissemination of educational resources by providing a dynamic and digital infrastructure. With the further enhancement of this transformation, the advent of Large Language Models (LLMs) has elevated the intelligence levels of these platforms. However, current academic benchmarks provide limited guidance for real-world industry scenarios… ▽ More

    Submitted 24 September, 2024; v1 submitted 24 September, 2024; originally announced September 2024.

  28. arXiv:2409.13403  [pdf, other

    cs.DS cs.CG

    Dynamic parameterized problems on unit disk graphs

    Authors: Shinwoo An, Kyungjin Cho, Leo Jang, Byeonghyeon Jung, Yudam Lee, Eunjin Oh, Donghun Shin, Hyeonjun Shin, Chanho Song

    Abstract: In this paper, we study fundamental parameterized problems such as $k$-Path/Cycle, Vertex Cover, Triangle Hitting Set, Feedback Vertex Set, and Cycle Packing for dynamic unit disk graphs. Given a vertex set $V$ changing dynamically under vertex insertions and deletions, our goal is to maintain data structures so that the aforementioned parameterized problems on the unit disk graph induced by $V$ c… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

    Comments: To appear in ISAAC 2024

  29. arXiv:2408.09591  [pdf, other

    cs.DS

    Pre-assignment problem for unique minimum vertex cover on bounded clique-width graphs

    Authors: Shinwoo An, Yeonsu Chang, Kyungjin Cho, O-joung Kwon, Myounghwan Lee, Eunjin Oh, Hyeonjun Shin

    Abstract: Horiyama et al. (AAAI 2024) considered the problem of generating instances with a unique minimum vertex cover under certain conditions. The Pre-assignment for Uniquification of Minimum Vertex Cover problem (shortly PAU-VC) is the problem, for given a graph $G$, to find a minimum set $S$ of vertices in $G$ such that there is a unique minimum vertex cover of $G$ containing $S$. We show that PAU-VC i… ▽ More

    Submitted 22 August, 2024; v1 submitted 18 August, 2024; originally announced August 2024.

    Comments: 19 pages, 3 figures

  30. arXiv:2408.03541  [pdf, ps, other

    cs.CL cs.AI

    EXAONE 3.0 7.8B Instruction Tuned Language Model

    Authors: LG AI Research, :, Soyoung An, Kyunghoon Bae, Eunbi Choi, Stanley Jungkyu Choi, Yemuk Choi, Seokhee Hong, Yeonjung Hong, Junwon Hwang, Hyojin Jeon, Gerrard Jeongwon Jo, Hyunjik Jo, Jiyeon Jung, Yountae Jung, Euisoon Kim, Hyosang Kim, Joonkee Kim, Seonghwan Kim, Soyeon Kim, Sunkyoung Kim, Yireun Kim, Youchul Kim, Edward Hwayoung Lee, Haeju Lee , et al. (14 additional authors not shown)

    Abstract: We introduce EXAONE 3.0 instruction-tuned language model, the first open model in the family of Large Language Models (LLMs) developed by LG AI Research. Among different model sizes, we publicly release the 7.8B instruction-tuned model to promote open research and innovations. Through extensive evaluations across a wide range of public and in-house benchmarks, EXAONE 3.0 demonstrates highly compet… ▽ More

    Submitted 13 August, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

  31. Towards End-to-End Explainable Facial Action Unit Recognition via Vision-Language Joint Learning

    Authors: Xuri Ge, Junchen Fu, Fuhai Chen, Shan An, Nicu Sebe, Joemon M. Jose

    Abstract: Facial action units (AUs), as defined in the Facial Action Coding System (FACS), have received significant research interest owing to their diverse range of applications in facial state analysis. Current mainstream FAU recognition models have a notable limitation, i.e., focusing only on the accuracy of AU recognition and overlooking explanations of corresponding AU states. In this paper, we propos… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: 10 pages, 5 figures, 4 tables

    Journal ref: ACM Multimedia 2024

  32. arXiv:2408.00611  [pdf, other

    cs.NE cs.LG

    Using CSNNs to Perform Event-based Data Processing & Classification on ASL-DVS

    Authors: Ria Patel, Sujit Tripathy, Zachary Sublett, Seoyoung An, Riya Patel

    Abstract: Recent advancements in bio-inspired visual sensing and neuromorphic computing have led to the development of various highly efficient bio-inspired solutions with real-world applications. One notable application integrates event-based cameras with spiking neural networks (SNNs) to process event-based sequences that are asynchronous and sparse, making them difficult to handle. In this project, we de… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: 8 pages, 14 figures

  33. arXiv:2408.00359  [pdf, other

    cs.LG stat.ML

    Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks

    Authors: Jy-yong Sohn, Dohyun Kwon, Seoyeon An, Kangwook Lee

    Abstract: Fine-tuning large pre-trained models is a common practice in machine learning applications, yet its mathematical analysis remains largely unexplored. In this paper, we study fine-tuning through the lens of memorization capacity. Our new measure, the Fine-Tuning Capacity (FTC), is defined as the maximum number of samples a neural network can fine-tune, or equivalently, as the minimum number of neur… ▽ More

    Submitted 19 August, 2024; v1 submitted 1 August, 2024; originally announced August 2024.

    Comments: 10 pages, 9 figures, UAI 2024

  34. arXiv:2407.13808  [pdf, other

    cs.CV

    CoAPT: Context Attribute words for Prompt Tuning

    Authors: Gun Lee, Subin An, Sungyong Baik, Soochahn Lee

    Abstract: We propose a novel prompt tuning method called CoAPT(Context Attribute words in Prompt Tuning) for few/zero-shot image classification. The core motivation is that attributes are descriptive words with rich information about a given concept. Thus, we aim to enrich text queries of existing prompt tuning methods, improving alignment between text and image embeddings in CLIP embedding space. To do so,… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 14 pages, 4 figures

  35. arXiv:2407.11372  [pdf, other

    cs.CR cs.CV

    UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening

    Authors: Siyuan Cheng, Guangyu Shen, Kaiyuan Zhang, Guanhong Tao, Shengwei An, Hanxi Guo, Shiqing Ma, Xiangyu Zhang

    Abstract: Deep neural networks (DNNs) have demonstrated effectiveness in various fields. However, DNNs are vulnerable to backdoor attacks, which inject a unique pattern, called trigger, into the input to cause misclassification to an attack-chosen target label. While existing works have proposed various methods to mitigate backdoor effects in poisoned models, they tend to be less effective against recent ad… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: The 18th European Conference on Computer Vision ECCV 2024

  36. arXiv:2407.02536  [pdf, other

    cs.LG cs.IR econ.GN stat.AP

    Reducing False Discoveries in Statistically-Significant Regional-Colocation Mining: A Summary of Results

    Authors: Subhankar Ghosh, Jayant Gupta, Arun Sharma, Shuai An, Shashi Shekhar

    Abstract: Given a set \emph{S} of spatial feature types, its feature instances, a study area, and a neighbor relationship, the goal is to find pairs $<$a region ($r_{g}$), a subset \emph{C} of \emph{S}$>$ such that \emph{C} is a statistically significant regional-colocation pattern in $r_{g}$. This problem is important for applications in various domains including ecology, economics, and sociology. The prob… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    ACM Class: E.m; F.2; E.1; H.3; I.5; J.0

  37. arXiv:2407.00256  [pdf, other

    cs.AI cs.CL cs.LG stat.ML

    One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts

    Authors: Ruochen Wang, Sohyun An, Minhao Cheng, Tianyi Zhou, Sung Ju Hwang, Cho-Jui Hsieh

    Abstract: Large Language Models (LLMs) exhibit strong generalization capabilities to novel tasks when prompted with language instructions and in-context demos. Since this ability sensitively depends on the quality of prompts, various methods have been explored to automate the instruction design. While these methods demonstrated promising results, they also restricted the searched prompt to one instruction.… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: ICML 2024. code available at https://github.com/ruocwang/mixture-of-prompts

    MSC Class: 68T01

    Journal ref: Proceedings of the 41st International Conference on Machine Learning (ICML), Vienna, Austria, 2024

  38. arXiv:2406.17424  [pdf, other

    cs.CG cs.DS

    Sparse Outerstring Graphs Have Logarithmic Treewidth

    Authors: Shinwoo An, Eunjin Oh, Jie Xue

    Abstract: An outerstring graph is the intersection graph of curves lying inside a disk with one endpoint on the boundary of the disk. We show that an outerstring graph with $n$ vertices has treewidth $O(α\log n)$, where $α$ denotes the arboricity of the graph, with an almost matching lower bound of $Ω(α\log (n/α))$. As a corollary, we show that a $t$-biclique-free outerstring graph has treewidth… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 17pages, In ESA'24

  39. arXiv:2406.12430  [pdf, other

    cs.CL cs.AI cs.LG

    PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers

    Authors: Myeonghwa Lee, Seonho An, Min-Soo Kim

    Abstract: In this paper, we conduct a study to utilize LLMs as a solution for decision making that requires complex data analysis. We define Decision QA as the task of answering the best decision, $d_{best}$, for a decision-making question $Q$, business rules $R$ and a database $D$. Since there is no benchmark that can examine Decision QA, we propose Decision QA benchmark, DQA. It has two scenarios, Locatin… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: NAACL 2024

    ACM Class: I.2.7

  40. arXiv:2406.10957  [pdf, other

    cs.CL

    Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence

    Authors: Junru Lu, Jiazheng Li, Siyu An, Meng Zhao, Yulan He, Di Yin, Xing Sun

    Abstract: Direct Preference Optimization (DPO) has emerged as a prominent algorithm for the direct and robust alignment of Large Language Models (LLMs) with human preferences, offering a more straightforward alternative to the complex Reinforcement Learning from Human Feedback (RLHF). Despite its promising efficacy, DPO faces a notable drawback: "verbosity", a common over-optimization phenomenon also observ… ▽ More

    Submitted 9 December, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: EMNLP 2024 Main, Final Version

  41. arXiv:2406.10744  [pdf, other

    cs.CV

    Technique Report of CVPR 2024 PBDL Challenges

    Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu , et al. (75 additional authors not shown)

    Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More

    Submitted 12 July, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 PBDL Challenges: https://pbdl-ws.github.io/pbdl2024/challenge/index.html

  42. arXiv:2406.06379  [pdf, other

    cs.CE

    FinVerse: An Autonomous Agent System for Versatile Financial Analysis

    Authors: Siyu An, Qin Li, Junru Lu, Di Yin, Xing Sun

    Abstract: With the significant advancements in cognitive intelligence driven by LLMs, autonomous agent systems have attracted extensive attention. Despite this growing interest, the development of stable and efficient agent systems poses substantial practical challenges. In this paper, we introduce FinVerse, a meticulously crafted agent system designed for a broad range of financial topics. FinVerse integra… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  43. arXiv:2406.04867  [pdf, other

    cs.LG cs.AI cs.CV

    Deep learning for precipitation nowcasting: A survey from the perspective of time series forecasting

    Authors: Sojung An, Tae-Jin Oh, Eunha Sohn, Donghyun Kim

    Abstract: Deep learning-based time series forecasting has dominated the short-term precipitation forecasting field with the help of its ability to estimate motion flow in high-resolution datasets. The growing interest in precipitation nowcasting offers substantial opportunities for the advancement of current forecasting technologies. Nevertheless, there has been a scarcity of in-depth surveys of time series… ▽ More

    Submitted 13 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: 21 pages, 7 figures, 5 tables

  44. arXiv:2405.20602  [pdf, other

    cs.LG cs.CL

    Masked Language Modeling Becomes Conditional Density Estimation for Tabular Data Synthesis

    Authors: Seunghwan An, Gyeongdong Woo, Jaesung Lim, ChangHyun Kim, Sungchul Hong, Jong-June Jeon

    Abstract: In this paper, our goal is to generate synthetic data for heterogeneous (mixed-type) tabular datasets with high machine learning utility (MLu). Since the MLu performance depends on accurately approximating the conditional distributions, we focus on devising a synthetic data generation method based on conditional distribution estimation. We introduce MaCoDE by redefining the consecutive multi-class… ▽ More

    Submitted 18 August, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  45. Improving SMOTE via Fusing Conditional VAE for Data-adaptive Noise Filtering

    Authors: Sungchul Hong, Seunghwan An, Jong-June Jeon

    Abstract: Recent advances in a generative neural network model extend the development of data augmentation methods. However, the augmentation methods based on the modern generative models fail to achieve notable performance for class imbalance data compared to the conventional model, Synthetic Minority Oversampling Technique (SMOTE). We investigate the problem of the generative model for imbalanced classifi… ▽ More

    Submitted 26 August, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Journal ref: Appl Intell 55, 841 (2025)

  46. arXiv:2405.19346  [pdf, other

    eess.SP cs.AI cs.LG

    Subject-Adaptive Transfer Learning Using Resting State EEG Signals for Cross-Subject EEG Motor Imagery Classification

    Authors: Sion An, Myeongkyun Kang, Soopil Kim, Philip Chikontwe, Li Shen, Sang Hyun Park

    Abstract: Electroencephalography (EEG) motor imagery (MI) classification is a fundamental, yet challenging task due to the variation of signals between individuals i.e., inter-subject variability. Previous approaches try to mitigate this using task-specific (TS) EEG signals from the target subject in training. However, recording TS EEG signals requires time and limits its applicability in various fields. In… ▽ More

    Submitted 9 July, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: Early Accepted at MICCAI 2024

  47. arXiv:2405.16013  [pdf, other

    cs.LG

    Convergence Behavior of an Adversarial Weak Supervision Method

    Authors: Steven An, Sanjoy Dasgupta

    Abstract: Labeling data via rules-of-thumb and minimal label supervision is central to Weak Supervision, a paradigm subsuming subareas of machine learning such as crowdsourced learning and semi-supervised ensemble learning. By using this labeled data to train modern machine learning methods, the cost of acquiring large amounts of hand labeled data can be ameliorated. Approaches to combining the rules-of-thu… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 49 pages, 16 figures, to be published in UAI 2024

  48. 3SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial Self-Highlighting

    Authors: Xuri Ge, Songpei Xu, Fuhai Chen, Jie Wang, Guoxin Wang, Shan An, Joemon M. Jose

    Abstract: In this paper, we propose a novel visual Semantic-Spatial Self-Highlighting Network (termed 3SHNet) for high-precision, high-efficiency and high-generalization image-sentence retrieval. 3SHNet highlights the salient identification of prominent objects and their spatial locations within the visual modality, thus allowing the integration of visual semantics-spatial interactions and maintaining indep… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted Information Processing and Management (IP&M), 10 pages, 9 figures and 8 tables

    Journal ref: Information Processing & Management, Volume 61, Issue 4, July 2024, 103716

  49. arXiv:2404.16898  [pdf, other

    cs.LG cs.AI

    How to Parameterize Asymmetric Quantization Ranges for Quantization-Aware Training

    Authors: Jaeseong You, Minseop Park, Kyunggeun Lee, Seokjun An, Chirag Patel, Markus Nage

    Abstract: This paper investigates three different parameterizations of asymmetric uniform quantization for quantization-aware training: (1) scale and offset, (2) minimum and maximum, and (3) beta and gamma. We perform a comprehensive comparative analysis of these parameterizations' influence on quantization-aware training, using both controlled experiments and real-world large language models. Our particula… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  50. arXiv:2404.16811  [pdf, other

    cs.CL cs.AI

    Make Your LLM Fully Utilize the Context

    Authors: Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou

    Abstract: While many contemporary large language models (LLMs) can process lengthy input, they still struggle to fully utilize information within the long context, known as the lost-in-the-middle challenge. We hypothesize that it stems from insufficient explicit supervision during the long-context training, which fails to emphasize that any position in a long context can hold crucial information. Based on t… ▽ More

    Submitted 26 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 19 pages, 7 figures, 3 tables, 9 examples