Skip to main content

Showing 1–50 of 2,401 results for author: Kim, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05678  [pdf, ps, other

    cs.HC cs.CL

    Beyond Prompts: Learning from Human Communication for Enhanced AI Intent Alignment

    Authors: Yoonsu Kim, Kihoon Son, Seoyoung Kim, Juho Kim

    Abstract: AI intent alignment, ensuring that AI produces outcomes as intended by users, is a critical challenge in human-AI interaction. The emergence of generative AI, including LLMs, has intensified the significance of this problem, as interactions increasingly involve users specifying desired results for AI systems. In order to support better AI intent alignment, we aim to explore human strategies for in… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2405.05581  [pdf, other

    cs.HC cs.AI cs.CL

    One vs. Many: Comprehending Accurate Information from Multiple Erroneous and Inconsistent AI Generations

    Authors: Yoonjoo Lee, Kihoon Son, Tae Soo Kim, Jisu Kim, John Joon Young Chung, Eytan Adar, Juho Kim

    Abstract: As Large Language Models (LLMs) are nondeterministic, the same input can generate different outputs, some of which may be incorrect or hallucinated. If run again, the LLM may correct itself and produce the correct answer. Unfortunately, most LLM-powered systems resort to single results which, correct or not, users accept. Having the LLM produce multiple outputs may help identify disagreements or a… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Accepted to FAccT 2024

  3. arXiv:2405.04537  [pdf, other

    cs.CV cs.AI cs.GR

    An intuitive multi-frequency feature representation for SO(3)-equivariant networks

    Authors: Dongwon Son, Jaehyung Kim, Sanghyeon Son, Beomjoon Kim

    Abstract: The usage of 3D vision algorithms, such as shape reconstruction, remains limited because they require inputs to be at a fixed canonical rotation. Recently, a simple equivariant network, Vector Neuron (VN) has been proposed that can be easily used with the state-of-the-art 3D neural network (NN) architectures. However, its performance is limited because it is designed to use only three-dimensional… ▽ More

    Submitted 15 March, 2024; originally announced May 2024.

    Comments: ICLR 2024

  4. arXiv:2405.04497  [pdf, other

    cs.HC

    Unveiling Disparities in Web Task Handling Between Human and Web Agent

    Authors: Kihoon Son, Jinhyeon Kwon, DaEun Choi, Tae Soo Kim, Young-Ho Kim, Sangdoo Yun, Juho Kim

    Abstract: With the advancement of Large-Language Models (LLMs) and Large Vision-Language Models (LVMs), agents have shown significant capabilities in various tasks, such as data analysis, gaming, or code generation. Recently, there has been a surge in research on web agents, capable of performing tasks within the web environment. However, the web poses unforeseeable scenarios, challenging the generalizabili… ▽ More

    Submitted 8 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  5. arXiv:2405.04356  [pdf, other

    cs.CV

    Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation

    Authors: Jihyun Kim, Changjae Oh, Hoseok Do, Soohyun Kim, Kwanghoon Sohn

    Abstract: We present a new multi-modal face image generation method that converts a text prompt and a visual input, such as a semantic mask or scribble map, into a photo-realistic face image. To do this, we combine the strengths of Generative Adversarial networks (GANs) and diffusion models (DMs) by employing the multi-modal features in the DM into the latent space of the pre-trained GANs. We present a simp… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR 2024

  6. arXiv:2405.03945  [pdf, other

    cs.CV cs.NI

    Role of Sensing and Computer Vision in 6G Wireless Communications

    Authors: Seungnyun Kim, Jihoon Moon, Jinhong Kim, Yongjun Ahn, Donghoon Kim, Sunwoo Kim, Kyuhong Shim, Byonghyo Shim

    Abstract: Recently, we are witnessing the remarkable progress and widespread adoption of sensing technologies in autonomous driving, robotics, and metaverse. Considering the rapid advancement of computer vision (CV) technology to analyze the sensing information, we anticipate a proliferation of wireless applications exploiting the sensing and CV technologies in 6G. In this article, we provide a holistic ove… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  7. arXiv:2405.03732  [pdf

    eess.IV cs.AI cs.CV cs.LG

    Accelerated MR Cholangiopancreatography with Deep Learning-based Reconstruction

    Authors: Jinho Kim, Marcel Dominik Nickel, Florian Knoll

    Abstract: This study accelerates MR cholangiopancreatography (MRCP) acquisitions using deep learning-based (DL) reconstruction at 3T and 0.55T. Thirty healthy volunteers underwent conventional two-fold MRCP scans at field strengths of 3T or 0.55T. We trained a variational network (VN) using retrospectively six-fold undersampled data obtained at 3T. We then evaluated our method against standard techniques su… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 20 pages, 6 figures, 2 tables

  8. arXiv:2405.03083  [pdf, other

    stat.ME cs.LG stat.ML

    Causal K-Means Clustering

    Authors: Kwangho Kim, Jisu Kim, Edward H. Kennedy

    Abstract: Causal effects are often characterized with population summaries. These might provide an incomplete picture when there are heterogeneous treatment effects across subgroups. Since the subgroup structure is typically unknown, it is more challenging to identify and evaluate subgroup effects than population effects. We propose a new solution to this problem: Causal k-Means Clustering, which harnesses… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  9. arXiv:2405.02996  [pdf, other

    cs.SD cs.AI eess.AS

    RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification

    Authors: June-Woo Kim, Miika Toikkanen, Sangmin Bae, Minseok Kim, Ho-Young Jung

    Abstract: Recent advancements in AI have democratized its deployment as a healthcare assistant. While pretrained models from large-scale visual and audio datasets have demonstrably generalized to this task, surprisingly, no studies have explored pretrained speech models, which, as human-originated sounds, intuitively would share closer resemblance to lung sounds. This paper explores the efficacy of pretrain… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted EMBC 2024

  10. arXiv:2405.02569  [pdf, other

    cs.LG cs.AI

    Decoupling Exploration and Exploitation for Unsupervised Pre-training with Successor Features

    Authors: JaeYoon Kim, Junyu Xuan, Christy Liang, Farookh Hussain

    Abstract: Unsupervised pre-training has been on the lookout for the virtue of a value function representation referred to as successor features (SFs), which decouples the dynamics of the environment from the rewards. It has a significant impact on the process of task-specific fine-tuning due to the decomposition. However, existing approaches struggle with local optima due to the unified intrinsic reward of… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: IJCNN 2024

  11. arXiv:2405.02568  [pdf, other

    cs.CV cs.AI

    ActiveNeuS: Active 3D Reconstruction using Neural Implicit Surface Uncertainty

    Authors: Hyunseo Kim, Hyeonseo Yang, Taekyung Kim, YoonSung Kim, Jin-Hwa Kim, Byoung-Tak Zhang

    Abstract: Active learning in 3D scene reconstruction has been widely studied, as selecting informative training views is critical for the reconstruction. Recently, Neural Radiance Fields (NeRF) variants have shown performance increases in active 3D reconstruction using image rendering or geometric uncertainty. However, the simultaneous consideration of both uncertainties in selecting informative views remai… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  12. arXiv:2405.02499  [pdf, other

    cs.CR cs.AR

    DRAMScope: Uncovering DRAM Microarchitecture and Characteristics by Issuing Memory Commands

    Authors: Hwayong Nam, Seungmin Baek, Minbok Wi, Michael Jaemin Kim, Jaehyun Park, Chihun Song, Nam Sung Kim, Jung Ho Ahn

    Abstract: The demand for precise information on DRAM microarchitectures and error characteristics has surged, driven by the need to explore processing in memory, enhance reliability, and mitigate security vulnerability. Nonetheless, DRAM manufacturers have disclosed only a limited amount of information, making it difficult to find specific information on their DRAM microarchitectures. This paper addresses t… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: To appear at the 51st IEEE/ACM International Symposium on Computer Architecture (ISCA)

  13. arXiv:2405.02066  [pdf, other

    cs.CV eess.IV

    WateRF: Robust Watermarks in Radiance Fields for Protection of Copyrights

    Authors: Youngdong Jang, Dong In Lee, MinHyuk Jang, Jong Wook Kim, Feng Yang, Sangpil Kim

    Abstract: The advances in the Neural Radiance Fields (NeRF) research offer extensive applications in diverse domains, but protecting their copyrights has not yet been researched in depth. Recently, NeRF watermarking has been considered one of the pivotal solutions for safely deploying NeRF-based 3D representations. However, existing methods are designed to apply only to implicit or explicit NeRF representat… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  14. TinySeg: Model Optimizing Framework for Image Segmentation on Tiny Embedded Systems

    Authors: Byungchul Chae, Jiae Kim, Seonyeong Heo

    Abstract: Image segmentation is one of the major computer vision tasks, which is applicable in a variety of domains, such as autonomous navigation of an unmanned aerial vehicle. However, image segmentation cannot easily materialize on tiny embedded systems because image segmentation models generally have high peak memory usage due to their architectural characteristics. This work finds that image segmentati… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: LCTES 2024

  15. arXiv:2405.01531  [pdf, other

    cs.LG cs.AI cs.CV

    Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models

    Authors: Nishad Singhi, Jae Myung Kim, Karsten Roth, Zeynep Akata

    Abstract: Concept Bottleneck Models (CBMs) ground image classification on human-understandable concepts to allow for interpretable model decisions. Crucially, the CBM design inherently allows for human interventions, in which expert users are given the ability to modify potentially misaligned concept choices to influence the decision behavior of the model in an interpretable fashion. However, existing appro… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  16. arXiv:2405.01361  [pdf, other

    cs.RO

    Haptic-Based Bilateral Teleoperation of Aerial Manipulator for Extracting Wedged Object with Compensation of Human Reaction Time

    Authors: Jeonghyun Byun, Dohyun Eom, H. Jin Kim

    Abstract: Bilateral teleoperation of an aerial manipulator facilitates the execution of industrial missions thanks to the combination of the aerial platform's maneuverability and the ability to conduct complex tasks with human supervision. Heretofore, research on such operations has focused on flying without any physical interaction or exerting a pushing force on a contact surface that does not involve abru… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: to be presented in 2024 IEEE International Conference on Unmanned Aircraft Systems (ICUAS), Chania, Crete, Greece, 2024

  17. arXiv:2405.00229  [pdf, other

    cs.HC cs.AI cs.PL

    Aptly: Making Mobile Apps from Natural Language

    Authors: Evan W. Patton, David Y. J. Kim, Ashley Granquist, Robin Liu, Arianna Scott, Jennet Zamanova, Harold Abelson

    Abstract: We present Aptly, an extension of the MIT App Inventor platform enabling mobile app development via natural language powered by code-generating large language models (LLMs). Aptly complements App Inventor's block language with a text language designed to allow visual code generation via text-based LLMs. We detail the technical aspects of how the Aptly server integrates LLMs with a realtime collabo… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages, 7 figures, 2 tables

  18. arXiv:2404.19336  [pdf

    cs.AI cs.PL

    Improving LLM Classification of Logical Errors by Integrating Error Relationship into Prompts

    Authors: Yanggyu Lee, Suchae Jeong, Jihie Kim

    Abstract: LLMs trained in the understanding of programming syntax are now providing effective assistance to developers and are being used in programming education such as in generation of coding problem examples or providing code explanations. A key aspect of programming education is understanding and dealing with error message. However, 'logical errors' in which the program operates against the programmer'… ▽ More

    Submitted 1 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: Accepted in ITS 2024

  19. arXiv:2404.18516  [pdf, ps, other

    eess.SP cs.IT

    Downlink Pilots are Essential for Cell-Free Massive MIMO with Multi-Antenna Users

    Authors: Eren Berk Kama, Junbeom Kim, Emil Björnson

    Abstract: We consider a cell-free massive MIMO system with multiple antennas on the users and access points. In previous works, the downlink spectral efficiency (SE) has been evaluated using the hardening bound that requires no downlink pilots. This approach works well when having single-antenna users. In this paper, we show that much higher SEs can be achieved if downlink pilots are sent since the effectiv… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  20. arXiv:2404.18423  [pdf, other

    cs.CV cs.AI

    Unsupervised Dynamics Prediction with Object-Centric Kinematics

    Authors: Yeon-Ji Song, Suhyung Choi, Jaein Kim, Jin-Hwa Kim, Byoung-Tak Zhang

    Abstract: Human perception involves discerning complex multi-object scenes into time-static object appearance (ie, size, shape, color) and time-varying object motion (ie, location, velocity, acceleration). This innate ability to unconsciously understand the environment is the motivation behind the success of dynamics modeling. Object-centric representations have emerged as a promising tool for dynamics pred… ▽ More

    Submitted 6 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: 15 pages, 6 figures, 4 tables

  21. LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing

    Authors: Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-Hsun Chen, Shaowei Wang

    Abstract: Logs are important in modern software development with runtime information. Log parsing is the first step in many log-based analyses, that involve extracting structured information from unstructured log data. Traditional log parsers face challenges in accurately parsing logs due to the diversity of log formats, which directly impacts the performance of downstream log-analysis tasks. In this paper,… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  22. arXiv:2404.17686  [pdf, other

    cs.NI cs.IT

    On the Benefits of Coding for Network Slicing

    Authors: Homa Esfahanizadeh, Vipindev Adat Vasudevan, Benjamin D. Kim, Shruti Siva, Jennifer Kim, Alejandro Cohen, Muriel Médard

    Abstract: Network slicing has emerged as an integral concept in 5G, aiming to partition the physical network infrastructure into isolated slices, customized for specific applications. We theoretically formulate the key performance metrics of an application, in terms of goodput and delivery delay, at a cost of network resources in terms of bandwidth. We explore an un-coded communication protocol that uses fe… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  23. arXiv:2404.17674  [pdf, other

    cs.LG cs.AI cs.CR

    Center-Based Relaxed Learning Against Membership Inference Attacks

    Authors: Xingli Fang, Jung-Eun Kim

    Abstract: Membership inference attacks (MIAs) are currently considered one of the main privacy attack strategies, and their defense mechanisms have also been extensively explored. However, there is still a gap between the existing defense approaches and ideal models in performance and deployment costs. In particular, we observed that the privacy vulnerability of the model is closely correlated with the gap… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  24. arXiv:2404.17140  [pdf, other

    cs.CL

    Small Language Models Need Strong Verifiers to Self-Correct Reasoning

    Authors: Yunxiang Zhang, Muhammad Khalifa, Lajanugen Logeswaran, Jaekyeom Kim, Moontae Lee, Honglak Lee, Lu Wang

    Abstract: Self-correction has emerged as a promising solution to boost the reasoning performance of large language models (LLMs), where LLMs refine their solutions using self-generated critiques that pinpoint the errors. This work explores whether smaller-size (<= 13B) language models (LMs) have the ability of self-correction on reasoning tasks with minimal inputs from stronger LMs. We propose a novel pipel… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  25. arXiv:2404.16831  [pdf, other

    cs.CV

    The Third Monocular Depth Estimation Challenge

    Authors: Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, YiPing Bao, Xiao Liu, Dohyeong Kim, Jinseong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, Jinqiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora , et al. (16 additional authors not shown)

    Abstract: This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su… ▽ More

    Submitted 27 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: To appear in CVPRW2024

  26. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  27. arXiv:2404.16168  [pdf, other

    cs.LG cs.AI stat.ML

    The Over-Certainty Phenomenon in Modern UDA Algorithms

    Authors: Fin Amin, Jung-Eun Kim

    Abstract: When neural networks are confronted with unfamiliar data that deviate from their training set, this signifies a domain shift. While these networks output predictions on their inputs, they typically fail to account for their level of familiarity with these novel observations. This challenge becomes even more pronounced in resource-constrained settings, such as embedded systems or edge devices. To a… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  28. arXiv:2404.15882  [pdf, ps, other

    cs.CV cs.AI

    Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains

    Authors: Eunsu Baek, Keondo Park, Jiyoon Kim, Hyung-Sin Kim

    Abstract: Computer vision applications predict on digital images acquired by a camera from physical scenes through light. However, conventional robustness benchmarks rely on perturbations in digitized images, diverging from distribution shifts occurring in the image acquisition process. To bridge this gap, we introduce a new distribution shift dataset, ImageNet-ES, comprising variations in environmental and… ▽ More

    Submitted 25 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: Published as a conference paper at CVPR 2024

  29. arXiv:2404.15510  [pdf, other

    cs.AR cs.DC cs.LG cs.NE

    NeuraChip: Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator

    Authors: Kaustubh Shivdikar, Nicolas Bohm Agostini, Malith Jayaweera, Gilbert Jonatan, Jose L. Abellan, Ajay Joshi, John Kim, David Kaeli

    Abstract: Graph Neural Networks (GNNs) are emerging as a formidable tool for processing non-euclidean data across various domains, ranging from social network analysis to bioinformatics. Despite their effectiveness, their adoption has not been pervasive because of scalability challenges associated with large-scale graph datasets, particularly when leveraging message passing. To tackle these challenges, we… ▽ More

    Submitted 26 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Visit https://neurachip.us for WebGUI based simulations

  30. arXiv:2404.15333  [pdf, other

    eess.SP cs.LG

    EB-GAME: A Game-Changer in ECG Heartbeat Anomaly Detection

    Authors: JuneYoung Park, Da Young Kim, Yunsoo Kim, Jisu Yoo, Tae Joon Kim

    Abstract: Cardiologists use electrocardiograms (ECG) for the detection of arrhythmias. However, continuous monitoring of ECG signals to detect cardiac abnormal-ities requires significant time and human resources. As a result, several deep learning studies have been conducted in advance for the automatic detection of arrhythmia. These models show relatively high performance in supervised learning, but are no… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  31. arXiv:2404.15190  [pdf, other

    cs.AI cs.CL cs.CV cs.RO

    Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following

    Authors: Suyeon Shin, Sujin jeon, Junghyun Kim, Gi-Cheon Kang, Byoung-Tak Zhang

    Abstract: Embodied Instruction Following (EIF) is the task of executing natural language instructions by navigating and interacting with objects in 3D environments. One of the primary challenges in EIF is compositional task planning, which is often addressed with supervised or in-context learning with labeled data. To this end, we introduce the Socratic Planner, the first zero-shot planning method that infe… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 14 pages, 6 figures

    MSC Class: 68T01 (Primary) 68T40; 68T50; 68T45 (Secondary)

  32. arXiv:2404.14687  [pdf, other

    cs.MM cs.AI cs.CL cs.CV

    Pegasus-v1 Technical Report

    Authors: Raehyuk Jung, Hyojun Go, Jaehyuk Yi, Jiho Jang, Daniel Kim, Jay Suh, Aiden Lee, Cooper Han, Jae Lee, Jeff Kim, Jin-Young Kim, Junwan Kim, Kyle Park, Lucas Lee, Mars Ha, Minjoon Seo, Abraham Jo, Ed Park, Hassan Kianinejad, SJ Kim, Tony Moon, Wade Jeong, Andrei Popescu, Esther Kim, EK Yoon , et al. (19 additional authors not shown)

    Abstract: This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's archi… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  33. arXiv:2404.14202  [pdf, other

    cs.LG stat.ML

    Rotting Infinitely Many-armed Bandits beyond the Worst-case Rotting: An Adaptive Approach

    Authors: Jung-hun Kim, Milan Vojnovic, Se-Young Yun

    Abstract: In this study, we consider the infinitely many armed bandit problems in rotting environments, where the mean reward of an arm may decrease with each pull, while otherwise, it remains unchanged. We explore two scenarios capturing problem-dependent characteristics regarding the decay of rewards: one in which the cumulative amount of rotting is bounded by $V_T$, referred to as the slow-rotting scenar… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  34. arXiv:2404.13808  [pdf, other

    cs.IR cs.LG cs.MM

    General Item Representation Learning for Cold-start Content Recommendations

    Authors: Jooeun Kim, Jinri Kim, Kwangeun Yeo, Eungi Kim, Kyoung-Woon On, Jonghwan Mun, Joonseok Lee

    Abstract: Cold-start item recommendation is a long-standing challenge in recommendation systems. A common remedy is to use a content-based approach, but rich information from raw contents in various forms has not been fully utilized. In this paper, we propose a domain/data-agnostic item representation learning framework for cold-start recommendations, naturally equipped with multimodal alignment among vario… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 14 pages

  35. arXiv:2404.13318  [pdf, other

    cs.LG

    EHRFL: Federated Learning Framework for Heterogeneous EHRs and Precision-guided Selection of Participating Clients

    Authors: Jiyoun Kim, Junu Kim, Kyunghoon Hur, Edward Choi

    Abstract: In this study, we provide solutions to two practical yet overlooked scenarios in federated learning for electronic health records (EHRs): firstly, we introduce EHRFL, a framework that facilitates federated learning across healthcare institutions with distinct medical coding systems and database schemas using text-based linearization of EHRs. Secondly, we focus on a scenario where a single healthca… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  36. arXiv:2404.13081  [pdf, other

    cs.CL cs.AI cs.LG

    SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs

    Authors: Jaehyung Kim, Jaehyun Nam, Sangwoo Mo, Jongjin Park, Sang-Woo Lee, Minjoon Seo, Jung-Woo Ha, Jinwoo Shin

    Abstract: Large language models (LLMs) have made significant advancements in various natural language processing tasks, including question answering (QA) tasks. While incorporating new information with the retrieval of relevant passages is a promising way to improve QA with LLMs, the existing methods often require additional fine-tuning which becomes infeasible with recent LLMs. Augmenting retrieved passage… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted at ICLR 2024

  37. arXiv:2404.12404  [pdf, other

    cs.LG cs.AI

    Group-wise Prompting for Synthetic Tabular Data Generation using Large Language Models

    Authors: Jinhee Kim, Taesung Kim, Jaegul Choo

    Abstract: Generating realistic synthetic tabular data presents a critical challenge in machine learning. This study introduces a simple yet effective method employing Large Language Models (LLMs) tailored to generate synthetic data, specifically addressing data imbalance problems. We propose a novel group-wise prompting method in CSV-style formatting that leverages the in-context learning capabilities of LL… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  38. arXiv:2404.11972  [pdf, other

    cs.CL

    Aligning Language Models to Explicitly Handle Ambiguity

    Authors: Hyuhng Joon Kim, Youna Kim, Cheonbok Park, Junyeob Kim, Choonghyun Park, Kang Min Yoo, Sang-goo Lee, Taeuk Kim

    Abstract: In spoken languages, utterances are often shaped to be incomplete or vague for efficiency. This can lead to varying interpretations of the same input, based on different assumptions about the context. To ensure reliable user-model interactions in such scenarios, it is crucial for models to adeptly handle the inherent ambiguity in user queries. However, conversational agents built upon even the mos… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  39. arXiv:2404.11925  [pdf, other

    cs.LG cs.AI cs.CV

    EdgeFusion: On-Device Text-to-Image Generation

    Authors: Thibault Castells, Hyoung-Kyu Song, Tairen Piao, Shinkook Choi, Bo-Kyeong Kim, Hanyoung Yim, Changgwun Lee, Jae Gon Kim, Tae-Ho Kim

    Abstract: The intensive computational burden of Stable Diffusion (SD) for text-to-image generation poses a significant hurdle for its practical application. To tackle this challenge, recent research focuses on methods to reduce sampling steps, such as Latent Consistency Model (LCM), and on employing architectural optimizations, including pruning and knowledge distillation. Diverging from existing approaches… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 4 pages, accepted to CVPR24 First Workshop on Efficient and On-Device Generation (EDGE)

  40. arXiv:2404.11916  [pdf, other

    cs.CL cs.AI

    SKIP: Skill-Localized Prompt Tuning for Inference Speed Boost-Up

    Authors: Nakyeong Yang, Junseok Kim, Jiwon Moon, Yunah Jang, Kyomin Jung

    Abstract: Prompt-tuning methods have shown comparable performance as parameter-efficient fine-tuning (PEFT) methods in various natural language understanding tasks. However, existing prompt tuning methods still utilize the entire model architecture; thus, they fail to accelerate inference speed in the application. In this paper, we propose a novel approach called SKIll-localized Prompt tuning (SKIP), which… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 6 pages

  41. arXiv:2404.11320  [pdf, other

    cs.RO

    Saturated RISE control for considering rotor thrust saturation of fully actuated multirotor

    Authors: Dongjae Lee, H. Jin Kim

    Abstract: This work proposes a saturated robust controller for a fully actuated multirotor that takes disturbance rejection and rotor thrust saturation into account. A disturbance rejection controller is required to prevent performance degradation in the presence of parametric uncertainty and external disturbance. Furthermore, rotor saturation should be properly addressed in a controller to avoid performanc… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 6 pages, 5 figures, 2024 International Conference on Unmanned Aircraft Systems (ICUAS) accepted

  42. arXiv:2404.11310  [pdf, other

    cs.RO

    Autonomous aerial perching and unperching using omnidirectional tiltrotor and switching controller

    Authors: Dongjae Lee, Sunwoo Hwang, Jeonghyun Byun, Seung Jae Lee, H. Jin Kim

    Abstract: Aerial unperching of multirotors has received little attention as opposed to perching that has been investigated to elongate operation time. This study presents a new aerial robot capable of both perching and unperching autonomously on/from a ferromagnetic surface during flight, and a switching controller to avoid rotor saturation and mitigate overshoot during transition between free-flight and pe… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 7 pages, 10 figures, 2024 IEEE International Conference on Robotics and Automation (ICRA) accepted

  43. arXiv:2404.11104  [pdf, other

    cs.CV

    Object Remover Performance Evaluation Methods using Class-wise Object Removal Images

    Authors: Changsuk Oh, Dongseok Shim, Taekbeom Lee, H. Jin Kim

    Abstract: Object removal refers to the process of erasing designated objects from an image while preserving the overall appearance, and it is one area where image inpainting is widely used in real-world applications. The performance of an object remover is quantitatively evaluated by measuring the quality of object removal results, similar to how the performance of an image inpainter is gauged. Current work… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  44. arXiv:2404.10938  [pdf, other

    cs.RO

    Safety-critical Autonomous Inspection of Distillation Columns using Quadrupedal Robots Equipped with Roller Arms

    Authors: Jaemin Lee, Jeeseop Kim, Aaron D. Ames

    Abstract: This paper proposes a comprehensive framework designed for the autonomous inspection of complex environments, with a specific focus on multi-tiered settings such as distillation column trays. Leveraging quadruped robots equipped with roller arms, and through the use of onboard perception, we integrate essential motion components including: locomotion, safe and dynamic transitions between trays, an… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 8 pages, 8 figures

  45. arXiv:2404.10936  [pdf, other

    eess.SP cs.LG

    Beam Training in mmWave Vehicular Systems: Machine Learning for Decoupling Beam Selection

    Authors: Ibrahim Kilinc, Ryan M. Dreifuerst, Junghoon Kim, Robert W. Heath Jr

    Abstract: Codebook-based beam selection is one approach for configuring millimeter wave communication links. The overhead required to reconfigure the transmit and receive beam pair, though, increases in highly dynamic vehicular communication systems. Location information coupled with machine learning (ML) beam recommendation is one way to reduce the overhead of beam pair selection. In this paper, we develop… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE BlackSeaCom 2024, 6 pages, 5 figures

  46. arXiv:2404.10436  [pdf, other

    cs.LG stat.CO stat.ME

    Tree Bandits for Generative Bayes

    Authors: Sean O'Hagan, Jungeum Kim, Veronika Rockova

    Abstract: In generative models with obscured likelihood, Approximate Bayesian Computation (ABC) is often the tool of last resort for inference. However, ABC demands many prior parameter trials to keep only a small fraction that passes an acceptance test. To accelerate ABC rejection sampling, this paper develops a self-aware framework that learns from past trials and errors. We apply recursive partitioning c… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  47. arXiv:2404.10308  [pdf, other

    cs.LG cs.AI

    Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

    Authors: Woomin Song, Seunghyuk Oh, Sangwoo Mo, Jaehyung Kim, Sukmin Yun, Jung-Woo Ha, Jinwoo Shin

    Abstract: Large language models (LLMs) have shown remarkable performance in various natural language processing tasks. However, a primary constraint they face is the context limit, i.e., the maximum number of tokens they can process. Previous works have explored architectural changes and modifications in positional encoding to relax the constraint, but they often require expensive training or do not address… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted to ICLR 2024. The first two authors contributed equally

  48. arXiv:2404.10179  [pdf, other

    cs.RO cs.AI cs.HC cs.LG

    Scaling Instructable Agents Across Many Simulated Worlds

    Authors: SIMA Team, Maria Abi Raad, Arun Ahuja, Catarina Barros, Frederic Besse, Andrew Bolt, Adrian Bolton, Bethanie Brownfield, Gavin Buttimore, Max Cant, Sarah Chakera, Stephanie C. Y. Chan, Jeff Clune, Adrian Collister, Vikki Copeman, Alex Cullum, Ishita Dasgupta, Dario de Cesare, Julia Di Trapani, Yani Donchev, Emma Dunleavy, Martin Engelcke, Ryan Faulkner, Frankie Garcia, Charles Gbadamosi , et al. (68 additional authors not shown)

    Abstract: Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground language in perception and embodied actions, in order to accomplish complex tasks. The Scalable, Instructable, Multiworld Agent (SIMA) project tackles this by training agents to follow free-form instructio… ▽ More

    Submitted 17 April, 2024; v1 submitted 13 March, 2024; originally announced April 2024.

  49. arXiv:2404.09886  [pdf, other

    cs.LG cs.CV

    ReffAKD: Resource-efficient Autoencoder-based Knowledge Distillation

    Authors: Divyang Doshi, Jung-Eun Kim

    Abstract: In this research, we propose an innovative method to boost Knowledge Distillation efficiency without the need for resource-heavy teacher models. Knowledge Distillation trains a smaller ``student'' model with guidance from a larger ``teacher'' model, which is computationally costly. However, the main benefit comes from the soft labels provided by the teacher, helping the student grasp nuanced class… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  50. arXiv:2404.08871  [pdf, other

    cs.DC cs.AR

    PID-Comm: A Fast and Flexible Collective Communication Framework for Commodity Processing-in-DIMM Devices

    Authors: Si Ung Noh, Junguk Hong, Chaemin Lim, Seongyeon Park, Jeehyun Kim, Hanjun Kim, Youngsok Kim, Jinho Lee

    Abstract: Recent dual in-line memory modules (DIMMs) are starting to support processing-in-memory (PIM) by associating their memory banks with processing elements (PEs), allowing applications to overcome the data movement bottleneck by offloading memory-intensive operations to the PEs. Many highly parallel applications have been shown to benefit from these PIM-enabled DIMMs, but further speedup is often lim… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted to ISCA 2024