Skip to main content

Showing 1–50 of 1,846 results for author: Lee, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05923  [pdf

    cs.HC

    Darkverse -- A New DarkWeb?

    Authors: Raymond Chan, Benjamin W. J. Kwok, Adriel Yeo, Kan Chen, Jeannie S. Lee

    Abstract: The "Darkverse" could be the negative harmful area of the Metaverse; a new virtual immersive environment for the facilitation of illicit activity such as misinformation, fraud, harassment, and illegal marketplaces. This paper explores the potential for inappropriate activities within the Metaverse, and the similarities between the Darkverse and the Dark Web. Challenges and future directions for in… ▽ More

    Submitted 23 April, 2024; originally announced May 2024.

    Comments: This is an accepted position statement of CHI 2024 Workshop (Novel Approaches for Understanding and Mitigating Emerging New Harms in Immersive and Embodied Virtual Spaces: A Workshop at CHI 2024)

  2. Exploiting Autoencoder's Weakness to Generate Pseudo Anomalies

    Authors: Marcella Astrid, Muhammad Zaigham Zaheer, Djamila Aouada, Seung-Ik Lee

    Abstract: Due to the rare occurrence of anomalous events, a typical approach to anomaly detection is to train an autoencoder (AE) with normal data only so that it learns the patterns or representations of the normal training data. At test time, the trained AE is expected to well reconstruct normal but to poorly reconstruct anomalous data. However, contrary to the expectation, anomalous data is often well re… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: SharedIt link: https://rdcu.be/dGOrh

    Journal ref: Astrid, M., Zaheer, M.Z., Aouada, D. and Lee, S.I., 2024. Exploiting autoencoder's weakness to generate pseudo anomalies. Neural Computing and Applications, pp.1-17

  3. arXiv:2405.03821  [pdf, other

    cs.HC cs.AI cs.SE

    Thoughtful Things: Building Human-Centric Smart Devices with Small Language Models

    Authors: Evan King, Haoxiang Yu, Sahil Vartak, Jenna Jacob, Sangsu Lee, Christine Julien

    Abstract: Everyday devices like light bulbs and kitchen appliances are now embedded with so many features and automated behaviors that they have become complicated to actually use. While such "smart" capabilities can better support users' goals, the task of learning the "ins and outs" of different devices is daunting. Voice assistants aim to solve this problem by providing a natural language interface to de… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 24 pages (3 pages of references)

  4. arXiv:2405.03183  [pdf, other

    cs.DC cs.CR math.NA

    Impact of EIP-4844 on Ethereum: Consensus Security, Ethereum Usage, Rollup Transaction Dynamics, and Blob Gas Fee Markets

    Authors: Seongwan Park, Bosul Mun, Seungyun Lee, Woojin Jeong, Jaewook Lee, Hyeonsang Eom, Huisu Jang

    Abstract: On March 13, 2024, Ethereum implemented EIP-4844, designed to enhance its role as a data availability layer. While this upgrade reduces data posting costs for rollups, it also raises concerns about its impact on the consensus layer due to increased propagation sizes. Moreover, the broader effects on the overall Ethereum ecosystem remain largely unexplored. In this paper, we conduct an empirical an… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  5. arXiv:2405.01974  [pdf, other

    cs.LG cs.AI q-bio.QM

    Multitask Extension of Geometrically Aligned Transfer Encoder

    Authors: Sung Moon Ko, Sumin Lee, Dae-Woong Jeong, Hyunseung Kim, Chanhui Lee, Soorin Yim, Sehui Han

    Abstract: Molecular datasets often suffer from a lack of data. It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved. Here, we leverage mutual information across different tasks in molecular data to address this issue. We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transf… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 7 pages, 3 figures, 2 tables

  6. arXiv:2405.01714  [pdf, other

    cs.LG cs.AI

    Interpretable Vital Sign Forecasting with Model Agnostic Attention Maps

    Authors: Yuwei Liu, Chen Dan, Anubhav Bhatti, Bingjie Shen, Divij Gupta, Suraj Parmar, San Lee

    Abstract: Sepsis is a leading cause of mortality in intensive care units (ICUs), representing a substantial medical challenge. The complexity of analyzing diverse vital signs to predict sepsis further aggravates this issue. While deep learning techniques have been advanced for early sepsis prediction, their 'black-box' nature obscures the internal logic, impairing interpretability in critical settings like… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 8 pages, 4 figures

  7. arXiv:2405.01554  [pdf, other

    cs.LG cs.AI q-bio.NC

    Early-stage detection of cognitive impairment by hybrid quantum-classical algorithm using resting-state functional MRI time-series

    Authors: Junggu Choi, Tak Hur, Daniel K. Park, Na-Young Shin, Seung-Koo Lee, Hakbae Lee, Sanghoon Han

    Abstract: Following the recent development of quantum machine learning techniques, the literature has reported several quantum machine learning algorithms for disease detection. This study explores the application of a hybrid quantum-classical algorithm for classifying region-of-interest time-series data obtained from resting-state functional magnetic resonance imaging in patients with early-stage cognitive… ▽ More

    Submitted 16 March, 2024; originally announced May 2024.

    Comments: 28 pages, 10 figures

  8. arXiv:2405.01113  [pdf, other

    cs.CV cs.AI eess.IV

    Domain-Transferred Synthetic Data Generation for Improving Monocular Depth Estimation

    Authors: Seungyeop Lee, Knut Peterson, Solmaz Arezoomandan, Bill Cai, Peihan Li, Lifeng Zhou, David Han

    Abstract: A major obstacle to the development of effective monocular depth estimation algorithms is the difficulty in obtaining high-quality depth data that corresponds to collected RGB images. Collecting this data is time-consuming and costly, and even data collected by modern sensors has limited range or resolution, and is subject to inconsistencies and noise. To combat this, we propose a method of data g… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  9. arXiv:2405.00236  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    STT: Stateful Tracking with Transformers for Autonomous Driving

    Authors: Longlong Jing, Ruichi Yu, Xu Chen, Zhengli Zhao, Shiwei Sheng, Colin Graber, Qi Chen, Qinru Li, Shangxuan Wu, Han Deng, Sangjin Lee, Chris Sweeney, Qiurui He, Wei-Chih Hung, Tong He, Xingyi Zhou, Farshid Moussavi, Zijian Guo, Yin Zhou, Mingxing Tan, Weilong Yang, Congcong Li

    Abstract: Tracking objects in three-dimensional space is critical for autonomous driving. To ensure safety while driving, the tracker must be able to reliably track objects across frames and accurately estimate their states such as velocity and acceleration in the present. Existing works frequently focus on the association task while either neglecting the model performance on state estimation or deploying c… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: ICRA 2024

  10. arXiv:2404.18926  [pdf, other

    cs.RO cs.CV cs.LG

    Point Cloud Models Improve Visual Robustness in Robotic Learners

    Authors: Skand Peri, Iain Lee, Chanho Kim, Li Fuxin, Tucker Hermans, Stefan Lee

    Abstract: Visual control policies can encounter significant performance degradation when visual conditions like lighting or camera position differ from those seen during training -- often exhibiting sharp declines in capability even for minor differences. In this work, we examine robustness to a suite of these types of visual changes for RGB-D and point cloud based visual control policies. To perform these… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted at International Conference on Robotics and Automation, 2024

  11. arXiv:2404.18448  [pdf, other

    cs.CV

    MFP: Making Full Use of Probability Maps for Interactive Image Segmentation

    Authors: Chaewon Lee, Seon-Ho Lee, Chang-Su Kim

    Abstract: In recent interactive segmentation algorithms, previous probability maps are used as network input to help predictions in the current segmentation round. However, despite the utilization of previous masks, useful information contained in the probability maps is not well propagated to the current predictions. In this paper, to overcome this limitation, we propose a novel and effective algorithm for… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  12. arXiv:2404.17592  [pdf, other

    cs.IR cs.LG stat.ML

    Low-Rank Online Dynamic Assortment with Dual Contextual Information

    Authors: Seong Jin Lee, Will Wei Sun, Yufeng Liu

    Abstract: As e-commerce expands, delivering real-time personalized recommendations from vast catalogs poses a critical challenge for retail platforms. Maximizing revenue requires careful consideration of both individual customer characteristics and available item features to optimize assortments over time. In this paper, we consider the dynamic assortment problem with dual contexts -- user and item features… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  13. arXiv:2404.17563  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    An exactly solvable model for emergence and scaling laws

    Authors: Yoonsoo Nam, Nayara Fonseca, Seok Hyeong Lee, Ard Louis

    Abstract: Deep learning models can exhibit what appears to be a sudden ability to solve a new problem as training time ($T$), training data ($D$), or model size ($N$) increases, a phenomenon known as emergence. In this paper, we present a framework where each new ability (a skill) is represented as a basis function. We solve a simple multi-linear model in this skill-basis, finding analytic expressions for t… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  14. arXiv:2404.17047  [pdf, other

    cs.LG

    Near to Mid-term Risks and Opportunities of Open Source Generative AI

    Authors: Francisco Eiras, Aleksandar Petrov, Bertie Vidgen, Christian Schroeder de Witt, Fabio Pizzati, Katherine Elkins, Supratik Mukhopadhyay, Adel Bibi, Botos Csaba, Fabro Steibel, Fazl Barez, Genevieve Smith, Gianluca Guadagni, Jon Chun, Jordi Cabot, Joseph Marvin Imperial, Juan A. Nolazco-Flores, Lori Landay, Matthew Jackson, Paul Röttger, Philip H. S. Torr, Trevor Darrell, Yong Suk Lee, Jakob Foerster

    Abstract: In the next few years, applications of Generative AI are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about potential risks and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation i… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  15. arXiv:2404.16804  [pdf, other

    cs.CV cs.AI cs.LG

    AAPL: Adding Attributes to Prompt Learning for Vision-Language Models

    Authors: Gahyeon Kim, Sohee Kim, Seokju Lee

    Abstract: Recent advances in large pre-trained vision-language models have demonstrated remarkable performance on zero-shot downstream tasks. Building upon this, recent studies, such as CoOp and CoCoOp, have proposed the use of prompt learning, where context within a prompt is replaced with learnable vectors, leading to significant improvements over manually crafted prompts. However, the performance improve… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024 Workshop on Prompting in Vision, Project Page: https://github.com/Gahyeonkim09/AAPL

  16. Cost-Driven Data Replication with Predictions

    Authors: Tianyu Zuo, Xueyan Tang, Bu Sung Lee

    Abstract: This paper studies an online replication problem for distributed data access. The goal is to dynamically create and delete data copies in a multi-server system as time passes to minimize the total storage and network cost of serving access requests. We study the problem in the emergent learning-augmented setting, assuming simple binary predictions about inter-request times at individual servers. W… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: The formal version of this draft will appear in ACM SPAA'24 conference

  17. arXiv:2404.16257  [pdf, other

    cs.CL cs.AI

    Translation of Multifaceted Data without Re-Training of Machine Translation Systems

    Authors: Hyeonseok Moon, Seungyoon Lee, Seongtae Hong, Seungjun Lee, Chanjun Park, Heuiseok Lim

    Abstract: Translating major language resources to build minor language resources becomes a widely-used approach. Particularly in translating complex data points composed of multiple components, it is common to translate each component separately. However, we argue that this practice often overlooks the interrelation between components within the same data point. To address this limitation, we propose a nove… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 19 pages

  18. arXiv:2404.16123  [pdf, other

    cs.CV cs.AI cs.CL

    FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication

    Authors: Eric Slyman, Stefan Lee, Scott Cohen, Kushal Kafle

    Abstract: Recent dataset deduplication techniques have demonstrated that content-aware dataset pruning can dramatically reduce the cost of training Vision-Language Pretrained (VLP) models without significant performance losses compared to training on the original dataset. These results have been based on pruning commonly used image-caption datasets collected from the web -- datasets that are known to harbor… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Conference paper at CVPR 2024. 6 pages, 8 figures. Project Page: https://ericslyman.com/fairdedup/

    ACM Class: I.4.10; I.2.7; E.0

  19. arXiv:2404.16069  [pdf, other

    cs.HC cs.AI

    Interactive Visual Learning for Stable Diffusion

    Authors: Seongmin Lee, Benjamin Hoover, Hendrik Strobelt, Zijie J. Wang, ShengYun Peng, Austin Wright, Kevin Li, Haekyu Park, Haoyang Yang, Polo Chau

    Abstract: Diffusion-based generative models' impressive ability to create convincing images has garnered global attention. However, their complex internal structures and operations often pose challenges for non-experts to grasp. We introduce Diffusion Explainer, the first interactive visualization tool designed to elucidate how Stable Diffusion transforms text prompts into images. It tightly integrates a vi… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 4 pages, 3 figures. arXiv admin note: substantial text overlap with arXiv:2305.03509

  20. arXiv:2404.15305  [pdf, other

    eess.SP cs.LG

    ADAPT^2: Adapting Pre-Trained Sensing Models to End-Users via Self-Supervision Replay

    Authors: Hyungjun Yoon, Jaehyun Kwak, Biniyam Aschalew Tolera, Gaole Dai, Mo Li, Taesik Gong, Kimin Lee, Sung-Ju Lee

    Abstract: Self-supervised learning has emerged as a method for utilizing massive unlabeled data for pre-training models, providing an effective feature extractor for various mobile sensing applications. However, when deployed to end-users, these models encounter significant domain shifts attributed to user diversity. We investigate the performance degradation that occurs when self-supervised models are fine… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  21. arXiv:2404.14687  [pdf, other

    cs.MM cs.AI cs.CL cs.CV

    Pegasus-v1 Technical Report

    Authors: Raehyuk Jung, Hyojun Go, Jaehyuk Yi, Jiho Jang, Daniel Kim, Jay Suh, Aiden Lee, Cooper Han, Jae Lee, Jeff Kim, Jin-Young Kim, Junwan Kim, Kyle Park, Lucas Lee, Mars Ha, Minjoon Seo, Abraham Jo, Ed Park, Hassan Kianinejad, SJ Kim, Tony Moon, Wade Jeong, Andrei Popescu, Esther Kim, EK Yoon , et al. (19 additional authors not shown)

    Abstract: This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's archi… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  22. arXiv:2404.13286  [pdf, other

    cs.SD cs.IR eess.AS

    Track Role Prediction of Single-Instrumental Sequences

    Authors: Changheon Han, Suhyun Lee, Minsam Ko

    Abstract: In the composition process, selecting appropriate single-instrumental music sequences and assigning their track-role is an indispensable task. However, manually determining the track-role for a myriad of music samples can be time-consuming and labor-intensive. This study introduces a deep learning model designed to automatically predict the track-role of single-instrumental music sequences. Our ev… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: ISMIR LBD 2023

  23. arXiv:2404.13081  [pdf, other

    cs.CL cs.AI cs.LG

    SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs

    Authors: Jaehyung Kim, Jaehyun Nam, Sangwoo Mo, Jongjin Park, Sang-Woo Lee, Minjoon Seo, Jung-Woo Ha, Jinwoo Shin

    Abstract: Large language models (LLMs) have made significant advancements in various natural language processing tasks, including question answering (QA) tasks. While incorporating new information with the retrieval of relevant passages is a promising way to improve QA with LLMs, the existing methods often require additional fine-tuning which becomes infeasible with recent LLMs. Augmenting retrieved passage… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted at ICLR 2024

  24. arXiv:2404.12563  [pdf, other

    cs.HC cs.ET

    Teaching Linguistic Justice through Augmented Reality

    Authors: Ashvini Varatharaj, Abigail Welch, Mary Bucholtz, Jin Sook Lee

    Abstract: This position paper presents the AR Language Map, a speculative artifact designed to enhance understanding of linguistic justice among middle and high school students through augmented reality (AR) that allows students to map their linguistic experiences. Through a social justice-oriented academic outreach program aimed at linguistically, economically, and racially minoritized students, academic c… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: Presented at CHI 2024 (arXiv:2404.05889)

    Report number: ARSJ/2024/05

  25. arXiv:2404.11972  [pdf, other

    cs.CL

    Aligning Language Models to Explicitly Handle Ambiguity

    Authors: Hyuhng Joon Kim, Youna Kim, Cheonbok Park, Junyeob Kim, Choonghyun Park, Kang Min Yoo, Sang-goo Lee, Taeuk Kim

    Abstract: In spoken languages, utterances are often shaped to be incomplete or vague for efficiency. This can lead to varying interpretations of the same input, based on different assumptions about the context. To ensure reliable user-model interactions in such scenarios, it is crucial for models to adeptly handle the inherent ambiguity in user queries. However, conversational agents built upon even the mos… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  26. arXiv:2404.11310  [pdf, other

    cs.RO

    Autonomous aerial perching and unperching using omnidirectional tiltrotor and switching controller

    Authors: Dongjae Lee, Sunwoo Hwang, Jeonghyun Byun, Seung Jae Lee, H. Jin Kim

    Abstract: Aerial unperching of multirotors has received little attention as opposed to perching that has been investigated to elongate operation time. This study presents a new aerial robot capable of both perching and unperching autonomously on/from a ferromagnetic surface during flight, and a switching controller to avoid rotor saturation and mitigate overshoot during transition between free-flight and pe… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 7 pages, 10 figures, 2024 IEEE International Conference on Robotics and Automation (ICRA) accepted

  27. arXiv:2404.11041  [pdf, other

    cs.AI cs.LG

    On the Empirical Complexity of Reasoning and Planning in LLMs

    Authors: Liwei Kang, Zirui Zhao, David Hsu, Wee Sun Lee

    Abstract: Large Language Models (LLMs) work surprisingly well for some complex reasoning problems via chain-of-thought (CoT) or tree-of-thought (ToT), but the underlying reasons remain unclear. We seek to understand the performance of these methods by conducting experimental case studies and linking the outcomes to sample and computational complexity in machine learning. We found that if problems can be dec… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  28. arXiv:2404.06727  [pdf, other

    cs.CV

    Bayesian NeRF: Quantifying Uncertainty with Volume Density in Neural Radiance Fields

    Authors: Sibeak Lee, Kyeongsu Kang, Hyeonwoo Yu

    Abstract: We present the Bayesian Neural Radiance Field (NeRF), which explicitly quantifies uncertainty in geometric volume structures without the need for additional networks, making it adept for challenging observations and uncontrolled images. NeRF diverges from traditional geometric methods by offering an enriched scene representation, rendering color and density in 3D space from various viewpoints. How… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  29. arXiv:2404.05431  [pdf, other

    cs.CR

    Simplifying MBA Expression Using E-Graphs

    Authors: Seoksu Lee, Hyeongchang Jeon, Eun-Sun Cho

    Abstract: Code obfuscation involves the addition of meaningless code or the complication of existing code in order to make a program difficult to reverse engineer. In recent years, MBA (Mixed Boolean Arithmetic) obfuscation has been applied to virus and malware code to impede expert analysis. Among the various obfuscation techniques, Mixed Boolean Arithmetic (MBA) obfuscation is considered the most challeng… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  30. arXiv:2404.05297  [pdf, other

    cs.CR cs.SE

    Automated Attack Synthesis for Constant Product Market Makers

    Authors: Sujin Han, Jinseo Kim, Sung-Ju Lee, Insu Yun

    Abstract: Decentralized Finance enables many novel applications that were impossible in traditional finances. However, it also introduces new types of vulnerabilities, such as composability bugs. The composability bugs refer to issues that lead to erroneous behaviors when multiple smart contracts operate together. One typical example of composability bugs is those between token contracts and Constant Produc… ▽ More

    Submitted 24 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: 12 pages, 8 figures

  31. arXiv:2404.04941  [pdf, other

    cs.CL

    Prompting Large Language Models for Zero-shot Essay Scoring via Multi-trait Specialization

    Authors: Sanwoo Lee, Yida Cai, Desong Meng, Ziyang Wang, Yunfang Wu

    Abstract: Advances in automated essay scoring (AES) have traditionally relied on labeled essays, requiring tremendous cost and expertise for their acquisition. Recently, large language models (LLMs) have achieved great success in various tasks, but their potential is less explored in AES. In this paper, we propose Multi Trait Specialization (MTS), a zero-shot prompting framework to elicit essay scoring capa… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  32. arXiv:2404.04376  [pdf, other

    cs.CV cs.AI

    ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing

    Authors: Alec Helbling, Seongmin Lee, Polo Chau

    Abstract: Recently, researchers have proposed powerful systems for generating and manipulating images using natural language instructions. However, it is difficult to precisely specify many common classes of image transformations with text alone. For example, a user may wish to change the location and breed of a particular dog in an image with several similar dogs. This task is quite difficult with natural… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.07925

  33. arXiv:2404.04096  [pdf, other

    cs.IT eess.SP

    Machine Learning-Aided Cooperative Localization under Dense Urban Environment

    Authors: Hoon Lee, Hong Ki Kim, Seung Hyun Oh, Sang Hyun Lee

    Abstract: Future wireless network technology provides automobiles with the connectivity feature to consolidate the concept of vehicular networks that collaborate on conducting cooperative driving tasks. The full potential of connected vehicles, which promises road safety and quality driving experience, can be leveraged if machine learning models guarantee the robustness in performing core functions includin… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  34. arXiv:2404.03138  [pdf, other

    cs.CV cs.GR

    Discontinuity-preserving Normal Integration with Auxiliary Edges

    Authors: Hyomin Kim, Yucheol Jung, Seungyong Lee

    Abstract: Many surface reconstruction methods incorporate normal integration, which is a process to obtain a depth map from surface gradients. In this process, the input may represent a surface with discontinuities, e.g., due to self-occlusion. To reconstruct an accurate depth map from the input normal map, hidden surface gradients occurring from the jumps must be handled. To model these jumps correctly, we… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: To appear at CVPR 2024. For supplementary video, see https://youtu.be/MTTcW5kAOFE

    ACM Class: I.4.5

  35. arXiv:2404.03105  [pdf, other

    cs.LG math.OC

    Methodology for Interpretable Reinforcement Learning for Optimizing Mechanical Ventilation

    Authors: Joo Seung Lee, Malini Mahendra, Anil Aswani

    Abstract: Mechanical ventilation is a critical life-support intervention that uses a machine to deliver controlled air and oxygen to a patient's lungs, assisting or replacing spontaneous breathing. While several data-driven approaches have been proposed to optimize ventilator control strategies, they often lack interpretability and agreement with general domain knowledge. This paper proposes a methodology f… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  36. arXiv:2404.02772  [pdf, other

    cs.CL

    FPT: Feature Prompt Tuning for Few-shot Readability Assessment

    Authors: Ziyang Wang, Sanwoo Lee, Hsiu-Yuan Huang, Yunfang Wu

    Abstract: Prompt-based methods have achieved promising results in most few-shot text classification tasks. However, for readability assessment tasks, traditional prompt methods lackcrucial linguistic knowledge, which has already been proven to be essential. Moreover, previous studies on utilizing linguistic features have shown non-robust performance in few-shot settings and may even impair model performance… ▽ More

    Submitted 10 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: NAACL-2024 main conference

  37. arXiv:2404.02754  [pdf, ps, other

    cs.LG

    Continual Learning of Numerous Tasks from Long-tail Distributions

    Authors: Liwei Kang, Wee Sun Lee

    Abstract: Continual learning, an important aspect of artificial intelligence and machine learning research, focuses on developing models that learn and adapt to new tasks while retaining previously acquired knowledge. Existing continual learning algorithms usually involve a small number of tasks with uniform sizes and may not accurately represent real-world learning scenarios. In this paper, we investigate… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  38. arXiv:2404.02405  [pdf, other

    cs.CV

    TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression

    Authors: Ho-Joong Kim, Jung-Ho Hong, Heejo Kong, Seong-Whan Lee

    Abstract: In this paper, we investigate that the normalized coordinate expression is a key factor as reliance on hand-crafted components in query-based detectors for temporal action detection (TAD). Despite significant advancements towards an end-to-end framework in object detection, query-based detectors have been limited in achieving full end-to-end modeling in TAD. To address this issue, we propose \mode… ▽ More

    Submitted 3 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  39. arXiv:2404.02157  [pdf, other

    cs.CV cs.AI

    Segment Any 3D Object with Language

    Authors: Seungjun Lee, Yuyang Zhao, Gim Hee Lee

    Abstract: In this paper, we investigate Open-Vocabulary 3D Instance Segmentation (OV-3DIS) with free-form language instructions. Earlier works that rely on only annotated base categories for training suffer from limited generalization to unseen novel categories. Recent works mitigate poor generalizability to novel categories by generating class-agnostic masks or projecting generalized masks from 2D to 3D, b… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Project Page: https://cvrp-sole.github.io

  40. arXiv:2404.02135  [pdf

    cs.CV eess.IV

    Enhancing Ship Classification in Optical Satellite Imagery: Integrating Convolutional Block Attention Module with ResNet for Improved Performance

    Authors: Ryan Donghan Kwon, Gangjoo Robin Nam, Jisoo Tak, Junseob Shin, Hyerin Cha, Yeom Hyeok, Seung Won Lee

    Abstract: This study presents an advanced Convolutional Neural Network (CNN) architecture for ship classification from optical satellite imagery, significantly enhancing performance through the integration of the Convolutional Block Attention Module (CBAM) and additional architectural innovations. Building upon the foundational ResNet50 model, we first incorporated a standard CBAM to direct the model's focu… ▽ More

    Submitted 8 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  41. arXiv:2404.01984  [pdf, other

    cs.CV

    Fashion Style Editing with Generative Human Prior

    Authors: Chaerin Kong, Seungyong Lee, Soohyeok Im, Wonsuk Yang

    Abstract: Image editing has been a long-standing challenge in the research community with its far-reaching impact on numerous applications. Recently, text-driven methods started to deliver promising results in domains like human faces, but their applications to more complex domains have been relatively limited. In this work, we explore the task of fashion style editing, where we aim to manipulate the fashio… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 5 pages

  42. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  43. arXiv:2404.01842  [pdf, other

    cs.CV

    Semi-Supervised Domain Adaptation for Wildfire Detection

    Authors: JooYoung Jang, Youngseo Cha, Jisu Kim, SooHyung Lee, Geonu Lee, Minkook Cho, Young Hwang, Nojun Kwak

    Abstract: Recently, both the frequency and intensity of wildfires have increased worldwide, primarily due to climate change. In this paper, we propose a novel protocol for wildfire detection, leveraging semi-supervised Domain Adaptation for object detection, accompanied by a corresponding dataset designed for use by both academics and industries. Our dataset encompasses 30 times more diverse labeled scenes… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 16 pages, 5 figures, 22 tables

  44. arXiv:2404.01361  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    LLM Attributor: Interactive Visual Attribution for LLM Generation

    Authors: Seongmin Lee, Zijie J. Wang, Aishwarya Chakravarthy, Alec Helbling, ShengYun Peng, Mansi Phute, Duen Horng Chau, Minsuk Kahng

    Abstract: While large language models (LLMs) have shown remarkable capability to generate convincing text across diverse domains, concerns around its potential risks have highlighted the importance of understanding the rationale behind text generation. We present LLM Attributor, a Python library that provides interactive visualizations for training data attribution of an LLM's text generation. Our library o… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 8 pages, 3 figures, For a video demo, see https://youtu.be/mIG2MDQKQxM

  45. arXiv:2404.01351  [pdf, other

    cs.LG cs.AI cs.CV

    AETTA: Label-Free Accuracy Estimation for Test-Time Adaptation

    Authors: Taeckyung Lee, Sorn Chottananurak, Taesik Gong, Sung-Ju Lee

    Abstract: Test-time adaptation (TTA) has emerged as a viable solution to adapt pre-trained models to domain shifts using unlabeled test data. However, TTA faces challenges of adaptation failures due to its reliance on blind adaptation to unknown test samples in dynamic scenarios. Traditional methods for out-of-distribution performance estimation are limited by unrealistic assumptions in the TTA context, suc… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  46. arXiv:2404.01140  [pdf, other

    cs.CL

    KoCoNovel: Annotated Dataset of Character Coreference in Korean Novels

    Authors: Kyuhee Kim, Surin Lee, Sangah Lee

    Abstract: In this paper, we present KoCoNovel, a novel character coreference dataset derived from Korean literary texts, complete with detailed annotation guidelines. Comprising 178K tokens from 50 modern and contemporary novels, KoCoNovel stands as one of the largest public coreference resolution corpora in Korean, and the first to be based on literary texts. KoCoNovel offers four distinct versions to acco… ▽ More

    Submitted 11 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 12 pages

  47. arXiv:2404.01104  [pdf, other

    cs.CL

    SentiCSE: A Sentiment-aware Contrastive Sentence Embedding Framework with Sentiment-guided Textual Similarity

    Authors: Jaemin Kim, Yohan Na, Kangmin Kim, Sang Rak Lee, Dong-Kyu Chae

    Abstract: Recently, sentiment-aware pre-trained language models (PLMs) demonstrate impressive results in downstream sentiment analysis tasks. However, they neglect to evaluate the quality of their constructed sentiment representations; they just focus on improving the fine-tuning performance, which overshadows the representation quality. We argue that without guaranteeing the representation quality, their d… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 14 pages, 8 figures

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: LREC-COLING2024

  48. arXiv:2404.01039  [pdf, other

    cs.LG

    A Survey on Hypergraph Neural Networks: An In-Depth and Step-By-Step Guide

    Authors: Sunwoo Kim, Soo Yong Lee, Yue Gao, Alessia Antelmi, Mirko Polato, Kijung Shin

    Abstract: Higher-order interactions (HOIs) are ubiquitous in real-world complex systems and applications, and thus investigation of deep learning for HOIs has become a valuable agenda for the data mining and machine learning communities. As networks of HOIs are expressed mathematically as hypergraphs, hypergraph neural networks (HNNs) have emerged as a powerful tool for representation learning on hypergraph… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  49. arXiv:2404.00916  [pdf, other

    cs.CV

    Gyro-based Neural Single Image Deblurring

    Authors: Heemin Yang, Jaesung Rim, Seungyong Lee, Seung-Hwan Baek, Sunghyun Cho

    Abstract: In this paper, we present GyroDeblurNet, a novel single image deblurring method that utilizes a gyro sensor to effectively resolve the ill-posedness of image deblurring. The gyro sensor provides valuable information about camera motion during exposure time that can significantly improve deblurring quality. However, effectively exploiting real-world gyro data is challenging due to significant error… ▽ More

    Submitted 8 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 14 pages, 11 figures

  50. arXiv:2404.00638  [pdf, other

    cs.LG

    HypeBoy: Generative Self-Supervised Representation Learning on Hypergraphs

    Authors: Sunwoo Kim, Shinhwan Kang, Fanchen Bu, Soo Yong Lee, Jaemin Yoo, Kijung Shin

    Abstract: Hypergraphs are marked by complex topology, expressing higher-order interactions among multiple nodes with hyperedges, and better capturing the topology is essential for effective representation learning. Recent advances in generative self-supervised learning (SSL) suggest that hypergraph neural networks learned from generative self supervision have the potential to effectively encode the complex… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: Published as a conference paper at ICLR 2024