Skip to main content

Showing 1–50 of 1,095 results for author: Kim, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05107  [pdf, other

    cs.ET cs.AR eess.SY

    Leveraging AES Padding: dBs for Nothing and FEC for Free in IoT Systems

    Authors: Jongchan Woo, Vipindev Adat Vasudevan, Benjamin D. Kim, Rafael G. L. D'Oliveira, Alejandro Cohen, Thomas Stahlbuhk, Ken R. Duffy, Muriel Médard

    Abstract: The Internet of Things (IoT) represents a significant advancement in digital technology, with its rapidly growing network of interconnected devices. This expansion, however, brings forth critical challenges in data security and reliability, especially under the threat of increasing cyber vulnerabilities. Addressing the security concerns, the Advanced Encryption Standard (AES) is commonly employed… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  2. arXiv:2405.04907  [pdf, other

    cs.NI

    Empowering Wireless Networks with Artificial Intelligence Generated Graph

    Authors: Jiacheng Wang, Yinqiu Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Haibo Zhou, Dong In Kim

    Abstract: In wireless communications, transforming network into graphs and processing them using deep learning models, such as Graph Neural Networks (GNNs), is one of the mainstream network optimization approaches. While effective, the generative AI (GAI) shows stronger capabilities in graph analysis, processing, and generation, than conventional methods such as GNN, offering a broader exploration space for… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  3. arXiv:2405.04198  [pdf, other

    cs.CR

    Enhancing Physical Layer Communication Security through Generative AI with Mixture of Experts

    Authors: Changyuan Zhao, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim, Xuemin, Shen, Khaled B. Letaief

    Abstract: AI technologies have become more widely adopted in wireless communications. As an emerging type of AI technologies, the generative artificial intelligence (GAI) gains lots of attention in communication security. Due to its powerful learning ability, GAI models have demonstrated superiority over conventional AI methods. However, GAI still has several limitations, including high computational comple… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  4. arXiv:2405.03945  [pdf, other

    cs.CV cs.NI

    Role of Sensing and Computer Vision in 6G Wireless Communications

    Authors: Seungnyun Kim, Jihoon Moon, Jinhong Kim, Yongjun Ahn, Donghoon Kim, Sunwoo Kim, Kyuhong Shim, Byonghyo Shim

    Abstract: Recently, we are witnessing the remarkable progress and widespread adoption of sensing technologies in autonomous driving, robotics, and metaverse. Considering the rapid advancement of computer vision (CV) technology to analyze the sensing information, we anticipate a proliferation of wireless applications exploiting the sensing and CV technologies in 6G. In this article, we provide a holistic ove… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  5. arXiv:2405.00260  [pdf, other

    cs.CV

    CREPE: Coordinate-Aware End-to-End Document Parser

    Authors: Yamato Okamoto, Youngmin Baek, Geewook Kim, Ryota Nakao, DongHyun Kim, Moon Bin Yim, Seunghyun Park, Bado Lee

    Abstract: In this study, we formulate an OCR-free sequence generation model for visual document understanding (VDU). Our model not only parses text from document images but also extracts the spatial coordinates of the text based on the multi-head architecture. Named as Coordinate-aware End-to-end Document Parser (CREPE), our method uniquely integrates these capabilities by introducing a special token for OC… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: Accepted at the International Conference on Document Analysis and Recognition (ICDAR 2024) main conference

  6. arXiv:2405.00229  [pdf, other

    cs.HC cs.AI cs.PL

    Aptly: Making Mobile Apps from Natural Language

    Authors: Evan W. Patton, David Y. J. Kim, Ashley Granquist, Robin Liu, Arianna Scott, Jennet Zamanova, Harold Abelson

    Abstract: We present Aptly, an extension of the MIT App Inventor platform enabling mobile app development via natural language powered by code-generating large language models (LLMs). Aptly complements App Inventor's block language with a text language designed to allow visual code generation via text-based LLMs. We detail the technical aspects of how the Aptly server integrates LLMs with a realtime collabo… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages, 7 figures, 2 tables

  7. arXiv:2404.19248  [pdf, other

    cs.CV

    Transition Rate Scheduling for Quantization-Aware Training

    Authors: Junghyup lee, Dohyung Kim, Jeimin Jeon, Bumsub Ham

    Abstract: Quantization-aware training (QAT) simulates a quantization process during training to lower bit-precision of weights/activations. It learns quantized weights indirectly by updating latent weights, i.e., full-precision inputs to a quantizer, using gradient-based optimizers. We claim that coupling a user-defined learning rate (LR) with these optimizers is sub-optimal for QAT. Quantized weights trans… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE TPAMI on Apr. 03, 2023

  8. arXiv:2404.18705  [pdf, other

    cs.IT eess.SP

    Wireless Information and Energy Transfer in the Era of 6G Communications

    Authors: Constantinos Psomas, Konstantinos Ntougias, Nikita Shanin, Dongfang Xu, Kenneth MacSporran Mayer, Nguyen Minh Tran, Laura Cottatellucci, Kae Won Choi, Dong In Kim, Robert Schober, Ioannis Krikidis

    Abstract: Wireless information and energy transfer (WIET) represents an emerging paradigm which employs controllable transmission of radio-frequency signals for the dual purpose of data communication and wireless charging. As such, WIET is widely regarded as an enabler of envisioned 6G use cases that rely on energy-sustainable Internet-of-Things (IoT) networks, such as smart cities and smart grids. Meeting… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Proceedings of the IEEE, 36 pages, 33 figures

  9. arXiv:2404.18459  [pdf, other

    cs.CV

    Chameleon: A Data-Efficient Generalist for Dense Visual Prediction in the Wild

    Authors: Donggyun Kim, Seongwoong Cho, Semin Kim, Chong Luo, Seunghoon Hong

    Abstract: Large language models have evolved data-efficient generalists, benefiting from the universal language interface and large-scale pre-training. However, constructing a data-efficient generalist for dense visual prediction presents a distinct challenge due to the variation in label structures across different tasks. Consequently, generalization to unseen dense prediction tasks in the low-data regime… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  10. LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing

    Authors: Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-Hsun Chen, Shaowei Wang

    Abstract: Logs are important in modern software development with runtime information. Log parsing is the first step in many log-based analyses, that involve extracting structured information from unstructured log data. Traditional log parsers face challenges in accurately parsing logs due to the diversity of log formats, which directly impacts the performance of downstream log-analysis tasks. In this paper,… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  11. arXiv:2404.17686  [pdf, other

    cs.NI cs.IT

    On the Benefits of Coding for Network Slicing

    Authors: Homa Esfahanizadeh, Vipindev Adat Vasudevan, Benjamin D. Kim, Shruti Siva, Jennifer Kim, Alejandro Cohen, Muriel Médard

    Abstract: Network slicing has emerged as an integral concept in 5G, aiming to partition the physical network infrastructure into isolated slices, customized for specific applications. We theoretically formulate the key performance metrics of an application, in terms of goodput and delivery delay, at a cost of network resources in terms of bandwidth. We explore an un-coded communication protocol that uses fe… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  12. arXiv:2404.17585  [pdf, other

    cs.HC cs.AI cs.LG eess.SP

    NeuroNet: A Novel Hybrid Self-Supervised Learning Framework for Sleep Stage Classification Using Single-Channel EEG

    Authors: Cheol-Hui Lee, Hakseung Kim, Hyun-jee Han, Min-Kyung Jung, Byung C. Yoon, Dong-Joo Kim

    Abstract: The classification of sleep stages is a pivotal aspect of diagnosing sleep disorders and evaluating sleep quality. However, the conventional manual scoring process, conducted by clinicians, is time-consuming and prone to human bias. Recent advancements in deep learning have substantially propelled the automation of sleep stage classification. Nevertheless, challenges persist, including the need fo… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 14 pages, 4 figures

  13. arXiv:2404.17179  [pdf, other

    cs.HC cs.ET

    Meta-Object: Interactive and Multisensory Virtual Object Learned from the Real World for the Post-Metaverse

    Authors: Dooyoung Kim, Taewook Ha, Jinseok Hong, Seonji Kim, Selin Choi, Heejeong Ko, Woontack Woo

    Abstract: With the proliferation of wearable Augmented Reality/Virtual Reality (AR/VR) devices, ubiquitous virtual experiences seamlessly integrate into daily life through metaverse platforms. To support immersive metaverse experiences akin to reality, we propose a next-generation virtual object, a meta-object, a property-embedded virtual object that contains interactive and multisensory characteristics lea… ▽ More

    Submitted 28 April, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: 12 pages, 4 figures, under review in the IEEE CG&A magazine

  14. arXiv:2404.16831  [pdf, other

    cs.CV

    The Third Monocular Depth Estimation Challenge

    Authors: Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, YiPing Bao, Xiao Liu, Dohyeong Kim, Jinseong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, Jinqiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora , et al. (16 additional authors not shown)

    Abstract: This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su… ▽ More

    Submitted 27 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: To appear in CVPRW2024

  15. arXiv:2404.16356  [pdf, other

    cs.NI cs.AI cs.LG

    Integration of Mixture of Experts and Multimodal Generative AI in Internet of Vehicles: A Survey

    Authors: Minrui Xu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Yuguang Fang, Dong In Kim, Xuemin, Shen

    Abstract: Generative AI (GAI) can enhance the cognitive, reasoning, and planning capabilities of intelligent modules in the Internet of Vehicles (IoV) by synthesizing augmented datasets, completing sensor data, and making sequential decisions. In addition, the mixture of experts (MoE) can enable the distributed and collaborative execution of AI models without performance degradation between connected vehicl… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  16. arXiv:2404.15516  [pdf, other

    cs.CV cs.AI

    Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval

    Authors: Young Kyun Jang, Donghyun Kim, Zihang Meng, Dat Huynh, Ser-Nam Lim

    Abstract: Composed Image Retrieval (CIR) is a task that retrieves images similar to a query, based on a provided textual modification. Current techniques rely on supervised learning for CIR models using labeled triplets of the reference image, text, target image. These specific triplets are not as commonly available as simple image-text pairs, limiting the widespread use of CIR and its scalability. On the o… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 15 pages

  17. arXiv:2404.15333  [pdf, other

    eess.SP cs.LG

    EB-GAME: A Game-Changer in ECG Heartbeat Anomaly Detection

    Authors: JuneYoung Park, Da Young Kim, Yunsoo Kim, Jisu Yoo, Tae Joon Kim

    Abstract: Cardiologists use electrocardiograms (ECG) for the detection of arrhythmias. However, continuous monitoring of ECG signals to detect cardiac abnormal-ities requires significant time and human resources. As a result, several deep learning studies have been conducted in advance for the automatic detection of arrhythmia. These models show relatively high performance in supervised learning, but are no… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  18. arXiv:2404.15096  [pdf, other

    cs.RO cs.LG

    Impedance Matching: Enabling an RL-Based Running Jump in a Quadruped Robot

    Authors: Neil Guan, Shangqun Yu, Shifan Zhu, Donghyun Kim

    Abstract: Replicating the remarkable athleticism seen in animals has long been a challenge in robotics control. Although Reinforcement Learning (RL) has demonstrated significant progress in dynamic legged locomotion control, the substantial sim-to-real gap often hinders the real-world demonstration of truly dynamic movements. We propose a new framework to mitigate this gap through frequency-domain analysis-… ▽ More

    Submitted 29 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted by Ubiquitous Robots 2024

  19. arXiv:2404.14687  [pdf, other

    cs.MM cs.AI cs.CL cs.CV

    Pegasus-v1 Technical Report

    Authors: Raehyuk Jung, Hyojun Go, Jaehyuk Yi, Jiho Jang, Daniel Kim, Jay Suh, Aiden Lee, Cooper Han, Jae Lee, Jeff Kim, Jin-Young Kim, Junwan Kim, Kyle Park, Lucas Lee, Mars Ha, Minjoon Seo, Abraham Jo, Ed Park, Hassan Kianinejad, SJ Kim, Tony Moon, Wade Jeong, Andrei Popescu, Esther Kim, EK Yoon , et al. (19 additional authors not shown)

    Abstract: This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's archi… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  20. arXiv:2404.14527  [pdf, other

    cs.DC cs.LG

    Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

    Authors: Tyler Griggs, Xiaoxuan Liu, Jiaxiang Yu, Doyoung Kim, Wei-Lin Chiang, Alvin Cheung, Ion Stoica

    Abstract: Large language models (LLMs) are increasingly integrated into many online services. However, a major challenge in deploying LLMs is their high cost, due primarily to the use of expensive GPU instances. To address this problem, we find that the significant heterogeneity of GPU types presents an opportunity to increase GPU cost efficiency and reduce deployment costs. The broad and growing market of… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  21. arXiv:2404.14219  [pdf, other

    cs.CL cs.AI

    Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

    Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg , et al. (62 additional authors not shown)

    Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset… ▽ More

    Submitted 23 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 12 pages

  22. arXiv:2404.13274  [pdf, other

    cs.HC cs.AI

    Augmented Object Intelligence: Making the Analog World Interactable with XR-Objects

    Authors: Mustafa Doga Dogan, Eric J. Gonzalez, Andrea Colaco, Karan Ahuja, Ruofei Du, Johnny Lee, Mar Gonzalez-Franco, David Kim

    Abstract: Seamless integration of physical objects as interactive digital entities remains a challenge for spatial computing. This paper introduces Augmented Object Intelligence (AOI), a novel XR interaction paradigm designed to blur the lines between digital and physical by equipping real-world objects with the ability to interact as if they were digital, where every object has the potential to serve as a… ▽ More

    Submitted 22 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  23. arXiv:2404.13127  [pdf, other

    cs.SI physics.soc-ph

    Uncovering large inconsistencies between machine learning derived gridded settlement datasets

    Authors: Vedran Sekara, Andrea Martini, Manuel Garcia-Herranz, Do-Hyung Kim

    Abstract: High-resolution human settlement maps provide detailed delineations of where people live and are vital for scientific and practical purposes, such as rapid disaster response, allocation of humanitarian resources, and international development. The increased availability of high-resolution satellite imagery, combined with powerful techniques from machine learning and artificial intelligence, has sp… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 14 pages, 4 figures

  24. arXiv:2404.11835  [pdf, other

    cs.AI

    CAUS: A Dataset for Question Generation based on Human Cognition Leveraging Large Language Models

    Authors: Minjung Shin, Donghyun Kim, Jeh-Kwang Ryu

    Abstract: We introduce the CAUS (Curious About Uncertain Scene) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties. Leveraging this dataset, we investigate the potential of LLMs to engage in questioning effectively. Our approach involves providing scene descriptions embedded with uncertainties to stimulate the generation of… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 8 pages, 4 figures and 3 tables. This work has been accepted for presentation at CogSci 2024 and is currently under revision

  25. arXiv:2404.11810  [pdf, other

    cs.GR

    Holographic Parallax Improves 3D Perceptual Realism

    Authors: Dongyeon Kim, Seung-Woo Nam, Suyeon Choi, Jong-Mo Seo, Gordon Wetzstein, Yoonchan Jeong

    Abstract: Holographic near-eye displays are a promising technology to solve long-standing challenges in virtual and augmented reality display systems. Over the last few years, many different computer-generated holography (CGH) algorithms have been proposed that are supervised by different types of target content, such as 2.5D RGB-depth maps, 3D focal stacks, and 4D light fields. It is unclear, however, what… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 33 pages, 34 figures

  26. arXiv:2404.11343  [pdf, other

    cs.IR cs.AI

    Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender System

    Authors: Sein Kim, Hongseok Kang, Seungyoon Choi, Donghyun Kim, Minchul Yang, Chanyoung Park

    Abstract: Collaborative filtering recommender systems (CF-RecSys) have shown successive results in enhancing the user experience on social media and e-commerce platforms. However, as CF-RecSys struggles under cold scenarios with sparse user-item interactions, recent strategies have focused on leveraging modality information of user/items (e.g., text or images) based on pre-trained modality encoders and Larg… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Under review

  27. arXiv:2404.10346  [pdf, other

    cs.CL

    Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards

    Authors: Hyeonbin Hwang, Doyoung Kim, Seungone Kim, Seonghyeon Ye, Minjoon Seo

    Abstract: Training on large amounts of rationales (i.e., CoT Fine-tuning) is effective at improving the reasoning capabilities of large language models (LLMs). However, acquiring human-authored rationales or augmenting rationales from proprietary models is costly and not scalable. In this paper, we study the problem of whether LLMs could self-improve their reasoning capabilities. To this end, we propose Sel… ▽ More

    Submitted 6 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: Preprint Under Review

  28. arXiv:2404.09228  [pdf, other

    cs.RO

    A Survey on Integration of Large Language Models with Intelligent Robots

    Authors: Yeseung Kim, Dohyun Kim, Jieun Choi, Jisang Park, Nayoung Oh, Daehyung Park

    Abstract: In recent years, the integration of large language models (LLMs) has revolutionized the field of robotics, enabling robots to communicate, understand, and reason with human-like proficiency. This paper explores the multifaceted impact of LLMs on robotics, addressing key challenges and opportunities for leveraging these models across various domains. By categorizing and analyzing LLM applications w… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 24 pages, 1 figure, Submitted to Intelligent Service Robotics (ISR)

  29. arXiv:2404.09134  [pdf, ps, other

    cs.NI cs.LG

    Interactive Generative AI Agents for Satellite Networks through a Mixture of Experts Transmission

    Authors: Ruichen Zhang, Hongyang Du, Yinqiu Liu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Dong In Kim

    Abstract: In response to the needs of 6G global communications, satellite communication networks have emerged as a key solution. However, the large-scale development of satellite communication networks is constrained by the complex system models, whose modeling is challenging for massive users. Moreover, transmission interference between satellites and users seriously affects communication performance. To s… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: 13 pages, 9 figures

  30. arXiv:2404.08396  [pdf, other

    cs.IT

    Joint Computation Offloading and Target Tracking in Integrated Sensing and Communication Enabled UAV Networks

    Authors: Trinh Van Chien, Mai Dinh Cong, Nguyen Cong Luong, Tri Nhu Do, Dong In Kim, Symeon Chatzinotas

    Abstract: In this paper, we investigate a joint computation offloading and target tracking in Integrated Sensing and Communication (ISAC)-enabled unmanned aerial vehicle (UAV) network. Therein, the UAV has a computing task that is partially offloaded to the ground UE for execution. Meanwhile, the UAV uses the offloading bit sequence to estimate the velocity of a ground target based on an autocorrelation fun… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 5 pages, 3 figures, 1 table. Accepted by IEEE Communications Letters

  31. arXiv:2404.05039  [pdf, other

    cs.RO

    StaccaToe: A Single-Leg Robot that Mimics the Human Leg and Toe

    Authors: Nisal Perera, Shangqun Yu, Daniel Marew, Mack Tang, Ken Suzuki, Aidan McCormack, Shifan Zhu, Yong-Jae Kim, Donghyun Kim

    Abstract: We introduce StaccaToe, a human-scale, electric motor-powered single-leg robot designed to rival the agility of human locomotion through two distinctive attributes: an actuated toe and a co-actuation configuration inspired by the human leg. Leveraging the foundational design of HyperLeg's lower leg mechanism, we develop a stand-alone robot by incorporating new link designs, custom-designed power e… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Submitted to 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  32. arXiv:2404.04698  [pdf, other

    cs.RO

    EAGLE: The First Event Camera Dataset Gathered by an Agile Quadruped Robot

    Authors: Shifan Zhu, Zixun Xiong, Donghyun Kim

    Abstract: When legged robots perform agile movements, traditional RGB cameras often produce blurred images, posing a challenge for accurate state estimation. Event cameras, inspired by biological vision mechanisms, have emerged as a promising solution for capturing high-speed movements and coping with challenging lighting conditions, owing to their significant advantages, such as low latency, high temporal… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 8 pages, 7 figures

  33. arXiv:2404.04496  [pdf, other

    cs.SE

    Towards Better Graph Neural Network-based Fault Localization Through Enhanced Code Representation

    Authors: Md Nakhla Rafi, Dong Jae Kim, An Ran Chen, Tse-Hsun Chen, Shaowei Wang

    Abstract: Automatic software fault localization plays an important role in software quality assurance by pinpointing faulty locations for easier debugging. Coverage-based fault localization, a widely used technique, employs statistics on coverage spectra to rank code based on suspiciousness scores. However, the rigidity of statistical approaches calls for learning-based techniques. Amongst all, Grace, a gra… ▽ More

    Submitted 30 April, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

  34. arXiv:2404.03321  [pdf, other

    cs.NI

    Fusion of Mixture of Experts and Generative Artificial Intelligence in Mobile Edge Metaverse

    Authors: Guangyuan Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Shiwen Mao, Dong In Kim

    Abstract: In the digital transformation era, Metaverse offers a fusion of virtual reality (VR), augmented reality (AR), and web technologies to create immersive digital experiences. However, the evolution of the Metaverse is slowed down by the challenges of content creation, scalability, and dynamic user interaction. Our study investigates an integration of Mixture of Experts (MoE) models with Generative Ar… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  35. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  36. arXiv:2404.01661  [pdf, other

    cs.RO eess.SY

    Interaction-Aware Vehicle Motion Planning with Collision Avoidance Constraints in Highway Traffic

    Authors: Dongryul Kim, Hyeonjeong Kim, Kyoungseok Han

    Abstract: This paper proposes collision-free optimal trajectory planning for autonomous vehicles in highway traffic, where vehicles need to deal with the interaction among each other. To address this issue, a novel optimal control framework is suggested, which couples the trajectory of surrounding vehicles with collision avoidance constraints. Additionally, we describe a trajectory optimization technique un… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  37. arXiv:2404.01583  [pdf, other

    cs.NI

    Defining Problem from Solutions: Inverse Reinforcement Learning (IRL) and Its Applications for Next-Generation Networking

    Authors: Yinqiu Liu, Ruichen Zhang, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim

    Abstract: Performance optimization is a critical concern in networking, on which Deep Reinforcement Learning (DRL) has achieved great success. Nonetheless, DRL training relies on precisely defined reward functions, which formulate the optimization objective and indicate the positive/negative progress towards the optimal. With the ever-increasing environmental complexity and human participation in Next-Gener… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 9 pages

  38. arXiv:2404.01397  [pdf, other

    cs.CV cs.AI cs.RO

    Object-conditioned Bag of Instances for Few-Shot Personalized Instance Recognition

    Authors: Umberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozay

    Abstract: Nowadays, users demand for increased personalization of vision systems to localize and identify personal instances of objects (e.g., my dog rather than dog) from a few-shot dataset only. Despite outstanding results of deep networks on classical label-abundant benchmarks (e.g., those of the latest YOLOv8 model for standard object detection), they struggle to maintain within-class variability to rep… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: ICASSP 2024. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other

  39. arXiv:2404.00943  [pdf, other

    cs.CL cs.AI

    Evalverse: Unified and Accessible Library for Large Language Model Evaluation

    Authors: Jihoo Kim, Wonho Song, Dahyun Kim, Yunsu Kim, Yungi Kim, Chanjun Park

    Abstract: This paper introduces Evalverse, a novel library that streamlines the evaluation of Large Language Models (LLMs) by unifying disparate evaluation tools into a single, user-friendly framework. Evalverse enables individuals with limited knowledge of artificial intelligence to easily request LLM evaluations and receive detailed reports, facilitated by an integration with communication platforms like… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  40. arXiv:2404.00928  [pdf, other

    cs.CV cs.LG

    Instance-Aware Group Quantization for Vision Transformers

    Authors: Jaehyeon Moon, Dohyung Kim, Junyong Cheon, Bumsub Ham

    Abstract: Post-training quantization (PTQ) is an efficient model compression technique that quantizes a pretrained full-precision model using only a small calibration set of unlabeled samples without retraining. PTQ methods for convolutional neural networks (CNNs) provide quantization results comparable to full-precision counterparts. Directly applying them to vision transformers (ViTs), however, incurs sev… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  41. arXiv:2404.00918  [pdf, other

    cs.CV

    Rethinking Saliency-Guided Weakly-Supervised Semantic Segmentation

    Authors: Beomyoung Kim, Donghyun Kim, Sung Ju Hwang

    Abstract: This paper presents a fresh perspective on the role of saliency maps in weakly-supervised semantic segmentation (WSSS) and offers new insights and research directions based on our empirical findings. We conduct comprehensive experiments and observe that the quality of the saliency map is a critical factor in saliency-guided WSSS approaches. Nonetheless, we find that the saliency maps used in previ… ▽ More

    Submitted 2 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: Preprint, 17 pages, 7 figures

  42. arXiv:2404.00670  [pdf, other

    cs.CV q-bio.QM stat.AP

    Statistical Analysis by Semiparametric Additive Regression and LSTM-FCN Based Hierarchical Classification for Computer Vision Quantification of Parkinsonian Bradykinesia

    Authors: Youngseo Cho, In Hee Kwak, Dohyeon Kim, Jinhee Na, Hanjoo Sung, Jeongjae Lee, Young Eun Kim, Hyeo-il Ma

    Abstract: Bradykinesia, characterized by involuntary slowing or decrement of movement, is a fundamental symptom of Parkinson's Disease (PD) and is vital for its clinical diagnosis. Despite various methodologies explored to quantify bradykinesia, computer vision-based approaches have shown promising results. However, these methods often fall short in adequately addressing key bradykinesia characteristics in… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  43. arXiv:2404.00626  [pdf, other

    cs.CV

    Domain Generalizable Person Search Using Unreal Dataset

    Authors: Minyoung Oh, Duhyun Kim, Jae-Young Sim

    Abstract: Collecting and labeling real datasets to train the person search networks not only requires a lot of time and effort, but also accompanies privacy issues. The weakly-supervised and unsupervised domain adaptation methods have been proposed to alleviate the labeling burden for target datasets, however, their generalization capability is limited. We introduce a novel person search method based on the… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: AAAI2024 accepted

  44. arXiv:2404.00376  [pdf, other

    cs.CL

    Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks

    Authors: Hyunjae Kim, Hyeon Hwang, Jiwoo Lee, Sihyeon Park, Dain Kim, Taewhoo Lee, Chanwoong Yoon, Jiwoong Sohn, Donghee Choi, Jaewoo Kang

    Abstract: While recent advancements in commercial large language models (LM) have shown promising results in medical tasks, their closed-source nature poses significant privacy and security concerns, hindering their widespread use in the medical field. Despite efforts to create open-source models, their limited parameters often result in insufficient multi-step reasoning capabilities required for solving co… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  45. arXiv:2404.00285  [pdf, other

    cs.CV cs.AI

    Long-Tailed Recognition on Binary Networks by Calibrating A Pre-trained Model

    Authors: Jihun Kim, Dahyun Kim, Hyungrok Jung, Taeil Oh, Jonghyun Choi

    Abstract: Deploying deep models in real-world scenarios entails a number of challenges, including computational efficiency and real-world (e.g., long-tailed) data distributions. We address the combined challenge of learning long-tailed distributions using highly resource-efficient binary neural networks as backbones. Specifically, we propose a calibrate-and-distill framework that uses off-the-shelf pretrain… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  46. arXiv:2403.19758  [pdf, other

    quant-ph cs.AI cs.CL

    Quantum Natural Language Processing

    Authors: Dominic Widdows, Willie Aboumrad, Dohun Kim, Sayonee Ray, Jonathan Mei

    Abstract: Language processing is at the heart of current developments in artificial intelligence, and quantum computers are becoming available at the same time. This has led to great interest in quantum natural language processing, and several early proposals and experiments. This paper surveys the state of this area, showing how NLP-related techniques have been used in quantum language processing. We exa… ▽ More

    Submitted 26 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  47. arXiv:2403.19588  [pdf, other

    cs.CV cs.LG cs.NE

    DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

    Authors: Donghyun Kim, Byeongho Heo, Dongyoon Han

    Abstract: This paper revives Densely Connected Convolutional Networks (DenseNets) and reveals the underrated effectiveness over predominant ResNet-style architectures. We believe DenseNets' potential was overlooked due to untouched training methods and traditional design elements not fully revealing their capabilities. Our pilot study shows dense connections through concatenation are strong, demonstrating t… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Code at https://github.com/naver-ai/rdnet

  48. arXiv:2403.19340  [pdf, other

    cs.CL cs.AI

    Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models

    Authors: Hyunbyung Park, Sukyung Lee, Gyoungjin Gim, Yungi Kim, Dahyun Kim, Chanjun Park

    Abstract: To address the challenges associated with data processing at scale, we propose Dataverse, a unified open-source Extract-Transform-Load (ETL) pipeline for large language models (LLMs) with a user-friendly design at its core. Easy addition of custom processors with block-based interface in Dataverse allows users to readily and efficiently use Dataverse to build their own ETL pipeline. We hope that D… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  49. arXiv:2403.19270  [pdf, other

    cs.CL cs.AI

    sDPO: Don't Use Your Data All at Once

    Authors: Dahyun Kim, Yungi Kim, Wonho Song, Hyeonwoo Kim, Yunsu Kim, Sanghoon Kim, Chanjun Park

    Abstract: As development of large language models (LLM) progresses, aligning them with human preferences has become increasingly important. We propose stepwise DPO (sDPO), an extension of the recently popularized direct preference optimization (DPO) for alignment tuning. This approach involves dividing the available preference datasets and utilizing them in a stepwise manner, rather than employing it all at… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  50. arXiv:2403.19254  [pdf, other

    cs.CV

    Imperceptible Protection against Style Imitation from Diffusion Models

    Authors: Namhyuk Ahn, Wonhyuk Ahn, KiYoon Yoo, Daesik Kim, Seung-Hun Nam

    Abstract: Recent progress in diffusion models has profoundly enhanced the fidelity of image generation. However, this has raised concerns about copyright infringements. While prior methods have introduced adversarial perturbations to prevent style imitation, most are accompanied by the degradation of artworks' visual quality. Recognizing the importance of maintaining this, we develop a visually improved pro… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.