-
A Symmetric Regressor for MRI-Based Assessment of Striatal Dopamine Transporter Uptake in Parkinson's Disease
Authors:
Walid Abdullah Al,
Il Dong Yun,
Yun Jung Bae
Abstract:
Dopamine transporter (DAT) imaging is commonly used for monitoring Parkinson's disease (PD), where striatal DAT uptake amount is computed to assess PD severity. However, DAT imaging has a high cost and the risk of radiance exposure and is not available in general clinics. Recently, MRI patch of the nigral region has been proposed as a safer and easier alternative. This paper proposes a symmetric r…
▽ More
Dopamine transporter (DAT) imaging is commonly used for monitoring Parkinson's disease (PD), where striatal DAT uptake amount is computed to assess PD severity. However, DAT imaging has a high cost and the risk of radiance exposure and is not available in general clinics. Recently, MRI patch of the nigral region has been proposed as a safer and easier alternative. This paper proposes a symmetric regressor for predicting the DAT uptake amount from the nigral MRI patch. Acknowledging the symmetry between the right and left nigrae, the proposed regressor incorporates a paired input-output model that simultaneously predicts the DAT uptake amounts for both the right and left striata. Moreover, it employs a symmetric loss that imposes a constraint on the difference between right-to-left predictions, resembling the high correlation in DAT uptake amounts in the two lateral sides. Additionally, we propose a symmetric Monte-Carlo (MC) dropout method for providing a fruitful uncertainty estimate of the DAT uptake prediction, which utilizes the above symmetry. We evaluated the proposed approach on 734 nigral patches, which demonstrated significantly improved performance of the symmetric regressor compared with the standard regressors while giving better explainability and feature representation. The symmetric MC dropout also gave precise uncertainty ranges with a high probability of including the true DAT uptake amounts within the range.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Automated Attack Synthesis for Constant Product Market Makers
Authors:
Sujin Han,
Jinseo Kim,
Sung-Ju Lee,
Insu Yun
Abstract:
Decentralized Finance enables many novel applications that were impossible in traditional finances. However, it also introduces new types of vulnerabilities, such as composability bugs. The composability bugs refer to issues that lead to erroneous behaviors when multiple smart contracts operate together. One typical example of composability bugs is those between token contracts and Constant Produc…
▽ More
Decentralized Finance enables many novel applications that were impossible in traditional finances. However, it also introduces new types of vulnerabilities, such as composability bugs. The composability bugs refer to issues that lead to erroneous behaviors when multiple smart contracts operate together. One typical example of composability bugs is those between token contracts and Constant Product Market Makers (CPMM), the most widely used model for Decentralized Exchanges. Since 2022, 23 exploits of such kind have resulted in a total loss of 2.2M USD. BlockSec, a smart contract auditing company, once reported that 138 exploits of such kind occurred just in February 2023. We propose CPMM-Exploiter, which automatically detects and generates end-to-end exploits for CPMM composability bugs. Generating such end-to-end exploits is challenging due to the large search space of multiple contracts and various fees involved with financial services. To tackle this, we investigated real-world exploits regarding these vulnerabilities and identified that they arise due to violating two safety invariants. Based on this observation, we implemented CPMM-Exploiter, a new grammar-based fuzzer targeting the detection of these bugs. CPMM-Exploiter uses fuzzing to find transactions that break the invariants. It then refines these transactions to make them profitable for the attacker. We evaluated CPMM-Exploiter on two real-world exploit datasets. CPMM-Exploiter obtained recalls of 0.91 and 0.89, respectively, while five baselines achieved maximum recalls of 0.36 and 0.58, respectively. We further evaluated CPMM-Exploiter by running it on the latest blocks of the Ethereum and Binance networks. It successfully generated 18 new exploits, which can result in 12.9K USD profit in total.
△ Less
Submitted 24 April, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
EGformer: Equirectangular Geometry-biased Transformer for 360 Depth Estimation
Authors:
Ilwi Yun,
Chanyong Shin,
Hyunku Lee,
Hyuk-Jae Lee,
Chae Eun Rhee
Abstract:
Estimating the depths of equirectangular (i.e., 360) images (EIs) is challenging given the distorted 180 x 360 field-of-view, which is hard to be addressed via convolutional neural network (CNN). Although a transformer with global attention achieves significant improvements over CNN for EI depth estimation task, it is computationally inefficient, which raises the need for transformer with local at…
▽ More
Estimating the depths of equirectangular (i.e., 360) images (EIs) is challenging given the distorted 180 x 360 field-of-view, which is hard to be addressed via convolutional neural network (CNN). Although a transformer with global attention achieves significant improvements over CNN for EI depth estimation task, it is computationally inefficient, which raises the need for transformer with local attention. However, to apply local attention successfully for EIs, a specific strategy, which addresses distorted equirectangular geometry and limited receptive field simultaneously, is required. Prior works have only cared either of them, resulting in unsatisfactory depths occasionally. In this paper, we propose an equirectangular geometry-biased transformer termed EGformer. While limiting the computational cost and the number of network parameters, EGformer enables the extraction of the equirectangular geometry-aware local attention with a large receptive field. To achieve this, we actively utilize the equirectangular geometry as the bias for the local attention instead of struggling to reduce the distortion of EIs. As compared to the most recent EI depth estimation studies, the proposed approach yields the best depth outcomes overall with the lowest computational cost and the fewest parameters, demonstrating the effectiveness of the proposed methods.
△ Less
Submitted 7 September, 2023; v1 submitted 16 April, 2023;
originally announced April 2023.
-
Extraction of Coronary Vessels in Fluoroscopic X-Ray Sequences Using Vessel Correspondence Optimization
Authors:
Seung Yeon Shin,
Soochahn Lee,
Kyoung Jin Noh,
Il Dong Yun,
Kyoung Mu Lee
Abstract:
We present a method to extract coronary vessels from fluoroscopic x-ray sequences. Given the vessel structure for the source frame, vessel correspondence candidates in the subsequent frame are generated by a novel hierarchical search scheme to overcome the aperture problem. Optimal correspondences are determined within a Markov random field optimization framework. Post-processing is performed to e…
▽ More
We present a method to extract coronary vessels from fluoroscopic x-ray sequences. Given the vessel structure for the source frame, vessel correspondence candidates in the subsequent frame are generated by a novel hierarchical search scheme to overcome the aperture problem. Optimal correspondences are determined within a Markov random field optimization framework. Post-processing is performed to extract vessel branches newly visible due to the inflow of contrast agent. Quantitative and qualitative evaluation conducted on a dataset of 18 sequences demonstrates the effectiveness of the proposed method.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Improving 360 Monocular Depth Estimation via Non-local Dense Prediction Transformer and Joint Supervised and Self-supervised Learning
Authors:
Ilwi Yun,
Hyuk-Jae Lee,
Chae Eun Rhee
Abstract:
Due to difficulties in acquiring ground truth depth of equirectangular (360) images, the quality and quantity of equirectangular depth data today is insufficient to represent the various scenes in the world. Therefore, 360 depth estimation studies, which relied solely on supervised learning, are destined to produce unsatisfactory results. Although self-supervised learning methods focusing on equir…
▽ More
Due to difficulties in acquiring ground truth depth of equirectangular (360) images, the quality and quantity of equirectangular depth data today is insufficient to represent the various scenes in the world. Therefore, 360 depth estimation studies, which relied solely on supervised learning, are destined to produce unsatisfactory results. Although self-supervised learning methods focusing on equirectangular images (EIs) are introduced, they often have incorrect or non-unique solutions, causing unstable performance. In this paper, we propose 360 monocular depth estimation methods which improve on the areas that limited previous studies. First, we introduce a self-supervised 360 depth learning method that only utilizes gravity-aligned videos, which has the potential to eliminate the needs for depth data during the training procedure. Second, we propose a joint learning scheme realized by combining supervised and self-supervised learning. The weakness of each learning is compensated, thus leading to more accurate depth estimation. Third, we propose a non-local fusion block, which can further retain the global information encoded by vision transformer when reconstructing the depths. With the proposed methods, we successfully apply the transformer to 360 depth estimations, to the best of our knowledge, which has not been tried before. On several benchmarks, our approach achieves significant improvements over previous works and establishes a state of the art.
△ Less
Submitted 29 December, 2021; v1 submitted 22 September, 2021;
originally announced September 2021.
-
EPSR: Edge Profile Super resolution
Authors:
Jiun Lee,
Jaekwang Kim,
Inyong Yun
Abstract:
In this paper, we propose Edge Profile Super Resolution(EPSR) method to preserve structure information and to restore texture. We make EPSR by stacking modified Fractal Residual Network(mFRN) structures hierarchically and repeatedly. mFRN is made up of lots of Residual Edge Profile Blocks(REPBs) consisting of three different modules such as Residual Efficient Channel Attention Block(RECAB) module,…
▽ More
In this paper, we propose Edge Profile Super Resolution(EPSR) method to preserve structure information and to restore texture. We make EPSR by stacking modified Fractal Residual Network(mFRN) structures hierarchically and repeatedly. mFRN is made up of lots of Residual Edge Profile Blocks(REPBs) consisting of three different modules such as Residual Efficient Channel Attention Block(RECAB) module, Edge Profile(EP) module, and Context Network(CN) module. RECAB produces more informative features with high frequency components. From the feature, EP module produce structure informed features by generating edge profile itself. Finally, CN module captures details by exploiting high frequency information such as texture and structure with proper sharpness. As repeating the procedure in mFRN structure, our EPSR could extract high-fidelity features and thus it prevents texture loss and preserves structure with appropriate sharpness. Experimental results present that our EPSR achieves competitive performance against state-of-the-art methods in PSNR and SSIM evaluation metrics as well as visual results.
△ Less
Submitted 12 May, 2021; v1 submitted 9 November, 2020;
originally announced November 2020.
-
Edge and Identity Preserving Network for Face Super-Resolution
Authors:
Jonghyun Kim,
Gen Li,
Inyong Yun,
Cheolkon Jung,
Joongkyu Kim
Abstract:
Face super-resolution (SR) has become an indispensable function in security solutions such as video surveillance and identification system, but the distortion in facial components is a great challenge in it. Most state-of-the-art methods have utilized facial priors with deep neural networks. These methods require extra labels, longer training time, and larger computation memory. In this paper, we…
▽ More
Face super-resolution (SR) has become an indispensable function in security solutions such as video surveillance and identification system, but the distortion in facial components is a great challenge in it. Most state-of-the-art methods have utilized facial priors with deep neural networks. These methods require extra labels, longer training time, and larger computation memory. In this paper, we propose a novel Edge and Identity Preserving Network for Face SR Network, named as EIPNet, to minimize the distortion by utilizing a lightweight edge block and identity information. We present an edge block to extract perceptual edge information, and concatenate it to the original feature maps in multiple scales. This structure progressively provides edge information in reconstruction to aggregate local and global structural information. Moreover, we define an identity loss function to preserve identification of SR images. The identity loss function compares feature distributions between SR images and their ground truth to recover identities in SR images. In addition, we provide a luminance-chrominance error (LCE) to separately infer brightness and color information in SR images. The LCE method not only reduces the dependency of color information by dividing brightness and color components but also enables our network to reflect differences between SR images and their ground truth in two color spaces of RGB and YUV. The proposed method facilitates the proposed SR network to elaborately restore facial components and generate high quality 8x scaled SR images with a lightweight network structure. Furthermore, our network is able to reconstruct an 128x128 SR image with 215 fps on a GTX 1080Ti GPU. Extensive experiments demonstrate that our network qualitatively and quantitatively outperforms state-of-the-art methods on two challenging datasets: CelebA and VGGFace2.
△ Less
Submitted 30 March, 2021; v1 submitted 27 August, 2020;
originally announced August 2020.
-
Gradually Applying Weakly Supervised and Active Learning for Mass Detection in Breast Ultrasound Images
Authors:
JooYeol Yun,
JungWoo Oh,
IlDong Yun
Abstract:
We propose a method for effectively utilizing weakly annotated image data in an object detection tasks of breast ultrasound images. Given the problem setting where a small, strongly annotated dataset and a large, weakly annotated dataset with no bounding box information are available, training an object detection model becomes a non-trivial problem. We suggest a controlled weight for handling the…
▽ More
We propose a method for effectively utilizing weakly annotated image data in an object detection tasks of breast ultrasound images. Given the problem setting where a small, strongly annotated dataset and a large, weakly annotated dataset with no bounding box information are available, training an object detection model becomes a non-trivial problem. We suggest a controlled weight for handling the effect of weakly annotated images in a two stage object detection model. We~also present a subsequent active learning scheme for safely assigning weakly annotated images a strong annotation using the trained model. Experimental results showed a 24\% point increase in correct localization (CorLoc) measure, which is the ratio of correctly localized and classified images, by assigning the properly controlled weight. Performing active learning after a model is trained showed an additional increase in CorLoc. We tested the proposed method on the Stanford Dog datasets to assure that it can be applied to general cases, where strong annotations are insufficient to obtain resembling results. The presented method showed that higher performance is achievable with lesser annotation effort.
△ Less
Submitted 19 August, 2020;
originally announced August 2020.
-
The CNN-based Coronary Occlusion Site Localization with Effective Preprocessing Method
Authors:
YeongHyeon Park,
Il Dong Yun,
Si-Hyuck Kang
Abstract:
The Coronary Artery Occlusion (CAO) acutely comes to human, and it highly threats the human's life. When CAO detected, Percutaneous Coronary Intervention (PCI) should be conducted timely. Before PCI, localizing the CAO is needed firstly, because the heart is covered with various arteries. We handle the three kinds of CAO in this paper and our purpose is not only localization of CAO but also improv…
▽ More
The Coronary Artery Occlusion (CAO) acutely comes to human, and it highly threats the human's life. When CAO detected, Percutaneous Coronary Intervention (PCI) should be conducted timely. Before PCI, localizing the CAO is needed firstly, because the heart is covered with various arteries. We handle the three kinds of CAO in this paper and our purpose is not only localization of CAO but also improving the localizing performance via preprocessing method. We improve localization performance from a minimum of 0.150 to a maximum of 0.372 via our noise reduction and pulse extraction based method.
△ Less
Submitted 18 December, 2019; v1 submitted 17 December, 2019;
originally announced December 2019.
-
Reinforcing Medical Image Classifier to Improve Generalization on Small Datasets
Authors:
Walid Abdullah Al,
Il Dong Yun
Abstract:
With the advents of deep learning, improved image classification with complex discriminative models has been made possible. However, such deep models with increased complexity require a huge set of labeled samples to generalize the training. Such classification models can easily overfit when applied for medical images because of limited training data, which is a common problem in the field of medi…
▽ More
With the advents of deep learning, improved image classification with complex discriminative models has been made possible. However, such deep models with increased complexity require a huge set of labeled samples to generalize the training. Such classification models can easily overfit when applied for medical images because of limited training data, which is a common problem in the field of medical image analysis. This paper proposes and investigates a reinforced classifier for improving the generalization under a few available training data. Partially following the idea of reinforcement learning, the proposed classifier uses a generalization-feedback from a subset of the training data to update its parameter instead of only using the conventional cross-entropy loss about the training data. We evaluate the improvement of the proposed classifier by applying it on three different classification problems against the standard deep classifiers equipped with existing overfitting-prevention techniques. Besides an overall improvement in classification performance, the proposed classifier showed remarkable characteristics of generalized learning, which can have great potential in medical classification tasks.
△ Less
Submitted 7 October, 2019; v1 submitted 2 September, 2019;
originally announced September 2019.
-
Reinforcement Learning-based Automatic Diagnosis of Acute Appendicitis in Abdominal CT
Authors:
Walid Abdullah Al,
Il Dong Yun,
Kyong Joon Lee
Abstract:
Acute appendicitis characterized by a painful inflammation of the vermiform appendix is one of the most common surgical emergencies. Localizing the appendix is challenging due to its unclear anatomy amidst the complex colon-structure as observed in the conventional CT views, resulting in a time-consuming diagnosis. End-to-end learning of a convolutional neural network (CNN) is also not likely to b…
▽ More
Acute appendicitis characterized by a painful inflammation of the vermiform appendix is one of the most common surgical emergencies. Localizing the appendix is challenging due to its unclear anatomy amidst the complex colon-structure as observed in the conventional CT views, resulting in a time-consuming diagnosis. End-to-end learning of a convolutional neural network (CNN) is also not likely to be useful because of the negligible size of the appendix compared with the abdominal CT volume. With no prior computational approaches to the best of our knowledge, we propose the first computerized automation for acute appendicitis diagnosis. In our approach, we utilize a reinforcement learning agent deployed in the lower abdominal region to obtain the appendix location first to reduce the search space for diagnosis. Then, we obtain the classification scores (i.e., the likelihood of acute appendicitis) for the local neighborhood around the localized position, using a CNN trained only on a small appendix patch per volume. From the spatial representation of the resultant scores, we finally define a region of low-entropy (RLE) to choose the optimal diagnosis score, which helps improve the classification accuracy showing robustness even under high appendix localization error cases. In our experiment with 319 abdominal CT volumes, the proposed RLE-based decision with prior localization showed significant improvement over the standard CNN-based diagnosis approaches.
△ Less
Submitted 2 September, 2019;
originally announced September 2019.
-
DABNet: Depth-wise Asymmetric Bottleneck for Real-time Semantic Segmentation
Authors:
Gen Li,
Inyoung Yun,
Jonghyun Kim,
Joongkyu Kim
Abstract:
As a pixel-level prediction task, semantic segmentation needs large computational cost with enormous parameters to obtain high performance. Recently, due to the increasing demand for autonomous systems and robots, it is significant to make a tradeoff between accuracy and inference speed. In this paper, we propose a novel Depthwise Asymmetric Bottleneck (DAB) module to address this dilemma, which e…
▽ More
As a pixel-level prediction task, semantic segmentation needs large computational cost with enormous parameters to obtain high performance. Recently, due to the increasing demand for autonomous systems and robots, it is significant to make a tradeoff between accuracy and inference speed. In this paper, we propose a novel Depthwise Asymmetric Bottleneck (DAB) module to address this dilemma, which efficiently adopts depth-wise asymmetric convolution and dilated convolution to build a bottleneck structure. Based on the DAB module, we design a Depth-wise Asymmetric Bottleneck Network (DABNet) especially for real-time semantic segmentation, which creates sufficient receptive field and densely utilizes the contextual information. Experiments on Cityscapes and CamVid datasets demonstrate that the proposed DABNet achieves a balance between speed and precision. Specifically, without any pretrained model and postprocessing, it achieves 70.1% Mean IoU on the Cityscapes test dataset with only 0.76 million parameters and a speed of 104 FPS on a single GTX 1080Ti card.
△ Less
Submitted 30 September, 2019; v1 submitted 25 July, 2019;
originally announced July 2019.
-
Centerline Depth World Reinforcement Learning-based Left Atrial Appendage Orifice Localization
Authors:
Walid Abdullah Al,
Il Dong Yun,
Eun Ju Chun
Abstract:
Left atrial appendage (LAA) closure (LAAC) is a minimally invasive implant-based method to prevent cardiovascular stroke in patients with non-valvular atrial fibrillation. Assessing the LAA orifice in preoperative CT angiography plays a crucial role in choosing an appropriate LAAC implant size and a proper C-arm angulation. However, accurate orifice localization is hard because of the high anatomi…
▽ More
Left atrial appendage (LAA) closure (LAAC) is a minimally invasive implant-based method to prevent cardiovascular stroke in patients with non-valvular atrial fibrillation. Assessing the LAA orifice in preoperative CT angiography plays a crucial role in choosing an appropriate LAAC implant size and a proper C-arm angulation. However, accurate orifice localization is hard because of the high anatomic variation of LAA, and unclear position and orientation of the orifice in available CT views. Deep localization models also yield high error in localizing the orifice in CT image because of the tiny structure of orifice compared to the vastness of CT image. In this paper, we propose a centerline depth-based reinforcement learning (RL) world for effective orifice localization in a small search space. In our scheme, an RL agent observes the centerline-to-surface distance and navigates through the LAA centerline to localize the orifice. Thus, the search space is significantly reduced facilitating improved localization. The proposed formulation could result in high localization accuracy comparing to the expert-annotations in 98 CT images. Moreover, the localization process takes about 8 seconds which is 18 times more efficient than the existing method. Therefore, this can be a useful aid to physicians during the preprocedural planning of LAAC.
△ Less
Submitted 17 December, 2020; v1 submitted 2 April, 2019;
originally announced April 2019.
-
Automatic Techniques to Systematically Discover New Heap Exploitation Primitives
Authors:
Insu Yun,
Dhaval Kapil,
Taesoo Kim
Abstract:
Heap exploitation techniques to abuse the metadata of allocators have been widely studied since they are application independent and can be used in restricted environments that corrupt only metadata. Although prior work has found several interesting exploitation techniques, they are ad-hoc and manual, which cannot effectively handle changes or a variety of allocators.
In this paper, we present a…
▽ More
Heap exploitation techniques to abuse the metadata of allocators have been widely studied since they are application independent and can be used in restricted environments that corrupt only metadata. Although prior work has found several interesting exploitation techniques, they are ad-hoc and manual, which cannot effectively handle changes or a variety of allocators.
In this paper, we present a new naming scheme for heap exploitation techniques that systematically organizes them to discover the unexplored space in finding the techniques and ArcHeap, the tool that finds heap exploitation techniques automatically and systematically regardless of their underlying implementations. For that, ArcHeap generates a set of heap actions (e.g. allocation or deallocation) by leveraging fuzzing, which exploits common designs of modern heap allocators. Then, ArcHeap checks whether the actions result in impact of exploitations such as arbitrary write or overlapped chunks that efficiently determine if the actions can be converted into the exploitation technique. Finally, from these actions, ArcHeap generates Proof-of-Concept code automatically for an exploitation technique.
We evaluated ArcHeap with real-world allocators --- ptmalloc, jemalloc, and tcmalloc --- and custom allocators from the DARPA Cyber Grand Challenge. ArcHeap successfully found 14 out of 16 known exploitation techniques and found five new exploitation techniques in ptmalloc. Moreover, ArcHeap found several exploitation techniques for jemalloc, tcmalloc, and even for the custom allocators. Further, ArcHeap can automatically show changes in exploitation techniques along with version change in ptmalloc using differential testing.
△ Less
Submitted 1 March, 2019;
originally announced March 2019.
-
Learning Bone Suppression from Dual Energy Chest X-rays using Adversarial Networks
Authors:
Dong Yul Oh,
Il Dong Yun
Abstract:
Suppressing bones on chest X-rays such as ribs and clavicle is often expected to improve pathologies classification. These bones can interfere with a broad range of diagnostic tasks on pulmonary disease except for musculoskeletal system. Current conventional method for acquisition of bone suppressed X-rays is dual energy imaging, which captures two radiographs at a very short interval with differe…
▽ More
Suppressing bones on chest X-rays such as ribs and clavicle is often expected to improve pathologies classification. These bones can interfere with a broad range of diagnostic tasks on pulmonary disease except for musculoskeletal system. Current conventional method for acquisition of bone suppressed X-rays is dual energy imaging, which captures two radiographs at a very short interval with different energy levels; however, the patient is exposed to radiation twice and the artifacts arise due to heartbeats between two shots. In this paper, we introduce a deep generative model trained to predict bone suppressed images on single energy chest X-rays, analyzing a finite set of previously acquired dual energy chest X-rays. Since the relatively small amount of data is available, such approach relies on the methodology maximizing the data utilization. Here we integrate the following two approaches. First, we use a conditional generative adversarial network that complements the traditional regression method minimizing the pairwise image difference. Second, we use Haar 2D wavelet decomposition to offer a perceptual guideline in frequency details to allow the model to converge quickly and efficiently. As a result, we achieve state-of-the-art performance on bone suppression as compared to the existing approaches with dual energy chest X-rays.
△ Less
Submitted 4 November, 2018;
originally announced November 2018.
-
Part-Level Convolutional Neural Networks for Pedestrian Detection Using Saliency and Boundary Box Alignment
Authors:
Inyong Yun,
Cheolkon Jung,
Xinran Wang,
Alfred O Hero,
Joongkyu Kim
Abstract:
Pedestrians in videos have a wide range of appearances such as body poses, occlusions, and complex backgrounds, and there exists the proposal shift problem in pedestrian detection that causes the loss of body parts such as head and legs. To address it, we propose part-level convolutional neural networks (CNN) for pedestrian detection using saliency and boundary box alignment in this paper. The pro…
▽ More
Pedestrians in videos have a wide range of appearances such as body poses, occlusions, and complex backgrounds, and there exists the proposal shift problem in pedestrian detection that causes the loss of body parts such as head and legs. To address it, we propose part-level convolutional neural networks (CNN) for pedestrian detection using saliency and boundary box alignment in this paper. The proposed network consists of two sub-networks: detection and alignment. We use saliency in the detection sub-network to remove false positives such as lamp posts and trees. We adopt bounding box alignment on detection proposals in the alignment sub-network to address the proposal shift problem. First, we combine FCN and CAM to extract deep features for pedestrian detection. Then, we perform part-level CNN to recall the lost body parts. Experimental results on various datasets demonstrate that the proposed method remarkably improves accuracy in pedestrian detection and outperforms existing state-of-the-arts in terms of log average miss rate at false position per image (FPPI).
△ Less
Submitted 1 October, 2018;
originally announced October 2018.
-
Comparison of RNN Encoder-Decoder Models for Anomaly Detection
Authors:
YeongHyeon Park,
Il Dong Yun
Abstract:
In this paper, we compare different types of Recurrent Neural Network (RNN) Encoder-Decoders in anomaly detection viewpoint. We focused on finding the model that can learn the same data more effectively. We compared multiple models under the same conditions, such as the number of parameters, optimizer, and learning rate. However, the difference is whether to predict the future sequence or restore…
▽ More
In this paper, we compare different types of Recurrent Neural Network (RNN) Encoder-Decoders in anomaly detection viewpoint. We focused on finding the model that can learn the same data more effectively. We compared multiple models under the same conditions, such as the number of parameters, optimizer, and learning rate. However, the difference is whether to predict the future sequence or restore the current sequence. We constructed the dataset with simple vectors and used them for the experiment. Finally, we experimentally confirmed that the model performs better when the model restores the current sequence, rather than predict the future sequence.
△ Less
Submitted 19 July, 2018; v1 submitted 17 July, 2018;
originally announced July 2018.
-
Partial Policy-based Reinforcement Learning for Anatomical Landmark Localization in 3D Medical Images
Authors:
Walid Abdullah Al,
Il Dong Yun
Abstract:
Deploying the idea of long-term cumulative return, reinforcement learning has shown remarkable performance in various fields. We propose a formulation of the landmark localization in 3D medical images as a reinforcement learning problem. Whereas value-based methods have been widely used to solve similar problems, we adopt an actor-critic based direct policy search method framed in a temporal diffe…
▽ More
Deploying the idea of long-term cumulative return, reinforcement learning has shown remarkable performance in various fields. We propose a formulation of the landmark localization in 3D medical images as a reinforcement learning problem. Whereas value-based methods have been widely used to solve similar problems, we adopt an actor-critic based direct policy search method framed in a temporal difference learning approach. Successful behavior learning is challenging in large state and/or action spaces, requiring many trials. We introduce a partial policy-based reinforcement learning to enable solving the large problem of localization by learning the optimal policy on smaller partial domains. Independent actors efficiently learn the corresponding partial policies, each utilizing their own independent critic. The proposed policy reconstruction from the partial policies ensures a robust and efficient localization utilizing the sub-agents solving simple binary decision problems in their corresponding partial action spaces. The proposed reinforcement learning requires a small number of trials to learn the optimal behavior compared with the original behavior learning scheme.
△ Less
Submitted 31 December, 2018; v1 submitted 8 July, 2018;
originally announced July 2018.
-
Deep Vessel Segmentation By Learning Graphical Connectivity
Authors:
Seung Yeon Shin,
Soochahn Lee,
Il Dong Yun,
Kyoung Mu Lee
Abstract:
We propose a novel deep-learning-based system for vessel segmentation. Existing methods using CNNs have mostly relied on local appearances learned on the regular image grid, without considering the graphical structure of vessel shape. To address this, we incorporate a graph convolutional network into a unified CNN architecture, where the final segmentation is inferred by combining the different ty…
▽ More
We propose a novel deep-learning-based system for vessel segmentation. Existing methods using CNNs have mostly relied on local appearances learned on the regular image grid, without considering the graphical structure of vessel shape. To address this, we incorporate a graph convolutional network into a unified CNN architecture, where the final segmentation is inferred by combining the different types of features. The proposed method can be applied to expand any type of CNN-based vessel segmentation method to enhance the performance. Experiments show that the proposed method outperforms the current state-of-the-art methods on two retinal image datasets as well as a coronary artery X-ray angiography dataset.
△ Less
Submitted 6 June, 2018;
originally announced June 2018.
-
Joint Weakly and Semi-Supervised Deep Learning for Localization and Classification of Masses in Breast Ultrasound Images
Authors:
Seung Yeon Shin,
Soochahn Lee,
Il Dong Yun,
Sun Mi Kim,
Kyoung Mu Lee
Abstract:
We propose a framework for localization and classification of masses in breast ultrasound (BUS) images. We have experimentally found that training convolutional neural network based mass detectors with large, weakly annotated datasets presents a non-trivial problem, while overfitting may occur with those trained with small, strongly annotated datasets. To overcome these problems, we use a weakly a…
▽ More
We propose a framework for localization and classification of masses in breast ultrasound (BUS) images. We have experimentally found that training convolutional neural network based mass detectors with large, weakly annotated datasets presents a non-trivial problem, while overfitting may occur with those trained with small, strongly annotated datasets. To overcome these problems, we use a weakly annotated dataset together with a smaller strongly annotated dataset in a hybrid manner. We propose a systematic weakly and semi-supervised training scenario with appropriate training loss selection. Experimental results show that the proposed method can successfully localize and classify masses with less annotation effort. The results trained with only 10 strongly annotated images along with weakly annotated images were comparable to results trained from 800 strongly annotated images, with the 95% confidence interval of difference -3.00%--5.00%, in terms of the correct localization (CorLoc) measure, which is the ratio of images with intersection over union with ground truth higher than 0.5. With the same number of strongly annotated images, additional weakly annotated images can be incorporated to give a 4.5% point increase in CorLoc, from 80.00% to 84.50% (with 95% confidence intervals 76.00%--83.75% and 81.00%--88.00%). The effects of different algorithmic details and varied amount of data are presented through ablative analysis.
△ Less
Submitted 22 January, 2019; v1 submitted 10 October, 2017;
originally announced October 2017.