Skip to main content

Showing 1–50 of 57 results for author: Zou, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.02363  [pdf, other

    cs.CV cs.CL

    LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model

    Authors: Yulin Luo, Ruichuan An, Bocheng Zou, Yiming Tang, Jiaming Liu, Shanghang Zhang

    Abstract: The distribution of subpopulations is an important property hidden within a dataset. Uncovering and analyzing the subpopulation distribution within datasets provides a comprehensive understanding of the datasets, standing as a powerful tool beneficial to various downstream tasks, including Dataset Subpopulation Organization, Subpopulation Shift, and Slice Discovery. Despite its importance, there h… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  2. arXiv:2404.16362  [pdf, other

    cs.CR

    Feature graph construction with static features for malware detection

    Authors: Binghui Zou, Chunjie Cao, Longjuan Wang, Yinan Cheng, Jingzhang Sun

    Abstract: Malware can greatly compromise the integrity and trustworthiness of information and is in a constant state of evolution. Existing feature fusion-based detection methods generally overlook the correlation between features. And mere concatenation of features will reduce the model's characterization ability, lead to low detection accuracy. Moreover, these methods are susceptible to concept drift and… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  3. arXiv:2404.08916  [pdf, other

    cs.CV cs.LG

    Meply: A Large-scale Dataset and Baseline Evaluations for Metastatic Perirectal Lymph Node Detection and Segmentation

    Authors: Weidong Guo, Hantao Zhang, Shouhong Wan, Bingbing Zou, Wanqin Wang, Chenyang Qiu, Jun Li, Peiquan Jin

    Abstract: Accurate segmentation of metastatic lymph nodes in rectal cancer is crucial for the staging and treatment of rectal cancer. However, existing segmentation approaches face challenges due to the absence of pixel-level annotated datasets tailored for lymph nodes around the rectum. Additionally, metastatic lymph nodes are characterized by their relatively small size, irregular shapes, and lower contra… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: 13 pages

  4. arXiv:2404.06483  [pdf, other

    cs.CV

    RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos

    Authors: Bochao Zou, Zizheng Guo, Xiaocheng Hu, Huimin Ma

    Abstract: Remote photoplethysmography (rPPG) is a non-contact method for detecting physiological signals from facial videos, holding great potential in various applications such as healthcare, affective computing, and anti-spoofing. Existing deep learning methods struggle to address two core issues of rPPG simultaneously: extracting weak rPPG signals from video segments with large spatiotemporal redundancy… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.12788

  5. arXiv:2404.01013  [pdf, other

    cs.CV cs.AI

    Teeth-SEG: An Efficient Instance Segmentation Framework for Orthodontic Treatment based on Anthropic Prior Knowledge

    Authors: Bo Zou, Shaofeng Wang, Hao Liu, Gaoyue Sun, Yajie Wang, FeiFei Zuo, Chengbin Quan, Youjian Zhao

    Abstract: Teeth localization, segmentation, and labeling in 2D images have great potential in modern dentistry to enhance dental diagnostics, treatment planning, and population-based studies on oral health. However, general instance segmentation frameworks are incompetent due to 1) the subtle differences between some teeth' shapes (e.g., maxillary first premolar and second premolar), 2) the teeth's position… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted by CVPR 2024

  6. arXiv:2404.00973  [pdf, other

    cs.CV

    VideoDistill: Language-aware Vision Distillation for Video Question Answering

    Authors: Bo Zou, Chao Yang, Yu Qiao, Chengbin Quan, Youjian Zhao

    Abstract: Significant advancements in video question answering (VideoQA) have been made thanks to thriving large image-language pretraining frameworks. Although these image-language models can efficiently represent both video and language branches, they typically employ a goal-free vision perception process and do not interact vision with language well during the answer generation, thus omitting crucial vis… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: This paper is accepted by CVPR2024

  7. arXiv:2404.00913  [pdf, other

    cs.CV cs.AI cs.CL

    LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction

    Authors: Bo Zou, Chao Yang, Yu Qiao, Chengbin Quan, Youjian Zhao

    Abstract: Existing methods to fine-tune LLMs, like Adapter, Prefix-tuning, and LoRA, which introduce extra modules or additional input sequences to inject new skills or knowledge, may compromise the innate abilities of LLMs. In this paper, we propose LLaMA-Excitor, a lightweight method that stimulates the LLMs' potential to better follow instructions by gradually paying more attention to worthwhile informat… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: This paper is accepted by CVPR 2024

  8. arXiv:2403.20271  [pdf, other

    cs.CV

    Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want

    Authors: Weifeng Lin, Xinyu Wei, Ruichuan An, Peng Gao, Bocheng Zou, Yulin Luo, Siyuan Huang, Shanghang Zhang, Hongsheng Li

    Abstract: The interaction between humans and artificial intelligence (AI) is a crucial factor that reflects the effectiveness of multimodal large language models (MLLMs). However, current MLLMs primarily focus on image-level comprehension and limit interaction to textual instructions, thereby constraining their flexibility in usage and depth of response. In this paper, we introduce the Draw-and-Understand p… ▽ More

    Submitted 31 March, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: 16 pages, 7 figures

  9. arXiv:2403.16358  [pdf, other

    cs.CV

    ChebMixer: Efficient Graph Representation Learning with MLP Mixer

    Authors: Xiaoyan Kui, Haonan Yan, Qinsong Li, Liming Chen, Beiji Zou

    Abstract: Graph neural networks have achieved remarkable success in learning graph representations, especially graph Transformer, which has recently shown superior performance on various graph mining tasks. However, graph Transformer generally treats nodes as tokens, which results in quadratic complexity regarding the number of nodes during self-attention computation. The graph MLP Mixer addresses this chal… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  10. arXiv:2402.17233  [pdf, other

    cs.LG stat.AP stat.ME

    Hybrid Square Neural ODE Causal Modeling

    Authors: Bob Junyi Zou, Matthew E. Levine, Dessi P. Zaharieva, Ramesh Johari, Emily B. Fox

    Abstract: Hybrid models combine mechanistic ODE-based dynamics with flexible and expressive neural network components. Such models have grown rapidly in popularity, especially in scientific domains where such ODE-based modeling offers important interpretability and validated causal grounding (e.g., for counterfactual reasoning). The incorporation of mechanistic models also provides inductive bias in standar… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  11. arXiv:2402.12788  [pdf, other

    cs.CV

    RhythmFormer: Extracting rPPG Signals Based on Hierarchical Temporal Periodic Transformer

    Authors: Bochao Zou, Zizheng Guo, Jiansheng Chen, Huimin Ma

    Abstract: Remote photoplethysmography (rPPG) is a non-contact method for detecting physiological signals based on facial videos, holding high potential in various applications such as healthcare, affective computing, anti-spoofing, etc. Due to the periodicity nature of rPPG, the long-range dependency capturing capacity of the Transformer was assumed to be advantageous for such signals. However, existing app… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  12. arXiv:2401.00496  [pdf, other

    cs.CV cs.AI cs.LG

    SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge

    Authors: Dimitrios Psychogyios, Emanuele Colleoni, Beatrice Van Amsterdam, Chih-Yang Li, Shu-Yu Huang, Yuchong Li, Fucang Jia, Baosheng Zou, Guotai Wang, Yang Liu, Maxence Boels, Jiayu Huo, Rachel Sparks, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin, Mengya Xu, An Wang, Yanan Wu, Long Bai, Hongliang Ren, Atsushi Yamada, Yuriko Harai, Yuto Ishikawa, Kazuyuki Hayashi , et al. (25 additional authors not shown)

    Abstract: Surgical tool segmentation and action recognition are fundamental building blocks in many computer-assisted intervention applications, ranging from surgical skills assessment to decision support systems. Nowadays, learning-based action recognition and segmentation approaches outperform classical methods, relying, however, on large, annotated datasets. Furthermore, action recognition and tool segme… ▽ More

    Submitted 23 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

  13. arXiv:2312.14705  [pdf, other

    eess.IV cs.CV cs.LG

    SCUNet++: Swin-UNet and CNN Bottleneck Hybrid Architecture with Multi-Fusion Dense Skip Connection for Pulmonary Embolism CT Image Segmentation

    Authors: Yifei Chen, Binfeng Zou, Zhaoxin Guo, Yiyu Huang, Yifan Huang, Feiwei Qin, Qinhai Li, Changmiao Wang

    Abstract: Pulmonary embolism (PE) is a prevalent lung disease that can lead to right ventricular hypertrophy and failure in severe cases, ranking second in severity only to myocardial infarction and sudden death. Pulmonary artery CT angiography (CTPA) is a widely used diagnostic method for PE. However, PE detection presents challenges in clinical practice due to limitations in imaging technology. CTPA can p… ▽ More

    Submitted 2 January, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: 10 pages, 7 figures, accept WACV2024

    Journal ref: WACV 2024

  14. arXiv:2312.05834  [pdf, other

    cs.CL cs.AI

    Evidence-based Interpretable Open-domain Fact-checking with Large Language Models

    Authors: Xin Tan, Bowei Zou, Ai Ti Aw

    Abstract: Universal fact-checking systems for real-world claims face significant challenges in gathering valid and sufficient real-time evidence and making reasoned decisions. In this work, we introduce the Open-domain Explainable Fact-checking (OE-Fact) system for claim-checking in real-world scenarios. The OE-Fact system can leverage the powerful understanding and reasoning capabilities of large language… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  15. arXiv:2312.02923  [pdf, other

    cs.CV

    MoSA: Mixture of Sparse Adapters for Visual Efficient Tuning

    Authors: Qizhe Zhang, Bocheng Zou, Ruichuan An, Jiaming Liu, Shanghang Zhang

    Abstract: With the rapid growth in the scale of pre-trained foundation models, parameter-efficient fine-tuning techniques have gained significant attention, among which Adapter Tuning is the most widely used. Despite achieving efficiency, it still underperforms full fine-tuning, and the performance improves at the cost of an increase in parameters. Recent efforts have either focused on training multiple ada… ▽ More

    Submitted 23 March, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: 16 pages, 7 figures. Official code: https://github.com/Theia-4869/MoSA

  16. arXiv:2310.15105  [pdf, other

    cs.CV

    FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained Models in Few-Shot Learning

    Authors: Kun Song, Huimin Ma, Bochao Zou, Huishuai Zhang, Weiran Huang

    Abstract: Due to the limited availability of data, existing few-shot learning methods trained from scratch fail to achieve satisfactory performance. In contrast, large-scale pre-trained models such as CLIP demonstrate remarkable few-shot and zero-shot capabilities. To enhance the performance of pre-trained models for downstream tasks, fine-tuning the model on downstream data is frequently necessary. However… ▽ More

    Submitted 17 November, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023

  17. arXiv:2308.08283  [pdf, other

    eess.IV cs.CV cs.LG

    CARE: A Large Scale CT Image Dataset and Clinical Applicable Benchmark Model for Rectal Cancer Segmentation

    Authors: Hantao Zhang, Weidong Guo, Chenyang Qiu, Shouhong Wan, Bingbing Zou, Wanqin Wang, Peiquan Jin

    Abstract: Rectal cancer segmentation of CT image plays a crucial role in timely clinical diagnosis, radiotherapy treatment, and follow-up. Although current segmentation methods have shown promise in delineating cancerous tissues, they still encounter challenges in achieving high segmentation precision. These obstacles arise from the intricate anatomical structures of the rectum and the difficulties in perfo… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: 8 pages

  18. arXiv:2305.16048  [pdf, other

    cs.CL cs.AI

    UFO: Unified Fact Obtaining for Commonsense Question Answering

    Authors: Zhifeng Li, Yifan Fan, Bowei Zou, Yu Hong

    Abstract: Leveraging external knowledge to enhance the reasoning ability is crucial for commonsense question answering. However, the existing knowledge bases heavily rely on manual annotation which unavoidably causes deficiency in coverage of world-wide commonsense knowledge. Accordingly, the knowledge bases fail to be flexible enough to support the reasoning over diverse questions. Recently, large-scale la… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: IJCNN 2023

  19. arXiv:2305.08348  [pdf, other

    cs.CL

    Coreference-aware Double-channel Attention Network for Multi-party Dialogue Reading Comprehension

    Authors: Yanling Li, Bowei Zou, Yifan Fan, Mengxing Dong, Yu Hong

    Abstract: We tackle Multi-party Dialogue Reading Comprehension (abbr., MDRC). MDRC stands for an extractive reading comprehension task grounded on a batch of dialogues among multiple interlocutors. It is challenging due to the requirement of understanding cross-utterance contexts and relationships in a multi-turn multi-party conversation. Previous studies have made great efforts on the utterance profiling o… ▽ More

    Submitted 22 May, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: IJCNN2023

  20. arXiv:2305.08347  [pdf, other

    cs.CL cs.AI

    KEPR: Knowledge Enhancement and Plausibility Ranking for Generative Commonsense Question Answering

    Authors: Zhifeng Li, Bowei Zou, Yifan Fan, Yu Hong

    Abstract: Generative commonsense question answering (GenCQA) is a task of automatically generating a list of answers given a question. The answer list is required to cover all reasonable answers. This presents the considerable challenges of producing diverse answers and ranking them properly. Incorporating a variety of closely-related background knowledge into the encoding of questions enables the generatio… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: IJCNN 2023

  21. arXiv:2305.03088  [pdf, other

    cs.CL cs.AI

    Modeling What-to-ask and How-to-ask for Answer-unaware Conversational Question Generation

    Authors: Xuan Long Do, Bowei Zou, Shafiq Joty, Anh Tai Tran, Liangming Pan, Nancy F. Chen, Ai Ti Aw

    Abstract: Conversational Question Generation (CQG) is a critical task for machines to assist humans in fulfilling their information needs through conversations. The task is generally cast into two different settings: answer-aware and answer-unaware. While the former facilitates the models by exposing the expected answer, the latter is more realistic and receiving growing attentions recently. What-to-ask and… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: 17 pages, ACL 2023

  22. arXiv:2304.05218  [pdf, ps, other

    cs.CV

    Improving Neural Radiance Fields with Depth-aware Optimization for Novel View Synthesis

    Authors: Shu Chen, Junyao Li, Yang Zhang, Beiji Zou

    Abstract: With dense inputs, Neural Radiance Fields (NeRF) is able to render photo-realistic novel views under static conditions. Although the synthesis quality is excellent, existing NeRF-based methods fail to obtain moderate three-dimensional (3D) structures. The novel view synthesis quality drops dramatically given sparse input due to the implicitly reconstructed inaccurate 3D-scene structure. We propose… ▽ More

    Submitted 19 February, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

  23. arXiv:2302.14229  [pdf, other

    cs.CL cs.AI

    Zero-Shot Cross-Lingual Summarization via Large Language Models

    Authors: Jiaan Wang, Yunlong Liang, Fandong Meng, Beiqi Zou, Zhixu Li, Jianfeng Qu, Jie Zhou

    Abstract: Given a document in a source language, cross-lingual summarization (CLS) aims to generate a summary in a different target language. Recently, the emergence of Large Language Models (LLMs), such as GPT-3.5, ChatGPT and GPT-4, has attracted wide attention from the computational linguistics community. However, it is not yet known the performance of LLMs on CLS. In this report, we empirically use vari… ▽ More

    Submitted 24 October, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: Both first authors contributed equally. Technical Report, 12 pages. Accepted to the 4th New Frontiers in Summarization Workshop (NewSumm@EMNLP 2023)

  24. arXiv:2302.10185  [pdf

    cs.CV cs.AI cs.LG

    Active Learning in Brain Tumor Segmentation with Uncertainty Sampling, Annotation Redundancy Restriction, and Data Initialization

    Authors: Daniel D Kim, Rajat S Chandra, Jian Peng, Jing Wu, Xue Feng, Michael Atalay, Chetan Bettegowda, Craig Jones, Haris Sair, Wei-hua Liao, Chengzhang Zhu, Beiji Zou, Li Yang, Anahita Fathi Kazerooni, Ali Nabavizadeh, Harrison X Bai, Zhicheng Jiao

    Abstract: Deep learning models have demonstrated great potential in medical 3D imaging, but their development is limited by the expensive, large volume of annotated data required. Active learning (AL) addresses this by training a model on a subset of the most informative data samples without compromising performance. We compared different AL strategies and propose a framework that minimizes the amount of da… ▽ More

    Submitted 4 February, 2023; originally announced February 2023.

    Comments: 22 pages, 3 figures, 3 tables, 1 supplementary data document. Submitted to Medical Physics in Jan 2023

  25. arXiv:2212.08568  [pdf, other

    cs.CV cs.LG

    Biomedical image analysis competitions: The state of current participation practice

    Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, Anubha Gupta, Jan Kybic, Alison Noble, Carlos Ortiz de Solórzano, Samiksha Pachade, Caroline Petitjean, Daniel Sage, Donglai Wei, Elizabeth Wilden, Deepak Alapatt, Vincent Andrearczyk, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano , et al. (331 additional authors not shown)

    Abstract: The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,… ▽ More

    Submitted 12 September, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

  26. arXiv:2210.11153  [pdf, other

    eess.IV cs.CV

    Reversed Image Signal Processing and RAW Reconstruction. AIM 2022 Challenge Report

    Authors: Marcos V. Conde, Radu Timofte, Yibin Huang, Jingyang Peng, Chang Chen, Cheng Li, Eduardo Pérez-Pellitero, Fenglong Song, Furui Bai, Shuai Liu, Chaoyu Feng, Xiaotao Wang, Lei Lei, Yu Zhu, Chenghua Li, Yingying Jiang, Yong A, Peisong Wang, Cong Leng, Jian Cheng, Xiaoyu Liu, Zhicun Yin, Zhilu Zhang, Junyi Li, Ming Liu , et al. (18 additional authors not shown)

    Abstract: Cameras capture sensor RAW images and transform them into pleasant RGB images, suitable for the human eyes, using their integrated Image Signal Processor (ISP). Numerous low-level vision tasks operate in the RAW domain (e.g. image denoising, white balance) due to its linear relationship with the scene irradiance, wide-range of information at 12bits, and sensor designs. Despite this, RAW image data… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: ECCV 2022 Advances in Image Manipulation (AIM) workshop

  27. arXiv:2210.00183  [pdf, ps, other

    cs.CV

    Structure-Aware NeRF without Posed Camera via Epipolar Constraint

    Authors: Shu Chen, Yang Zhang, Yaxin Xu, Beiji Zou

    Abstract: The neural radiance field (NeRF) for realistic novel view synthesis requires camera poses to be pre-acquired by a structure-from-motion (SfM) approach. This two-stage strategy is not convenient to use and degrades the performance because the error in the pose extraction can propagate to the view synthesis. We integrate the pose extraction and view synthesis into a single end-to-end procedure so th… ▽ More

    Submitted 30 September, 2022; originally announced October 2022.

  28. arXiv:2209.06652  [pdf, other

    cs.CL

    CoHS-CQG: Context and History Selection for Conversational Question Generation

    Authors: Xuan Long Do, Bowei Zou, Liangming Pan, Nancy F. Chen, Shafiq Joty, Ai Ti Aw

    Abstract: Conversational question generation (CQG) serves as a vital task for machines to assist humans, such as interactive reading comprehension, through conversations. Compared to traditional single-turn question generation (SQG), CQG is more challenging in the sense that the generated question is required not only to be meaningful, but also to align with the occurred conversation history. While previous… ▽ More

    Submitted 10 October, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: Accepted by 29th International Conference on Computational Linguistics (COLING 2022)

  29. arXiv:2208.12459  [pdf, other

    cs.LG

    Meta Objective Guided Disambiguation for Partial Label Learning

    Authors: Bo-Shi Zou, Ming-Kun Xie, Sheng-Jun Huang

    Abstract: Partial label learning (PLL) is a typical weakly supervised learning framework, where each training instance is associated with a candidate label set, among which only one label is valid. To solve PLL problems, typically methods try to perform disambiguation for candidate sets by either using prior knowledge, such as structure information of training data, or refining model outputs in a self-train… ▽ More

    Submitted 22 December, 2023; v1 submitted 26 August, 2022; originally announced August 2022.

    Comments: 10 pages

  30. arXiv:2207.06695  [pdf, other

    cs.CV

    DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding

    Authors: Liang Qiao, Hui Jiang, Ying Chen, Can Li, Pengfei Li, Zaisheng Li, Baorui Zou, Dashan Guo, Yingda Xu, Yunlu Xu, Zhanzhan Cheng, Yi Niu

    Abstract: This paper presents DavarOCR, an open-source toolbox for OCR and document understanding tasks. DavarOCR currently implements 19 advanced algorithms, covering 9 different task forms. DavarOCR provides detailed usage instructions and the trained models for each algorithm. Compared with the previous opensource OCR toolbox, DavarOCR has relatively more complete support for the sub-tasks of the cutting… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

    Comments: Short paper, Accept by ACM MM2022

  31. arXiv:2205.14833  [pdf, other

    cs.LG cs.DC eess.SY

    Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning

    Authors: Chengfei Lv, Chaoyue Niu, Renjie Gu, Xiaotang Jiang, Zhaode Wang, Bin Liu, Ziqi Wu, Qiulin Yao, Congyu Huang, Panos Huang, Tao Huang, Hui Shu, Jinde Song, Bin Zou, Peng Lan, Guohuan Xu, Fei Wu, Shaojie Tang, Fan Wu, Guihai Chen

    Abstract: To break the bottlenecks of mainstream cloud-based machine learning (ML) paradigm, we adopt device-cloud collaborative ML and build the first end-to-end and general-purpose system, called Walle, as the foundation. Walle consists of a deployment platform, distributing ML tasks to billion-scale devices in time; a data pipeline, efficiently preparing task input; and a compute container, providing a c… ▽ More

    Submitted 29 May, 2022; originally announced May 2022.

    Comments: Accepted by OSDI 2022

  32. arXiv:2205.12633  [pdf, other

    cs.CV eess.IV

    NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

    Authors: Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang , et al. (68 additional authors not shown)

    Abstract: This paper reviews the challenge on constrained high dynamic range (HDR) imaging that was part of the New Trends in Image Restoration and Enhancement (NTIRE) workshop, held in conjunction with CVPR 2022. This manuscript focuses on the competition set-up, datasets, the proposed methods and their results. The challenge aims at estimating an HDR image from multiple respective low dynamic range (LDR)… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: CVPR Workshops 2022. 15 pages, 21 figures, 2 tables

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022

  33. Encoding of direct 4D printing of isotropic single-material system for double-curvature and multimodal morphing

    Authors: Bihui Zou, Chao Song, Zipeng He, Jaehyung Ju

    Abstract: The ability to morph flat sheets into complex 3D shapes is extremely useful for fast manufacturing and saving materials while also allowing volumetrically efficient storage and shipment and a functional use. Direct 4D printing is a compelling method to morph complex 3D shapes out of as-printed 2D plates. However, most direct 4D printing methods require multi-material systems involving costly machi… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Journal ref: Extreme Mech. Lett. 54 (2022) 101779

  34. arXiv:2203.02102  [pdf, other

    eess.SP cs.HC eess.SY

    BEATS: An Open-Source, High-Precision, Multi-Channel EEG Acquisition Tool System

    Authors: Bing Zou, Yubo Zheng, Mu Shen, Yingying Luo, Lei Li, Lin Zhang

    Abstract: Stable and accurate electroencephalogram (EEG) signal acquisition is fundamental in non-invasive brain-computer interface (BCI) technology. Commonly used EEG acquisition system's hardware and software are usually closed-source. Its inability to flexible expansion and secondary development is a major obstacle to real-time BCI research. This paper presents the Beijing University of Posts and Telecom… ▽ More

    Submitted 19 December, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  35. arXiv:2202.13636  [pdf, other

    cs.CL

    Improving Lexical Embeddings for Robust Question Answering

    Authors: Weiwen Xu, Bowei Zou, Wai Lam, Ai Ti Aw

    Abstract: Recent techniques in Question Answering (QA) have gained remarkable performance improvement with some QA models even surpassed human performance. However, the ability of these models in truly understanding the language still remains dubious and the models are revealing limitations when facing adversarial examples. To strengthen the robustness of QA models and their generalization ability, we propo… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: 7 pages, 3 tables

  36. arXiv:2201.12538  [pdf, other

    cs.CL cs.AI

    Incorporating Commonsense Knowledge into Story Ending Generation via Heterogeneous Graph Networks

    Authors: Jiaan Wang, Beiqi Zou, Zhixu Li, Jianfeng Qu, Pengpeng Zhao, An Liu, Lei Zhao

    Abstract: Story ending generation is an interesting and challenging task, which aims to generate a coherent and reasonable ending given a story context. The key challenges of the task lie in how to comprehend the story context sufficiently and handle the implicit knowledge behind story clues effectively, which are still under-explored by previous work. In this paper, we propose a Story Heterogeneous Graph N… ▽ More

    Submitted 29 January, 2022; originally announced January 2022.

    Comments: DASFAA 2022

  37. arXiv:2109.13828  [pdf, other

    cs.LG cs.CR

    An Automated Data Engineering Pipeline for Anomaly Detection of IoT Sensor Data

    Authors: Xinze Li, Baixi Zou

    Abstract: The rapid development in the field of System of Chip (SoC) technology, Internet of Things (IoT), cloud computing, and artificial intelligence has brought more possibilities of improving and solving the current problems. With data analytics and the use of machine learning/deep learning, it is made possible to learn the underlying patterns and make decisions based on what was learned from massive da… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

    MSC Class: 68T01 ACM Class: I.2.6; C.2.0

  38. arXiv:2107.11919  [pdf, other

    cs.CV

    ICDAR 2021 Competition on Scene Video Text Spotting

    Authors: Zhanzhan Cheng, Jing Lu, Baorui Zou, Shuigeng Zhou, Fei Wu

    Abstract: Scene video text spotting (SVTS) is a very important research topic because of many real-life applications. However, only a little effort has put to spotting scene video text, in contrast to massive studies of scene text spotting in static images. Due to various environmental interferences like motion blur, spotting scene video text becomes very challenging. To promote this research area, this com… ▽ More

    Submitted 25 July, 2021; originally announced July 2021.

    Comments: SVTS Technique Report for ICDAR 2021 competition

  39. arXiv:2105.13845  [pdf

    cs.DC math.OC

    Multi-Tier Adaptive Memory Programming and Cluster- and Job-based Relocation for Distributed On-demand Crowdshipping

    Authors: Tanvir Ahamed, Bo Zou

    Abstract: With rapid e-commerce growth, on-demand urban delivery is having a high time especially for food, grocery, and retail, often requiring delivery in a very short amount of time after an order is placed. This imposes significant financial and operational challenges for traditional vehicle-based delivery methods. Crowdshipping, which employs ordinary people with a low pay rate and limited time availab… ▽ More

    Submitted 14 May, 2021; originally announced May 2021.

    Comments: 46 pages, 9 figures, 4 algorithms

  40. arXiv:2105.03807  [pdf

    cs.CV

    Estimation of 3D Human Pose Using Prior Knowledge

    Authors: Shu Chen, Lei Zhang, Beiji Zou

    Abstract: Estimating three-dimensional human poses from the positions of two-dimensional joints has shown promising results.However, using two-dimensional joint coordinates as input loses more information than image-based approaches and results in ambiguity.In order to overcome this problem, we combine bone length and camera parameters with two-dimensional joint coordinates for input.This combination is mor… ▽ More

    Submitted 8 May, 2021; originally announced May 2021.

    Comments: letter

  41. arXiv:2101.03690  [pdf

    cs.LG

    Modeling Household Online Shopping Demand in the U.S.: A Machine Learning Approach and Comparative Investigation between 2009 and 2017

    Authors: Limon Barua, Bo Zou, Yan, Zhou, Yulin Liu

    Abstract: Despite the rapid growth of online shopping and research interest in the relationship between online and in-store shopping, national-level modeling and investigation of the demand for online shopping with a prediction focus remain limited in the literature. This paper differs from prior work and leverages two recent releases of the U.S. National Household Travel Survey (NHTS) data for 2009 and 201… ▽ More

    Submitted 10 January, 2021; originally announced January 2021.

  42. arXiv:2012.15575  [pdf, other

    cs.CV

    A Deep Retinal Image Quality Assessment Network with Salient Structure Priors

    Authors: Ziwen Xu, beiji Zou, Qing Liu

    Abstract: Retinal image quality assessment is an essential prerequisite for diagnosis of retinal diseases. Its goal is to identify retinal images in which anatomic structures and lesions attracting ophthalmologists' attention most are exhibited clearly and definitely while reject poor quality fundus images. Motivated by this, we mimic the way that ophthalmologists assess the quality of retinal images and pr… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

  43. arXiv:2011.14430  [pdf

    cs.AI cs.LG

    Deep Reinforcement Learning for Crowdsourced Urban Delivery: System States Characterization, Heuristics-guided Action Choice, and Rule-Interposing Integration

    Authors: Tanvir Ahamed, Bo Zou, Nahid Parvez Farazi, Theja Tulabandhula

    Abstract: This paper investigates the problem of assigning shipping requests to ad hoc couriers in the context of crowdsourced urban delivery. The shipping requests are spatially distributed each with a limited time window between the earliest time for pickup and latest time for delivery. The ad hoc couriers, termed crowdsourcees, also have limited time availability and carrying capacity. We propose a new d… ▽ More

    Submitted 29 November, 2020; originally announced November 2020.

    Comments: 50 pages, 17 figures

  44. arXiv:2010.13313  [pdf, other

    eess.IV cs.CV

    A Dark and Bright Channel Prior Guided Deep Network for Retinal Image Quality Assessment

    Authors: Ziwen Xu, Beiji Zou, Qing Liu

    Abstract: Retinal image quality assessment is an essential task in the diagnosis of retinal diseases. Recently, there are emerging deep models to grade quality of retinal images. Current state-of-the-arts either directly transfer classification networks originally designed for natural images to quality classification of retinal images or introduce extra image quality priors via multiple CNN branches or inde… ▽ More

    Submitted 20 April, 2021; v1 submitted 25 October, 2020; originally announced October 2020.

  45. arXiv:2010.06187  [pdf

    cs.LG cs.AI

    Deep Reinforcement Learning and Transportation Research: A Comprehensive Review

    Authors: Nahid Parvez Farazi, Tanvir Ahamed, Limon Barua, Bo Zou

    Abstract: Deep reinforcement learning (DRL) is an emerging methodology that is transforming the way many complicated transportation decision-making problems are tackled. Researchers have been increasingly turning to this powerful learning-based methodology to solve challenging problems across transportation fields. While many promising applications have been reported in the literature, there remains a lack… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

  46. Unsupervised Deep Representation Learning and Few-Shot Classification of PolSAR Images

    Authors: Lamei Zhang, Siyu Zhang, Bin Zou, Hongwei Dong

    Abstract: Deep learning and convolutional neural networks (CNNs) have made progress in polarimetric synthetic aperture radar (PolSAR) image classification over the past few years. However, a crucial issue has not been addressed, i.e., the requirement of CNNs for abundant labeled samples versus the insufficient human annotations of PolSAR images. It is well-known that following the supervised learning paradi… ▽ More

    Submitted 24 December, 2020; v1 submitted 27 June, 2020; originally announced June 2020.

    Comments: 16 pages, 16 figures

  47. arXiv:2005.13116  [pdf, other

    cs.CV

    Object-QA: Towards High Reliable Object Quality Assessment

    Authors: Jing Lu, Baorui Zou, Zhanzhan Cheng, Shiliang Pu, Shuigeng Zhou, Yi Niu, Fei Wu

    Abstract: In object recognition applications, object images usually appear with different quality levels. Practically, it is very important to indicate object image qualities for better application performance, e.g. filtering out low-quality object image frames to maintain robust video object recognition results and speed up inference. However, no previous works are explicitly proposed for addressing the pr… ▽ More

    Submitted 26 May, 2020; originally announced May 2020.

  48. arXiv:2002.12418  [pdf, other

    cs.CV cs.DC cs.LG

    MNN: A Universal and Efficient Inference Engine

    Authors: Xiaotang Jiang, Huan Wang, Yiliu Chen, Ziqi Wu, Lichuan Wang, Bin Zou, Yafeng Yang, Zongyang Cui, Yu Cai, Tianhang Yu, Chengfei Lv, Zhihua Wu

    Abstract: Deploying deep learning models on mobile devices draws more and more attention recently. However, designing an efficient inference engine on devices is under the great challenges of model compatibility, device diversity, and resource limitation. To deal with these challenges, we propose Mobile Neural Network (MNN), a universal and efficient inference engine tailored to mobile applications. In this… ▽ More

    Submitted 27 February, 2020; originally announced February 2020.

    Comments: Accepted by MLSys 2020

  49. Automatic Design of CNNs via Differentiable Neural Architecture Search for PolSAR Image Classification

    Authors: Hongwei Dong, Siyu Zhang, Bin Zou, Lamei Zhang

    Abstract: Convolutional neural networks (CNNs) have shown good performance in polarimetric synthetic aperture radar (PolSAR) image classification due to the automation of feature engineering. Excellent hand-crafted architectures of CNNs incorporated the wisdom of human experts, which is an important reason for CNN's success. However, the design of the architectures is a difficult problem, which needs a lot… ▽ More

    Submitted 19 November, 2019; v1 submitted 16 November, 2019; originally announced November 2019.

    Journal ref: IEEE Transactions on Geoscience and Remote Sensing, Volume 58, Issue 9, September 2020, Pages 6362-6375

  50. arXiv:1911.01648  [pdf, other

    cs.CV

    A Deep Gradient Boosting Network for Optic Disc and Cup Segmentation

    Authors: Qing Liu, Beiji Zou, Yang Zhao, Yixiong Liang

    Abstract: Segmentation of optic disc (OD) and optic cup (OC) is critical in automated fundus image analysis system. Existing state-of-the-arts focus on designing deep neural networks with one or multiple dense prediction branches. Such kind of designs ignore connections among prediction branches and their learning capacity is limited. To build connections among prediction branches, this paper introduces gra… ▽ More

    Submitted 5 November, 2019; originally announced November 2019.