Skip to main content

Showing 1–50 of 2,889 results for author: Zhang, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05155  [pdf, other

    cs.CE

    An efficient truncation scheme for Eulerian and total Lagrangian SPH methods

    Authors: Zhentong Wang, Chi Zhang, Oskar J. Haidn, Xiangyu Hu

    Abstract: In smoothed particle hydrodynamics (SPH) method, the particle-based approximations are implemented via kernel functions, and the evaluation of performance involves two key criteria: numerical accuracy and computational efficiency. In the SPH community, the Wendland kernel reigns as the prevailing choice due to its commendable accuracy and reasonable computational efficiency. Nevertheless, there ex… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 38 pages and 14 figures

  2. arXiv:2405.04840  [pdf, other

    cs.IR

    Federated Adaptation for Foundation Model-based Recommendations

    Authors: Chunxu Zhang, Guodong Long, Hongkuan Guo, Xiao Fang, Yang Song, Zhaojie Liu, Guorui Zhou, Zijian Zhang, Yang Liu, Bo Yang

    Abstract: With the recent success of large language models, particularly foundation models with generalization abilities, applying foundation models for recommendations becomes a new paradigm to improve existing recommendation systems. It becomes a new open challenge to enable the foundation model to capture user preference changes in a timely manner with reasonable communication and computation costs while… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted as a regular paper of IJCAI'24

  3. Teacher-Student Network for Real-World Face Super-Resolution with Progressive Embedding of Edge Information

    Authors: Zhilei Liu, Chenggong Zhang

    Abstract: Traditional face super-resolution (FSR) methods trained on synthetic datasets usually have poor generalization ability for real-world face images. Recent work has utilized complex degradation models or training networks to simulate the real degradation process, but this limits the performance of these methods due to the domain differences that still exist between the generated low-resolution image… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by ICIP 2023

  4. arXiv:2405.04404  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Vision Mamba: A Comprehensive Survey and Taxonomy

    Authors: Xiao Liu, Chenxu Zhang, Lei Zhang

    Abstract: State Space Model (SSM) is a mathematical model used to describe and analyze the behavior of dynamic systems. This model has witnessed numerous applications in several fields, including control theory, signal processing, economics and machine learning. In the field of deep learning, state space models are used to process sequence data, such as time series analysis, natural language processing (NLP… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: https://github.com/lx6c78/Vision-Mamba-A-Comprehensive-Survey-and-Taxonomy

  5. arXiv:2405.04064  [pdf, other

    cs.AI

    MFA-Net: Multi-Scale feature fusion attention network for liver tumor segmentation

    Authors: Yanli Yuan, Bingbing Wang, Chuan Zhang, Jingyi Xu, Ximeng Liu, Liehuang Zhu

    Abstract: Segmentation of organs of interest in medical CT images is beneficial for diagnosis of diseases. Though recent methods based on Fully Convolutional Neural Networks (F-CNNs) have shown success in many segmentation tasks, fusing features from images with different scales is still a challenge: (1) Due to the lack of spatial awareness, F-CNNs share the same weights at different spatial locations. (2)… ▽ More

    Submitted 9 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: Paper accepted in Human-Centric Representation Learning workshop at AAAI 2024

  6. arXiv:2405.03718  [pdf, other

    cs.LG cs.AI cs.GT cs.MA

    A Single Online Agent Can Efficiently Learn Mean Field Games

    Authors: Chenyu Zhang, Xu Chen, Xuan Di

    Abstract: Mean field games (MFGs) are a promising framework for modeling the behavior of large-population systems. However, solving MFGs can be challenging due to the coupling of forward population evolution and backward agent dynamics. Typically, obtaining mean field Nash equilibria (MFNE) involves an iterative approach where the forward and backward processes are solved alternately, known as fixed-point i… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  7. arXiv:2405.03520  [pdf, other

    cs.CV

    Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond

    Authors: Zheng Zhu, Xiaofeng Wang, Wangbo Zhao, Chen Min, Nianchen Deng, Min Dou, Yuqi Wang, Botian Shi, Kai Wang, Chi Zhang, Yang You, Zhaoxiang Zhang, Dawei Zhao, Liang Xiao, Jian Zhao, Jiwen Lu, Guan Huang

    Abstract: General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual environments to decision-making systems. Recently, the emergence of the Sora model has attained significant attention due to its remarkable simulation capabilities, which exhibits an incipient comprehension of physical law… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: This survey will be regularly updated at: https://github.com/GigaAI-research/General-World-Models-Survey

  8. arXiv:2405.02784  [pdf, other

    eess.IV cs.CV

    MR-Transformer: Vision Transformer for Total Knee Replacement Prediction Using Magnetic Resonance Imaging

    Authors: Chaojie Zhang, Shengjia Chen, Ozkan Cigdem, Haresh Rengaraj Rajamohan, Kyunghyun Cho, Richard Kijowski, Cem M. Deniz

    Abstract: A transformer-based deep learning model, MR-Transformer, was developed for total knee replacement (TKR) prediction using magnetic resonance imaging (MRI). The model incorporates the ImageNet pre-training and captures three-dimensional (3D) spatial correlation from the MR images. The performance of the proposed model was compared to existing state-of-the-art deep learning models for knee injury dia… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  9. arXiv:2405.02759  [pdf, other

    cs.GR

    Region-Aware Color Smudging

    Authors: Ying Jiang, Pengfei Xu, Congyi Zhang, Hongbo Fu, Henry Lau, Wenping Wang

    Abstract: Color smudge operations from digital painting software enable users to create natural shading effects in high-fidelity paintings by interactively mixing colors. To precisely control results in traditional painting software, users tend to organize flat-filled color regions in multiple layers and smudge them to generate different color gradients. However, the requirement to carefully deal with regio… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  10. arXiv:2405.02466  [pdf

    cs.CR cs.LG

    ProFLingo: A Fingerprinting-based Copyright Protection Scheme for Large Language Models

    Authors: Heng Jin, Chaoyu Zhang, Shanghao Shi, Wenjing Lou, Y. Thomas Hou

    Abstract: Large language models (LLMs) have attracted significant attention in recent years. Due to their "Large" nature, training LLMs from scratch consumes immense computational resources. Since several major players in the artificial intelligence (AI) field have open-sourced their original LLMs, an increasing number of individual researchers and smaller companies are able to build derivative LLMs based o… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: This is the author's pre-print version of the work. It is posted here for your personal use. Not for redistribution

  11. arXiv:2405.02412  [pdf, other

    cs.LG

    Deep Learning and Transfer Learning Architectures for English Premier League Player Performance Forecasting

    Authors: Daniel Frees, Pranav Ravella, Charlie Zhang

    Abstract: This paper presents a groundbreaking model for forecasting English Premier League (EPL) player performance using convolutional neural networks (CNNs). We evaluate Ridge regression, LightGBM and CNNs on the task of predicting upcoming player FPL score based on historical FPL data over the previous weeks. Our baseline models, Ridge regression and LightGBM, achieve solid performance and emphasize the… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 10 pages

    MSC Class: 68T07

  12. arXiv:2405.02299  [pdf, other

    cs.CE cs.LG

    Deep Reinforcement Learning for Modelling Protein Complexes

    Authors: Ziqi Gao, Tao Feng, Jiaxuan You, Chenyi Zi, Yan Zhou, Chen Zhang, Jia Li

    Abstract: AlphaFold can be used for both single-chain and multi-chain protein structure prediction, while the latter becomes extremely challenging as the number of chains increases. In this work, by taking each chain as a node and assembly actions as edges, we show that an acyclic undirected connected graph can be used to predict the structure of multi-chain protein complexes (a.k.a., protein complex modell… ▽ More

    Submitted 6 May, 2024; v1 submitted 11 March, 2024; originally announced May 2024.

    Comments: International Conference on Learning Representations (ICLR 2024)

  13. arXiv:2405.01920  [pdf

    cs.CV

    Lightweight Change Detection in Heterogeneous Remote Sensing Images with Online All-Integer Pruning Training

    Authors: Chengyang Zhang, Weiming Li, Gang Li, Huina Song, Zhaohui Song, Xueqian Wang, Antonio Plaza

    Abstract: Detection of changes in heterogeneous remote sensing images is vital, especially in response to emergencies like earthquakes and floods. Current homogenous transformation-based change detection (CD) methods often suffer from high computation and memory costs, which are not friendly to edge-computation devices like onboard CD devices at satellites. To address this issue, this paper proposes a new l… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  14. arXiv:2405.01884  [pdf, other

    cs.CL

    Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction

    Authors: Wanlong Liu, Li Zhou, Dingyi Zeng, Yichen Xiao, Shaohuan Cheng, Chen Zhang, Grandee Lee, Malu Zhang, Wenyu Chen

    Abstract: Recent mainstream event argument extraction methods process each event in isolation, resulting in inefficient inference and ignoring the correlations among multiple events. To address these limitations, here we propose a multiple-event argument extraction model DEEIA (Dependency-guided Encoding and Event-specific Information Aggregation), capable of extracting arguments from all events within a do… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  15. arXiv:2405.01047  [pdf, ps, other

    math.OC cs.GT

    Optimal Pricing for Linear-Quadratic Games with Nonlinear Interaction Between Agents

    Authors: Jiamin Cai, Chenyue Zhang, Hoi-To Wai

    Abstract: This paper studies a class of network games with linear-quadratic payoffs and externalities exerted through a strictly concave interaction function. This class of game is motivated by the diminishing marginal effects with peer influences. We analyze the optimal pricing strategy for this class of network game. First, we prove the existence of a unique Nash Equilibrium (NE). Second, we study the opt… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 7 pages, 2 figures, revisions under IEEE Control Systems Letters

  16. arXiv:2405.00549  [pdf, ps, other

    cs.DC

    A Confirmation Rule for the Ethereum Consensus Protocol

    Authors: Aditya Asgaonkar, Francesco D'Amato, Roberto Saltini, Luca Zanolini, Chenyi Zhang

    Abstract: A Confirmation Rule, within blockchain networks, refers to an algorithm implemented by network nodes that determines (either probabilistically or deterministically) the permanence of certain blocks on the blockchain. An example of Confirmation Rule is the Bitcoin's longest chain Confirmation Rule where a block is confirmed (with high probability) when it has a sufficiently long chain of successors… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  17. arXiv:2405.00181  [pdf, other

    cs.CV cs.AI

    Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly

    Authors: Hang Du, Sicheng Zhang, Binzhu Xie, Guoshun Nan, Jiayang Zhang, Junrui Xu, Hangyu Liu, Sicong Leng, Jiangming Liu, Hehe Fan, Dajiu Huang, Jing Feng, Linli Chen, Can Zhang, Xuhuan Li, Hao Zhang, Jianhang Chen, Qimei Cui, Xiaofeng Tao

    Abstract: Video anomaly understanding (VAU) aims to automatically comprehend unusual occurrences in videos, thereby enabling various applications such as traffic surveillance and industrial manufacturing. While existing VAU benchmarks primarily concentrate on anomaly detection and localization, our focus is on more practicality, prompting us to raise the following crucial questions: "what anomaly occurred?"… ▽ More

    Submitted 6 May, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

    Comments: Accepted in CVPR2024, Codebase: https://github.com/fesvhtr/CUVA

  18. arXiv:2404.18947  [pdf, other

    cs.LG cs.AI

    Multimodal Fusion on Low-quality Data: A Comprehensive Survey

    Authors: Qingyang Zhang, Yake Wei, Zongbo Han, Huazhu Fu, Xi Peng, Cheng Deng, Qinghua Hu, Cai Xu, Jie Wen, Di Hu, Changqing Zhang

    Abstract: Multimodal fusion focuses on integrating information from multiple modalities with the goal of more accurate prediction, which has achieved remarkable progress in a wide range of scenarios, including autonomous driving and medical diagnosis. However, the reliability of multimodal fusion remains largely unexplored especially under low-quality data settings. This paper surveys the common challenges… ▽ More

    Submitted 5 May, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

    Comments: Feel free to comment on our manuscript: qingyangzhang@tju.edu.cn

  19. arXiv:2404.18886  [pdf, other

    cs.LG cs.AI

    A Survey on Diffusion Models for Time Series and Spatio-Temporal Data

    Authors: Yiyuan Yang, Ming Jin, Haomin Wen, Chaoli Zhang, Yuxuan Liang, Lintao Ma, Yi Wang, Chenghao Liu, Bin Yang, Zenglin Xu, Jiang Bian, Shirui Pan, Qingsong Wen

    Abstract: The study of time series data is crucial for understanding trends and anomalies over time, enabling predictive insights across various sectors. Spatio-temporal data, on the other hand, is vital for analyzing phenomena in both space and time, providing a dynamic perspective on complex system interactions. Recently, diffusion models have seen widespread application in time series and spatio-temporal… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Ongoing work; 27 pages, 8 figures, 2 tables; Github Repo: https://github.com/yyysjz1997/Awesome-TimeSeries-SpatioTemporal-Diffusion-Model

  20. M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework

    Authors: Zijian Zhang, Shuchang Liu, Jiaao Yu, Qingpeng Cai, Xiangyu Zhao, Chunxu Zhang, Ziru Liu, Qidong Liu, Hongwei Zhao, Lantao Hu, Peng Jiang, Kun Gai

    Abstract: Multi-domain recommendation and multi-task recommendation have demonstrated their effectiveness in leveraging common information from different domains and objectives for comprehensive user modeling. Nonetheless, the practical recommendation usually faces multiple domains and tasks simultaneously, which cannot be well-addressed by current methods. To this end, we introduce M3oE, an adaptive multi-… ▽ More

    Submitted 7 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  21. arXiv:2404.18443  [pdf, other

    cs.CL cs.AI cs.IR q-bio.QM

    BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers

    Authors: Ran Xu, Wenqi Shi, Yue Yu, Yuchen Zhuang, Yanqiao Zhu, May D. Wang, Joyce C. Ho, Chao Zhang, Carl Yang

    Abstract: Developing effective biomedical retrieval models is important for excelling at knowledge-intensive biomedical tasks but still challenging due to the deficiency of sufficient publicly annotated biomedical data and computational resources. We present BMRetriever, a series of dense retrievers for enhancing biomedical retrieval via unsupervised pre-training on large biomedical corpora, followed by ins… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Work in progress. The model and data will be uploaded to \url{https://github.com/ritaranx/BMRetriever}

  22. Domain Reasoning in TopKAT

    Authors: Cheng Zhang, Arthur Azevedo de Amorim, Marco Gaboardi

    Abstract: TopKAT is the algebraic theory of Kleene algebra with tests (KAT) extended with a top element. Compared to KAT, one pleasant feature of TopKAT is that, in relational models, the top element allows us to express the domain and codomain of a relation. This enables several applications in program logics, such as proving under-approximate specifications or reachability properties of imperative program… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: A version of this article is accepted at ICALP 2024

  23. arXiv:2404.18392  [pdf, other

    cs.DC

    Dflow, a Python framework for constructing cloud-native AI-for-Science workflows

    Authors: Xinzijian Liu, Yanbo Han, Zhuoyuan Li, Jiahao Fan, Chengqian Zhang, Jinzhe Zeng, Yifan Shan, Yannan Yuan, Wei-Hong Xu, Yun-Pei Liu, Yuzhi Zhang, Tongqi Wen, Darrin M. York, Zhicheng Zhong, Hang Zheng, Jun Cheng, Linfeng Zhang, Han Wang

    Abstract: In the AI-for-science era, scientific computing scenarios such as concurrent learning and high-throughput computing demand a new generation of infrastructure that supports scalable computing resources and automated workflow management on both cloud and high-performance supercomputers. Here we introduce Dflow, an open-source Python toolkit designed for scientists to construct workflows with simple… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  24. arXiv:2404.18130  [pdf, other

    cs.AI cs.CL

    Logic Agent: Enhancing Validity with Logic Rule Invocation

    Authors: Hanmeng Liu, Zhiyang Teng, Chaoli Zhang, Yue Zhang

    Abstract: Chain-of-Thought (CoT) prompting has emerged as a pivotal technique for augmenting the inferential capabilities of language models during reasoning tasks. Despite its advancements, CoT often grapples with challenges in validating reasoning validity and ensuring informativeness. Addressing these limitations, this paper introduces the Logic Agent (LA), an agent-based framework aimed at enhancing the… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  25. arXiv:2404.17381  [pdf, other

    cs.CV

    Frequency-Guided Multi-Level Human Action Anomaly Detection with Normalizing Flows

    Authors: Shun Maeda, Chunzhi Gu, Jun Yu, Shogo Tokai, Shangce Gao, Chao Zhang

    Abstract: We introduce the task of human action anomaly detection (HAAD), which aims to identify anomalous motions in an unsupervised manner given only the pre-determined normal category of training action samples. Compared to prior human-related anomaly detection tasks which primarily focus on unusual events from videos, HAAD involves the learning of specific action labels to recognize semantically anomalo… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  26. arXiv:2404.17022  [pdf

    cs.SD eess.AS

    Investigating differences in lab-quality and remote recording methods with dynamic acoustic measures

    Authors: Cong Zhang, Kathleen Jepson, Yu-Ying Chuang

    Abstract: Increasingly, phonetic research utilizes data collected from participants who record themselves on readily available devices. Though such recordings are convenient, their suitability for acoustic analysis remains an open question, especially regarding how the individual methods affect acoustic measures over time. We used Quantile Generalized Additive Mixed Models (QGAMMs) to analyze measures of F0… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  27. arXiv:2404.16847  [pdf, other

    cs.CR cs.AI

    State-of-the-Art Approaches to Enhancing Privacy Preservation of Machine Learning Datasets: A Survey

    Authors: Chaoyu Zhang

    Abstract: This paper examines the evolving landscape of machine learning (ML) and its profound impact across various sectors, with a special focus on the emerging field of Privacy-preserving Machine Learning (PPML). As ML applications become increasingly integral to industries like telecommunications, financial technology, and surveillance, they raise significant privacy concerns, necessitating the developm… ▽ More

    Submitted 25 February, 2024; originally announced April 2024.

  28. arXiv:2404.16820  [pdf, other

    cs.CV

    Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings

    Authors: Olivia Wiles, Chuhan Zhang, Isabela Albuquerque, Ivana Kajić, Su Wang, Emanuele Bugliarello, Yasumasa Onoe, Chris Knutsen, Cyrus Rashtchian, Jordi Pont-Tuset, Aida Nematzadeh

    Abstract: While text-to-image (T2I) generative models have become ubiquitous, they do not necessarily generate images that align with a given prompt. While previous work has evaluated T2I alignment by proposing metrics, benchmarks, and templates for collecting human judgements, the quality of these components is not systematically measured. Human-rated prompt sets are generally small and the reliability of… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Data and code will be released at: https://github.com/google-deepmind/gecko_benchmark_t2i

  29. arXiv:2404.16573  [pdf, other

    cs.CV

    Multi-Scale Representations by Varying Window Attention for Semantic Segmentation

    Authors: Haotian Yan, Ming Wu, Chuang Zhang

    Abstract: Multi-scale learning is central to semantic segmentation. We visualize the effective receptive field (ERF) of canonical multi-scale representations and point out two risks in learning them: scale inadequacy and field inactivation. A novel multi-scale learner, varying window attention (VWA), is presented to address these issues. VWA leverages the local window attention (LWA) and disentangles LWA in… ▽ More

    Submitted 26 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: ICLR2024 Poster

  30. arXiv:2404.16137  [pdf, ps, other

    cs.IT cs.LG eess.SP

    Learned Pulse Shaping Design for PAPR Reduction in DFT-s-OFDM

    Authors: Fabrizio Carpi, Soheil Rostami, Joonyoung Cho, Siddharth Garg, Elza Erkip, Charlie Jianzhong Zhang

    Abstract: High peak-to-average power ratio (PAPR) is one of the main factors limiting cell coverage for cellular systems, especially in the uplink direction. Discrete Fourier transform spread orthogonal frequency-domain multiplexing (DFT-s-OFDM) with spectrally-extended frequency-domain spectrum shaping (FDSS) is one of the efficient techniques deployed to lower the PAPR of the uplink waveforms. In this wor… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 5 pages, under review

  31. arXiv:2404.16037  [pdf, other

    cs.CV cs.LG physics.ao-ph

    VN-Net: Vision-Numerical Fusion Graph Convolutional Network for Sparse Spatio-Temporal Meteorological Forecasting

    Authors: Yutong Xiong, Xun Zhu, Ming Wu, Weiqing Li, Fanbin Mo, Chuang Zhang, Bin Zhang

    Abstract: Sparse meteorological forecasting is indispensable for fine-grained weather forecasting and deserves extensive attention. Recent studies have highlighted the potential of spatio-temporal graph convolutional networks (ST-GCNs) in predicting numerical data from ground weather stations. However, as one of the highest fidelity and lowest latency data, the application of the vision data from satellites… ▽ More

    Submitted 26 January, 2024; originally announced April 2024.

  32. arXiv:2404.16023  [pdf, other

    stat.AP cs.LG

    Learning Car-Following Behaviors Using Bayesian Matrix Normal Mixture Regression

    Authors: Chengyuan Zhang, Kehua Chen, Meixin Zhu, Hai Yang, Lijun Sun

    Abstract: Learning and understanding car-following (CF) behaviors are crucial for microscopic traffic simulation. Traditional CF models, though simple, often lack generalization capabilities, while many data-driven methods, despite their robustness, operate as "black boxes" with limited interpretability. To bridge this gap, this work introduces a Bayesian Matrix Normal Mixture Regression (MNMR) model that s… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 6 pages, Accepted by the 35th IEEE Intelligent Vehicles Symposium

  33. HTAP Databases: A Survey

    Authors: Chao Zhang, Guoliang Li, Jintao Zhang, Xinning Zhang, Jianhua Feng

    Abstract: Since Gartner coined the term, Hybrid Transactional and Analytical Processing (HTAP), numerous HTAP databases have been proposed to combine transactions with analytics in order to enable real-time data analytics for various data-intensive applications. HTAP databases typically process the mixed workloads of transactions and analytical queries in a unified system by leveraging both a row store and… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: IEEE Transactions on Knowledge and Data Engineering, 2024

  34. arXiv:2404.15582  [pdf, other

    cs.CR

    Armored Core of PKI: Remove Signing Keys for CA via Physically Unclonable Function

    Authors: Xiaolin Zhang, Chenghao Chen, Kailun Qin, Chi Zhang, Shipei Qu, Tengfei Wang, Yuxuan Wang, Dawu Gu

    Abstract: The protection of CA's signing keys is one of the most crucial security concerns in PKI. However, these keys can still be exposed today by human errors or various carefully designed attacks. Traditional protections like TEE and HSM fail to eliminate this risk since they can be bypassed by skilled attackers. This dilemma motivates us to consider removing CA' signing keys and propose Armored Core, a… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Initial version of the paper

  35. arXiv:2404.15506  [pdf, other

    cs.CV

    Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation

    Authors: Mu Hu, Wei Yin, Chi Zhang, Zhipeng Cai, Xiaoxiao Long, Hao Chen, Kaixuan Wang, Gang Yu, Chunhua Shen, Shaojie Shen

    Abstract: We introduce Metric3D v2, a geometric foundation model for zero-shot metric depth and surface normal estimation from a single image, which is crucial for metric 3D recovery. While depth and normal are geometrically related and highly complimentary, they present distinct challenges. SoTA monocular depth methods achieve zero-shot generalization by learning affine-invariant depths, which cannot recov… ▽ More

    Submitted 21 March, 2024; originally announced April 2024.

    Comments: Our project page is at https://JUGGHM.github.io/Metric3Dv2. arXiv admin note: substantial text overlap with arXiv:2307.10984

  36. arXiv:2404.15103  [pdf, other

    cs.CL

    Multi-view Content-aware Indexing for Long Document Retrieval

    Authors: Kuicai Dong, Derrick Goh Xin Deik, Yi Quan Lee, Hao Zhang, Xiangyang Li, Cong Zhang, Yong Liu

    Abstract: Long document question answering (DocQA) aims to answer questions from long documents over 10k words. They usually contain content structures such as sections, sub-sections, and paragraph demarcations. However, the indexing methods of long documents remain under-explored, while existing systems generally employ fixed-length chunking. As they do not consider content structures, the resultant chunks… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  37. arXiv:2404.14983  [pdf, other

    cs.CR

    Zero-Knowledge Location Privacy via Accurate Floating Point SNARKs

    Authors: Jens Ernstberger, Chengru Zhang, Luca Ciprian, Philipp Jovanovic, Sebastian Steinhorst

    Abstract: This paper introduces Zero-Knowledge Location Privacy (ZKLP), enabling users to prove to third parties that they are within a specified geographical region while not disclosing their exact location. ZKLP supports varying levels of granularity, allowing for customization depending on the use case. To realize ZKLP, we introduce the first set of Zero-Knowledge Proof (ZKP) circuits that are fully comp… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  38. arXiv:2404.14897  [pdf, other

    cs.CL cs.AI

    Beyond the Speculative Game: A Survey of Speculative Execution in Large Language Models

    Authors: Chen Zhang, Zhuorui Liu, Dawei Song

    Abstract: With the increasingly giant scales of (causal) large language models (LLMs), the inference efficiency comes as one of the core concerns along the improved performance. In contrast to the memory footprint, the latency bottleneck seems to be of greater importance as there can be billions of requests to a LLM (e.g., GPT-4) per day. The bottleneck is mainly due to the autoregressive innateness of LLMs… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 10 pages, 4 figures, 1 table, rejected from IJCAI 2024, revision in progress

  39. arXiv:2404.14716  [pdf, other

    cs.CL cs.AI cs.CV cs.SD eess.AS

    Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities

    Authors: Siyin Wang, Chao-Han Huck Yang, Ji Wu, Chao Zhang

    Abstract: Large language models (LLMs) can adapt to new tasks through in-context learning (ICL) based on a few examples presented in dialogue history without any model parameter update. Despite such convenience, the performance of ICL heavily depends on the quality of the in-context examples presented, which makes the in-context example selection approach a critical choice. This paper proposes a novel Bayes… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 16 pages, 6 figures

  40. arXiv:2404.14219  [pdf, other

    cs.CL cs.AI

    Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

    Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg , et al. (62 additional authors not shown)

    Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset… ▽ More

    Submitted 23 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 12 pages

  41. arXiv:2404.13972  [pdf, other

    cs.CV

    Non-Uniform Exposure Imaging via Neuromorphic Shutter Control

    Authors: Mingyuan Lin, Jian Liu, Chi Zhang, Zibo Zhao, Chu He, Lei Yu

    Abstract: By leveraging the blur-noise trade-off, imaging with non-uniform exposures largely extends the image acquisition flexibility in harsh environments. However, the limitation of conventional cameras in perceiving intra-frame dynamic information prevents existing methods from being implemented in the real-world frame acquisition for real-time adaptive camera shutter control. To address this challenge,… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  42. arXiv:2404.13924  [pdf, other

    cs.HC cs.ET

    ActSonic: Recognizing Everyday Activities from Inaudible Acoustic Waves Around the Body

    Authors: Saif Mahmud, Vineet Parikh, Qikang Liang, Ke Li, Ruidong Zhang, Ashwin Ajit, Vipin Gunda, Devansh Agarwal, François Guimbretière, Cheng Zhang

    Abstract: We present ActSonic, an intelligent, low-power active acoustic sensing system integrated into eyeglasses that can recognize 27 different everyday activities (e.g., eating, drinking, toothbrushing) from inaudible acoustic waves around the body with a time resolution of one second. It only needs a pair of miniature speakers and microphones mounted on each hinge of eyeglasses to emit ultrasonic waves… ▽ More

    Submitted 8 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 29 pages, 11 figures

  43. arXiv:2404.13701  [pdf, other

    cs.CV cs.LG

    Semantic-Rearrangement-Based Multi-Level Alignment for Domain Generalized Segmentation

    Authors: Guanlong Jiao, Chenyangguang Zhang, Haonan Yin, Yu Mo, Biqing Huang, Hui Pan, Yi Luo, Jingxian Liu

    Abstract: Domain generalized semantic segmentation is an essential computer vision task, for which models only leverage source data to learn the capability of generalized semantic segmentation towards the unseen target domains. Previous works typically address this challenge by global style randomization or feature regularization. In this paper, we argue that given the observation that different local seman… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  44. arXiv:2404.13388  [pdf

    eess.IV cs.CV cs.LG

    Diagnosis of Multiple Fundus Disorders Amidst a Scarcity of Medical Experts Via Self-supervised Machine Learning

    Authors: Yong Liu, Mengtian Kang, Shuo Gao, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Arokia Nathan, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Luigi Occhipinti

    Abstract: Fundus diseases are major causes of visual impairment and blindness worldwide, especially in underdeveloped regions, where the shortage of ophthalmologists hinders timely diagnosis. AI-assisted fundus image analysis has several advantages, such as high accuracy, reduced workload, and improved accessibility, but it requires a large amount of expert-annotated data to build reliable models. To addres… ▽ More

    Submitted 23 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  45. arXiv:2404.13386  [pdf

    eess.IV cs.CV cs.LG

    SSVT: Self-Supervised Vision Transformer For Eye Disease Diagnosis Based On Fundus Images

    Authors: Jiaqi Wang, Mengtian Kang, Yong Liu, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Shuo Gao, Luigi G. Occhipinti

    Abstract: Machine learning-based fundus image diagnosis technologies trigger worldwide interest owing to their benefits such as reducing medical resource power and providing objective evaluation results. However, current methods are commonly based on supervised methods, bringing in a heavy workload to biomedical staff and hence suffering in expanding effective databases. To address this issue, in this artic… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: ISBI 2024

  46. arXiv:2404.12980  [pdf, other

    cs.HC

    Ring-a-Pose: A Ring for Continuous Hand Pose Tracking

    Authors: Tianhong Catherine Yu, Guilin Hu, Ruidong Zhang, Hyunchul Lim, Saif Mahmud, Chi-Jung Lee, Ke Li, Devansh Agarwal, Shuyang Nie, Jinseok Oh, François Guimbretière, Cheng Zhang

    Abstract: We present Ring-a-Pose, a single untethered ring that tracks continuous 3D hand poses. Located in the center of the hand, the ring emits an inaudible acoustic signal that each hand pose reflects differently. Ring-a-Pose imposes minimal obtrusions on the hand, unlike multi-ring or glove systems. It is not affected by the choice of clothing that may cover wrist-worn systems. In a series of three use… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  47. arXiv:2404.12867  [pdf, other

    cs.CV cs.RO

    FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving

    Authors: Xingtai Gui, Tengteng Huang, Haonan Shao, Haotian Yao, Chi Zhang

    Abstract: The future instance prediction from a Bird's Eye View(BEV) perspective is a vital component in autonomous driving, which involves future instance segmentation and instance motion prediction. Existing methods usually rely on a redundant and complex pipeline which requires multiple auxiliary outputs and post-processing procedures. Moreover, estimated errors on each of the auxiliary predictions will… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  48. arXiv:2404.12618  [pdf

    cs.CL cs.AI cs.LG

    CORI: CJKV Benchmark with Romanization Integration -- A step towards Cross-lingual Transfer Beyond Textual Scripts

    Authors: Hoang H. Nguyen, Chenwei Zhang, Ye Liu, Natalie Parde, Eugene Rohrbaugh, Philip S. Yu

    Abstract: Naively assuming English as a source language may hinder cross-lingual transfer for many languages by failing to consider the importance of language contact. Some languages are more well-connected than others, and target languages can benefit from transferring from closely related languages; for many languages, the set of closely related languages does not include English. In this work, we study t… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Accepted at LREC-COLING 2024

  49. arXiv:2404.12210  [pdf, other

    cs.CV

    Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training

    Authors: Jin Gao, Shubo Lin, Shaoru Wang, Yutong Kou, Zeming Li, Liang Li, Congxuan Zhang, Xiaoqin Zhang, Yizheng Wang, Weiming Hu

    Abstract: Masked image modeling (MIM) pre-training for large-scale vision transformers (ViTs) in computer vision has enabled promising downstream performance on top of the learned self-supervised ViT features. In this paper, we question if the extremely simple ViTs' fine-tuning performance with a small-scale architecture can also benefit from this pre-training paradigm, which is considerably less studied ye… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  50. arXiv:2404.11155  [pdf, other

    cs.CV

    HybriMap: Hybrid Clues Utilization for Effective Vectorized HD Map Construction

    Authors: Chi Zhang, Qi Song, Feifei Li, Yongquan Chen, Rui Huang

    Abstract: Constructing vectorized high-definition maps from surround-view cameras has garnered significant attention in recent years. However, the commonly employed multi-stage sequential workflow in prevailing approaches often leads to the loss of early-stage information, particularly in perspective-view features. Usually, such loss is observed as an instance missing or shape mismatching in the final birds… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.