Search | arXiv e-print repository

arXiv:2405.12676 [pdf]

Experimental investigation of trans-scale displacement responses of wrinkle defects in fiber reinforced composite laminates

Authors: Li Ma, Shoulong Wang, Changchen Liu, Ange Wen, Kaidi Ying, Jing Guo

Abstract: Wrinkle defects were found widely exist in the field of industrial products, i.e. wind turbine blades and filament-wound composite pressure vessels. The magnitude of wrinkle wavelength varies from several millimeters to over one hundred millimeters. Locating the wrinkle defects and measuring their responses are very important to the assessment of the structures that containing wrinkle defects. A m… ▽ More Wrinkle defects were found widely exist in the field of industrial products, i.e. wind turbine blades and filament-wound composite pressure vessels. The magnitude of wrinkle wavelength varies from several millimeters to over one hundred millimeters. Locating the wrinkle defects and measuring their responses are very important to the assessment of the structures that containing wrinkle defects. A meso-mechanical modeling is presented based on the homogenization method to obtain the effective stiffness of a graded wrinkle. The finite element simulation predicts the trans-scale response of out-of-plane displacement of wrinkled laminates, where the maximum displacement ranges from nanoscale to millimeter scale. Such trans-scale effect requires different measurement approaches to observe the displacement responses. Here we employed Shearography (Speckle Pattern Shearing Interferometry) and fringe projection profilometry (FPP) method respectively according to the different magnitude of displacement. In FPP method, a displacement extraction algorithm was presented to obtain the out-of-plane displacement. The measurement sensitivity and accuracy of Shearography and FPP are compared, which provides a quantitative reference for industrial non-destructive test. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.02823 [pdf, other]

Reconfigurable Massive MIMO: Precoding Design and Channel Estimation in the Electromagnetic Domain

Authors: Keke Ying, Zhen Gao, Yu Su, Tong Qin, Michail Matthaiou, Robert Schober

Abstract: Reconfigurable massive multiple-input multiple-output (RmMIMO) technology offers increased flexibility for future communication systems by exploiting previously untapped degrees of freedom in the electromagnetic (EM) domain. The representation of the traditional spatial domain channel state information (sCSI) limits the insights into the potential of EM domain channel properties, constraining the… ▽ More Reconfigurable massive multiple-input multiple-output (RmMIMO) technology offers increased flexibility for future communication systems by exploiting previously untapped degrees of freedom in the electromagnetic (EM) domain. The representation of the traditional spatial domain channel state information (sCSI) limits the insights into the potential of EM domain channel properties, constraining the base station's (BS) utmost capability for precoding design. This paper leverages the EM domain channel state information (eCSI) for radiation pattern design at the BS. We develop an orthogonal decomposition method based on spherical harmonic functions to decompose the radiation pattern into a linear combination of orthogonal bases. By formulating the radiation pattern design as an optimization problem for the projection coefficients over these bases, we develop a manifold optimization-based method for iterative radiation pattern and digital precoder design. To address the eCSI estimation problem, we capitalize on the inherent structure of the channel. Specifically, we propose a subspace-based scheme to reduce the pilot overhead for wideband sCSI estimation. Given the estimated full-band sCSI, we further employ parameterized methods for angle of arrival estimation. Subsequently, the complete eCSI can be reconstructed after estimating the equivalent channel gain via the least squares method. Simulation results demonstrate that, in comparison to traditional mMIMO systems with fixed antenna radiation patterns, the proposed RmMIMO architecture offers significant throughput gains for multi-user transmission at a low channel estimation overhead. △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: This work is being submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2404.16006 [pdf, other]

MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Authors: Kaining Ying, Fanqing Meng, Jin Wang, Zhiqian Li, Han Lin, Yue Yang, Hao Zhang, Wenbo Zhang, Yuqi Lin, Shuo Liu, Jiayi Lei, Quanfeng Lu, Runjian Chen, Peng Xu, Renrui Zhang, Haozhe Zhang, Peng Gao, Yali Wang, Yu Qiao, Ping Luo, Kaipeng Zhang, Wenqi Shao

Abstract: Large Vision-Language Models (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation. However, existing multimodal evaluation benchmarks cover a limited number of multimodal tasks testing rudimentary capabilities, falling short in tracking LVLM development. In this study, we present MMT-Bench, a comprehensive benchmark designed to… ▽ More Large Vision-Language Models (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation. However, existing multimodal evaluation benchmarks cover a limited number of multimodal tasks testing rudimentary capabilities, falling short in tracking LVLM development. In this study, we present MMT-Bench, a comprehensive benchmark designed to assess LVLMs across massive multimodal tasks requiring expert knowledge and deliberate visual recognition, localization, reasoning, and planning. MMT-Bench comprises $31,325$ meticulously curated multi-choice visual questions from various multimodal scenarios such as vehicle driving and embodied navigation, covering $32$ core meta-tasks and $162$ subtasks in multimodal understanding. Due to its extensive task coverage, MMT-Bench enables the evaluation of LVLMs using a task map, facilitating the discovery of in- and out-of-domain tasks. Evaluation results involving $30$ LVLMs such as the proprietary GPT-4V, GeminiProVision, and open-sourced InternVL-Chat, underscore the significant challenges posed by MMT-Bench. We anticipate that MMT-Bench will inspire the community to develop next-generation multimodal foundation models aimed at achieving general-purpose multimodal intelligence. △ Less

Submitted 24 April, 2024; originally announced April 2024.

Comments: 77 pages, 41 figures

arXiv:2403.20194 [pdf, other]

ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Capability for Large Vision-Language Models

Authors: Shuo Liu, Kaining Ying, Hao Zhang, Yue Yang, Yuqi Lin, Tianle Zhang, Chuanhao Li, Yu Qiao, Ping Luo, Wenqi Shao, Kaipeng Zhang

Abstract: This paper presents ConvBench, a novel multi-turn conversation evaluation benchmark tailored for Large Vision-Language Models (LVLMs). Unlike existing benchmarks that assess individual capabilities in single-turn dialogues, ConvBench adopts a three-level multimodal capability hierarchy, mimicking human cognitive processes by stacking up perception, reasoning, and creativity. Each level focuses on… ▽ More This paper presents ConvBench, a novel multi-turn conversation evaluation benchmark tailored for Large Vision-Language Models (LVLMs). Unlike existing benchmarks that assess individual capabilities in single-turn dialogues, ConvBench adopts a three-level multimodal capability hierarchy, mimicking human cognitive processes by stacking up perception, reasoning, and creativity. Each level focuses on a distinct capability, mirroring the cognitive progression from basic perception to logical reasoning and ultimately to advanced creativity. ConvBench comprises 577 meticulously curated multi-turn conversations encompassing 215 tasks reflective of real-world demands. Automatic evaluations quantify response performance at each turn and overall conversation level. Leveraging the capability hierarchy, ConvBench enables precise attribution of conversation mistakes to specific levels. Experimental results reveal a performance gap between multi-modal models, including GPT4-V, and human performance in multi-turn conversations. Additionally, weak fine-grained perception in multi-modal models contributes to reasoning and creation failures. ConvBench serves as a catalyst for further research aimed at enhancing visual dialogues. △ Less

Submitted 25 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

arXiv:2401.00283 [pdf, other]

Near-Space Communications: the Last Piece of 6G Space-Air-Ground-Sea Integrated Network Puzzle

Authors: Hongshan Liu, Tong Qin, Zhen Gao, Tianqi Mao, Keke Ying, Ziwei Wan, Li Qiao, Rui Na, Zhongxiang Li, Chun Hu, Yikun Mei, Tuan Li, Guanghui Wen, Lei Chen, Zhonghuai Wu, Ruiqi Liu, Gaojie Chen, Shuo Wang, Dezhi Zheng

Abstract: This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis… ▽ More This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis between the NS-COM network and other counterparts in SAGSIN is conducted, covering aspects of deployment, coverage, channel characteristics and unique problems of NS-COM network. Afterwards, the technical aspects of NS-COM, including channel modeling, random access, channel estimation, array-based beam management and joint network optimization, are examined in detail. Furthermore, we explore the potential applications of NS-COM, such as structural expansion in SAGSIN communication, civil aviation communication, remote and urgent communication, weather monitoring and carbon neutrality. Finally, some promising research avenues are identified, including stratospheric satellite (StratoSat) -to-ground direct links for mobile terminals, reconfigurable multiple-input multiple-output (MIMO) and holographic MIMO, federated learning in NS-COM networks, maritime communication, electromagnetic spectrum sensing and adversarial game, integrated sensing and communications, StratoSat-based radar detection and imaging, NS-COM assisted enhanced global navigation system, NS-COM assisted intelligent unmanned system and free space optical (FSO) communication. Overall, this paper highlights that the NS-COM plays an indispensable role in the SAGSIN puzzle, providing substantial performance and coverage enhancement to the traditional SAGSIN architecture. △ Less

Submitted 4 March, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

Comments: 28 pages, 8 figures, 2 tables

arXiv:2307.12616 [pdf, other]

CTVIS: Consistent Training for Online Video Instance Segmentation

Authors: Kaining Ying, Qing Zhong, Weian Mao, Zhenhua Wang, Hao Chen, Lin Yuanbo Wu, Yifan Liu, Chengxiang Fan, Yunzhi Zhuge, Chunhua Shen

Abstract: The discrimination of instance embeddings plays a vital role in associating instances across time for online video instance segmentation (VIS). Instance embedding learning is directly supervised by the contrastive loss computed upon the contrastive items (CIs), which are sets of anchor/positive/negative embeddings. Recent online VIS methods leverage CIs sourced from one reference frame only, which… ▽ More The discrimination of instance embeddings plays a vital role in associating instances across time for online video instance segmentation (VIS). Instance embedding learning is directly supervised by the contrastive loss computed upon the contrastive items (CIs), which are sets of anchor/positive/negative embeddings. Recent online VIS methods leverage CIs sourced from one reference frame only, which we argue is insufficient for learning highly discriminative embeddings. Intuitively, a possible strategy to enhance CIs is replicating the inference phase during training. To this end, we propose a simple yet effective training strategy, called Consistent Training for Online VIS (CTVIS), which devotes to aligning the training and inference pipelines in terms of building CIs. Specifically, CTVIS constructs CIs by referring inference the momentum-averaged embedding and the memory bank storage mechanisms, and adding noise to the relevant embeddings. Such an extension allows a reliable comparison between embeddings of current instances and the stable representations of historical instances, thereby conferring an advantage in modeling VIS challenges such as occlusion, re-identification, and deformation. Empirically, CTVIS outstrips the SOTA VIS models by up to +5.0 points on three VIS benchmarks, including YTVIS19 (55.1% AP), YTVIS21 (50.1% AP) and OVIS (35.5% AP). Furthermore, we find that pseudo-videos transformed from images can train robust models surpassing fully-supervised ones. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: Accepted by ICCV 2023. The code is available at https://github.com/KainingYing/CTVIS

arXiv:2307.00464 [pdf, other]

Human-to-Human Interaction Detection

Authors: Zhenhua Wang, Kaining Ying, Jiajun Meng, Jifeng Ning

Abstract: A comprehensive understanding of interested human-to-human interactions in video streams, such as queuing, handshaking, fighting and chasing, is of immense importance to the surveillance of public security in regions like campuses, squares and parks. Different from conventional human interaction recognition, which uses choreographed videos as inputs, neglects concurrent interactive groups, and per… ▽ More A comprehensive understanding of interested human-to-human interactions in video streams, such as queuing, handshaking, fighting and chasing, is of immense importance to the surveillance of public security in regions like campuses, squares and parks. Different from conventional human interaction recognition, which uses choreographed videos as inputs, neglects concurrent interactive groups, and performs detection and recognition in separate stages, we introduce a new task named human-to-human interaction detection (HID). HID devotes to detecting subjects, recognizing person-wise actions, and grouping people according to their interactive relations, in one model. First, based on the popular AVA dataset created for action detection, we establish a new HID benchmark, termed AVA-Interaction (AVA-I), by adding annotations on interactive relations in a frame-by-frame manner. AVA-I consists of 85,254 frames and 86,338 interactive groups, and each image includes up to 4 concurrent interactive groups. Second, we present a novel baseline approach SaMFormer for HID, containing a visual feature extractor, a split stage which leverages a Transformer-based model to decode action instances and interactive groups, and a merging stage which reconstructs the relationship between instances and groups. All SaMFormer components are jointly trained in an end-to-end manner. Extensive experiments on AVA-I validate the superiority of SaMFormer over representative methods. The dataset and code will be made public to encourage more follow-up studies. △ Less

Submitted 11 August, 2023; v1 submitted 1 July, 2023; originally announced July 2023.

arXiv:2304.04484 [pdf, other]

Quasi-Synchronous Random Access for Massive MIMO-Based LEO Satellite Constellations

Authors: Keke Ying, Zhen Gao, Sheng Chen, Mingyu Zhou, Dezhi Zheng, Symeon Chatzinotas, Björn Ottersten, H. Vincent Poor

Abstract: Low earth orbit (LEO) satellite constellation-enabled communication networks are expected to be an important part of many Internet of Things (IoT) deployments due to their unique advantage of providing seamless global coverage. In this paper, we investigate the random access problem in massive multiple-input multiple-output-based LEO satellite systems, where the multi-satellite cooperative process… ▽ More Low earth orbit (LEO) satellite constellation-enabled communication networks are expected to be an important part of many Internet of Things (IoT) deployments due to their unique advantage of providing seamless global coverage. In this paper, we investigate the random access problem in massive multiple-input multiple-output-based LEO satellite systems, where the multi-satellite cooperative processing mechanism is considered. Specifically, at edge satellite nodes, we conceive a training sequence padded multi-carrier system to overcome the issue of imperfect synchronization, where the training sequence is utilized to detect the devices' activity and estimate their channels. Considering the inherent sparsity of terrestrial-satellite links and the sporadic traffic feature of IoT terminals, we utilize the orthogonal approximate message passing-multiple measurement vector algorithm to estimate the delay coefficients and user terminal activity. To further utilize the structure of the receive array, a two-dimensional estimation of signal parameters via rotational invariance technique is performed for enhancing channel estimation. Finally, at the central server node, we propose a majority voting scheme to enhance activity detection by aggregating backhaul information from multiple satellites. Moreover, multi-satellite cooperative linear data detection and multi-satellite cooperative Bayesian dequantization data detection are proposed to cope with perfect and quantized backhaul, respectively. Simulation results verify the effectiveness of our proposed schemes in terms of channel estimation, activity detection, and data detection for quasi-synchronous random access in satellite systems. △ Less

Submitted 10 April, 2023; originally announced April 2023.

Comments: 38 pages, 16 figures. This paper has been accepted by IEEE JSAC SI on 3GPP Technologies: 5G-Advanced and Beyond. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2302.11385 [pdf, other]

Reconfigurable Massive MIMO: Harnessing the Power of the Electromagnetic Domain for Enhanced Information Transfer

Authors: Keke Ying, Zhen Gao, Sheng Chen, Xinyu Gao, Michail Matthaiou, Rui Zhang, Robert Schober

Abstract: The capacity of commercial massive multiple-input multiple-output (mMIMO) systems is constrained by the limited array aperture at the base station, and cannot meet the ever-increasing traffic demands of wireless networks. Given the array aperture, holographic MIMO with infinitesimal antenna spacing can maximize the capacity, but is physically unrealizable. As a promising alternative, reconfigurabl… ▽ More The capacity of commercial massive multiple-input multiple-output (mMIMO) systems is constrained by the limited array aperture at the base station, and cannot meet the ever-increasing traffic demands of wireless networks. Given the array aperture, holographic MIMO with infinitesimal antenna spacing can maximize the capacity, but is physically unrealizable. As a promising alternative, reconfigurable mMIMO is proposed to harness the unexploited power of the electromagnetic (EM) domain for enhanced information transfer. Specifically, the reconfigurable pixel antenna technology provides each antenna with an adjustable EM radiation (EMR) pattern, introducing extra degrees of freedom for information transfer in the EM domain. In this article, we present the concept and benefits of availing the EMR domain for mMIMO transmission. Moreover, we propose a viable architecture for reconfigurable mMIMO systems, and the associated system model and downlink precoding are also discussed. In particular, a three-level precoding scheme is proposed, and simulation results verify its considerable spectral and energy efficiency advantages compared to traditional mMIMO systems. Finally, we further discuss the challenges, insights, and prospects of deploying reconfigurable mMIMO, along with the associated hardware, algorithms, and fundamental theory. △ Less

Submitted 22 February, 2023; originally announced February 2023.

Comments: 7 pages, 3 figures. This paper is accepted by IEEE Wireless Communications Magazine. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2301.12829 [pdf, other]

doi 10.1109/TSC.2023.3330175

Identifying the Key Attributes in an Unlabeled Event Log for Automated Process Discovery

Authors: Kentaroh Toyoda, Rachel Gan Kai Ying, Allan NengSheng Zhang, Tan Puay Siew

Abstract: Process mining discovers and analyzes a process model from historical event logs. The prior art methods use the key attributes of case-id, activity, and timestamp hidden in an event log as clues to discover a process model. However, a user needs to specify them manually, and this can be an exhaustive task. In this paper, we propose a two-stage key attribute identification method to avoid such a ma… ▽ More Process mining discovers and analyzes a process model from historical event logs. The prior art methods use the key attributes of case-id, activity, and timestamp hidden in an event log as clues to discover a process model. However, a user needs to specify them manually, and this can be an exhaustive task. In this paper, we propose a two-stage key attribute identification method to avoid such a manual investigation, and thus this is a step toward fully automated process discovery. One of the challenging tasks is how to avoid exhaustive computation due to combinatorial explosion. For this, we narrow down candidates for each key attribute by using supervised machine learning in the first stage and identify the best combination of the key attributes by discovering process models and evaluating them in the second stage. Our computational complexity can be reduced from $\mathcal{O}(N^3)$ to $\mathcal{O}(k^3)$ where $N$ and $k$ are the numbers of columns and candidates we keep in the first stage, respectively, and usually $k$ is much smaller than $N$. We evaluated our method with 14 open datasets and showed that our method could identify the key attributes even with $k = 2$ for about 20 seconds for many datasets. △ Less

Submitted 16 November, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

Comments: IEEE Transactions on Services Computing (Early Access version)

arXiv:2212.05578 [pdf, other]

A Formalization of Doob's Martingale Convergence Theorems in mathlib

Authors: Kexing Ying, Rémy Degenne

Abstract: We present the formalization of Doob's martingale convergence theorems in the mathlib library for the Lean theorem prover. These theorems give conditions under which (sub)martingales converge, almost everywhere or in $L^1$. In order to formalize those results, we build a definition of the conditional expectation in Banach spaces and develop the theory of stochastic processes, stopping times and ma… ▽ More We present the formalization of Doob's martingale convergence theorems in the mathlib library for the Lean theorem prover. These theorems give conditions under which (sub)martingales converge, almost everywhere or in $L^1$. In order to formalize those results, we build a definition of the conditional expectation in Banach spaces and develop the theory of stochastic processes, stopping times and martingales. As an application of the convergence theorems, we also present the formalization of Lévy's generalized Borel-Cantelli lemma. This work on martingale theory is one of the first developments of probability theory in mathlib, and it builds upon diverse parts of that library such as topology, analysis and most importantly measure theory. △ Less

Submitted 11 December, 2022; originally announced December 2022.

arXiv:2202.12251 [pdf, other]

ISDA: Position-Aware Instance Segmentation with Deformable Attention

Authors: Kaining Ying, Zhenhua Wang, Cong Bai, Pengfei Zhou

Abstract: Most instance segmentation models are not end-to-end trainable due to either the incorporation of proposal estimation (RPN) as a pre-processing or non-maximum suppression (NMS) as a post-processing. Here we propose a novel end-to-end instance segmentation method termed ISDA. It reshapes the task into predicting a set of object masks, which are generated via traditional convolution operation with l… ▽ More Most instance segmentation models are not end-to-end trainable due to either the incorporation of proposal estimation (RPN) as a pre-processing or non-maximum suppression (NMS) as a post-processing. Here we propose a novel end-to-end instance segmentation method termed ISDA. It reshapes the task into predicting a set of object masks, which are generated via traditional convolution operation with learned position-aware kernels and features of objects. Such kernels and features are learned by leveraging a deformable attention network with multi-scale representation. Thanks to the introduced set-prediction mechanism, the proposed method is NMS-free. Empirically, ISDA outperforms Mask R-CNN (the strong baseline) by 2.6 points on MS-COCO, and achieves leading performance compared with recent models. Code will be available soon. △ Less

Submitted 23 February, 2022; originally announced February 2022.

Comments: Accepted to ICASSP 2022

arXiv:2201.02084 [pdf, other]

Active Terminal Identification, Channel Estimation, and Signal Detection for Grant-Free NOMA-OTFS in LEO Satellite Internet-of-Things

Authors: Xingyu Zhou, Keke Ying, Zhen Gao, Yongpeng Wu, Zhenyu Xiao, Symeon Chatzinotas, Jinhong Yuan, Björn Ottersten

Abstract: This paper investigates the massive connectivity of low Earth orbit (LEO) satellite-based Internet-of-Things (IoT) for seamless global coverage. We propose to integrate the grant-free non-orthogonal multiple access (GF-NOMA) paradigm with the emerging orthogonal time frequency space (OTFS) modulation to accommodate the massive IoT access, and mitigate the long round-trip latency and severe Doppler… ▽ More This paper investigates the massive connectivity of low Earth orbit (LEO) satellite-based Internet-of-Things (IoT) for seamless global coverage. We propose to integrate the grant-free non-orthogonal multiple access (GF-NOMA) paradigm with the emerging orthogonal time frequency space (OTFS) modulation to accommodate the massive IoT access, and mitigate the long round-trip latency and severe Doppler effect of terrestrial-satellite links (TSLs). On this basis, we put forward a two-stage successive active terminal identification (ATI) and channel estimation (CE) scheme as well as a low-complexity multi-user signal detection (SD) method. Specifically, at the first stage, the proposed training sequence aided OTFS (TS-OTFS) data frame structure facilitates the joint ATI and coarse CE, whereby both the traffic sparsity of terrestrial IoT terminals and the sparse channel impulse response are leveraged for enhanced performance. Moreover, based on the single Doppler shift property for each TSL and sparsity of delay-Doppler domain channel, we develop a parametric approach to further refine the CE performance. Finally, a least square based parallel time domain SD method is developed to detect the OTFS signals with relatively low complexity. Simulation results demonstrate the superiority of the proposed methods over the state-of-the-art solutions in terms of ATI, CE, and SD performance confronted with the long round-trip latency and severe Doppler effect. △ Less

Submitted 9 October, 2022; v1 submitted 6 January, 2022; originally announced January 2022.

Comments: 20 pages, 9 figures, accepted by IEEE Transactions on Wireless Communications

arXiv:2001.05763 [pdf, ps, other]

GMD-Based Hybrid Beamforming for Large Reconfigurable Intelligent Surface Assisted Millimeter-Wave Massive MIMO

Authors: Keke Ying, Zhen Gao, Shanxiang Lyu, Yongpeng Wu, Hua Wang, Mohamed-Slim Alouini

Abstract: Reconfigurable intelligent surface (RIS) is considered to be an energy-efficient approach to reshape the wireless environment for improved throughput. Its passive feature greatly reduces the energy consumption, which makes RIS a promising technique for enabling the future smart city. Existing beamforming designs for RIS mainly focus on optimizing the spectral efficiency for single carrier systems.… ▽ More Reconfigurable intelligent surface (RIS) is considered to be an energy-efficient approach to reshape the wireless environment for improved throughput. Its passive feature greatly reduces the energy consumption, which makes RIS a promising technique for enabling the future smart city. Existing beamforming designs for RIS mainly focus on optimizing the spectral efficiency for single carrier systems. To avoid the complicated bit allocation on different spatial domain subchannels in MIMO systems, in this paper, we propose a geometric mean decomposition-based beamforming for RIS-assisted millimeter wave (mmWave) hybrid MIMO systems so that multiple parallel data streams in the spatial domain can be considered to have the same channel gain. Specifically, by exploiting the common angular-domain sparsity of mmWave massive MIMO channels over different subcarriers, a simultaneous orthogonal match pursuit algorithm is utilized to obtain the optimal multiple beams from an oversampling 2D-DFT codebook. Moreover, by only leveraging the angle of arrival and angle of departure associated with the line of sight (LoS) channels, we further design the phase shifters for RIS by maximizing the array gain for LoS channel. Simulation results show that the proposed scheme can achieve better BER performance than conventional approaches. Our work is an initial attempt to discuss the broadband hybrid beamforming for RIS-assisted mmWave hybrid MIMO systems. △ Less

Submitted 16 January, 2020; v1 submitted 16 January, 2020; originally announced January 2020.

Comments: 8 pages, 6 figures, accepted by IEEE Access.This is an initial attempt to discuss the broadband hybrid beamforming for RIS-assisted mmWave hybrid MIMO systems

arXiv:1907.04500 [pdf, other]

Fetal Pose Estimation in Volumetric MRI using a 3D Convolution Neural Network

Authors: Junshen Xu, Molin Zhang, Esra Abaci Turk, Larry Zhang, Ellen Grant, Kui Ying, Polina Golland, Elfar Adalsteinsson

Abstract: The performance and diagnostic utility of magnetic resonance imaging (MRI) in pregnancy is fundamentally constrained by fetal motion. Motion of the fetus, which is unpredictable and rapid on the scale of conventional imaging times, limits the set of viable acquisition techniques to single-shot imaging with severe compromises in signal-to-noise ratio and diagnostic contrast, and frequently results… ▽ More The performance and diagnostic utility of magnetic resonance imaging (MRI) in pregnancy is fundamentally constrained by fetal motion. Motion of the fetus, which is unpredictable and rapid on the scale of conventional imaging times, limits the set of viable acquisition techniques to single-shot imaging with severe compromises in signal-to-noise ratio and diagnostic contrast, and frequently results in unacceptable image quality. Surprisingly little is known about the characteristics of fetal motion during MRI and here we propose and demonstrate methods that exploit a growing repository of MRI observations of the gravid abdomen that are acquired at low spatial resolution but relatively high temporal resolution and over long durations (10-30 minutes). We estimate fetal pose per frame in MRI volumes of the pregnant abdomen via deep learning algorithms that detect key fetal landmarks. Evaluation of the proposed method shows that our framework achieves quantitatively an average error of 4.47 mm and 96.4\% accuracy (with error less than 10 mm). Fetal pose estimation in MRI time series yields novel means of quantifying fetal movements in health and disease, and enables the learning of kinematic models that may enhance prospective mitigation of fetal motion artifacts during MRI acquisition. △ Less

Submitted 9 July, 2019; originally announced July 2019.

Comments: MICCAI 2019

arXiv:1903.03913 [pdf, ps, other]

Towards Ultra-Reliable Low-Latency Communications: Typical Scenarios, Possible Solutions, and Open Issues

Authors: Daquan Feng, Changyang She, Kai Ying, Lifeng Lai, Zhanwei Hou, Tony Q. S. Quek, Yonghui Li, Branka Vucetic

Abstract: Ultra-reliable low-latency communications (URLLC) has been considered as one of the three new application scenarios in the \emph{5th Generation} (5G) \emph {New Radio} (NR), where the physical layer design aspects have been specified. With the 5G NR, we can guarantee the reliability and latency in radio access networks. However, for communication scenarios where the transmission involves both radi… ▽ More Ultra-reliable low-latency communications (URLLC) has been considered as one of the three new application scenarios in the \emph{5th Generation} (5G) \emph {New Radio} (NR), where the physical layer design aspects have been specified. With the 5G NR, we can guarantee the reliability and latency in radio access networks. However, for communication scenarios where the transmission involves both radio access and wide area core networks, the delay in radio access networks only contributes to part of the \emph{end-to-end} (E2E) delay. In this paper, we outline the delay components and packet loss probabilities in typical communication scenarios of URLLC, and formulate the constraints on E2E delay and overall packet loss probability. Then, we summarize possible solutions in the physical layer, the link layer, the network layer, and the cross-layer design, respectively. Finally, we discuss the open issues in prediction and communication co-design for URLLC in wide area large scale networks. △ Less

Submitted 9 March, 2019; originally announced March 2019.

Comments: 8 pages, 7 figures. Accepted by IEEE Vehicular Technology Magazine

Journal ref: IEEE Vehicular Technology Magazine, 2019

arXiv:1408.0826 [pdf, ps, other]

doi 10.1016/j.dsp.2014.10.002

Optimization of Signal-to-Noise-and-Distortion Ratio for Dynamic Range Limited Nonlinearities

Authors: Kai Ying, Zhenhua Yu, Robert J. Baxley, G. Tong Zhou

Abstract: Many components used in signal processing and communication applications, such as power amplifiers and analog-to-digital converters, are nonlinear and have a finite dynamic range. The nonlinearity associated with these devices distorts the input, which can degrade the overall system performance. Signal-to-noise-and-distortion ratio (SNDR) is a common metric to quantify the performance degradation.… ▽ More Many components used in signal processing and communication applications, such as power amplifiers and analog-to-digital converters, are nonlinear and have a finite dynamic range. The nonlinearity associated with these devices distorts the input, which can degrade the overall system performance. Signal-to-noise-and-distortion ratio (SNDR) is a common metric to quantify the performance degradation. One way to mitigate nonlinear distortions is by maximizing the SNDR. In this paper, we analyze how to maximize the SNDR of the nonlinearities in optical wireless communication (OWC) systems. Specifically, we answer the question of how to optimally predistort a double-sided memory-less nonlinearity that has both a "turn-on" value and a maximum "saturation" value. We show that the SNDR-maximizing response given the constraints is a double-sided limiter with a certain linear gain and a certain bias value. Both the gain and the bias are functions of the probability density function (PDF) of the input signal and the noise power. We also find a lower bound of the nonlinear system capacity, which is given by the SDNR and an upper bound determined by dynamic signal-to-noise ratio (DSNR). An application of the results herein is to design predistortion linearization of nonlinear devices like light emitting diodes (LEDs). △ Less

Submitted 4 August, 2014; originally announced August 2014.

Showing 1–17 of 17 results for author: Ying, K