-
Improving Zero-shot Generalization and Robustness of Multi-modal Models
Authors:
Yunhao Ge,
Jie Ren,
Andrew Gallagher,
Yuxiao Wang,
Ming-Hsuan Yang,
Hartwig Adam,
Laurent Itti,
Balaji Lakshminarayanan,
Jiaping Zhao
Abstract:
Multi-modal image-text models such as CLIP and LiT have demonstrated impressive performance on image classification benchmarks and their zero-shot generalization ability is particularly exciting. While the top-5 zero-shot accuracies of these models are very high, the top-1 accuracies are much lower (over 25% gap in some cases). We investigate the reasons for this performance gap and find that many…
▽ More
Multi-modal image-text models such as CLIP and LiT have demonstrated impressive performance on image classification benchmarks and their zero-shot generalization ability is particularly exciting. While the top-5 zero-shot accuracies of these models are very high, the top-1 accuracies are much lower (over 25% gap in some cases). We investigate the reasons for this performance gap and find that many of the failure cases are caused by ambiguity in the text prompts. First, we develop a simple and efficient zero-shot post-hoc method to identify images whose top-1 prediction is likely to be incorrect, by measuring consistency of the predictions w.r.t. multiple prompts and image transformations. We show that our procedure better predicts mistakes, outperforming the popular max logit baseline on selective prediction tasks. Next, we propose a simple and efficient way to improve accuracy on such uncertain images by making use of the WordNet hierarchy; specifically we augment the original class by incorporating its parent and children from the semantic label hierarchy, and plug the augmentation into text prompts. We conduct experiments on both CLIP and LiT models with five different ImageNet-based datasets. For CLIP, our method improves the top-1 accuracy by 17.13% on the uncertain subset and 3.6% on the entire ImageNet validation set. We also show that our method improves across ImageNet shifted datasets, four other datasets, and other model architectures such as LiT. The proposed method is hyperparameter-free, requires no additional model training and can be easily scaled to other large multi-modal architectures. Code is available at https://github.com/gyhandy/Hierarchy-CLIP.
△ Less
Submitted 25 May, 2023; v1 submitted 4 December, 2022;
originally announced December 2022.
-
Trace the Accretion Geometry of H 1743--322 with Type C Quasi-periodic Oscillations in Multiple Outbursts
Authors:
Qing-Cang Shui,
Shu Zhang,
Yu-Peng P. Chen,
Shuang-Nan Zhang,
Ling-Da Kong,
Peng-Ju Wang,
Long Ji,
Hong-Xing Yin,
J. L. Qu,
L. Tao,
M. Y. Ge,
Jing-Qiang Peng,
Zhi Chang,
Jian Li,
Peng Zhang
Abstract:
We present a systematic analysis of type C quasi-periodic oscillation (QPO) observations of H 1743--322 throughout the Rossi X-ray Timing Explorer (RXTE) era. We find that, while different outbursts have significant flux differences, they show consistent positive correlations between the QPO fractional root-mean-square (rms) amplitude and non-thermal fraction of the emission, which indicate an ind…
▽ More
We present a systematic analysis of type C quasi-periodic oscillation (QPO) observations of H 1743--322 throughout the Rossi X-ray Timing Explorer (RXTE) era. We find that, while different outbursts have significant flux differences, they show consistent positive correlations between the QPO fractional root-mean-square (rms) amplitude and non-thermal fraction of the emission, which indicate an independence of the intrinsic QPO rms on individual outburst brightness in H 1743--322. However, the dependence of the QPO rms on frequency is different between the outburst rise and decay phases, where QPO fractional rms of the decay phase is significantly lower than that of the rise phase at low frequencies. The spectral analysis also reveals different ranges of coronal temperature between the two outburst stages. A semi-quantitative analysis shows that the Lense-Thirring precession model could be responsible for the QPO rms differences, requiring a variable coronal geometric shape. However, the variable-Comptonization model could also account for the findings. The fact that the rms differences and the hysteresis traces in the hardness-intensity diagram (HID) accompany each other indicates a connection between the two phenomena. By correlating the findings with QPO phase lags and the quasi-simultaneous radio flux previously published, we propose there could be corona-jet transitions in H 1743--322 similar to those that have been recently reported in GRS 1915+105.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
MmWave Mapping and SLAM for 5G and Beyond
Authors:
Yu Ge,
Ossi Kaltiokallio,
Hyowon Kim,
Jukka Talvitie,
Sunwoo Kim,
Lennart Svensson,
Mikko Valkama,
Henk Wymeersch
Abstract:
Device localization and radar-like mapping are at the heart of integrated sensing and communication, enabling not only new services and applications, but can also improve communication quality with reduced overheads. These forms of sensing are however susceptible to data association problems, due to the unknown relation between measurements and detected objects or targets. In this chapter, we prov…
▽ More
Device localization and radar-like mapping are at the heart of integrated sensing and communication, enabling not only new services and applications, but can also improve communication quality with reduced overheads. These forms of sensing are however susceptible to data association problems, due to the unknown relation between measurements and detected objects or targets. In this chapter, we provide an overview of the fundamental tools used to solve mapping, tracking, and simultaneous localization and mapping (SLAM) problems. We distinguish the different types of sensing problems and then focus on mapping and SLAM as running examples. Starting from the applicable models and definitions, we describe the different algorithmic approaches, with a particular focus on how to deal with data association problems. In particular, methods based on random finite set theory and Bayesian graphical models are introduced in detail. A numerical study with synthetic and experimental data is then used to compare these approaches in a variety of scenarios.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Monte-Carlo simulations on possible collimation effects of outflows to fan-beamed emission of ultraluminous accreting X-ray pulsars
Authors:
X. Hou,
Y. You,
L. Ji,
R. Soria,
S. N. Zhang,
M. Y. Ge,
L. Tao,
S. Zhang,
H. Feng,
M. Zhou,
Y. L. Tuo,
L. M. Song,
J. C. Wang
Abstract:
Pulsating ultraluminous X-ray sources (PULXs) are accreting pulsars with apparent X-ray luminosity exceeding $10^{39}\, \rm erg\ s^{-1}$. We perform Monte-Carlo simulations to investigate whether high collimation effect (or strong beaming effect) is dominant in the presence of accretion outflows, for the fan beam emission of the accretion column of the neutron stars in PULXs. We show that the thre…
▽ More
Pulsating ultraluminous X-ray sources (PULXs) are accreting pulsars with apparent X-ray luminosity exceeding $10^{39}\, \rm erg\ s^{-1}$. We perform Monte-Carlo simulations to investigate whether high collimation effect (or strong beaming effect) is dominant in the presence of accretion outflows, for the fan beam emission of the accretion column of the neutron stars in PULXs. We show that the three nearby PULXs (RX J0209.6$-$7427, Swift J0243.6+6124 and SMC X-3), namely the three musketeers here, have their main pulsed emission not strongly collimated even if strong outflows exist. This conclusion can be extended to the current sample of extragalactic PULXs, if accretion outflows are commonly produced from them. This means that the observed high luminosity of PULXs is indeed intrinsic, which can be used to infer the existence of very strong surface magnetic fields of $\sim10^{13-14}$ G, possibly multipole fields. However, if strong outflows are launched from the accretion disks in PULXs as a consequence of disk spherization by radiation pressure, regular dipole magnetic fields of $\sim10^{12}$ G may be required, comparable to that of the three musketeers, which have experienced large luminosity changes from well below their Eddington limit ($2\times10^{38}\, \rm erg\ s^{-1}$ for a NS) to super-Eddington and their maximum luminosity fills the luminosity gap between Galactic pulsars and extragalactic PULXs.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Reliable Extraction of Semantic Information and Rate of Innovation Estimation for Graph Signals
Authors:
Mert Kalfa,
Sadik Yagiz Yetim,
Arda Atalik,
Mehmetcan Gok,
Yiqun Ge,
Rong Li,
Wen Tong,
Tolga Mete Duman,
Orhan Arikan
Abstract:
Semantic signal processing and communications are poised to play a central part in developing the next generation of sensor devices and networks. A crucial component of a semantic system is the extraction of semantic signals from the raw input signals, which has become increasingly tractable with the recent advances in machine learning (ML) and artificial intelligence (AI) techniques. The accurate…
▽ More
Semantic signal processing and communications are poised to play a central part in developing the next generation of sensor devices and networks. A crucial component of a semantic system is the extraction of semantic signals from the raw input signals, which has become increasingly tractable with the recent advances in machine learning (ML) and artificial intelligence (AI) techniques. The accurate extraction of semantic signals using the aforementioned ML and AI methods, and the detection of semantic innovation for scheduling transmission and/or storage events are critical tasks for reliable semantic signal processing and communications. In this work, we propose a reliable semantic information extraction framework based on our previous work on semantic signal representations in a hierarchical graph-based structure. The proposed framework includes a time integration method to increase fidelity of ML outputs in a class-aware manner, a graph-edit-distance based metric to detect innovation events at the graph-level and filter out sporadic errors, and a Hidden Markov Model (HMM) to produce smooth and reliable graph signals. The proposed methods within the framework are demonstrated individually and collectively through simulations and case studies based on real-world computer vision examples.
△ Less
Submitted 10 November, 2022;
originally announced November 2022.
-
Refined magnetic structure of VI$_3$
Authors:
Ola Kenji Forslund,
Yuqing Ge,
Hiroto Ohta,
Chennan Wang,
Mahmoud Abdel-Hafiez,
Jun Sugiyama,
Martin Månsson,
Yasmine Sassa
Abstract:
The van der Waals ferromagnet (FM), VI$_3$, was studied by muon spin relaxation ($μ^+$SR) and first principle calculations based on density functional theory (DFT). Temperature dependent zero field muon spin relaxation ($μ^+$SR) measurements confirm the onset of long range FM order and the time spectra exhibits clear muon spin precession frequencies for $T<T_{\rm C}=50.03(1)$~K. The calculated int…
▽ More
The van der Waals ferromagnet (FM), VI$_3$, was studied by muon spin relaxation ($μ^+$SR) and first principle calculations based on density functional theory (DFT). Temperature dependent zero field muon spin relaxation ($μ^+$SR) measurements confirm the onset of long range FM order and the time spectra exhibits clear muon spin precession frequencies for $T<T_{\rm C}=50.03(1)$~K. The calculated internal magnetic fields at the predicted muon sites, based on the established magnetic structure from neutron diffraction, is inconsistent with the measured one. This inconsistency is because of strong incoherent neutron scattering and absorption originating from the elements V and I. Instead, a new and a more accurate magnetic structure is derived based on a combined study using $μ^+$SR and DFT. These results suggest strong contritions from orbital angular momentum, providing experimental evidence for the existence of unquenched orbital angular momentum of V$^{3+}$ in VI$_3$. Finally, an unusual form of a short range ordering is present above $T_{\rm C}$. Its temperature dependence is unlike previously reported cases in other layered compounds and its microscopic origin is discussed.
△ Less
Submitted 31 October, 2022;
originally announced October 2022.
-
Analysis of V2X Sidelink Positioning in sub-6 GHz
Authors:
Yu Ge,
Maximilian Stark,
Musa Furkan Keskin,
Frank Hofmann,
Thomas Hansen,
Henk Wymeersch
Abstract:
Radio positioning is an important part of joint communication and sensing in beyond 5G communication systems. Existing works mainly focus on the mmWave bands and under-utilize the sub-6 GHz bands, even though it is promising for accurate positioning, especially when the multipath is uncomplicated, and meaningful in several important use cases. In this paper, we analyze V2X sidelink positioning and…
▽ More
Radio positioning is an important part of joint communication and sensing in beyond 5G communication systems. Existing works mainly focus on the mmWave bands and under-utilize the sub-6 GHz bands, even though it is promising for accurate positioning, especially when the multipath is uncomplicated, and meaningful in several important use cases. In this paper, we analyze V2X sidelink positioning and propose a new performance bound that can predict the positioning performance in the presence of severe multipath. Simulation results using ray-tracing data demonstrate the possibility of sidelink positioning, and the efficacy of the new performance bound and its relation with the complexity of the multipath.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
A Novel Block-Wise Index Modulation Scheme for High-Mobility OTFS Communications
Authors:
Mi Qian,
Yao Ge,
Miaowen Wen,
Fei Ji
Abstract:
As a promising technique for high-mobility wireless communications, orthogonal time frequency space (OTFS) has been proved to enjoy excellent advantages with respect to traditional orthogonal frequency division multiplexing (OFDM). However, a challenging problem is to design efficient systems to further improve the performance. In this paper, we propose a novel block-wise index modulation (IM) sch…
▽ More
As a promising technique for high-mobility wireless communications, orthogonal time frequency space (OTFS) has been proved to enjoy excellent advantages with respect to traditional orthogonal frequency division multiplexing (OFDM). However, a challenging problem is to design efficient systems to further improve the performance. In this paper, we propose a novel block-wise index modulation (IM) scheme for OTFS systems, named Doppler-IM with OTFS (DoIM-OTFS), where a block of Doppler resource bins are activated simultaneously. For practical implementation, we develop a low complexity customized message passing (CMP) algorithm for our proposed DoIM-OTFS scheme. Simulation results demonstrate our proposed DoIM-OTFS system outperforms traditional OTFS system without IM. The proposed CMP algorithm can achieve desired performance and robustness to the imperfect channel state information (CSI).
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
First-principles calculations on the mechanical, electronic, magnetic and optical properties of two-dimensional Janus Cr$_2$TeX (X= P, As, Sb) monolayers
Authors:
Qiuyue Ma,
Wenhui Wan,
Yanfeng Ge,
Yingmei Li,
Yong Liu
Abstract:
Janus materials possess extraordinary physical, chemical, and mechanical properties caused by symmetry breaking. Here, the mechanic properties, electronic structure, magnetic properties, and optical properties of Janus Cr$_2$TeX (X= P, As, Sb) monolayers are systematically investigated by the density functional theory. Janus Cr$_2$TeP, Cr$_2$TeAs, and Cr$_2$TeSb are intrinsic ferromagnetic (FM) ha…
▽ More
Janus materials possess extraordinary physical, chemical, and mechanical properties caused by symmetry breaking. Here, the mechanic properties, electronic structure, magnetic properties, and optical properties of Janus Cr$_2$TeX (X= P, As, Sb) monolayers are systematically investigated by the density functional theory. Janus Cr$_2$TeP, Cr$_2$TeAs, and Cr$_2$TeSb are intrinsic ferromagnetic (FM) half-metals with wide spin gaps and half-metallic gaps. Monte Carlo simulations based on the Heisenberg model estimate the Curie temperature (\emph{T}$_c$) of these monolayers are about 583, 608, and 597 K, respectively. Additionally, it is found that Cr$_2$TeX (X= P, As, Sb) monolayers still exhibit FM half-metallic properties under biaxial strain from -6% to 6%. At last, the Cr$_2$TeP monolayer has a higher absorption coefficient than the Cr$_2$TeAs and Cr$_2$TeSb monolayers in the visible region. The results predict that Janus Cr$_2$TeX (X= P, As, Sb) monolayers with novel properties have good potential for applications in future nanodevices.
△ Less
Submitted 6 May, 2023; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Darwinian Model Upgrades: Model Evolving with Selective Compatibility
Authors:
Binjie Zhang,
Shupeng Su,
Yixiao Ge,
Xuyuan Xu,
Yexin Wang,
Chun Yuan,
Mike Zheng Shou,
Ying Shan
Abstract:
The traditional model upgrading paradigm for retrieval requires recomputing all gallery embeddings before deploying the new model (dubbed as "backfilling"), which is quite expensive and time-consuming considering billions of instances in industrial applications. BCT presents the first step towards backward-compatible model upgrades to get rid of backfilling. It is workable but leaves the new model…
▽ More
The traditional model upgrading paradigm for retrieval requires recomputing all gallery embeddings before deploying the new model (dubbed as "backfilling"), which is quite expensive and time-consuming considering billions of instances in industrial applications. BCT presents the first step towards backward-compatible model upgrades to get rid of backfilling. It is workable but leaves the new model in a dilemma between new feature discriminativeness and new-to-old compatibility due to the undifferentiated compatibility constraints. In this work, we propose Darwinian Model Upgrades (DMU), which disentangle the inheritance and variation in the model evolving with selective backward compatibility and forward adaptation, respectively. The old-to-new heritable knowledge is measured by old feature discriminativeness, and the gallery features, especially those of poor quality, are evolved in a lightweight manner to become more adaptive in the new latent space. We demonstrate the superiority of DMU through comprehensive experiments on large-scale landmark retrieval and face recognition benchmarks. DMU effectively alleviates the new-to-new degradation and improves new-to-old compatibility, rendering a more proper model upgrading paradigm in large-scale retrieval systems.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
Super resolution dual-energy cone-beam CT imaging with dual-layer flat-panel detector
Authors:
Ting Su,
Jiongtao Zhu,
Xin Zhang,
Dong Zeng,
Yuhang Tan,
Han Cui,
Hairong Zheng,
Jianhua Ma,
Dong Liang,
Yongshuai Ge
Abstract:
For medical cone-beam computed tomography (CBCT) imaging, the native receptor array of the flat-panel detector (FPD) is usually binned into a reduced matrix size. By doing so, the signal readout speed can be increased by over 4-9 times at the expense of sacrificing the spatial resolution by at least 50%-67%. Clearly, such tradition poses a main bottleneck in generating high spatial resolution and…
▽ More
For medical cone-beam computed tomography (CBCT) imaging, the native receptor array of the flat-panel detector (FPD) is usually binned into a reduced matrix size. By doing so, the signal readout speed can be increased by over 4-9 times at the expense of sacrificing the spatial resolution by at least 50%-67%. Clearly, such tradition poses a main bottleneck in generating high spatial resolution and high temporal resolution CBCT images at the same time. In addition, the conventional FPD is also difficult in generating dual-energy CBCT images. In this paper, we propose an innovative super resolution dual-energy CBCT imaging method, named as suRi, based on dual-layer FPD (DL-FPD) to overcome these aforementioned difficulties at once. With suRi, specifically, an 1D or 2D sub-pixel (half pixel in this study) shifted binning is applied to replace the conventionally aligned binning to double the spatial sampling rate during the dual-energy data acquisition. As a result, the suRi approach provides a new strategy to enable high signal readout speed and high spatial resolution CBCT imaging with FPD. Moreover, a penalized likelihood material decomposition algorithm is developed to directly reconstruct the high resolution bases from the dual-energy CBCT projections containing spatial sub-pixel shifts. Experiments based on the single-layer FPD and DL-FPD are performed with physical phantoms and biological specimen to validate this newly developed suRi method. The synthesized monochromatic CT imaging results demonstrate that suRi can significantly improve the spatial image resolution by 46.15%. We believe the developed suRi method would be capable to greatly enhance the imaging performance of the DL-FPD based dual-energy CBCT systems in future.
△ Less
Submitted 17 October, 2022; v1 submitted 11 October, 2022;
originally announced October 2022.
-
CMOS based high-resolution dynamic X-ray imaging with inorganic perovskite
Authors:
Yanliang Liu,
Chaosong Gao,
Jiongtao Zhu,
Xin Zhang,
Meng Wu,
Ting Su,
Jiahong Wang,
Zonghai Sheng,
Wenjun Liu,
Tongyu Shi,
Xingchen He,
Dong Liang,
Hairong Zheng,
Xue-Feng Yu,
Xiangming Sun,
Yongshuai Ge
Abstract:
High-resolution dynamic X-ray detector is crucial for time-resolved digital radiography (DR) imaging and fast 3D medical computed tomography (CT) imaging. Recently, perovskites have become promising alternatives to conventional semi-conductor materials, e.g., Si, a-Se and CdTe, for direct X-ray detection. However, the feasibility of their combination with high-speed pixelated complementary metal-o…
▽ More
High-resolution dynamic X-ray detector is crucial for time-resolved digital radiography (DR) imaging and fast 3D medical computed tomography (CT) imaging. Recently, perovskites have become promising alternatives to conventional semi-conductor materials, e.g., Si, a-Se and CdTe, for direct X-ray detection. However, the feasibility of their combination with high-speed pixelated complementary metal-oxide-semiconductor (CMOS) arrays remains unknown. This work originally reports an innovative direct-conversion X-ray detector fabricated with 300 micrometer thick inorganic perovskite film printed on a tailored CMOS array. In-house measurements demonstrate that the CsPbBr3 film has excellent optoelectric properties of an electron mobility-lifetime product of 3.40x10$^{-5}$ cm$^2$ V$^{-1}$, and the X-ray detector exhibits high sensitivity of 9341uC Gy$_{\rm air}^{-1}$ cm$^{-2}$, and low detection limit of 588 nGy$_{\rm air}^{-1}$. This CMOS X-ray imaging detector achieves a high spatial resolution up to 5.5 lp/mm (close to the resolution limit of 6.0 lp/mm), and >300 frame per second (fps) readout speed. DR image of a resolution pattern phantom and a anesthesia mice, CT images of a biological specimen are acquired for the first time.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Authors:
Ziyun Zeng,
Yuying Ge,
Xihui Liu,
Bin Chen,
Ping Luo,
Shu-Tao Xia,
Yixiao Ge
Abstract:
Pre-training on large-scale video data has become a common recipe for learning transferable spatiotemporal representations in recent years. Despite some progress, existing methods are mostly limited to highly curated datasets (e.g., K400) and exhibit unsatisfactory out-of-the-box representations. We argue that it is due to the fact that they only capture pixel-level knowledge rather than spatiotem…
▽ More
Pre-training on large-scale video data has become a common recipe for learning transferable spatiotemporal representations in recent years. Despite some progress, existing methods are mostly limited to highly curated datasets (e.g., K400) and exhibit unsatisfactory out-of-the-box representations. We argue that it is due to the fact that they only capture pixel-level knowledge rather than spatiotemporal semantics, which hinders further progress in video understanding. Inspired by the great success of image-text pre-training (e.g., CLIP), we take the first step to exploit language semantics to boost transferable spatiotemporal representation learning. We introduce a new pretext task, Turning to Video for Transcript Sorting (TVTS), which sorts shuffled ASR scripts by attending to learned video representations. We do not rely on descriptive captions and learn purely from video, i.e., leveraging the natural transcribed speech knowledge to provide noisy but useful semantics over time. Our method enforces the vision model to contextualize what is happening over time so that it can re-organize the narrative transcripts, and can seamlessly apply to large-scale uncurated video data in the real world. Our method demonstrates strong out-of-the-box spatiotemporal representations on diverse benchmarks, e.g., +13.6% gains over VideoMAE on SSV2 via linear probing. The code is available at https://github.com/TencentARC/TVTS.
△ Less
Submitted 12 March, 2023; v1 submitted 30 September, 2022;
originally announced September 2022.
-
Magnetic nature of wolframite MgReO$_4$
Authors:
Elisabetta Nocerino,
Ola K. Forslund,
Chennan Wang,
Hiroya Sakurai,
Frank Elson,
Rasmus Palm,
Ugne Miniotaite,
Yuqing Ge,
Yasmine Sassa,
Jun Sugiyama,
Martin Månsson
Abstract:
Rhenium oxides belonging to the family $A$ReO$_4$ where $A$ is a metal cation, exhibit interesting electronic and magnetic properties. In this study we have utilized the muon spin rotation/relaxation ($μ^+$SR) technique to study the magnetic properties of the MgReO$_4$ compound. To the best of our knowledge, this is the first investigation reported on this interesting material, that is stabilized…
▽ More
Rhenium oxides belonging to the family $A$ReO$_4$ where $A$ is a metal cation, exhibit interesting electronic and magnetic properties. In this study we have utilized the muon spin rotation/relaxation ($μ^+$SR) technique to study the magnetic properties of the MgReO$_4$ compound. To the best of our knowledge, this is the first investigation reported on this interesting material, that is stabilized in a wolframite crystal structure using a special high-pressure synthesis technique. Bulk magnetic studies show the onset of an antiferromagnetic (AF) long range order, or a possible singlet spin state at $T_{\rm C1}\approx90$~K, with a subtle second high-temperature transition at $T_{\rm C2}\approx280$~K. Both transitions are also confirmed by heat capacity ($C_p$) measurements. From our $μ^+$SR measurements, it is clear that the sample enters an AF order below $T_{\rm C1}=T_{\rm N}\approx85$~K. We find no evidence of magnetic signal above $T_{\rm N}$, which indicates that $T_{\rm C2}$ is likely linked to a structural transition. Further, via sensitive zero field (ZF) $μ^+$SR measurements we find evidence of a spin reorientation at $T_{\rm Cant}\approx65$~K. This points towards a transition from a collinear AF into a canted AF order at low temperature, which is proposed to be driven by competing magnetic interactions.
△ Less
Submitted 24 September, 2022;
originally announced September 2022.
-
Third-order charge transport in a magnetic topological semimetal
Authors:
Ziming Zhu,
Huiying Liu,
Yongheng Ge,
Zeying Zhang,
Weikang Wu,
Cong Xiao,
Shengyuan A. Yang
Abstract:
Magnetic topological materials and their physical signatures are a focus of current research. Here, by first-principles calculations and symmetry analysis, we reveal topological semimetal states in an existing antiferromagnet ThMn2Si2. Depending on the Néel vector orientation, the topological band crossings near the Fermi level form either a double-nodal loop or two pairs of Dirac points,which are…
▽ More
Magnetic topological materials and their physical signatures are a focus of current research. Here, by first-principles calculations and symmetry analysis, we reveal topological semimetal states in an existing antiferromagnet ThMn2Si2. Depending on the Néel vector orientation, the topological band crossings near the Fermi level form either a double-nodal loop or two pairs of Dirac points,which are all fourfold degenerate and robust under spin-orbit coupling. These topological features produce large Berry connection polarizability, which leads to enhanced nonlinear transport effects. Particularly, we evaluate the third order current response, which dominates the transverse charge current. We show that the nonlinear response can be much more sensitive to topological phase transitions than linear response, which offers a powerful tool for characterizing magnetic topological semimetals.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Equivariant Filter Design for Discrete-time systems
Authors:
Yixiao Ge,
Pieter van Goor,
Robert Mahony
Abstract:
The kinematics of many nonlinear control systems, especially in the robotics field, admit a transitive Lie-group symmetry, which is useful in high performance observer design. The recently proposed equivariant filter (EqF) exploits equivariance to generate high performance filters for a wide range of real-world systems. However, existing work on the equivariant filter, and equivariance of control…
▽ More
The kinematics of many nonlinear control systems, especially in the robotics field, admit a transitive Lie-group symmetry, which is useful in high performance observer design. The recently proposed equivariant filter (EqF) exploits equivariance to generate high performance filters for a wide range of real-world systems. However, existing work on the equivariant filter, and equivariance of control systems in general, is based on a continuous-time formulation. In this paper, we first present the equivariant structure of a discrete-time system. We then use this to propose a discrete-time version of the equivariant filter. A novelty of the approach is that the geometry of the symmetry group naturally appears as parallel transport in the reset step of the filter. Preliminary results for linear second order kinematics with separate bearing and range measurements indicate that the discrete EqF significantly outperforms both a discretized version of the continuous EqF and a classical discrete EKF.
△ Less
Submitted 11 September, 2022;
originally announced September 2022.
-
Fan beamed X-ray emission from 1 keV to above 130 keV from the ultraluminous X-ray pulsar RX J0209.6-7427 in the Small Magellanic Cloud
Authors:
X. Hou,
M. Y. Ge,
L. Ji,
S. N. Zhang,
Y. You,
L. Tao,
S. Zhang,
R. Soria,
H. Feng,
M. Zhou,
Y. L. Tuo,
L. M. Song,
J. C. Wang
Abstract:
We present detailed timing and spectral analyses of the transient X-ray pulsar RX J0209.6$-$7427 in the Small Magellanic Cloud during its 2019 giant outburst. With a better known distance than most galactic X-ray pulsars, its peak luminosity is determined to be $(1.11\pm0.06)\times 10^{39}\, \rm erg\ s^{-1}$; it is thus a {\it bonda fide} pulsating ultraluminous X-ray source (PULX). Owing to the b…
▽ More
We present detailed timing and spectral analyses of the transient X-ray pulsar RX J0209.6$-$7427 in the Small Magellanic Cloud during its 2019 giant outburst. With a better known distance than most galactic X-ray pulsars, its peak luminosity is determined to be $(1.11\pm0.06)\times 10^{39}\, \rm erg\ s^{-1}$; it is thus a {\it bonda fide} pulsating ultraluminous X-ray source (PULX). Owing to the broad energy band of \textit{Insight}-HXMT, its pulsed X-ray emission was detected from 1 keV up to the 130$-$180 keV band, which is the highest energy emission detected from any PULXs outside the Milky Way. This allows us to conclude that its main pulsed X-ray emission is from the "fan beam" of the accretion column, and its luminosity is thus intrinsic. We also estimate its magnetic field of (4.8$-$8.6)$\times10^{12}$ G or (1.7$-$2.2)$\times10^{13}$ G, from its spin evolution or transition in the accretion column structure during the outburst; we suggest that the two values of the magnetic field strength correspond to the dipole and multipole magnetic fields of the neutron star, similar to the recent discovery in the Galactic PULX Swift J0243.6+6124. Therefore, the nature of the neutron star and its ULX emission can be understood within the current theoretical frame of accreting neutron stars. This may have implications for understanding the nature of those farther away extragalactic PULXs.
△ Less
Submitted 31 August, 2022;
originally announced August 2022.
-
Timing properties of the X-ray accreting pulsar 1A 0535+262 studied with Insight-HXMT
Authors:
P. J. Wang,
L. D. Kong,
S. Zhang,
V. Doroshenko,
A. Santangelo,
L. Ji,
E. S. Yorgancioglu,
Y. P. Chen,
S. N. Zhang,
J. L. Qu,
M. Y. Ge,
J. Li,
Z. Chang,
L. Tao,
J. Q. Peng,
Q. C. Shui
Abstract:
We report results on the timing analysis of the 2020 giant outburst of 1A 0535+262, using broadband data from Insight-HXMT. The analysis of the pulse profile evolution from the sub-critical luminosity to super-critical luminosity regime is presented for the first time. We found that the observed pulse profile exhibits a complex dependence on both energy and luminosity.A dip structure at the energy…
▽ More
We report results on the timing analysis of the 2020 giant outburst of 1A 0535+262, using broadband data from Insight-HXMT. The analysis of the pulse profile evolution from the sub-critical luminosity to super-critical luminosity regime is presented for the first time. We found that the observed pulse profile exhibits a complex dependence on both energy and luminosity.A dip structure at the energy of the cyclotron resonant scattering features (CRSFs) is found for the first time in the pulse fraction-energy relation of 1A 0535+262, when the outburst evolves in a luminosity range from 4.8 $\times 10^{37}$ erg s$^{-1}$ to 1.0 $\times 10^{38}$ erg s$^{-1}$. The observed structure is luminosity dependent and appears around the source critical luminosity ($\sim$ 6.7 $\times 10^{37}$ erg s$^{-1}$).
△ Less
Submitted 28 August, 2022;
originally announced August 2022.
-
Global RTK Positioning in Graphical State Space
Authors:
Yihong Ge,
Sudan Yan,
Shaolin Lü,
Cong Li
Abstract:
This paper proposes a new method for RTK post-processing. Different from the traditional forward-backward Kalman filter, in our method, the whole system equation is built on a graphical state space model and solved by factor graph optimization. The position solution provided by the forward Kalman filter is used as the linearization points of the graphical state space model. Constant variables, suc…
▽ More
This paper proposes a new method for RTK post-processing. Different from the traditional forward-backward Kalman filter, in our method, the whole system equation is built on a graphical state space model and solved by factor graph optimization. The position solution provided by the forward Kalman filter is used as the linearization points of the graphical state space model. Constant variables, such as double-difference ambiguity, will exist as constants in the graphical state space model, not as time-series variables. It is shown by experiment results that factor graph optimization with a graphical state space model is more effective than Kalman filter with a traditional discrete-time state space model for RTK post-processing problem.
△ Less
Submitted 8 November, 2022; v1 submitted 26 August, 2022;
originally announced August 2022.
-
Doppler Exploitation in Bistatic mmWave Radio SLAM
Authors:
Yu Ge,
Ossi Kaltiokallio,
Hui Chen,
Fan Jiang,
Jukka Talvitie,
Mikko Valkama,
Lennart Svensson,
Henk Wymeersch
Abstract:
Networks in 5G and beyond utilize millimeter wave (mmWave) radio signals, large bandwidths, and large antenna arrays, which bring opportunities in jointly localizing the user equipment and mapping the propagation environment, termed as simultaneous localization and mapping (SLAM). Existing approaches mainly rely on delays and angles, and ignore the Doppler, although it contains geometric informati…
▽ More
Networks in 5G and beyond utilize millimeter wave (mmWave) radio signals, large bandwidths, and large antenna arrays, which bring opportunities in jointly localizing the user equipment and mapping the propagation environment, termed as simultaneous localization and mapping (SLAM). Existing approaches mainly rely on delays and angles, and ignore the Doppler, although it contains geometric information. In this paper, we study the benefits of exploiting Doppler in SLAM through deriving the posterior Cramér-Rao bounds (PCRBs) and formulating the extended Kalman-Poisson multi-Bernoulli sequential filtering solution with Doppler as one of the involved measurements. Both theoretical PCRB analysis and simulation results demonstrate the efficacy of utilizing Doppler.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Transitions and Origin of the Type-B Quasi-Periodic Oscillation in the Black Hole X-ray Binary MAXI~ J1348--630
Authors:
H. X. Liu,
Y. Huang,
Q. C. Bu,
W. Yu,
Z. X. Yang,
L. Zhang,
L. D. Kong,
G. C. Xiao,
J. L. Qu,
S. N. Zhang,
S. Zhang,
L. M. Song,
S. M. Jia,
X. Ma,
L. Tao,
M. Y. Ge,
Q. Z. Liu,
J. Z. Yan,
R. C. Ma,
X. Q. Ren,
D. K. Zhou,
T. M. Li,
B. Y. Wu,
Y. C. Xu,
Y. F. Du
, et al. (4 additional authors not shown)
Abstract:
The fast transitions between different types of quasi-periodic oscillations (QPOs) are generally observed in black hole transient sources (BHTs). We present a detailed study on the timing and spectral properties of the transitions of type-B QPOs in MAXI~J1348--630, observed by \emph{Insight}-HXMT. The fractional rms variability--energy relationship and energy spectra reveal that type-B QPOs probab…
▽ More
The fast transitions between different types of quasi-periodic oscillations (QPOs) are generally observed in black hole transient sources (BHTs). We present a detailed study on the timing and spectral properties of the transitions of type-B QPOs in MAXI~J1348--630, observed by \emph{Insight}-HXMT. The fractional rms variability--energy relationship and energy spectra reveal that type-B QPOs probably originate from jet precession. Compared to weak power-law dominated power spectrum, when type-B QPO is present, the corresponding energy spectrum shows an increase in Comptonization component and the need for {\tt\string xillverCp} component, and a slight increase of height of the corona when using {\tt\string relxilllp} model. Therefore, we suggest that a coupled inner disk-jet region is responsible for the observed type-B QPOs transitions. The time scale for the appearance/disappearance of type-B QPOs is either long or short (seconds), which may indicate an instability of disk-jet structure. For these phenomena, we give the hypothesis that the Bardeen-Petterson effect causes disk-jet structure to align with BH spin axis, or that the disappearance of small-scale jets bound by the magnetic flux tubes lead to the disappearance of type-B QPOs. We observed three events regarding the B/C transitions, one of which occurred in a short time from $\sim 9.2$ Hz (C) to $\sim 4.8$ Hz (B). The energy spectral analysis for the other two transitions shows that when type-C QPO is present, the Comptonization flux is higher, the spectrum is harder and the inner radius of disk changes insignificantly. We suggest that type-C QPOs probably originate from relatively stronger jets or corona.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
Joint Channel Estimation and Data Detection for Hybrid RIS aided Millimeter Wave OTFS Systems
Authors:
Muye Li,
Shun Zhang,
Yao Ge,
Feifei Gao,
Pingzhi Fan
Abstract:
For high mobility communication scenario, the recently emerged orthogonal time frequency space (OTFS) modulation introduces a new delay-Doppler domain signal space, and can provide better communication performance than traditional orthogonal frequency division multiplexing system. This article focuses on the joint channel estimation and data detection (JCEDD) for hybrid reconfigurable intelligent…
▽ More
For high mobility communication scenario, the recently emerged orthogonal time frequency space (OTFS) modulation introduces a new delay-Doppler domain signal space, and can provide better communication performance than traditional orthogonal frequency division multiplexing system. This article focuses on the joint channel estimation and data detection (JCEDD) for hybrid reconfigurable intelligent surface (HRIS) aided millimeter wave (mmWave) OTFS systems. Firstly, a new transmission structure is designed. Within the pilot durations of the designed structure, partial HRIS elements are alternatively activated. The time domain channel model is then exhibited. Secondly, the received signal model for both the HRIS over time domain and the base station over delay-Doppler domain are studied. Thirdly, by utilizing channel parameters acquired at the HRIS, an HRIS beamforming design strategy is proposed. For the OTFS transmission, we propose a JCEDD scheme over delay-Doppler domain. In this scheme, message passing (MP) algorithm is designed to simultaneously obtain the equivalent channel gain and the data symbols. On the other hand, the channel parameters, i.e., the Doppler shift, the channel sparsity, and the channel variance, are updated through expectation-maximization (EM) algorithm. By iteratively executing the MP and EM algorithm, both the channel and the unknown data symbols can be accurately acquired. Finally, simulation results are provided to validate the effectiveness of our proposed JCEDD scheme.
△ Less
Submitted 14 August, 2022;
originally announced August 2022.
-
2D-XY ferromagnetism with high transition temperature in Janus monolayer V$_{2}$XN (X = P, As)
Authors:
Wenhui Wan,
Botao Fu,
Chang Liu,
Rui Guo,
Yanfeng Ge,
Yong Liu
Abstract:
Two-dimensional (2D) XY magnets with easy magnetization planes support the nontrivial topological spin textures whose dissipationless transport is highly desirable for 2D spintronic devices. Here, we predicted that Janus monolayer V$_{2}$XN (X = P, As) with a square lattice are 2D-XY ferromagnets by first-principles calculations. Both the magnetocrystalline anisotropy and magnetic shape anisotropy…
▽ More
Two-dimensional (2D) XY magnets with easy magnetization planes support the nontrivial topological spin textures whose dissipationless transport is highly desirable for 2D spintronic devices. Here, we predicted that Janus monolayer V$_{2}$XN (X = P, As) with a square lattice are 2D-XY ferromagnets by first-principles calculations. Both the magnetocrystalline anisotropy and magnetic shape anisotropy favor an in-plane magnetization, leading to an easy magnetization $xy$-plane in Janus monolayer V$_{2}$XN. Resting on the Monte Carlo simulations, we observed the Berezinskii-Kosterlitz-Thouless (BKT) phase transition in monolayer V$_{2}$XN with transition temperature $T_{\rm BKT}$ being above the room temperature. Especially, monolayer V$_{2}$AsN has a magnetic anisotropy energy (MAE) of 292.0 $μ$eV per V atom and a $T_{\rm BKT}$ of 434 K, which is larger than that of monolayer V$_{2}$PN. Moreover, a tensile strain of 5\% can further improve the $T_{\rm BKT}$ of monolayer V$_{2}$XN to be above 500 K. Our results indicated that Janus monolayer V$_{2}$XN (X = P, As) were candidate materials to realize high-temperature 2D-XY ferromagnetism for spintronics applications.
△ Less
Submitted 22 February, 2023; v1 submitted 13 August, 2022;
originally announced August 2022.
-
Automatic Hybrid-Precision Quantization for MIMO Detectors
Authors:
Yingmeng Ge,
Zhenhao Ji,
Yongming Huang,
Zaichen Zhang,
Xiaohu You,
Chuan Zhang
Abstract:
In the design of wireless systems, quantization plays a critical role in hardware, which directly affects both area efficiency and energy efficiency. Being an enabling technique, the wide applications of multiple-input multiple-output (MIMO) heavily relies on efficient implementations balancing both performance and complexity. However, most of the existing detectors uniformly quantize all variable…
▽ More
In the design of wireless systems, quantization plays a critical role in hardware, which directly affects both area efficiency and energy efficiency. Being an enabling technique, the wide applications of multiple-input multiple-output (MIMO) heavily relies on efficient implementations balancing both performance and complexity. However, most of the existing detectors uniformly quantize all variables, resulting in high redundancy and low flexibility. Requiring both expertise and efforts, an in-depth tailored quantization usually asks for prohibitive costs and is not considered by conventional MIMO detectors. In this paper, a general framework named the automatic hybrid-precision quantization (AHPQ) is proposed with two parts: integral quantization determined by probability density function (PDF), and fractional quantization by deep reinforcement learning (DRL). Being automatic, AHPQ demonstrates high efficiency in figuring out good quantizations for a set of algorithmic parameters. For the approximate message passing (AMP) detector, AHPQ achieves up to $58.7\%$ lower average bitwidth than the unified quantization (UQ) one with almost no performance sacrifice. The feasibility of AHPQ has been verified by implementation with $65$ nm CMOS technology. Compared with its UQ counterpart, AHPQ exhibits $2.97\times$ higher throughput-to-area ratio (TAR) with $19.3\%$ lower energy dissipation. Moreover, by node compression and strength reduction, the AHPQ detector outperforms the state-of-the-art (SOA) in both throughput ($17.92$ Gb/s) and energy efficiency ($7.93$ pJ/b). The proposed AHPQ framework is also applicable for other digital signal processing algorithms.
△ Less
Submitted 11 August, 2022;
originally announced August 2022.
-
Triangle singularity in $B^0\to π^- K^+ X(3872)$ via the $D_{s1}\bar{D} D^*$ loop and possible precise measurement of the $X(3872)$ mass
Authors:
Mao-Jun Yan,
Ying-Hui Ge,
Xiao-Hai Liu
Abstract:
We investigate the $B^0\to π^- K^+ X(3872)$ decay via the $D_{s1}(2536)\bar{D} D^*$ rescattering diagram. The line shape of the $K^+X(3872)$ distribution curve around $D_{s1}(2536)\bar{D}$ threshold is very sensitive to the $X(3872)$ mass because the triangle singularity (TS) can be generated from the loop. By means of this characteristic, we can determine whether the $X(3872)$ mass is below or ab…
▽ More
We investigate the $B^0\to π^- K^+ X(3872)$ decay via the $D_{s1}(2536)\bar{D} D^*$ rescattering diagram. The line shape of the $K^+X(3872)$ distribution curve around $D_{s1}(2536)\bar{D}$ threshold is very sensitive to the $X(3872)$ mass because the triangle singularity (TS) can be generated from the loop. By means of this characteristic, we can determine whether the $X(3872)$ mass is below or above the $D^{\ast 0}\bar{D}^0$ threshold with high precision. The narrowness of $D_{s1}(2536)$ in the loop is one of the key reasons why the TS mechanism of measuring the $X(3872)$ mass may work. The $X(3872)$ width impact on the $K^+X(3872)$ line shape is also crucial in the TS mechanism. If the width is as large as 1 MeV, the proposed method of measuring the $X(3872)$ mass would be ruined.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
AlphaVC: High-Performance and Efficient Learned Video Compression
Authors:
Yibo Shi,
Yunying Ge,
Jing Wang,
Jue Mao
Abstract:
Recently, learned video compression has drawn lots of attention and show a rapid development trend with promising results. However, the previous works still suffer from some criticial issues and have a performance gap with traditional compression standards in terms of widely used PSNR metric. In this paper, we propose several techniques to effectively improve the performance. First, to address the…
▽ More
Recently, learned video compression has drawn lots of attention and show a rapid development trend with promising results. However, the previous works still suffer from some criticial issues and have a performance gap with traditional compression standards in terms of widely used PSNR metric. In this paper, we propose several techniques to effectively improve the performance. First, to address the problem of accumulative error, we introduce a conditional-I-frame as the first frame in the GoP, which stabilizes the reconstructed quality and saves the bit-rate. Second, to efficiently improve the accuracy of inter prediction without increasing the complexity of decoder, we propose a pixel-to-feature motion prediction method at encoder side that helps us to obtain high-quality motion information. Third, we propose a probability-based entropy skipping method, which not only brings performance gain, but also greatly reduces the runtime of entropy coding. With these powerful techniques, this paper proposes AlphaVC, a high-performance and efficient learned video compression scheme. To the best of our knowledge, AlphaVC is the first E2E AI codec that exceeds the latest compression standard VVC on all common test datasets for both PSNR (-28.2% BD-rate saving) and MSSSIM (-52.2% BD-rate saving), and has very fast encoding (0.001x VVC) and decoding (1.69x VVC) speeds.
△ Less
Submitted 29 July, 2022;
originally announced July 2022.
-
Spinful topological phases in acoustic crystals with projective PT symmetry
Authors:
Yan Meng,
Shuxin Lin,
Bin-jie Shi,
Bin Wei,
Linyun Yang,
Bei Yan,
Zhenxiao Zhu,
Xiang Xi,
Yin Wang,
Yong Ge,
Shou-qi Yuan,
Jingming Chen,
Guigeng Liu,
Hongxiang Sun,
Hongsheng Chen,
Yihao Yang,
Zhen Gao
Abstract:
For the classification of topological phases of matter, an important consideration is whether a system is spinless or spinful, as these two classes have distinct symmetry algebra that gives rise to fundamentally different topological phases. However, only recently has it been realized theoretically that in the presence of gauge symmetry, the algebraic structure of symmetries can be projectively re…
▽ More
For the classification of topological phases of matter, an important consideration is whether a system is spinless or spinful, as these two classes have distinct symmetry algebra that gives rise to fundamentally different topological phases. However, only recently has it been realized theoretically that in the presence of gauge symmetry, the algebraic structure of symmetries can be projectively represented, which possibly enables the switch between spinless and spinful topological phases. Here, we report the first experimental demonstration of this idea by realizing spinful topological phases in "spinless" acoustic crystals with projective space-time inversion symmetry. In particular, we realize a DIII-class one-dimensional topologically gapped phase characterized by a 2Z winding number, which features Kramers degenerate bands and Kramers pair of topological boundary modes. Our work thus overcomes a fundamental constraint on topological phases by spin classes.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
A Survey on Trustworthy Recommender Systems
Authors:
Yingqiang Ge,
Shuchang Liu,
Zuohui Fu,
Juntao Tan,
Zelong Li,
Shuyuan Xu,
Yunqi Li,
Yikun Xian,
Yongfeng Zhang
Abstract:
Recommender systems (RS), serving at the forefront of Human-centered AI, are widely deployed in almost every corner of the web and facilitate the human decision-making process. However, despite their enormous capabilities and potential, RS may also lead to undesired effects on users, items, producers, platforms, or even the society at large, such as compromised user trust due to non-transparency,…
▽ More
Recommender systems (RS), serving at the forefront of Human-centered AI, are widely deployed in almost every corner of the web and facilitate the human decision-making process. However, despite their enormous capabilities and potential, RS may also lead to undesired effects on users, items, producers, platforms, or even the society at large, such as compromised user trust due to non-transparency, unfair treatment of different consumers, or producers, privacy concerns due to extensive use of user's private data for personalization, just to name a few. All of these create an urgent need for Trustworthy Recommender Systems (TRS) so as to mitigate or avoid such adverse impacts and risks. In this survey, we will introduce techniques related to trustworthy recommendation, including but not limited to explainable recommendation, fairness in recommendation, privacy-aware recommendation, robustness in recommendation, user-controllable recommendation, as well as the relationship between these different perspectives in terms of trustworthy recommendation. Through this survey, we hope to deliver readers with a comprehensive view of the research area and raise attention to the community about the importance, existing research achievements, and future research directions on trustworthy recommendation.
△ Less
Submitted 21 February, 2024; v1 submitted 25 July, 2022;
originally announced July 2022.
-
Neural-Sim: Learning to Generate Training Data with NeRF
Authors:
Yunhao Ge,
Harkirat Behl,
Jiashu Xu,
Suriya Gunasekar,
Neel Joshi,
Yale Song,
Xin Wang,
Laurent Itti,
Vibhav Vineet
Abstract:
Training computer vision models usually requires collecting and labeling vast amounts of imagery under a diverse set of scene configurations and properties. This process is incredibly time-consuming, and it is challenging to ensure that the captured data distribution maps well to the target domain of an application scenario. Recently, synthetic data has emerged as a way to address both of these is…
▽ More
Training computer vision models usually requires collecting and labeling vast amounts of imagery under a diverse set of scene configurations and properties. This process is incredibly time-consuming, and it is challenging to ensure that the captured data distribution maps well to the target domain of an application scenario. Recently, synthetic data has emerged as a way to address both of these issues. However, existing approaches either require human experts to manually tune each scene property or use automatic methods that provide little to no control; this requires rendering large amounts of random data variations, which is slow and is often suboptimal for the target domain. We present the first fully differentiable synthetic data pipeline that uses Neural Radiance Fields (NeRFs) in a closed-loop with a target application's loss function. Our approach generates data on-demand, with no human labor, to maximize accuracy for a target task. We illustrate the effectiveness of our method on synthetic and real-world object detection tasks. We also introduce a new "YCB-in-the-Wild" dataset and benchmark that provides a test scenario for object detection with varied poses in real-world environments.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
Delay-Doppler Reversal for OTFS System in Doubly-selective Fading Channels
Authors:
Xiangxiang Li,
Haiyan Wang,
Yao Ge,
Xiaohong Shen,
Yuanyuan Lei
Abstract:
The recent proposed orthogonal time frequency space (OTFS) modulation shows signifcant advantages than conventional orthogonal frequency division multiplexing (OFDM) for high mobility wireless communications. However, a challenging problem is the development of effcient receivers for practical OTFS systems with low complexity. In this paper, we propose a novel delay-Doppler reversal (DDR) technolo…
▽ More
The recent proposed orthogonal time frequency space (OTFS) modulation shows signifcant advantages than conventional orthogonal frequency division multiplexing (OFDM) for high mobility wireless communications. However, a challenging problem is the development of effcient receivers for practical OTFS systems with low complexity. In this paper, we propose a novel delay-Doppler reversal (DDR) technology for OTFS system with desired performance and low complexity. We present the DDR technology from a perspective of two-dimensional cascaded channel model, analyze its computational complexity and also analyze its performance gain compared to the direct processing (DP) receiver without DDR. Simulation results demonstrate that our proposed DDR receiver outperforms traditional receivers in doubly-selective fading channels.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
Possibility of $T_{c\bar{s}}(2900)$ as the resonance-like structure induced by threshold effects
Authors:
Ying-Hui Ge,
Xiao-Hai Liu,
Hong-Wei Ke
Abstract:
We investigate the process $B\to \bar{D}D_s π$ via several rescattering processes. It is shown that the triangle singularity (TS) peak around the $D^*K^*$ threshold generated from the $χ_{c1}K^* D^*$ loop is relatively narrow, which may simulate the resonance-like structure $T_{c\bar{s}}(2900)$ recently observed by LHCb in the $D_sπ$ spectrum. However, the TS peak around the $D_s^*ρ$ threshold gen…
▽ More
We investigate the process $B\to \bar{D}D_s π$ via several rescattering processes. It is shown that the triangle singularity (TS) peak around the $D^*K^*$ threshold generated from the $χ_{c1}K^* D^*$ loop is relatively narrow, which may simulate the resonance-like structure $T_{c\bar{s}}(2900)$ recently observed by LHCb in the $D_sπ$ spectrum. However, the TS peak around the $D_s^*ρ$ threshold generated from the $D^{**} D_s^* ρ$ loop is smoothed by the broad width of $ρ$, which itself can hardly describe the $T_{c\bar{s}}(2900)$ structure. A TS signal around the $DK$ threshold generated from the $χ_{c0}K D $ loop is also predicted.
△ Less
Submitted 20 July, 2022;
originally announced July 2022.
-
Contributions of Shape, Texture, and Color in Visual Recognition
Authors:
Yunhao Ge,
Yao Xiao,
Zhi Xu,
Xingrui Wang,
Laurent Itti
Abstract:
We investigate the contributions of three important features of the human visual system (HVS)~ -- ~shape, texture, and color ~ -- ~to object classification. We build a humanoid vision engine (HVE) that explicitly and separately computes shape, texture, and color features from images. The resulting feature vectors are then concatenated to support the final classification. We show that HVE can summa…
▽ More
We investigate the contributions of three important features of the human visual system (HVS)~ -- ~shape, texture, and color ~ -- ~to object classification. We build a humanoid vision engine (HVE) that explicitly and separately computes shape, texture, and color features from images. The resulting feature vectors are then concatenated to support the final classification. We show that HVE can summarize and rank-order the contributions of the three features to object recognition. We use human experiments to confirm that both HVE and humans predominantly use some specific features to support the classification of specific classes (e.g., texture is the dominant feature to distinguish a zebra from other quadrupeds, both for humans and HVE). With the help of HVE, given any environment (dataset), we can summarize the most important features for the whole task (task-specific; e.g., color is the most important feature overall for classification with the CUB dataset), and for each class (class-specific; e.g., shape is the most important feature to recognize boats in the iLab-20M dataset). To demonstrate more usefulness of HVE, we use it to simulate the open-world zero-shot learning ability of humans with no attribute labeling. Finally, we show that HVE can also simulate human imagination ability with the combination of different features. We will open-source the HVE engine and corresponding datasets.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Forcing the Whole Video as Background: An Adversarial Learning Strategy for Weakly Temporal Action Localization
Authors:
Ziqiang Li,
Yongxin Ge,
Jiaruo Yu,
Zhongming Chen
Abstract:
With video-level labels, weakly supervised temporal action localization (WTAL) applies a localization-by-classification paradigm to detect and classify the action in untrimmed videos. Due to the characteristic of classification, class-specific background snippets are inevitably mis-activated to improve the discriminability of the classifier in WTAL. To alleviate the disturbance of background, exis…
▽ More
With video-level labels, weakly supervised temporal action localization (WTAL) applies a localization-by-classification paradigm to detect and classify the action in untrimmed videos. Due to the characteristic of classification, class-specific background snippets are inevitably mis-activated to improve the discriminability of the classifier in WTAL. To alleviate the disturbance of background, existing methods try to enlarge the discrepancy between action and background through modeling background snippets with pseudo-snippet-level annotations, which largely rely on artificial hypotheticals. Distinct from the previous works, we present an adversarial learning strategy to break the limitation of mining pseudo background snippets. Concretely, the background classification loss forces the whole video to be regarded as the background by a background gradient reinforcement strategy, confusing the recognition model. Reversely, the foreground(action) loss guides the model to focus on action snippets under such conditions. As a result, competition between the two classification losses drives the model to boost its ability for action modeling. Simultaneously, a novel temporal enhancement network is designed to facilitate the model to construct temporal relation of affinity snippets based on the proposed strategy, for further improving the performance of action localization. Finally, extensive experiments conducted on THUMOS14 and ActivityNet1.2 demonstrate the effectiveness of the proposed method.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space
Authors:
Wenqi Shao,
Xun Zhao,
Yixiao Ge,
Zhaoyang Zhang,
Lei Yang,
Xiaogang Wang,
Ying Shan,
Ping Luo
Abstract:
This paper addresses an important problem of ranking the pre-trained deep neural networks and screening the most transferable ones for downstream tasks. It is challenging because the ground-truth model ranking for each task can only be generated by fine-tuning the pre-trained models on the target dataset, which is brute-force and computationally expensive. Recent advanced methods proposed several…
▽ More
This paper addresses an important problem of ranking the pre-trained deep neural networks and screening the most transferable ones for downstream tasks. It is challenging because the ground-truth model ranking for each task can only be generated by fine-tuning the pre-trained models on the target dataset, which is brute-force and computationally expensive. Recent advanced methods proposed several lightweight transferability metrics to predict the fine-tuning results. However, these approaches only capture static representations but neglect the fine-tuning dynamics. To this end, this paper proposes a new transferability metric, called \textbf{S}elf-challenging \textbf{F}isher \textbf{D}iscriminant \textbf{A}nalysis (\textbf{SFDA}), which has many appealing benefits that existing works do not have. First, SFDA can embed the static features into a Fisher space and refine them for better separability between classes. Second, SFDA uses a self-challenging mechanism to encourage different pre-trained models to differentiate on hard examples. Third, SFDA can easily select multiple pre-trained models for the model ensemble. Extensive experiments on $33$ pre-trained models of $11$ downstream tasks show that SFDA is efficient, effective, and robust when measuring the transferability of pre-trained models. For instance, compared with the state-of-the-art method NLEEP, SFDA demonstrates an average of $59.1$\% gain while bringing $22.5$x speedup in wall-clock time. The code will be available at \url{https://github.com/TencentARC/SFDA}.
△ Less
Submitted 19 July, 2022; v1 submitted 6 July, 2022;
originally announced July 2022.
-
BYHE: A Simple Framework for Boosting End-to-end Video-based Heart Rate Measurement Network
Authors:
Weiyu Sun,
Xinyu Zhang,
Ying Chen,
Yun Ge,
Chunyu Ji,
Xiaolin Huang
Abstract:
Heart rate measuring based on remote photoplethysmography (rPPG) plays an important role in health caring, which estimates heart rate from facial video in a non-contact, less-constrained way. End-to-end neural network is a main branch of rPPG-based heart rate estimation methods, whose trait is recovering rPPG signal containing sufficient heart rate message from original facial video directly. Howe…
▽ More
Heart rate measuring based on remote photoplethysmography (rPPG) plays an important role in health caring, which estimates heart rate from facial video in a non-contact, less-constrained way. End-to-end neural network is a main branch of rPPG-based heart rate estimation methods, whose trait is recovering rPPG signal containing sufficient heart rate message from original facial video directly. However, there exists some easily neglected problems on relevant datasets which thwarting the efficient training of end-to-end methods, such as uncertain temporal delay and indefinite envelope shape of label waves. Although many novel and powerful networks are proposed, hitherto there are no systematic research digging into these problems. In this paper, from perspective of common intrinsic rhythm periodical self-similarity results from cardiac activities, we propose a comprehensive methodology, Boost Your Heartbeat Estimation (BYHE), including new label representations, corresponding network adjustments and loss functions. BYHE can be easily grafted on current end-to-end network and boost its training efficiency. By applying our methodology, we can save tremendous time without conducting laborious handworks, such as label wave alignment which is necessary for previous end-to-end methods, and meanwhile enhance the utilization on datasets. According to our experiments, BYHE can leverage classical end-to-end network to reach competitive performance against those state-of-the-art methods on mostly used datasets. Such improvement indicates selecting perspicuous and efficient label representation is also a promising direction towards better remote physiological signal measurement.
△ Less
Submitted 27 September, 2022; v1 submitted 4 July, 2022;
originally announced July 2022.
-
Knowledge-aware Neural Collective Matrix Factorization for Cross-domain Recommendation
Authors:
Li Zhang,
Yan Ge,
Jun Ma,
Jianmo Ni,
Haiping Lu
Abstract:
Cross-domain recommendation (CDR) can help customers find more satisfying items in different domains. Existing CDR models mainly use common users or mapping functions as bridges between domains but have very limited exploration in fully utilizing extra knowledge across domains. In this paper, we propose to incorporate the knowledge graph (KG) for CDR, which enables items in different domains to sh…
▽ More
Cross-domain recommendation (CDR) can help customers find more satisfying items in different domains. Existing CDR models mainly use common users or mapping functions as bridges between domains but have very limited exploration in fully utilizing extra knowledge across domains. In this paper, we propose to incorporate the knowledge graph (KG) for CDR, which enables items in different domains to share knowledge. To this end, we first construct a new dataset AmazonKG4CDR from the Freebase KG and a subset (two domain pairs: movies-music, movie-book) of Amazon Review Data. This new dataset facilitates linking knowledge to bridge within- and cross-domain items for CDR. Then we propose a new framework, KG-aware Neural Collective Matrix Factorization (KG-NeuCMF), leveraging KG to enrich item representations. It first learns item embeddings by graph convolutional autoencoder to capture both domain-specific and domain-general knowledge from adjacent and higher-order neighbours in the KG. Then, we maximize the mutual information between item embeddings learned from the KG and user-item matrix to establish cross-domain relationships for better CDR. Finally, we conduct extensive experiments on the newly constructed dataset and demonstrate that our model significantly outperforms the best-performing baselines.
△ Less
Submitted 27 June, 2022;
originally announced June 2022.
-
DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection
Authors:
Yunhao Ge,
Jiashu Xu,
Brian Nlong Zhao,
Neel Joshi,
Laurent Itti,
Vibhav Vineet
Abstract:
We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-toimage synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach decouples training data generation into foreground object mask generation and background (context) image generation. For foreground object mask generation, we use a simple textual template with obje…
▽ More
We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-toimage synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach decouples training data generation into foreground object mask generation and background (context) image generation. For foreground object mask generation, we use a simple textual template with object class name as input to DALL-E to generate a diverse set of foreground images. A foreground-background segmentation algorithm is then used to generate foreground object masks. Next, in order to generate context images, first a language description of the context is generated by applying an image captioning method on a small set of images representing the context. These language descriptions are then used to generate diverse sets of context images using the DALL-E framework. These are then composited with object masks generated in the first step to provide an augmented training set for a classifier. We demonstrate the advantages of our approach on four object detection datasets including on Pascal VOC and COCO object detection tasks. Furthermore, we also highlight the compositional nature of our data generation approach on out-of-distribution and zero-shot data generation scenarios.
△ Less
Submitted 21 December, 2022; v1 submitted 20 June, 2022;
originally announced June 2022.
-
Boosting Factorization Machines via Saliency-Guided Mixup
Authors:
Chenwang Wu,
Defu Lian,
Yong Ge,
Min Zhou,
Enhong Chen,
Dacheng Tao
Abstract:
Factorization machines (FMs) are widely used in recommender systems due to their adaptability and ability to learn from sparse data. However, for the ubiquitous non-interactive features in sparse data, existing FMs can only estimate the parameters corresponding to these features via the inner product of their embeddings. Undeniably, they cannot learn the direct interactions of these features, whic…
▽ More
Factorization machines (FMs) are widely used in recommender systems due to their adaptability and ability to learn from sparse data. However, for the ubiquitous non-interactive features in sparse data, existing FMs can only estimate the parameters corresponding to these features via the inner product of their embeddings. Undeniably, they cannot learn the direct interactions of these features, which limits the model's expressive power. To this end, we first present MixFM, inspired by Mixup, to generate auxiliary training data to boost FMs. Unlike existing augmentation strategies that require labor costs and expertise to collect additional information such as position and fields, these extra data generated by MixFM only by the convex combination of the raw ones without any professional knowledge support. More importantly, if the parent samples to be mixed have non-interactive features, MixFM will establish their direct interactions. Second, considering that MixFM may generate redundant or even detrimental instances, we further put forward a novel Factorization Machine powered by Saliency-guided Mixup (denoted as SMFM). Guided by the customized saliency, SMFM can generate more informative neighbor data. Through theoretical analysis, we prove that the proposed methods minimize the upper bound of the generalization error, which hold a beneficial effect on enhancing FMs. Significantly, we give the first generalization bound of FM, implying the generalization requires more data and a smaller embedding size under the sufficient representation capability. Finally, extensive experiments on five datasets confirm that our approaches are superior to baselines. Besides, the results show that "poisoning" mixed data is likewise beneficial to the FM variants.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
READ: Aggregating Reconstruction Error into Out-of-distribution Detection
Authors:
Wenyu Jiang,
Yuxin Ge,
Hao Cheng,
Mingcai Chen,
Shuai Feng,
Chongjun Wang
Abstract:
Detecting out-of-distribution (OOD) samples is crucial to the safe deployment of a classifier in the real world. However, deep neural networks are known to be overconfident for abnormal data. Existing works directly design score function by mining the inconsistency from classifier for in-distribution (ID) and OOD. In this paper, we further complement this inconsistency with reconstruction error, b…
▽ More
Detecting out-of-distribution (OOD) samples is crucial to the safe deployment of a classifier in the real world. However, deep neural networks are known to be overconfident for abnormal data. Existing works directly design score function by mining the inconsistency from classifier for in-distribution (ID) and OOD. In this paper, we further complement this inconsistency with reconstruction error, based on the assumption that an autoencoder trained on ID data can not reconstruct OOD as well as ID. We propose a novel method, READ (Reconstruction Error Aggregated Detector), to unify inconsistencies from classifier and autoencoder. Specifically, the reconstruction error of raw pixels is transformed to latent space of classifier. We show that the transformed reconstruction error bridges the semantic gap and inherits detection performance from the original. Moreover, we propose an adjustment strategy to alleviate the overconfidence problem of autoencoder according to a fine-grained characterization of OOD data. Under two scenarios of pre-training and retraining, we respectively present two variants of our method, namely READ-MD (Mahalanobis Distance) only based on pre-trained classifier and READ-ED (Euclidean Distance) which retrains the classifier. Our methods do not require access to test time OOD data for fine-tuning hyperparameters. Finally, we demonstrate the effectiveness of the proposed methods through extensive comparisons with state-of-the-art OOD detection algorithms. On a CIFAR-10 pre-trained WideResNet, our method reduces the average FPR@95TPR by up to 9.8% compared with previous state-of-the-art.
△ Less
Submitted 5 January, 2023; v1 submitted 15 June, 2022;
originally announced June 2022.
-
An Extractive-and-Abstractive Framework for Source Code Summarization
Authors:
Weisong Sun,
Chunrong Fang,
Yuchen Chen,
Quanjun Zhang,
Guanhong Tao,
Tingxu Han,
Yifei Ge,
Yudu You,
Bin Luo
Abstract:
(Source) Code summarization aims to automatically generate summaries/comments for a given code snippet in the form of natural language. Such summaries play a key role in helping developers understand and maintain source code. Existing code summarization techniques can be categorized into extractive methods and abstractive methods. The extractive methods extract a subset of important statements and…
▽ More
(Source) Code summarization aims to automatically generate summaries/comments for a given code snippet in the form of natural language. Such summaries play a key role in helping developers understand and maintain source code. Existing code summarization techniques can be categorized into extractive methods and abstractive methods. The extractive methods extract a subset of important statements and keywords from the code snippet using retrieval techniques, and generate a summary that preserves factual details in important statements and keywords. However, such a subset may miss identifier or entity naming, and consequently, the naturalness of generated summary is usually poor. The abstractive methods can generate human-written-like summaries leveraging encoder-decoder models from the neural machine translation domain. The generated summaries however often miss important factual details.
To generate human-written-like summaries with preserved factual details, we propose a novel extractive-and-abstractive framework. The extractive module in the framework performs a task of extractive code summarization, which takes in the code snippet and predicts important statements containing key factual details. The abstractive module in the framework performs a task of abstractive code summarization, which takes in the entire code snippet and important statements in parallel and generates a succinct and human-written-like natural language summary. We evaluate the effectiveness of our technique, called EACS, by conducting extensive experiments on three datasets involving six programming languages. Experimental results show that EACS significantly outperforms state-of-the-art techniques in terms of all three widely used metrics, including BLEU, METEOR, and ROUGH-L.
△ Less
Submitted 4 November, 2023; v1 submitted 14 June, 2022;
originally announced June 2022.
-
Invariant Structure Learning for Better Generalization and Causal Explainability
Authors:
Yunhao Ge,
Sercan Ö. Arik,
Jinsung Yoon,
Ao Xu,
Laurent Itti,
Tomas Pfister
Abstract:
Learning the causal structure behind data is invaluable for improving generalization and obtaining high-quality explanations. We propose a novel framework, Invariant Structure Learning (ISL), that is designed to improve causal structure discovery by utilizing generalization as an indication. ISL splits the data into different environments, and learns a structure that is invariant to the target acr…
▽ More
Learning the causal structure behind data is invaluable for improving generalization and obtaining high-quality explanations. We propose a novel framework, Invariant Structure Learning (ISL), that is designed to improve causal structure discovery by utilizing generalization as an indication. ISL splits the data into different environments, and learns a structure that is invariant to the target across different environments by imposing a consistency constraint. An aggregation mechanism then selects the optimal classifier based on a graph structure that reflects the causal mechanisms in the data more accurately compared to the structures learnt from individual environments. Furthermore, we extend ISL to a self-supervised learning setting where accurate causal structure discovery does not rely on any labels. This self-supervised ISL utilizes invariant causality proposals by iteratively setting different nodes as targets. On synthetic and real-world datasets, we demonstrate that ISL accurately discovers the causal structure, outperforms alternative methods, and yields superior generalization for datasets with significant distribution shifts.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
The peculiar spectral evolution of the new X-ray transient MAXI J0637-430
Authors:
R. C. Ma,
R. Soria,
L. Tao,
W. Zhang,
J. L. Qu,
S. N. Zhang,
L. Zhang,
E. L. Qiao,
S. J. Zhao,
M. Y. Ge,
X. B. Li,
Y. Huang,
L. M. Song,
S. Zhang,
Q. C. Bu,
Y. N. Wang,
X. Ma,
S. M. Jia
Abstract:
We studied the transient Galactic black hole candidate MAXI J0637-430 with data from Insight-HXMT, Swift and XMM-Newton. The broad-band X-ray observations from Insight-HXMT help us constrain the power-law component. MAXI J0637-430 is located at unusually high Galactic latitude; if it belongs to the Galactic thick disk, we suggest a most likely distance <7 kpc. Compared with other black hole transi…
▽ More
We studied the transient Galactic black hole candidate MAXI J0637-430 with data from Insight-HXMT, Swift and XMM-Newton. The broad-band X-ray observations from Insight-HXMT help us constrain the power-law component. MAXI J0637-430 is located at unusually high Galactic latitude; if it belongs to the Galactic thick disk, we suggest a most likely distance <7 kpc. Compared with other black hole transients, MAXI J0637-430 is also unusual for other reasons: a fast transition to the thermal dominant state at the start of the outburst; a low peak temperature and luminosity (we estimate them at ~ 0.7 keV and <0.1 times Eddington, respectively); a short decline timescale; a low soft-to-hard transition luminosity (<0.01 times Eddington). We argue that such properties are consistent with a small binary separation, short binary period (P ~ 2 hr), and low-mass donor star (M2 ~ 0.2 M_sun). Moreover, spectral modelling shows that a single disk-blackbody component is not a good fit to the thermal emission. Soft spectral residuals, and deviations from the standard L_disk ~ T^4 in relation, suggest the need for a second thermal component. We propose and discuss various scenarios for such component, in addition to those presented in previous studies of this source. For example, a gap in the accretion disk between a hotter inner ring near the innermost stable orbit, and a cooler outer disk. Another possibility is that the second thermal component is the thermal plasma emission from an ionized outflow.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
Enhanced thermally-activated skyrmion diffusion with tunable effective gyrotropic force
Authors:
Takaaki Dohi,
Markus Weißenhofer,
Nico Kerber,
Fabian Kammerbauer,
Yuqing Ge,
Klaus Raab,
Jakub Zàzvorka,
Maria-Andromachi Syskaki,
Aga Shahee,
Moritz Ruhwedel,
Tobias Böttcher,
Philipp Pirro,
Gerhard Jakob,
Ulrich Nowak,
Mathias Kläui
Abstract:
Magnetic skyrmions, topologically-stabilized spin textures that emerge in magnetic systems, have garnered considerable interest due to a variety of electromagnetic responses that are governed by the topology. The topology that creates a microscopic gyrotropic force also causes detrimental effects, such as the skyrmion Hall effect, which is a well-studied phenomenon highlighting the influence of to…
▽ More
Magnetic skyrmions, topologically-stabilized spin textures that emerge in magnetic systems, have garnered considerable interest due to a variety of electromagnetic responses that are governed by the topology. The topology that creates a microscopic gyrotropic force also causes detrimental effects, such as the skyrmion Hall effect, which is a well-studied phenomenon highlighting the influence of topology on the deterministic dynamics and drift motion. Furthermore, the gyrotropic force is anticipated to have a substantial impact on stochastic diffusive motion; however, the predicted repercussions have yet to be demonstrated, even qualitatively. Here we demonstrate enhanced thermally-activated diffusive motion of skyrmions in a specifically designed synthetic antiferromagnet. Suppressing the effective gyrotropic force by tuning the angular momentum compensation leads to a more than 10 times enhanced diffusion coefficient compared to that of ferromagnetic skyrmions. Consequently, our findings not only demonstrate the gyro-force dependence of the diffusion coefficient but also enable ultimately energy-efficient unconventional stochastic computing.
△ Less
Submitted 11 September, 2023; v1 submitted 1 June, 2022;
originally announced June 2022.
-
Point-Teaching: Weakly Semi-Supervised Object Detection with Point Annotations
Authors:
Yongtao Ge,
Qiang Zhou,
Xinlong Wang,
Zhibin Wang,
Hao Li,
Chunhua Shen
Abstract:
Point annotations are considerably more time-efficient than bounding box annotations. However, how to use cheap point annotations to boost the performance of semi-supervised object detection remains largely unsolved. In this work, we present Point-Teaching, a weakly semi-supervised object detection framework to fully exploit the point annotations. Specifically, we propose a Hungarian-based point m…
▽ More
Point annotations are considerably more time-efficient than bounding box annotations. However, how to use cheap point annotations to boost the performance of semi-supervised object detection remains largely unsolved. In this work, we present Point-Teaching, a weakly semi-supervised object detection framework to fully exploit the point annotations. Specifically, we propose a Hungarian-based point matching method to generate pseudo labels for point annotated images. We further propose multiple instance learning (MIL) approaches at the level of images and points to supervise the object detector with point annotations. Finally, we propose a simple-yet-effective data augmentation, termed point-guided copy-paste, to reduce the impact of the unmatched points. Experiments demonstrate the effectiveness of our method on a few datasets and various data regimes.
△ Less
Submitted 24 October, 2022; v1 submitted 1 June, 2022;
originally announced June 2022.
-
Doppler-Enabled Single-Antenna Localization and Mapping Without Synchronization
Authors:
Hui Chen,
Fan Jiang,
Yu Ge,
Hyowon Kim,
Henk Wymeersch
Abstract:
Radio localization is a key enabler for joint communication and sensing in the fifth/sixth generation (5G/6G) communication systems. With the help of multipath components (MPCs), localization and mapping tasks can be done with a single base station (BS) and single unsynchronized user equipment (UE) if both of them are equipped with an antenna array. However, the antenna array at the UE side increa…
▽ More
Radio localization is a key enabler for joint communication and sensing in the fifth/sixth generation (5G/6G) communication systems. With the help of multipath components (MPCs), localization and mapping tasks can be done with a single base station (BS) and single unsynchronized user equipment (UE) if both of them are equipped with an antenna array. However, the antenna array at the UE side increases the hardware and computational cost, preventing localization functionality. In this work, we show that with Doppler estimation and MPCs, localization and mapping tasks can be performed even with a single-antenna mobile UE. Furthermore, we show that the localization and mapping performance will improve and then saturate at a certain level with an increased UE speed. Both theoretical Cramér-Rao bound analysis and simulation results show the potential of localization under mobility and the effectiveness of the proposed localization algorithm.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Fairness in Recommendation: Foundations, Methods and Applications
Authors:
Yunqi Li,
Hanxiong Chen,
Shuyuan Xu,
Yingqiang Ge,
Juntao Tan,
Shuchang Liu,
Yongfeng Zhang
Abstract:
As one of the most pervasive applications of machine learning, recommender systems are playing an important role on assisting human decision making. The satisfaction of users and the interests of platforms are closely related to the quality of the generated recommendation results. However, as a highly data-driven system, recommender system could be affected by data or algorithmic bias and thus gen…
▽ More
As one of the most pervasive applications of machine learning, recommender systems are playing an important role on assisting human decision making. The satisfaction of users and the interests of platforms are closely related to the quality of the generated recommendation results. However, as a highly data-driven system, recommender system could be affected by data or algorithmic bias and thus generate unfair results, which could weaken the reliance of the systems. As a result, it is crucial to address the potential unfairness problems in recommendation settings. Recently, there has been growing attention on fairness considerations in recommender systems with more and more literature on approaches to promote fairness in recommendation. However, the studies are rather fragmented and lack a systematic organization, thus making it difficult to penetrate for new researchers to the domain. This motivates us to provide a systematic survey of existing works on fairness in recommendation. This survey focuses on the foundations for fairness in recommendation literature. It first presents a brief introduction about fairness in basic machine learning tasks such as classification and ranking in order to provide a general overview of fairness research, as well as introduce the more complex situations and challenges that need to be considered when studying fairness in recommender systems. After that, the survey will introduce fairness in recommendation with a focus on the taxonomies of current fairness definitions, the typical techniques for improving fairness, as well as the datasets for fairness studies in recommendation. The survey also talks about the challenges and opportunities in fairness research with the hope of promoting the fair recommendation research area and beyond.
△ Less
Submitted 26 July, 2023; v1 submitted 26 May, 2022;
originally announced May 2022.
-
What should I Ask: A Knowledge-driven Approach for Follow-up Questions Generation in Conversational Surveys
Authors:
Yubin Ge,
Ziang Xiao,
Jana Diesner,
Heng Ji,
Karrie Karahalios,
Hari Sundaram
Abstract:
Generating follow-up questions on the fly could significantly improve conversational survey quality and user experiences by enabling a more dynamic and personalized survey structure. In this paper, we proposed a novel task for knowledge-driven follow-up question generation in conversational surveys. We constructed a new human-annotated dataset of human-written follow-up questions with dialogue his…
▽ More
Generating follow-up questions on the fly could significantly improve conversational survey quality and user experiences by enabling a more dynamic and personalized survey structure. In this paper, we proposed a novel task for knowledge-driven follow-up question generation in conversational surveys. We constructed a new human-annotated dataset of human-written follow-up questions with dialogue history and labeled knowledge in the context of conversational surveys. Along with the dataset, we designed and validated a set of reference-free Gricean-inspired evaluation metrics to systematically evaluate the quality of generated follow-up questions. We then propose a two-staged knowledge-driven model for the task, which generates informative and coherent follow-up questions by using knowledge to steer the generation process. The experiments demonstrate that compared to GPT-based baseline models, our two-staged model generates more informative, coherent, and clear follow-up questions.
△ Less
Submitted 13 October, 2023; v1 submitted 22 May, 2022;
originally announced May 2022.
-
Masked Image Modeling with Denoising Contrast
Authors:
Kun Yi,
Yixiao Ge,
Xiaotong Li,
Shusheng Yang,
Dian Li,
Jianping Wu,
Ying Shan,
Xiaohu Qie
Abstract:
Since the development of self-supervised visual representation learning from contrastive learning to masked image modeling (MIM), there is no significant difference in essence, that is, how to design proper pretext tasks for vision dictionary look-up. MIM recently dominates this line of research with state-of-the-art performance on vision Transformers (ViTs), where the core is to enhance the patch…
▽ More
Since the development of self-supervised visual representation learning from contrastive learning to masked image modeling (MIM), there is no significant difference in essence, that is, how to design proper pretext tasks for vision dictionary look-up. MIM recently dominates this line of research with state-of-the-art performance on vision Transformers (ViTs), where the core is to enhance the patch-level visual context capturing of the network via denoising auto-encoding mechanism. Rather than tailoring image tokenizers with extra training stages as in previous works, we unleash the great potential of contrastive learning on denoising auto-encoding and introduce a pure MIM method, ConMIM, to produce simple intra-image inter-patch contrastive constraints as the sole learning objectives for masked patch prediction. We further strengthen the denoising mechanism with asymmetric designs, including image perturbations and model progress rates, to improve the network pre-training. ConMIM-pretrained models with various scales achieve competitive results on downstream image classification, semantic segmentation, object detection, and instance segmentation tasks, e.g., on ImageNet-1K classification, we achieve 83.9% top-1 accuracy with ViT-Small and 85.3% with ViT-Base without extra data for pre-training.
△ Less
Submitted 29 January, 2023; v1 submitted 19 May, 2022;
originally announced May 2022.
-
New pulse profile variability associated with a glitch of PSR J0738-4042
Authors:
S. Q. Zhou,
E. Gügercinoğlu,
J. P. Yuan,
M. Y. Ge,
C. Yu,
C. M. Zhang,
J. Zhang,
Z. W. Feng,
C. Q. Ye
Abstract:
The close correlation observed between emission state and spin-down rate change of pulsars has many implications both for the magnetospheric physics and the neutron star interior. The middle-aged pulsar PSR J0738$-$4042, which had been observed to display variations in the pulse profile associated with its spin-down rate change due to external effects, is a remarkable example. In this study, based…
▽ More
The close correlation observed between emission state and spin-down rate change of pulsars has many implications both for the magnetospheric physics and the neutron star interior. The middle-aged pulsar PSR J0738$-$4042, which had been observed to display variations in the pulse profile associated with its spin-down rate change due to external effects, is a remarkable example. In this study, based on the 12.5-yr combined public timing data from UTMOST and Parkes, we have detected a new emission-rotation correlation in PSR J0738$-$4042 concurrent with a glitch. A glitch that occurred at MJD 57359(5) (December 3, 2015) with $Δν/ν\sim 0.36(4)\times 10^{-9}$ is the first glitch event observed in this pulsar and is probably the underlying cause of the emission-rotation correlation. Unlike the usual post-glitch behaviours, the braking torque on the pulsar has continued to increase over 1380 d, corresponding to a significant decrease in $\ddotν$. As for changes in the pulse profile after the glitch, the relative amplitude of the leading component weakens drastically, while the middle component becomes stronger. A combined model of crustquake induced platelet movement and vortex creep response is invoked to account for this rare correlation. In this scenario, magnetospheric state-change is naturally linked to the pulsar-intrinsic processes that give rise to a glitch.
△ Less
Submitted 22 November, 2022; v1 submitted 17 May, 2022;
originally announced May 2022.
-
de Haas-van Alphen effect and the first-principles study of the possible topological stannide Cu$_3$Sn
Authors:
Chengxu Liu,
Bin Li,
Yongheng Ge,
Wen-He Jiao,
Chuanying Xi,
Yi Liu,
Chunqiang Xu,
Qi Lu,
Yunlong Li,
Hang-Qiang Qiu,
Qin-Qing Zhu,
Zhi Ren,
Ziming Zhu,
Dong Qian,
Xianglin Ke,
Xiaofeng Xu
Abstract:
The quest for quantum materials with diverse symmetry-protected topological states has been the focus of recent research interest, primarily due to their fascinating physical properties and the potential technological utility. In this work, we report on the magnetotransport, de Haas-van Alphen (dHvA) oscillations, and the first-principles calculations of the stannide Cu$_3$Sn that is isostructural…
▽ More
The quest for quantum materials with diverse symmetry-protected topological states has been the focus of recent research interest, primarily due to their fascinating physical properties and the potential technological utility. In this work, we report on the magnetotransport, de Haas-van Alphen (dHvA) oscillations, and the first-principles calculations of the stannide Cu$_3$Sn that is isostructural with the recently reported topological semimetal Ag$_3$Sn. The magnetoresistance was found to vary quasi-linearly in field. Clear dHvA oscillations were observed under a field as low as 1 Tesla at 2 K, with three major oscillation frequencies $F_α$=8.74 T, $F_β$=150.19 T and $F_γ$=229.66 T and extremely small effective masses. The analysis of dHvA quantum oscillations revealed a possible nonzero Berry phase, suggestive of the nontrivial band topology. The corroborating evidence for the nontrivial electronic topology also comes from the first-principles calculations which yield a nonzero $\mathbb{Z}_2$ topological index. These results collectively suggest that Cu$_3$Sn, in analogy to its homologue Ag$_3$Sn, may be another intermetallic stannide hosting topological Dirac fermions.
△ Less
Submitted 9 May, 2022;
originally announced May 2022.