-
Geometry-Guided Ray Augmentation for Neural Surface Reconstruction with Sparse Views
Authors:
Jiawei Yao,
Chen Wang,
Tong Wu,
Chuming Li
Abstract:
In this paper, we propose a novel method for 3D scene and object reconstruction from sparse multi-view images. Different from previous methods that leverage extra information such as depth or generalizable features across scenes, our approach leverages the scene properties embedded in the multi-view inputs to create precise pseudo-labels for optimization without any prior training. Specifically, w…
▽ More
In this paper, we propose a novel method for 3D scene and object reconstruction from sparse multi-view images. Different from previous methods that leverage extra information such as depth or generalizable features across scenes, our approach leverages the scene properties embedded in the multi-view inputs to create precise pseudo-labels for optimization without any prior training. Specifically, we introduce a geometry-guided approach that improves surface reconstruction accuracy from sparse views by leveraging spherical harmonics to predict the novel radiance while holistically considering all color observations for a point in the scene. Also, our pipeline exploits proxy geometry and correctly handles the occlusion in generating the pseudo-labels of radiance, which previous image-warping methods fail to avoid. Our method, dubbed Ray Augmentation (RayAug), achieves superior results on DTU and Blender datasets without requiring prior training, demonstrating its effectiveness in addressing the problem of sparse view reconstruction. Our pipeline is flexible and can be integrated into other implicit neural reconstruction methods for sparse views.
△ Less
Submitted 7 March, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for Debugging
Authors:
Qianou Ma,
Hua Shen,
Kenneth Koedinger,
Tongshuang Wu
Abstract:
Large Language Models (LLMs) now excel at generative skills and can create content at impeccable speeds. However, they are imperfect and still make various mistakes. In a Computer Science education context, as these models are widely recognized as "AI pair programmers," it becomes increasingly important to train students on evaluating and debugging the LLM-generated code. In this work, we introduc…
▽ More
Large Language Models (LLMs) now excel at generative skills and can create content at impeccable speeds. However, they are imperfect and still make various mistakes. In a Computer Science education context, as these models are widely recognized as "AI pair programmers," it becomes increasingly important to train students on evaluating and debugging the LLM-generated code. In this work, we introduce HypoCompass, a novel system to facilitate deliberate practice on debugging, where human novices play the role of Teaching Assistants and help LLM-powered teachable agents debug code. We enable effective task delegation between students and LLMs in this learning-by-teaching environment: students focus on hypothesizing the cause of code errors, while adjacent skills like code completion are offloaded to LLM-agents. Our evaluations demonstrate that HypoCompass generates high-quality training materials (e.g., bugs and fixes), outperforming human counterparts fourfold in efficiency, and significantly improves student performance on debugging by 12% in the pre-to-post test.
△ Less
Submitted 1 April, 2024; v1 submitted 8 October, 2023;
originally announced October 2023.
-
Reach-avoid Analysis for Sampled-data Systems with Measurement Uncertainties
Authors:
Taoran Wu,
Dejin Ren,
Shuyuan Zhang,
Lei Wang,
Bai Xue
Abstract:
Digital control has become increasingly prevalent in modern systems, making continuous-time plants controlled by discrete-time (digital) controllers ubiquitous and crucial across industries, including aerospace, automotive, and manufacturing. This paper focuses on investigating the reach-avoid problem in such systems, where the objective is to reach a goal set while avoiding unsafe states, especia…
▽ More
Digital control has become increasingly prevalent in modern systems, making continuous-time plants controlled by discrete-time (digital) controllers ubiquitous and crucial across industries, including aerospace, automotive, and manufacturing. This paper focuses on investigating the reach-avoid problem in such systems, where the objective is to reach a goal set while avoiding unsafe states, especially in the presence of state measurement uncertainties. We propose an approach that builds upon the concept of exponential control guidance barrier functions, originally used for synthesizing continuous-time feedback controllers. We introduce a sufficient condition that, if met by a given continuous-time feedback controller, ensures the safe guidance of the system into the goal set in its sampled-data implementation, despite state measurement uncertainties. The event of reaching the goal set is determined based on state measurements obtained at the sampling time instants. Numerical examples are provided to demonstrate the validity of our theoretical developments, showcasing successful implementation in solving the reach-avoid problem in sampled-data systems with state measurement uncertainties.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
1st Place Solution of Egocentric 3D Hand Pose Estimation Challenge 2023 Technical Report:A Concise Pipeline for Egocentric Hand Pose Reconstruction
Authors:
Zhishan Zhou,
Zhi Lv,
Shihao Zhou,
Minqiang Zou,
Tong Wu,
Mochen Yu,
Yao Tang,
Jiajun Liang
Abstract:
This report introduce our work on Egocentric 3D Hand Pose Estimation workshop. Using AssemblyHands, this challenge focuses on egocentric 3D hand pose estimation from a single-view image. In the competition, we adopt ViT based backbones and a simple regressor for 3D keypoints prediction, which provides strong model baselines. We noticed that Hand-objects occlusions and self-occlusions lead to perfo…
▽ More
This report introduce our work on Egocentric 3D Hand Pose Estimation workshop. Using AssemblyHands, this challenge focuses on egocentric 3D hand pose estimation from a single-view image. In the competition, we adopt ViT based backbones and a simple regressor for 3D keypoints prediction, which provides strong model baselines. We noticed that Hand-objects occlusions and self-occlusions lead to performance degradation, thus proposed a non-model method to merge multi-view results in the post-process stage. Moreover, We utilized test time augmentation and model ensemble to make further improvement. We also found that public dataset and rational preprocess are beneficial. Our method achieved 12.21mm MPJPE on test dataset, achieve the first place in Egocentric 3D Hand Pose Estimation challenge.
△ Less
Submitted 9 October, 2023; v1 submitted 7 October, 2023;
originally announced October 2023.
-
From Nuisance to News Sense: Augmenting the News with Cross-Document Evidence and Context
Authors:
Jeremiah Milbauer,
Ziqi Ding,
Zhijin Wu,
Tongshuang Wu
Abstract:
Reading and understanding the stories in the news is increasingly difficult. Reporting on stories evolves rapidly, politicized news venues offer different perspectives (and sometimes different facts), and misinformation is rampant. However, existing solutions merely aggregate an overwhelming amount of information from heterogenous sources, such as different news outlets, social media, and news bia…
▽ More
Reading and understanding the stories in the news is increasingly difficult. Reporting on stories evolves rapidly, politicized news venues offer different perspectives (and sometimes different facts), and misinformation is rampant. However, existing solutions merely aggregate an overwhelming amount of information from heterogenous sources, such as different news outlets, social media, and news bias rating agencies. We present NEWSSENSE, a novel sensemaking tool and reading interface designed to collect and integrate information from multiple news articles on a central topic, using a form of reference-free fact verification. NEWSSENSE augments a central, grounding article of the user's choice by linking it to related articles from different sources, providing inline highlights on how specific claims in the chosen article are either supported or contradicted by information from other articles. Using NEWSSENSE, users can seamlessly digest and cross-check multiple information sources without disturbing their natural reading flow. Our pilot study shows that NEWSSENSE has the potential to help users identify key information, verify the credibility of news articles, and explore different perspectives.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Hermes: Unlocking Security Analysis of Cellular Network Protocols by Synthesizing Finite State Machines from Natural Language Specifications
Authors:
Abdullah Al Ishtiaq,
Sarkar Snigdha Sarathi Das,
Syed Md Mukit Rashid,
Ali Ranjbar,
Kai Tu,
Tianwei Wu,
Zhezheng Song,
Weixuan Wang,
Mujtahid Akon,
Rui Zhang,
Syed Rafiul Hussain
Abstract:
In this paper, we present Hermes, an end-to-end framework to automatically generate formal representations from natural language cellular specifications. We first develop a neural constituency parser, NEUTREX, to process transition-relevant texts and extract transition components (i.e., states, conditions, and actions). We also design a domain-specific language to translate these transition compon…
▽ More
In this paper, we present Hermes, an end-to-end framework to automatically generate formal representations from natural language cellular specifications. We first develop a neural constituency parser, NEUTREX, to process transition-relevant texts and extract transition components (i.e., states, conditions, and actions). We also design a domain-specific language to translate these transition components to logical formulas by leveraging dependency parse trees. Finally, we compile these logical formulas to generate transitions and create the formal model as finite state machines. To demonstrate the effectiveness of Hermes, we evaluate it on 4G NAS, 5G NAS, and 5G RRC specifications and obtain an overall accuracy of 81-87%, which is a substantial improvement over the state-of-the-art. Our security analysis of the extracted models uncovers 3 new vulnerabilities and identifies 19 previous attacks in 4G and 5G specifications, and 7 deviations in commercial 4G basebands.
△ Less
Submitted 11 October, 2023; v1 submitted 6 October, 2023;
originally announced October 2023.
-
WLST: Weak Labels Guided Self-training for Weakly-supervised Domain Adaptation on 3D Object Detection
Authors:
Tsung-Lin Tsou,
Tsung-Han Wu,
Winston H. Hsu
Abstract:
In the field of domain adaptation (DA) on 3D object detection, most of the work is dedicated to unsupervised domain adaptation (UDA). Yet, without any target annotations, the performance gap between the UDA approaches and the fully-supervised approach is still noticeable, which is impractical for real-world applications. On the other hand, weakly-supervised domain adaptation (WDA) is an underexplo…
▽ More
In the field of domain adaptation (DA) on 3D object detection, most of the work is dedicated to unsupervised domain adaptation (UDA). Yet, without any target annotations, the performance gap between the UDA approaches and the fully-supervised approach is still noticeable, which is impractical for real-world applications. On the other hand, weakly-supervised domain adaptation (WDA) is an underexplored yet practical task that only requires few labeling effort on the target domain. To improve the DA performance in a cost-effective way, we propose a general weak labels guided self-training framework, WLST, designed for WDA on 3D object detection. By incorporating autolabeler, which can generate 3D pseudo labels from 2D bounding boxes, into the existing self-training pipeline, our method is able to generate more robust and consistent pseudo labels that would benefit the training process on the target domain. Extensive experiments demonstrate the effectiveness, robustness, and detector-agnosticism of our WLST framework. Notably, it outperforms previous state-of-the-art methods on all evaluation tasks.
△ Less
Submitted 7 February, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
What do we learn from a large-scale study of pre-trained visual representations in sim and real environments?
Authors:
Sneha Silwal,
Karmesh Yadav,
Tingfan Wu,
Jay Vakil,
Arjun Majumdar,
Sergio Arnaud,
Claire Chen,
Vincent-Pierre Berges,
Dhruv Batra,
Aravind Rajeswaran,
Mrinal Kalakrishnan,
Franziska Meier,
Oleksandr Maksymets
Abstract:
We present a large empirical investigation on the use of pre-trained visual representations (PVRs) for training downstream policies that execute real-world tasks. Our study spans five different PVRs, two different policy-learning paradigms (imitation and reinforcement learning), and three different robots for 5 distinct manipulation and indoor navigation tasks. From this effort, we can arrive at t…
▽ More
We present a large empirical investigation on the use of pre-trained visual representations (PVRs) for training downstream policies that execute real-world tasks. Our study spans five different PVRs, two different policy-learning paradigms (imitation and reinforcement learning), and three different robots for 5 distinct manipulation and indoor navigation tasks. From this effort, we can arrive at three insights: 1) the performance trends of PVRs in the simulation are generally indicative of their trends in the real world, 2) the use of PVRs enables a first-of-its-kind result with indoor ImageNav (zero-shot transfer to a held-out scene in the real world), and 3) the benefits from variations in PVRs, primarily data-augmentation and fine-tuning, also transfer to the real-world performance. See project website for additional details and visuals.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Selenite: Scaffolding Online Sensemaking with Comprehensive Overviews Elicited from Large Language Models
Authors:
Michael Xieyang Liu,
Tongshuang Wu,
Tianying Chen,
Franklin Mingzhe Li,
Aniket Kittur,
Brad A. Myers
Abstract:
Sensemaking in unfamiliar domains can be challenging, demanding considerable user effort to compare different options with respect to various criteria. Prior research and our formative study found that people would benefit from reading an overview of an information space upfront, including the criteria others previously found useful. However, existing sensemaking tools struggle with the "cold-star…
▽ More
Sensemaking in unfamiliar domains can be challenging, demanding considerable user effort to compare different options with respect to various criteria. Prior research and our formative study found that people would benefit from reading an overview of an information space upfront, including the criteria others previously found useful. However, existing sensemaking tools struggle with the "cold-start" problem -- it not only requires significant input from previous users to generate and share these overviews, but such overviews may also turn out to be biased and incomplete. In this work, we introduce a novel system, Selenite, which leverages Large Language Models (LLMs) as reasoning machines and knowledge retrievers to automatically produce a comprehensive overview of options and criteria to jumpstart users' sensemaking processes. Subsequently, Selenite also adapts as people use it, helping users find, read, and navigate unfamiliar information in a systematic yet personalized manner. Through three studies, we found that Selenite produced accurate and high-quality overviews reliably, significantly accelerated users' information processing, and effectively improved their overall comprehension and sensemaking experience.
△ Less
Submitted 28 January, 2024; v1 submitted 3 October, 2023;
originally announced October 2023.
-
Evolution of self-organized structure, the internal transport barrier in the ion-temperature-gradient driven gyrokinetic turbulence
Authors:
S. Wang,
Z. Wang,
T. Wu
Abstract:
Understanding the self-organization of the most promising internal transport barrier in fusion plasmas needs a long-time nonlinear gyrokinetic global simulation. The Neighboring Equilibrium Update method, which solves the secularity problem in a perturbative simulation and speeds up the numerical computation by more than 10 times. It is found that the internal transport barrier emerges at the magn…
▽ More
Understanding the self-organization of the most promising internal transport barrier in fusion plasmas needs a long-time nonlinear gyrokinetic global simulation. The Neighboring Equilibrium Update method, which solves the secularity problem in a perturbative simulation and speeds up the numerical computation by more than 10 times. It is found that the internal transport barrier emerges at the magnetic axis due to inward propagated turbulence avalanche, and its outward expansion is the catastrophe of self-organized structure induced by outward propagated avalanche.
△ Less
Submitted 16 November, 2023; v1 submitted 2 October, 2023;
originally announced October 2023.
-
OceanNet: A principled neural operator-based digital twin for regional oceans
Authors:
Ashesh Chattopadhyay,
Michael Gray,
Tianning Wu,
Anna B. Lowe,
Ruoying He
Abstract:
While data-driven approaches demonstrate great potential in atmospheric modeling and weather forecasting, ocean modeling poses distinct challenges due to complex bathymetry, land, vertical structure, and flow non-linearity. This study introduces OceanNet, a principled neural operator-based digital twin for ocean circulation. OceanNet uses a Fourier neural operator and predictor-evaluate-corrector…
▽ More
While data-driven approaches demonstrate great potential in atmospheric modeling and weather forecasting, ocean modeling poses distinct challenges due to complex bathymetry, land, vertical structure, and flow non-linearity. This study introduces OceanNet, a principled neural operator-based digital twin for ocean circulation. OceanNet uses a Fourier neural operator and predictor-evaluate-corrector integration scheme to mitigate autoregressive error growth and enhance stability over extended time scales. A spectral regularizer counteracts spectral bias at smaller scales. OceanNet is applied to the northwest Atlantic Ocean western boundary current (the Gulf Stream), focusing on the task of seasonal prediction for Loop Current eddies and the Gulf Stream meander. Trained using historical sea surface height (SSH) data, OceanNet demonstrates competitive forecast skill by outperforming SSH predictions by an uncoupled, state-of-the-art dynamical ocean model forecast, reducing computation by 500,000 times. These accomplishments demonstrate the potential of physics-inspired deep neural operators as cost-effective alternatives to high-resolution numerical ocean models.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment
Authors:
Tianhao Wu,
Banghua Zhu,
Ruoyu Zhang,
Zhaojin Wen,
Kannan Ramchandran,
Jiantao Jiao
Abstract:
Large Language Models (LLMs) can acquire extensive world knowledge through pre-training on large corpora. However, due to exposure to low-quality data, LLMs may exhibit harmful behavior without aligning with human values. The dominant approach for steering LLMs towards beneficial behavior involves Reinforcement Learning with Human Feedback (RLHF), with Proximal Policy Optimization (PPO) serving as…
▽ More
Large Language Models (LLMs) can acquire extensive world knowledge through pre-training on large corpora. However, due to exposure to low-quality data, LLMs may exhibit harmful behavior without aligning with human values. The dominant approach for steering LLMs towards beneficial behavior involves Reinforcement Learning with Human Feedback (RLHF), with Proximal Policy Optimization (PPO) serving as the default RL optimizer. Despite its effectiveness, PPO has limitations when optimizing rewards trained from comparison-based loss. Primarily, PPO is not invariant to equivalent reward functions containing identical preference information due to the need to calibrate the reward scale. Additionally, PPO's necessity for token-wise updates introduces complexity in both function approximation and algorithm design compared to trajectory-wise optimization. This paper proposes a new framework, reinforcement learning with relative feedback, and a novel trajectory-wise policy gradient algorithm, Pairwise Proximal Policy Optimization (P3O) that operates directly on comparative rewards. We show theoretically that P3O is invariant to equivalent rewards and avoids the complexity of PPO. Empirical evaluations demonstrate that P3O outperforms PPO in the KL-Reward trade-off and can align with human preferences as well as or better than prior methods. In summary, this work introduces a simpler yet effective approach for aligning LLMs to human preferences through relative feedback.
△ Less
Submitted 9 October, 2023; v1 submitted 29 September, 2023;
originally announced October 2023.
-
Reusability report: Prostate cancer stratification with diverse biologically-informed neural architectures
Authors:
Christian Pedersen,
Tiberiu Tesileanu,
Tinghui Wu,
Siavash Golkar,
Miles Cranmer,
Zijun Zhang,
Shirley Ho
Abstract:
In Elmarakeby et al., "Biologically informed deep neural network for prostate cancer discovery", a feedforward neural network with biologically informed, sparse connections (P-NET) was presented to model the state of prostate cancer. We verified the reproducibility of the study conducted by Elmarakeby et al., using both their original codebase, and our own re-implementation using more up-to-date l…
▽ More
In Elmarakeby et al., "Biologically informed deep neural network for prostate cancer discovery", a feedforward neural network with biologically informed, sparse connections (P-NET) was presented to model the state of prostate cancer. We verified the reproducibility of the study conducted by Elmarakeby et al., using both their original codebase, and our own re-implementation using more up-to-date libraries. We quantified the contribution of network sparsification by Reactome biological pathways, and confirmed its importance to P-NET's superior performance. Furthermore, we explored alternative neural architectures and approaches to incorporating biological information into the networks. We experimented with three types of graph neural networks on the same training data, and investigated the clinical prediction agreement between different models. Our analyses demonstrated that deep neural networks with distinct architectures make incorrect predictions for individual patient that are persistent across different initializations of a specific neural architecture. This suggests that different neural architectures are sensitive to different aspects of the data, an important yet under-explored challenge for clinical prediction tasks.
△ Less
Submitted 30 October, 2023; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Capping the positivity cone: dimension-8 Higgs operators in the SMEFT
Authors:
Qing Chen,
Ken Mimasu,
Tong Arthur Wu,
Guo-Dong Zhang,
Shuang-Yong Zhou
Abstract:
SMEFT Wilson coefficients are subject to various positivity bounds in order to be consistent with the fundamental principles of S-matrix. Previous bounds on dimension-8 SMEFT operators have been obtained using the positivity part of UV partial wave unitarity and form a (projective) convex cone. We derive a set of linear UV unitarity conditions that go beyond positivity and are easy to implement in…
▽ More
SMEFT Wilson coefficients are subject to various positivity bounds in order to be consistent with the fundamental principles of S-matrix. Previous bounds on dimension-8 SMEFT operators have been obtained using the positivity part of UV partial wave unitarity and form a (projective) convex cone. We derive a set of linear UV unitarity conditions that go beyond positivity and are easy to implement in an optimization scheme with dispersion relations in a multi-field EFT. Using Higgs scattering as an example, we demonstrate how to obtain closed bounds in the space of the three relevant dimension-8 coefficients, making use of the UV unitarity conditions as well as so-called null constraints that arise from full crossing symmetry. Specifically, we show that they are bounded by inequalities schematically going like $C<O\left((4π)^2\right)$. We compare the newly obtained upper bounds with the traditional perturbative unitarity bounds from within the EFT, and discuss some phenomenological implications of the two-sided positivity bounds in the context of experimental probes of Vector Boson Scattering.
△ Less
Submitted 9 March, 2024; v1 submitted 27 September, 2023;
originally announced September 2023.
-
Vectorial ground state solutions for a class of Hartree-Fock type systems with the double coupled feature
Authors:
Juntao Sun,
Tsung-fang Wu
Abstract:
In this paper we study the Hartree-Fock type system as follows: \begin{equation*} \left\{ \begin{array}{ll} -Δu+u+λφ_{u,v}u=\left\vert u\right\vert ^{p-2}u+β\left\vert v\right\vert ^{\frac{p}{2}}\left\vert u\right\vert ^{\frac{p}{2}% -2}u & \text{ in }\mathbb{R}^{3}, \\ -Δv+v+λφ_{u,v}v=\left\vert v\right\vert ^{p-2}v+β\left\vert u\right\vert ^{\frac{p}{2}}\left\vert v\right\vert ^{\frac{p}{2}% -2}…
▽ More
In this paper we study the Hartree-Fock type system as follows: \begin{equation*} \left\{ \begin{array}{ll} -Δu+u+λφ_{u,v}u=\left\vert u\right\vert ^{p-2}u+β\left\vert v\right\vert ^{\frac{p}{2}}\left\vert u\right\vert ^{\frac{p}{2}% -2}u & \text{ in }\mathbb{R}^{3}, \\ -Δv+v+λφ_{u,v}v=\left\vert v\right\vert ^{p-2}v+β\left\vert u\right\vert ^{\frac{p}{2}}\left\vert v\right\vert ^{\frac{p}{2}% -2}v & \text{ in }\mathbb{R}^{3},% \end{array}% \right. \end{equation*}% where $φ_{u,v}(x)=\int_{\mathbb{R}^{3}}\frac{u^{2}(y)+v^{2}\left( y\right) }{|x-y|}dy,$ the parameters $λ,β>0$ and $2<p<4$. Such system is viewed as an approximation of the Coulomb system with two particles appeared in quantum mechanics, taking into account the Pauli principle. Its characteristic feature lies on the presence of the double coupled terms. When $2<p<3,$ we establish the existence and multiplicity of nontrivial radial solutions, including vectorial ones, in the radial space $% H_{r}$ by describing the internal relationship between the coupling constants $λ$ and $β.$ When $2<p<4,$ we study the existence of vectorial solutions in the non-radial space $H$ by developing a novel constraint method, together with some new analysis techniques. In particular, when $3\leq p<4,$ a vectorial ground state solution is found in $% H$, which is innovative as it was not discussed at all in any previous results. Our study can be regarded as an entire supplement in d'Avenia et al. [J. Differential Equations 335 (2022) 580--614].
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
Authors:
Yaohui Wang,
Xinyuan Chen,
Xin Ma,
Shangchen Zhou,
Ziqi Huang,
Yi Wang,
Ceyuan Yang,
Yinan He,
Jiashuo Yu,
Peiqing Yang,
Yuwei Guo,
Tianxing Wu,
Chenyang Si,
Yuming Jiang,
Cunjian Chen,
Chen Change Loy,
Bo Dai,
Dahua Lin,
Yu Qiao,
Ziwei Liu
Abstract:
This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task to simultaneously a) accomplish the synthesis of visually realistic and temporally coherent videos while b) preserving the strong creative generation nature of the pre-trained T2I model. To this end, we propose L…
▽ More
This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task to simultaneously a) accomplish the synthesis of visually realistic and temporally coherent videos while b) preserving the strong creative generation nature of the pre-trained T2I model. To this end, we propose LaVie, an integrated video generation framework that operates on cascaded video latent diffusion models, comprising a base T2V model, a temporal interpolation model, and a video super-resolution model. Our key insights are two-fold: 1) We reveal that the incorporation of simple temporal self-attentions, coupled with rotary positional encoding, adequately captures the temporal correlations inherent in video data. 2) Additionally, we validate that the process of joint image-video fine-tuning plays a pivotal role in producing high-quality and creative outcomes. To enhance the performance of LaVie, we contribute a comprehensive and diverse video dataset named Vimeo25M, consisting of 25 million text-video pairs that prioritize quality, diversity, and aesthetic appeal. Extensive experiments demonstrate that LaVie achieves state-of-the-art performance both quantitatively and qualitatively. Furthermore, we showcase the versatility of pre-trained LaVie models in various long video generation and personalized video synthesis applications.
△ Less
Submitted 26 September, 2023; v1 submitted 26 September, 2023;
originally announced September 2023.
-
Robust Sequential DeepFake Detection
Authors:
Rui Shao,
Tianxing Wu,
Ziwei Liu
Abstract:
Since photorealistic faces can be readily generated by facial manipulation technologies nowadays, potential malicious abuse of these technologies has drawn great concerns. Numerous deepfake detection methods are thus proposed. However, existing methods only focus on detecting one-step facial manipulation. As the emergence of easy-accessible facial editing applications, people can easily manipulate…
▽ More
Since photorealistic faces can be readily generated by facial manipulation technologies nowadays, potential malicious abuse of these technologies has drawn great concerns. Numerous deepfake detection methods are thus proposed. However, existing methods only focus on detecting one-step facial manipulation. As the emergence of easy-accessible facial editing applications, people can easily manipulate facial components using multi-step operations in a sequential manner. This new threat requires us to detect a sequence of facial manipulations, which is vital for both detecting deepfake media and recovering original faces afterwards. Motivated by this observation, we emphasize the need and propose a novel research problem called Detecting Sequential DeepFake Manipulation (Seq-DeepFake). Unlike the existing deepfake detection task only demanding a binary label prediction, detecting Seq-DeepFake manipulation requires correctly predicting a sequential vector of facial manipulation operations. To support a large-scale investigation, we construct the first Seq-DeepFake dataset, where face images are manipulated sequentially with corresponding annotations of sequential facial manipulation vectors. Based on this new dataset, we cast detecting Seq-DeepFake manipulation as a specific image-to-sequence task and propose a concise yet effective Seq-DeepFake Transformer (SeqFakeFormer). To better reflect real-world deepfake data distributions, we further apply various perturbations on the original Seq-DeepFake dataset and construct the more challenging Sequential DeepFake dataset with perturbations (Seq-DeepFake-P). To exploit deeper correlation between images and sequences when facing Seq-DeepFake-P, a dedicated Seq-DeepFake Transformer with Image-Sequence Reasoning (SeqFakeFormer++) is devised, which builds stronger correspondence between image-sequence pairs for more robust Seq-DeepFake detection.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Detecting and Grounding Multi-Modal Media Manipulation and Beyond
Authors:
Rui Shao,
Tianxing Wu,
Jianlong Wu,
Liqiang Nie,
Ziwei Liu
Abstract:
Misinformation has become a pressing issue. Fake media, in both visual and textual forms, is widespread on the web. While various deepfake detection and text fake news detection methods have been proposed, they are only designed for single-modality forgery based on binary classification, let alone analyzing and reasoning subtle forgery traces across different modalities. In this paper, we highligh…
▽ More
Misinformation has become a pressing issue. Fake media, in both visual and textual forms, is widespread on the web. While various deepfake detection and text fake news detection methods have been proposed, they are only designed for single-modality forgery based on binary classification, let alone analyzing and reasoning subtle forgery traces across different modalities. In this paper, we highlight a new research problem for multi-modal fake media, namely Detecting and Grounding Multi-Modal Media Manipulation (DGM^4). DGM^4 aims to not only detect the authenticity of multi-modal media, but also ground the manipulated content, which requires deeper reasoning of multi-modal media manipulation. To support a large-scale investigation, we construct the first DGM^4 dataset, where image-text pairs are manipulated by various approaches, with rich annotation of diverse manipulations. Moreover, we propose a novel HierArchical Multi-modal Manipulation rEasoning tRansformer (HAMMER) to fully capture the fine-grained interaction between different modalities. HAMMER performs 1) manipulation-aware contrastive learning between two uni-modal encoders as shallow manipulation reasoning, and 2) modality-aware cross-attention by multi-modal aggregator as deep manipulation reasoning. Dedicated manipulation detection and grounding heads are integrated from shallow to deep levels based on the interacted multi-modal information. To exploit more fine-grained contrastive learning for cross-modal semantic alignment, we further integrate Manipulation-Aware Contrastive Loss with Local View and construct a more advanced model HAMMER++. Finally, we build an extensive benchmark and set up rigorous evaluation metrics for this new research problem. Comprehensive experiments demonstrate the superiority of HAMMER and HAMMER++.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
A Novel Approach for Effective Multi-View Clustering with Information-Theoretic Perspective
Authors:
Chenhang Cui,
Yazhou Ren,
Jingyu Pu,
Jiawei Li,
Xiaorong Pu,
Tianyi Wu,
Yutao Shi,
Lifang He
Abstract:
Multi-view clustering (MVC) is a popular technique for improving clustering performance using various data sources. However, existing methods primarily focus on acquiring consistent information while often neglecting the issue of redundancy across multiple views. This study presents a new approach called Sufficient Multi-View Clustering (SUMVC) that examines the multi-view clustering framework fro…
▽ More
Multi-view clustering (MVC) is a popular technique for improving clustering performance using various data sources. However, existing methods primarily focus on acquiring consistent information while often neglecting the issue of redundancy across multiple views. This study presents a new approach called Sufficient Multi-View Clustering (SUMVC) that examines the multi-view clustering framework from an information-theoretic standpoint. Our proposed method consists of two parts. Firstly, we develop a simple and reliable multi-view clustering method SCMVC (simple consistent multi-view clustering) that employs variational analysis to generate consistent information. Secondly, we propose a sufficient representation lower bound to enhance consistent information and minimise unnecessary information among views. The proposed SUMVC method offers a promising solution to the problem of multi-view clustering and provides a new perspective for analyzing multi-view data.
To verify the effectiveness of our model, we conducted a theoretical analysis based on the Bayes Error Rate, and experiments on multiple multi-view datasets demonstrate the superior performance of SUMVC.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
InSpaceType: Reconsider Space Type in Indoor Monocular Depth Estimation
Authors:
Cho-Ying Wu,
Quankai Gao,
Chin-Cheng Hsu,
Te-Lin Wu,
Jing-Wen Chen,
Ulrich Neumann
Abstract:
Indoor monocular depth estimation has attracted increasing research interest. Most previous works have been focusing on methodology, primarily experimenting with NYU-Depth-V2 (NYUv2) Dataset, and only concentrated on the overall performance over the test set. However, little is known regarding robustness and generalization when it comes to applying monocular depth estimation methods to real-world…
▽ More
Indoor monocular depth estimation has attracted increasing research interest. Most previous works have been focusing on methodology, primarily experimenting with NYU-Depth-V2 (NYUv2) Dataset, and only concentrated on the overall performance over the test set. However, little is known regarding robustness and generalization when it comes to applying monocular depth estimation methods to real-world scenarios where highly varying and diverse functional \textit{space types} are present such as library or kitchen. A study for performance breakdown into space types is essential to realize a pretrained model's performance variance. To facilitate our investigation for robustness and address limitations of previous works, we collect InSpaceType, a high-quality and high-resolution RGBD dataset for general indoor environments. We benchmark 12 recent methods on InSpaceType and find they severely suffer from performance imbalance concerning space types, which reveals their underlying bias. We extend our analysis to 4 other datasets, 3 mitigation approaches, and the ability to generalize to unseen space types. Our work marks the first in-depth investigation of performance imbalance across space types for indoor monocular depth estimation, drawing attention to potential safety concerns for model deployment without considering space types, and further shedding light on potential ways to improve robustness. See \url{https://depthcomputation.github.io/DepthPublic} for data and the supplementary document. The benchmark list on the GitHub project page keeps updates for the lastest monocular depth estimation methods.
△ Less
Submitted 30 January, 2024; v1 submitted 23 September, 2023;
originally announced September 2023.
-
Enhanced Sensitivity in Rydberg Atom Electric Field Sensors through Autler-Townes Effect and Two-Photon Absorption: A Theoretical Analysis Using Many-Mode Floquet Theory
Authors:
Tianhao Wu
Abstract:
In this paper, we present a comprehensive investigation into the sensitivity of a Rydberg atom electric field sensor, with a specific focus on the minimum detectable field (MDF) as a key metric. The study utilizes one-mode Floquet theory to calculate the Stark shift for selected Rydberg states when exposed to a signal electric field. The results are compared to those obtained using the rotating wa…
▽ More
In this paper, we present a comprehensive investigation into the sensitivity of a Rydberg atom electric field sensor, with a specific focus on the minimum detectable field (MDF) as a key metric. The study utilizes one-mode Floquet theory to calculate the Stark shift for selected Rydberg states when exposed to a signal electric field. The results are compared to those obtained using the rotating wave approximation (RWA). To enhance the sensor's sensitivity when the frequency of the signal electric field deviates from resonance frequencies between Rydberg states, we propose incorporating an extra coupling electric field and using many-mode Floquet theory, a generalization of one-mode Floquet theory, to theoretically analyze this kind of Rydberg atom electric field sensor. The Autler-Townes effect resulting from this coupling electric field causes Rydberg states to split into dressed states, effectively increasing sensitivity by modulating the frequencies of resonance peaks. Moreover, the phenomenon of two-photon absorption in the presence of the coupling electric field is explored. We demonstrate that by appropriately adjusting the coupling electric field's amplitude or frequency, one can control the occurrence of two-photon resonances, providing additional sensitivity enhancement for the Rydberg sensor within the significantly extended off-resonance domain. The study underscores the significance of coupling fields in enhancing the sensitivity of Rydberg atom electric field sensors. These insights hold promising implications for the development of more robust and versatile electric field sensing devices, applicable in diverse fields such as precision measurements and quantum information processing.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
PyPose v0.6: The Imperative Programming Interface for Robotics
Authors:
Zitong Zhan,
Xiangfu Li,
Qihang Li,
Haonan He,
Abhinav Pandey,
Haitao Xiao,
Yangmengfei Xu,
Xiangyu Chen,
Kuan Xu,
Kun Cao,
Zhipeng Zhao,
Zihan Wang,
Huan Xu,
Zihang Fang,
Yutian Chen,
Wentao Wang,
Xu Fang,
Yi Du,
Tianhao Wu,
Xiao Lin,
Yuheng Qiu,
Fan Yang,
Jingnan Shi,
Shaoshu Su,
Yiren Lu
, et al. (11 additional authors not shown)
Abstract:
PyPose is an open-source library for robot learning. It combines a learning-based approach with physics-based optimization, which enables seamless end-to-end robot learning. It has been used in many tasks due to its meticulously designed application programming interface (API) and efficient implementation. From its initial launch in early 2022, PyPose has experienced significant enhancements, inco…
▽ More
PyPose is an open-source library for robot learning. It combines a learning-based approach with physics-based optimization, which enables seamless end-to-end robot learning. It has been used in many tasks due to its meticulously designed application programming interface (API) and efficient implementation. From its initial launch in early 2022, PyPose has experienced significant enhancements, incorporating a wide variety of new features into its platform. To satisfy the growing demand for understanding and utilizing the library and reduce the learning curve of new users, we present the fundamental design principle of the imperative programming interface, and showcase the flexible usage of diverse functionalities and modules using an extremely simple Dubins car example. We also demonstrate that the PyPose can be easily used to navigate a real quadruped robot with a few lines of code.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch Interactions for Multi-Channel Speech Enhancement
Authors:
Jiahui Pan,
Shulin He,
Tianci Wu,
Hui Zhang,
Xueliang Zhang
Abstract:
Multi-channel speech enhancement seeks to utilize spatial information to distinguish target speech from interfering signals. While deep learning approaches like the dual-path convolutional recurrent network (DPCRN) have made strides, challenges persist in effectively modeling inter-channel correlations and amalgamating multi-level information. In response, we introduce the Parallel Dual-Path Convo…
▽ More
Multi-channel speech enhancement seeks to utilize spatial information to distinguish target speech from interfering signals. While deep learning approaches like the dual-path convolutional recurrent network (DPCRN) have made strides, challenges persist in effectively modeling inter-channel correlations and amalgamating multi-level information. In response, we introduce the Parallel Dual-Path Convolutional Recurrent Network (PDPCRN). This acoustic modeling architecture has two key innovations. First, a parallel design with separate branches extracts complementary features. Second, bi-directional modules enable cross-branch communication. Together, these facilitate diverse representation fusion and enhanced modeling. Experimental validation on TIMIT datasets underscores the prowess of PDPCRN. Notably, against baseline models like the standard DPCRN, PDPCRN not only outperforms in PESQ and STOI metrics but also boasts a leaner computational footprint with reduced parameters.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
CDDM: Channel Denoising Diffusion Models for Wireless Semantic Communications
Authors:
Tong Wu,
Zhiyong Chen,
Dazhi He,
Liang Qian,
Yin Xu,
Meixia Tao,
Wenjun Zhang
Abstract:
Diffusion models (DM) can gradually learn to remove noise, which have been widely used in artificial intelligence generated content (AIGC) in recent years. The property of DM for eliminating noise leads us to wonder whether DM can be applied to wireless communications to help the receiver mitigate the channel noise. To address this, we propose channel denoising diffusion models (CDDM) for semantic…
▽ More
Diffusion models (DM) can gradually learn to remove noise, which have been widely used in artificial intelligence generated content (AIGC) in recent years. The property of DM for eliminating noise leads us to wonder whether DM can be applied to wireless communications to help the receiver mitigate the channel noise. To address this, we propose channel denoising diffusion models (CDDM) for semantic communications over wireless channels in this paper. CDDM can be applied as a new physical layer module after the channel equalization to learn the distribution of the channel input signal, and then utilizes this learned knowledge to remove the channel noise. We derive corresponding training and sampling algorithms of CDDM according to the forward diffusion process specially designed to adapt the channel models and theoretically prove that the well-trained CDDM can effectively reduce the conditional entropy of the received signal under small sampling steps. Moreover, we apply CDDM to a semantic communications system based on joint source-channel coding (JSCC) for image transmission. Extensive experimental results demonstrate that CDDM can further reduce the mean square error (MSE) after minimum mean square error (MMSE) equalizer, and the joint CDDM and JSCC system achieves better performance than the JSCC system and the traditional JPEG2000 with low-density parity-check (LDPC) code approach.
△ Less
Submitted 16 September, 2023;
originally announced September 2023.
-
"Merge Conflicts!" Exploring the Impacts of External Distractors to Parametric Knowledge Graphs
Authors:
Cheng Qian,
Xinran Zhao,
Sherry Tongshuang Wu
Abstract:
Large language models (LLMs) acquire extensive knowledge during pre-training, known as their parametric knowledge. However, in order to remain up-to-date and align with human instructions, LLMs inevitably require external knowledge during their interactions with users. This raises a crucial question: How will LLMs respond when external knowledge interferes with their parametric knowledge? To inves…
▽ More
Large language models (LLMs) acquire extensive knowledge during pre-training, known as their parametric knowledge. However, in order to remain up-to-date and align with human instructions, LLMs inevitably require external knowledge during their interactions with users. This raises a crucial question: How will LLMs respond when external knowledge interferes with their parametric knowledge? To investigate this question, we propose a framework that systematically elicits LLM parametric knowledge and introduces external knowledge. Specifically, we uncover the impacts by constructing a parametric knowledge graph to reveal the different knowledge structures of LLMs, and introduce external knowledge through distractors of varying degrees, methods, positions, and formats. Our experiments on both black-box and open-source models demonstrate that LLMs tend to produce responses that deviate from their parametric knowledge, particularly when they encounter direct conflicts or confounding changes of information within detailed contexts. We also find that while LLMs are sensitive to the veracity of external knowledge, they can still be distracted by unrelated information. These findings highlight the risk of hallucination when integrating external knowledge, even indirectly, during interactions with current LLMs. All the data and results are publicly available.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
ECEA: Extensible Co-Existing Attention for Few-Shot Object Detection
Authors:
Zhimeng Xin,
Tianxu Wu,
Shiming Chen,
Yixiong Zou,
Ling Shao,
Xinge You
Abstract:
Few-shot object detection (FSOD) identifies objects from extremely few annotated samples. Most existing FSOD methods, recently, apply the two-stage learning paradigm, which transfers the knowledge learned from abundant base classes to assist the few-shot detectors by learning the global features. However, such existing FSOD approaches seldom consider the localization of objects from local to globa…
▽ More
Few-shot object detection (FSOD) identifies objects from extremely few annotated samples. Most existing FSOD methods, recently, apply the two-stage learning paradigm, which transfers the knowledge learned from abundant base classes to assist the few-shot detectors by learning the global features. However, such existing FSOD approaches seldom consider the localization of objects from local to global. Limited by the scarce training data in FSOD, the training samples of novel classes typically capture part of objects, resulting in such FSOD methods cannot detect the completely unseen object during testing. To tackle this problem, we propose an Extensible Co-Existing Attention (ECEA) module to enable the model to infer the global object according to the local parts. Essentially, the proposed module continuously learns the extensible ability on the base stage with abundant samples and transfers it to the novel stage, which can assist the few-shot model to quickly adapt in extending local regions to co-existing regions. Specifically, we first devise an extensible attention mechanism that starts with a local region and extends attention to co-existing regions that are similar and adjacent to the given local region. We then implement the extensible attention mechanism in different feature scales to progressively discover the full object in various receptive fields. Extensive experiments on the PASCAL VOC and COCO datasets show that our ECEA module can assist the few-shot detector to completely predict the object despite some regions failing to appear in the training samples and achieve the new state of the art compared with existing FSOD methods.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Detail Reinforcement Diffusion Model: Augmentation Fine-Grained Visual Categorization in Few-Shot Conditions
Authors:
Tianxu Wu,
Shuo Ye,
Shuhuang Chen,
Qinmu Peng,
Xinge You
Abstract:
The challenge in fine-grained visual categorization lies in how to explore the subtle differences between different subclasses and achieve accurate discrimination. Previous research has relied on large-scale annotated data and pre-trained deep models to achieve the objective. However, when only a limited amount of samples is available, similar methods may become less effective. Diffusion models ha…
▽ More
The challenge in fine-grained visual categorization lies in how to explore the subtle differences between different subclasses and achieve accurate discrimination. Previous research has relied on large-scale annotated data and pre-trained deep models to achieve the objective. However, when only a limited amount of samples is available, similar methods may become less effective. Diffusion models have been widely adopted in data augmentation due to their outstanding diversity in data generation. However, the high level of detail required for fine-grained images makes it challenging for existing methods to be directly employed. To address this issue, we propose a novel approach termed the detail reinforcement diffusion model~(DRDM), which leverages the rich knowledge of large models for fine-grained data augmentation and comprises two key components including discriminative semantic recombination (DSR) and spatial knowledge reference~(SKR). Specifically, DSR is designed to extract implicit similarity relationships from the labels and reconstruct the semantic mapping between labels and instances, which enables better discrimination of subtle differences between different subclasses. Furthermore, we introduce the SKR module, which incorporates the distributions of different datasets as references in the feature space. This allows the SKR to aggregate the high-dimensional distribution of subclass features in few-shot FGVC tasks, thus expanding the decision boundary. Through these two critical components, we effectively utilize the knowledge from large models to address the issue of data scarcity, resulting in improved performance for fine-grained visual recognition tasks. Extensive experiments demonstrate the consistent performance gain offered by our DRDM.
△ Less
Submitted 15 May, 2024; v1 submitted 14 September, 2023;
originally announced September 2023.
-
Large-Vocabulary 3D Diffusion Model with Transformer
Authors:
Ziang Cao,
Fangzhou Hong,
Tong Wu,
Liang Pan,
Ziwei Liu
Abstract:
Creating diverse and high-quality 3D assets with an automatic generative model is highly desirable. Despite extensive efforts on 3D generation, most existing works focus on the generation of a single category or a few categories. In this paper, we introduce a diffusion-based feed-forward framework for synthesizing massive categories of real-world 3D objects with a single generative model. Notably,…
▽ More
Creating diverse and high-quality 3D assets with an automatic generative model is highly desirable. Despite extensive efforts on 3D generation, most existing works focus on the generation of a single category or a few categories. In this paper, we introduce a diffusion-based feed-forward framework for synthesizing massive categories of real-world 3D objects with a single generative model. Notably, there are three major challenges for this large-vocabulary 3D generation: a) the need for expressive yet efficient 3D representation; b) large diversity in geometry and texture across categories; c) complexity in the appearances of real-world objects. To this end, we propose a novel triplane-based 3D-aware Diffusion model with TransFormer, DiffTF, for handling challenges via three aspects. 1) Considering efficiency and robustness, we adopt a revised triplane representation and improve the fitting speed and accuracy. 2) To handle the drastic variations in geometry and texture, we regard the features of all 3D objects as a combination of generalized 3D knowledge and specialized 3D features. To extract generalized 3D knowledge from diverse categories, we propose a novel 3D-aware transformer with shared cross-plane attention. It learns the cross-plane relations across different planes and aggregates the generalized 3D knowledge with specialized 3D features. 3) In addition, we devise the 3D-aware encoder/decoder to enhance the generalized 3D knowledge in the encoded triplanes for handling categories with complex appearances. Extensive experiments on ShapeNet and OmniObject3D (over 200 diverse real-world categories) convincingly demonstrate that a single DiffTF model achieves state-of-the-art large-vocabulary 3D object generation performance with large diversity, rich semantics, and high quality.
△ Less
Submitted 15 September, 2023; v1 submitted 14 September, 2023;
originally announced September 2023.
-
On Performance of Fluid Antenna System using Maximum Ratio Combining
Authors:
Xiazhi Lai,
Tuo Wu,
Junteng Yao,
Cunhua Pan,
Maged Elkashlan,
Kai-Kit Wong
Abstract:
This letter investigates a fluid antenna system (FAS) where multiple ports can be activated for signal combining for enhanced receiver performance. Given $M$ ports at the FAS, the best $K$ ports out of the $M$ available ports are selected before maximum ratio combining (MRC) is used to combine the received signals from the selected ports. The aim of this letter is to study the achievable performan…
▽ More
This letter investigates a fluid antenna system (FAS) where multiple ports can be activated for signal combining for enhanced receiver performance. Given $M$ ports at the FAS, the best $K$ ports out of the $M$ available ports are selected before maximum ratio combining (MRC) is used to combine the received signals from the selected ports. The aim of this letter is to study the achievable performance of FAS when more than one ports can be activated. We do so by analyzing the outage probability of this setup in Rayleigh fading channels through the utilization of Gauss-Chebyshev integration, lower bound estimation, and high signal-to-noise ratio (SNR) asymptotic approximations. Our analytical results demonstrate that FAS can harness rich spatial diversity, which is confirmed by computer simulations.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
The Hodge-Dirac operator and Dabrowski-Sitarz-Zalecki type theorems for manifolds with boundary
Authors:
Tong Wu,
Yong Wang
Abstract:
In [10], Dabrowski etc. gave spectral Einstein bilinear functionals of differential forms for the Hodge-Dirac operator $d+δ$ on an oriented even-dimensional Riemannian manifold. In this paper, we generalize the results of Dabrowski etc. to the cases of 4 dimensional oriented Riemannian manifolds with boundary. Furthermore, we give the proof of Dabrowski-Sitarz-Zalecki type theorems associated with…
▽ More
In [10], Dabrowski etc. gave spectral Einstein bilinear functionals of differential forms for the Hodge-Dirac operator $d+δ$ on an oriented even-dimensional Riemannian manifold. In this paper, we generalize the results of Dabrowski etc. to the cases of 4 dimensional oriented Riemannian manifolds with boundary. Furthermore, we give the proof of Dabrowski-Sitarz-Zalecki type theorems associated with the Hodge-Dirac operator for manifolds with boundary.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
GraspGF: Learning Score-based Grasping Primitive for Human-assisting Dexterous Grasping
Authors:
Tianhao Wu,
Mingdong Wu,
Jiyao Zhang,
Yunchong Gan,
Hao Dong
Abstract:
The use of anthropomorphic robotic hands for assisting individuals in situations where human hands may be unavailable or unsuitable has gained significant importance. In this paper, we propose a novel task called human-assisting dexterous grasping that aims to train a policy for controlling a robotic hand's fingers to assist users in grasping objects. Unlike conventional dexterous grasping, this t…
▽ More
The use of anthropomorphic robotic hands for assisting individuals in situations where human hands may be unavailable or unsuitable has gained significant importance. In this paper, we propose a novel task called human-assisting dexterous grasping that aims to train a policy for controlling a robotic hand's fingers to assist users in grasping objects. Unlike conventional dexterous grasping, this task presents a more complex challenge as the policy needs to adapt to diverse user intentions, in addition to the object's geometry. We address this challenge by proposing an approach consisting of two sub-modules: a hand-object-conditional grasping primitive called Grasping Gradient Field~(GraspGF), and a history-conditional residual policy. GraspGF learns `how' to grasp by estimating the gradient from a success grasping example set, while the residual policy determines `when' and at what speed the grasping action should be executed based on the trajectory history. Experimental results demonstrate the superiority of our proposed method compared to baselines, highlighting the user-awareness and practicality in real-world applications. The codes and demonstrations can be viewed at "https://sites.google.com/view/graspgf".
△ Less
Submitted 14 November, 2023; v1 submitted 12 September, 2023;
originally announced September 2023.
-
MFPNet: Multi-scale Feature Propagation Network For Lightweight Semantic Segmentation
Authors:
Guoan Xu,
Wenjing Jia,
Tao Wu,
Ligeng Chen
Abstract:
In contrast to the abundant research focusing on large-scale models, the progress in lightweight semantic segmentation appears to be advancing at a comparatively slower pace. However, existing compact methods often suffer from limited feature representation capability due to the shallowness of their networks. In this paper, we propose a novel lightweight segmentation architecture, called Multi-sca…
▽ More
In contrast to the abundant research focusing on large-scale models, the progress in lightweight semantic segmentation appears to be advancing at a comparatively slower pace. However, existing compact methods often suffer from limited feature representation capability due to the shallowness of their networks. In this paper, we propose a novel lightweight segmentation architecture, called Multi-scale Feature Propagation Network (MFPNet), to address the dilemma. Specifically, we design a robust Encoder-Decoder structure featuring symmetrical residual blocks that consist of flexible bottleneck residual modules (BRMs) to explore deep and rich muti-scale semantic context. Furthermore, taking benefit from their capacity to model latent long-range contextual relationships, we leverage Graph Convolutional Networks (GCNs) to facilitate multi-scale feature propagation between the BRM blocks. When evaluated on benchmark datasets, our proposed approach shows superior segmentation results.
△ Less
Submitted 12 September, 2023; v1 submitted 9 September, 2023;
originally announced September 2023.
-
Analytical Modeling of Acoustic Exponential Materials and Physical Mechanism of Broadband Anti-Reflection
Authors:
Sichao Qu,
Min Yang,
Tenglong Wu,
Yunfei Xu,
Nicholas Fang,
Shuyu Chen
Abstract:
Spatially exponential distributions of material properties are ubiquitous in many natural and engineered systems, from the vertical distribution of the atmosphere to acoustic horns and anti-reflective coatings. These media seamlessly interface different impedances, enhancing wave transmission and reducing internal reflections. This work advances traditional transfer matrix theory by integrating an…
▽ More
Spatially exponential distributions of material properties are ubiquitous in many natural and engineered systems, from the vertical distribution of the atmosphere to acoustic horns and anti-reflective coatings. These media seamlessly interface different impedances, enhancing wave transmission and reducing internal reflections. This work advances traditional transfer matrix theory by integrating analytical solutions for acoustic exponential materials, which possess exponential density and/or bulk modulus, offering a more accurate predictive tool and revealing the physical mechanism of broadband anti-reflection for sound propagation in such non-uniform materials. Leveraging this method, we designed an acoustic dipole array that effectively mimics exponential mass distribution. Through experiments with precisely engineered micro-perforated plates, we demonstrate an ultra-low reflection rate of about 0.86% across a wide frequency range from 420 Hz to 10,000 Hz. Our modified transfer matrix approach underpins the design of exponential materials, and our layering strategy for stacking acoustic dipoles suggests a pathway to more functional gradient acoustic metamaterials.
△ Less
Submitted 7 April, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.
-
Automatic Data Transformation Using Large Language Model: An Experimental Study on Building Energy Data
Authors:
Ankita Sharma,
Xuanmao Li,
Hong Guan,
Guoxin Sun,
Liang Zhang,
Lanjun Wang,
Kesheng Wu,
Lei Cao,
Erkang Zhu,
Alexander Sim,
Teresa Wu,
Jia Zou
Abstract:
Existing approaches to automatic data transformation are insufficient to meet the requirements in many real-world scenarios, such as the building sector. First, there is no convenient interface for domain experts to provide domain knowledge easily. Second, they require significant training data collection overheads. Third, the accuracy suffers from complicated schema changes. To bridge this gap, w…
▽ More
Existing approaches to automatic data transformation are insufficient to meet the requirements in many real-world scenarios, such as the building sector. First, there is no convenient interface for domain experts to provide domain knowledge easily. Second, they require significant training data collection overheads. Third, the accuracy suffers from complicated schema changes. To bridge this gap, we present a novel approach that leverages the unique capabilities of large language models (LLMs) in coding, complex reasoning, and zero-shot learning to generate SQL code that transforms the source datasets into the target datasets. We demonstrate the viability of this approach by designing an LLM-based framework, termed SQLMorpher, which comprises a prompt generator that integrates the initial prompt with optional domain knowledge and historical patterns in external databases. It also implements an iterative prompt optimization mechanism that automatically improves the prompt based on flaw detection. The key contributions of this work include (1) pioneering an end-to-end LLM-based solution for data transformation, (2) developing a benchmark dataset of 105 real-world building energy data transformation problems, and (3) conducting an extensive empirical evaluation where our approach achieved 96% accuracy in all 105 problems. SQLMorpher demonstrates the effectiveness of utilizing LLMs in complex, domain-specific challenges, highlighting the potential of their potential to drive sustainable solutions.
△ Less
Submitted 6 September, 2023; v1 submitted 5 September, 2023;
originally announced September 2023.
-
Gravitational Lensing by Transparent Janis-Newman-Winicour Naked Singularities
Authors:
Deyou Chen,
Yiqian Chen,
Peng Wang,
Tianshu Wu,
Houwen Wu
Abstract:
The Janis-Newman-Winicour (JNW) spacetime can describe a naked singularity with a photon sphere that smoothly transforms into a Schwarzschild black hole. Our analysis reveals that photons, upon entering the photon sphere, converge to the singularity in a finite coordinate time. Furthermore, if the singularity is subjected to some regularization, these photons can traverse the regularized singulari…
▽ More
The Janis-Newman-Winicour (JNW) spacetime can describe a naked singularity with a photon sphere that smoothly transforms into a Schwarzschild black hole. Our analysis reveals that photons, upon entering the photon sphere, converge to the singularity in a finite coordinate time. Furthermore, if the singularity is subjected to some regularization, these photons can traverse the regularized singularity. Subsequently, we investigate the gravitational lensing of distant sources and show that new images emerge within the critical curve formed by light rays escaping from the photon sphere. These newfound images offer a powerful tool for the detection and study of JNW naked singularities.
△ Less
Submitted 18 September, 2023; v1 submitted 2 September, 2023;
originally announced September 2023.
-
An Isotropic Discretization with Semi-implicit Approach for Phase Field Model of Alloy Solidification
Authors:
Chao Tang,
David Taiyen Wu,
Siu Sin Quek
Abstract:
Quantitative phase field models have been extensively used to study the solidification behavior of alloys under different conditions. However, a longstanding challenge of phase field models is the directional bias caused by the discretization-induced lattice effects. In particular, widely used discretization methods may introduce significant spurious anisotropy for simulations of polycrystalline s…
▽ More
Quantitative phase field models have been extensively used to study the solidification behavior of alloys under different conditions. However, a longstanding challenge of phase field models is the directional bias caused by the discretization-induced lattice effects. In particular, widely used discretization methods may introduce significant spurious anisotropy for simulations of polycrystalline solidification. In this paper, we demonstrate a feasible 2D discretization strategy utilizing a hexagonal mesh to reduce the lattice-induced anisotropy of the phase field model. The leading differential terms of the 2D discretization methods are analyzed by using known methods in Fourier space. Using Taylor expansion of discrete Fourier Transform up to sixth order, we found that the proposed discretization strategy is more accurate and isotropic than other methods, including the isotropic discretization recently proposed by Ji et al.[1]. Additionally, the proposed 2D discretization method can be easily incorporated into a semi-implicit algorithm to solve phase field equations, thereby greatly reducing time step constraints and improving computational efficiency compared to explicit approaches. To prove the accuracy and efficiency of the proposed isotropic discretization with semi-implicit algorithm, 2D simulations of alloy solidification with different discretization schemes were performed and compared. We show that the proposed discretization using a hexagonal mesh can drastically reduce grid-induced anisotropy compared to conventional methods.
△ Less
Submitted 2 September, 2023;
originally announced September 2023.
-
Amplifying Non-Resonant Production of Dark Sector Particles in Scattering Dominance Regime
Authors:
Mingxuan Du,
Jia Liu,
Xiao-Ping Wang,
Tianhao Wu
Abstract:
We investigate the enhancement of dark sector particle production within the scattering dominant regime. These particles typically exhibit a slight mixing with Standard Model particles through various portals, allowing for their generation through in-medium oscillation from Standard Model particle sources. Our analysis reveals that in the scattering dominance regime, with a significantly smaller s…
▽ More
We investigate the enhancement of dark sector particle production within the scattering dominant regime. These particles typically exhibit a slight mixing with Standard Model particles through various portals, allowing for their generation through in-medium oscillation from Standard Model particle sources. Our analysis reveals that in the scattering dominance regime, with a significantly smaller scattering mean free path $λ_{\rm sca}$ compared to the absorption mean free path $λ_{\rm abs}$, the nonresonant production of sterile states can experience an enhancement by a factor of $λ_{\rm abs}/λ_{\rm sca}$. This phenomenon is demonstrated within the context of kinetic mixing dark photon production at a reactor, precisely satisfying this condition. By incorporating this collisional enhancement, we find that the current sensitivity to the mixing parameter $ε$ for dark photons in the TEXONO experiment can be significantly improved across a range spanning from tens of eV to MeV. This advancement establishes the most stringent laboratory constraint within this mass spectrum for the dark photon. Sterile neutrino production, however, does not exhibit such enhancement, either due to the failure to meet the scattering dominance criterion or the neutrino damping in resonant production.
△ Less
Submitted 12 March, 2024; v1 submitted 31 August, 2023;
originally announced September 2023.
-
Spatio-temporal boundary dissipation measurement in Taylor-Couette flow using Diffusing-Wave Spectroscopy
Authors:
Enzo Francisco,
Vincent Bouillaut,
Tong Wu,
Sébastien Aumaître
Abstract:
Diffusing-Wave Spectroscopy (DWS) allows for the direct measurement of the squared strain-rate tensor. When combined with commonly available high-speed cameras, we show that DWS gives direct access to the spatio-temporal variations of the viscous dissipation rate of a Newtonian fluid flow. The method is demonstrated using a Taylor-Couette (TC) cell filled with a lipid emulsion or a \ch{TiO2} suspe…
▽ More
Diffusing-Wave Spectroscopy (DWS) allows for the direct measurement of the squared strain-rate tensor. When combined with commonly available high-speed cameras, we show that DWS gives direct access to the spatio-temporal variations of the viscous dissipation rate of a Newtonian fluid flow. The method is demonstrated using a Taylor-Couette (TC) cell filled with a lipid emulsion or a \ch{TiO2} suspension. We image the boundary dissipation rate in a quantitative and time-resolved fashion by shining coherent light at the experimental cell and measuring the local correlation time of the speckle pattern. The results are validated by comparison with the theoretical prediction for an ideal TC flow and with global measurements using a photomultiplier tube and a photon correlator. We illustrate the method by characterizing the spatial organization of the boundary dissipation rate past the Taylor-Couette instability threshold, and its spatio-temporal dynamics in the wavy vortex flow that arises beyond a secondary instability threshold. This study paves the way for direct imaging of the dissipation rate in a large variety of flows, including turbulent ones.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
A general Dabrowski-Sitarz-Zalecki type theorems for manifold with boundary
Authors:
Tong Wu,
Yong Wang
Abstract:
In [17], we obtained the spectral Einstein functional associated with the Dirac operator for n-dimensional manifolds without boundary. In this paper, we give the proof of general Dabrowski-Sitarz-Zalecki type theorems for the spectral Einstein functional associated with the Dirac operator on even and odd dimensional manifolds with boundary.
In [17], we obtained the spectral Einstein functional associated with the Dirac operator for n-dimensional manifolds without boundary. In this paper, we give the proof of general Dabrowski-Sitarz-Zalecki type theorems for the spectral Einstein functional associated with the Dirac operator on even and odd dimensional manifolds with boundary.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
ARF-Plus: Controlling Perceptual Factors in Artistic Radiance Fields for 3D Scene Stylization
Authors:
Wenzhao Li,
Tianhao Wu,
Fangcheng Zhong,
Cengiz Oztireli
Abstract:
The radiance fields style transfer is an emerging field that has recently gained popularity as a means of 3D scene stylization, thanks to the outstanding performance of neural radiance fields in 3D reconstruction and view synthesis. We highlight a research gap in radiance fields style transfer, the lack of sufficient perceptual controllability, motivated by the existing concept in the 2D image sty…
▽ More
The radiance fields style transfer is an emerging field that has recently gained popularity as a means of 3D scene stylization, thanks to the outstanding performance of neural radiance fields in 3D reconstruction and view synthesis. We highlight a research gap in radiance fields style transfer, the lack of sufficient perceptual controllability, motivated by the existing concept in the 2D image style transfer. In this paper, we present ARF-Plus, a 3D neural style transfer framework offering manageable control over perceptual factors, to systematically explore the perceptual controllability in 3D scene stylization. Four distinct types of controls - color preservation control, (style pattern) scale control, spatial (selective stylization area) control, and depth enhancement control - are proposed and integrated into this framework. Results from real-world datasets, both quantitative and qualitative, show that the four types of controls in our ARF-Plus framework successfully accomplish their corresponding perceptual controls when stylizing 3D scenes. These techniques work well for individual style inputs as well as for the simultaneous application of multiple styles within a scene. This unlocks a realm of limitless possibilities, allowing customized modifications of stylization effects and flexible merging of the strengths of different styles, ultimately enabling the creation of novel and eye-catching stylistic effects on 3D scenes.
△ Less
Submitted 6 September, 2023; v1 submitted 23 August, 2023;
originally announced August 2023.
-
Prompt2Model: Generating Deployable Models from Natural Language Instructions
Authors:
Vijay Viswanathan,
Chenyang Zhao,
Amanda Bertsch,
Tongshuang Wu,
Graham Neubig
Abstract:
Large language models (LLMs) enable system builders today to create competent NLP systems through prompting, where they only need to describe the task in natural language and provide a few examples. However, in other ways, LLMs are a step backward from traditional special-purpose NLP models; they require extensive computational resources for deployment and can be gated behind APIs. In this paper,…
▽ More
Large language models (LLMs) enable system builders today to create competent NLP systems through prompting, where they only need to describe the task in natural language and provide a few examples. However, in other ways, LLMs are a step backward from traditional special-purpose NLP models; they require extensive computational resources for deployment and can be gated behind APIs. In this paper, we propose Prompt2Model, a general-purpose method that takes a natural language task description like the prompts provided to LLMs, and uses it to train a special-purpose model that is conducive to deployment. This is done through a multi-step process of retrieval of existing datasets and pretrained models, dataset generation using LLMs, and supervised fine-tuning on these retrieved and generated datasets. Over three tasks, we demonstrate that given the same few-shot prompt as input, Prompt2Model trains models that outperform the results of a strong LLM, gpt-3.5-turbo, by an average of 20% while being up to 700 times smaller. We also show that this data can be used to obtain reliable performance estimates of model performance, enabling model developers to assess model reliability before deployment. Prompt2Model is available open-source at https://github.com/neulab/prompt2model.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
Improving Depth Gradient Continuity in Transformers: A Comparative Study on Monocular Depth Estimation with CNN
Authors:
Jiawei Yao,
Tong Wu,
Xiaofeng Zhang
Abstract:
Monocular depth estimation is an ongoing challenge in computer vision. Recent progress with Transformer models has demonstrated notable advantages over conventional CNNs in this area. However, there's still a gap in understanding how these models prioritize different regions in 2D images and how these regions affect depth estimation performance. To explore the differences between Transformers and…
▽ More
Monocular depth estimation is an ongoing challenge in computer vision. Recent progress with Transformer models has demonstrated notable advantages over conventional CNNs in this area. However, there's still a gap in understanding how these models prioritize different regions in 2D images and how these regions affect depth estimation performance. To explore the differences between Transformers and CNNs, we employ a sparse pixel approach to contrastively analyze the distinctions between the two. Our findings suggest that while Transformers excel in handling global context and intricate textures, they lag behind CNNs in preserving depth gradient continuity. To further enhance the performance of Transformer models in monocular depth estimation, we propose the Depth Gradient Refinement (DGR) module that refines depth estimation through high-order differentiation, feature fusion, and recalibration. Additionally, we leverage optimal transport theory, treating depth maps as spatial probability distributions, and employ the optimal transport distance as a loss function to optimize our model. Experimental results demonstrate that models integrated with the plug-and-play Depth Gradient Refinement (DGR) module and the proposed loss function enhance performance without increasing complexity and computational costs on both outdoor KITTI and indoor NYU-Depth-v2 datasets. This research not only offers fresh insights into the distinctions between Transformers and CNNs in depth estimation but also paves the way for novel depth estimation methodologies.
△ Less
Submitted 5 December, 2023; v1 submitted 16 August, 2023;
originally announced August 2023.
-
Synergi: A Mixed-Initiative System for Scholarly Synthesis and Sensemaking
Authors:
Hyeonsu B. Kang,
Sherry Tongshuang Wu,
Joseph Chee Chang,
Aniket Kittur
Abstract:
Efficiently reviewing scholarly literature and synthesizing prior art are crucial for scientific progress. Yet, the growing scale of publications and the burden of knowledge make synthesis of research threads more challenging than ever. While significant research has been devoted to helping scholars interact with individual papers, building research threads scattered across multiple papers remains…
▽ More
Efficiently reviewing scholarly literature and synthesizing prior art are crucial for scientific progress. Yet, the growing scale of publications and the burden of knowledge make synthesis of research threads more challenging than ever. While significant research has been devoted to helping scholars interact with individual papers, building research threads scattered across multiple papers remains a challenge. Most top-down synthesis (and LLMs) make it difficult to personalize and iterate on the output, while bottom-up synthesis is costly in time and effort. Here, we explore a new design space of mixed-initiative workflows. In doing so we develop a novel computational pipeline, Synergi, that ties together user input of relevant seed threads with citation graphs and LLMs, to expand and structure them, respectively. Synergi allows scholars to start with an entire threads-and-subthreads structure generated from papers relevant to their interests, and to iterate and customize on it as they wish. In our evaluation, we find that Synergi helps scholars efficiently make sense of relevant threads, broaden their perspectives, and increases their curiosity. We discuss future design implications for thread-based, mixed-initiative scholarly synthesis support tools.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Auditory cueing strategy for stride length and cadence modification: a feasibility study with healthy adults
Authors:
Tina LY Wu,
Anna Murphy,
Chao Chen,
Dana Kulic
Abstract:
People with Parkinson's Disease experience gait impairments that significantly impact their quality of life. Visual, auditory, and tactile cues can alleviate gait impairments, but they can become less effective due to the progressive nature of the disease and changes in people's motor capability. In this study, we develop a human-in-the-loop (HIL) framework that monitors two key gait parameters, s…
▽ More
People with Parkinson's Disease experience gait impairments that significantly impact their quality of life. Visual, auditory, and tactile cues can alleviate gait impairments, but they can become less effective due to the progressive nature of the disease and changes in people's motor capability. In this study, we develop a human-in-the-loop (HIL) framework that monitors two key gait parameters, stride length and cadence, and continuously learns a person-specific model of how the parameters change in response to the feedback. The model is then used in an optimization algorithm to improve the gait parameters. This feasibility study examines whether auditory cues can be used to influence stride length in people without gait impairments. The results demonstrate the benefits of the HIL framework in maintaining people's stride length in the presence of a secondary task.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Antisymmetric Planar Hall Effect in Rutile Oxide Films Induced by the Lorentz Force
Authors:
Yongwei Cui,
Zhaoqing Li,
Haoran Chen,
Yue Chen,
Yunzhuo Wu,
Ke Pei,
Tong Wu,
Nian Xie,
Renchao Che,
Xuepeng Qiu,
Yi Liu,
Zhe Yuan,
Yizheng Wu
Abstract:
The conventional Hall effect is linearly proportional to the field component or magnetization component perpendicular to a film. Despite the increasing theoretical proposals on the Hall effect to the in-plane field or magnetization in various special systems induced by the Berry curvature, such an unconventional Hall effect has only been experimentally reported in Weyl semimetals and in a heterodi…
▽ More
The conventional Hall effect is linearly proportional to the field component or magnetization component perpendicular to a film. Despite the increasing theoretical proposals on the Hall effect to the in-plane field or magnetization in various special systems induced by the Berry curvature, such an unconventional Hall effect has only been experimentally reported in Weyl semimetals and in a heterodimensional superlattice. Here, we report an unambiguous experimental observation of the antisymmetric planar Hall effect (APHE) with respect to the in-plane magnetic field in centrosymmetric rutile RuO2 and IrO2 single-crystal films. The measured Hall resistivity is found to be linearly proportional to the component of the applied in-plane magnetic field along a particular crystal axis and to be independent of the current direction or temperature. Both the experimental observations and theoretical calculations confirm that the APHE in rutile oxide films is induced by the Lorentz force. Our findings can be generalized to ferromagnetic materials for the discovery of anomalous Hall effects and quantum anomalous Hall effects induced by in-plane magnetization. In addition to significantly expanding knowledge of the Hall effect, this work opens the door to explore new members in the Hall effect family.
△ Less
Submitted 18 February, 2024; v1 submitted 12 August, 2023;
originally announced August 2023.
-
CGBA: Curvature-aware Geometric Black-box Attack
Authors:
Md Farhamdur Reza,
Ali Rahmati,
Tianfu Wu,
Huaiyu Dai
Abstract:
Decision-based black-box attacks often necessitate a large number of queries to craft an adversarial example. Moreover, decision-based attacks based on querying boundary points in the estimated normal vector direction often suffer from inefficiency and convergence issues. In this paper, we propose a novel query-efficient curvature-aware geometric decision-based black-box attack (CGBA) that conduct…
▽ More
Decision-based black-box attacks often necessitate a large number of queries to craft an adversarial example. Moreover, decision-based attacks based on querying boundary points in the estimated normal vector direction often suffer from inefficiency and convergence issues. In this paper, we propose a novel query-efficient curvature-aware geometric decision-based black-box attack (CGBA) that conducts boundary search along a semicircular path on a restricted 2D plane to ensure finding a boundary point successfully irrespective of the boundary curvature. While the proposed CGBA attack can work effectively for an arbitrary decision boundary, it is particularly efficient in exploiting the low curvature to craft high-quality adversarial examples, which is widely seen and experimentally verified in commonly used classifiers under non-targeted attacks. In contrast, the decision boundaries often exhibit higher curvature under targeted attacks. Thus, we develop a new query-efficient variant, CGBA-H, that is adapted for the targeted attack. In addition, we further design an algorithm to obtain a better initial boundary point at the expense of some extra queries, which considerably enhances the performance of the targeted attack. Extensive experiments are conducted to evaluate the performance of our proposed methods against some well-known classifiers on the ImageNet and CIFAR10 datasets, demonstrating the superiority of CGBA and CGBA-H over state-of-the-art non-targeted and targeted attacks, respectively. The source code is available at https://github.com/Farhamdur/CGBA.
△ Less
Submitted 6 August, 2023;
originally announced August 2023.
-
Dirac operators with torsion, spectral Einstein functionals and the noncommutative residue
Authors:
Jian Wang,
Yong Wang,
Tong Wu
Abstract:
Recently Dabrowski etc. \cite{DL} obtained the metric and Einstein functionals by two vector fields and Laplace-type operators over vector bundles, giving an interesting example of the spinor connection and square of the Dirac operator. Pf$\ddot{a}$ffle and Stephan \cite{PS1} considered orthogonal connections with arbitrary torsion on compact Riemannian manifolds and computed the spectral action.…
▽ More
Recently Dabrowski etc. \cite{DL} obtained the metric and Einstein functionals by two vector fields and Laplace-type operators over vector bundles, giving an interesting example of the spinor connection and square of the Dirac operator. Pf$\ddot{a}$ffle and Stephan \cite{PS1} considered orthogonal connections with arbitrary torsion on compact Riemannian manifolds and computed the spectral action. Motivated by the spectral functionals and Dirac operators with torsion, we give some new spectral functionals which is the extension of spectral functionals to the noncommutative realm with torsion, and we relate them to the noncommutative residue for manifolds with boundary. Our method of producing these spectral functionals is the noncommutative residue and Dirac operators with torsion.
△ Less
Submitted 29 July, 2023;
originally announced August 2023.
-
AsdKB: A Chinese Knowledge Base for the Early Screening and Diagnosis of Autism Spectrum Disorder
Authors:
Tianxing Wu,
Xudong Cao,
Yipeng Zhu,
Feiyue Wu,
Tianling Gong,
Yuxiang Wang,
Shenqi Jing
Abstract:
To easily obtain the knowledge about autism spectrum disorder and help its early screening and diagnosis, we create AsdKB, a Chinese knowledge base on autism spectrum disorder. The knowledge base is built on top of various sources, including 1) the disease knowledge from SNOMED CT and ICD-10 clinical descriptions on mental and behavioural disorders, 2) the diagnostic knowledge from DSM-5 and diffe…
▽ More
To easily obtain the knowledge about autism spectrum disorder and help its early screening and diagnosis, we create AsdKB, a Chinese knowledge base on autism spectrum disorder. The knowledge base is built on top of various sources, including 1) the disease knowledge from SNOMED CT and ICD-10 clinical descriptions on mental and behavioural disorders, 2) the diagnostic knowledge from DSM-5 and different screening tools recommended by social organizations and medical institutes, and 3) the expert knowledge on professional physicians and hospitals from the Web. AsdKB contains both ontological and factual knowledge, and is accessible as Linked Data at https://w3id.org/asdkb/. The potential applications of AsdKB are question answering, auxiliary diagnosis, and expert recommendation, and we illustrate them with a prototype which can be accessed at http://asdkb.org.cn/.
△ Less
Submitted 2 August, 2023; v1 submitted 31 July, 2023;
originally announced July 2023.
-
Blockchain-empowered Federated Learning for Healthcare Metaverses: User-centric Incentive Mechanism with Optimal Data Freshness
Authors:
Jiawen Kang,
Jinbo Wen,
Dongdong Ye,
Bingkun Lai,
Tianhao Wu,
Zehui Xiong,
Jiangtian Nie,
Dusit Niyato,
Yang Zhang,
Shengli Xie
Abstract:
Given the revolutionary role of metaverses, healthcare metaverses are emerging as a transformative force, creating intelligent healthcare systems that offer immersive and personalized services. The healthcare metaverses allow for effective decision-making and data analytics for users. However, there still exist critical challenges in building healthcare metaverses, such as the risk of sensitive da…
▽ More
Given the revolutionary role of metaverses, healthcare metaverses are emerging as a transformative force, creating intelligent healthcare systems that offer immersive and personalized services. The healthcare metaverses allow for effective decision-making and data analytics for users. However, there still exist critical challenges in building healthcare metaverses, such as the risk of sensitive data leakage and issues with sensing data security and freshness, as well as concerns around incentivizing data sharing. In this paper, we first design a user-centric privacy-preserving framework based on decentralized Federated Learning (FL) for healthcare metaverses. To further improve the privacy protection of healthcare metaverses, a cross-chain empowered FL framework is utilized to enhance sensing data security. This framework utilizes a hierarchical cross-chain architecture with a main chain and multiple subchains to perform decentralized, privacy-preserving, and secure data training in both virtual and physical spaces. Moreover, we utilize Age of Information (AoI) as an effective data-freshness metric and propose an AoI-based contract theory model under Prospect Theory (PT) to motivate sensing data sharing in a user-centric manner. This model exploits PT to better capture the subjective utility of the service provider. Finally, our numerical results demonstrate the effectiveness of the proposed schemes for healthcare metaverses.
△ Less
Submitted 29 July, 2023;
originally announced July 2023.
-
One-forms, spectral Einstein functionals and the noncommutative residue
Authors:
Jian Wang,
Yong Wang,
Tong Wu,
Yuchen Yang
Abstract:
For two one-forms and the Dirac operator, Dabrowski etc. recovered the spectral Einstein functionals by computing their noncommutative residue in Theorem 4.1 \cite{DL}. In this paper, we generalize the results of Dabrowski etc. to the cases of four dimensional spin manifolds with boundary.
For two one-forms and the Dirac operator, Dabrowski etc. recovered the spectral Einstein functionals by computing their noncommutative residue in Theorem 4.1 \cite{DL}. In this paper, we generalize the results of Dabrowski etc. to the cases of four dimensional spin manifolds with boundary.
△ Less
Submitted 29 July, 2023;
originally announced July 2023.