Skip to main content

Showing 1–50 of 1,011 results for author: Kumar, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05243  [pdf, other

    quant-ph cs.LG physics.comp-ph

    Deep learning-based variational autoencoder for classification of quantum and classical states of light

    Authors: Mahesh Bhupati, Abhishek Mall, Anshuman Kumar, Pankaj K. Jha

    Abstract: Advancements in optical quantum technologies have been enabled by the generation, manipulation, and characterization of light, with identification based on its photon statistics. However, characterizing light and its sources through single photon measurements often requires efficient detectors and longer measurement times to obtain high-quality photon statistics. Here we introduce a deep learning-… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  2. arXiv:2405.03948  [pdf, other

    cs.IR cs.HC

    The Fault in Our Recommendations: On the Perils of Optimizing the Measurable

    Authors: Omar Besbes, Yash Kanoria, Akshit Kumar

    Abstract: Recommendation systems are widespread, and through customized recommendations, promise to match users with options they will like. To that end, data on engagement is collected and used. Most recommendation systems are ranking-based, where they rank and recommend items based on their predicted engagement. However, the engagement signals are often only a crude proxy for utility, as data on the latte… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  3. arXiv:2405.03005  [pdf, other

    cs.LG cs.AI

    Safe Reinforcement Learning with Learned Non-Markovian Safety Constraints

    Authors: Siow Meng Low, Akshat Kumar

    Abstract: In safe Reinforcement Learning (RL), safety cost is typically defined as a function dependent on the immediate state and actions. In practice, safety constraints can often be non-Markovian due to the insufficient fidelity of state representation, and safety cost may not be known. We therefore address a general setting where safety labels (e.g., safe or unsafe) are associated with state-action traj… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  4. arXiv:2405.01572  [pdf, other

    cs.SE cs.AI cs.AR

    A Semi-Formal Verification Methodology for Efficient Configuration Coverage of Highly Configurable Digital Designs

    Authors: Aman Kumar, Sebastian Simon

    Abstract: Nowadays, a majority of System-on-Chips (SoCs) make use of Intellectual Property (IP) in order to shorten development cycles. When such IPs are developed, one of the main focuses lies in the high configurability of the design. This flexibility on the design side introduces the challenge of covering a huge state space of IP configurations on the verification side to ensure the functional correctnes… ▽ More

    Submitted 20 April, 2024; originally announced May 2024.

    Comments: Published in DVCon U.S. 2021

  5. arXiv:2405.01040  [pdf, other

    cs.CV cs.CL eess.IV

    Few Shot Class Incremental Learning using Vision-Language models

    Authors: Anurag Kumar, Chinmay Bharti, Saikat Dutta, Srikrishna Karanam, Biplab Banerjee

    Abstract: Recent advancements in deep learning have demonstrated remarkable performance comparable to human capabilities across various supervised computer vision tasks. However, the prevalent assumption of having an extensive pool of training data encompassing all classes prior to model training often diverges from real-world scenarios, where limited data availability for novel classes is the norm. The cha… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: under review at Pattern Recognition Letters

  6. arXiv:2405.00130  [pdf, other

    eess.IV cs.CV cs.LG

    A Flexible 2.5D Medical Image Segmentation Approach with In-Slice and Cross-Slice Attention

    Authors: Amarjeet Kumar, Hongxu Jiang, Muhammad Imran, Cyndi Valdes, Gabriela Leon, Dahyun Kang, Parvathi Nataraj, Yuyin Zhou, Michael D. Weiss, Wei Shao

    Abstract: Deep learning has become the de facto method for medical image segmentation, with 3D segmentation models excelling in capturing complex 3D structures and 2D models offering high computational efficiency. However, segmenting 2.5D images, which have high in-plane but low through-plane resolution, is a relatively unexplored challenge. While applying 2D models to individual slices of a 2.5D image is f… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  7. arXiv:2404.18270  [pdf, other

    cs.AI cs.LO

    Pragmatic Formal Verification of Sequential Error Detection and Correction Codes (ECCs) used in Safety-Critical Design

    Authors: Aman Kumar

    Abstract: Error Detection and Correction Codes (ECCs) are often used in digital designs to protect data integrity. Especially in safety-critical systems such as automotive electronics, ECCs are widely used and the verification of such complex logic becomes more critical considering the ISO 26262 safety standards. Exhaustive verification of ECC using formal methods has been a challenge given the high number… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Published in DVCon U.S. 2023

  8. arXiv:2404.16896  [pdf, other

    cs.GR cs.LG

    A Neural-Network-Based Approach for Loose-Fitting Clothing

    Authors: Yongxu Jin, Dalton Omens, Zhenglin Geng, Joseph Teran, Abishek Kumar, Kenji Tashiro, Ronald Fedkiw

    Abstract: Since loose-fitting clothing contains dynamic modes that have proven to be difficult to predict via neural networks, we first illustrate how to coarsely approximate these modes with a real-time numerical algorithm specifically designed to mimic the most important ballistic features of a classical numerical simulation. Although there is some flexibility in the choice of the numerical algorithm used… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  9. arXiv:2404.16893  [pdf, other

    cs.LG cs.AI cs.RO

    Automatic AI controller that can drive with confidence: steering vehicle with uncertainty knowledge

    Authors: Neha Kumari, Sumit Kumar. Sneha Priya, Ayush Kumar, Akash Fogla

    Abstract: In safety-critical systems that interface with the real world, the role of uncertainty in decision-making is pivotal, particularly in the context of machine learning models. For the secure functioning of Cyber-Physical Systems (CPS), it is imperative to manage such uncertainty adeptly. In this research, we focus on the development of a vehicle's lateral control system using a machine learning fram… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.08187

  10. arXiv:2404.16060  [pdf

    cs.HC physics.ed-ph physics.optics

    Pocket Schlieren: a background oriented schlieren imaging platform on a smartphone

    Authors: Diganta Rabha, Vimod Kumar, Akshay Kumar, Dinesh Saini, Manish Kumar

    Abstract: Background-oriented schlieren (BOS) is a powerful technique for flow visualization. Nevertheless, the widespread dissemination of BOS is impeded by its dependence on scientific cameras, computing hardware, and dedicated analysis software. In this work, we aim to democratize BOS by providing a smartphone based scientific tool called "Pocket Schlieren". Pocket Schlieren enables users to directly cap… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 24 pages, 6 figures, 4 Supplementary figures

  11. arXiv:2404.15371  [pdf, other

    eess.SP cs.AI

    Efficient Verification of a RADAR SoC Using Formal and Simulation-Based Methods

    Authors: Aman Kumar, Mark Litterick, Samuele Candido

    Abstract: As the demand for Internet of Things (IoT) and Human-to-Machine Interaction (HMI) increases, modern System-on-Chips (SoCs) offering such solutions are becoming increasingly complex. This intricate design poses significant challenges for verification, particularly when time-to-market is a crucial factor for consumer electronics products. This paper presents a case study based on our work to verify… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Published in DVCon Europe 2023

  12. arXiv:2404.14372  [pdf, other

    cs.CL cs.AI

    Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph

    Authors: Xiaochen Kev Gao, Feng Yao, Kewen Zhao, Beilei He, Animesh Kumar, Vish Krishnan, Jingbo Shang

    Abstract: Model scaling is becoming the default choice for many language tasks due to the success of large language models (LLMs). However, it can fall short in specific scenarios where simple customized methods excel. In this paper, we delve into the patent approval pre-diction task and unveil that simple domain-specific graph methods outperform enlarging the model, using the intrinsic dependencies within… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 17 Pages, Under Review

  13. arXiv:2404.14367  [pdf, other

    cs.LG

    Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

    Authors: Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar

    Abstract: Learning from preference labels plays a crucial role in fine-tuning large language models. There are several distinct approaches for preference fine-tuning, including supervised learning, on-policy reinforcement learning (RL), and contrastive learning. Different methods come with different implementation tradeoffs and performance differences, and existing empirical findings present different concl… ▽ More

    Submitted 23 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  14. Visualizing Intelligent Tutor Interactions for Responsive Pedagogy

    Authors: Grace Guo, Aishwarya Mudgal Sunil Kumar, Adit Gupta, Adam Coscia, Chris MacLellan, Alex Endert

    Abstract: Intelligent tutoring systems leverage AI models of expert learning and student knowledge to deliver personalized tutoring to students. While these intelligent tutors have demonstrated improved student learning outcomes, it is still unclear how teachers might integrate them into curriculum and course planning to support responsive pedagogy. In this paper, we conducted a design study with five teach… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 9 pages, 5 figures, ACM AVI 2024

  15. arXiv:2404.07981  [pdf, other

    cs.IR cs.AI cs.CL

    Manipulating Large Language Models to Increase Product Visibility

    Authors: Aounon Kumar, Himabindu Lakkaraju

    Abstract: Large language models (LLMs) are increasingly being integrated into search engines to provide natural language responses tailored to user queries. Customers and end-users are also becoming more dependent on these models for quick and easy purchase decisions. In this work, we investigate whether recommendations from LLMs can be manipulated to enhance a product's visibility. We demonstrate that addi… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  16. arXiv:2404.07225  [pdf

    q-fin.ST cs.AI cs.LG

    Unveiling the Impact of Macroeconomic Policies: A Double Machine Learning Approach to Analyzing Interest Rate Effects on Financial Markets

    Authors: Anoop Kumar, Suresh Dodda, Navin Kamuni, Rajeev Kumar Arora

    Abstract: This study examines the effects of macroeconomic policies on financial markets using a novel approach that combines Machine Learning (ML) techniques and causal inference. It focuses on the effect of interest rate changes made by the US Federal Reserve System (FRS) on the returns of fixed income and equity funds between January 1986 and December 2021. The analysis makes a distinction between active… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  17. arXiv:2404.06715  [pdf, other

    cs.CV

    Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data

    Authors: Aakash Kumar, Chen Chen, Ajmal Mian, Neils Lobo, Mubarak Shah

    Abstract: 3D detection is a critical task that enables machines to identify and locate objects in three-dimensional space. It has a broad range of applications in several fields, including autonomous driving, robotics and augmented reality. Monocular 3D detection is attractive as it requires only a single camera, however, it lacks the accuracy and robustness required for real world applications. High resolu… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  18. arXiv:2404.04392  [pdf, other

    cs.CR cs.AI

    Increased LLM Vulnerabilities from Fine-tuning and Quantization

    Authors: Divyanshu Kumar, Anurakt Kumar, Sahil Agarwal, Prashanth Harshangi

    Abstract: Large Language Models (LLMs) have become very popular and have found use cases in many domains, such as chatbots, auto-task completion agents, and much more. However, LLMs are vulnerable to different types of attacks, such as jailbreaking, prompt injection attacks, and privacy leakage attacks. Foundational LLMs undergo adversarial and alignment training to learn not to generate malicious and toxic… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  19. arXiv:2404.03995  [pdf, other

    cs.SE cs.AI

    Balancing Progress and Responsibility: A Synthesis of Sustainability Trade-Offs of AI-Based Systems

    Authors: Apoorva Nalini Pradeep Kumar, Justus Bogner, Markus Funke, Patricia Lago

    Abstract: Recent advances in artificial intelligence (AI) capabilities have increased the eagerness of companies to integrate AI into software systems. While AI can be used to have a positive impact on several dimensions of sustainability, this is often overshadowed by its potential negative influence. While many studies have explored sustainability factors in isolation, there is insufficient holistic cover… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: Accepted for publication at the 8th International Workshop on Green and Sustainable Software (GREENS'24), collocated with ICSA'24

  20. arXiv:2404.00526  [pdf

    cs.HC cs.AI

    The Emotional Impact of Game Duration: A Framework for Understanding Player Emotions in Extended Gameplay Sessions

    Authors: Anoop Kumar, Suresh Dodda, Navin Kamuni, Venkata Sai Mahesh Vuppalapati

    Abstract: Video games have played a crucial role in entertainment since their development in the 1970s, becoming even more prominent during the lockdown period when people were looking for ways to entertain them. However, at that time, players were unaware of the significant impact that playtime could have on their feelings. This has made it challenging for designers and developers to create new games since… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  21. arXiv:2403.20318  [pdf, other

    cs.CV cs.AI

    SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects

    Authors: Abhinav Kumar, Yuliang Guo, Xinyu Huang, Liu Ren, Xiaoming Liu

    Abstract: Monocular 3D detectors achieve remarkable performance on cars and smaller objects. However, their performance drops on larger objects, leading to fatal accidents. Some attribute the failures to training data scarcity or their receptive field requirements of large objects. In this paper, we highlight this understudied problem of generalization to large objects. We find that modern frontal detectors… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  22. arXiv:2403.18821  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark

    Authors: Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard

    Abstract: We present a new dataset called Real Acoustic Fields (RAF) that captures real acoustic room data from multiple modalities. The dataset includes high-quality and densely captured room impulse response data paired with multi-view images, and precise 6DoF pose tracking data for sound emitters and listeners in the rooms. We used this dataset to evaluate existing methods for novel-view acoustic synthes… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024. Project site: https://facebookresearch.github.io/real-acoustic-fields/

  23. arXiv:2403.17116  [pdf

    cs.HC cs.CY

    Behind the Counter: Exploring the Motivations and Barriers of Online Counterspeech Writing

    Authors: Kaike Ping, Anisha Kumar, Xiaohan Ding, Eugenia Rho

    Abstract: Current research mainly explores the attributes and impact of online counterspeech, leaving a gap in understanding of who engages in online counterspeech or what motivates or deters users from participating. To investigate this, we surveyed 458 English-speaking U.S. participants, analyzing key motivations and barriers underlying online counterspeech engagement. We presented each participant with t… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 39 pages, 2 figures

  24. arXiv:2403.16750  [pdf, other

    cs.AI

    All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification

    Authors: Deepak Narayan Gadde, Aman Kumar, Thomas Nalapat, Evgenii Rezunov, Fabio Cappellini

    Abstract: Modern hardware designs have grown increasingly efficient and complex. However, they are often susceptible to Common Weakness Enumerations (CWEs). This paper is focused on the formal verification of CWEs in a dataset of hardware designs written in SystemVerilog from Regenerative Artificial Intelligence (AI) powered by Large Language Models (LLMs). We applied formal verification to categorize each… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Published in DVCon U.S. 2024

  25. arXiv:2403.16514  [pdf, ps, other

    cs.HC cs.SI

    Linguistically Differentiating Acts and Recalls of Racial Microaggressions on Social Media

    Authors: Uma Sushmitha Gunturi, Anisha Kumar, Xiaohan Ding, Eugenia H. Rho

    Abstract: In this work, we examine the linguistic signature of online racial microaggressions (acts) and how it differs from that of personal narratives recalling experiences of such aggressions (recalls) by Black social media users. We manually curate and annotate a corpus of acts and recalls from in-the-wild social media discussions, and verify labels with Black workshop participants. We leverage Natural… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 36 pages

  26. arXiv:2403.15705  [pdf, other

    cs.CV

    UPNeRF: A Unified Framework for Monocular 3D Object Reconstruction and Pose Estimation

    Authors: Yuliang Guo, Abhinav Kumar, Cheng Zhao, Ruoyu Wang, Xinyu Huang, Liu Ren

    Abstract: Monocular 3D reconstruction for categorical objects heavily relies on accurately perceiving each object's pose. While gradient-based optimization within a NeRF framework updates initially given poses, this paper highlights that such a scheme fails when the initial pose even moderately deviates from the true pose. Consequently, existing methods often depend on a third-party 3D object to provide an… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  27. arXiv:2403.10088  [pdf, other

    cs.CL cs.AI

    Intent-conditioned and Non-toxic Counterspeech Generation using Multi-Task Instruction Tuning with RLAIF

    Authors: Amey Hengle, Aswini Kumar, Sahajpreet Singh, Anil Bandhakavi, Md Shad Akhtar, Tanmoy Chakroborty

    Abstract: Counterspeech, defined as a response to mitigate online hate speech, is increasingly used as a non-censorial solution. Addressing hate speech effectively involves dispelling the stereotypes, prejudices, and biases often subtly implied in brief, single-sentence statements or abuses. These implicit expressions challenge language models, especially in seq2seq tasks, as model performance typically exc… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  28. arXiv:2403.09362  [pdf, other

    cs.CL

    Komodo: A Linguistic Expedition into Indonesia's Regional Languages

    Authors: Louis Owen, Vishesh Tripathi, Abhay Kumar, Biddwan Ahmed

    Abstract: The recent breakthroughs in Large Language Models (LLMs) have mostly focused on languages with easily available and sufficient resources, such as English. However, there remains a significant gap for languages that lack sufficient linguistic resources in the public domain. Our work introduces Komodo-7B, 7-billion-parameter Large Language Models designed to address this gap by seamlessly operating… ▽ More

    Submitted 19 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: 30 Pages, 8 Figures, 4 Tables

  29. arXiv:2403.08384   

    cs.CV

    AADNet: Attention aware Demoiréing Network

    Authors: M Rakesh Reddy, Shubham Mandloi, Aman Kumar

    Abstract: Moire pattern frequently appears in photographs captured with mobile devices and digital cameras, potentially degrading image quality. Despite recent advancements in computer vision, image demoire'ing remains a challenging task due to the dynamic textures and variations in colour, shape, and frequency of moire patterns. Most existing methods struggle to generalize to unseen datasets, limiting thei… ▽ More

    Submitted 6 May, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: Due to unauthorized access and upload, this paper has been withdrawn. It does not reflect the contributions or approval

  30. arXiv:2403.08261  [pdf, other

    cs.CV cs.AI eess.IV

    CoroNetGAN: Controlled Pruning of GANs via Hypernetworks

    Authors: Aman Kumar, Khushboo Anand, Shubham Mandloi, Ashutosh Mishra, Avinash Thakur, Neeraj Kasera, Prathosh A P

    Abstract: Generative Adversarial Networks (GANs) have proven to exhibit remarkable performance and are widely used across many generative computer vision applications. However, the unprecedented demand for the deployment of GANs on resource-constrained edge devices still poses a challenge due to huge number of parameters involved in the generation process. This has led to focused attention on the area of co… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  31. arXiv:2403.08176  [pdf

    cs.DL cs.CY

    Sentiment-aware Enhancements of PageRank-based Citation Metric, Impact Factor, and H-index for Ranking the Authors of Scholarly Articles

    Authors: Shikha Gupta, Animesh Kumar

    Abstract: Heretofore, the only way to evaluate an author has been frequency-based citation metrics that assume citations to be of a neutral sentiment. However, considering the sentiment behind citations aids in a better understanding of the viewpoints of fellow researchers for the scholarly output of an author.

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: The paper has been accepted for publication in Computer Science journal: http://journals.agh.edu.pl/csci}

  32. arXiv:2403.08121  [pdf, other

    cs.LG math.OC stat.ML

    Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations

    Authors: Akshay Kumar, Jarvis Haupt

    Abstract: This paper studies the gradient flow dynamics that arise when training deep homogeneous neural networks, starting with small initializations. The present work considers neural networks that are assumed to have locally Lipschitz gradients and an order of homogeneity strictly greater than two. This paper demonstrates that for sufficiently small initializations, during the early stages of training, t… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  33. arXiv:2403.07958  [pdf, other

    cs.LG cs.AI

    Temporal Decisions: Leveraging Temporal Correlation for Efficient Decisions in Early Exit Neural Networks

    Authors: Max Sponner, Lorenzo Servadei, Bernd Waschneck, Robert Wille, Akash Kumar

    Abstract: Deep Learning is becoming increasingly relevant in Embedded and Internet-of-things applications. However, deploying models on embedded devices poses a challenge due to their resource limitations. This can impact the model's inference accuracy and latency. One potential solution are Early Exit Neural Networks, which adjust model depth dynamically through additional classifiers attached between thei… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  34. arXiv:2403.07957  [pdf, other

    cs.LG cs.AI

    Efficient Post-Training Augmentation for Adaptive Inference in Heterogeneous and Distributed IoT Environments

    Authors: Max Sponner, Lorenzo Servadei, Bernd Waschneck, Robert Wille, Akash Kumar

    Abstract: Early Exit Neural Networks (EENNs) present a solution to enhance the efficiency of neural network deployments. However, creating EENNs is challenging and requires specialized domain knowledge, due to the large amount of additional design choices. To address this issue, we propose an automated augmentation flow that focuses on converting an existing model into an EENN. It performs all required desi… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  35. arXiv:2403.05612  [pdf, other

    cs.LG cs.AI cs.CL

    Unfamiliar Finetuning Examples Control How Language Models Hallucinate

    Authors: Katie Kang, Eric Wallace, Claire Tomlin, Aviral Kumar, Sergey Levine

    Abstract: Large language models (LLMs) have a tendency to generate plausible-sounding yet factually incorrect responses, especially when queried on unfamiliar concepts. In this work, we explore the underlying mechanisms that govern how finetuned LLMs hallucinate. Our investigation reveals an interesting pattern: as inputs become more unfamiliar, LLM outputs tend to default towards a ``hedged'' prediction, w… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  36. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry, Lepikhin, Timothy Lillicrap, Jean-baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser, Ioannis Antonoglou, Rohan Anil, Sebastian Borgeaud, Andrew Dai, Katie Millican, Ethan Dyer, Mia Glaese, Thibault Sottiaux, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, James Molloy , et al. (683 additional authors not shown)

    Abstract: In this report, we present the latest model of the Gemini family, Gemini 1.5 Pro, a highly compute-efficient multimodal mixture-of-experts model capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. Gemini 1.5 Pro achieves near-perfect recall on long-context retrieval tasks across modalit… ▽ More

    Submitted 25 April, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  37. arXiv:2403.04781  [pdf

    cs.CR cs.CV cs.LG eess.IV

    Selective Encryption using Segmentation Mask with Chaotic Henon Map for Multidimensional Medical Images

    Authors: S Arut Prakash, Aditya Ganesh Kumar, Prabhu Shankar K. C., Lithicka Anandavel, Aditya Lakshmi Narayanan

    Abstract: A user-centric design and resource optimization should be at the center of any technology or innovation. The user-centric perspective gives the developer the opportunity to develop with task-based optimization. The user in the medical image field is a medical professional who analyzes the medical images and gives their diagnosis results to the patient. This scheme, having the medical professional… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  38. arXiv:2403.03950  [pdf, other

    cs.LG cs.AI stat.ML

    Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

    Authors: Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taïga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal

    Abstract: Value functions are a central component of deep reinforcement learning (RL). These functions, parameterized by neural networks, are trained using a mean squared error regression objective to match bootstrapped target values. However, scaling value-based RL methods that use regression to large networks, such as high-capacity Transformers, has proven challenging. This difficulty is in stark contrast… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  39. arXiv:2403.03744  [pdf, other

    cs.AI

    Towards Safe Large Language Models for Medicine

    Authors: Tessa Han, Aounon Kumar, Chirag Agarwal, Himabindu Lakkaraju

    Abstract: As large language models (LLMs) develop ever-improving capabilities and are applied in real-world settings, it is important to understand their safety. While initial steps have been taken to evaluate the safety of general-knowledge LLMs, exposing some weaknesses, the safety of medical LLMs has not been sufficiently evaluated despite their high risks to personal health and safety, public health and… ▽ More

    Submitted 1 May, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

  40. arXiv:2403.01369  [pdf, other

    eess.AS cs.AI cs.LG

    A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement

    Authors: Ravi Shankar, Ke Tan, Buye Xu, Anurag Kumar

    Abstract: Self-supervised learned models have been found to be very effective for certain speech tasks such as automatic speech recognition, speaker identification, keyword spotting and others. While the features are undeniably useful in speech recognition and associated tasks, their utility in speech enhancement systems is yet to be firmly established, and perhaps not properly understood. In this paper, we… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 8 pages; Shorter form accepted in ICASSP 2024

  41. arXiv:2403.00975  [pdf, other

    cs.LG cs.AI math.FA stat.AP

    Equipment Health Assessment: Time Series Analysis for Wind Turbine Performance

    Authors: Jana Backhus, Aniruddha Rajendra Rao, Chandrasekar Venkatraman, Abhishek Padmanabhan, A. Vinoth Kumar, Chetan Gupta

    Abstract: In this study, we leverage SCADA data from diverse wind turbines to predict power output, employing advanced time series methods, specifically Functional Neural Networks (FNN) and Long Short-Term Memory (LSTM) networks. A key innovation lies in the ensemble of FNN and LSTM models, capitalizing on their collective learning. This ensemble approach outperforms individual models, ensuring stable and a… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 19 Pages, 17 Figures, 3 Tables, Submitted at Applied Sciences (MDPI)

  42. arXiv:2403.00199  [pdf, other

    cs.CL cs.CY cs.LG

    Improving Socratic Question Generation using Data Augmentation and Preference Optimization

    Authors: Nischal Ashok Kumar, Andrew Lan

    Abstract: The Socratic method is a way of guiding students toward solving a problem independently without directly revealing the solution to the problem. Although this method has been shown to significantly improve student learning outcomes, it remains a complex labor-intensive task for instructors. Large language models (LLMs) can be used to augment human effort by automatically generating Socratic questio… ▽ More

    Submitted 18 April, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

    Comments: Published at the 19th BEA Workshop co-located with NAACL-2024

  43. arXiv:2402.19446  [pdf, other

    cs.LG cs.AI cs.CL

    ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

    Authors: Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar

    Abstract: A broad use case of large language models (LLMs) is in goal-directed decision-making tasks (or "agent" tasks), where an LLM needs to not just generate completions for a given prompt, but rather make intelligent decisions over a multi-turn interaction to accomplish a task (e.g., when interacting with the web, using tools, or providing customer support). Reinforcement learning (RL) provides a genera… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  44. arXiv:2402.18968  [pdf, other

    eess.AS cs.SD

    Ambisonics Networks -- The Effect Of Radial Functions Regularization

    Authors: Bar Shaybet, Anurag Kumar, Vladimir Tourbabin, Boaz Rafaely

    Abstract: Ambisonics, a popular format of spatial audio, is the spherical harmonic (SH) representation of the plane wave density function of a sound field. Many algorithms operate in the SH domain and utilize the Ambisonics as their input signal. The process of encoding Ambisonics from a spherical microphone array involves dividing by the radial functions, which may amplify noise at low frequencies. This ca… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: to be published in Icassp 2024

  45. arXiv:2402.16142  [pdf

    cs.CL cs.AI

    From Text to Transformation: A Comprehensive Review of Large Language Models' Versatility

    Authors: Pravneet Kaur, Gautam Siddharth Kashyap, Ankit Kumar, Md Tabrez Nafis, Sandeep Kumar, Vikrant Shokeen

    Abstract: This groundbreaking study explores the expanse of Large Language Models (LLMs), such as Generative Pre-Trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT) across varied domains ranging from technology, finance, healthcare to education. Despite their established prowess in Natural Language Processing (NLP), these LLMs have not been systematically examined fo… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  46. arXiv:2402.15833  [pdf, other

    cs.CL cs.LG

    Prompt Perturbation Consistency Learning for Robust Language Models

    Authors: Yao Qiang, Subhrangshu Nandi, Ninareh Mehrabi, Greg Ver Steeg, Anoop Kumar, Anna Rumshisky, Aram Galstyan

    Abstract: Large language models (LLMs) have demonstrated impressive performance on a number of natural language processing tasks, such as question answering and text summarization. However, their performance on sequence labeling tasks such as intent classification and slot filling (IC-SF), which is a central component in personal assistant systems, lags significantly behind discriminative models. Furthermor… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  47. arXiv:2402.14591  [pdf, other

    cs.CV cs.RO

    High-Speed Detector For Low-Powered Devices In Aerial Grasping

    Authors: Ashish Kumar, Laxmidhar Behera

    Abstract: Autonomous aerial harvesting is a highly complex problem because it requires numerous interdisciplinary algorithms to be executed on mini low-powered computing devices. Object detection is one such algorithm that is compute-hungry. In this context, we make the following contributions: (i) Fast Fruit Detector (FFD), a resource-efficient, single-stage, and postprocessing-free object detector based o… ▽ More

    Submitted 1 March, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: 8 Pages, 9 Figures, 8 Tables, IEEE Robotics and Automation Letters (IEEE RA-L)

  48. Fog enabled distributed training architecture for federated learning

    Authors: Aditya Kumar, Satish Narayana Srirama

    Abstract: The amount of data being produced at every epoch of second is increasing every moment. Various sensors, cameras and smart gadgets produce continuous data throughout its installation. Processing and analyzing raw data at a cloud server faces several challenges such as bandwidth, congestion, latency, privacy and security. Fog computing brings computational resources closer to IoT that addresses some… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: Conference paper accepted at BDA 2021

    Journal ref: Big Data Analytics 9th International Conference, BDA 2021, Virtual Event, December 15-18, 2021

  49. arXiv:2402.10985  [pdf, other

    cs.CR cs.AI

    Leveraging AI Planning For Detecting Cloud Security Vulnerabilities

    Authors: Mikhail Kazdagli, Mohit Tiwari, Akshat Kumar

    Abstract: Cloud computing services provide scalable and cost-effective solutions for data storage, processing, and collaboration. Alongside their growing popularity, concerns related to their security vulnerabilities leading to data breaches and sophisticated attacks such as ransomware are growing. To address these, first, we propose a generic framework to express relations between different cloud objects s… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  50. arXiv:2402.09226  [pdf, other

    cs.LG math.OC stat.ML

    Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks

    Authors: Akshay Kumar, Jarvis Haupt

    Abstract: This paper examines gradient flow dynamics of two-homogeneous neural networks for small initializations, where all weights are initialized near the origin. For both square and logistic losses, it is shown that for sufficiently small initializations, the gradient flow dynamics spend sufficient time in the neighborhood of the origin to allow the weights of the neural network to approximately converg… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.