Skip to main content

Showing 1–50 of 56 results for author: Talukdar, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.16816  [pdf, other

    cs.CL

    IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages

    Authors: Harman Singh, Nitish Gupta, Shikhar Bharadwaj, Dinesh Tewari, Partha Talukdar

    Abstract: As large language models (LLMs) see increasing adoption across the globe, it is imperative for LLMs to be representative of the linguistic diversity of the world. India is a linguistically diverse country of 1.4 Billion people. To facilitate research on multilingual LLM evaluation, we release IndicGenBench - the largest benchmark for evaluating LLMs on user-facing generation tasks across a diverse… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  2. arXiv:2401.02412  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    LLM Augmented LLMs: Expanding Capabilities through Composition

    Authors: Rachit Bansal, Bidisha Samanta, Siddharth Dalmia, Nitish Gupta, Shikhar Vashishth, Sriram Ganapathy, Abhishek Bapna, Prateek Jain, Partha Talukdar

    Abstract: Foundational models with billions of parameters which have been trained on large corpora of data have demonstrated non-trivial skills in a variety of domains. However, due to their monolithic structure, it is challenging and expensive to augment them or impart new skills. On the other hand, due to their adaptation abilities, several new instances of these models are being trained towards new domai… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: 17 pages, 2 figures, 8 tables

  3. arXiv:2311.00913  [pdf, other

    cs.CL

    Self-Influence Guided Data Reweighting for Language Model Pre-training

    Authors: Megh Thakkar, Tolga Bolukbasi, Sriram Ganapathy, Shikhar Vashishth, Sarath Chandar, Partha Talukdar

    Abstract: Language Models (LMs) pre-trained with self-supervision on large text corpora have become the default starting point for developing models for various NLP tasks. Once the pre-training corpus has been assembled, all data samples in the corpus are treated with equal importance during LM pre-training. However, due to varying levels of relevance and quality of data, equal importance to all the data sa… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted to EMNLP 2023

  4. arXiv:2309.10567  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Multimodal Modeling For Spoken Language Identification

    Authors: Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa

    Abstract: Spoken language identification refers to the task of automatically predicting the spoken language in a given utterance. Conventionally, it is modeled as a speech-based language identification task. Prior techniques have been constrained to a single modality; however in the case of video data there is a wealth of other metadata that may be beneficial for this task. In this work, we propose MuSeLI,… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  5. arXiv:2306.04374  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Label Aware Speech Representation Learning For Language Identification

    Authors: Shikhar Vashishth, Shikhar Bharadwaj, Sriram Ganapathy, Ankur Bapna, Min Ma, Wei Han, Vera Axelrod, Partha Talukdar

    Abstract: Speech representation learning approaches for non-semantic tasks such as language recognition have either explored supervised embedding extraction methods using a classifier model or self-supervised representation learning approaches using raw data. In this paper, we propose a novel framework of combining self-supervised representation learning with the language label information for the pre-train… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: Accepted at Interspeech 2023

  6. XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages

    Authors: Sebastian Ruder, Jonathan H. Clark, Alexander Gutkin, Mihir Kale, Min Ma, Massimo Nicosia, Shruti Rijhwani, Parker Riley, Jean-Michel A. Sarr, Xinyi Wang, John Wieting, Nitish Gupta, Anna Katanova, Christo Kirov, Dana L. Dickinson, Brian Roark, Bidisha Samanta, Connie Tao, David I. Adelani, Vera Axelrod, Isaac Caswell, Colin Cherry, Dan Garrette, Reeve Ingle, Melvin Johnson , et al. (2 additional authors not shown)

    Abstract: Data scarcity is a crucial issue for the development of highly multilingual NLP systems. Yet for many under-represented languages (ULs) -- languages for which NLP re-search is particularly far behind in meeting user needs -- it is feasible to annotate small amounts of data. Motivated by this, we propose XTREME-UP, a benchmark defined by: its focus on the scarce-data scenario rather than zero-shot;… ▽ More

    Submitted 24 May, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

  7. arXiv:2303.12860  [pdf, other

    cs.CL cs.AI

    Salient Span Masking for Temporal Understanding

    Authors: Jeremy R. Cole, Aditi Chaudhary, Bhuwan Dhingra, Partha Talukdar

    Abstract: Salient Span Masking (SSM) has shown itself to be an effective strategy to improve closed-book question answering performance. SSM extends general masked language model pretraining by creating additional unsupervised training sentences that mask a single entity or date span, thus oversampling factual information. Despite the success of this paradigm, the span types and sampling strategies are rela… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: 5 pages 1 figure, to appear in EACL 2023

  8. arXiv:2211.11206  [pdf, other

    cs.CL cs.AI cs.CY

    Cultural Re-contextualization of Fairness Research in Language Technologies in India

    Authors: Shaily Bhatt, Sunipa Dev, Partha Talukdar, Shachi Dave, Vinodkumar Prabhakaran

    Abstract: Recent research has revealed undesirable biases in NLP data and models. However, these efforts largely focus on social disparities in the West, and are not directly portable to other geo-cultural contexts. In this position paper, we outline a holistic research agenda to re-contextualize NLP fairness research for the Indian context, accounting for Indian societal context, bridging technological gap… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted to NeurIPS Workshop on "Cultures in AI/AI in Culture". This is a non-archival short version, to cite please refer to our complete paper: arXiv:2209.12226

  9. arXiv:2211.07615  [pdf, other

    cs.CL

    UGIF: UI Grounded Instruction Following

    Authors: Sagar Gubbi Venkatesh, Partha Talukdar, Srini Narayanan

    Abstract: Smartphone users often find it difficult to navigate myriad menus to perform common tasks such as "How to block calls from unknown numbers?". Currently, help documents with step-by-step instructions are manually written to aid the user. The user experience can be further enhanced by grounding the instructions in the help document to the UI and overlaying a tutorial on the phone UI. To build such t… ▽ More

    Submitted 23 May, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

  10. arXiv:2210.07313  [pdf, other

    cs.CL cs.LG

    Bootstrapping Multilingual Semantic Parsers using Large Language Models

    Authors: Abhijeet Awasthi, Nitish Gupta, Bidisha Samanta, Shachi Dave, Sunita Sarawagi, Partha Talukdar

    Abstract: Despite cross-lingual generalization demonstrated by pre-trained multilingual models, the translate-train paradigm of transferring English datasets across multiple languages remains to be a key mechanism for training task-specific multilingual models. However, for many low-resource languages, the availability of a reliable translation service entails significant amounts of costly human-annotated t… ▽ More

    Submitted 11 February, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: EACL-23

  11. TwiRGCN: Temporally Weighted Graph Convolution for Question Answering over Temporal Knowledge Graphs

    Authors: Aditya Sharma, Apoorv Saxena, Chitrank Gupta, Seyed Mehran Kazemi, Partha Talukdar, Soumen Chakrabarti

    Abstract: Recent years have witnessed much interest in temporal reasoning over knowledge graphs (KG) for complex question answering (QA), but there remains a substantial gap in human capabilities. We explore how to generalize relational graph convolutional networks (RGCN) for temporal KGQA. Specifically, we propose a novel, intuitive and interpretable scheme to modulate the messages passed through a KG edge… ▽ More

    Submitted 5 October, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: 9 pages + references + appendix

    Journal ref: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023) pages 2049 to 2060

  12. arXiv:2209.12226  [pdf, other

    cs.CL cs.CY

    Re-contextualizing Fairness in NLP: The Case of India

    Authors: Shaily Bhatt, Sunipa Dev, Partha Talukdar, Shachi Dave, Vinodkumar Prabhakaran

    Abstract: Recent research has revealed undesirable biases in NLP data and models. However, these efforts focus on social disparities in West, and are not directly portable to other geo-cultural contexts. In this paper, we focus on NLP fair-ness in the context of India. We start with a brief account of the prominent axes of social disparities in India. We build resources for fairness evaluation in the Indian… ▽ More

    Submitted 21 November, 2022; v1 submitted 25 September, 2022; originally announced September 2022.

    Comments: Accepted to AACL-IJCNLP 2022

  13. arXiv:2209.06767  [pdf, other

    cs.CL

    Parameter-Efficient Finetuning for Robust Continual Multilingual Learning

    Authors: Kartikeya Badola, Shachi Dave, Partha Talukdar

    Abstract: We introduce and study the problem of Continual Multilingual Learning (CML) where a previously trained multilingual model is periodically updated using new data arriving in stages. If the new data is present only in a subset of languages, we find that the resulting model shows improved performance only on the languages included in the latest update (and a few closely related languages) while its p… ▽ More

    Submitted 28 August, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: Published at ACL Findings 2023

  14. arXiv:2205.12676  [pdf, other

    cs.CL

    Evaluating the Diversity, Equity and Inclusion of NLP Technology: A Case Study for Indian Languages

    Authors: Simran Khanuja, Sebastian Ruder, Partha Talukdar

    Abstract: In order for NLP technology to be widely applicable, fair, and useful, it needs to serve a diverse set of speakers across the world's languages, be equitable, i.e., not unduly biased towards any particular language, and be inclusive of all users, particularly in low-resource settings where compute constraints are common. In this paper, we propose an evaluation paradigm that assesses NLP technologi… ▽ More

    Submitted 12 April, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to EACL Findings, 2023

  15. arXiv:2203.01976  [pdf, other

    cs.CL

    Overlap-based Vocabulary Generation Improves Cross-lingual Transfer Among Related Languages

    Authors: Vaidehi Patil, Partha Talukdar, Sunita Sarawagi

    Abstract: Pre-trained multilingual language models such as mBERT and XLM-R have demonstrated great potential for zero-shot cross-lingual transfer to low web-resource languages (LRL). However, due to limited model capacity, the large difference in the sizes of available monolingual corpora between high web-resource languages (HRL) and LRLs does not provide enough scope of co-embedding the LRL with the HRL, t… ▽ More

    Submitted 23 March, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: Accepted to appear at the ACL 2022 Main conference

  16. arXiv:2110.14782  [pdf, other

    cs.CL cs.LG

    When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer

    Authors: Ameet Deshpande, Partha Talukdar, Karthik Narasimhan

    Abstract: While recent work on multilingual language models has demonstrated their capacity for cross-lingual zero-shot transfer on downstream tasks, there is a lack of consensus in the community as to what shared properties between languages enable such transfer. Analyses involving pairs of natural languages are often inconclusive and contradictory since languages simultaneously differ in many linguistic a… ▽ More

    Submitted 3 May, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: Accepted at NAACL 2022

  17. arXiv:2110.07385  [pdf, other

    cs.CL cs.LG

    Few-shot Controllable Style Transfer for Low-Resource Multilingual Settings

    Authors: Kalpesh Krishna, Deepak Nathani, Xavier Garcia, Bidisha Samanta, Partha Talukdar

    Abstract: Style transfer is the task of rewriting a sentence into a target style while approximately preserving content. While most prior literature assumes access to a large style-labelled corpus, recent work (Riley et al. 2021) has attempted "few-shot" style transfer using only 3-10 sentences at inference for style extraction. In this work we study a relevant low-resource setting: style transfer for langu… ▽ More

    Submitted 11 March, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: ACL 2022 camera ready, 30 pages

  18. arXiv:2109.14364  [pdf, other

    cs.CL

    Multilingual Fact Linking

    Authors: Keshav Kolluru, Martin Rezk, Pat Verga, William W. Cohen, Partha Talukdar

    Abstract: Knowledge-intensive NLP tasks can benefit from linking natural language text with facts from a Knowledge Graph (KG). Although facts themselves are language-agnostic, the fact labels (i.e., language-specific representation of the fact) in the KG are often present only in a few languages. This makes it challenging to link KG facts to sentences in languages other than the limited set of languages. To… ▽ More

    Submitted 30 September, 2021; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: AKBC 2021

  19. arXiv:2106.12806  [pdf, other

    cs.CL

    OKGIT: Open Knowledge Graph Link Prediction with Implicit Types

    Authors: Chandrahas, Partha Pratim Talukdar

    Abstract: Open Knowledge Graphs (OpenKG) refer to a set of (head noun phrase, relation phrase, tail noun phrase) triples such as (tesla, return to, new york) extracted from a corpus using OpenIE tools. While OpenKGs are easy to bootstrap for a domain, they are very sparse and far from being directly usable in an end task. Therefore, the task of predicting new facts, i.e., link prediction, becomes an importa… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: Findings of the ACL: ACL-IJCNLP 2021

  20. arXiv:2106.03958  [pdf, other

    cs.CL

    Exploiting Language Relatedness for Low Web-Resource Language Model Adaptation: An Indic Languages Study

    Authors: Yash Khemchandani, Sarvesh Mehtani, Vaidehi Patil, Abhijeet Awasthi, Partha Talukdar, Sunita Sarawagi

    Abstract: Recent research in multilingual language models (LM) has demonstrated their ability to effectively handle multiple languages in a single model. This holds promise for low web-resource languages (LRL) as multilingual models can enable transfer of supervision from high resource languages to LRLs. However, incorporating a new language in an LM still remains a challenge, particularly for languages wit… ▽ More

    Submitted 9 June, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: Accepted to ACL-IJCNLP 2021

  21. arXiv:2106.02834  [pdf, other

    cs.CL

    MergeDistill: Merging Pre-trained Language Models using Distillation

    Authors: Simran Khanuja, Melvin Johnson, Partha Talukdar

    Abstract: Pre-trained multilingual language models (LMs) have achieved state-of-the-art results in cross-lingual transfer, but they often lead to an inequitable representation of languages due to limited capacity, skewed pre-training data, and sub-optimal vocabularies. This has prompted the creation of an ever-growing pre-trained model universe, where each model is trained on large amounts of language or do… ▽ More

    Submitted 5 June, 2021; originally announced June 2021.

    Comments: ACL 2021 Findings

  22. arXiv:2106.01751  [pdf, other

    cs.CL

    Reordering Examples Helps during Priming-based Few-Shot Learning

    Authors: Sawan Kumar, Partha Talukdar

    Abstract: The ability to learn from limited data, or few-shot learning, is a desirable and often critical requirement for NLP systems. While many existing methods do poorly at learning from a handful of examples, large pretrained language models have recently been shown to be efficient few-shot learners. One approach to few-shot learning, which does not require finetuning of model parameters, is to augment… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: 12 pages, 1 figure, Accepted to Findings of ACL 2021

  23. arXiv:2106.01515  [pdf, other

    cs.LG

    Question Answering Over Temporal Knowledge Graphs

    Authors: Apoorv Saxena, Soumen Chakrabarti, Partha Talukdar

    Abstract: Temporal Knowledge Graphs (Temporal KGs) extend regular Knowledge Graphs by providing temporal scopes (start and end times) on each edge in the KG. While Question Answering over KG (KGQA) has received some attention from the research community, QA over Temporal KGs (Temporal KGQA) is a relatively unexplored area. Lack of broad coverage datasets has been another factor limiting progress in this are… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: ACL 2021

  24. arXiv:2103.10730  [pdf, other

    cs.CL

    MuRIL: Multilingual Representations for Indian Languages

    Authors: Simran Khanuja, Diksha Bansal, Sarvesh Mehtani, Savya Khosla, Atreyee Dey, Balaji Gopalan, Dilip Kumar Margam, Pooja Aggarwal, Rajiv Teja Nagipogu, Shachi Dave, Shruti Gupta, Subhash Chandra Bose Gali, Vish Subramanian, Partha Talukdar

    Abstract: India is a multilingual society with 1369 rationalized languages and dialects being spoken across the country (INDIA, 2011). Of these, the 22 scheduled languages have a staggering total of 1.17 billion speakers and 121 languages have more than 10,000 speakers (INDIA, 2011). India also has the second largest (and an ever growing) digital footprint (Statista, 2020). Despite this, today's state-of-th… ▽ More

    Submitted 2 April, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

  25. arXiv:2012.13693  [pdf, other

    cs.RO cs.CL cs.LG

    Spatial Reasoning from Natural Language Instructions for Robot Manipulation

    Authors: Sagar Gubbi Venkatesh, Anirban Biswas, Raviteja Upadrashta, Vikram Srinivasan, Partha Talukdar, Bharadwaj Amrutur

    Abstract: Robots that can manipulate objects in unstructured environments and collaborate with humans can benefit immensely by understanding natural language. We propose a pipelined architecture of two stages to perform spatial reasoning on the text input. All the objects in the scene are first localized, and then the instruction for the robot in natural language and the localized co-ordinates are mapped to… ▽ More

    Submitted 26 March, 2021; v1 submitted 26 December, 2020; originally announced December 2020.

    Comments: Accepted for ICRA 2021

  26. arXiv:2005.12116  [pdf, other

    cs.CL

    NILE : Natural Language Inference with Faithful Natural Language Explanations

    Authors: Sawan Kumar, Partha Talukdar

    Abstract: The recent growth in the popularity and success of deep learning models on NLP classification tasks has accompanied the need for generating some form of natural language explanation of the predicted labels. Such generated natural language (NL) explanations are expected to be faithful, i.e., they should correlate well with the model's internal decision making. In this work, we focus on the task of… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: 13 pages, 3 figures, Accepted to ACL 2020

  27. arXiv:2005.09069  [pdf, other

    cs.CL cs.LG

    P-SIF: Document Embeddings Using Partition Averaging

    Authors: Vivek Gupta, Ankit Saw, Pegah Nokhiz, Praneeth Netrapalli, Piyush Rai, Partha Talukdar

    Abstract: Simple weighted averaging of word vectors often yields effective representations for sentences which outperform sophisticated seq2seq neural models in many tasks. While it is desirable to use the same method to represent documents as well, unfortunately, the effectiveness is lost when representing long documents involving multiple sentences. One of the key reasons is that a longer document is like… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

    Comments: 15 Pages, 3 Figures, 13 Tables, AAAI 2020, Blog : http://vivgupt.blogspot.com/2019/06/document-vector-estimation-using.html

  28. arXiv:2005.08417  [pdf, other

    cs.CL

    Syntax-guided Controlled Generation of Paraphrases

    Authors: Ashutosh Kumar, Kabir Ahuja, Raghuram Vadapalli, Partha Talukdar

    Abstract: Given a sentence (e.g., "I like mangoes") and a constraint (e.g., sentiment flip), the goal of controlled text generation is to produce a sentence that adapts the input sentence to meet the requirements of the constraint (e.g., "I hate mangoes"). Going beyond such simple constraints, recent works have started exploring the incorporation of complex syntactic-guidance as constraints in the task of c… ▽ More

    Submitted 17 May, 2020; originally announced May 2020.

    Comments: 16 pages, 3 figures, Accepted to TACL 2020

  29. arXiv:1911.07979  [pdf, other

    cs.LG stat.ML

    ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations

    Authors: Ekagra Ranjan, Soumya Sanyal, Partha Pratim Talukdar

    Abstract: Graph Neural Networks (GNN) have been shown to work effectively for modeling graph structured data to solve tasks such as node classification, link prediction and graph classification. There has been some recent progress in defining the notion of pooling in graphs whereby the model tries to generate a graph level representation by downsampling and summarizing the information present in the nodes.… ▽ More

    Submitted 2 February, 2020; v1 submitted 18 November, 2019; originally announced November 2019.

    Comments: The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020)

  30. arXiv:1911.07918  [pdf, other

    cs.CL cs.IR cs.LG

    Improving Document Classification with Multi-Sense Embeddings

    Authors: Vivek Gupta, Ankit Saw, Pegah Nokhiz, Harshit Gupta, Partha Talukdar

    Abstract: Efficient representation of text documents is an important building block in many NLP tasks. Research on long text categorization has shown that simple weighted averaging of word vectors for sentence representation often outperforms more sophisticated neural models. Recently proposed Sparse Composite Document Vector (SCDV) (Mekala et. al, 2017) extends this approach from sentences to documents usi… ▽ More

    Submitted 18 November, 2019; originally announced November 2019.

    Comments: 8 Pages, 7 Figures, 12 Tables, under review at ECAI 2020

  31. arXiv:1911.03903  [pdf, other

    cs.CL

    A Re-evaluation of Knowledge Graph Completion Methods

    Authors: Zhiqing Sun, Shikhar Vashishth, Soumya Sanyal, Partha Talukdar, Yiming Yang

    Abstract: Knowledge Graph Completion (KGC) aims at automatically predicting missing links for large-scale knowledge graphs. A vast number of state-of-the-art KGC techniques have got published at top conferences in several research fields, including data mining, machine learning, and natural language processing. However, we notice that several recent papers report very high performance, which largely outperf… ▽ More

    Submitted 8 July, 2020; v1 submitted 10 November, 2019; originally announced November 2019.

    Comments: Accepted at ACL 2020

  32. arXiv:1911.03082  [pdf, other

    cs.LG stat.ML

    Composition-based Multi-Relational Graph Convolutional Networks

    Authors: Shikhar Vashishth, Soumya Sanyal, Vikram Nitin, Partha Talukdar

    Abstract: Graph Convolutional Networks (GCNs) have recently been shown to be quite successful in modeling graph-structured data. However, the primary focus has been on handling simple undirected graphs. Multi-relational graphs are a more general and prevalent form of graphs where each edge has a label and direction associated with it. Most of the existing approaches to handle such graphs suffer from over-pa… ▽ More

    Submitted 18 January, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: In Proceedings of ICLR 2020

  33. arXiv:1911.00219  [pdf, other

    cs.LG stat.ML

    InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

    Authors: Shikhar Vashishth, Soumya Sanyal, Vikram Nitin, Nilesh Agrawal, Partha Talukdar

    Abstract: Most existing knowledge graphs suffer from incompleteness, which can be alleviated by inferring missing links based on known facts. One popular way to accomplish this is to generate low-dimensional embeddings of entities and relations, and use these to make inferences. ConvE, a recently proposed approach, applies convolutional filters on 2D reshapings of entity and relation embeddings in order to… ▽ More

    Submitted 24 September, 2020; v1 submitted 1 November, 2019; originally announced November 2019.

    Comments: Accepted at AAAI 2020

  34. arXiv:1906.11861  [pdf, other

    cs.CL

    Relating Simple Sentence Representations in Deep Neural Networks and the Brain

    Authors: Sharmistha Jat, Hao Tang, Partha Talukdar, Tom Mitchell

    Abstract: What is the relationship between sentence representations learned by deep recurrent models against those encoded by the brain? Is there any correspondence between hidden layers of these recurrent models and brain regions when processing sentences? Can these deep models be used to synthesize brain data which can then be utilized in other extrinsic tasks? We investigate these questions using sentenc… ▽ More

    Submitted 27 June, 2019; originally announced June 2019.

    Comments: Association for Computational Linguistics (ACL) 2019

  35. arXiv:1902.02161  [pdf, other

    cs.CL

    AD3: Attentive Deep Document Dater

    Authors: Swayambhu Nath Ray, Shib Sankar Dasgupta, Partha Talukdar

    Abstract: Knowledge of the creation date of documents facilitates several tasks such as summarization, event extraction, temporally focused information extraction etc. Unfortunately, for most of the documents on the Web, the time-stamp metadata is either missing or can't be trusted. Thus, predicting creation time from document content itself is an important task. In this paper, we propose Attentive Deep Doc… ▽ More

    Submitted 21 January, 2019; originally announced February 2019.

    Journal ref: DBLP:conf/emnlp/RayDT18 (2018)

  36. arXiv:1902.00175  [pdf, other

    cs.CL cs.AI cs.LG

    Dating Documents using Graph Convolution Networks

    Authors: Shikhar Vashishth, Shib Sankar Dasgupta, Swayambhu Nath Ray, Partha Talukdar

    Abstract: Document date is essential for many important tasks, such as document retrieval, summarization, event detection, etc. While existing approaches for these tasks assume accurate knowledge of the document date, this is not always available, especially for arbitrary documents from the Web. Document Dating is a challenging problem which requires inference over the temporal structure of the document. Pr… ▽ More

    Submitted 31 January, 2019; originally announced February 2019.

    Comments: Accepted at ACL 2018

    Journal ref: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 2018

  37. CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information

    Authors: Shikhar Vashishth, Prince Jain, Partha Talukdar

    Abstract: Open Information Extraction (OpenIE) methods extract (noun phrase, relation phrase, noun phrase) triples from text, resulting in the construction of large Open Knowledge Bases (Open KBs). The noun phrases (NPs) and relation phrases in such Open KBs are not canonicalized, leading to the storage of redundant and ambiguous facts. Recent research has posed canonicalization of Open KBs as clustering ov… ▽ More

    Submitted 31 January, 2019; originally announced February 2019.

    Comments: Accepted at WWW 2018

    Journal ref: International World Wide Web Conferences Steering Committee 2018

  38. arXiv:1901.08255  [pdf, other

    cs.LG cs.SI stat.ML

    Confidence-based Graph Convolutional Networks for Semi-Supervised Learning

    Authors: Shikhar Vashishth, Prateek Yadav, Manik Bhandari, Partha Talukdar

    Abstract: Predicting properties of nodes in a graph is an important problem with applications in a variety of domains. Graph-based Semi-Supervised Learning (SSL) methods aim to address this problem by labeling a small subset of the nodes as seeds and then utilizing the graph structure to predict label scores for the rest of the nodes in the graph. Recently, Graph Convolutional Networks (GCNs) have achieved… ▽ More

    Submitted 11 February, 2019; v1 submitted 24 January, 2019; originally announced January 2019.

    Comments: Accepted at AISTATS 2019

  39. arXiv:1812.04361  [pdf, other

    cs.CL

    RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information

    Authors: Shikhar Vashishth, Rishabh Joshi, Sai Suman Prayaga, Chiranjib Bhattacharyya, Partha Talukdar

    Abstract: Distantly-supervised Relation Extraction (RE) methods train an extractor by automatically aligning relation instances in a Knowledge Base (KB) with unstructured text. In addition to relation instances, KBs often contain other relevant side information, such as aliases of relations (e.g., founded and co-founded are aliases for the relation founderOfCompany). RE models usually ignore such readily av… ▽ More

    Submitted 11 February, 2019; v1 submitted 11 December, 2018; originally announced December 2018.

    Comments: 10 pages, 6 figures, EMNLP 2018

    Journal ref: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

  40. arXiv:1811.05660  [pdf, other

    cs.LG cond-mat.mtrl-sci stat.ML

    MT-CGCNN: Integrating Crystal Graph Convolutional Neural Network with Multitask Learning for Material Property Prediction

    Authors: Soumya Sanyal, Janakiraman Balachandran, Naganand Yadati, Abhishek Kumar, Padmini Rajagopalan, Suchismita Sanyal, Partha Talukdar

    Abstract: Developing accurate, transferable and computationally inexpensive machine learning models can rapidly accelerate the discovery and development of new materials. Some of the major challenges involved in developing such models are, (i) limited availability of materials data as compared to other fields, (ii) lack of universal descriptor of materials to predict its various properties. The limited avai… ▽ More

    Submitted 14 November, 2018; originally announced November 2018.

    Comments: NIPS Workshop on Machine Learning for Molecules and Materials

  41. arXiv:1809.04283  [pdf, other

    cs.CL cs.LG

    Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks

    Authors: Shikhar Vashishth, Manik Bhandari, Prateek Yadav, Piyush Rai, Chiranjib Bhattacharyya, Partha Talukdar

    Abstract: Word embeddings have been widely adopted across several NLP applications. Most existing word embedding methods utilize sequential context of a word to learn its embedding. While there have been some attempts at utilizing syntactic context of a word, such methods result in an explosion of the vocabulary size. In this paper, we overcome this problem by proposing SynGCN, a flexible Graph Convolution… ▽ More

    Submitted 20 July, 2019; v1 submitted 12 September, 2018; originally announced September 2018.

    Comments: 11 pages, 2 figures

    Journal ref: 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019)

  42. arXiv:1809.02589  [pdf, other

    cs.LG stat.ML

    HyperGCN: A New Method of Training Graph Convolutional Networks on Hypergraphs

    Authors: Naganand Yadati, Madhav Nimishakavi, Prateek Yadav, Vikram Nitin, Anand Louis, Partha Talukdar

    Abstract: In many real-world network datasets such as co-authorship, co-citation, email communication, etc., relationships are complex and go beyond pairwise. Hypergraphs provide a flexible and natural modeling tool to model such complex relationships. The obvious existence of such complex relationships in many real-world networks naturaly motivates the problem of learning with hypergraphs. A popular learni… ▽ More

    Submitted 22 May, 2019; v1 submitted 7 September, 2018; originally announced September 2018.

  43. arXiv:1805.11365  [pdf, other

    cs.LG stat.ML

    Lovasz Convolutional Networks

    Authors: Prateek Yadav, Madhav Nimishakavi, Naganand Yadati, Shikhar Vashishth, Arun Rajkumar, Partha Talukdar

    Abstract: Semi-supervised learning on graph structured data has received significant attention with the recent introduction of Graph Convolution Networks (GCN). While traditional methods have focused on optimizing a loss augmented with Laplacian regularization framework, GCNs perform an implicit Laplacian type regularization to capture local graph structure. In this work, we propose Lovasz Convolutional Net… ▽ More

    Submitted 3 January, 2019; v1 submitted 29 May, 2018; originally announced May 2018.

    Comments: Accepted at AISTATS 2019

  44. arXiv:1804.06987  [pdf, other

    cs.CL

    Improving Distantly Supervised Relation Extraction using Word and Entity Based Attention

    Authors: Sharmistha Jat, Siddhesh Khandelwal, Partha Talukdar

    Abstract: Relation extraction is the problem of classifying the relationship between two entities in a given sentence. Distant Supervision (DS) is a popular technique for developing relation extractors starting with limited supervision. We note that most of the sentences in the distant supervision relation extraction setting are very long and may benefit from word attention for better sentence representatio… ▽ More

    Submitted 18 April, 2018; originally announced April 2018.

  45. arXiv:1802.06371  [pdf, other

    cs.LG

    Inductive Framework for Multi-Aspect Streaming Tensor Completion with Side Information

    Authors: Madhav Nimishakavi, Bamdev Mishra, Manish Gupta, Partha Talukdar

    Abstract: Low rank tensor completion is a well studied problem and has applications in various fields. However, in many real world applications the data is dynamic, i.e., new data arrives at different time intervals. As a result, the tensors used to represent the data grow in size. Besides the tensors, in many real world scenarios, side information is also available in the form of matrices which also grow i… ▽ More

    Submitted 1 September, 2018; v1 submitted 18 February, 2018; originally announced February 2018.

    Comments: Accepted to International Conference on Information and Knowledge Management (CIKM), 2018

  46. arXiv:1712.03547  [pdf, ps, other

    cs.CL

    Inducing Interpretability in Knowledge Graph Embeddings

    Authors: Chandrahas, Tathagata Sengupta, Cibi Pragadeesh, Partha Pratim Talukdar

    Abstract: We study the problem of inducing interpretability in KG embeddings. Specifically, we explore the Universal Schema (Riedel et al., 2013) and propose a method to induce interpretability. There have been many vector space models proposed for the problem, however, most of these methods don't address the interpretability (semantics) of individual dimensions. In this work, we study this problem and prop… ▽ More

    Submitted 10 December, 2017; originally announced December 2017.

  47. arXiv:1711.05401  [pdf, other

    cs.AI cs.LG stat.ML

    Revisiting Simple Neural Networks for Learning Representations of Knowledge Graphs

    Authors: Srinivas Ravishankar, Chandrahas, Partha Pratim Talukdar

    Abstract: We address the problem of learning vector representations for entities and relations in Knowledge Graphs (KGs) for Knowledge Base Completion (KBC). This problem has received significant attention in the past few years and multiple methods have been proposed. Most of the existing methods in the literature use a predefined characteristic scoring function for evaluating the correctness of KG triples.… ▽ More

    Submitted 8 January, 2018; v1 submitted 14 November, 2017; originally announced November 2017.

    Comments: 7 pages, submitted to and accepted in Automated Knowledge Base Construction (AKBC) Workshop 2017, at NIPS 2017

  48. arXiv:1710.09942  [pdf, other

    cs.CL

    CANDiS: Coupled & Attention-Driven Neural Distant Supervision

    Authors: Tushar Nagarajan, Sharmistha, Partha Talukdar

    Abstract: Distant Supervision for Relation Extraction uses heuristically aligned text data with an existing knowledge base as training data. The unsupervised nature of this technique allows it to scale to web-scale relation extraction tasks, at the expense of noise in the training data. Previous work has explored relationships among instances of the same entity-pair to reduce this noise, but relationships a… ▽ More

    Submitted 26 October, 2017; originally announced October 2017.

    Comments: WiNLP 2017

  49. arXiv:1707.01917  [pdf, other

    cs.CL cs.IR

    Higher-order Relation Schema Induction using Tensor Factorization with Back-off and Aggregation

    Authors: Madhav Nimishakavi, Partha Talukdar

    Abstract: Relation Schema Induction (RSI) is the problem of identifying type signatures of arguments of relations from unlabeled text. Most of the previous work in this area have focused only on binary RSI, i.e., inducing only the subject and object type signatures per relation. However, in practice, many relations are high-order, i.e., they have more than two arguments and inducing type signatures of all a… ▽ More

    Submitted 29 May, 2018; v1 submitted 6 July, 2017; originally announced July 2017.

  50. arXiv:1610.06912  [pdf, other

    cs.AI

    KGEval: Estimating Accuracy of Automatically Constructed Knowledge Graphs

    Authors: Prakhar Ojha, Partha Talukdar

    Abstract: Automatic construction of large knowledge graphs (KG) by mining web-scale text datasets has received considerable attention recently. Estimating accuracy of such automatically constructed KGs is a challenging problem due to their size and diversity. This important problem has largely been ignored in prior research we fill this gap and propose KGEval. KGEval binds facts of a KG using coupling const… ▽ More

    Submitted 1 December, 2016; v1 submitted 21 October, 2016; originally announced October 2016.