Search | arXiv e-print repository

The Third Monocular Depth Estimation Challenge

Authors: Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, YiPing Bao, Xiao Liu, Dohyeong Kim, Jinseong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, Jinqiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora , et al. (16 additional authors not shown)

Abstract: This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su… ▽ More This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 submissions outperforming the baseline on the test set: 10 among them submitted a report describing their approach, highlighting a diffused use of foundational models such as Depth Anything at the core of their method. The challenge winners drastically improved 3D F-Score performance, from 17.51% to 23.72%. △ Less

Submitted 27 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

Comments: To appear in CVPRW2024

arXiv:2404.13222 [pdf, other]

Vim4Path: Self-Supervised Vision Mamba for Histopathology Images

Authors: Ali Nasiri-Sarvi, Vincent Quoc-Huy Trinh, Hassan Rivaz, Mahdi S. Hosseini

Abstract: Representation learning from Gigapixel Whole Slide Images (WSI) poses a significant challenge in computational pathology due to the complicated nature of tissue structures and the scarcity of labeled data. Multi-instance learning methods have addressed this challenge, leveraging image patches to classify slides utilizing pretrained models using Self-Supervised Learning (SSL) approaches. The perfor… ▽ More Representation learning from Gigapixel Whole Slide Images (WSI) poses a significant challenge in computational pathology due to the complicated nature of tissue structures and the scarcity of labeled data. Multi-instance learning methods have addressed this challenge, leveraging image patches to classify slides utilizing pretrained models using Self-Supervised Learning (SSL) approaches. The performance of both SSL and MIL methods relies on the architecture of the feature encoder. This paper proposes leveraging the Vision Mamba (Vim) architecture, inspired by state space models, within the DINO framework for representation learning in computational pathology. We evaluate the performance of Vim against Vision Transformers (ViT) on the Camelyon16 dataset for both patch-level and slide-level classification. Our findings highlight Vim's enhanced performance compared to ViT, particularly at smaller scales, where Vim achieves an 8.21 increase in ROC AUC for models of similar size. An explainability analysis further highlights Vim's capabilities, which reveals that Vim uniquely emulates the pathologist workflow-unlike ViT. This alignment with human expert analysis highlights Vim's potential in practical diagnostic settings and contributes significantly to developing effective representation-learning algorithms in computational pathology. We release the codes and pretrained weights at \url{https://github.com/AtlasAnalyticsLab/Vim4Path}. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: Accepted in CVPR2023 (9th Workshop on Computer Vision for Microscopy Image Analysis)

arXiv:2404.02348 [pdf, other]

COVID-19 Detection Based on Blood Test Parameters using Various Artificial Intelligence Methods

Authors: Kavian Khanjani, Seyed Rasoul Hosseini, Shahrzad Shashaani, Mohammad Teshnehlab

Abstract: In 2019, the world faced a new challenge: a COVID-19 disease caused by the novel coronavirus, SARS-CoV-2. The virus rapidly spread across the globe, leading to a high rate of mortality, which prompted health organizations to take measures to control its transmission. Early disease detection is crucial in the treatment process, and computer-based automatic detection systems have been developed to a… ▽ More In 2019, the world faced a new challenge: a COVID-19 disease caused by the novel coronavirus, SARS-CoV-2. The virus rapidly spread across the globe, leading to a high rate of mortality, which prompted health organizations to take measures to control its transmission. Early disease detection is crucial in the treatment process, and computer-based automatic detection systems have been developed to aid in this effort. These systems often rely on artificial intelligence (AI) approaches such as machine learning, neural networks, fuzzy systems, and deep learning to classify diseases. This study aimed to differentiate COVID-19 patients from others using self-categorizing classifiers and employing various AI methods. This study used two datasets: the blood test samples and radiography images. The best results for the blood test samples obtained from San Raphael Hospital, which include two classes of individuals, those with COVID-19 and those with non-COVID diseases, were achieved through the use of the Ensemble method (a combination of a neural network and two machines learning methods). The results showed that this approach for COVID-19 diagnosis is cost-effective and provides results in a shorter amount of time than other methods. The proposed model achieved an accuracy of 94.09% on the dataset used. Secondly, the radiographic images were divided into four classes: normal, viral pneumonia, ground glass opacity, and COVID-19 infection. These were used for segmentation and classification. The lung lobes were extracted from the images and then categorized into specific classes. We achieved an accuracy of 91.1% on the image dataset. Generally, this study highlights the potential of AI in detecting and managing COVID-19 and underscores the importance of continued research and development in this field. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2403.19782 [pdf, other]

ENet-21: An Optimized light CNN Structure for Lane Detection

Authors: Seyed Rasoul Hosseini, Mohammad Teshnehlab

Abstract: Lane detection for autonomous vehicles is an important concept, yet it is a challenging issue of driver assistance systems in modern vehicles. The emergence of deep learning leads to significant progress in self-driving cars. Conventional deep learning-based methods handle lane detection problems as a binary segmentation task and determine whether a pixel belongs to a line. These methods rely on t… ▽ More Lane detection for autonomous vehicles is an important concept, yet it is a challenging issue of driver assistance systems in modern vehicles. The emergence of deep learning leads to significant progress in self-driving cars. Conventional deep learning-based methods handle lane detection problems as a binary segmentation task and determine whether a pixel belongs to a line. These methods rely on the assumption of a fixed number of lanes, which does not always work. This study aims to develop an optimal structure for the lane detection problem, offering a promising solution for driver assistance features in modern vehicles by utilizing a machine learning method consisting of binary segmentation and Affinity Fields that can manage varying numbers of lanes and lane change scenarios. In this approach, the Convolutional Neural Network (CNN), is selected as a feature extractor, and the final output is obtained through clustering of the semantic segmentation and Affinity Field outputs. Our method uses less complex CNN architecture than exi △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: The paper is under review by Soft Computing journal

arXiv:2403.14671 [pdf]

Understanding the Transit Gap: A Comparative Study of On-Demand Bus Services and Urban Climate Resilience in South End, Charlotte, NC and Avondale, Chattanooga, TN

Authors: Sanaz Sadat Hosseini, Babak Rahimi Ardabili, Mona Azarbayjani, Srinivas Pulugurtha, Hamed Tabkhi

Abstract: Urban design significantly impacts sustainability, particularly in the context of public transit efficiency and carbon emissions reduction. This study explores two neighborhoods with distinct urban designs: South End, Charlotte, NC, featuring a dynamic mixed-use urban design pattern, and Avondale, Chattanooga, TN, with a residential suburban grid layout. Using the TRANSIT-GYM tool, we assess the i… ▽ More Urban design significantly impacts sustainability, particularly in the context of public transit efficiency and carbon emissions reduction. This study explores two neighborhoods with distinct urban designs: South End, Charlotte, NC, featuring a dynamic mixed-use urban design pattern, and Avondale, Chattanooga, TN, with a residential suburban grid layout. Using the TRANSIT-GYM tool, we assess the impact of increased bus utilization in these different urban settings on traffic and CO2 emissions. Our results highlight the critical role of urban design and planning in transit system efficiency. In South End, the mixed-use design led to more substantial emission reductions, indicating that urban layout can significantly influence public transit outcomes. Tailored strategies that consider the unique urban design elements are essential for climate resilience. Notably, doubling bus utilization decreased daily emissions by 10.18% in South End and 8.13% in Avondale, with a corresponding reduction in overall traffic. A target of 50% bus utilization saw emissions drop by 21.45% in South End and 14.50% in Avondale. At an idealistic goal of 70% bus utilization, South End and Avondale witnessed emission reductions of 37.22% and 27.80%, respectively. These insights are crucial for urban designers and policymakers in developing sustainable urban landscapes. △ Less

Submitted 30 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

Comments: 6 pages, 4 figures, Recently accepted by the PLEA 2024 Conference for Sustainable Architecture and Urban Design (Re-Thinking Resilience), Wroclaw, Poland

arXiv:2312.01351 [pdf]

Deep learning and traditional-based CAD schemes for the pulmonary embolism diagnosis: A survey

Authors: Seyed Hesamoddin Hosseini, Amir Hossein Taherinia, Mahdi Saadatmand

Abstract: Nowadays, pulmonary Computed Tomography Angiography (CTA) is the main tool for detecting Pulmonary Embolism (PE). However, manual interpretation of CTA volume requires a radiologist, which is time-consuming and error-prone due to the specific conditions of lung tissue, large volume of data, lack of experience, and eye fatigue. Therefore, Computer-Aided Design (CAD) systems are used as a second opi… ▽ More Nowadays, pulmonary Computed Tomography Angiography (CTA) is the main tool for detecting Pulmonary Embolism (PE). However, manual interpretation of CTA volume requires a radiologist, which is time-consuming and error-prone due to the specific conditions of lung tissue, large volume of data, lack of experience, and eye fatigue. Therefore, Computer-Aided Design (CAD) systems are used as a second opinion for the diagnosis of PE. The purpose of this article is to review, evaluate, and compare the performance of deep learning and traditional-based CAD system for diagnosis PE and to help physicians and researchers in this field. In this study, all articles available in databases such as IEEE, ScienceDirect, Wiley, Springer, Nature, and Wolters Kluwer in the field of PE diagnosis were examined using traditional and deep learning methods. From 2002 to 2023, 23 papers were studied to extract the articles with the considered limitations. Each paper presents an automatic PE detection system that we evaluate using criteria such as sensitivity, False Positives (FP), and the number of datasets. This research work includes recent studies, state-of-the-art research works, and a more comprehensive overview compared to previously published review articles in this research area. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: 22 pages, 6 figures, 5 tables

arXiv:2311.18140 [pdf, other]

ROBBIE: Robust Bias Evaluation of Large Generative Language Models

Authors: David Esiobu, Xiaoqing Tan, Saghar Hosseini, Megan Ung, Yuchen Zhang, Jude Fernandes, Jane Dwivedi-Yu, Eleonora Presani, Adina Williams, Eric Michael Smith

Abstract: As generative large language models (LLMs) grow more performant and prevalent, we must develop comprehensive enough tools to measure and improve their fairness. Different prompt-based datasets can be used to measure social bias across multiple text domains and demographic axes, meaning that testing LLMs on more datasets can potentially help us characterize their biases more fully, and better ensur… ▽ More As generative large language models (LLMs) grow more performant and prevalent, we must develop comprehensive enough tools to measure and improve their fairness. Different prompt-based datasets can be used to measure social bias across multiple text domains and demographic axes, meaning that testing LLMs on more datasets can potentially help us characterize their biases more fully, and better ensure equal and equitable treatment of marginalized demographic groups. In this work, our focus is two-fold: (1) Benchmarking: a comparison of 6 different prompt-based bias and toxicity metrics across 12 demographic axes and 5 families of generative LLMs. Out of those 6 metrics, AdvPromptSet and HolisticBiasR are novel datasets proposed in the paper. The comparison of those benchmarks gives us insights about the bias and toxicity of the compared models. Therefore, we explore the frequency of demographic terms in common LLM pre-training corpora and how this may relate to model biases. (2) Mitigation: we conduct a comprehensive study of how well 3 bias/toxicity mitigation techniques perform across our suite of measurements. ROBBIE aims to provide insights for practitioners while deploying a model, emphasizing the need to not only measure potential harms, but also understand how they arise by characterizing the data, mitigate harms once found, and balance any trade-offs. We open-source our analysis code in hopes of encouraging broader measurements of bias in future LLMs. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: EMNLP 2023

arXiv:2310.13996 [pdf, other]

Emulating the Human Mind: A Neural-symbolic Link Prediction Model with Fast and Slow Reasoning and Filtered Rules

Authors: Mohammad Hossein Khojasteh, Najmeh Torabian, Ali Farjami, Saeid Hosseini, Behrouz Minaei-Bidgoli

Abstract: Link prediction is an important task in addressing the incompleteness problem of knowledge graphs (KG). Previous link prediction models suffer from issues related to either performance or explanatory capability. Furthermore, models that are capable of generating explanations, often struggle with erroneous paths or reasoning leading to the correct answer. To address these challenges, we introduce a… ▽ More Link prediction is an important task in addressing the incompleteness problem of knowledge graphs (KG). Previous link prediction models suffer from issues related to either performance or explanatory capability. Furthermore, models that are capable of generating explanations, often struggle with erroneous paths or reasoning leading to the correct answer. To address these challenges, we introduce a novel Neural-Symbolic model named FaSt-FLiP (stands for Fast and Slow Thinking with Filtered rules for Link Prediction task), inspired by two distinct aspects of human cognition: "commonsense reasoning" and "thinking, fast and slow." Our objective is to combine a logical and neural model for enhanced link prediction. To tackle the challenge of dealing with incorrect paths or rules generated by the logical model, we propose a semi-supervised method to convert rules into sentences. These sentences are then subjected to assessment and removal of incorrect rules using an NLI (Natural Language Inference) model. Our approach to combining logical and neural models involves first obtaining answers from both the logical and neural models. These answers are subsequently unified using an Inference Engine module, which has been realized through both algorithmic implementation and a novel neural model architecture. To validate the efficacy of our model, we conducted a series of experiments. The results demonstrate the superior performance of our model in both link prediction metrics and the generation of more reliable explanations. △ Less

Submitted 21 October, 2023; originally announced October 2023.

arXiv:2310.04855 [pdf, other]

Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

Authors: S. M. F. Sani, Seyed Abbas Hosseini, Hamid R. Rabiee

Abstract: Often, recommendation systems employ continuous training, leading to a self-feedback loop bias in which the system becomes biased toward its previous recommendations. Recent studies have attempted to mitigate this bias by collecting small amounts of unbiased data. While these studies have successfully developed less biased models, they ignore the crucial fact that the recommendations generated by… ▽ More Often, recommendation systems employ continuous training, leading to a self-feedback loop bias in which the system becomes biased toward its previous recommendations. Recent studies have attempted to mitigate this bias by collecting small amounts of unbiased data. While these studies have successfully developed less biased models, they ignore the crucial fact that the recommendations generated by the model serve as the training data for subsequent training sessions. To address this issue, we propose a framework that learns an unbiased estimator using a small amount of uniformly collected data and focuses on generating improved training data for subsequent training iterations. To accomplish this, we view recommendation as a contextual multi-arm bandit problem and emphasize on exploring items that the model has a limited understanding of. We introduce a new offline sequential training schema that simulates real-world continuous training scenarios in recommendation systems, offering a more appropriate framework for studying self-feedback bias. We demonstrate the superiority of our model over state-of-the-art debiasing methods by conducting extensive experiments using the proposed training schema. △ Less

Submitted 7 October, 2023; originally announced October 2023.

arXiv:2310.03248 [pdf, other]

Xcrum: A Synergistic Approach Integrating Extreme Programming with Scrum

Authors: Siavash Hosseini

Abstract: In today's modern world, software plays a pivotal role. Software development is a highly complex and time-consuming process, demanding multidimensional efforts. Companies continually adapt their requirements to align with the evolving environment, with a specific emphasis on rapid delivery and the acceptance of changing requirements. Traditional models, such as plan-driven development, often fall… ▽ More In today's modern world, software plays a pivotal role. Software development is a highly complex and time-consuming process, demanding multidimensional efforts. Companies continually adapt their requirements to align with the evolving environment, with a specific emphasis on rapid delivery and the acceptance of changing requirements. Traditional models, such as plan-driven development, often fall short in meeting these demands. In the realm of software development, Agile has been the focal point of global discourse for both researchers and developers. Agile development is better suited to customize and streamline the development process, offering a highly flexible, early, and rapid delivery lifecycle conducive to efficient software development. This article aims to provide an overview of two prominent Agile methodologies: Scrum and Extreme Programming (XP). It achieves this by reviewing relevant publications, analyzing their impact on software development, exploring the distinctive features of each methodology, and conducting a comparative assessment. Furthermore, the article offers personal insights and recommendations. Notably, the integration of XP practices into Scrum has given rise to a novel hybrid methodology known as "Xcrum," which retains its agility. It should be highlighted that, given this new approach's incorporation of the strengths of both methods, it holds the potential to outperform the original frameworks. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2310.02473 [pdf, other]

Prompting-based Temporal Domain Generalization

Authors: Sepidehsadat Hosseini, Mengyao Zhai, Hossein Hajimirsadegh, Frederick Tung

Abstract: Machine learning traditionally assumes that the training and testing data are distributed independently and identically. However, in many real-world settings, the data distribution can shift over time, leading to poor generalization of trained models in future time periods. This paper presents a novel prompting-based approach to temporal domain generalization that is parameter-efficient, time-effi… ▽ More Machine learning traditionally assumes that the training and testing data are distributed independently and identically. However, in many real-world settings, the data distribution can shift over time, leading to poor generalization of trained models in future time periods. This paper presents a novel prompting-based approach to temporal domain generalization that is parameter-efficient, time-efficient, and does not require access to future data during training. Our method adapts a trained model to temporal drift by learning global prompts, domain-specific prompts, and drift-aware prompts that capture underlying temporal dynamics. Experiments on classification, regression, and time series forecasting tasks demonstrate the generality of the proposed approach. The code repository will be publicly shared. △ Less

Submitted 15 February, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

arXiv:2309.15770 [pdf, other]

Generating Transferable Adversarial Simulation Scenarios for Self-Driving via Neural Rendering

Authors: Yasasa Abeysirigoonawardena, Kevin Xie, Chuhan Chen, Salar Hosseini, Ruiting Chen, Ruiqi Wang, Florian Shkurti

Abstract: Self-driving software pipelines include components that are learned from a significant number of training examples, yet it remains challenging to evaluate the overall system's safety and generalization performance. Together with scaling up the real-world deployment of autonomous vehicles, it is of critical importance to automatically find simulation scenarios where the driving policies will fail.… ▽ More Self-driving software pipelines include components that are learned from a significant number of training examples, yet it remains challenging to evaluate the overall system's safety and generalization performance. Together with scaling up the real-world deployment of autonomous vehicles, it is of critical importance to automatically find simulation scenarios where the driving policies will fail. We propose a method that efficiently generates adversarial simulation scenarios for autonomous driving by solving an optimal control problem that aims to maximally perturb the policy from its nominal trajectory. Given an image-based driving policy, we show that we can inject new objects in a neural rendering representation of the deployment scene, and optimize their texture in order to generate adversarial sensor inputs to the policy. We demonstrate that adversarial scenarios discovered purely in the neural renderer (surrogate scene) can often be successfully transferred to the deployment scene, without further optimization. We demonstrate this transfer occurs both in simulated and real environments, provided the learned surrogate scene is sufficiently close to the deployment scene. △ Less

Submitted 23 January, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

Comments: Conference paper submitted to CoRL 23

arXiv:2308.16105 [pdf, other]

Advanced Deep Regression Models for Forecasting Time Series Oil Production

Authors: Siavash Hosseini, Thangarajah Akilan

Abstract: Global oil demand is rapidly increasing and is expected to reach 106.3 million barrels per day by 2040. Thus, it is vital for hydrocarbon extraction industries to forecast their production to optimize their operations and avoid losses. Big companies have realized that exploiting the power of deep learning (DL) and the massive amount of data from various oil wells for this purpose can save a lot of… ▽ More Global oil demand is rapidly increasing and is expected to reach 106.3 million barrels per day by 2040. Thus, it is vital for hydrocarbon extraction industries to forecast their production to optimize their operations and avoid losses. Big companies have realized that exploiting the power of deep learning (DL) and the massive amount of data from various oil wells for this purpose can save a lot of operational costs and reduce unwanted environmental impacts. In this direction, researchers have proposed models using conventional machine learning (ML) techniques for oil production forecasting. However, these techniques are inappropriate for this problem as they can not capture historical patterns found in time series data, resulting in inaccurate predictions. This research aims to overcome these issues by developing advanced data-driven regression models using sequential convolutions and long short-term memory (LSTM) units. Exhaustive analyses are conducted to select the optimal sequence length, model hyperparameters, and cross-well dataset formation to build highly generalized robust models. A comprehensive experimental study on Volve oilfield data validates the proposed models. It reveals that the LSTM-based sequence learning model can predict oil production better than the 1-D convolutional neural network (CNN) with mean absolute error (MAE) and R2 score of 111.16 and 0.98, respectively. It is also found that the LSTM-based model performs better than all the existing state-of-the-art solutions and achieves a 37% improvement compared to a standard linear regression, which is considered the baseline model in this work. △ Less

Submitted 30 August, 2023; originally announced August 2023.

arXiv:2307.09288 [pdf, other]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Authors: Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini , et al. (43 additional authors not shown)

Abstract: In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be… ▽ More In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs. △ Less

Submitted 19 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

arXiv:2307.03967 [pdf, other]

End-to-End Supervised Multilabel Contrastive Learning

Authors: Ahmad Sajedi, Samir Khaki, Konstantinos N. Plataniotis, Mahdi S. Hosseini

Abstract: Multilabel representation learning is recognized as a challenging problem that can be associated with either label dependencies between object categories or data-related issues such as the inherent imbalance of positive/negative samples. Recent advances address these challenges from model- and data-centric viewpoints. In model-centric, the label correlation is obtained by an external model designs… ▽ More Multilabel representation learning is recognized as a challenging problem that can be associated with either label dependencies between object categories or data-related issues such as the inherent imbalance of positive/negative samples. Recent advances address these challenges from model- and data-centric viewpoints. In model-centric, the label correlation is obtained by an external model designs (e.g., graph CNN) to incorporate an inductive bias for training. However, they fail to design an end-to-end training framework, leading to high computational complexity. On the contrary, in data-centric, the realistic nature of the dataset is considered for improving the classification while ignoring the label dependencies. In this paper, we propose a new end-to-end training framework -- dubbed KMCL (Kernel-based Mutlilabel Contrastive Learning) -- to address the shortcomings of both model- and data-centric designs. The KMCL first transforms the embedded features into a mixture of exponential kernels in Gaussian RKHS. It is then followed by encoding an objective loss that is comprised of (a) reconstruction loss to reconstruct kernel representation, (b) asymmetric classification loss to address the inherent imbalance problem, and (c) contrastive loss to capture label correlation. The KMCL models the uncertainty of the feature encoder while maintaining a low computational footprint. Extensive experiments are conducted on image classification tasks to showcase the consistent improvements of KMCL over the SOTA methods. PyTorch implementation is provided in \url{https://github.com/mahdihosseini/KMCL}. △ Less

Submitted 8 July, 2023; originally announced July 2023.

arXiv:2304.06467 [pdf]

Towards Understanding the Benefits and Challenges of Demand Responsive Public Transit- A Case Study in the City of Charlotte, NC

Authors: Sanaz Sadat Hosseini, Mona Azarbayjani, Jason Lawrence, Hamed Tabkhi

Abstract: Access to adequate public transportation plays a critical role in inequity and socio-economic mobility, particularly in low-income communities. Low-income workers who rely heavily on public transportation face a spatial disparity between home and work, which leads to higher unemployment, longer job searches, and longer commute times. The overarching goal of this study is to get initial data that w… ▽ More Access to adequate public transportation plays a critical role in inequity and socio-economic mobility, particularly in low-income communities. Low-income workers who rely heavily on public transportation face a spatial disparity between home and work, which leads to higher unemployment, longer job searches, and longer commute times. The overarching goal of this study is to get initial data that would result in creating a connected, coordinated, demand-responsive, and efficient public bus system that minimizes transit gaps for low-income, transit-dependent communities. To create equitable metropolitan public transportation, this paper evaluates existing CATS mobile applications that assist passengers in finding bus routes and arrival times. Our community survey methodology includes filling out questionnaires on Charlotte's current bus system on specific bus lines and determining user acceptance for a future novel smart technology. We have also collected data on the demand and transit gap for a real-world pilot study, Sprinter bus line, Bus line 7, Bus line 9, and Bus lines 97-99. These lines connect all of Charlotte City's main areas and are the most important bus lines in the system. On the studied routes, the primary survey results indicate that the current bus system has many flaws, the major one being the lack of proper timing to meet the needs of passengers. The most common problems are long commutes and long waiting times at stations. Moreover, the existing application provides inaccurate information, and on average, 80 percent of travelers and respondents are inclined to use new technology. △ Less

Submitted 8 April, 2023; originally announced April 2023.

Comments: 22 pages, 54 figures

arXiv:2304.05482 [pdf, other]

Computational Pathology: A Survey Review and The Way Forward

Authors: Mahdi S. Hosseini, Babak Ehteshami Bejnordi, Vincent Quoc-Huy Trinh, Danial Hasan, Xingwen Li, Taehyo Kim, Haochen Zhang, Theodore Wu, Kajanan Chinniah, Sina Maghsoudlou, Ryan Zhang, Stephen Yang, Jiadai Zhu, Lyndon Chan, Samir Khaki, Andrei Buin, Fatemeh Chaji, Ala Salehi, Bich Ngoc Nguyen, Dimitris Samaras, Konstantinos N. Plataniotis

Abstract: Computational Pathology CPath is an interdisciplinary science that augments developments of computational approaches to analyze and model medical histopathology images. The main objective for CPath is to develop infrastructure and workflows of digital diagnostics as an assistive CAD system for clinical pathology, facilitating transformational changes in the diagnosis and treatment of cancer that a… ▽ More Computational Pathology CPath is an interdisciplinary science that augments developments of computational approaches to analyze and model medical histopathology images. The main objective for CPath is to develop infrastructure and workflows of digital diagnostics as an assistive CAD system for clinical pathology, facilitating transformational changes in the diagnosis and treatment of cancer that are mainly address by CPath tools. With evergrowing developments in deep learning and computer vision algorithms, and the ease of the data flow from digital pathology, currently CPath is witnessing a paradigm shift. Despite the sheer volume of engineering and scientific works being introduced for cancer image analysis, there is still a considerable gap of adopting and integrating these algorithms in clinical practice. This raises a significant question regarding the direction and trends that are undertaken in CPath. In this article we provide a comprehensive review of more than 800 papers to address the challenges faced in problem design all-the-way to the application and implementation viewpoints. We have catalogued each paper into a model-card by examining the key works and challenges faced to layout the current landscape in CPath. We hope this helps the community to locate relevant works and facilitate understanding of the field's future directions. In a nutshell, we oversee the CPath developments in cycle of stages which are required to be cohesively linked together to address the challenges associated with such multidisciplinary science. We overview this cycle from different perspectives of data-centric, model-centric, and application-centric problems. We finally sketch remaining challenges and provide directions for future technical developments and clinical integration of CPath (https://github.com/AtlasAnalyticsLab/CPath_Survey). △ Less

Submitted 27 January, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: Accepted in Elsevier Journal of Pathology Informatics (JPI) 2024

arXiv:2303.16733 [pdf, other]

doi 10.1145/3578503.3583611

Emotional Framing in the Spreading of False and True Claims

Authors: Akram Sadat Hosseini, Steffen Staab

Abstract: The explosive growth of online misinformation, such as false claims, has affected the social behavior of online users. In order to be persuasive and mislead the audience, false claims are made to trigger emotions in their audience. This paper contributes to understanding how misinformation in social media is shaped by investigating the emotional framing that authors of the claims try to create for… ▽ More The explosive growth of online misinformation, such as false claims, has affected the social behavior of online users. In order to be persuasive and mislead the audience, false claims are made to trigger emotions in their audience. This paper contributes to understanding how misinformation in social media is shaped by investigating the emotional framing that authors of the claims try to create for their audience. We investigate how, firstly, the existence of emotional framing in the claims depends on the topic and credibility of the claims. Secondly, we explore how emotionally framed content triggers emotional response posts by social media users, and how emotions expressed in claims and corresponding users' response posts affect their sharing behavior on social media. Analysis of four data sets covering different topics (politics, health, Syrian war, and COVID-19) reveals that authors shape their claims depending on the topic area to pass targeted emotions to their audience. By analysing responses to claims, we show that the credibility of the claim influences the distribution of emotions that the claim incites in its audience. Moreover, our analysis shows that emotions expressed in the claims are repeated in the users' responses. Finally, the analysis of users' sharing behavior shows that negative emotional framing such as anger, fear, and sadness of false claims leads to more interaction among users than positive emotions. This analysis also reveals that in the claims that trigger happy responses, true claims result in more sharing compared to false claims. △ Less

Submitted 29 March, 2023; originally announced March 2023.

arXiv:2303.15495 [pdf, other]

Real-Time Bus Arrival Prediction: A Deep Learning Approach for Enhanced Urban Mobility

Authors: Narges Rashvand, Sanaz Sadat Hosseini, Mona Azarbayjani, Hamed Tabkhi

Abstract: In urban settings, bus transit stands as a significant mode of public transportation, yet faces hurdles in delivering accurate and reliable arrival times. This discrepancy often culminates in delays and a decline in ridership, particularly in areas with a heavy reliance on bus transit. A prevalent challenge is the mismatch between actual bus arrival times and their scheduled counterparts, leading… ▽ More In urban settings, bus transit stands as a significant mode of public transportation, yet faces hurdles in delivering accurate and reliable arrival times. This discrepancy often culminates in delays and a decline in ridership, particularly in areas with a heavy reliance on bus transit. A prevalent challenge is the mismatch between actual bus arrival times and their scheduled counterparts, leading to disruptions in fixed schedules. Our study, utilizing New York City bus data, reveals an average delay of approximately eight minutes between scheduled and actual bus arrival times. This research introduces an innovative, AI-based, data-driven methodology for predicting bus arrival times at various transit points (stations), offering a collective prediction for all bus lines within large metropolitan areas. Through the deployment of a fully connected neural network, our method elevates the accuracy and efficiency of public bus transit systems. Our comprehensive evaluation encompasses over 200 bus lines and 2 million data points, showcasing an error margin of under 40 seconds for arrival time estimates. Additionally, the inference time for each data point in the validation set is recorded at below 0.006 ms, demonstrating the potential of our Neural-Net-based approach in substantially enhancing the punctuality and reliability of bus transit systems. △ Less

Submitted 4 March, 2024; v1 submitted 27 March, 2023; originally announced March 2023.

arXiv:2301.09211 [pdf, other]

An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models

Authors: Saghar Hosseini, Hamid Palangi, Ahmed Hassan Awadallah

Abstract: Large-scale Pre-Trained Language Models (PTLMs) capture knowledge from massive human-written data which contains latent societal biases and toxic contents. In this paper, we leverage the primary task of PTLMs, i.e., language modeling, and propose a new metric to quantify manifested implicit representational harms in PTLMs towards 13 marginalized demographics. Using this metric, we conducted an emp… ▽ More Large-scale Pre-Trained Language Models (PTLMs) capture knowledge from massive human-written data which contains latent societal biases and toxic contents. In this paper, we leverage the primary task of PTLMs, i.e., language modeling, and propose a new metric to quantify manifested implicit representational harms in PTLMs towards 13 marginalized demographics. Using this metric, we conducted an empirical analysis of 24 widely used PTLMs. Our analysis provides insights into the correlation between the proposed metric in this work and other related metrics for representational harm. We observe that our metric correlates with most of the gender-specific metrics in the literature. Through extensive experiments, we explore the connections between PTLMs architectures and representational harms across two dimensions: depth and width of the networks. We found that prioritizing depth over width, mitigates representational harms in some PTLMs. Our code and data can be found at https://github.com/microsoft/SafeNLP. △ Less

Submitted 22 January, 2023; originally announced January 2023.

Comments: 17 pages,

ACM Class: I.2.7

arXiv:2301.01286 [pdf, other]

Pseudo-Inverted Bottleneck Convolution for DARTS Search Space

Authors: Arash Ahmadian, Louis S. P. Liu, Yue Fei, Konstantinos N. Plataniotis, Mahdi S. Hosseini

Abstract: Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based neural architecture search method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-desig… ▽ More Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based neural architecture search method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-design changes inspired by ConvNeXt and studying the trade-off between accuracy, evaluation layer count, and computational cost. We introduce the Pseudo-Inverted Bottleneck Conv (PIBConv) block intending to reduce the computational footprint of the inverted bottleneck block proposed in ConvNeXt. Our proposed architecture is much less sensitive to evaluation layer count and outperforms a DARTS network with similar size significantly, at layer counts as small as 2. Furthermore, with less layers, not only does it achieve higher accuracy with lower computational footprint (measured in GMACs) and parameter count, GradCAM comparisons show that our network can better detect distinctive features of target objects compared to DARTS. Code is available from https://github.com/mahdihosseini/PIBConv. △ Less

Submitted 18 March, 2023; v1 submitted 31 December, 2022; originally announced January 2023.

Comments: 5 pages

arXiv:2301.01240 [pdf, other]

Modeling Effective Lifespan of Payment Channels

Authors: Soheil Zibakhsh Shabgahi, Seyed Mahdi Hosseini, Seyed Pooya Shariatpanahi, Behnam Bahrak

Abstract: While being decentralized, secure, and reliable, Bitcoin and many other blockchain-based cryptocurrencies suffer from scalability issues. One of the promising proposals to address this problem is off-chain payment channels. Since, not all nodes are connected directly to each other, they can use a payment network to route their payments. Each node allocates a balance that is frozen during the chann… ▽ More While being decentralized, secure, and reliable, Bitcoin and many other blockchain-based cryptocurrencies suffer from scalability issues. One of the promising proposals to address this problem is off-chain payment channels. Since, not all nodes are connected directly to each other, they can use a payment network to route their payments. Each node allocates a balance that is frozen during the channel's lifespan. Spending and receiving transactions will shift the balance to one side of the channel. A channel becomes unbalanced when there is not sufficient balance in one direction. In this case, we say the effective lifespan of the channel has ended. In this paper, we develop a mathematical model to predict the expected effective lifespan of a channel based on the network's topology. We investigate the impact of channel unbalancing on the payment network and individual channels. We also discuss the effect of certain characteristics of payment channels on their lifespan. Our case study on a snapshot of the Lightning Network shows how the effective lifespan is distributed, and how it is correlated with other network characteristics. Our results show that central unbalanced channels have a drastic effect on the network performance. △ Less

Submitted 11 September, 2022; originally announced January 2023.

arXiv:2211.13785 [pdf, other]

PuzzleFusion: Unleashing the Power of Diffusion Models for Spatial Puzzle Solving

Authors: Sepidehsadat Hosseini, Mohammad Amin Shabani, Saghar Irandoust, Yasutaka Furukawa

Abstract: This paper presents an end-to-end neural architecture based on Diffusion Models for spatial puzzle solving, particularly jigsaw puzzle and room arrangement tasks. In the latter task, for instance, the proposed system "PuzzleFusion" takes a set of room layouts as polygonal curves in the top-down view and aligns the room layout pieces by estimating their 2D translations and rotations, akin to solvin… ▽ More This paper presents an end-to-end neural architecture based on Diffusion Models for spatial puzzle solving, particularly jigsaw puzzle and room arrangement tasks. In the latter task, for instance, the proposed system "PuzzleFusion" takes a set of room layouts as polygonal curves in the top-down view and aligns the room layout pieces by estimating their 2D translations and rotations, akin to solving the jigsaw puzzle of room layouts. A surprising discovery of the paper is that the simple use of a Diffusion Model effectively solves these challenging spatial puzzle tasks as a conditional generation process. To enable learning of an end-to-end neural system, the paper introduces new datasets with ground-truth arrangements: 1) 2D Voronoi jigsaw dataset, a synthetic one where pieces are generated by Voronoi diagram of 2D pointset; and 2) MagicPlan dataset, a real one offered by MagicPlan from its production pipeline, where pieces are room layouts constructed by augmented reality App by real-estate consumers. The qualitative and quantitative evaluations demonstrate that our approach outperforms the competing methods by significant margins in all the tasks. △ Less

Submitted 3 October, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

arXiv:2211.13287 [pdf, other]

HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising

Authors: Mohammad Amin Shabani, Sepidehsadat Hosseini, Yasutaka Furukawa

Abstract: The paper presents a novel approach for vector-floorplan generation via a diffusion model, which denoises 2D coordinates of room/door corners with two inference objectives: 1) a single-step noise as the continuous quantity to precisely invert the continuous forward process; and 2) the final 2D coordinate as the discrete quantity to establish geometric incident relationships such as parallelism, or… ▽ More The paper presents a novel approach for vector-floorplan generation via a diffusion model, which denoises 2D coordinates of room/door corners with two inference objectives: 1) a single-step noise as the continuous quantity to precisely invert the continuous forward process; and 2) the final 2D coordinate as the discrete quantity to establish geometric incident relationships such as parallelism, orthogonality, and corner-sharing. Our task is graph-conditioned floorplan generation, a common workflow in floorplan design. We represent a floorplan as 1D polygonal loops, each of which corresponds to a room or a door. Our diffusion model employs a Transformer architecture at the core, which controls the attention masks based on the input graph-constraint and directly generates vector-graphics floorplans via a discrete and continuous denoising process. We have evaluated our approach on RPLAN dataset. The proposed approach makes significant improvements in all the metrics against the state-of-the-art with significant margins, while being capable of generating non-Manhattan structures and controlling the exact number of corners per room. A project website with supplementary video and document is here https://aminshabani.github.io/housediffusion. △ Less

Submitted 23 November, 2022; originally announced November 2022.

arXiv:2208.10919 [pdf, other]

Cluster Based Secure Multi-Party Computation in Federated Learning for Histopathology Images

Authors: S. Maryam Hosseini, Milad Sikaroudi, Morteza Babaei, H. R. Tizhoosh

Abstract: Federated learning (FL) is a decentralized method enabling hospitals to collaboratively learn a model without sharing private patient data for training. In FL, participant hospitals periodically exchange training results rather than training samples with a central server. However, having access to model parameters or gradients can expose private training data samples. To address this challenge, we… ▽ More Federated learning (FL) is a decentralized method enabling hospitals to collaboratively learn a model without sharing private patient data for training. In FL, participant hospitals periodically exchange training results rather than training samples with a central server. However, having access to model parameters or gradients can expose private training data samples. To address this challenge, we adopt secure multiparty computation (SMC) to establish a privacy-preserving federated learning framework. In our proposed method, the hospitals are divided into clusters. After local training, each hospital splits its model weights among other hospitals in the same cluster such that no single hospital can retrieve other hospitals' weights on its own. Then, all hospitals sum up the received weights, sending the results to the central server. Finally, the central server aggregates the results, retrieving the average of models' weights and updating the model without having access to individual hospitals' weights. We conduct experiments on a publicly available repository, The Cancer Genome Atlas (TCGA). We compare the performance of the proposed framework with differential privacy and federated averaging as the baseline. The results reveal that compared to differential privacy, our framework can achieve higher accuracy with no privacy leakage risk at a cost of higher communication overhead. △ Less

Submitted 21 August, 2022; originally announced August 2022.

Comments: Accepted at MICCAI 2022 Workshop on Distributed, Collaborative and Federated Learning

arXiv:2207.08018 [pdf]

Extend the lifetime of wireless sensor networks by modifying cluster-based data collection

Authors: Nahid Ebrahimi, Ali Taghavirashidizadeh, Seyyed Saeed Hosseini

Abstract: Wireless sensor networks have significant potential to increase our ability to view and control the physical environment, but the issue of power consumption in these networks has become an important parameter in their reliability and since in many applications of networks. Wireless sensor needs to guarantee end-to-end quality parameters, support for quality of service in these networks is of utmos… ▽ More Wireless sensor networks have significant potential to increase our ability to view and control the physical environment, but the issue of power consumption in these networks has become an important parameter in their reliability and since in many applications of networks. Wireless sensor needs to guarantee end-to-end quality parameters, support for quality of service in these networks is of utmost importance. In wireless sensor networks, one of the most important issues is the lifetime of each network, which is directly related to the energy consumption of the network sensors. Increasing network lifetime is one of the most challenging requirements in these types of networks. This paper presents a clustering approach to reduce power consumption in wireless sensor networks, which is the most effective method for scalability and energy consumption reduction in sensor networks. The simulation results show that the proposed algorithm can reduce the energy consumption of the wireless sensor network and dramatically increase the lifetime of the network. △ Less

Submitted 16 July, 2022; originally announced July 2022.

Comments: 9 pages, 7 figures, 3rd International Congress On Science And Engineering, Germany, Hamburg

arXiv:2206.00645 [pdf, other]

Floorplan Restoration by Structure Hallucinating Transformer Cascades

Authors: Sepidehsadat Hosseini, Yasutaka Furukawa

Abstract: This paper presents an extreme floorplan reconstruction task, a new benchmark for the task, and a neural architecture as a solution. Given a partial floorplan reconstruction inferred or curated from panorama images, the task is to reconstruct a complete floorplan including invisible architectural structures. The proposed neural network 1) encodes an input partial floorplan into a set of latent vec… ▽ More This paper presents an extreme floorplan reconstruction task, a new benchmark for the task, and a neural architecture as a solution. Given a partial floorplan reconstruction inferred or curated from panorama images, the task is to reconstruct a complete floorplan including invisible architectural structures. The proposed neural network 1) encodes an input partial floorplan into a set of latent vectors by convolutional neural networks and a Transformer; and 2) reconstructs an entire floorplan while hallucinating invisible rooms and doors by cascading Transformer decoders. Qualitative and quantitative evaluations demonstrate effectiveness of our approach over the benchmark of 701 houses, outperforming the state-of-the-art reconstruction techniques. We will share our code, models, and data. △ Less

Submitted 3 October, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

Comments: Published at BMVC 2023

arXiv:2205.01533 [pdf, other]

Average Age of Information Minimization in Reliable Covert Communication on Time-Varying Channels

Authors: Shima Salar Hosseini, Paeiz Azmi, Nader Mokari

Abstract: In this letter, we propose reliable covert communications with the aim of minimizing age of information (AoI) in the time-varying channels. We named the time duration that channel state information (CSI) is valid as a new metric, as age of channel variation (AoC). To find reliable covert communication in a fresh manner in dynamic environments, this work considers a new constraint that shows a rela… ▽ More In this letter, we propose reliable covert communications with the aim of minimizing age of information (AoI) in the time-varying channels. We named the time duration that channel state information (CSI) is valid as a new metric, as age of channel variation (AoC). To find reliable covert communication in a fresh manner in dynamic environments, this work considers a new constraint that shows a relation between AoI and AoC. With the aid of the proposed constraint, this paper addresses two main challenges of reliable covert communication with the aim of minimizing AoI: 1) users packets with desirable size; 2) guaranteeing the negligible probability of Willies detection, in time-varying networks. In the simulation results, we compare the performance of the proposed constraint in reliable covert communication with the aim of minimizing AoI with conventional optimization of the requirement of information freshness in covert communications. △ Less

Submitted 3 May, 2022; originally announced May 2022.

arXiv:2203.16723 [pdf, other]

Exploiting Explainable Metrics for Augmented SGD

Authors: Mahdi S. Hosseini, Mathieu Tuli, Konstantinos N. Plataniotis

Abstract: Explaining the generalization characteristics of deep learning is an emerging topic in advanced machine learning. There are several unanswered questions about how learning under stochastic optimization really works and why certain strategies are better than others. In this paper, we address the following question: \textit{can we probe intermediate layers of a deep neural network to identify and qu… ▽ More Explaining the generalization characteristics of deep learning is an emerging topic in advanced machine learning. There are several unanswered questions about how learning under stochastic optimization really works and why certain strategies are better than others. In this paper, we address the following question: \textit{can we probe intermediate layers of a deep neural network to identify and quantify the learning quality of each layer?} With this question in mind, we propose new explainability metrics that measure the redundant information in a network's layers using a low-rank factorization framework and quantify a complexity measure that is highly correlated with the generalization performance of a given optimizer, network, and dataset. We subsequently exploit these metrics to augment the Stochastic Gradient Descent (SGD) optimizer by adaptively adjusting the learning rate in each layer to improve in generalization performance. Our augmented SGD -- dubbed RMSGD -- introduces minimal computational overhead compared to SOTA methods and outperforms them by exhibiting strong generalization characteristics across application, architecture, and dataset. △ Less

Submitted 30 March, 2022; originally announced March 2022.

Comments: Accepted in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR2022)

arXiv:2201.11246 [pdf, other]

HistoKT: Cross Knowledge Transfer in Computational Pathology

Authors: Ryan Zhang, Jiadai Zhu, Stephen Yang, Mahdi S. Hosseini, Angelo Genovese, Lina Chen, Corwyn Rowsell, Savvas Damaskinos, Sonal Varma, Konstantinos N. Plataniotis

Abstract: The lack of well-annotated datasets in computational pathology (CPath) obstructs the application of deep learning techniques for classifying medical images. %Since pathologist time is expensive, dataset curation is intrinsically difficult. Many CPath workflows involve transferring learned knowledge between various image domains through transfer learning. Currently, most transfer learning research… ▽ More The lack of well-annotated datasets in computational pathology (CPath) obstructs the application of deep learning techniques for classifying medical images. %Since pathologist time is expensive, dataset curation is intrinsically difficult. Many CPath workflows involve transferring learned knowledge between various image domains through transfer learning. Currently, most transfer learning research follows a model-centric approach, tuning network parameters to improve transfer results over few datasets. In this paper, we take a data-centric approach to the transfer learning problem and examine the existence of generalizable knowledge between histopathological datasets. First, we create a standardization workflow for aggregating existing histopathological data. We then measure inter-domain knowledge by training ResNet18 models across multiple histopathological datasets, and cross-transferring between them to determine the quantity and quality of innate shared knowledge. Additionally, we use weight distillation to share knowledge between models without additional training. We find that hard to learn, multi-class datasets benefit most from pretraining, and a two stage learning framework incorporating a large source domain such as ImageNet allows for better utilization of smaller datasets. Furthermore, we find that weight distillation enables models trained on purely histopathological features to outperform models using external natural image data. △ Less

Submitted 26 January, 2022; originally announced January 2022.

Comments: Accepted in ICASSP2022

arXiv:2112.05209 [pdf, other]

Compositional Generalization for Natural Language Interfaces to Web APIs

Authors: Saghar Hosseini, Ahmed Hassan Awadallah, Yu Su

Abstract: This paper presents Okapi, a new dataset for Natural Language to executable web Application Programming Interfaces (NL2API). This dataset is in English and contains 22,508 questions and 9,019 unique API calls, covering three domains. We define new compositional generalization tasks for NL2API which explore the models' ability to extrapolate from simple API calls in the training set to new and more… ▽ More This paper presents Okapi, a new dataset for Natural Language to executable web Application Programming Interfaces (NL2API). This dataset is in English and contains 22,508 questions and 9,019 unique API calls, covering three domains. We define new compositional generalization tasks for NL2API which explore the models' ability to extrapolate from simple API calls in the training set to new and more complex API calls in the inference phase. Also, the models are required to generate API calls that execute correctly as opposed to the existing approaches which evaluate queries with placeholder values. Our dataset is different than most of the existing compositional semantic parsing datasets because it is a non-synthetic dataset studying the compositional generalization in a low-resource setting. Okapi is a step towards creating realistic datasets and benchmarks for studying compositional generalization alongside the existing datasets and tasks. We report the generalization capabilities of sequence-to-sequence baseline models trained on a variety of the SCAN and Okapi datasets tasks. The best model achieves 15\% exact match accuracy when generalizing from simple API calls to more complex API calls. This highlights some challenges for future research. Okapi dataset and tasks are publicly available at https://aka.ms/nl2api/data. △ Less

Submitted 9 December, 2021; originally announced December 2021.

arXiv:2112.04599 [pdf, ps, other]

Quotients of span categories that are allegories and the representation of regular categories

Authors: S. Naser Hosseini, Amir R. Shir Ali Nasab, Walter Tholen, Leila Yeganeh

Abstract: We consider the ordinary category Span(C) of (isomorphism classes of) spans of morphisms in a category C with finite limits as needed, composed horizontally via pullback, and give a general criterion for a quotient of Span(C) to be an allegory. In particular, when C carries a pullback-stable, but not necessarily proper, (E, M)-factorization system, we establish a quotient category Span_E(C) that i… ▽ More We consider the ordinary category Span(C) of (isomorphism classes of) spans of morphisms in a category C with finite limits as needed, composed horizontally via pullback, and give a general criterion for a quotient of Span(C) to be an allegory. In particular, when C carries a pullback-stable, but not necessarily proper, (E, M)-factorization system, we establish a quotient category Span_E(C) that is isomorphic to the category Rel_M(C) of M-relations in C, and show that it is a (unitary and tabular) allegory precisely when M is a class of monomorphisms in C. Without this restriction, one can still find a least pullback-stable and composition-closed class E. containing E such that Span_E.(C) is a unitary and tabular allegory. In this way one obtains a left adjoint to the 2-functor that assigns to every unitary and tabular allegory the regular category of its Lawverian maps. With the Freyd-Scedrov Representation Theorem for regular categories, we conclude that every finitely complete category with a stable factorization system has a reflection into the huge 2-category of all regular categories. △ Less

Submitted 8 December, 2021; originally announced December 2021.

MSC Class: 18B10; 18A32; 18E08

arXiv:2111.14062 [pdf, other]

P4AI: Approaching AI Ethics through Principlism

Authors: Andre Fu, Elisa Ding, Mahdi S. Hosseini, Konstantinos N. Plataniotis

Abstract: The field of computer vision is rapidly evolving, particularly in the context of new methods of neural architecture design. These models contribute to (1) the Climate Crisis - increased CO2 emissions and (2) the Privacy Crisis - data leakage concerns. To address the often overlooked impact the Computer Vision (CV) community has on these crises, we outline a novel ethical framework, \textit{P4AI}:… ▽ More The field of computer vision is rapidly evolving, particularly in the context of new methods of neural architecture design. These models contribute to (1) the Climate Crisis - increased CO2 emissions and (2) the Privacy Crisis - data leakage concerns. To address the often overlooked impact the Computer Vision (CV) community has on these crises, we outline a novel ethical framework, \textit{P4AI}: Principlism for AI, an augmented principlistic view of ethical dilemmas within AI. We then suggest using P4AI to make concrete recommendations to the community to mitigate the climate and privacy crises. △ Less

Submitted 28 November, 2021; originally announced November 2021.

Comments: Human-Centered AI workshop at NeurIPS 2021

arXiv:2111.14059 [pdf, other]

NoFADE: Analyzing Diminishing Returns on CO2 Investment

Authors: Andre Fu, Justin Tran, Andy Xie, Jonathan Spraggett, Elisa Ding, Chang-Won Lee, Kanav Singla, Mahdi S. Hosseini, Konstantinos N. Plataniotis

Abstract: Climate change continues to be a pressing issue that currently affects society at-large. It is important that we as a society, including the Computer Vision (CV) community take steps to limit our impact on the environment. In this paper, we (a) analyze the effect of diminishing returns on CV methods, and (b) propose a \textit{``NoFADE''}: a novel entropy-based metric to quantify model--dataset--co… ▽ More Climate change continues to be a pressing issue that currently affects society at-large. It is important that we as a society, including the Computer Vision (CV) community take steps to limit our impact on the environment. In this paper, we (a) analyze the effect of diminishing returns on CV methods, and (b) propose a \textit{``NoFADE''}: a novel entropy-based metric to quantify model--dataset--complexity relationships. We show that some CV tasks are reaching saturation, while others are almost fully saturated. In this light, NoFADE allows the CV community to compare models and datasets on a similar basis, establishing an agnostic platform. △ Less

Submitted 28 November, 2021; originally announced November 2021.

Comments: Climate Change with Machine Learning workshop at 35th Conference on Neural Information Processing Systems (NeurIPS2021-CCAI)

arXiv:2111.14056 [pdf, other]

Towards Robust and Automatic Hyper-Parameter Tunning

Authors: Mathieu Tuli, Mahdi S. Hosseini, Konstantinos N. Plataniotis

Abstract: The task of hyper-parameter optimization (HPO) is burdened with heavy computational costs due to the intractability of optimizing both a model's weights and its hyper-parameters simultaneously. In this work, we introduce a new class of HPO method and explore how the low-rank factorization of the convolutional weights of intermediate layers of a convolutional neural network can be used to define an… ▽ More The task of hyper-parameter optimization (HPO) is burdened with heavy computational costs due to the intractability of optimizing both a model's weights and its hyper-parameters simultaneously. In this work, we introduce a new class of HPO method and explore how the low-rank factorization of the convolutional weights of intermediate layers of a convolutional neural network can be used to define an analytical response surface for optimizing hyper-parameters, using only training data. We quantify how this surface behaves as a surrogate to model performance and can be solved using a trust-region search algorithm, which we call autoHyper. The algorithm outperforms state-of-the-art such as Bayesian Optimization and generalizes across model, optimizer, and dataset selection. Our code can be found at \url{https://github.com/MathieuTuli/autoHyper}. △ Less

Submitted 12 December, 2021; v1 submitted 28 November, 2021; originally announced November 2021.

Comments: NeurIPS-OPT2021: 13th Annual Workshop on Optimization for Machine Learning

ACM Class: I.2.6; I.2.8; I.2.2

arXiv:2111.02570 [pdf, other]

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

Authors: Subhabrata Mukherjee, Xiaodong Liu, Guoqing Zheng, Saghar Hosseini, Hao Cheng, Greg Yang, Christopher Meek, Ahmed Hassan Awadallah, Jianfeng Gao

Abstract: Most recent progress in natural language understanding (NLU) has been driven, in part, by benchmarks such as GLUE, SuperGLUE, SQuAD, etc. In fact, many NLU models have now matched or exceeded "human-level" performance on many tasks in these benchmarks. Most of these benchmarks, however, give models access to relatively large amounts of labeled data for training. As such, the models are provided fa… ▽ More Most recent progress in natural language understanding (NLU) has been driven, in part, by benchmarks such as GLUE, SuperGLUE, SQuAD, etc. In fact, many NLU models have now matched or exceeded "human-level" performance on many tasks in these benchmarks. Most of these benchmarks, however, give models access to relatively large amounts of labeled data for training. As such, the models are provided far more data than required by humans to achieve strong performance. That has motivated a line of work that focuses on improving few-shot learning performance of NLU models. However, there is a lack of standardized evaluation benchmarks for few-shot NLU resulting in different experimental settings in different papers. To help accelerate this line of work, we introduce CLUES (Constrained Language Understanding Evaluation Standard), a benchmark for evaluating the few-shot learning capabilities of NLU models. We demonstrate that while recent models reach human performance when they have access to large amounts of labeled data, there is a huge gap in performance in the few-shot setting for most tasks. We also demonstrate differences between alternative model families and adaptation techniques in the few shot setting. Finally, we discuss several principles and choices in designing the experimental settings for evaluating the true few-shot learning performance and suggest a unified standardized approach to few-shot learning evaluation. We aim to encourage research on NLU models that can generalize to new tasks with a small number of examples. Code and data for CLUES are available at https://github.com/microsoft/CLUES. △ Less

Submitted 3 November, 2021; originally announced November 2021.

Comments: NeurIPS 2021 Datasets and Benchmarks Track

arXiv:2110.12259 [pdf, other]

In Search of Probeable Generalization Measures

Authors: Jonathan Jaegerman, Khalil Damouni, Mahdi S. Hosseini, Konstantinos N. Plataniotis

Abstract: Understanding the generalization behaviour of deep neural networks is a topic of recent interest that has driven the production of many studies, notably the development and evaluation of generalization "explainability" measures that quantify model generalization ability. Generalization measures have also proven useful in the development of powerful layer-wise model tuning and optimization algorith… ▽ More Understanding the generalization behaviour of deep neural networks is a topic of recent interest that has driven the production of many studies, notably the development and evaluation of generalization "explainability" measures that quantify model generalization ability. Generalization measures have also proven useful in the development of powerful layer-wise model tuning and optimization algorithms, though these algorithms require specific kinds of generalization measures which can probe individual layers. The purpose of this paper is to explore the neglected subtopic of probeable generalization measures; to establish firm ground for further investigations, and to inspire and guide the development of novel model tuning and optimization algorithms. We evaluate and compare measures, demonstrating effectiveness and robustness across model variations, dataset complexities, training hyperparameters, and training stages. We also introduce a new dataset of trained models and performance metrics, GenProb, for testing generalization measures, model tuning algorithms and optimization algorithms. △ Less

Submitted 23 October, 2021; originally announced October 2021.

Comments: Accepted in ICMLA2021

ACM Class: I.2; I.2.8; I.2.10

arXiv:2110.06830 [pdf, other]

CONetV2: Efficient Auto-Channel Size Optimization for CNNs

Authors: Yi Ru Wang, Samir Khaki, Weihang Zheng, Mahdi S. Hosseini, Konstantinos N. Plataniotis

Abstract: Neural Architecture Search (NAS) has been pivotal in finding optimal network configurations for Convolution Neural Networks (CNNs). While many methods explore NAS from a global search-space perspective, the employed optimization schemes typically require heavy computational resources. This work introduces a method that is efficient in computationally constrained environments by examining the micro… ▽ More Neural Architecture Search (NAS) has been pivotal in finding optimal network configurations for Convolution Neural Networks (CNNs). While many methods explore NAS from a global search-space perspective, the employed optimization schemes typically require heavy computational resources. This work introduces a method that is efficient in computationally constrained environments by examining the micro-search space of channel size. In tackling channel-size optimization, we design an automated algorithm to extract the dependencies within different connected layers of the network. In addition, we introduce the idea of knowledge distillation, which enables preservation of trained weights, admist trials where the channel sizes are changing. Further, since the standard performance indicators (accuracy, loss) fail to capture the performance of individual network components (providing an overall network evaluation), we introduce a novel metric that highly correlates with test accuracy and enables analysis of individual network layers. Combining dependency extraction, metrics, and knowledge distillation, we introduce an efficient searching algorithm, with simulated annealing inspired stochasticity, and demonstrate its effectiveness in finding optimal architectures that outperform baselines by a large margin. △ Less

Submitted 13 October, 2021; originally announced October 2021.

ACM Class: I.2; I.2.8; I.2.10

arXiv:2109.05681 [pdf, other]

Multi-User Scheduling in Hybrid Millimeter Wave Massive MIMO Systems

Authors: Seyedeh Maryam Hosseini, Shahram Shahsavari, Catherine Rosenberg

Abstract: While mmWave bands provide a large bandwidth for mobile broadband services, they suffer from severe path loss and shadowing. Multiple-antenna techniques such as beamforming (BF) can be applied to compensate the signal attenuation. We consider a special case of hybrid BF called per-stream hybrid BF (PSHBF) which is easier to implement than the general hybrid BF because it circumvents the need for j… ▽ More While mmWave bands provide a large bandwidth for mobile broadband services, they suffer from severe path loss and shadowing. Multiple-antenna techniques such as beamforming (BF) can be applied to compensate the signal attenuation. We consider a special case of hybrid BF called per-stream hybrid BF (PSHBF) which is easier to implement than the general hybrid BF because it circumvents the need for joint analog-digital beamformer optimization. Employing BF at the base station enables the transmission of multiple data streams to several users in the same resource block. In this paper, we provide an offline study of proportional fair multi-user scheduling in a mmWave system with PSHBF to understand the impact of various system parameters on the performance. We formulate multi-user scheduling as an optimization problem. To tackle the non-convexity, we provide a feasible solution and show through numerical examples that the performance of the provided solution is very close to an upper-bound. Using this framework, we provide extensive numerical investigations revealing several engineering insights. △ Less

Submitted 12 September, 2021; originally announced September 2021.

arXiv:2108.07689 [pdf]

A multimodal sensor dataset for continuous stress detection of nurses in a hospital

Authors: Seyedmajid Hosseini, Satya Katragadda, Ravi Teja Bhupatiraju, Ziad Ashkar, Christoph W. Borst, Kenneth Cochran, Raju Gottumukkala

Abstract: Advances in wearable technologies provide the opportunity to monitor many physiological variables continuously. Stress detection has gained increased attention in recent years, mainly because early stress detection can help individuals better manage health to minimize the negative impacts of long-term stress exposure. This paper provides a unique stress detection dataset created in a natural worki… ▽ More Advances in wearable technologies provide the opportunity to monitor many physiological variables continuously. Stress detection has gained increased attention in recent years, mainly because early stress detection can help individuals better manage health to minimize the negative impacts of long-term stress exposure. This paper provides a unique stress detection dataset created in a natural working environment in a hospital. This dataset is a collection of biometric data of nurses during the COVID-19 outbreak. Studying stress in a work environment is complex due to many social, cultural, and psychological factors in dealing with stressful conditions. Therefore, we captured both the physiological data and associated context pertaining to the stress events. We monitored specifc physiological variables such as electrodermal activity, Heart Rate, and skin temperature of the nurse subjects. A periodic smartphone-administered survey also captured the contributing factors for the detected stress events. A database containing the signals, stress events, and survey responses is publicly available on Dryad. △ Less

Submitted 1 June, 2022; v1 submitted 25 July, 2021; originally announced August 2021.

Comments: 14 pages, 9 images

arXiv:2108.06859 [pdf, other]

Probeable DARTS with Application to Computational Pathology

Authors: Sheyang Tang, Mahdi S. Hosseini, Lina Chen, Sonal Varma, Corwyn Rowsell, Savvas Damaskinos, Konstantinos N. Plataniotis, Zhou Wang

Abstract: AI technology has made remarkable achievements in computational pathology (CPath), especially with the help of deep neural networks. However, the network performance is highly related to architecture design, which commonly requires human experts with domain knowledge. In this paper, we combat this challenge with the recent advance in neural architecture search (NAS) to find an optimal network for… ▽ More AI technology has made remarkable achievements in computational pathology (CPath), especially with the help of deep neural networks. However, the network performance is highly related to architecture design, which commonly requires human experts with domain knowledge. In this paper, we combat this challenge with the recent advance in neural architecture search (NAS) to find an optimal network for CPath applications. In particular, we use differentiable architecture search (DARTS) for its efficiency. We first adopt a probing metric to show that the original DARTS lacks proper hyperparameter tuning on the CIFAR dataset, and how the generalization issue can be addressed using an adaptive optimization strategy. We then apply our searching framework on CPath applications by searching for the optimum network architecture on a histological tissue type dataset (ADP). Results show that the searched network outperforms state-of-the-art networks in terms of prediction accuracy and computation complexity. We further conduct extensive experiments to demonstrate the transferability of the searched network to new CPath applications, the robustness against downscaled inputs, as well as the reliability of predictions. △ Less

Submitted 15 August, 2021; originally announced August 2021.

arXiv:2108.06822 [pdf, other]

CONet: Channel Optimization for Convolutional Neural Networks

Authors: Mahdi S. Hosseini, Jia Shu Zhang, Zhe Liu, Andre Fu, Jingxuan Su, Mathieu Tuli, Sepehr Hosseini, Arsh Kadakia, Haoran Wang, Konstantinos N. Plataniotis

Abstract: Neural Architecture Search (NAS) has shifted network design from using human intuition to leveraging search algorithms guided by evaluation metrics. We study channel size optimization in convolutional neural networks (CNN) and identify the role it plays in model accuracy and complexity. Current channel size selection methods are generally limited by discrete sample spaces while suffering from manu… ▽ More Neural Architecture Search (NAS) has shifted network design from using human intuition to leveraging search algorithms guided by evaluation metrics. We study channel size optimization in convolutional neural networks (CNN) and identify the role it plays in model accuracy and complexity. Current channel size selection methods are generally limited by discrete sample spaces while suffering from manual iteration and simple heuristics. To solve this, we introduce an efficient dynamic scaling algorithm -- CONet -- that automatically optimizes channel sizes across network layers for a given CNN. Two metrics -- "\textit{Rank}" and "\textit{Rank Average Slope}" -- are introduced to identify the information accumulated in training. The algorithm dynamically scales channel sizes up or down over a fixed searching phase. We conduct experiments on CIFAR10/100 and ImageNet datasets and show that CONet can find efficient and accurate architectures searched in ResNet, DARTS, and DARTS+ spaces that outperform their baseline models. This document supersedes previously published paper in ICCV2021-NeurArch workshop. An additional section is included on manual scaling of channel size in CNNs to numerically validate of the metrics used in searching optimum channel configurations in CNNs. △ Less

Submitted 7 April, 2022; v1 submitted 15 August, 2021; originally announced August 2021.

arXiv:2106.14174 [pdf, other]

Transfer-based adaptive tree for multimodal sentiment analysis based on user latent aspects

Authors: Sana Rahmani, Saeid Hosseini, Raziyeh Zall, Mohammad Reza Kangavari, Sara Kamran, Wen Hua

Abstract: Multimodal sentiment analysis benefits various applications such as human-computer interaction and recommendation systems. It aims to infer the users' bipolar ideas using visual, textual, and acoustic signals. Although researchers affirm the association between cognitive cues and emotional manifestations, most of the current multimodal approaches in sentiment analysis disregard user-specific aspec… ▽ More Multimodal sentiment analysis benefits various applications such as human-computer interaction and recommendation systems. It aims to infer the users' bipolar ideas using visual, textual, and acoustic signals. Although researchers affirm the association between cognitive cues and emotional manifestations, most of the current multimodal approaches in sentiment analysis disregard user-specific aspects. To tackle this issue, we devise a novel method to perform multimodal sentiment prediction using cognitive cues, such as personality. Our framework constructs an adaptive tree by hierarchically dividing users and trains the LSTM-based submodels, utilizing an attention-based fusion to transfer cognitive-oriented knowledge within the tree. Subsequently, the framework consumes the conclusive agglomerative knowledge from the adaptive tree to predict final sentiments. We also devise a dynamic dropout method to facilitate data sharing between neighboring nodes, reducing data sparsity. The empirical results on real-world datasets determine that our proposed model for sentiment prediction can surpass trending rivals. Moreover, compared to other ensemble approaches, the proposed transfer-based algorithm can better utilize the latent cognitive cues and foster the prediction outcomes. Based on the given extrinsic and intrinsic analysis results, we note that compared to other theoretical-based techniques, the proposed hierarchical clustering approach can better group the users within the adaptive tree. △ Less

Submitted 27 June, 2021; originally announced June 2021.

Comments: Under Review on IEEE Transactions on Pattern Analysis and Machine Intelligence

arXiv:2106.14150 [pdf]

Image content dependent semi-fragile watermarking with localized tamper detection

Authors: Samira Hosseini, Mojtaba Mahdavi

Abstract: Content-independent watermarks and block-wise independency can be considered as vulnerabilities in semi-fragile watermarking methods. In this paper to achieve the objectives of semi-fragile watermarking techniques, a method is proposed to not have the mentioned shortcomings. In the proposed method, the watermark is generated by relying on image content and a key. Furthermore, the embedding scheme… ▽ More Content-independent watermarks and block-wise independency can be considered as vulnerabilities in semi-fragile watermarking methods. In this paper to achieve the objectives of semi-fragile watermarking techniques, a method is proposed to not have the mentioned shortcomings. In the proposed method, the watermark is generated by relying on image content and a key. Furthermore, the embedding scheme causes the watermarked blocks to become dependent on each other, using a key. In the embedding phase, the image is partitioned into non-overlapping blocks. In order to detect and separate the different types of attacks more precisely, the proposed method embeds three copies of each watermark bit into LWT coefficients of each 4x4 block. In the authentication phase, by voting between the extracted bits the error maps are created; these maps indicate image authenticity and reveal the modified regions. Also, in order to automate the authentication, the images are classified into four categories using seven features. Classification accuracy in the experiments is 97.97 percent. It is noted that our experiments demonstrate that the proposed method is robust against JPEG compression and is competitive with a state-of-the-art semi-fragile watermarking method, in terms of robustness and semi-fragility. △ Less

Submitted 27 June, 2021; originally announced June 2021.

Comments: 32 pages, 11 figures, 5 tables

arXiv:2106.07363 [pdf, other]

Cognitive-aware Short-text Understanding for Inferring Professions

Authors: Sayna Esmailzadeh, Saeid Hosseini, Mohammad Reza Kangavari, Wen Hua

Abstract: Leveraging short-text contents to estimate the occupation of microblog authors has significant gains in many applications. Yet challenges abound. Firstly brief textual contents come with excessive lexical noise that makes the inference problem challenging. Secondly, cognitive-semantics are not evident, and important linguistic features are latent in short-text contents. Thirdly, it is hard to meas… ▽ More Leveraging short-text contents to estimate the occupation of microblog authors has significant gains in many applications. Yet challenges abound. Firstly brief textual contents come with excessive lexical noise that makes the inference problem challenging. Secondly, cognitive-semantics are not evident, and important linguistic features are latent in short-text contents. Thirdly, it is hard to measure the correlation between the cognitive short-text semantics and the features pertaining various occupations. We argue that the multi-aspect cognitive features are needed to correctly associate short-text contents to a particular job and discover suitable people for the careers. To this end, we devise a novel framework that on the one hand, can infer short-text contents and exploit cognitive features, and on the other hand, fuses various adopted novel algorithms, such as curve fitting, support vector, and boosting modules to better predict the occupation of the authors. The final estimation module manufactures the $R^w$-tree via coherence weight to tune the best outcome in the inferring process. We conduct comprehensive experiments on real-life Twitter data. The experimental results show that compared to other rivals, our cognitive multi-aspect model can achieve a higher performance in the career estimation procedure, where it is inevitable to neglect the contextual semantics of users. △ Less

Submitted 4 June, 2021; originally announced June 2021.

arXiv:2106.01706 [pdf, other]

EmoDNN: Understanding emotions from short texts through a deep neural network ensemble

Authors: Sara Kamran, Raziyeh Zall, Mohammad Reza Kangavari, Saeid Hosseini, Sana Rahmani, Wen Hua

Abstract: The latent knowledge in the emotions and the opinions of the individuals that are manifested via social networks are crucial to numerous applications including social management, dynamical processes, and public security. Affective computing, as an interdisciplinary research field, linking artificial intelligence to cognitive inference, is capable to exploit emotion-oriented knowledge from brief co… ▽ More The latent knowledge in the emotions and the opinions of the individuals that are manifested via social networks are crucial to numerous applications including social management, dynamical processes, and public security. Affective computing, as an interdisciplinary research field, linking artificial intelligence to cognitive inference, is capable to exploit emotion-oriented knowledge from brief contents. The textual contents convey hidden information such as personality and cognition about corresponding authors that can determine both correlations and variations between users. Emotion recognition from brief contents should embrace the contrast between authors where the differences in personality and cognition can be traced within emotional expressions. To tackle this challenge, we devise a framework that, on the one hand, infers latent individual aspects, from brief contents and, on the other hand, presents a novel ensemble classifier equipped with dynamic dropout convnets to extract emotions from textual context. To categorize short text contents, our proposed method conjointly leverages cognitive factors and exploits hidden information. We utilize the outcome vectors in a novel embedding model to foster emotion-pertinent features that are collectively assembled by lexicon inductions. Experimental results show that compared to other competitors, our proposed model can achieve a higher performance in recognizing emotion from noisy contents. △ Less

Submitted 3 June, 2021; originally announced June 2021.

arXiv:2104.08702 [pdf, other]

Reconsidering CO2 emissions from Computer Vision

Authors: Andre Fu, Mahdi S. Hosseini, Konstantinos N. Plataniotis

Abstract: Climate change is a pressing issue that is currently affecting and will affect every part of our lives. It's becoming incredibly vital we, as a society, address the climate crisis as a universal effort, including those in the Computer Vision (CV) community. In this work, we analyze the total cost of CO2 emissions by breaking it into (1) the architecture creation cost and (2) the life-time evaluati… ▽ More Climate change is a pressing issue that is currently affecting and will affect every part of our lives. It's becoming incredibly vital we, as a society, address the climate crisis as a universal effort, including those in the Computer Vision (CV) community. In this work, we analyze the total cost of CO2 emissions by breaking it into (1) the architecture creation cost and (2) the life-time evaluation cost. We show that over time, these costs are non-negligible and are having a direct impact on our future. Importantly, we conduct an ethical analysis of how the CV-community is unintentionally overlooking its own ethical AI principles by emitting this level of CO2. To address these concerns, we propose adding "enforcement" as a pillar of ethical AI and provide some recommendations for how architecture designers and broader CV community can curb the climate crisis. △ Less

Submitted 18 April, 2021; originally announced April 2021.

Comments: Accepted for publication in CVPR 2021 Workshop

arXiv:2104.01024 [pdf, other]

doi 10.1145/3412841.3442020

A Comparison of Similarity Based Instance Selection Methods for Cross Project Defect Prediction

Authors: Seyedrebvar Hosseini, Burak Turhan

Abstract: Context: Previous studies have shown that training data instance selection based on nearest neighborhood (NN) information can lead to better performance in cross project defect prediction (CPDP) by reducing heterogeneity in training datasets. However, neighborhood calculation is computationally expensive and approximate methods such as Locality Sensitive Hashing (LSH) can be as effective as exact… ▽ More Context: Previous studies have shown that training data instance selection based on nearest neighborhood (NN) information can lead to better performance in cross project defect prediction (CPDP) by reducing heterogeneity in training datasets. However, neighborhood calculation is computationally expensive and approximate methods such as Locality Sensitive Hashing (LSH) can be as effective as exact methods. Aim: We aim at comparing instance selection methods for CPDP, namely LSH, NN-filter, and Genetic Instance Selection (GIS). Method: We conduct experiments with five base learners, optimizing their hyper parameters, on 13 datasets from PROMISE repository in order to compare the performance of LSH with benchmark instance selection methods NN-Filter and GIS. Results: The statistical tests show six distinct groups for F-measure performance. The top two group contains only LSH and GIS benchmarks whereas the bottom two groups contain only NN-Filter variants. LSH and GIS favor recall more than precision. In fact, for precision performance only three significantly distinct groups are detected by the tests where the top group is comprised of NN-Filter variants only. Recall wise, 16 different groups are identified where the top three groups contain only LSH methods, four of the next six are GIS only and the bottom five contain only NN-Filter. Finally, NN-Filter benchmarks never outperform the LSH counterparts with the same base learner, tuned or non-tuned. Further, they never even belong to the same rank group, meaning that LSH is always significantly better than NN-Filter with the same learner and settings. Conclusions: The increase in performance and the decrease in computational overhead and runtime make LSH a promising approach. However, the performance of LSH is based on high recall and in environments where precision is considered more important NN-Filter should be considered. △ Less

Submitted 2 April, 2021; originally announced April 2021.

Comments: The 36th ACM/SIGAPP Symposium on Applied Computing (SAC'21), 10 pages

arXiv:2103.02574 [pdf, other]

House-GAN++: Generative Adversarial Layout Refinement Networks

Authors: Nelson Nauata, Sepidehsadat Hosseini, Kai-Hung Chang, Hang Chu, Chin-Yi Cheng, Yasutaka Furukawa

Abstract: This paper proposes a novel generative adversarial layout refinement network for automated floorplan generation. Our architecture is an integration of a graph-constrained relational GAN and a conditional GAN, where a previously generated layout becomes the next input constraint, enabling iterative refinement. A surprising discovery of our research is that a simple non-iterative training process, d… ▽ More This paper proposes a novel generative adversarial layout refinement network for automated floorplan generation. Our architecture is an integration of a graph-constrained relational GAN and a conditional GAN, where a previously generated layout becomes the next input constraint, enabling iterative refinement. A surprising discovery of our research is that a simple non-iterative training process, dubbed component-wise GT-conditioning, is effective in learning such a generator. The iterative generator also creates a new opportunity in further improving a metric of choice via meta-optimization techniques by controlling when to pass which input constraints during iterative layout refinement. Our qualitative and quantitative evaluation based on the three standard metrics demonstrate that the proposed system makes significant improvements over the current state-of-the-art, even competitive against the ground-truth floorplans, designed by professional architects. △ Less

Submitted 3 March, 2021; originally announced March 2021.

arXiv:2102.07737 [pdf, other]

Zero-Shot Self-Supervised Learning for MRI Reconstruction

Authors: Burhaneddin Yaman, Seyed Amir Hossein Hosseini, Mehmet Akçakaya

Abstract: Deep learning (DL) has emerged as a powerful tool for accelerated MRI reconstruction, but often necessitates a database of fully-sampled measurements for training. Recent self-supervised and unsupervised learning approaches enable training without fully-sampled data. However, a database of undersampled measurements may not be available in many scenarios, especially for scans involving contrast or… ▽ More Deep learning (DL) has emerged as a powerful tool for accelerated MRI reconstruction, but often necessitates a database of fully-sampled measurements for training. Recent self-supervised and unsupervised learning approaches enable training without fully-sampled data. However, a database of undersampled measurements may not be available in many scenarios, especially for scans involving contrast or translational acquisitions in development. Moreover, recent studies show that database-trained models may not generalize well when the unseen measurements differ in terms of sampling pattern, acceleration rate, SNR, image contrast, and anatomy. Such challenges necessitate a new methodology to enable subject-specific DL MRI reconstruction without external training datasets, since it is clinically imperative to provide high-quality reconstructions that can be used to identify lesions/disease for \emph{every individual}. In this work, we propose a zero-shot self-supervised learning approach to perform subject-specific accelerated DL MRI reconstruction to tackle these issues. The proposed approach partitions the available measurements from a single scan into three disjoint sets. Two of these sets are used to enforce data consistency and define loss during training for self-supervision, while the last set serves to self-validate, establishing an early stopping criterion. In the presence of models pre-trained on a database with different image characteristics, we show that the proposed approach can be combined with transfer learning for faster convergence time and reduced computational complexity. The code is available at \url{https://github.com/byaman14/ZS-SSL}. △ Less

Submitted 28 November, 2023; v1 submitted 15 February, 2021; originally announced February 2021.

Journal ref: International Conference on Learning Representations (ICLR), 2022

Showing 1–50 of 118 results for author: Hosseini, S