Skip to main content

Showing 1–36 of 36 results for author: Davis, J C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.18801  [pdf, other

    cs.CV cs.LG cs.SE

    A Partial Replication of MaskFormer in TensorFlow on TPUs for the TensorFlow Model Garden

    Authors: Vishal Purohit, Wenxin Jiang, Akshath R. Ravikiran, James C. Davis

    Abstract: This paper undertakes the task of replicating the MaskFormer model a universal image segmentation model originally developed using the PyTorch framework, within the TensorFlow ecosystem, specifically optimized for execution on Tensor Processing Units (TPUs). Our implementation exploits the modular constructs available within the TensorFlow Model Garden (TFMG), encompassing elements such as the dat… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  2. arXiv:2404.16688  [pdf, other

    cs.SE

    Reusing Deep Learning Models: Challenges and Directions in Software Engineering

    Authors: James C. Davis, Purvish Jajal, Wenxin Jiang, Taylor R. Schorlemmer, Nicholas Synovic, George K. Thiruvathukal

    Abstract: Deep neural networks (DNNs) achieve state-of-the-art performance in many areas, including computer vision, system configuration, and question-answering. However, DNNs are expensive to develop, both in intellectual effort (e.g., devising new architectures) and computational costs (e.g., training). Reusing DNNs is a promising direction to amortize costs within a company and across the computing indu… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Proceedings of the IEEE John Vincent Atanasoff Symposium on Modern Computing (JVA'23) 2023

  3. arXiv:2404.16632  [pdf

    cs.CR cs.SE

    Introducing Systems Thinking as a Framework for Teaching and Assessing Threat Modeling Competency

    Authors: Siddhant S. Joshi, Preeti Mukherjee, Kirsten A. Davis, James C. Davis

    Abstract: Computing systems face diverse and substantial cybersecurity threats. To mitigate these cybersecurity threats, software engineers need to be competent in the skill of threat modeling. In industry and academia, there are many frameworks for teaching threat modeling, but our analysis of these frameworks suggests that (1) these approaches tend to be focused on component-level analysis rather than edu… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Presented at the Annual Conference of the American Society for Engineering Education (ASEE'24) 2024

  4. arXiv:2403.18679  [pdf

    cs.SE cs.HC

    An Exploratory Study on Upper-Level Computing Students' Use of Large Language Models as Tools in a Semester-Long Project

    Authors: Ben Arie Tanay, Lexy Arinze, Siddhant S. Joshi, Kirsten A. Davis, James C. Davis

    Abstract: Background: Large Language Models (LLMs) such as ChatGPT and CoPilot are influencing software engineering practice. Software engineering educators must teach future software engineers how to use such tools well. As of yet, there have been few studies that report on the use of LLMs in the classroom. It is, therefore, important to evaluate students' perception of LLMs and possible ways of adapting t… ▽ More

    Submitted 16 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted to the 2024 General Conference of the American Society for Engineering Education (ASEE)

  5. arXiv:2402.12252  [pdf, other

    cs.CR cs.SE

    An Interview Study on Third-Party Cyber Threat Hunting Processes in the U.S. Department of Homeland Security

    Authors: William P. Maxam III, James C. Davis

    Abstract: Cybersecurity is a major challenge for large organizations. Traditional cybersecurity defense is reactive. Cybersecurity operations centers keep out adversaries and incident response teams clean up after break-ins. Recently a proactive stage has been introduced: Cyber Threat Hunting (TH) looks for potential compromises missed by other cyber defenses. TH is mandated for federal executive agencies a… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: Technical report accompanying a paper at USENIX Security 2024

  6. arXiv:2402.00699  [pdf, other

    cs.SE cs.AI cs.DB cs.LG

    PeaTMOSS: A Dataset and Initial Analysis of Pre-Trained Models in Open-Source Software

    Authors: Wenxin Jiang, Jerin Yasmin, Jason Jones, Nicholas Synovic, Jiashen Kuo, Nathaniel Bielanski, Yuan Tian, George K. Thiruvathukal, James C. Davis

    Abstract: The development and training of deep learning models have become increasingly costly and complex. Consequently, software engineers are adopting pre-trained models (PTMs) for their downstream applications. The dynamics of the PTM supply chain remain largely unexplored, signaling a clear need for structured datasets that document not only the metadata but also the subsequent applications of these mo… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted at MSR'24

  7. arXiv:2401.14635  [pdf, other

    cs.CR cs.SE

    Signing in Four Public Software Package Registries: Quantity, Quality, and Influencing Factors

    Authors: Taylor R Schorlemmer, Kelechi G Kalu, Luke Chigges, Kyung Myung Ko, Eman Abu Isghair, Saurabh Baghi, Santiago Torres-Arias, James C Davis

    Abstract: Many software applications incorporate open-source third-party packages distributed by public package registries. Guaranteeing authorship along this supply chain is a challenge. Package maintainers can guarantee package authorship through software signing. However, it is unclear how common this practice is, and whether the resulting signatures are created properly. Prior work has provided raw data… ▽ More

    Submitted 14 April, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted at IEEE Security & Privacy 2024 (S&P'24)

  8. arXiv:2401.14629  [pdf, ps, other

    cs.SE cs.CY

    A First Look at the General Data Protection Regulation (GDPR) in Open-Source Software

    Authors: Lucas Franke, Huayu Liang, Aaron Brantly, James C Davis, Chris Brown

    Abstract: This poster describes work on the General Data Protection Regulation (GDPR) in open-source software. Although open-source software is commonly integrated into regulated software, and thus must be engineered or adapted for compliance, we do not know how such laws impact open-source software development. We surveyed open-source developers (N=47) to understand their experiences and perceptions of G… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 2 page extended abstract for ICSE-Poster 2024

  9. arXiv:2310.14117  [pdf, other

    cs.CR cs.SE

    ZTD$_{JAVA}$: Mitigating Software Supply Chain Vulnerabilities via Zero-Trust Dependencies

    Authors: Paschal C. Amusuo, Kyle A. Robinson, Tanmay Singla, Huiyun Peng, Aravind Machiry, Santiago Torres-Arias, Laurent Simon, James C. Davis

    Abstract: Third-party software components like Log4J accelerate software application development but introduce substantial risk. These components have led to many software supply chain attacks. These attacks succeed because third-party software components are implicitly trusted in an application. Although several security defenses exist to reduce the risks from third-party software components, none of them… ▽ More

    Submitted 25 April, 2024; v1 submitted 21 October, 2023; originally announced October 2023.

    Comments: 15 pages, 5 figures, 5 tables

    ACM Class: K.6.5; D.4.6

  10. arXiv:2310.03620  [pdf, other

    cs.SE cs.AI

    PeaTMOSS: Mining Pre-Trained Models in Open-Source Software

    Authors: Wenxin Jiang, Jason Jones, Jerin Yasmin, Nicholas Synovic, Rajeev Sashti, Sophie Chen, George K. Thiruvathukal, Yuan Tian, James C. Davis

    Abstract: Developing and training deep learning models is expensive, so software engineers have begun to reuse pre-trained deep learning models (PTMs) and fine-tune them for downstream tasks. Despite the wide-spread use of PTMs, we know little about the corresponding software engineering behaviors and challenges. To enable the study of software engineering with PTMs, we present the PeaTMOSS dataset: Pre-T… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  11. arXiv:2310.01653  [pdf

    cs.SE

    A Unified Taxonomy and Evaluation of IoT Security Guidelines

    Authors: Jesse Chen, Dharun Anandayuvaraj, James C Davis, Sazzadur Rahaman

    Abstract: Cybersecurity concerns about Internet of Things (IoT) devices and infrastructure are growing each year. In response, organizations worldwide have published IoT cybersecurity guidelines to protect their citizens and customers. These guidelines constrain the development of IoT systems, which include substantial software components both on-device and in the Cloud. While these guidelines are being wid… ▽ More

    Submitted 3 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

  12. arXiv:2310.01642  [pdf, other

    cs.SE cs.AI

    Naming Practices of Pre-Trained Models in Hugging Face

    Authors: Wenxin Jiang, Chingwo Cheung, Mingyu Kim, Heesoo Kim, George K. Thiruvathukal, James C. Davis

    Abstract: As innovation in deep learning continues, many engineers seek to adopt Pre-Trained Models (PTMs) as components in computer systems. Researchers publish PTMs, which engineers adapt for quality or performance prior to deployment. PTM authors should choose appropriate names for their PTMs, which would facilitate model discovery and reuse. However, prior research has reported that model names are not… ▽ More

    Submitted 28 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: 21 pages

  13. arXiv:2310.00205  [pdf, other

    cs.SE cs.CR

    An Empirical Study on the Use of Static Analysis Tools in Open Source Embedded Software

    Authors: Mingjie Shen, Akul Pillai, Brian A. Yuan, James C. Davis, Aravind Machiry

    Abstract: This paper performs the first study to understand the prevalence, challenges, and effectiveness of using Static Application Security Testing (SAST) tools on Open-Source Embedded Software (EMBOSS) repositories. We collect a corpus of 258 of the most popular EMBOSS projects, representing 13 distinct categories such as real-time operating systems, network stacks, and applications. To understand the c… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

  14. arXiv:2308.12387  [pdf, other

    cs.SE

    Reflecting on the Use of the Policy-Process-Product Theory in Empirical Software Engineering

    Authors: Kelechi G. Kalu, Taylor R. Schorlemmer, Sophie Chen, Kyle Robinson, Erik Kocinare, James C. Davis

    Abstract: The primary theory of software engineering is that an organization's Policies and Processes influence the quality of its Products. We call this the PPP Theory. Although empirical software engineering research has grown common, it is unclear whether researchers are trying to evaluate the PPP Theory. To assess this, we analyzed half (33) of the empirical works published over the last two years in th… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

    Comments: 5 pages, published in the proceedings of the 2023 ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering in the Ideas-Visions-Reflections track (ESEC/FSE-IVR'23)

  15. arXiv:2308.10965  [pdf, other

    cs.SE

    Systematically Detecting Packet Validation Vulnerabilities in Embedded Network Stacks

    Authors: Paschal C. Amusuo, Ricardo Andrés Calvo Méndez, Zhongwei Xu, Aravind Machiry, James C. Davis

    Abstract: Embedded Network Stacks (ENS) enable low-resource devices to communicate with the outside world, facilitating the development of the Internet of Things and Cyber-Physical Systems. Some defects in ENS are thus high-severity cybersecurity vulnerabilities: they are remotely triggerable and can impact the physical world. While prior research has shed light on the characteristics of defects in many cla… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: 12 pages, 3 figures, to be published in the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023)

    ACM Class: D.2.5

  16. arXiv:2308.04898  [pdf, other

    cs.CR cs.LG cs.SE

    An Empirical Study on Using Large Language Models to Analyze Software Supply Chain Security Failures

    Authors: Tanmay Singla, Dharun Anandayuvaraj, Kelechi G. Kalu, Taylor R. Schorlemmer, James C. Davis

    Abstract: As we increasingly depend on software systems, the consequences of breaches in the software supply chain become more severe. High-profile cyber attacks like those on SolarWinds and ShadowHammer have resulted in significant financial and data losses, underlining the need for stronger cybersecurity. One way to prevent future breaches is by studying past failures. However, traditional methods of anal… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: 22 pages, 9 figures

  17. arXiv:2303.17708  [pdf, other

    cs.SE cs.LG

    Analysis of Failures and Risks in Deep Learning Model Converters: A Case Study in the ONNX Ecosystem

    Authors: Purvish Jajal, Wenxin Jiang, Arav Tewari, Erik Kocinare, Joseph Woo, Anusha Sarraf, Yung-Hsiang Lu, George K. Thiruvathukal, James C. Davis

    Abstract: Software engineers develop, fine-tune, and deploy deep learning (DL) models using a variety of development frameworks and runtime environments. DL model converters move models between frameworks and to runtime environments. Conversion errors compromise model quality and disrupt deployment. However, the failure characteristics of DL model converters are unknown, adding risk when using DL interopera… ▽ More

    Submitted 24 April, 2024; v1 submitted 30 March, 2023; originally announced March 2023.

  18. arXiv:2303.08934  [pdf, other

    cs.SE

    PTMTorrent: A Dataset for Mining Open-source Pre-trained Model Packages

    Authors: Wenxin Jiang, Nicholas Synovic, Purvish Jajal, Taylor R. Schorlemmer, Arav Tewari, Bhavesh Pareek, George K. Thiruvathukal, James C. Davis

    Abstract: Due to the cost of developing and training deep learning models from scratch, machine learning engineers have begun to reuse pre-trained models (PTMs) and fine-tune them for downstream tasks. PTM registries known as "model hubs" support engineers in distributing and reusing deep learning models. PTM packages include pre-trained weights, documentation, model architectures, datasets, and metadata. M… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: 5 pages, 2 figures, Accepted to MSR'23

  19. arXiv:2303.07476  [pdf, other

    cs.SE cs.AI

    Challenges and Practices of Deep Learning Model Reengineering: A Case Study on Computer Vision

    Authors: Wenxin Jiang, Vishnu Banna, Naveen Vivek, Abhinav Goel, Nicholas Synovic, George K. Thiruvathukal, James C. Davis

    Abstract: Many engineering organizations are reimplementing and extending deep neural networks from the research community. We describe this process as deep learning model reengineering. Deep learning model reengineering - reusing, reproducing, adapting, and enhancing state-of-the-art deep learning approaches - is challenging for reasons including under-documented reference models, changing requirements, an… ▽ More

    Submitted 25 August, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: Under submission to EMSE

  20. arXiv:2303.02555  [pdf, other

    cs.SE

    Regexes are Hard: Decision-making, Difficulties, and Risks in Programming Regular Expressions

    Authors: Louis G. Michael IV, James Donohue, James C. Davis, Dongyoon Lee, Francisco Servant

    Abstract: Regular expressions (regexes) are a powerful mechanism for solving string-matching problems. They are supported by all modern programming languages, and have been estimated to appear in more than a third of Python and JavaScript projects. Yet existing studies have focused mostly on one aspect of regex programming: readability. We know little about how developers perceive and program regexes, nor t… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

    Comments: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2019

  21. arXiv:2303.02552  [pdf, other

    cs.SE cs.AI cs.LG

    An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry

    Authors: Wenxin Jiang, Nicholas Synovic, Matt Hyatt, Taylor R. Schorlemmer, Rohan Sethi, Yung-Hsiang Lu, George K. Thiruvathukal, James C. Davis

    Abstract: Deep Neural Networks (DNNs) are being adopted as components in software systems. Creating and specializing DNNs from scratch has grown increasingly difficult as state-of-the-art architectures grow more complex. Following the path of traditional software engineering, machine learning engineers have begun to reuse large-scale pre-trained models (PTMs) and fine-tune these models for downstream tasks.… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

    Comments: Proceedings of the ACM/IEEE 45th International Conference on Software Engineering (ICSE) 2023

  22. arXiv:2303.02551  [pdf, other

    cs.SE cs.AI cs.LG

    Discrepancies among Pre-trained Deep Neural Networks: A New Threat to Model Zoo Reliability

    Authors: Diego Montes, Pongpatapee Peerapatanapokin, Jeff Schultz, Chengjun Gun, Wenxin Jiang, James C. Davis

    Abstract: Training deep neural networks (DNNs) takes signifcant time and resources. A practice for expedited deployment is to use pre-trained deep neural networks (PTNNs), often from model zoos -- collections of PTNNs; yet, the reliability of model zoos remains unexamined. In the absence of an industry standard for the implementation and performance of PTNNs, engineers cannot confidently incorporate them in… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

    Comments: Proceedings of the 30th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering: Ideas, Visions, and Reflections track (ESEC/FSE-IVR) 2022

  23. arXiv:2303.01996  [pdf, other

    cs.CR cs.SE

    Exploiting Input Sanitization for Regex Denial of Service

    Authors: Efe Barlas, Xin Du, James C. Davis

    Abstract: Web services use server-side input sanitization to guard against harmful input. Some web services publish their sanitization logic to make their client interface more usable, e.g., allowing clients to debug invalid requests locally. However, this usability practice poses a security risk. Specifically, services may share the regexes they use to sanitize input strings -- and regex-based denial of se… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering (ICSE) 2022

  24. arXiv:2212.07979  [pdf, other

    cs.SE cs.CR cs.HC cs.PL

    Improving Developers' Understanding of Regex Denial of Service Tools through Anti-Patterns and Fix Strategies

    Authors: Sk Adnan Hassan, Zainab Aamir, Dongyoon Lee, James C. Davis, Francisco Servant

    Abstract: Regular expressions are used for diverse purposes, including input validation and firewalls. Unfortunately, they can also lead to a security vulnerability called ReDoS (Regular Expression Denial of Service), caused by a super-linear worst-case execution time during regex matching. Due to the severity and prevalence of ReDoS, past work proposed automatic tools to detect and fix regexes. Although th… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: IEEE Security & Privacy 2023

  25. Reflections on Software Failure Analysis

    Authors: Paschal C. Amusuo, Aishwarya Sharma, Siddharth R. Rao, Abbey Vincent, James C. Davis

    Abstract: Failure studies are important in revealing the root causes, behaviors, and life cycle of defects in software systems. These studies either focus on understanding the characteristics of defects in specific classes of systems or the characteristics of a specific type of defect in the systems it manifests in. Failure studies have influenced various software engineering research directions, especially… ▽ More

    Submitted 21 September, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: 6 pages, 4 figures To be published in: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE '22)

  26. arXiv:2207.11767  [pdf, other

    cs.SE

    Snapshot Metrics Are Not Enough: Analyzing Software Repositories with Longitudinal Metrics

    Authors: Nicholas Synovic, Matt Hyatt, Rohan Sethi, Sohini Thota, Shilpika, Allan J. Miller, Wenxin Jiang, Emmanuel S. Amobi, Austin Pinderski, Konstantin Läufer, Nicholas J. Hayward, Neil Klingensmith, James C. Davis, George K. Thiruvathukal

    Abstract: Software metrics capture information about software development processes and products. These metrics support decision-making, e.g., in team management or dependency selection. However, existing metrics tools measure only a snapshot of a software project. Little attention has been given to enabling engineers to reason about metric trends over time -- longitudinal metrics that give insight about pr… ▽ More

    Submitted 24 July, 2022; originally announced July 2022.

    Comments: Accepted at ASE 2022 Tool Demonstrations

  27. arXiv:2206.13562  [pdf

    cs.SE

    Incorporating Failure Knowledge into Design Decisions for IoT Systems: A Controlled Experiment on Novices

    Authors: Dharun Anandayuvaraj, Pujita Thulluri, Justin Figueroa, Harshit Shandilya, James C. Davis

    Abstract: Internet of Things (IoT) systems allow software to directly interact with the physical world. Recent IoT failures can be attributed to recurring software design flaws, suggesting IoT software engineers may not be learning from past failures. We examine the use of failure stories to improve IoT system designs. We conducted an experiment to evaluate the influence of failure-related learning treatmen… ▽ More

    Submitted 20 March, 2023; v1 submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted at the Software Engineering Research & Practices for the Internet of Things (SERP4IoT) workshop at The International Conference on Software Engineering (ICSE) 2023

  28. Reflecting on Recurring Failures in IoT Development

    Authors: Dharun Anandayuvaraj, James C. Davis

    Abstract: As IoT systems are given more responsibility and autonomy, they offer greater benefits, but also carry greater risks. We believe this trend invigorates an old challenge of software engineering: how to develop high-risk software-intensive systems safely and securely under market pressures? As a first step, we conducted a systematic analysis of recent IoT failures to identify engineering challenges.… ▽ More

    Submitted 19 September, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted at the New Ideas and Emerging Results Track (NIER) at The 37th IEEE/ACM International Conference on Automated Software Engineering (ASE 2022)

  29. arXiv:2109.13356  [pdf, other

    cs.CV cs.DC

    Efficient Computer Vision on Edge Devices with Pipeline-Parallel Hierarchical Neural Networks

    Authors: Abhinav Goel, Caleb Tung, Xiao Hu, George K. Thiruvathukal, James C. Davis, Yung-Hsiang Lu

    Abstract: Computer vision on low-power edge devices enables applications including search-and-rescue and security. State-of-the-art computer vision algorithms, such as Deep Neural Networks (DNNs), are too large for inference on low-power edge devices. To improve efficiency, some existing approaches parallelize DNN inference across multiple edge devices. However, these techniques introduce significant commun… ▽ More

    Submitted 4 November, 2021; v1 submitted 27 September, 2021; originally announced September 2021.

    Comments: Accepted for publication in ASPDAC 2022

  30. arXiv:2107.00821  [pdf, other

    cs.SE cs.AI cs.LG

    An Experience Report on Machine Learning Reproducibility: Guidance for Practitioners and TensorFlow Model Garden Contributors

    Authors: Vishnu Banna, Akhil Chinnakotla, Zhengxin Yan, Anirudh Vegesana, Naveen Vivek, Kruthi Krishnappa, Wenxin Jiang, Yung-Hsiang Lu, George K. Thiruvathukal, James C. Davis

    Abstract: Machine learning techniques are becoming a fundamental tool for scientific and engineering progress. These techniques are applied in contexts as diverse as astronomy and spam filtering. However, correctly applying these techniques requires careful engineering. Much attention has been paid to the technical potential; relatively little attention has been paid to the software engineering process requ… ▽ More

    Submitted 29 July, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

    Comments: Technical Report

  31. arXiv:2106.10588  [pdf, other

    cs.CV eess.IV

    Low-Power Multi-Camera Object Re-Identification using Hierarchical Neural Networks

    Authors: Abhinav Goel, Caleb Tung, Xiao Hu, Haobo Wang, James C. Davis, George K. Thiruvathukal, Yung-Hsiang Lu

    Abstract: Low-power computer vision on embedded devices has many applications. This paper describes a low-power technique for the object re-identification (reID) problem: matching a query image against a gallery of previously seen images. State-of-the-art techniques rely on large, computationally-intensive Deep Neural Networks (DNNs). We propose a novel hierarchical DNN architecture that uses attribute labe… ▽ More

    Submitted 19 June, 2021; originally announced June 2021.

    Comments: Accepted to ISLPED 2021

  32. arXiv:2105.04397  [pdf, other

    cs.SE cs.PL

    Why Aren't Regular Expressions a Lingua Franca? An Empirical Study on the Re-use and Portability of Regular Expressions

    Authors: James C. Davis, Louis G. Michael IV, Christy A. Coghlan, Francisco Servant, Dongyoon Lee

    Abstract: This paper explores the extent to which regular expressions (regexes) are portable across programming languages. Many languages offer similar regex syntaxes, and it would be natural to assume that regexes can be ported across language boundaries. But can regexes be copy/pasted across language boundaries while retaining their semantic and performance characteristics? In our survey of 158 professi… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: ESEC/FSE 2019

  33. arXiv:2009.12156  [pdf, other

    cs.SE

    An Empirical Study on the Impact of Deep Parameters on Mobile App Energy Usage

    Authors: Qiang Xu, James C. Davis, Y. Charlie Hu, Abhilash Jindal

    Abstract: Improving software performance through configuration parameter tuning is a common activity during software maintenance. Beyond traditional performance metrics like latency, mobile app developers are interested in reducing app energy usage. Some mobile apps have centralized locations for parameter tuning, similar to databases and operating systems, but it is common for mobile apps to have hundreds… ▽ More

    Submitted 16 January, 2022; v1 submitted 22 September, 2020; originally announced September 2020.

    Comments: 12 pages, 9 figures, to be published in SANER 2022, camera-ready

  34. arXiv:2009.05632  [pdf, other

    cs.SE cs.PL

    A Principled Approach to GraphQL Query Cost Analysis

    Authors: Alan Cha, Erik Wittern, Guillaume Baudart, James C. Davis, Louis Mandel, Jim A. Laredo

    Abstract: The landscape of web APIs is evolving to meet new client requirements and to facilitate how providers fulfill them. A recent web API model is GraphQL, which is both a query language and a runtime. Using GraphQL, client queries express the data they want to retrieve or mutate, and servers respond with exactly those data or changes. GraphQL's expressiveness is risky for service providers because cli… ▽ More

    Submitted 11 September, 2020; originally announced September 2020.

    Comments: Published at the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) 2020

  35. arXiv:1907.13012  [pdf, other

    cs.SE

    An Empirical Study of GraphQL Schemas

    Authors: Erik Wittern, Alan Cha, James C. Davis, Guillaume Baudart, Louis Mandel

    Abstract: GraphQL is a query language for APIs and a runtime to execute queries. Using GraphQL queries, clients define precisely what data they wish to retrieve or mutate on a server, leading to fewer round trips and reduced response sizes. Although interest in GraphQL is on the rise, with increasing adoption at major organizations, little is known about what GraphQL interfaces look like in practice. This l… ▽ More

    Submitted 30 July, 2019; originally announced July 2019.

  36. The IceProd Framework: Distributed Data Processing for the IceCube Neutrino Observatory

    Authors: M. G. Aartsen, R. Abbasi, M. Ackermann, J. Adams, J. A. Aguilar, M. Ahlers, D. Altmann, C. Arguelles, J. Auffenberg, X. Bai, M. Baker, S. W. Barwick, V. Baum, R. Bay, J. J. Beatty, J. Becker Tjus, K. -H. Becker, S. BenZvi, P. Berghaus, D. Berley, E. Bernardini, A. Bernhard, D. Z. Besson, G. Binder, D. Bindig , et al. (262 additional authors not shown)

    Abstract: IceCube is a one-gigaton instrument located at the geographic South Pole, designed to detect cosmic neutrinos, iden- tify the particle nature of dark matter, and study high-energy neutrinos themselves. Simulation of the IceCube detector and processing of data require a significant amount of computational resources. IceProd is a distributed management system based on Python, XML-RPC and GridFTP. It… ▽ More

    Submitted 22 August, 2014; v1 submitted 22 November, 2013; originally announced November 2013.

    Journal ref: Journal of Parallel & Distributed Computing 75:198,2015