Skip to main content

Showing 1–25 of 25 results for author: Rashtchian, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.16820  [pdf, other

    cs.CV

    Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings

    Authors: Olivia Wiles, Chuhan Zhang, Isabela Albuquerque, Ivana Kajić, Su Wang, Emanuele Bugliarello, Yasumasa Onoe, Chris Knutsen, Cyrus Rashtchian, Jordi Pont-Tuset, Aida Nematzadeh

    Abstract: While text-to-image (T2I) generative models have become ubiquitous, they do not necessarily generate images that align with a given prompt. While previous work has evaluated T2I alignment by proposing metrics, benchmarks, and templates for collecting human judgements, the quality of these components is not systematically measured. Human-rated prompt sets are generally small and the reliability of… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Data and code will be released at: https://github.com/google-deepmind/gecko_benchmark_t2i

  2. arXiv:2311.17946  [pdf, other

    cs.CV cs.AI cs.CL

    DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback

    Authors: Jiao Sun, Deqing Fu, Yushi Hu, Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd van Steenkiste, Ranjay Krishna, Cyrus Rashtchian

    Abstract: Despite their wide-spread success, Text-to-Image models (T2I) still struggle to produce images that are both aesthetically pleasing and faithful to the user's input text. We introduce DreamSync, a model-agnostic training algorithm by design that improves T2I models to be faithful to the text input. DreamSync builds off a recent insight from TIFA's evaluation framework -- that large vision-language… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  3. arXiv:2307.05610  [pdf, other

    cs.LG cs.AI cs.CV

    Substance or Style: What Does Your Image Embedding Know?

    Authors: Cyrus Rashtchian, Charles Herrmann, Chun-Sung Ferng, Ayan Chakrabarti, Dilip Krishnan, Deqing Sun, Da-Cheng Juan, Andrew Tomkins

    Abstract: Probes are small networks that predict properties of underlying data from embeddings, and they provide a targeted, effective way to illuminate the information contained in embeddings. While analysis through the use of probes has become standard in NLP, there has been much less exploration in vision. Image foundation models have primarily been evaluated for semantic content. Better understanding th… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: 27 pages, 9 figures

  4. arXiv:2301.12993  [pdf, other

    cs.CV cs.LG

    Benchmarking Robustness to Adversarial Image Obfuscations

    Authors: Florian Stimberg, Ayan Chakrabarti, Chun-Ta Lu, Hussein Hazimeh, Otilia Stretcu, Wei Qiao, Yintao Liu, Merve Kaya, Cyrus Rashtchian, Ariel Fuxman, Mehmet Tek, Sven Gowal

    Abstract: Automated content filtering and moderation is an important tool that allows online platforms to build striving user communities that facilitate cooperation and prevent abuse. Unfortunately, resourceful actors try to bypass automated filters in a bid to post content that violate platform policies and codes of conduct. To reach this goal, these malicious actors may obfuscate policy violating images… ▽ More

    Submitted 29 November, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    ACM Class: I.2.10; I.4.0

  5. arXiv:2208.04461  [pdf, other

    cs.LG stat.ML

    A Theoretical View on Sparsely Activated Networks

    Authors: Cenk Baykal, Nishanth Dikkala, Rina Panigrahy, Cyrus Rashtchian, Xin Wang

    Abstract: Deep and wide neural networks successfully fit very complex functions today, but dense models are starting to be prohibitively expensive for inference. To mitigate this, one promising direction is networks that activate a sparse subgraph of the network. The subgraph is chosen by a data-dependent routing function, enforcing a fixed mapping of inputs to subnetworks (e.g., the Mixture of Experts (MoE… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

    Comments: 18 pages, 7 figures

  6. arXiv:2111.06105  [pdf, other

    cs.IT cs.CC cs.DM math.CO

    Multivariate Analytic Combinatorics for Cost Constrained Channels and Subsequence Enumeration

    Authors: Andreas Lenz, Stephen Melczer, Cyrus Rashtchian, Paul H. Siegel

    Abstract: Analytic combinatorics in several variables is a powerful tool for deriving the asymptotic behavior of combinatorial quantities by analyzing multivariate generating functions. We study information-theoretic questions about sequences in a discrete noiseless channel under cost and forbidden substring constraints. Our main contributions involve the relationship between the graph structure of the chan… ▽ More

    Submitted 14 November, 2021; v1 submitted 11 November, 2021; originally announced November 2021.

    ACM Class: E.4

  7. arXiv:2109.01064  [pdf, other

    math.PR cs.IT cs.LG math.ST

    Lower Bounds on the Total Variation Distance Between Mixtures of Two Gaussians

    Authors: Sami Davies, Arya Mazumdar, Soumyabrata Pal, Cyrus Rashtchian

    Abstract: Mixtures of high dimensional Gaussian distributions have been studied extensively in statistics and learning theory. While the total variation distance appears naturally in the sample complexity of distribution learning, it is analytically difficult to obtain tight lower bounds for mixtures. Exploiting a connection between total variation distance and the characteristic function of the mixture, we… ▽ More

    Submitted 9 March, 2022; v1 submitted 2 September, 2021; originally announced September 2021.

    Comments: 22 pages, 1 figure; Accepted to ALT 2022

  8. arXiv:2107.01335  [pdf, other

    cs.CC cs.DS cs.LG

    Average-Case Communication Complexity of Statistical Problems

    Authors: Cyrus Rashtchian, David P. Woodruff, Peng Ye, Hanlin Zhu

    Abstract: We study statistical problems, such as planted clique, its variants, and sparse principal component analysis in the context of average-case communication complexity. Our motivation is to understand the statistical-computational trade-offs in streaming, sketching, and query-based models. Communication complexity is the main tool for proving lower bounds in these models, yet many prior results do no… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

    Comments: 28 pages. Conference on Learning Theory (COLT), 2021

  9. arXiv:2012.06713  [pdf, ps, other

    cs.DS cs.CC cs.IT cs.LG math.PR

    Approximate Trace Reconstruction

    Authors: Sami Davies, Miklos Z. Racz, Cyrus Rashtchian, Benjamin G. Schiffer

    Abstract: In the usual trace reconstruction problem, the goal is to exactly reconstruct an unknown string of length $n$ after it passes through a deletion channel many times independently, producing a set of traces (i.e., random subsequences of the string). We consider the relaxed problem of approximate reconstruction. Here, the goal is to output a string that is close to the original one in edit distance w… ▽ More

    Submitted 16 December, 2020; v1 submitted 11 December, 2020; originally announced December 2020.

  10. arXiv:2011.14532  [pdf, other

    cs.DS cs.IT math.CO math.PR q-bio.QM

    Batch Optimization for DNA Synthesis

    Authors: Konstantin Makarychev, Miklos Z. Racz, Cyrus Rashtchian, Sergey Yekhanin

    Abstract: Large pools of synthetic DNA molecules have been recently used to reliably store significant volumes of digital data. While DNA as a storage medium has enormous potential because of its high storage density, its practical use is currently severely limited because of the high cost and low throughput of available DNA synthesis technologies. We study the role of batch optimization in reducing the cos… ▽ More

    Submitted 23 February, 2021; v1 submitted 29 November, 2020; originally announced November 2020.

    Comments: Improved Theorem 1.2 and its proof

  11. arXiv:2011.08485  [pdf, other

    cs.LG stat.ML

    Probing Predictions on OOD Images via Nearest Categories

    Authors: Yao-Yuan Yang, Cyrus Rashtchian, Ruslan Salakhutdinov, Kamalika Chaudhuri

    Abstract: We study out-of-distribution (OOD) prediction behavior of neural networks when they classify images from unseen classes or corrupted images. To probe the OOD behavior, we introduce a new measure, nearest category generalization (NCG), where we compute the fraction of OOD inputs that are classified with the same label as their nearest neighbor in the training set. Our motivation stems from understa… ▽ More

    Submitted 8 March, 2023; v1 submitted 17 November, 2020; originally announced November 2020.

    Comments: Accepted by Transactions on Machine Learning Research

  12. arXiv:2010.16055  [pdf, other

    cs.LG stat.ML

    Unsupervised Embedding of Hierarchical Structure in Euclidean Space

    Authors: Jinyu Zhao, Yi Hao, Cyrus Rashtchian

    Abstract: Deep embedding methods have influenced many areas of unsupervised learning. However, the best methods for learning hierarchical structure use non-Euclidean representations, whereas Euclidean geometry underlies the theory behind many hierarchical clustering algorithms. To bridge the gap between these two areas, we consider learning a non-linear embedding of data into Euclidean space as a way to imp… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

  13. arXiv:2010.06083  [pdf, other

    cs.CC cs.IT cs.LG q-bio.GN

    Trace Reconstruction Problems in Computational Biology

    Authors: Vinnu Bhardwaj, Pavel A. Pevzner, Cyrus Rashtchian, Yana Safonova

    Abstract: The problem of reconstructing a string from its error-prone copies, the trace reconstruction problem, was introduced by Vladimir Levenshtein two decades ago. While there has been considerable theoretical work on trace reconstruction, practical solutions have only recently started to emerge in the context of two rapidly developing research areas: immunogenomics and DNA data storage. In immunogenomi… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: 20 pages, 8 figures. Accepted to the Special Issue of IEEE Transactions on Information Theory Dedicated to the Memory of Vladimir I. Levenshtein (copyright of journal version transferred to IEEE)

    ACM Class: F.2.3; G.3; E.4; J.2

  14. arXiv:2006.14015  [pdf, other

    cs.DS cs.CC cs.LG

    Vector-Matrix-Vector Queries for Solving Linear Algebra, Statistics, and Graph Problems

    Authors: Cyrus Rashtchian, David P. Woodruff, Hanlin Zhu

    Abstract: We consider the general problem of learning about a matrix through vector-matrix-vector queries. These queries provide the value of $\boldsymbol{u}^{\mathrm{T}}\boldsymbol{M}\boldsymbol{v}$ over a fixed field $\mathbb{F}$ for a specified pair of vectors $\boldsymbol{u},\boldsymbol{v} \in \mathbb{F}^n$. To motivate these queries, we observe that they generalize many previously studied models, such… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: 26 pages, to be published in RANDOM 2020

  15. arXiv:2006.02399  [pdf, other

    cs.LG cs.CG cs.DS stat.ML

    ExKMC: Expanding Explainable $k$-Means Clustering

    Authors: Nave Frost, Michal Moshkovitz, Cyrus Rashtchian

    Abstract: Despite the popularity of explainable AI, there is limited work on effective methods for unsupervised learning. We study algorithms for $k$-means clustering, focusing on a trade-off between explainability and accuracy. Following prior work, we use a small decision tree to partition a dataset into $k$ clusters. This enables us to explain each cluster assignment by a short sequence of single-feature… ▽ More

    Submitted 1 July, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

  16. arXiv:2003.02972  [pdf, other

    cs.DS cs.DC cs.LG

    LSF-Join: Locality Sensitive Filtering for Distributed All-Pairs Set Similarity Under Skew

    Authors: Cyrus Rashtchian, Aneesh Sharma, David P. Woodruff

    Abstract: All-pairs set similarity is a widely used data mining task, even for large and high-dimensional datasets. Traditionally, similarity search has focused on discovering very similar pairs, for which a variety of efficient algorithms are known. However, recent work highlights the importance of finding pairs of sets with relatively small intersection sizes. For example, in a recommender system, two use… ▽ More

    Submitted 5 March, 2020; originally announced March 2020.

    Comments: WWW (The Web Conference) 2020

  17. arXiv:2003.02460  [pdf, other

    cs.LG cs.CR stat.ML

    A Closer Look at Accuracy vs. Robustness

    Authors: Yao-Yuan Yang, Cyrus Rashtchian, Hongyang Zhang, Ruslan Salakhutdinov, Kamalika Chaudhuri

    Abstract: Current methods for training robust networks lead to a drop in test accuracy, which has led prior works to posit that a robustness-accuracy tradeoff may be inevitable in deep learning. We take a closer look at this phenomenon and first show that real image datasets are actually separated. With this property in mind, we then prove that robustness and accuracy should both be achievable for benchmark… ▽ More

    Submitted 12 July, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

  18. arXiv:2002.12538  [pdf, other

    cs.LG cs.CG cs.DS stat.ML

    Explainable $k$-Means and $k$-Medians Clustering

    Authors: Sanjoy Dasgupta, Nave Frost, Michal Moshkovitz, Cyrus Rashtchian

    Abstract: Clustering is a popular form of unsupervised learning for geometric data. Unfortunately, many clustering algorithms lead to cluster assignments that are hard to explain, partially because they depend on all the features of the data in a complicated way. To improve interpretability, we consider using a small decision tree to partition a data set into clusters, so that clusters can be characterized… ▽ More

    Submitted 21 September, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

  19. arXiv:1911.09944  [pdf, ps, other

    cs.IT cs.CC math.CO

    Covering Codes using Insertions or Deletions

    Authors: Andreas Lenz, Cyrus Rashtchian, Paul H. Siegel, Eitan Yaakobi

    Abstract: A covering code is a set of codewords with the property that the union of balls, suitably defined, around these codewords covers an entire space. Generally, the goal is to find the covering code with the minimum size codebook. While most prior work on covering codes has focused on the Hamming metric, we consider the problem of designing covering codes defined in terms of either insertions or delet… ▽ More

    Submitted 25 May, 2020; v1 submitted 22 November, 2019; originally announced November 2019.

    Comments: 13 pages

  20. arXiv:1910.11921  [pdf, ps, other

    cs.CC cs.DS math.CO

    Equivalence of Systematic Linear Data Structures and Matrix Rigidity

    Authors: Sivaramakrishnan Natarajan Ramamoorthy, Cyrus Rashtchian

    Abstract: Recently, Dvir, Golovnev, and Weinstein have shown that sufficiently strong lower bounds for linear data structures would imply new bounds for rigid matrices. However, their result utilizes an algorithm that requires an $NP$ oracle, and hence, the rigid matrices are not explicit. In this work, we derive an equivalence between rigidity and the systematic linear model of data structures. For the… ▽ More

    Submitted 25 October, 2019; originally announced October 2019.

    Comments: 23 pages, 1 table

  21. arXiv:1906.03310  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Robustness for Non-Parametric Classification: A Generic Attack and Defense

    Authors: Yao-Yuan Yang, Cyrus Rashtchian, Yizhen Wang, Kamalika Chaudhuri

    Abstract: Adversarially robust machine learning has received much recent attention. However, prior attacks and defenses for non-parametric classifiers have been developed in an ad-hoc or classifier-specific basis. In this work, we take a holistic look at adversarial examples for non-parametric classifiers, including nearest neighbors, decision trees, and random forests. We provide a general defense method,… ▽ More

    Submitted 24 February, 2020; v1 submitted 7 June, 2019; originally announced June 2019.

    Comments: AISTATS 2020

  22. arXiv:1902.05101  [pdf, other

    cs.CC cs.DS math.PR

    Reconstructing Trees from Traces

    Authors: Sami Davies, Miklos Z. Racz, Cyrus Rashtchian

    Abstract: We study the problem of learning a node-labeled tree given independent traces from an appropriately defined deletion channel. This problem, tree trace reconstruction, generalizes string trace reconstruction, which corresponds to the tree being a path. For many classes of trees, including complete trees and spiders, we provide algorithms that reconstruct the labels using only a polynomial number of… ▽ More

    Submitted 18 September, 2020; v1 submitted 13 February, 2019; originally announced February 2019.

    Comments: Major revisions in the new version including algorithm descriptions, more details in section 3.1, and several new figures

  23. arXiv:1711.07567  [pdf, other

    cs.DS

    Edge Estimation with Independent Set Oracles

    Authors: Paul Beame, Sariel Har-Peled, Sivaramakrishnan Natarajan Ramamoorthy, Cyrus Rashtchian, Makrand Sinha

    Abstract: We study the task of estimating the number of edges in a graph with access to only an independent set oracle. Independent set queries draw motivation from group testing and have applications to the complexity of decision versus counting problems. We give two algorithms to estimate the number of edges in an $n$-vertex graph, using (i) $\mathrm{polylog}(n)$ bipartite independent set queries, or (ii)… ▽ More

    Submitted 11 March, 2020; v1 submitted 20 November, 2017; originally announced November 2017.

    Comments: A preliminary version appeared in the proceedings of ITCS 2018

    ACM Class: F.1.1; F.2

  24. arXiv:1611.04999  [pdf, other

    cs.DS cs.CC cs.DC

    Massively-Parallel Similarity Join, Edge-Isoperimetry, and Distance Correlations on the Hypercube

    Authors: Paul Beame, Cyrus Rashtchian

    Abstract: We study distributed protocols for finding all pairs of similar vectors in a large dataset. Our results pertain to a variety of discrete metrics, and we give concrete instantiations for Hamming distance. In particular, we give improved upper bounds on the overhead required for similarity defined by Hamming distance $r>1$ and prove a lower bound showing qualitative optimality of the overhead requir… ▽ More

    Submitted 15 November, 2016; originally announced November 2016.

    Comments: 23 pages, plus references and appendix. To appear in SODA 2017

  25. arXiv:1511.08245  [pdf, other

    math.CO cs.CC cs.DM

    Shattered Sets and the Hilbert Function

    Authors: Shay Moran, Cyrus Rashtchian

    Abstract: We study complexity measures on subsets of the boolean hypercube and exhibit connections between algebra (the Hilbert function) and combinatorics (VC theory). These connections yield results in both directions. Our main complexity-theoretic result proves that most linear program feasibility problems cannot be computed by polynomial-sized constant-depth circuits. Moreover, our result applies to a s… ▽ More

    Submitted 21 May, 2020; v1 submitted 25 November, 2015; originally announced November 2015.

    Comments: 19 pages, 2 figures. Fixed typo in Theorem 2.3