Conference Proceedings | Yuichi Yoshida

Conference Proceedings

2024

  • Lipschitz Continuous Allocations for Optimization Games. Soh Kumabe and Yuichi Yoshida. ICALP 2024.

    In cooperative game theory, the primary focus is the equitable allocation of payoffs or costs among agents. However, in the practical applications of cooperative games, accurately representing games is challenging. In such cases, using an allocation method sensitive to small perturbations in the game can lead to various problems, including dissatisfaction among agents and the potential for manipulation by agents seeking to maximize their own benefits. Therefore, the allocation method must be robust against game perturbations.

    In this study, we explore optimization games, in which the value of the characteristic function is provided as the optimal value of an optimization problem. To assess the robustness of the allocation methods, we use the Lipschitz constant, which quantifies the extent of change in the allocation vector in response to a unit perturbation in the weight vector of the underlying problem. Thereafter, we provide an algorithm for the matching game that returns an allocation belonging to the \((\frac{1}{2}-\epsilon)\)-approximate core with Lipschitz constant \(O(\epsilon^{-1})\). Additionally, we provide an algorithm for a minimum spanning tree game that returns an allocation belonging to the \(4\)-approximate core with a constant Lipschitz constant.

    The Shapley value is a popular allocation that satisfies several desirable properties. Therefore, we investigate the robustness of the Shapley value. We demonstrate that the Lipschitz constant of the Shapley value for the minimum spanning tree is constant, whereas that for the matching game is \(\Omega(\log n)\), where \(n\) denotes the number of vertices.

    Sensitivity Analysis,Game Theory
  • Testing Spreading Behavior in Networks with Arbitrary Topologies. Augusto Modanese and Yuichi Yoshida. ICALP 2024. [arXiv]

    Given the full topology of a network, how hard is it to test if it is evolving according to a local rule or is far from doing so? Inspired by the works of Goldreich and Ron (J. ACM, 2017) and Nakar and Ron (ICALP, 2021), we initiate the study of property testing in dynamic environments with arbitrary topologies. Our focus is on the simplest non-trivial rule that can be tested, which corresponds to the \(\mathbf{BP}\) rule of bootstrap percolation and models a simple spreading behavior: Every \enquote{infected} node stays infected forever, and each \enquote{healthy} node becomes infected if and only if it has at least one infected neighbor. Our results are subdivided into two main groups:

    • If we are testing a single time step of evolution, then the query complexity is \(O(\Delta/\epsilon)\) or \(\tilde{O}(\sqrt{n}/\epsilon)\) (whichever is smaller), where \(\Delta\) and \(n\) are the maximum degree of a node and the number of vertices in the underlying graph, respectively. We also give lower bounds for both one- and two-sided error testers that match our upper bounds up to \(\Delta = o(\sqrt{n})\) and \(\Delta = O(n^{1/3})\), respectively. If \(\epsilon\) is constant, then the first of these also holds against adaptive testers.
    • For the setting of testing the environment over \(T\) time steps, we give two algorithms that need \(O(\Delta^{T-1}/\epsilon T)\) and \(\tilde{O}(|E|/\epsilon T)\) queries, respectively, where \(E\) is the set of edges of the underlying graph.

    All of our algorithms are one-sided error, and all of them are also non-adaptive, with the single exception of the more complex \(\tilde{O}(\sqrt{n}/\epsilon)\)-query tester for the case \(T = 2\).

    Sublinear-Time Algorithms
  • Online Algorithms for Spectral Hypergraph Sparsification. Tasuku Soma, Kam Chuen Tung, and Yuichi Yoshida. IPCO 2024. [arXiv]

    We provide the first online algorithm for spectral hypergraph sparsification. In the online setting, hyperedges with positive weights are arriving in a stream, and upon the arrival of each hyperedge, we must irrevocably decide whether or not to include it in the sparsifier. Our algorithm produces an \((\epsilon,\delta)\)-spectral sparsifier with multiplicative error ϵ and additive error \(\delta\) that has \(O(\epsilon^{-2}n \log n \log r \log(1+\epsilon W/\delta n))\) hyperedges with high probability, where \(\epsilon,\delta \in (0,1)\), \(n\) is the number of nodes, and \(W\) is the sum of edge weights. The space complexity of our algorithm is \(O(n^2)\), while previous algorithms require the space complexity of \(\Omega(m)\), where \(m\) is the number of hyperedges. This provides an exponential improvement in the space complexity since \(m\) can be exponential in \(n\).

    Spectral Graph Theory,Online Algorithms

2023

  • A Batch-to-Online Transformation under Random-Order Model. Jing Dong and Yuichi Yoshida. NeurIPS 2023. [arXiv]

    We introduce a transformation framework that can be utilized to develop online algorithms with low \(\epsilon\)-approximate regret in the random-order model from offline approximation algorithms. We first give a general reduction theorem that transforms an offline approximation algorithm with low average sensitivity to an online algorithm with low \(\epsilon\)-approximate regret. We then demonstrate that offline approximation algorithms can be transformed into a low-sensitivity version using a coreset construction method. To showcase the versatility of our approach, we apply it to various problems, including online \((k,z)\)-clustering, online matrix approximation, and online regression, and successfully achieve polylogarithmic \(\epsilon\)-approximate regret for each problem. Moreover, we show that in all three cases, our algorithm also enjoys low inconsistency, which may be desired in some online applications.

    Online Algorithms,Sensitivity Analysis
  • Lipschitz Continuous Algorithms for Graph Problems. Soh Kumabe and Yuichi Yoshida. FOCS 2023. [arXiv]

    Graph algorithms are widely used for decision making and knowledge discovery. To ensure their effectiveness, it is essential that their output remains stable even when subjected to small perturbations to the input because frequent output changes can result in costly decisions, reduced user trust, potential security concerns, and lack of replicability. In this study, we consider the Lipschitz continuity of algorithms as a stability measure and initiate a systematic study of the Lipschitz continuity of algorithms for (weighted) graph problems. Depending on how we embed the output solution to a metric space, we can think of several Lipschitzness notions. We mainly consider the one that is invariant under scaling of weights, and we provide Lipschitz continuous algorithms and lower bounds for the minimum spanning tree problem, the shortest path problem, and the maximum weight matching problem. In particular, our shortest path algorithm is obtained by first designing an algorithm for unweighted graphs that are robust against edge contractions and then applying it to the unweighted graph constructed from the original weighted graph. Then, we consider another Lipschitzness notion induced by a natural mapping that maps the output solution to its characteristic vector. It turns out that no Lipschitz continuous algorithm exists for this Lipschitz notion, and we instead design algorithms with bounded pointwise Lipschitz constants for the minimum spanning tree problem and the maximum weight bipartite matching problem. Our algorithm for the latter problem is based on an LP relaxation with entropy regularization.

    Sensitivity Analysis,Combinatorial Optimization
  • Controlling Posterior Collapse by an Inverse Lipschitz Constraint on the Decoder Network. Yuri Kinoshita, Kenta Oono, Kenji Fukumizu, Yuichi Yoshida, and Shin-ichi Maeda. ICML 2023. [arXiv]

    Variational autoencoders (VAEs) are one of the deep generative models that have experienced enormous success over the past decades. However, in practice, they suffer from a problem called posterior collapse, which occurs when the posterior distribution coincides, or collapses, with the prior taking no information from the latent structure of the input data into consideration. In this work, we introduce an inverse Lipschitz neural network into the decoder and, based on this architecture, provide a new method that can control in a simple and clear manner the degree of posterior collapse for a wide range of VAE models equipped with a concrete theoretical guarantee. We also illustrate the effectiveness of our method through several numerical experiments.

    Machine Learning
  • Average Sensitivity of Decision Tree Learning. Satoshi Hara and Yuichi Yoshida. ICLR 2023.

    A decision tree is a fundamental model used in data mining and machine learning. In practice, the training data used to construct a decision tree may change over time or contain noise. A drastic change in the learned tree structure owing to such data perturbation is unfavorable in practice. For example, in data mining, a change in the tree implies that the extracted knowledge can be unstable, which raises the question of whether the extracted knowledge is truly reliable or is only a noisy artifact. To alleviate this issue, we design decision tree learning algorithms that are stable against insignificant perturbations in the training data. Specifically, we adopt the notion of average sensitivity as a stability measure, and design an algorithm with low average sensitivity that outputs a decision tree whose accuracy is nearly equal to the optimal decision tree. The experimental results on real-world datasets demonstrate that the proposed algorithm achieves a low average sensitivity with an insignificant decrease in accuracy.

    Sensitivity Analysis
  • Low Degree Testing over the Reals. Vipul Arora, Arnab Bhattacharyya, Noah Fleming, Esty Kelman, and Yuichi Yoshida. SODA 2023. [arXiv]

    We study the problem of testing whether a function \(f\colon \mathbb{R}^n \to \mathbb{R}\) is a polynomial of degree at most \(d\) in the distribution-free testing model. Here, the distance between functions is measured with respect to an unknown distribution \(\mathcal{D}\) over \(\mathbb{R}^n\) from which we can draw samples. In contrast to previous work, we do not assume that \(\mathcal{D}\) has finite support.

    We design a tester that given query access to \(f\), and sample access to \(\mathcal{D}\), makes \(\mathrm{poly}(d/\epsilon)\) queries to \(f\), accepts with probability \(1\) if \(f\) is a polynomial of degree \(d\), and rejects with probability at least \(2/3\) if every degree-\(d\) polynomial \(P\) disagrees with \(f\) on a set of mass at least \(\epsilon\) with respect to \(\mathcal{D}\). Our result also holds under mild assumptions when we receive only a polynomial number of bits of precision for each query to \(f\), or when \(f\) can only be queried on rational points representable using a logarithmic number of bits. Along the way, we prove a new stability theorem for multivariate polynomials that may be of independent interest.

    Sublinear-Time Algorithms
  • We show sublinear-time algorithms for Max Cut and Max E2Lin\((q)\) on expanders in the adjacency list model that distinguishes instances with the optimal value more than \(1-\varepsilon\) from those with the optimal value less than \(1-\rho\) for \(\rho \gg \varepsilon\). The time complexities for Max Cut and Max 2Lin\((q)\) are \(\tilde{O}(\frac{1}{\phi^2\rho} \cdot m^{1/2+O(\varepsilon/(\phi^2\rho))})\) and \(\widetilde{O}(\mathrm{poly}(\frac{q}{\phi\rho})\cdot {(mq)}^{1/2+O(q^6\varepsilon/\phi^2\rho^2)})\), respectively, where \(m\) is the number of edges in the underlying graph and \(\phi\) is its conductance. Then, we show a sublinear-time algorithm for \textsc{Unique Label Cover} on expanders with \(\phi \gg \epsilon\) in the bounded-degree model. The time complexity of our algorithm is \(\widetilde{O}_d(2^{q^{O(1)}\cdot\phi^{1/q}\cdot \varepsilon^{-1/2}}\cdot n^{1/2+q^{O(q)}\cdot \varepsilon^{4^{1.5-q}}\cdot \phi^{-2}})\), where \(n\) is the number of variables. We complement these algorithmic results by showing that testing \(3\)-colorability requires \(\Omega(n)\) queries even on expanders.

    Sublinear-Time Algorithms

2022

  • Average Sensitivity of Euclidean k-Clustering. Yuichi Yoshida and Shinji Ito. NeurIPS 2022.

    Given a set of \(n\) points in \(\mathbb{R}^d\), the goal of Euclidean \((k,\ell)\)-clustering is to find \(k\) centers that minimize the sum of the \(\ell\)-th powers of the Euclidean distance of each point to the closest center. In practical situations, the clustering result must be stable against points missing in the input data so that we can make trustworthy and consistent decisions. To address this issue, we consider the average sensitivity of Euclidean \((k,\ell)\)-clustering, which measures the stability of the output in total variation distance against deleting a random point from the input data. We first show that a popular algorithm k-means++ and its variant called \(D^\ell\)-sampling have low average sensitivity. Next, we show that any approximation algorithm for Euclidean \((k,\ell)\)-clustering can be transformed to an algorithm with low average sensitivity while almost preserving the approximation guarantee. As byproducts of our results, we provide several algorithms for consistent \((k,\ell)\)-clustering and dynamic \((k,\ell)\)-clustering in the random-order model, where the input points are randomly permuted and given in an online manner. The goal of the consistent setting is to maintain a good solution while minimizing the number of changes to the solution during the process, and that of the dynamic setting is to maintain a good solution while minimizing the (amortized) update time.

    Sensitivity Analysis
  • Average sensitivity of the Knapsack Problem. Soh Kumabe and Yuichi Yoshida. ESA 2022.

    In resource allocation, we often require that the output allocation of an algorithm is stable against input perturbation because frequent reallocation is costly and untrustworthy. Varma and Yoshida (SODA'21) formalized this requirement for algorithms as the notion of average sensitivity. Here, the average sensitivity of an algorithm on an input instance is, roughly speaking, the average size of the symmetric difference of the output for the instance and that for the instance with one item deleted, where the average is taken over the deleted item.

    In this work, we consider the average sensitivity of the knapsack problem, a representative example of a resource allocation problem. We first show a \((1-\epsilon)\)-approximation algorithm for the knapsack problem with average sensitivity \(O(\epsilon^{-1}\log \epsilon^{-1})\). Then, we complement this result by showing that any \((1-\epsilon)\)-approximation algorithm has average sensitivity \(\Omega(\epsilon^{-1})\). As an application of our algorithm, we consider the incremental knapsack problem in the random-order setting, where the goal is to maintain a good solution while items arrive one by one in a random order. Specifically, we show that for any \(\epsilon > 0\), there exists a \((1-\epsilon)\)-approximation algorithm with amortized recourse \(O(\epsilon^{-1}\log \epsilon^{-1})\) and amortized update time \(O(\log n+f_\epsilon)\), where \(n\) is the total number of items and \(f_\epsilon>0\) is a value depending on \(\epsilon\).

    Sensitivity Analysis
  • Downsampling for Testing and Learning in Product Distributions. Nathaniel Harms, Yuichi Yoshida. ICALP 2022. [arXiv]

    We study distribution-free property testing and learning problems where the unknown probability distribution is a product distribution over \(\mathbb{R}^d\). For many important classes of functions, such as intersections of halfspaces, polynomial threshold functions, convex sets, and \(k\)-alternating functions, the known algorithms either have complexity that depends on the support size of the distribution, or are proven to work only for specific examples of product distributions. We introduce a general method, which we call downsampling, that resolves these issues. Downsampling uses a notion of ``rectilinear isoperimetry'' for product distributions, which further strengthens the connection between isoperimetry, testing and learning. Using this technique, we attain new efficient distribution-free algorithms under product distributions on \(\mathbb{R}^d\):

    • A simpler proof for non-adaptive, one-sided monotonicity testing of functions \([n]^d \to \{0,1\}\), and improved sample complexity for testing monotonicity over unknown product distributions, from \(O(d^7)\) [Black, Chakrabarty, & Seshadhri, SODA 2020] to \(\widetilde O(d^3)\). \item Polynomial-time agnostic learning algorithms for functions of a constant number of halfspaces, and constant-degree polynomial threshold functions;
    • An \(\exp{O(d\log(dk))}\)-time agnostic learning algorithm, and an \(\exp{O(d\log(dk))}\)-sample tolerant tester, for functions of \(k\) convex sets; and a \(2^{\widetilde O(d)}\) sample-based one-sided tester for convex sets;
    • An \(\exp{\widetilde O(k\sqrt{d})}\)-time agnostic learning algorithm for \(k\)-alternating functions, and a sample-based tolerant tester with the same complexity.
    Sublinear-Time Algorithms
  • Sparsification of Decomposable Submodular Functions. Akbar Rafiey and Yuichi Yoshida. AAAI 2022. [arXiv]

    Submodular functions are at the core of many machine learning and data mining tasks. The underlying submodular functions for many of these tasks are decomposable, i.e., they are sum of several simple submodular functions. In many data intensive applications, however, the number of underlying submodular functions in the original function is so large that we need prohibitively large amount of time to process it and/or it does not even fit in the main memory. To overcome this issue, we introduce the notion of sparsification for decomposable submodular functions whose objective is to obtain an accurate approximation of the original function that is a (weighted) sum of only a few submodular functions. Our main result is a polynomial-time randomized sparsification algorithm such that the expected number of functions used in the output is independent of the number of underlying submodular functions in the original function. We also study the effectiveness of our algorithm under various constraints such as matroid and cardinality constraints. We complement our theoretical analysis with an empirical study of the performance of our algorithm.

    Submodular Functions
  • Average Sensitivity of Dynamic Programming. Soh Kumabe and Yuichi Yoshida. SODA 2022. [arXiv]

    When processing data with uncertainty, it is desirable that the output of the algorithm is stable against small perturbations in the input. Varma and Yoshida [SODA’21] recently formalized this idea and proposed the notion of average sensitivity of algorithms, which is roughly speaking, the average Hamming distance between solutions for the original input and that obtained by deleting one element from the input, where the average is taken over the deleted element.

    In this work, we consider average sensitivity of algorithms for problems that can be solved by dynamic programming. We first present a \((1 - \delta)\)-approximation algorithm for finding a maximum weight chain (MWC) in a transitive directed acyclic graph with average sensitivity \(O(\delta^{-1} \log^3 n)\), where \(n\) is the number of vertices in the graph. We then show algorithms with small average sensitivity for various dynamic programming problems by reducing them to MWC while preserving average sensitivity, including the longest increasing subsequence problem, the interval scheduling problem, the longest common subsequence problem, the knapsack problem with integral weight, and the RNA folding problem. For the RNA folding problem, our reduction is highly nontrivial because a naive reduction generates an exponentially large graph, which only provides a trivial average sensitivity bound.

    Sensitivity Analysis

2021

  • Spectral Hypergraph Sparsifiers of Nearly Linear Size. Michael Kapralov, Robert Krauthgamer, Jakab Tardos, and Yuichi Yoshida. FOCS 2021. [arXiv]

    Graph sparsification has been studied extensively over the past two decades, culminating in spectral sparsifiers of optimal size (up to constant factors). Spectral hypergraph sparsification is a natural analogue of this problem, for which optimal bounds on the sparsifier size are not known, mainly because the hypergraph Laplacian is non-linear, and thus lacks the linear-algebraic structure and tools that have been so effective for graphs. Our main contribution is the first algorithm for constructing ϵ-spectral sparsifiers for hypergraphs with \(O^∗(n)\) hyperedges, where \(O^∗\) suppresses \((\epsilon^{-1}\log n)O(1)\) factors. This bound is independent of the rank \(r\) (maximum cardinality of a hyperedge), and is essentially best possible due to a recent bit complexity lower bound of \(\Omega(nr)\) for hypergraph sparsification.

    This result is obtained by introducing two new tools. First, we give a new proof of spectral concentration bounds for sparsifiers of graphs; it avoids linear-algebraic methods, replacing e.g. the usual application of the matrix Bernstein inequality and therefore applies to the (non-linear) hypergraph setting. To achieve the result, we design a new sequence of hypergraph-dependent \(\epsilon\)-nets on the unit sphere in \(\mathbb{R}^n\). Second, we extend the weight assignment technique of Chen, Khanna and Nagda [FOCS'20] to the spectral sparsification setting. Surprisingly, the number of spanning trees after the weight assignment can serve as a potential function guiding the reweighting process in the spectral setting.

    Spectral Graph Theory,Network Algorithms
  • Local Algorithms for Estimating Effective Resistance. Daniel Lopatta, Pan Peng, Yuichi Yoshida, and Gramoz Goranci. KDD 2021.

    Effective resistance is an important metric that measures the similarity of two vertices in a graph. It has found applications in graph clustering, recommendation systems, network reliability, among others. In spite of the importance of the effective resistances, we still lack efficient algorithms to compute or approximate them on massive graphs.

    In this work, we design several \emph{local algorithms} for estimating effective resistances, which are algorithms that only read a small portion of the input while still having provable performance guarantees. To illustrate, our main algorithm approximates the effective resistance between any vertex pair \(s,t\) with an arbitrarily small additive error \(\varepsilon\) in time \(O(\mathrm{poly}(\log n/\varepsilon))\), whenever the underlying graph has bounded mixing time. We perform an extensive empirical study on several benchmark datasets, validating the performance of our algorithms.

    Sublinear-Time Algorithms,Network Algorithms,Spectral Graph Theory
  • Online Risk-Averse Submodular Maximization. Tasuku Soma and Yuichi Yoshida. IJCAI 2021.

    We present a polynomial-time online algorithm for maximizing the conditional value at risk (CVaR) of a monotone stochastic submodular function. Given \(T\) i.i.d. samples from an underlying distribution arriving online, our algorithm produces a sequence of solutions that converges to a \((1-1/e)\)-approximate solution with a convergence rate of \(O(T^{-1/4})\) for monotone continuous DR-submodular functions. Compared with previous offline algorithms, which require \(\Omega(T)\) space, our online algorithm only requires \(O(\sqrt{T})\) space. We extend our online algorithm to portfolio optimization for monotone submodular set functions under a matroid constraint. Experiments conducted on real-world datasets demonstrate that our algorithm can rapidly achieve CVaRs that are comparable to those obtained by existing offline algorithms.

    Submodular Functions,Online Algorithms
  • Towards Tight Bounds for Spectral Sparsification of Hypergraphs. Michael Kapralov, Robert Krauthgamer, Jakab Tardos, and Yuichi Yoshida. STOC 2021. [arXiv]

    Cut and spectral sparsification of graphs have numerous applications, including e.g. speeding up algorithms for cuts and Laplacian solvers. These powerful notions have recently been extended to hypergraphs, which are much richer and may offer new applications. However, the current bounds on the size of hypergraph sparsifiers are not as tight as the corresponding bounds for graphs.

    Our first result is a polynomial-time algorithm that, given a hypergraph on \(n\) vertices with maximum hyperedge size \(r\), outputs an \(\epsilon\)-spectral sparsifier with \(O^*(nr)\) hyperedges, where \(O^*\) suppresses \((\epsilon^{-1} \log n)^{O(1)}\) factors. This size bound improves the two previous bounds: \(O^*(n^3)\) [Soma and Yoshida, SODA'19] and \(O^*(nr^3)\) [Bansal, Svensson and Trevisan, FOCS'19]. Our main technical tool is a new method for proving concentration of the nonlinear analogue of the quadratic form of the Laplacians for hypergraph expanders.

    We complement this with lower bounds on the bit complexity of any compression scheme that \((1+\epsilon)\)-approximates all the cuts in a given hypergraph, and hence also on the bit complexity of every \(\epsilon\)-cut/spectral sparsifier. These lower bounds are based on Ruzsa-Szemer'edi graphs, and a particular instantiation yields an \(\Omega(nr)\) lower bound on the bit complexity even for fixed constant \(\epsilon\). This is tight up to polylogarithmic factors in \(n\), due to recent hypergraph cut sparsifiers of [Chen, Khanna and Nagda, FOCS'20].

    Finally, for directed hypergraphs, we present an algorithm that computes an \(\epsilon\)-spectral sparsifier with \(O^*(n^2r^3)\) hyperarcs, where \(r\) is the maximum size of a hyperarc. For small \(r\), this improves over \(O^*(n^3)\) known from [Soma and Yoshida, SODA'19], and is getting close to the trivial lower bound of \(\Omega(n^2)\) hyperarcs.

    Spectral Graph Theory
  • RelWalk -- A Latent Variable Model Approach to Knowledge Graph Embedding. Danushka Bollegala, Huda Hakami, Yuichi Yoshida, and Ken-ichi Kawarabayashi. EACL 2021. [arXiv]

    Knowledge Graph Embedding (KGE) is the task of jointly learning entity and relation embeddings for a given knowledge graph. Existing methods for learning KGEs can be seen as a two-stage process where (a) entities and relations in the knowledge graph are represented using some linear algebraic structures (embeddings), and (b) a scoring function is defined that evaluates the strength of a relation that holds between two entities using the corresponding relation and entity embeddings. Unfortunately, prior proposals for the scoring functions in the first step have been heuristically motivated, and it is unclear as to how the scoring functions in KGEs relate to the generation process of the underlying knowledge graph. To address this issue, we propose a generative account of the KGE learning task. Specifically, given a knowledge graph represented by a set of relational triples (h, R, t), where the semantic relation R holds between the two entities h (head) and t (tail), we extend the random walk model (Arora et al., 2016a) of word embeddings to KGE. We derive a theoretical relationship between the joint probability p(h, R, t) and the embeddings of h, R and t. Moreover, we show that marginal loss minimisation, a popular objective used by much prior work in KGE, follows naturally from the log-likelihood ratio maximisation under the probabilities estimated from the KGEs according to our theoretical relationship. We propose a learning objective motivated by the theoretical analysis to learn KGEs from a given knowledge graph. The KGEs learnt by our proposed method obtain state-of-the-art performance on FB15K237 and WN18RR benchmark datasets, providing empirical evidence in support of the theory.

    Natural Language Processing
  • One-Tape Turing Machine and Branching Program Lower Bounds for MCSP. Mahdi Cheraghchi, Shuichi Hirahara, Dimitrios Myrisiotis, and Yuichi Yoshida. STACS 2021.

    For a size parameter \(s\colon\mathbb{N}\to\mathbb{N}\), the Minimum Circuit Size Problem (denoted by \({\rm MCSP}[s(n)]\)) is the problem of deciding whether the minimum circuit size of a given function \(f \colon \{0,1\}^n \to \{0,1\}\) (represented by a string of length \(N := 2^n\)) is at most a threshold \(s(n)\). A recent line of work exhibited ``hardness magnification'' phenomena for MCSP: A very weak lower bound for MCSP implies a breakthrough result in complexity theory. For example, McKay, Murray, and Williams (STOC 2019) implicitly showed that, for some constant \(\mu_1 > 0\), if \({\rm MCSP}[2^{\mu_1\cdot n}]\) cannot be computed by a one-tape Turing machine (with an additional one-way read-only input tape) running in time \(N^{1.01}\), then \({\rm P}\neq{\rm NP}\).

    In this paper, we present the following new lower bounds against one-tape Turing machines and branching programs:

    • A randomized two-sided error one-tape Turing machine (with an additional one-way read-only input tape) cannot compute \({\rm MCSP}[2^{\mu_2\cdot n}]\) in time \(N^{1.99}\), for some constant \(\mu_2 > \mu_1\).
    • A non-deterministic (or parity) branching program of size \(o(N^{1.5}/\log N)\) cannot compute MKTP, which is a time-bounded Kolmogorov complexity analogue of MCSP. This is shown by directly applying the Nečiporuk method to MKTP, which previously appeared to be difficult.
    • The size of any non-deterministic, co-non-deterministic, or parity branching program computing MCSP is at least \(N^{1.5-o(1)}\).

    These results are the first non-trivial lower bounds for MCSP and MKTP against one-tape Turing machines and non-deterministic branching programs, and essentially match the best-known lower bounds for any explicit functions against these computational models.

    The first result is based on recent constructions of pseudorandom generators for read-once oblivious branching programs (ROBPs) and combinatorial rectangles (Forbes and Kelley, FOCS 2018; Viola 2019). En route, we obtain several related results:

    • There exists a (local) hitting set generator with seed length \(\widetilde{O}(\sqrt{N})\) secure against read-once polynomial-size non-deterministic branching programs on \(N\)-bit inputs.
    • Any read-once co-non-deterministic branching program computing MCSP must have size at least \(2^{\widetilde{\Omega}(N)}\).
    Computational Complexity
  • Average Sensitivity of Graph Algorithms. Nithin Varma and Yuichi Yoshida. SODA 2021. [arXiv] [slides]

    In modern applications of graphs algorithms, where the graphs of interest are large and dynamic, it is unrealistic to assume that an input representation contains the full information of a graph being studied. Hence, it is desirable to use algorithms that, even when only a (large) subgraph is available, output solutions that are close to the solutions output when the whole graph is available. We formalize this idea by introducing the notion of average sensitivity of graph algorithms, which is the average earth mover's distance between the output distributions of an algorithm on a graph and its subgraph obtained by removing an edge, where the average is over the edges removed and the distance between two outputs is the Hamming distance.

    In this work, we initiate a systematic study of average sensitivity. After deriving basic properties of average sensitivity such as composition, we provide efficient approximation algorithms with low average sensitivities for concrete graph problems, including the minimum spanning forest problem, the global minimum cut problem, the minimum s-t cut problem, and the maximum matching problem. In addition, we prove that the average sensitivity of our global minimum cut algorithm is almost optimal, by showing a nearly matching lower bound. We also show that every algorithm for the 2-coloring problem has average sensitivity linear in the number of vertices. One of the main ideas involved in designing our algorithms with low average sensitivity is the following fact; if the presence of a vertex or an edge in the solution output by an algorithm can be decided locally, then the algorithm has a low average sensitivity, allowing us to reuse the analyses of known sublinear-time algorithms and local computation algorithms (LCAs). Using this connection, we show that every LCA for 2-coloring has linear query complexity, thereby answering an open question.

    Sublinear-Time Algorithms,Sensitivity Analysis
  • Sensitivity Analysis of the Maximum Matching Problem. Yuichi Yoshida and Samson Zhou. ITCS 2021 [arXiv]

    We consider the sensitivity of algorithms for the maximum matching problem against edge and vertex modifications. When an algorithm \(A\) for the maximum matching problem is deterministic, the sensitivity of \(A\) on \(G\) is defined as \(\max_{e \in E(G)}|A(G) \triangle A(G - e)|\), where \(G-e\) is the graph obtained from \(G\) by removing an edge \(e \in E(G)\) and \(\triangle\) denotes the symmetric difference. When \(A\) is randomized, the sensitivity is defined as \(\max_{e \in E(G)}d_{\mathrm{EM}}(A(G),A(G-e))\), where \(d_{\mathrm{EM}}(\cdot,\cdot)\) denotes the earth mover's distance between two distributions. Thus the sensitivity measures the difference between the output of an algorithm after the input is slightly perturbed. Algorithms with low sensitivity, or \emph{stable} algorithms are desirable because they are robust to edge failure or attack.

    In this work, we show a randomized \((1-\epsilon)\)-approximation algorithm with worst-case sensitivity \(O_{\epsilon}(1)\), which substantially improves upon the \((1-\epsilon)\)-approximation algorithm of Varma and Yoshida (SODA 2021) that obtains average sensitivity \(n^{O(1/(1+\epsilon^2))}\) sensitivity algorithm, and show a deterministic \(1/2\)-approximation algorithm with sensitivity \(\exp(O(\log^*n))\) for bounded-degree graphs. We then show that any deterministic constant-factor approximation algorithm must have sensitivity \(\Omega(\log^* n)\). Our results imply that randomized algorithms are strictly more powerful than deterministic ones in that the former can achieve sensitivity independent of \(n\) whereas the latter cannot. We also show analogous results for vertex sensitivity, where we remove a vertex instead of an edge. As an application of our results, we give an algorithm for the online maximum matching with \(O_{\epsilon}(n)\) total replacements in the vertex-arrival model. By comparison, Bernstein et al. (J. ACM 2019) gave an online algorithm that always outputs the maximum matching, but only for bipartite graphs and with \(O(n\log n)\) total replacements.

    Finally, we introduce the notion of normalized weighted sensitivity, a natural generalization of sensitivity that accounts for the weights of deleted edges. For a graph with weight function \(w\), the normalized weighted sensitivity is defined to be the sum of the weighted edges in the symmetric difference of the algorithm normalized by the altered edge, i.e., \(\max_{e \in E(G)}\frac{1}{w(e)}w\left(A(G) \triangle A(G - e)\right)\). Hence the normalized weighted sensitivity measures the weighted difference between the output of an algorithm after the input is slightly perturbed, normalized by the weight of the perturbation. We show that if all edges in a graph have polynomially bounded weight, then given a trade-off parameter \(\alpha>2\), there exists an algorithm that outputs a \(\frac{1}{4\alpha}\)-approximation to the maximum weighted matching in \(O(m\log_{\alpha} n)\) time, with normalized weighted sensitivity \(O(1)\).

    Sensitivity Analysis,Online Algorithms
  • Ordered Graph Limits and Their Applications. Omri Ben-Eliezer, Eldar Fischer, Amit Levi, and Yuichi Yoshida. ITCS 2021. [arXiv]

    The emerging theory of graph limits exhibits an analytic perspective on graphs, showing that many important concepts and tools in graph theory and its applications can be described more naturally (and sometimes proved more easily) in analytic language. We extend the theory of graph limits to the ordered setting, presenting a limit object for dense vertex-ordered graphs, which we call an orderon. As a special case, this yields limit objects for matrices whose rows and columns are ordered, and for dynamic graphs that expand (via vertex insertions) over time. Along the way, we devise an ordered locality-preserving variant of the cut distance between ordered graphs, showing that two graphs are close with respect to this distance if and only if they are similar in terms of their ordered subgraph frequencies. We show that the space of orderons is compact with respect to this distance notion, which is key to a successful analysis of combinatorial objects through their limits.

    We derive several applications of the ordered limit theory in extremal combinatorics, sampling, and property testing in ordered graphs. In particular, we prove a new ordered analogue of the well-known result by Alon and Stav [RS&A'08] on the furthest graph from a hereditary property; this is the first known result of this type in the ordered setting. Unlike the unordered regime, here the random graph model \(G(n,p)\) with an ordering over the vertices is not always asymptotically the furthest from the property for some p. However, using our ordered limit theory, we show that random graphs generated by a stochastic block model, where the blocks are consecutive in the vertex ordering, are (approximately) the furthest. Additionally, we describe an alternative analytic proof of the ordered graph removal lemma [Alon et al., FOCS'17].

    Sublinear-Time Algorithms

2020

  • Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits. Shinji Ito, Shuichi Hirahara, Tasuku Soma, and Yuichi Yoshida. NeurIPS 2020.

    We propose novel algorithms with first- and second-order regret bounds for adversarial linear bandits. These regret bounds imply that our algorithms perform well when there is an action achieving a small cumulative loss or the loss has a small variance. In addition, we need only assumptions weaker than those of existing algorithms; our algorithms work on discrete action sets as well as continuous ones without a priori knowledge about losses, and they run efficiently if a linear optimization oracle for the action set is available. These results are obtained by combining optimistic online optimization, continuous multiplicative weight update methods, and a novel technique that we refer to as distribution truncation. We also show that the regret bounds of our algorithms are tight up to polylogarithmic factors.

    Online Algorithms
  • Weak submodularity is a natural relaxation of the diminishing return property, which is equivalent to submodularity. Weak submodularity has been used to show that many (monotone) functions that arise in practice can be efficiently maximized with provable guarantees. In this work we introduce two natural generalizations of weak submodularity for non-monotone functions. We show that an efficient randomized greedy algorithm has provable approximation guarantees for maximizing these functions subject to a cardinality constraint. We then provide a more refined analysis that takes into account that the weak submodularity parameter may change (sometimes improving) throughout the execution of the algorithm. This leads to improved approximation guarantees in some settings. We provide applications of our results for monotone and non-monotone maximization problems.

    Submodular Functions
  • The problem of maximizing nonnegative monotone submodular functions under a certain constraint has been intensively studied in the last decade, and a wide range of efficient approximation algorithms have been developed for this problem. Many machine learning problems, including data summarization and influence maximization, can be naturally modeled as the problem of maximizing monotone submodular functions. However, when such applications involve sensitive data about individuals, their privacy concerns should be addressed. In this paper, we study the problem of maximizing monotone submodular functions subject to matroid constraints in the framework of differential privacy. We provide \((1-\frac{1}{\mathrm{e}})\)-approximation algorithm which improves upon the previous results in terms of approximation guarantee. This is done with an almost c ubic number of function evaluations in our algorithm.

    Moreover, we study \(k\)-submodularity, a natural generalization of submodularity. We give the first \(\frac{1}{2}\)-approximation algorithm that preserves differential privacy for maximizing monotone \(k\)-submodular functions subject to matroid constraints. The approximation ratio is asymptotically tight and is obtained with an almost linear number of function evaluations.

    Submodular Functions
  • Average Sensitivity of Spectral Clustering. Pan Peng and Yuichi Yoshida. KDD 2020. [arXiv]

    Spectral clustering is one of the most popular clustering methods for finding clusters in a graph, which has found many applications in data mining. However, the input graph in those applications may have many missing edges due to error in measurement, withholding for a privacy reason, or arbitrariness in data conversion. To make reliable and efficient decisions based on spectral clustering, we assess the stability of spectral clustering against edge perturbations in the input graph using the notion of average sensitivity, which is the expected size of the symmetric difference of the output clusters before and after we randomly remove edges. We first prove that the average sensitivity of spectral clustering is proportional to \(\lambda_2/\lambda_3^2\), where \(\lambda_i\) is the \(i\)-th smallest eigenvalue of the (normalized) Laplacian. We also prove an analogous bound for \(k\)-way spectral clustering, which partitions the graph into \(k\) clusters. Then, we empirically confirm our theoretical bounds by conducting experiments on synthetic and real networks. Our results suggest that spectral clustering is stable against edge perturbations when there is a cluster structure in the input graph.

    Network Algorithms,Spectral Graph Theory,Sensitivity Analysis
  • Hypergraph Clustering Based on PageRank. Yuuki Takai, Atsushi Miyauchi, Masahiro Ikeda, and Yuichi Yoshida. KDD 2020

    A hypergraph is a useful combinatorial object to model ternary or higher-order relations among entities. Clustering hypergraphs is a fundamental task in network analysis. In this study, we develop two clustering algorithms based on personalized PageRank on hypergraphs. The first one is local in the sense that its goal is to find a tightly connected vertex set with a bounded volume including a specified vertex. The second one is global in the sense that its goal is to find a tightly connected vertex set. For both algorithms, we discuss theoretical guarantees on the conductance of the output vertex set. Also, we experimentally demonstrate that our clustering algorithms outperform existing methods in terms of both the solution quality and running time. To the best of our knowledge, ours are the first practical algorithms for hypergraphs with theoretical guarantees on the conductance of the output set.

    Network Algorithms,Spectral Graph Theory
  • On Random Subsampling of Gaussian Process Regression: A Graphon-Based Analysis. Kohei Hayashi, Msaaki Imaizumi, and Yuichi Yoshida. AISTATS 2020.

    In this paper, we study random subsampling of Gaussian process regression, one of the simplest approximation baselines, from a theoretical perspective. Although subsampling discards a large part of training data, we show provable guarantees on the accuracy of the predictive mean/variance and its generalization ability. For analysis, we consider embedding kernel matrices into graphons, which encapsulate the difference of the sample size and enables us to evaluate the approximation and generalization errors in a unified manner. The experimental results show that the subsampling approximation achieves a better trade-off regarding accuracy and runtime than the Nystrom and random Fourier expansion methods.

    Machine Learning,Sublinear-Time Algorithms
  • We study the problem of testing whether a function \(f\colon \mathbb{R}^n \rightarrow \mathbb{R}\) is linear (i.e., both additive and homogeneous) in the \emph{distribution-free} property testing model, where the distance between functions is measured with respect to an unknown probability distribution over \(\mathbb{R}\). We show that, given query access to \(f\), sampling access to the unknown distribution as well as the standard Gaussian, and \(\varepsilon > 0\), we can distinguish additive functions from functions that are \(\varepsilon\)-far from additive functions with \(O\left(\frac{1}{\varepsilon}\log\frac{1}{\varepsilon}\right)\) queries, independent of \(n\). Furthermore, under the assumption that \(f\) is a continuous function, the additivity tester can be extended to a distribution-free tester for linearity using the same number of queries. On the other hand, we show that if we are only allowed to get values of \(f\) on sampled points, then any distribution-free tester requires \(\Omega(n)\) samples, even if the underlying distribution is the standard Gaussian.

    Sublinear-Time Algorithms

2019

  • Variational Inference of Penalized Regression with Submodular Functions. Koh Takeuchi, Yuichi Yoshida, and Yoshinobu Kawahara. UAI 2019.

    Various regularizers inducing structured-sparsity are constructed as Lovász extensions of submodular functions. In this paper, we consider a hierarchical probabilistic model of linear regression and its kernel extension with this type of regularization, and develop a variational inference scheme for the posterior estimate on this model. We derive an upper bound on the partition function with an approximation guarantee, and then show that minimizing this bound is equivalent to the minimization of a quadratic function over the polyhedron determined by the corresponding submodular function, which can be solved efficiently by the proximal gradient algorithm. Our scheme gives a natural extension of the Bayesian Lasso model for the maximum a posteriori (MAP) estimation to a variety of regularizers inducing structured sparsity, and thus this work provides a principled way to transfer the advantages of the Bayesian formulation into those models. Finally, we investigate the empirical performance of our scheme with several Bayesian variants of widely known models such as Lasso, generalized fused Lasso, and non-overlapping group Lasso.

    Submodular Functions,Machine Learning
  • In this paper, we utilize the perspective of property testing to consider the testability of relational database queries. A primary motivation is the desire to avoid reading an entire database to decide a property thereof. We focus on conjunctive queries, which are the most basic and heavily studied database queries. Each conjunctive query can be represented as a relational structure A such that deciding if the conjunctive query is satisfied by a relational structure B is equivalent to deciding if there exists a homomorphism from A to B. We phrase our results in terms of homomorphisms. Precisely, we study, for each relational structure A, the testability of homomorphism inadmissibility from A. We consider algorithms that have oracle access to an input relational structure B and that distinguish, with high probability, the case where there is no homomorphism from A to B, from the case where one needs to remove a constant fraction of tuples from B in order to suppress all such homomorphisms. We provide a complete characterization of the structures A from which one can test homomorphism inadmissibility with one-sided error by making a constant number of queries to B. Our characterization shows that homomorphism inadmissibility from A is constant-query testable with one-sided error if and only if the core of A is \(\alpha\)-acyclic. We also show that the injective version of the problem is constant-query testable with one-sided error if A is \(\alpha\)-acyclic; this result generalizes existing results for testing subgraph-freeness in the general graph model.

    Sublinear-Time Algorithms,Constraint Satisfaction Problems
  • Sensitivity Analysis of Centralities on Unweighted Networks. Shogo Murai and Yuichi Yoshida. WWW 2019.

    Revealing important vertices is a fundamental task in network analysis. As such, many indicators have been proposed for doing so, which are collectively called centralities. However, the abundance of studies on centralities blurs their differences. In this work, we compare centralities based on their sensivitity to modifications in the graph. Specifically, we introduce a quantitative measure called (average-case) edge sensitivity, which measures how much the centrality value of a uniformly chosen vertex (or an edge) changes when we remove a uniformly chosen edge. Edge sensitivity is applicable to unweighted graphs, regarding which, to our knowledge, there has been no theoretical analysis of the centralities. We conducted a theoretical analysis of the edge sensitivities of six major centralities: the closeness centrality, harmonic centrality, betweenness centrality, endpoint betweenness centrality, PageRank, and spanning tree centrality. Our experimental results on synthetic and real graphs confirm the tendency predicted by the theoretical analysis. We also discuss an extension of edge sensitivity to the setting that we remove a uniformly chosen set of edges of size \(k\) for an integer \(k \geq 1\).

    Network Algorithms
  • Estimating Walk-Based Similarities Using Random Walk. Shogo Murai and Yuichi Yoshida. WWW 2019.

    Measuring similarities between vertices is an important task in network analysis, which has numerous applications. One major approach to define a similarity between vertices is by accumulating weights of walks between them that encompasses personalized PageRank (PPR) and Katz similarity. Although many effective methods for PPR based on efficient simulation of random walks have been proposed, these techniques cannot be applied to other walk- based similarity notions because the random walk interpretation is only valid for PPR. In this paper, we propose a random-walk reduction method that reduces the computation of any walk-based similarity to the computation of a stationary distribution of a random walk. With this reduction, we can exploit techniques for PPR to compute other walk-based similarities. As a concrete application, we design an indexing method for walk-based similarities with which we can quickly estimate the similarity value of queried pairs of vertices, and theoretically analyze its approximation error. Our experimental results demonstrate that the instantiation of our method for Katz similarity is two orders of magnitude faster than existing methods on large real networks, without any deterioration in solution quality.

    Network Algorithms
  • The Cheeger inequality for undirected graphs, which relates the conductance of an undirected graph and the second smallest eigenvalue of its normalized Laplacian, is a cornerstone of spectral graph theory. The Cheeger inequality has been extended to directed graphs and hypergraphs using normalized Laplacians for those, that are no longer linear but piecewise linear transformations. In this paper, we introduce the notion of a submodular transformation \(F:\{0,1\}^n\to\mathbb{R}^m\), which applies \(m\) submodular functions to the \(n\)-dimensional input vector, and then introduce the notions of its Laplacian and normalized Laplacian. With these notions, we unify and generalize the existing Cheeger inequalities by showing a Cheeger inequality for submodular transformations, which relates the conductance of a submodular transformation and the smallest non-trivial eigenvalue of its normalized Laplacian. This result recovers the Cheeger inequalities for undirected graphs, directed graphs, and hypergraphs, and derives novel Cheeger inequalities for mutual information and directed information. Computing the smallest non-trivial eigenvalue of a normalized Laplacian of a submodular transformation is NP-hard under the small set expansion hypothesis. In this paper, we present a polynomial-time \(O(\log n)\)-approximation algorithm for the symmetric case, which is tight, and a polynomial-time \(O(\log^2 n+\log n \cdot \log m)\)-approximation algorithm for the general case. We expect the algebra concerned with submodular transformations, or submodular algebra, to be useful in the future not only for generalizing spectral graph theory but also for analyzing other problems that involve piecewise linear transformations, e.g., deep learning.

    Submodular Functions,Spectral Graph Theory
  • Spectral Sparsification of Hypergraphs. Tasuku Soma and Yuichi Yoshida. SODA 2019. [arXiv]

    For an undirected/directed hypergraph \(G=(V,E)\), its Laplacian \(L_G:\mathbb{R}^V\to\mathbb{R}^V\) is defined such that its ``quadratic form'' \(\mathbf{x}^\top L_G(\mathbf{x})\) captures the cut information of \(G\). In particular, \(\mathbf{1}^\top_S L_G(\mathbf{1}_S)\) coincides with the cut size of \(S\subseteq V\), where \(\mathbf{1}_S \in \mathbb{R}^V\) is the characteristic vector of \(S\). A weighted subgraph \(H\) of a hypergraph \(G\) on a vertex set \(V\) is said to be an \(\epsilon\)-spectral sparsifier of \(G\) if \((1-\epsilon)\mathbf{x}^\top L_H(\mathbf{x})\leq \mathbf{x}^\top L_G(\mathbf{x}) \leq (1+\epsilon)\mathbf{x}^\top L_H(\mathbf{x})\) holds for every \(\mathbf{x} \in \mathbb{R}^V\). In this paper, we present a polynomial-time algorithm that, given an undirected/directed hypergraph \(G\) on \(n\) vertices, constructs an \(\epsilon\)-spectral sparsifier of \(G\) with \(O(n^3\log n/\epsilon^2)\) hyperedges/hyperarcs. The proposed spectral sparsification can be used to improve the time and space complexities of algorithms for solving problems that involve the quadratic form, such as computing the eigenvalues of \(L_G\), computing the effective resistance between a pair of vertices in \(G\), semi-supervised learning based on \(L_G\), and cut problems on \(G\). In addition, our sparsification result implies that any submodular function \(f:2^V\to \mathbb{R}_+\) with \(f(\emptyset)=f(V)=0\) can be concisely represented by a directed hypergraph. Accordingly, we show that, for any distribution, we can properly and agnostically learn submodular functions \(f:2^V\to [0,1]\) with \(f(\emptyset)=f(V)=0\), with \(O(n^4\log(n/\epsilon)/\epsilon^4)\) samples.

    Submodular Functions,Spectral Graph Theory

2018

  • 0/1/all CSPs, Half-Integral A-path Packing, and Linear-Time FPT Algorithms. Yoichi Iwata, Yutaro Yamaguchi, and Yuichi Yoshida. FOCS 2018. [arxiv]

    A recent trend in the design of FPT algorithms is exploiting the half-integrality of LP relaxations. In other words, starting with a half-integral optimal solution to an LP relaxation, we assign integral values to variables one-by-one by branch and bound. This technique is general and the resulting time complexity has a low dependency on the parameter. However, the time complexity often becomes a large polynomial in the input size because we need to compute half-integral optimal LP solutions. In this paper, we address this issue by providing an \(O(km)\)-time algorithm for solving the LPs arising from various FPT problems, where \(k\) is the optimal value and m is the number of edges/constraints. Our algorithm is based on interesting connections among 0/1/all constraints, which has been studied in the field of constraints satisfaction, \(A\)-path packing, which has been studied in the field of combinatorial optimization, and the LPs used in FPT algorithms. With the aid of this algorithm, we obtain improved FPT algorithms for various problems, including Group Feedback Vertex Set, Subset Feedback Vertex Set, Node Multiway Cut, Node Unique Label Cover, and Non-monochromatic Cycle Transversal. The obtained running time for each of these problems is linear in the input size and has the current smallest dependency on the parameter. In particular, these algorithms are the first linear-time FPT algorithms for problems including Group Feedback Vertex Set and Non-monochromatic Cycle Transversal.

    Constraint Satisfaction Problems,FPT
  • We design a sublinear-time approximation algorithm for quadratic function minimization problems with a better error bound than the previous algorithm by Hayashi and Yoshida (NIPS’16). Our approximation algorithm can be modified to handle the case where the minimization is done over a sphere. The analysis of our algorithms is obtained by combining results from graph limit theory, along with a novel spectral decomposition of matrices. Specifically, we prove that a matrix A can be decomposed into a structured part and a pseudorandom part, where the structured part is a block matrix with a polylogarithmic number of blocks, such that in each block all the entries are the same, and the pseudorandom part has a small spectral norm, achieving better error bound than the existing decomposition theorem of Frieze and Kannan (FOCS’96). As an additional application of the decomposition theorem, we give a sublinear-time approximation algorithm for computing the top singular values of a matrix.

    Sublinear-Time Algorithms
  • In a cooperative game, the utility of a coalition of players is given by the characteristic function, and the goal is to find a stable value division of the total utility to the players. In real-world applications, however, multiple scenarios could exist, each of which determines a characteristic function, and which scenario is more important is unknown. To handle such situations, the notion of multi-scenario cooperative games and several solution concepts have been proposed. However, computing the value divisions in those solution concepts is intractable in general.

    To resolve this issue, we focus on supermodular two-scenario cooperative games in which the number of scenarios is two and the characteristic functions are supermodular and study the computational aspects of a major solution concept called the preference core. First, we show that we can compute the value division in the preference core of a supermodular two-scenario game in polynomial time. Then, we reveal the relations among preference cores with different parameters. Finally, we provide more efficient algorithms for deciding the non-emptiness of the preference core for several specific supermodular two-scenario cooperative games such as the generalized induced subgraph game, airport game, and multicast tree game.

    Submodular Functions,Game Theory
  • In monotone submodular function maximization, approximation guarantees based on the curvature of the objective function have been extensively studied in the literature. However, the notion of curvature is often pessimistic, and we rarely obtain improved approximation guarantees, even for very simple objective functions. In this paper, we provide a novel approximation guarantee by extracting an \(M^\natural \)-concave function \(h:2^E \to \mathbb{R}_+\), a notion in discrete convex analysis, from the objective function \(f:2^E\to \mathbb{R}_+\). We introduce the notion of \(h\)-curvature, which measures how much \(f\) deviates from \(h\), and show that we can obtain a \((1−\gamma/e-\epsilon)\)-approximation to the problem of maximizing \(f\) under a cardinality constraint in polynomial time for any constant \(\epsilon>0\). Then, we show that we can obtain nontrivial approximation guarantees for various problems by applying the proposed algorithm.

    Submodular Functions
  • Spectral Normalization for Generative Adversarial Networks. Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. ICLR 2018 (oral). [code] [article by Christian Cosgrove] [Available in pytorch]

    One of the challenges in the study of generative adversarial networks is the instability of its training. In this paper, we propose a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator. Our new normalization technique is computationally light and easy to incorporate into existing implementations. We tested the efficacy of spectral normalization on CIFAR10, STL-10, and ILSVRC2012 dataset, and we experimentally confirmed that spectrally normalized GANs (SN-GANs) is capable of generating images of better or equal quality relative to the previous training stabilization techniques.

    Machine Learning
  • Statistically Efficient Estimation for Non-Smooth Probability Densities. Masaaki Imaizumi, Takanori Maehara, and Yuichi Yoshida. AISTATS 2018 (best paper award, oral).

    We investigate statistical efficiency of estimators for non-smooth density functions. The density estimation problem appears in various situations, and it is intensively used in statistics and machine learning. The statistical efficiencies of estimators, i.e., their convergence rates, play a central role in advanced statistical analysis. Although estimators and their convergence rates for smooth density functions are well investigated in the literature, those for non-smooth density functions remain elusive despite their importance in application fields. In this paper, we propose new estimators for non-smooth density functions by employing the notion of Szemerédi partitions from graph theory. We derive convergence rates of the proposed estimators. One of them has the optimal convergence rate in minimax sense, and the other has slightly worse convergence rate but runs in polynomial time. Experimental results support the theoretical performance of our esti- mators.

    Machine Learning
  • Guaranteed Sufficient Decrease for Stochastic Variance Reduced Gradient Optimization. Fanhua Shang, Yuanyuan Liu, Kaiwen Zhou, James Cheng, Kelvin Kai Wing Ng, and Yuichi Yoshida. AISTATS 2018.

    In this paper, we propose a novel sufficient decrease technique for stochastic variance reduced gradient descent methods such as SVRG and SAGA. In order to make sufficient decrease for stochastic optimization, we design a new sufficient decrease criterion, which yields sufficient decrease versions of stochastic variance reduction algorithms such as SVRG-SD and SAGA-SD as a byproduct. We introduce a coefficient to scale current iterate and to satisfy the sufficient decrease property, which takes the decisions to shrink, expand or even move in the opposite direction, and then give two specific update rules of the coefficient for Lasso and ridge regres- sion. Moreover, we analyze the convergence properties of our algorithms for strongly convex problems, which show that both of our algorithms attain linear convergence rates. We also provide the convergence guarantees of our algorithms for non-strongly convex problems. Our experimental results further verify that our algorithms achieve significantly better performance than their counterparts.

    Machine Learning
  • Using k-way Co-occurrences for Learning Word Embeddings. Danushka Bollegala, Yuichi Yoshida, and Ken-ichi Kawarabayashi. AAAI 2018. [arXiv]

    Co-occurrences between two words provide useful insights into the semantics of those words. Consequently, numerous prior work on word embedding learning has used co-occurrences between two words as the training signal for learning word embeddings. However, in natural language texts it is common for multiple words to be related and co-occurring in the same context. We extend the notion of co-occurrences to cover \(k(\geq 2)\)-way co-occurrences among a set of \(k\)-words. Specifically, we prove a theoretical relationship between the joint probability of \(k(\geq 2)\) words, and the sum of \(\ell_2\) norms of their embeddings. Next, we propose a learning objective motivated by our theoretical result that utilises \(k\)-way cooccurrences for learning word embeddings. Our experimental results show that the derived theoretical relationship does indeed hold empirically, and despite data sparsity, for some smaller \(k(\leq 5)\) values, \(k\)-way embeddings perform comparably or better than 2-way embeddings in a range of tasks.

    Natural Language Processing

2017

  • Fitting Low-Rank Tensors in Constant Time. Kohei Hayashi and Yuichi Yoshida. NIPS 2017 (spotlight).

    In this paper, we develop an algorithm that approximates the residual error of Tucker decomposition, one of the most popular tensor decomposition methods, with a provable guarantee. Given an order-\(K\) tensor \(X\in\mathbb{R}^{N_1\times\cdots\times N_K}\), our algorithm randomly samples a constant number \(s\) of indices for each mode and creates a ``mini'' tensor \(\tilde{X}\in\mathbb{R}^{s\times\cdots\times s}\), whose elements are given by the intersection of the sampled indices on \(X\). Then, we show that the residual error of the Tucker decomposition of \(\tilde{X}\) is sufficiently close to that of \(X\) with high probability. This result implies that we can figure out how much we can fit a low-rank tensor to \(X\) in constant time, regardless of the size of \(X\). This is useful for guessing the favorable rank of Tucker decomposition. Finally, we demonstrate how the sampling method works quickly and accurately using multiple real datasets.

    Sublinear-Time Algorithms,Machine Learning
  • Streaming Algorithms for Maximizing Monotone Submodular Functions under a Knapsack Constraint. Chien-Chung Huang, Naonori Kakimura and Yuichi Yoshida. APPROX 2017.

    In this paper, we consider the problem of maximizing a monotone submodular function subject to a knapsack constraint in the streaming setting. In particular, the elements arrive sequentially and at any point of time, the algorithm has access only to a small fraction of the data stored in primary memory. For this problem, we propose a \((0.363 - \epsilon)\)-approximation algorithm, requiring only a single pass through the data; moreover, we propose a \((0.4 - \epsilon)\)-approximation algorithm requiring a constant number of passes through the data. The required memory space of both algorithms depends only on the size of the knapsack capacity and ε.

    Submodular Functions
  • Landmark indexing for Evaluation of Label-Constrained Reachability Queries. Lucien Valstar, George Fletcher and Yuichi Yoshida. SIGMOD 2017. [slides] [code]

    Consider a directed edge-labeled graph, such as a social net- work or a citation network. A fundamental query on such data is to determine if there is a path in the graph from a given source vertex to a given target vertex, using only edges with labels in a restricted subset of the edge labels in the graph. Such label-constrained reachability (LCR) queries play an important role in graph analytics, for example, as a core fragment of the so-called regular path queries which are supported in practical graph query languages such as the W3C’s SPARQL 1.1, Neo4j’s Cypher, and Oracle’s PGQL. Current solutions for LCR evaluation, however, do not scale to large graphs which are increasingly common in a broad range of application domains. In this paper we present the first practical solution for efficient LCR evaluation, leveraging landmark-based indexes for large graphs. We show through extensive experiments that our indexes are significantly smaller than state-of-the-art LCR indexing techniques, while supporting up to orders of magnitude faster query evaluation times. Our complete C++ codebase is available as open source for further research.

    Network Algorithms
  • Portfolio Optimization for Influence Spread. Naoto Ohsaka and Yuichi Yoshida. WWW 2017. [slides (by Ohsaka)]

    Motivated by viral marketing, stochastic diffusion processes that model influence spread on a network have been studied intensively. The primary interest in such models has been to find a seed set of a fixed size that maximizes the expected size of the cascade from it. Practically, however, it is not desirable to have the risk of ending with a small cascade, even if the expected size of the cascade is large. To address this issue, we adopt conditional value at risk (CVaR) as a risk measure, and propose an algorithm that computes a portfolio over seed sets with a provable guarantee on its CVaR. Using real-world social networks, we demonstrate that the portfolio computed by our algorithm has a significantly better CVaR than seed sets computed by other baseline methods.

    Network Algorithms,Submodular Functions
  • Submodular function maximization has numerous applications in machine learning and artificial intelligence. Many real applications require multiple submodular objective func-tions to be maximized, and which function is regarded as important by a user is not known in advance. In such cases, it is desirable to have a small family of representative solutions that would satisfy any user’s preference. A traditional approach for solving such a problem is to enumerate the Pareto optimal solutions. However, owing to the massive number of Pareto optimal solutions (possibly exponentially many), it is difficult for a user to select a solution. In this paper, we propose two efficient methods for finding a small family of representative solutions, based on the notion of regret ratio. The first method outputs a family of fixed size with a nontrivial regret ratio. The second method enables us to choose the size of the output family, and in the biobjective case, it has a provable trade-off between the size and the regret ratio. Using real and synthetic data, we empirically demonstrate that our methods achieve a small regret ratio.

    Submodular Functions
  • Non-monotone DR-Submodular Function Maximization. Tasuku Soma and Yuichi Yoshida. AAAI 2017. [arXiv] [poster]

    We consider non-monotone DR-submodular function maximization, where DR-submodularity (diminishing return submodularity) is an extension of submodularity for functions over the integer lattice based on the concept of the diminishing return property. Maximizing non-monotone DR-submodular functions has many applications in machine learning that cannot be captured by submodular set functions. In this paper, we present a \(1/(2+\epsilon)\)-approximation algorithm with a running time of roughly \(O(n/\epsilon \log^2B)\), where \(n\) is the size of the ground set, B is the maximum value of a coordinate, and \(\epsilon > 0\) is a parameter. The approximation ratio is almost tight and the dependency of running time on B is exponentially smaller than the naive greedy algorithm. Experiments on synthetic and real-world datasets demonstrate that our algorithm outputs almost the best solution compared to other baseline algorithms, whereas its running time is several orders of magnitude faster.

    Submodular Functions
  • Computing Least Cores of Supermodular Cooperative Games. Daisuke Hatano and Yuichi Yoshida. AAAI 2017

    One of the goals of a cooperative game is to compute a valuedivision to the players from which they have no incentive todeviate. This concept is formalized as the notion of the core.To obtain a value division that motivates players to cooperate to a greater extent or that is more robust under noise, the notions of the strong least core and the weak least core have been considered. In this paper, we characterize the strong and the weak least cores of supermodular cooperative games using the theory of minimizing crossing submodular functions. We then apply our characterizations to two representative supermodular cooperative games, namely, the induced subgraph game generalized to hypergraphs and the airport game. For these games, we derive explicit forms of the strong and weak least core values, and provide polynomial-time algorithms that compute value divisions in the strong and weak least cores.

    Submodular Functions,Game Theory
  • Random-Radius Ball Method for Estimating Closeness Centrality. Wataru Inariba, Takuya Akiba and Yuichi Yoshida. AAAI 2017.

    In the analysis of real-world complex networks, identifying important vertices is one of the most fundamental operations. A variety of centrality measures have been proposed and extensively studied in various research areas. Many of distance-based centrality measures embrace some issues in treating disconnected networks, which are resolved by the recently emerged harmonic centrality. This paper focuses on a family of centrality measures including the harmonic centrality and its variants, and addresses their computational difficulty on very large graphs by presenting a new estimation algorithm named the random-radius ball (RRB) method. The RRB method is easy to implement, and a theoretical analysis, which includes the time complexity and error bounds, is also provided. The effectiveness of the RRB method over existing algorithms is demonstrated through experiments on real-world networks.

    Network Algorithms

2016

  • Minimizing Quadratic Functions in Constant Time. Kohei Hayashi and Yuichi Yoshida. NIPS 2016. [arXiv] [poster]

    A sampling-based optimization method for quadratic functions is proposed. Our method approximately solves the following \(n\)-dimensional quadratic minimization problem in constant time, which is independent of \(n\): \(z^*=\min_{\mathbf{v} \in \mathbb{R}^n}\langle \mathbf{v},A\mathbf{v}\rangle+n\langle \mathbf{v}\mathrm{diag}(\mathbf{d})\mathbf{v}\rangle+n\langle \mathbf{b},\mathbf{v}\rangle \), where \(A \in \mathbb{R}^{n \times n}\) a matrix and \(\mathbf{d},\mathbf{b} \in \mathbb{R}^n\) are vectors. Our theoretical analysis specifies the number of samples \(k(\delta,\epsilon)\) such that the approximated solution \(z\) satisfies \(|z-z^*|=O(\epsilon n^2)\) with probability \(1-\delta\). The empirical performance (accuracy and runtime) is positively confirmed by numerical experiments.

    Sublinear-Time Algorithms,Machine Learning
  • Testing Assignments to Constraint Satisfaction Problems. Hubie Chen, Matt Valeriote and Yuichi Yoshida. FOCS 2016. [arXiv] [slides]

    For a finite relational structure \(A\), let \(\mathrm{CSP}(A)\) denote the CSP instances whose constraint relations are taken from \(A\). The resulting family of problems \(\mathrm{CSP}(A)\) has been considered heavily in a variety of computational contexts. In this article, we consider this family from the perspective of property testing: given an instance of a CSP and query access to an assignment, one wants to decide whether the assignment satisfies the instance, or is far from so doing. While previous work on this scenario studied concrete templates or restricted classes of structures, this article presents comprehensive classification theorems. Our first contribution is a dichotomy theorem completely characterizing the structures \(A\) such that \(\mathrm{CSP}(A)\) is constant-query testable: (i) If \(A\) has a majority polymorphism and a Maltsev polymorphism, then \(\mathrm{CSP}(A)\) is constant-query testable with one-sided error. (ii) Else, testing \(\mathrm{CSP}(A)\) requires a super-constant number of queries. Let \(\exists\mathrm{CSP}(A)\) denote the extension of \(\mathrm{CSP}(A)\) to instances which may include existentially quantified variables. Our second contribution is to classify all structures \(A\) in terms of the number of queries needed to test assignments to instances of \(\exists\mathrm{CSP}(A)\), with one-sided error. More specifically, we show the following trichotomy (i) If \(A\) has a majority polymorphism and a Maltsev polymorphism, then \(\exists\mathrm{CSP}(A)\) is constant-query testable with one-sided error. (ii) Else, if A has a \((k + 1)\)-ary near-unanimity polymorphism for some \(k \geq 2\), and no Maltsev polymorphism then \(\exists\mathrm{CSP}(A)\) is not constant-query testable (even with two-sided error) but is sublinear-query testable with one-sided error. (iii) Else, testing \(\exists\mathrm{CSP}(A)\) with one-sided error requires a linear number of queries.

    Constraint Satisfaction Problems,Sublinear-Time Algorithms
  • Dynamic Influence Analysis in Evolving Networks. Naoto Ohsaka, Takuya Akiba, Yuichi Yoshida and Ken-ichi Kawarabayashi. Proceedings of the VLDB Endowment 2016. [slides (by Ohsaka)] [code]

    We propose the first real-time fully-dynamic index data structure designed for influence analysis on evolving networks. With this aim, we carefully redesign the data structure of the state-of-the-art sketching method introduced by Borgs et al., and construct corresponding update algorithms. Using this index, we present algorithms for two kinds of queries, influence estimation and influence maximization, which are strongly motivated by practical applications, such as viral marketing. We provide a thorough theoretical analysis, which guarantees the non-degeneracy of the solution accuracy after an arbitrary number of updates. Furthermore, we introduce a reachability-tree-based technique and a skipping method, which greatly reduce the time consumption required for edge/vertex deletions and vertex additions, respectively, and counter-based random number generators, which improve the space efficiency. Experimental evaluations using real dynamic networks with tens of millions of edges demonstrate the efficiency, scalability, and accuracy of our proposed indexing scheme. Specifically, it can reflect a graph modification within a time of several orders of magnitude smaller than that required to reconstruct an index from scratch, estimate the influence spread of a vertex set accurately within a millisecond, and select highly influential vertices at least ten times faster than state-of-the-art static algorithms.

    Network Algorithms
  • Efficient Algorithms for Spanning Tree Centrality. Takanori Hayashi, Takuya Akiba and Yuichi Yoshida. IJCAI 2016.

    In a connected graph, spanning tree centralities of a vertex and an edge measure how crucial they are for the graph to be connected. In this paper, we propose efficient algorithms for estimating spanning tree centralities with theoretical guarantees on their accuracy. We experimentally demonstrate that our methods are orders of magnitude faster than previous methods. Then, we propose a novel centrality notion of a vertex, called aggregated spanning tree centrality, which also considers the number of connected components obtained by removing the vertex. We also give an efficient algorithm for estimating aggregated spanning tree cen- trality. Finally, we experimentally show that those spanning tree centralities are useful to identify vulnerable edges and vertices in infrastructure networks.

    Network Algorithms
  • The problem of maximizing non-negative monotone submodular functions under a certain constraint has been intensively studied in the last decade. In this paper, we address the problem for functions defined over the integer lattice. Suppose that a non-negative monotone submodular function \(f:\mathbb{Z}^n_+ \to \mathbb{R}_+\) is given via an evaluation oracle. Assume further that \(f\) satisfies the diminishing return property, which is not an immediate consequence of the submodularity when the domain is the integer lattice. Then, we show polynomial-time \((1-1/e-\epsilon)\)-approximation algorithm for cardinality constraints, polymatroid constraints, and knapsack constraints. For a cardinality constraint, we also show a \((1-1/e-\epsilon)\)-approximation algorithm with slightly worse time complexity that does not rely on the diminishing return property. Our algorithms for a polymatroid constraint and a knapsack constraint first extend the domain of the objective function to the Euclidean space and then run the continuous greedy algorithm. We give two different kinds of continuous extensions, one is for polymatroid constraints and the other is for knapsack constraints, which might be of independent interest.

    Submodular Functions
  • In this work, we introduce a new Markov operator associated with a digraph, which we refer to as a nonlinear Laplacian. Unlike previous Laplacians for digraphs, the nonlinear Laplacian does not rely on the stationary distribution of the random walk process and is well defined on digraphs that are not strongly connected. We show that the nonlinear Laplacian has nontrivial eigenvalues and give a Cheeger-like inequality, which relates the conductance of a digraph and the smallest non-zero eigenvalue of its nonlinear Laplacian. Finally, we apply the nonlinear Laplacian to the analysis of real-world networks and obtain encouraging results.

    Network Algorithms,Submodular Functions,Spectral Graph Theory
  • We consider stable signal recovery in \(\ell_q\) quasi-norm for \(0 < q \leq 1\). In this problem, given a measurement vector \(y = Ax\) for some unknown signal vector \(x \in \mathbb{R}^n\) and a known matrix \(A \in \mathbb{R}^{m \times n}\), we want to recover \(z \in \mathbb{R}^n\) with \(\|x - z\|_q = O(\|x - x^*\|_q)\) from a measurement vector, where \(x^*\) is the s-sparse vector closest to \(x\) in \(\ell_q\) quasi-norm. Although a small value of \(q\) is favorable for measuring the distance to sparse vectors, previous methods for \(q < 1\) involve \(\ell_q\) quasi-norm minimization which is computationally intractable. In this paper, we overcome this issue by using the sum-of-squares method, and give the first polynomial-time stable recovery scheme for a large class of matrices \(A\) in \(\ell_q\) quasi-norm for any fixed constant \(0 < q \leq 1\).

  • Let \(\{f_i:\mathbb{F}_p^i\to \{0,1\}\}\) be a sequence of functions, where \(p\) is a fixed prime and \(\mathbb{F}_p\) is the finite field of order \(p\). The limit of the sequence can be syntactically defined using the notion of ultralimit. Inspired by the Gowers norm, we introduce a metric over limits of function sequences, and study properties of it. One application of this metric is that it provides a simpler characterization of affine-invariant parameters of functions that are constant-query estimable than the previous one obtained by Yoshida (STOC'14). Using this characterization, we show that the property of being a function of a constant number of low-degree polynomials and a constant number of factored polynomials (of arbitrary degrees) is constant-query testable if it is closed under blowing-up. Examples of this property include the property of having a constant spectral norm and degree-structural properties with rank conditions.

    Boolean Functions,Sublinear-Time Algorithms
  • Improved Approximation Algorithms for k-Submodular Function Maximization. Satoru Iwata, Shin-ichi Tanigawa and Yuichi Yoshida. SODA 2016. [arXiv]

    This paper presents a polynomial-time \(1/2\)-approximation algorithm for maximizing nonnegative k-submodular functions. This improves upon the previous \(\max\{1/3, 1/(1 + a)\}\)-approximation by Ward and Živný [18], where \(a = \max\{1,\sqrt{(k-1)/4}\}\). We also show that for monotone \(k\)-submodular functions there is a polynomial-time \(k/(2k – 1)\)-approximation algorithm while for any \(\epsilon > 0\) a \(((k + 1)/2k + \epsilon)\)-approximation algorithm for maximizing monotone \(k\)-submodular functions would require exponentially many queries. In particular, our hardness result implies that our algorithms are asymptotically tight. We also extend the approach to provide constant factor approximation algorithms for maximizing skewbisubmodular functions, which were recently introduced as generalizations of bisubmodular functions.

    Submodular Functions

2015

  • We consider a generalization of the submodular cover problem based on the concept of diminishing return property on the integer lattice. We are motivated by real scenarios in machine learning that cannot be captured by (traditional) submodular set functions. We show that the generalized submodular cover problem can be applied to various problems and devise a bicriteria approximation algorithm. Our algorithm is guaranteed to output a log-factor approximate solution that satisfies the constraints with the desired accuracy. The running time of our algorithm is roughly \(O(n\log(nr)\log r)\), where \(n\) is the size of the ground set and \(r\) is the maximum value of a coordinate. The dependency on \(r\) is exponentially better than the naive reduction algorithms. Several experiments on real and artificial datasets demonstrate that the solution quality of our algorithm is comparable to naive algorithms, while the running time is several orders of magnitude faster.

    Submodular Functions
  • A \(k\)-submodular function is a generalization of a submodular function, where the input consists of \(k\) disjoint subsets, instead of a single subset, of the domain. Many machine learning problems, including influence maximization with \(k\) kinds of topics and sensor placement with \(k\) kinds of sensors, can be naturally modeled as the problem of maximizing monotone \(k\)-submodular functions.In this paper, we give constant-factor approximation algorithms for maximizing monotone \(k\)-submodular functions subject to several size constraints.The running time of our algorithms are almost linear in the domain size.We experimentally demonstrate that our algorithms outperform baseline algorithms in terms of the solution quality.

    Submodular Functions
  • Fully Dynamic Betweenness Centrality Maintenance on Massive Networks. Takanori Hayashi, Takuya Akiba and Yuichi Yoshida. Proceedings of the VLDB Endowment 2015. [code]

    Measuring the relative importance of each vertex in a network is one of the most fundamental building blocks in network analysis. Among several importance measures, betweenness centrality, in particular, plays key roles in many real applications. Considerable effort has been made for developing algorithms for static settings. However, real networks today are highly dynamic and are evolving rapidly, and scalable dynamic methods that can instantly reflect graph changes into centrality values are required. In this paper, we present the first fully dynamic method for managing betweenness centrality of all vertices in a large dynamic network. Its main data structure is the weighted hyperedge representation of shortest paths called hypergraph sketch. We carefully design dynamic update procedure with theoretical accuracy guarantee. To accelerate updates, we further propose two auxiliary data structures called two-ball index and special-purpose reachability index. Experimental results using real networks demonstrate its high scalability and efficiency. In particular, it can reflect a graph change in less than a millisecond on average for a large-scale web graph with 106M vertices and 3.7B edges, which is several orders of magnitude larger than the limits of previous dynamic methods.

    Network Algorithms
  • On the Equivalence among Problems of Bounded Width. Yoichi Iwata and Yuichi Yoshida. ESA 2015. [arXiv]

    In this paper, we introduce a methodology, called decomposition-based reductions, for showing the equivalence among various problems of bounded-width. First, we show that the following are equivalent for any \(α > 0\):

    • SAT can be solved in \(O^*(2^{\alpha \mathrm{tw}})\) time,
    • 3-SAT can be solved in \(O^*(2^{\alpha \mathrm{tw}})\) time,
    • Max 2-SAT can be solved in \(O^*(2^{\alpha\mathrm{tw}})\) time,
    • Independent Set can be solved in \(O^*(2^{\alpha\mathrm{tw}})\) time, and
    • Independent Set can be solved in \(O^*(2^{\alpha\mathrm{cw}})\) time,

    where \(\mathrm{tw}\) and \(\mathrm{cw}\) are the tree-width and clique-width of the instance, respectively. Then, we introduce a new parameterized complexity class EPNL, which includes Set Cover and TSP, and show that SAT, 3-SAT, Max 2-SAT, and Independent Set parameterized by path-width are EPNL-complete. This implies that if one of these \textup{EPNL}-complete problems can be solved in \(O^*(c^k)\) time, then any problem in \textup{EPNL} can be solved in \(O^*(c^k)\) time.

    Combinatorial Optimization
  • Learning Word Representations from Relational Graphs. Danushka Bollegala, Takanori Maehara, Yuichi Yoshida and Ken-ichi Kawarabayashi. AAAI 2015. [arXiv]

    Attributes of words and relations between two words are central to numerous tasks in Artificial Intelligence such as knowledge representation, similarity measurement, and analogy detection. Often when two words share one or more attributes in common, they are connected by some semantic relations. On the other hand, if there are numerous semantic relations between two words, we can expect some of the attributes of one of the words to be inherited by the other. Motivated by this close connection between attributes and relations, given a relational graph in which words are inter-connected via numerous semantic relations, we propose a method to learn a latent representation for the individual words. The proposed method considers not only the co-occurrences of words as done by existing approaches for word representation learning, but also the semantic relations in which two words co-occur. To evaluate the accuracy of the word representations learnt using the proposed method, we use the learnt word representations to solve semantic word analogy problems. Our experimental results show that it is possible to learn better word representations by using semantic semantics between words.

    Natural Language Processing
  • Distributed Multiplicative Weights Methods for DCOP. Daisuke Hatano and Yuichi Yoshida. AAAI 2015.

    We introduce a new framework for solving distributed constraint optimization problems that extend the domain of each variable into a simplex.We propose two methods for searching the extended domain for good assignments.The first one relaxes the problem using linear programming, finds the optimum LP solution, and rounds it to an assignment.The second one plays a cost-minimization game, finds a certain kind of equilibrium, and rounds it to an assignment.Both methods are realized by performing the multiplicative weights method in a distributed manner.We experimentally demonstrate that our methods have good scalability,and in particular, the second method outperforms existing algorithms in terms of solution quality and efficiency.

    Constraint Satisfaction Problems
  • Efficient Top-k Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling. Takuya Akiba, Takanori Hayashi, Nozomi Nori, Yoichi Iwata and Yuichi Yoshida. AAAI 2015. [code]

    We propose an indexing scheme for top-\(k\) shortest-path distance queries on graphs, which is useful in a wide range of important applications such as network-aware search and link prediction. While considerable effort has been made for efficiently answering standard (top-1) distance queries, none of previous methods can be directly extended for top-k distance queries. We propose a new framework for top-\(k\) distance queries based on 2-hop cover and then present an efficient indexing algorithm based on the simple but effective recent notion of pruned landmark labeling. Extensive experimental results on real social and web graphs show the scalability, efficiency and robustness of our method. Moreover, we demonstrate the usefulness of top-\(k\) distance queries through an application to link prediction.

    Network Algorithms

2014

  • Robust Approximation of Temporal CSP. Suguru Tamaki and Yuichi Yoshida. APPROX 2014.

    A temporal constraint language \(\Gamma\) is a set of relations with first-order definitions in \((\mathbb{Q};<)\). Let \(\mathrm{CSP}(\Gamma)\) denote the set of constraint satisfaction problem instances with relations from \(\Gamma\). \(\mathrm{CSP}(\Gamma)\) admits robust approximation if, for any \(\varepsilon \geq 0\), given a \((1-\varepsilon)\)-satisfiable instance of \(\mathrm{CSP}(\Gamma)\), we can compute an assignment that satisfies at least a \((1-f(\varepsilon))\)-fraction of constraints in polynomial time. Here, \(f(\varepsilon)\) is some function satisfying \(f(0)=0\) and \(\lim_{\varepsilon \to 0}f(\varepsilon)=0\).

    Firstly, we give a qualitative characterization of robust approximability: Assuming the Unique Games Conjecture, we give a necessary and sufficient condition on \(\Gamma\) under which \(\mathrm{CSP}(\Gamma)\) admits robust approximation. Secondly, we give a quantitative characterization of robust approximability: Assuming the Unique Games Conjecture, we precisely characterize how \(f(\varepsilon)\) depends on \(\varepsilon\) for each \(\Gamma\). We show that our robust approximation algorithms can be run in almost linear time.

    Constraint Satisfaction Problems
  • Betweenness centrality measures the importance of a vertex by quantifying the number of times it acts as a midpoint of the shortest paths between other vertices. This measure is widely used in network analysis. In many applications, we wish to choose the \(k\) vertices with the maximum adaptive betweenness centrality, which is the betweenness centrality without considering the shortest paths that have been taken into account by already-chosen vertices.

    All previous methods are designed to compute the betweenness centrality in a fixed graph. Thus, to solve the above task, we have to run these methods \(k\) times. In this paper, we present a method that directly solves the task, with an almost linear runtime no matter how large the value of \(k\). Our method first constructs a hypergraph that encodes the betweenness centrality, and then computes the adaptive betweenness centrality by examining this graph. Our technique can be utilized to handle other centrality measures.

    We theoretically prove that our method is very accurate, and experimentally confirm that it is three orders of magnitude faster than previous methods. Relying on the scalability of our method, we experimentally demonstrate that strategies based on adaptive betweenness centrality are effective in important applications studied in the network science and database communities.

    Network Algorithms
  • Fast and Accurate Influence Maximization on Large Networks with Pruned Monte-Carlo Simulations. Naoto Ohsaka, Takuya Akiba, Yuichi Yoshida and Ken-Ichi Kawarabayashi. AAAI 2014. [slides (by Ohsaka)] [code]

    Influence maximization is a problem to find small sets of highly influential individuals in a social network to maximize the spread of influence under stochastic cascade models of propagation. Although the problem has been well-studied, it is still highly challenging to find solutions of high quality in large-scale networks of the day. While Monte-Carlo-simulation-based methods produce near-optimal solutions with a theoretical guarantee, they are prohibitively slow for large graphs. As a result, many heuristic methods without any theoretical guarantee have been developed, but all of them substantially compromise solution quality. To address this issue, we propose a new method for the influence maximization problem. Unlike other recent heuristic methods, the proposed method is a Monte-Carlo-simulation-based method, and thus it consistently produces solutions of high quality with the theoretical guarantee. On the other hand, unlike other previous Monte-Carlo-simulation-based methods, it runs as fast as other state-of-the-art methods, and can be applied to large networks of the day. Through our extensive experiments, we demonstrate the scalability and the solution quality of the proposed method.

    Network Algorithms
  • We consider the problem of testing if two input forests are isomorphic or are far from being so. An algorithm is called an \(\epsilon\)-tester for forest-isomorphism if given an oracle access to two forests \(G\) and \(H\) in the adjacency list model, with high probability, accepts if \(G\) and \(H\) are isomorphic and rejects if we must modify at least \(\epsilon n\) edges to make \(G\) isomorphic to \(H\). We show an \(\epsilon\)-tester for forest-isomorphism with a query complexity \(\mathrm{polylog}(n)\) and a lower bound of \(\Omega(\sqrt{\log n/\log \log n})\). Further, with the aid of the tester, we show that every graph property is testable in the adjacency list model with \(\mathrm{polylog}(n)\) queries if the input graph is a forest.

    Sublinear-Time Algorithms
  • Let \(P\) be a property of function \(\mathbb{F}^n_p \to \{0, 1\}\) for a fixed prime \(p\). An algorithm is called a tester for \(P\) if, given a query access to the input function \(f\), with high probability, it accepts when \(f\) satisfies \(P\) and rejects when \(f\) is "far" from satisfying \(P\). In this paper, we give a characterization of affine-invariant properties that are (two-sided error) testable with a constant number of queries. The characterization is stated in terms of decomposition theorems, which roughly claim that any function can be decomposed into a structured part that is a function of a constant number of polynomials, and a pseudo-random part whose Gowers norm is small. We first give an algorithm that tests whether the structured part of the input function has a specific form. Then we show that an affine-invariant property is testable with a constant number of queries if and only if it can be reduced to the problem of testing whether the structured part of the input function is close to one of a constant number of candidates.

    Boolean Functions,Sublinear-Time Algorithms
  • We propose two dynamic indexing schemes for shortest-path and distance queries on large time-evolving graphs, which are useful in a wide range of important applications such as real-time network-aware search and network evolution analysis. To the best of our knowledge, these methods are the first practical exact indexing methods to efficiently process distance queries and dynamic graph updates.

    We first propose a dynamic indexing scheme for queries on the last snapshot. The scalability and efficiency of its offline indexing algorithm and query algorithm are competitive even with previous static methods. Meanwhile, the method is dynamic, that is, it can incrementally update indices as the graph changes over time. Then, we further design another dynamic indexing scheme that can also answer two kinds of historical queries with regard to not only the latest snapshot but also previous snapshots.

    Through extensive experiments on real and synthetic evolving networks, we show the scalability and efficiency of our methods. Specifically, they can construct indices from large graphs with millions of vertices, answer queries in microseconds, and update indices in milliseconds.

    Network Algorithms
  • We consider approximation schemes for the maximum constraint satis- faction problems and the maximum assignment problems. Though they are NP-Hard in general, if the instance is “dense” or “locally dense”, then they are known to have approximation schemes that run in polynomial time or quasi-polynomial time. In this paper, we give a unified method of showing these approximation schemes based on the Sherali-Adams linear programming relaxation hierarchy. We also use our linear programming-based framework to show new algorithmic results on the optimization version of the hypergraph isomorphism problem.

    Constraint Satisfaction Problems
  • Parameterized Testability. Kazuo Iwama and Yuichi Yoshida. ITCS 2014. [slides]

    This paper studies property testing for NP optimization problems with parameter \(k\) under the general graph model with an augmentation of random edge sampling capability. It is shown that a variety of such problems, including \(k\)-Vertex Cover, \(k\)-Feedback Vertex Set, \(k\)-Multicut, \(k\)-path-freeness and \(k\)-Dominating Set, are constant-time testable if \(k\) is constant. It should be noted that the first four problems are fixed parameter tractable (FPT) and it turns out that algorithmic techniques for their FPT algorithms (branch-and-bound search, color coding, etc.) are also useful for our testers. \(k\)-Dominating Set is \(W[2]\)-hard, but we can still test the property in constant time since the definition of \(\epsilon\)-farness makes the problem trivial for non-sparse graphs that are the source of hardness for the original optimization problem. We also consider \(k\)-Odd Cycle Transversal, which is another well-known FPT problem, but we only give a sublinear-time tester when \(k\) is a constant.

    Sublinear-Time Algorithms,FPT
  • Linear Time FPT Algorithms via Network Flow. Yoichi Iwata, Keigo Oka and Yuichi Yoshida. SODA 2014. [arXiv] [video (talk by Iwata)]

    In the area of parameterized complexity, to cope with NP-Hard problems, we introduce a parameter \(k\) besides the input size \(n\), and we aim to design algorithms (called FPT algorithms) that run in \(O(f(k)n^d)\) time for some function \(f(k)\) and constant \(d\). Though FPT algorithms have been successfully designed for many problems, typically they are not sufficiently fast because of huge \(f(k)\) and \(d\).

    In this paper, we give FPT algorithms with small \(f(k)\) and d for many important problems including Odd Cycle Transversal and Almost 2-SAT. More specifically, we can choose \(f(k)\) as a single exponential \((4^k)\) and d as one, that is, linear in the input size. To the best of our knowledge, our algorithms achieve linear time complexity for the first time for these problems.

    To obtain our algorithms for these problems, we consider a large class of integer programs, called BIP2. Then we show that, in linear time, we can reduce BIP2 to Vertex Cover Above LP preserving the parameter \(k\), and we can compute an optimal LP solution for Vertex Cover Above LP using network flow.

    Then, we perform an exhaustive search by fixing half-integral values in the optimal LP solution for Vertex Cover Above LP. A bottleneck here is that we need to recompute an LP optimal solution after branching. To address this issue, we exploit network flow to update the optimal LP solution in linear time.

    FPT

2013

  • Answering reachability queries on directed graphs is ubiquitous in many applications involved with graph-shaped data as one of the most fundamental and important operations. However, it is still highly challenging to efficiently process them on large-scale graphs. Transitive-closure-based methods consume prohibitively large index space, and online-search-based methods answer queries too slowly. Labeling-based methods attain both small index size and query time, but previous indexing algorithms are not scalable at all for processing large graphs of the day.

    In this paper, we propose new labeling-based methods for reachability queries, referred to as pruned landmark labeling and pruned path labeling. They follow the frameworks of 2-hop cover and 3-hop cover, but their indexing algorithms are based on the recent notion of pruned labeling and improve the indexing time by several orders of magnitude, resulting in applicability to large graphs with tens of millions of vertices and edges. Our experimental results show that they attain remarkable trade-offs between fast query time, small index size and scalability, which previous methods have never been able to achieve. Furthermore, we also discuss the ingredients of the efficiency of our methods by a novel theoretical analysis based on the graph minor theory.

    Network Algorithms
  • Capturing sets of closely related vertices from large networks is an essential task in many applications such as social network analysis, bioinformatics, and web link research. Decomposing a graph into \(k\)-core components is a standard and efficient method for this task, but obtained clusters might not be well-connected. The idea of using maximal \(k\)-edge-connected subgraphs was recently proposed to address this issue. Although we can obtain better clusters with this idea, the state-of-the-art method is not efficient enough to process large networks with millions of vertices.

    In this paper, we propose a new method to decompose a graph into maximal \(k\)-edge-connected components, based on random contraction of edges. Our method is simple to implement but improves performance drastically. We experimentally show that our method can successfully decompose large networks and it is thousands times faster than the previous method. Also, we theoretically explain why our method is efficient in practice. To see the importance of maximal \(k\)-edge-connected subgraphs, we also conduct experiments using real-world networks to show that many \(k\)-core components have small edge-connectivity and they can be decomposed into a lot of maximal \(k\)-edge-connected subgraphs.

    Network Algorithms
  • Mining for Analogous Tuples from an Entity-Relation Graph. Danushka Bollegala, Mitsuru Kushimoto, Yuichi Yoshida and Ken-ichi Kawarabayashi. IJCAI 2013.

    The ability to recognize analogies is an important factor that is closely related to human intelligence. Verbal analogies have been used for evaluating both examinees at university entrance exams as well as algorithms for measuring relational similarity. However, relational similarity measures proposed so far are confined to measuring the similarity between pairs of words. Unfortunately, such pairwise approaches ignore the rich relational structure that exists in real-world knowledge bases containing millions of entities and semantic relations. We propose a method to efficiently identify analogous entity tuples from a given entity-relation graph. First, we present an efficient approach for extracting potential analogous tuples from a given entity-relation graph. Second, to measure the structural similarity between two tuples, we propose two types of kernel functions: vertex-feature kernels, and edge-feature kernels. Moreover, we combine those kernels to construct composite kernels that simultaneously consider both vertex and edge features. Experimental results show that our proposed method accurately identifies analogous tuples and significantly outperforms a state-of-the-art pairwise relational similarity measure, extended to tuples.

    Natural Language Processing
  • Testing Linear-Invariant Function Isomorphism. Karl Wimmer and Yuichi Yoshida. ICALP 2013.

    A function \(f:\mathbb{F}^n_2 \to \{-1,1\}\) is called linear-isomorphic to \(g\) if \(f = g \circ A\) for some non-singular matrix \(A\). In the \(g\)-isomorphism problem, we want a randomized algorithm that distinguishes whether an input function \(f\) is linear-isomorphic to \(g\) or far from being so.

    We show that the query complexity to test \(g\)-isomorphism is essentially determined by the spectral norm of \(g\). That is, if \(g\) is close to having spectral norm \(s\), then we can test g-isomorphism with \(\mathrm{poly}(s)\) queries, and if \(g\) is far from having spectral norm \(s\), then we cannot test \(g\)-isomorphism with \(o(\log s)\) queries. The upper bound is almost tight since there is indeed a function \(g\) close to having spectral norm s whereas testing \(g\)-isomorphism requires \(\Omega(s)\) queries. As far as we know, our result is the first characterization of this type for functions. Our upper bound is essentially the Kushilevitz-Mansour learning algorithm, modified for use in the implicit setting.

    Exploiting our upper bound, we show that any property is testable if it can be well-approximated by functions with small spectral norm. We also extend our algorithm to the setting where \(A\) is allowed to be singular.

    Boolean Functions,Sublinear-Time Algorithms
  • An Algebraic Characterization of Testable CSPs. Arnab Bhattacharyya and Yuichi Yoshida. ICALP 2013. [slides]

    Given an instance \(\mathcal{I}\) of a CSP, a tester for \(\mathcal{I}\) distinguishes assignments satisfying \(\mathcal{I}\) from those which are far from any assignment satisfying \(\mathcal{I}\). The efficiency of a tester is measured by its query complexity, the number of variable assignments queried by the algorithm. In this paper, we characterize the hardness of testing Boolean CSPs in terms of the algebra generated by the relations used to form constraints. In terms of computational complexity, we show that if a non-trivial Boolean CSP is sublinear-query testable (resp., not sublinear-query testable), then the CSP is in NL (resp., P-complete, ⊕L-complete or NL-complete) and that if a sublinear-query testable Boolean CSP is constant-query testable (resp., not constant-query testable), then counting the number of solutions of the CSP is in P (resp., #P-complete).

    Also, we conjecture that a CSP instance is testable in sublinear time if its Gaifman graph has bounded treewidth. We confirm the conjecture when a near-unanimity operation is a polymorphism of the CSP.

    Constraint Satisfaction Problems,Sublinear-Time Algorithms
  • We propose a new exact method for shortest-path distance queries on large-scale networks. Our method precomputes distance labels for vertices by performing a breadth-first search from every vertex. Seemingly too obvious and too inefficient at first glance, the key ingredient introduced here is pruning during breadth-first searches. While we can still answer the correct distance for any pair of vertices from the labels, it surprisingly reduces the search space and sizes of labels. Moreover, we show that we can perform 32 or 64 breadth-first searches simultaneously exploiting bitwise operations. We experimentally demonstrate that the combination of these two techniques is efficient and robust on various kinds of large-scale real-world networks. In particular, our method can handle social networks and web graphs with hundreds of millions of edges, which are two orders of magnitude larger than the limits of previous exact methods, with comparable query time to those of previous methods.

    Network Algorithms
  • Testing a property \(P\) of graphs in the bounded-degree model deals with the following problem: given a graph \(G\) of bounded degree \(d\), we should distinguish (with probability \(2/3\), say) between the case that \(G\) satisfies \(P\) and the case that one should add/remove at least \(\epsilon dn\) edges of \(G\) to make it satisfy \(P\). In sharp contrast to property testing of dense graphs, which is relatively well understood, only few properties are known to be testable with a constant number of queries in the bounded-degree model. In particular, no global monotone (i.e,closed under edge deletions) property that expander graphs can satisfy has been shown to be testable in constant time so far.

    In this paper, we identify for the first time a natural family of global monotone property that expander graphs can satisfy and can be efficiently tested in the bounded degree model. Specifically, we show that, for any integer \(t \geq 1\), \(K_t\)-subdivision-freeness is testable with a constant number of queries in the bounded-degree model. This property was not previously known to be testable even with \(o(n)\) queries. Note that an expander graph with all degree less than \(t-1\) does not have a \(K_t\)-subdivision.

    The proof is based on a novel combination of some results that develop the framework of partitioning oracles, together with structural graph theory results that develop the seminal graph minor theory by Robertson and Seymour. As far as we aware, this is the first result that bridges property testing and structural graph theory. Although we know a rough structure for graphs without \(H\)-minors from the famous graph minor theory by Robertson and Seymour, there is no corresponding structure theorem for graphs without \(H\)-subdivisions so far, even \(K_5\)-subdivision-free graphs. Therefore, subdivisions and minors are very different in a graph structural sense.

    Sublinear-Time Algorithms
  • We study the constraint satisfaction problem over the point algebra. In this problem, an instance consists of a set of variables and a set of binary constraints of forms \((x < y)\), \((x \leq y)\), \((x \neq y)\) or \((x = y)\). Then, the objective is to assign integers to variables so as to satisfy as many constraints as possible.This problem contains many important problems such as Correlation Clustering, Maximum Acyclic Subgraph, and Feedback Arc Set. We first give an exact algorithm that runs in \(O^*(3^{\frac{\log 5}{\log 6}n})\) time, which improves the previous best \(O^*(3^n)\) obtained by a standard dynamic programming. Our algorithm combines the dynamic programming with the split-and-list technique. The split-and-list technique involves matrix products and we make use of sparsity of matrices to speed up the computation. As for approximation, we give a \(0.4586\)-approximation algorithm when the objective is maximizing the number of satisfied constraints, and give an \(O(\log n \log \log n)\)-approximation algorithm when the objective is minimizing the number of unsatisfied constraints.

    Constraint Satisfaction Problems

2012

  • We propose a constant-time algorithm for approximating the weight of the maximum weight branching in the general graph model. A directed graph is called a branching if it is acyclic and each vertex has at most one incoming edge. An edge-weighted digraph \(G\) of average degree \(d\) whose weights are real values in \([0, 1]\) is given as an oracle access, and we are allowed to ask degrees and incoming edges for any vertex through the oracle. Then, with high probability, our algorithm estimates the weight of the maximum weight branching in \(G\) with an absolute error of at most εn with query complexity \(O(d/\epsilon^3)\), where \(n\) is the number of vertices. We also show a lower bound of \(\Omega(d/\epsilon^2)\). Additionally, our algorithm can be modified to run with query complexity \(O(1/\epsilon^4)\) for unweighted digraphs, i.e., it runs in time independent of the input size even for digraphs with \(d = \Omega(n)\) edges. In contrast, we show that it requires \(\Omega(n)\) queries to approximate the weight of the minimum (or maximum) spanning arborescence in a weighted digraph.

    Sublinear-Time Algorithms
  • Given a Boolean function \(f\), the \(f\)-isomorphism testing problem requires a randomized algorithm to distinguish functions that are identical to \(f\) up to relabeling of the input variables from functions that are far from being so. An important open question in property testing is to determine for which functions \(f\) we can test \(f\)-isomorphism with a constant number of queries. Despite much recent attention to this question, essentially only two classes of functions were known to be efficiently isomorphism testable: symmetric functions and juntas. We unify and extend these results by showing that all partially symmetric functions -- functions invariant to the reordering of all but a constant number of their variables -- are efficiently isomorphism-testable. This class of functions, first introduced by Shannon, includes symmetric functions, juntas, and many other functions as well. We conjecture that these functions are essentially the only functions efficiently isomorphism-testable. To prove our main result, we also show that partial symmetry is efficiently testable. In turn, to prove this result we had to revisit the junta testing problem. We provide a new proof of correctness of the nearly-optimal junta tester. Our new proof replaces the Fourier machinery of the original proof with a purely combinatorial argument that exploits the connection between sets of variables with low influence and intersecting families. Another important ingredient in our proofs is a new notion of symmetric influence. We use this measure of influence to prove that partial symmetry is efficiently testable and also to construct an efficient sample extractor for partially symmetric functions. We then combine the sample extractor with the testing-by-implicit-learning approach to complete the proof that partially symmetric functions are efficiently isomorphism-testable.

    Boolean Functions,Sublinear-Time Algorithms
  • Given an undirected graph \(G = (V,E)\) and positive edge weights \(\{w_e\}_{e \in E}\), a linear arrangement is a permutation \(\pi: V \to [n]\). The value of the arrangement is \(\mathrm{val}(G,\pi):=\frac{1}{n}\sum_{e=\{u,v\}}w_e|\pi(u)-\pi(v)|\). In the minimum linear arrangement problem (MLA), the goal is to find a linear arrangement \(\pi^*\) that achieves \(\mathrm{val}(G,\pi^*) = \mathrm{MLA}(G): =  \min_{\pi}\mathrm{val}(G,\pi)\).

    In this paper, we show that for any \(\epsilon > 0\) and positive integer \(r\), there is an \(O(nr/\epsilon)\)-time randomized algorithm which, given a graph \(G\), returns a permutation \(\pi\) such that \[ \mathrm{val}(G,\pi) \leq \left(1 + \frac{2}{(1-\epsilon)\lambda_{r+1}(\mathcal{L})}\right)\mathrm{MLA}(G) + O\left(\frac{\log n}{\sqrt{n}}\sum_{e \in E}w_e\right) \] with high probability. Here \(\mathcal{L}\) is the normalized Laplacian of \(G\) and \(\lambda_r(\mathcal{L})\) is the \(r\)-th eigenvalue of \(\mathcal{L}\). Our algorithm gives a constant factor approximation for regular graphs that are weak expanders.

    Constraint Satisfaction Problems
  • Constant-Time Algorithms for Sparsity Matroids. Hiro Ito, Shin-ichi Tanigawa and Yuichi Yoshida. ICALP 2012. [slides]

    A graph \(G = (V, E)\) is called \((k, \ell)\)-sparse if \(|F| ≤ k|V(F)| -\ell\) for any \(F \subseteq E\) with \(F \neq \emptyset\). Here, \(V(F)\) denotes the set of vertices incident to \(F\). A graph \(G = (V,E)\) is called \((k,\ell)\)-full if \(G\) contains a \((k,\ell)\)-sparse subgraph with \(|V|\) vertices and \(k|V| - \ell\) edges. The family of edge sets of \((k,\ell)\)-sparse subgraphs forms a family of independent sets of a matroid on \(E\), known as the sparsity matroid of \(G\). In this paper, we give a constant-time algorithm that approximates the rank of the sparsity matroid associated with a degree-bounded undirected graph. This algorithm leads to a constant-time tester for \((k,\ell)\)-fullness in the bounded-degree model, (i.e., we can decide with high probability whether the input graph satisfies a property or far from it). Depending on the values of \(k\) and \(\ell\), our algorithm can test various properties of graphs such as connectivity, rigidity, and how many spanning trees can be packed in a unified manner.

    Based on this result, we also propose a constant-time tester for \((k,\ell)\)-edge-connected-orientability in the bounded-degree model, where an undirected graph \(G\) is called \((k,\ell)\)-edge-connected-orientable if there exists an orientation \(\vec{G}\) of \(G\) with a vertex \(r\in V\) such that \(\vec{G}\) contains \(k\) arc-disjoint dipaths from r to each vertex \(v \in V\) and \(\ell\) arc-disjoint dipaths from each vertex \(v \in V\) to \(r\).

    A tester is called a one-sided error tester for \(P\) if it always accepts a graph satisfying \(P\). We show, for any \(k \geq 2\) and (proper) \(\ell \geq 0\), every one-sided error tester for \((k,\ell)\)-fullness and \((k,\ell)\)-edge-connected-orientability requires \(\Omega(n)\) queries.

    Sublinear-Time Algorithms,Submodular Functions
  • Testing List H-Homomorphisms. Yuichi Yoshida. CCC 2012. [slides]

    Let \(H\) be an undirected graph. In the List \(H\)-Homomorphism Problem, given an undirected graph \(G\) with a list constraint \(L(v) \subseteq V(H)\) for each variable \(v∈V(G)\), the objective is to find a list \(H\)-homomorphism \(f:V(G)\to V(H)\), that is, \(f(v)\in L(v)\) for every \(v\in V(G)\) and \((f(u),f(v)) \in E(H)\) whenever \((u,v)\in E(G)\). We consider the following problem: given a map \(f:V(G) \to V(H)\) as an oracle access, the objective is to decide with high probability whether \(f\) is a list \(H\)-homomorphism or \textit{far} from any list \(H\)-homomorphisms. The efficiency of an algorithm is measured by the number of accesses to \(f\). In this paper, we classify graphs \(H\) with respect to the query complexity for testing list \(H\)-homomorphisms and show the following trichotomy holds: (i) List \(H\)-homomorphisms are testable with a constant number of queries if and only if \(H\) is a reflexive complete graph or an irreflexive complete bipartite graph. (ii) List \(H\)-homomorphisms are testable with a sublinear number of queries if and only if \(H\) is a bi-arc graph. (iii) Testing list \(H\)-homomorphisms requires a linear number of queries if \(H\) is not a bi-arc graph.

    Constraint Satisfaction Problems,Sublinear-Time Algorithms
  • Constant-Time Approximation Algorithms for the Knapsack Problem. Hiro Ito, Susumu Kiyoshima and Yuichi Yoshida. TAMC 2012.

    In this paper, we give a constant-time approximation algorithm for the knapsack problem. Using weighted sampling, with which we can sample items with probability proportional to their profits, our algorithm runs with query complexity \(O(\epsilon^{- 4} \log \epsilon^{-1})\), and it approximates the optimal profit with probability at least \(2/3\) up to error at most an \(\epsilon\)-fraction of the total profit. For the subset sum problem, which is a special case of the knapsack problem, we can improve the query complexity to \(O(\epsilon^{-1} \log \epsilon^{-1})\).

    Sublinear-Time Algorithms
  • Algorithms and Complexity of Generalized River Crossing Problems. Stefan Langerman, Hiro Ito and Yuichi Yoshida. FUN 2012.

    Three men, each with a sister, must cross a river using a boat which can carry only two people, so that a woman whose brother is not present is never left in the company of another man. This is a very famous problem appeared in Latin book “Problems to Sharpen the Young,” one of the earliest collections on recreational mathematics. This paper considers a generalization of such “River-Crossing Problems.” It shows that the problem is NP-hard if the boat size is three, and a large class of sub-problems can be solved in polynomial time if the boat size is two. It’s also conjectured that determining whether a river crossing problem has a solution without bounding the number of transportations, can be solved in polynomial time even when the size of the boat is large.

  • Linear programming, width-1 CSPs, and robust satisfaction. Gabor Kun, Ryan O’Donnell, Suguru Tamaki, Yuichi Yoshida and Yuan Zhou. ITCS 2012.

    We say that an algorithm robustly decides a constraint satisfaction problem \(\Pi\) if it distinguishes at-least-\((1-\epsilon)\)-satisfiable instances from less-than-\((1 - r(\epsilon))\)-satisfiable instances for some function \(r(\epsilon)\) with \(r(\epsilon) \to 0\) as \(\epsilon \to 0\). In this paper we show that the canonical linear programming relaxation robustly decides \(\Pi\) if and only if \(\Pi\) has "width 1" (in the sense of Feder and Vardi).

    Constraint Satisfaction Problems
  • Studies on Constant-Time Algorithms for Bounded-Degree Graphs and Constraint Satisfaction Problems. Yuichi Yoshida. Ph.D. thesis, Kyoto University, 2012.

    We are concerned with constant-time algorithms, which aim for solving decision problems or optimization problems in time independent of sizes of inputs. To obtain such efficient algorithms, we relax the concept of decision and optimization as follows. An algorithm is said to test a property if, with high probability, it decides whether an input satisfies the property or we must modify an \(\epsilon\)-fraction of the input to make it satisfy the property. An algorithm is said to \((\alpha, \epsilon n)\)-approximate an optimization problem if, with high probability, it outputs an \((\alpha, \epsilon n)\)-approximation to the optimal value. Here, a value \(x\) is called an \((\alpha, \epsilon n)\)-approximation to \(x^*\) if \(\alpha x^* - \epsilon n \leq x \leq x^*\) where \(n\) is the size of an input. By relaxing this way, various problems become solvable in constant time.

    Constant-time algorithms were first studied for algebraic properties of functions for applications to computational complexity. However, because of their theoretical richness and potential to practical applications, they have been studied for many different areas in the last decade. In this thesis, we concentrate on problems on bounded-degree graphs and constraint satisfaction problems (CSPs). In this setting, each vertex (or variable) is incident to a constant number of edges (or constraints). Bounded-degree instances are practically important since real-world problems often give rise to such instances, and they are also theoretically important since we become able to study various fundamental problems that are trivial in general instances.

    We first give a constant-time \((1, \epsilon n)\)-approximation algorithm for the maximum matching problem. Compared to previous works, our algorithm runs exponentially faster with respect to \(1/\epsilon\). Using the algorithm as a building block, we show various related problems are also testable in constant time. A graph \(G = (V , E )\) is called \((k,l)\)-sparse if \(|F| \leq k|V(F)|-l\) for any non-empty \(F \subseteq E\). Here, \(V(F)\) denotes the set of vertices incident to \(F\). We give a constant-time algorithm that tests a graph \(G = (V, E)\) has a \((k, l)\)-sparse subgraph with \(|V|\) vertices and \(k|V | − l\) edges. Depending on values of \(k\) and \(l\), our algorithm can test various properties of graphs such as connectivity, rigidity, and how many spanning trees can be packed. Based on this result, we also give a constant-time testing algorithm for (k, l)-edge-connected- orientability, where a graph \(G\) is called \((k, l)\)-edge-connected-orientable if there exists an orientation \(\vec{G}\) of \(G\) with a vertex \(r \in V\) such that \(\vec{G}\) contains \(k\) arc-disjoint dipaths from \(r\) to each vertex \(v \in V\) and \(l\) arc-disjoint dipaths from each vertex \(v \in V\) to \(r\).

    Secondly, we consider constant-time algorithms for CSPs. Let \(\Lambda\) be a set of relations. Then, an instance of \(\mathrm{CSP}(\Lambda)\) is a pair of variables and constraints over the variables such that each constraint belongs to \(\Lambda\). In the problem, called \(\mathrm{MaxCSP}(\Lambda)\), given an instance of \(\mathrm{CSP}(\Lambda)\), we want to compute an assignment that maximizes the number of satisfied constraints. Here, we give the “best” constant-time approxi- mation algorithm for every \(\mathrm{MaxCSP}(\Lambda)\). Specifically, for each \(\Lambda\), there exists some constant \(\alpha_\Lambda\) with the following properties. (1) for any \(\epsilon > 0\), there exists a constant-time \((\alpha_\Lambda - \epsilon, \epsilon n)\)-approximation algorithm for \(\mathrm{MaxCSP}(\Lambda)\). (2) for any \(\epsilon > 0\), there exists \(\delta > 0\) such that no constant-time \((\alpha_\Lambda + \epsilon, \delta n)\)-approximation algorithm for \(\mathrm{MaxCSP}(\Lambda)\) exists. Here, \(\alpha_\Lambda\) coincides with the integrality gap of a standard LP relaxation for \(\mathrm{MaxCSP}(\Lambda)\).

    We consider testing satisfiability of CSP instances. Using the result above, we can derive the characterization of the constant-time testability of \(\mathrm{CSP}(\Lambda)\). Let \(\mathrm{opt}(I)\) and \(\mathrm{lp}(I)\) denote the optimal value and the LP value for an instance \(I\), respectively. Then, we have the followings. (i) if any instance of \(\mathrm{CSP}(\Lambda)\) with \(lp(I) \geq 1−\epsilon\) satisfies \(\mathrm{opt}(I) \geq 1 − r(\epsilon)\) where \(\lim_{\epsilon \to 0} r(\epsilon) = 0\), then \(CSP(\Lambda)\) is testable in constant time. (ii) if there is an instance \(I\) of \(\mathrm{CSP}(\Lambda)\) with \(\mathrm{lp}(I) = 1\) and \(\mathrm{opt}(I) < 1\), then \(\mathrm{CSP}(\Lambda)\) is not testable in constant time.

    To achieve a combinatorial characterization of the testability of \(\mathrm{CSP}(\Lambda)\), we show a close connection between constant-time testability and the notion of width 1. We say \(\mathrm{CSP}(\Lambda)\) has width 1 if satisfiability of instances of \(\mathrm{CSP}(\Lambda)\) can be decided by a certain simple propagation algorithm. Then, we prove that any CSP of width 1 satisfies (i), which means that any such CSP is testable in constant time.

    In a subsequent work (Kun et al. ITCS 2012), which includes the part of the present thesis, we have shown that any CSP of width more than 1 satisfies (ii). Thus, a CSP is constant-time testable if and only if it has width 1.

    Sublinear-Time Algorithms

2011

  • Algorithms for Finding a Maximum Non-k-Linked Graph. Yusuke Kobayashi and Yuichi Yoshida. ESA 2011.

    A graph with at least \(2k\) vertices is said to be k-linked if for any ordered \(k\)-tuples \((s_1, \ldots,s_k)\) and \((t_1, \ldots, t_k)\) of \(2k\) distinct vertices, there exist pairwise vertex-disjoint paths \(P_1,\ldots, P_k\) such that \(P_i\) connects si and ti for \(i = 1, \ldots, k\). For a given graph \(G\), we consider the problem of finding a maximum induced subgraph of \(G\) that is not \(k\)-linked. This problem is a common generalization of computing the vertex-connectivity and testing the \(k\)-linkedness of \(G\), and it is closely related to the concept of \(H\)-linkedness. In this paper, we give the first polynomial-time algorithm for the case of \(k = 2\), whereas a similar problem that finds a maximum induced subgraph without 2-vertex-disjoint paths connecting fixed terminal pairs is NP-hard. For the case of general \(k\), we give an \((8k − 2)\)-additive approximation algorithm. We also investigate the computational complexities of the edge-disjoint case and the directed case.

    Combinatorial Optimization
  • Property Testing for Cyclic Groups and Beyond. Francois Le Gall and Yuichi Yoshida. COCOON 2011.

    This paper studies the problem of testing if an input (Γ,¿), where Γ is a finite set of unknown size and ¿ is a binary operation over Γ given as an oracle, is close to a specified class of groups. Friedl et al. (Proc. of STOC, 2005) have constructed an efficient tester using poly(log|Γ|) queries for the class of abelian groups. We focus in this paper on subclasses of abelian groups, and show that these problems are much harder: Ω(|Γ|1/6) queries are necessary to test if the input is close to a cyclic group, and Ω(|Γ| c ) queries for some constant c are necessary to test more generally if the input is close to an abelian group generated by k elements, for any fixed integer k¿1. We also show that knowledge of the size of the ground set Γ helps only for k=1, in which case we construct an efficient tester using poly(log|Γ|) queries; for any other value k¿2 the query complexity remains Ω(|Γ| c ). All our upper and lower bounds hold for both the edit distance and the Hamming distance. These are, to the best of our knowledge, the first nontrivial lower bounds for such group-theoretic problems in the property testing model and, in particular, they imply the first exponential separations between the classical and quantum query complexities of testing closeness to classes of groups.

    Sublinear-Time Algorithms
  • In this paper, we consider lower bounds on the query complexity for testing CSPs in the bounded-degree model. We mainly consider Boolean CSPs allowing literals. First, for any "symmetric" predicate P : {0, 1}k → {0,1} except EQU where k ≥ 3, we show that every (randomized) algorithm that distinguishes satisfiable instances of CSP(P) from instances (|P-1 (0)|/2k - ϵ)-far from satisfiability requires Ω(n1/2+δ) queries where n is the number of variables and δ >; 0 is a constant that depends on P and e. This breaks a natural lower bound Ω(n1/2), which is obtained by the birthday paradox. We also show that every one-sided error tester requires Ω(n) queries for such P. These results are hereditary in the sense that the same results hold for any predicate Q such that P-1 (1) ⊆ Q-1(1). For EQU, we give a one-sided error tester whose query complexity is O̅(n1/2). Also, for 2-XOR (or, equivalently E2LIN2), we show an Ω(n1/2+δ) lower bound for distinguishing instances between e-close to and (1/2 -ϵ)-far from satisfiability. Next, for the general k-CSP over the binary domain, we show that every algorithm that distinguishes satisfiable instances from instances (1 - 2k/2k - ϵ)-far from satisfiability requires Ω(n) queries. The matching NP-hardness is not known, even assuming the Unique Games Conjecture or the d-to-1 Conjecture. As a corollary, for Maximum Independent Set on graphs with n vertices and a degree bound d, we show that every approximation algorithm within a factor d/poly log d and an additive error of ϵn requires Ω(n) queries. Previously, only super-constant lower bounds were known.

    Sublinear-Time Algorithms,Constraint Satisfaction Problems
  • Raghavendra (STOC 2008) gave an elegant and surprising result: if Khot's Unique Games Conjecture (STOC 2002) is true, then for every constraint satisfaction problem (CSP), the best approximation ratio is attained by a certain simple semidefinite programming and a rounding scheme for it.

    In this paper, we show that similar results hold for constant-time approximation algorithms in the bounded-degree model. Specifically, we present the following: (i) For every CSP, we construct an oracle that serves an access, in constant time, to a nearly optimal solution to a basic LP relaxation of the CSP. (ii) Using the oracle, we give a constant-time rounding scheme that achieves an approximation ratio coincident with the integrality gap of the basic LP. (iii) Finally, we give a generic conversion from integrality gaps of basic LPs to hardness results. All of those results are unconditional. Therefore, for every bounded-degree CSP, we give the best constant-time approximation algorithm among all.

    A CSP instance is called ε-far from satisfiability if we must remove at least an ε-fraction of constraints to make it satisfiable. A CSP is called testable if there is a constant-time algorithm that distinguishes satisfiable instances from ε-far instances with probability at least 2/3. Using the results above, we also derive, under a technical assumption, an equivalent condition under which a CSP is testable in the bounded-degree model.

    Sublinear-Time Algorithms,Constraint Satisfaction Problems

2010

  • A Query Efficient Non-Adaptive Long Code Test with Perfect Completeness. Suguru Tamaki and Yuichi Yoshida. RANDOM 2010.

    Long Code testing is a fundamental problem in the area of property testing and hardness of approximation. In this paper, we study how small the soundness \(s\) of the Long Code test with perfect completeness can be by using non-adaptive \(q\) queries. We show that \(s = (2q + 3)/2^q\) is possible, where \(q\) is of the form math formula for \(2^{k}-1\) any integer \(k > 2\).

    Sublinear-Time Algorithms,Boolean Functions
  • Testing Outerplanarity of Bounded Degree Graphs. Yuichi Yoshida and Hiro Ito. RANDOM 2010.

    We present an efficient algorithm for testing outerplanarity of graphs in the bounded-degree model. In this model, given a graph \(G\) with degree bound \(d\), we should distinguish with high probability the case that \(G\) is outerplanar from the case that modifying at least an \(\epsilon\)-fraction of the edge set is necessary to make \(G\) outerplanar. Our algorithm runs in time polynomial in \(d\) and \(1/\epsilon\) only. To achieve the time complexity, we exploit the tree-like structure inherent to an outerplanar graph using the microtree/macrotree decomposition of a tree. As a corollary, we show that the property of being a cactus is testable in time polynomial in \(d\) and \(1/\epsilon\).

    Sublinear-Time Algorithms
  • Conjunctive Filter: Breaking the Entropy Barrier. Daisuke Okanohara and Yuichi Yoshida. ALENEX 2010.

    We consider a problem for storing a map that associates a key with a set of values. To store \(n\) values from the universe of size \(m\), it requires log 2(m/n) bits of space, which can be approximated as \((1.44 + n) \log 2 m/n\) bits when n m. If we allow ε fraction of errors in outputs, we can store it with roughly n log21/ε bits, which matches the entropy bound. Bloom filter is a wellknown example for such data structures. Our objective is to break this entropy bound and construct more space-efficient data structures. In this paper, we propose a novel data structure called a conjunctive filter, which supports conjunctive queries on k distinct keys for fixed k. Although a conjunctive filter cannot return the set of values itself associated with a queried key, it can perform conjunctive queries with O(1/√m) fraction of errors. Also, the consumed space is n/k log2m bits and it is significantly smaller than the entropy bound n/2 log2m when k ≥ 3. We will show that many problems can be solved by using a conjunctive filter such as full-text search and database join queries. Also, we conducted experiments using a real-world data set, and show that a conjunctive filter answers conjunctive queries almost correctly using about 1/2 ∼ 1/4 space as the entropy bound.

  • In this paper, we study lower bounds on the query complexity of testing algorithms for various problems. Given an oracle that returns information of an input object, a testing algorithm distinguishes the case that the object has a given property \(P\) from the case that it has a large distance to having \(P\) with probability at least \(2/3\). The query complexity of an algorithm is measured by the number of accesses to the oracle. We introduce two reductions that preserve the query complexity. One is derived from the gap-preserving local reduction and the other is from the L-reduction. By using the former reduction, we show linear lower bounds on the query complexity for testing basic NP-complete properties, i.e., 3-edge-colorability, directed Hamiltonian path/cycle, undirected Hamiltonian path/cycle, 3-dimensional matching and NP-complete generalized satisfiability problems. Also, using the second reduction, we show a linear lower bound on the query complexity of approximating the size of the maximum 3-dimensional matching.

    Sublinear-Time Algorithms

2009

  • An Improved Constant-Time Approximation Algorithm for Maximum Matchings. Yuichi Yoshida, Masaki Yamamoto and Hiro Ito. STOC 2009

    This paper studies approximation algorithms for problems on degree-bounded graphs. Let \(n\) and \(d\) be the number of vertices and the degree bound, respectively. This paper presents an algorithm to approximate the size of some maximal independent set with additive error \(\epsilon n\) whose running time is \(O(d^2)\). Using this algorithm, it also shows that there are approximation algorithms for many other problems, e.g., the maximum matching problem, the minimum vertex cover problem, and the minimum set cover problem, that run exponentially faster than existing algorithms with respect to d and 1/ε. Its approximation algorithm for the maximum matching problem can be transformed to a testing algorithm for the property of having a perfect matching with two-sided error. On the contrary, it also shows that every one-sided error tester for the property requires at least \(\Omega(n)\) queries.

    Sublinear-Time Algorithms
  • Fast Calculation of Levenshtein Distance on the Cell Processor. Shun Sakuraba and Yuichi Yoshida. SACSIS 2009. (in Japanese)

    Cell Challenge 2009の規定課題において、Cell Broadband Engineを用いて文字列の編集距 離を計算するプログラムを作成するという課題が出題された。本稿では我々が規定課題において考案した最適化技法について報告する。基本となるアルゴリズムはビット並列化法と呼ばれるものであり、実行委員会によるサンプルプログラムと比べて約 315 倍の高速化を達成した。

2008

  • Property Testing on k-Vertex Connectivity of Graphs. Yuichi Yoshida and Hiro Ito. ICALP 2008.

    Wepresentanalgorithmfortestingthek-vertex-connectivity of graphs with given maximum degree. The time complexity of the algorithm is independent of the number of vertices and edges of graphs. A graph \(G\) with \(n\) vertices and maximum degree at most \(d\) is called \(\epsilon\)-far from \(k\)-vertex-connectivity when at least \(\frac{\epsilon dn}{2}\) edges must be added to or removed from \(G\) to obtain a \(k\)-vertex-connected graph with maximum degree at most \(d\). The algorithm always accepts every graph that is \(k\)-vertex-connected and rejects every graph that is \(\epsilon\)-far from \(k\)-vertex-connectivity with a probability of at least \(2/3\). The algorithm runs in \(O\left(d \left(\frac{c}{\epsilon d}\right)^k \log \frac{1}{\epsilon d}\right)\) time (\(c > 1\) is a constant) for given \((k − 1)\)-vertex-connected graphs, and \(O\left( d \left(\frac{ck}{\epsilon d}\right)^k \log \frac{k}{\epsilon d}\right)\) time (\(c > 1\) is a constant) for given general graphs. It is the first constant-time \(k\)-vertex-connectivity testing algorithm for general \(k \geq 4\).

    Sublinear-Time Algorithms