Journal Papers | Yuichi Yoshida

Journal Papers

2017

  • Parameterized Testability. Kazuo Iwama and Yuichi Yoshida. ACM Transactions on Computation Theory 2017.

    This paper studies property testing for NP optimization problems with parameter \(k\) under the general graph model with an augmentation of random edge sampling capability. It is shown that a variety of such problems, including \(k\)-Vertex Cover, \(k\)-Feedback Vertex Set, \(k\)-Multicut, \(k\)-Path-Free and \(k\)-Dominating Set, are constant-query testable if \(k\) is constant. It should be noted that the first four problems are fixed parameter tractable (FPT) and it turns out that algorithmic techniques for their FPT algorithms (branch-and-bound search, color coding, etc.) are also useful for our testers. \(k\)-Dominating Set is \(W[2]\)-hard, but we can still test the property with a constant number of queries since the definition of \(\epsilon\)-farness makes the problem trivial for non-sparse graphs that are the source of hardness for the original optimization problem. We also consider \koct, which is another well-known FPT problem, but we only give a sublinear-query tester when \(k\) is a constant.

    Default
  • An Iterative Approach for the Global Estimation of Sentence Similarity. Tomoyuki Kajiwara, Danushka Bollegala, Yuichi Yoshida and Ken-ichi Kawarabayashi. PLOS ONE 2017.

    Measuring the similarity between two sentences is often difficult due to their small lexical overlap. Instead of focusing on the sets of features in two given sentences between which we must measure similarity, we propose a sentence similarity method that considers two types of constraints that must be satisfied by all pairs of sentences in a given corpus. Namely, (a) if two sentences share many features in common, then it is likely that the remaining features in each sentence are also related, and (b) if two sentences contain many related features, then those two sentences are themselves similar. The two constraints are utilized in an iterative bootstrapping procedure that simultaneously updates both word and sentence similarity scores. Experimental results on SemEval 2015 Task 2 dataset show that the proposed iterative approach for measuring sentence semantic similarity is significantly better than the non-iterative counterparts.

    Natural Language Processing

2016

  • Testing List H-Homomorphisms. Yuichi Yoshida. Computational Complexity 2016. [slides]

    In the List \(H\)-Homomorphism Problem, for a graph \(H\) that is a parameter of the problem, an instance consists of an undirected graph \(G\) with a list constraint \(L(v) \subseteq V(H)\) for each variable \(v \in V(G)\), and the objective is to determine whether there is a list \(H\)-homomorphism \(f:V(G) \to V(H)\), that is, \(f(v) \in L(v)\) for every \(v \in V(G)\) and \((f(u),f(v)) \in E(H)\) whenever \((u,v) \in E(G)\). We consider the problem of testing list \(H\)-homomorphisms in the following weighted setting: An instance consists of an undirected graph \(G\), list constraints \(L\), weights imposed on the vertices of \(G\), and a map \(f:V(G) \to V(H)\) given as an oracle access. The objective is to determine whether \(f\) is a list \(H\)-homomorphism or far from any list \(H\)-homomorphism. The farness is measured by the total weight of vertices \(v \in V(G)\) for which \(f(v)\) must be changed so as to make \(f\) a list \(H\)-homomorphism. In this paper, we classify graphs H with respect to the number of queries to \(f\) required to test the list \(H\)-homomorphisms. Specifically, we show that (i) list \(H\)-homomorphisms are testable with a constant number of queries if and only if \(H\) is a reflexive complete graph or an irreflexive complete bipartite graph and (ii) list \(H\)-homomorphisms are testable with a sublinear number of queries if and only if \(H\) is a bi-arc graph.

    Sublinear-Time Algorithms,Constraint Satisfaction Problems
  • Cycle and Flow Trusses in Directed Networks. Taro Takaguchi and Yuichi Yoshda. Royal Society Open Science 2016. [arXiv] [code]

    When we represent real-world systems as networks, the directions of links often convey valuable information. Finding module structures that respect link directions is one of the most important tasks for analysing directed networks. Although many notions of a directed module have been proposed, no consensus has been reached. This lack of consensus results partly because there might exist distinct types of modules in a single directed network, whereas most previous studies focused on an independent criterion for modules. To address this issue, we propose a generic notion of the so-called truss structures in directed networks. Our definition of truss is able to extract two distinct types of trusses, named the cycle truss and the flow truss, from a unified framework. By applying the method for finding trusses to empirical networks obtained from a wide range of research fields, we find that most real networks contain both cycle and flow trusses. In addition, the abundance of (and the overlap between) the two types of trusses may be useful to characterize module structures in a wide variety of empirical networks. Our findings shed light on the importance of simultaneously considering different types of modules in directed networks.

    Network Algorithms
  • Testing Properties of Functions on Finite Groups. Kenta Oono and Yuichi Yoshida. Random Structures & Algorithms 2016. [arXiv]

    We study testing properties of functions on finite groups. First we consider functions of the form math formula, where \(G\) is a finite group. We show that conjugate invariance, homomorphism, and the property of being proportional to an irreducible character is testable with a constant number of queries to \(f\), where a character is a crucial notion in representation theory. Our proof relies on representation theory and harmonic analysis on finite groups. Next we consider functions of the form math formula, where \(d\) is a fixed constant and math formula is the family of \(d\) by \(d\) matrices with each element in math formula. For a function math formula, we show that the unitary isomorphism to \(g\) is testable with a constant number of queries to \(f\), where we say that \(f\) and \(g\) are unitary isomorphic if there exists a unitary matrix \(U\) such that math formula for any math formula.

    Sublinear-Time Algorithms,Boolean Functions
  • Half-integrality, LP-branching and FPT Algorithms. Yoichi Iwata, Magnus Wahlström and Yuichi Yoshida. SIAM Journal on Computing 2016. [arXiv]

    A recent trend in parameterized algorithms is the application of polytope tools to fixed-parameter tractable (FPT) algorithms [e.g., Cygan et al., 2011; Narayanaswamy et al., 2012]. Although this approach has yielded significant speedups for a range of important problems, it requires the underlying polytope to have very restrictive properties, including half-integrality and Nemhauser--Trotter-style persistence properties. To date, these properties are essentially known to hold only for two classes of polytopes, covering the cases of Vertex Cover [Nemhauser and Trotter, 1975] and Node Multiway Cut [Garg et al., 2004]. Taking a slightly different approach, we view half-integrality as a discrete relaxation of a problem, e.g., a relaxation of the search space from \(\{0,1\}^V\) to \(\{0,1/2,1\}^V\) such that the new problem admits a polynomial-time exact solution. Using tools from constraint satisfaction problems [in particular Thapper and Živný, 2012] to study the existence of such relaxations, we are able to provide a much broader class of half-integral polytopes with the required properties. Our results unify and significantly extend the previously known cases, and yield a range of new and improved FPT algorithms, including an \(O^*(|\Sigma|^{2k})\)-time algorithm for node-deletion Unique Label Cover and an \(O^*(4^k)\)-time algorithm for Group Feedback Vertex Set where the group is given by oracle access. The latter result also implies the first single-exponential time FPT algorithm for Subset Feedback Vertex Set, answering an open question of Cygan et al., 2016]. Additionally, we propose a network-flow-based approach to solve several cases of the relaxation problem. This gives the first linear-time FPT algorithm to edge-deletion Unique Label Cover.

    Combinatorial Optimization and FPT
  • Coverage Centralities for Temporal Networks. Taro Takaguchi, Yosuke Yano and Yuichi Yoshida. The European Physical Journal B 2016. [arXiv]

    Structure of real networked systems, such as social relationship, can be modeled as temporal networks in which each edge appears only at the prescribed time. Understanding the structure of temporal networks requires quantifying the importance of a temporal vertex, which is a pair of vertex index and time. In this paper, we define two centrality measures of a temporal vertex based on the fastest temporal paths which use the temporal vertex. The definition is free from parameters and robust against the change in time scale on which we focus. In addition, we can efficiently compute these centrality values for all temporal vertices. Using the two centrality measures, we reveal that distributions of these centrality values of real-world temporal networks are heterogeneous. For various datasets, we also demonstrate that a majority of the highly central temporal vertices are located within a narrow time window around a particular time. In other words, there is a bottleneck time at which most information sent in the temporal network passes through a small number of temporal vertices, which suggests an important role of these temporal vertices in spreading phenomena.

    Network Algorithms

2015

  • Testing Outerplanarity of Bounded Degree Graphs. Hiro Ito and Yuichi Yoshida. Algorithmica 2015.

    We present an efficient algorithm for testing outerplanarity of graphs in the bounded-degree model. In this model, given a graph \(G\) with degree bound \(d\), we should distinguish with high probability the case that \(G\) is outerplanar from the case that modifying at least an \(\epsilon\)-fraction of the edge set is necessary to make \(G\) outerplanar. Our algorithm runs in time polynomial in \(d\) and \(1/\epsilon\) only. To achieve the time complexity, we exploit the tree-like structure inherent to an outerplanar graph using the microtree/macrotree decomposition of a tree. As a corollary, we show that the property of being a cactus is testable in time polynomial in \(d\) and \(1/\epsilon\).

    Sublinear-Time Algorithms
  • A Query Efficient Non-Adaptive Long Code Test with Perfect Completeness. Suguru Tamaki and Yuichi Yoshida. Random Structures & Algorithms 2015.

    Long Code testing is a fundamental problem in the area of property testing and hardness of approximation. In this paper, we study how small the soundness s of the Long Code test with perfect completeness can be by using non-adaptive \(q\) queries. We show that \(s = (2q + 3)/2^q\) is possible, where \(q\) is of the form \(2^k-1\) for any integer \(k > 2\).

    Sublinear-Time Algorithms,Constraint Satisfaction Problems
  • Partially Symmetric Functions are Efficiently Isomorphism-Testable. Eric Blais, Amit Weinstein and Yuichi Yoshida. SIAM Journal on Computing 2015. [arXiv] [slides (by Weinstein)] [video (talk by Weinstein)]

    Given a Boolean function \(f\), the \(f\)-isomorphism testing problem requires a randomized algorithm to distinguish functions that are identical to \(f\) up to relabeling of the input variables from functions that are far from being so. An important open question in property testing is to determine for which functions \(f\) we can test \(f\)-isomorphism with a constant number of queries. Despite much recent attention to this question, essentially only two classes of functions were known to be efficiently isomorphism testable: symmetric functions and juntas. We unify and extend these results by showing that all partially symmetric functions---functions invariant to the reordering of all but a constant number of their variables---are efficiently isomorphism testable. This class of functions, first introduced by Shannon, includes symmetric functions, juntas, and many other functions as well. We conjecture that these functions are essentially the only functions efficiently isomorphism-testable. To prove our main result, we also show that partial symmetry is efficiently testable. In turn, to prove this result we had to revisit the junta testing problem. We provide a new proof of correctness of the nearly optimal junta tester. Our new proof replaces the Fourier machinery of the original proof with a purely combinatorial argument that exploits the connection between sets of variables with low influence and intersecting families. Another important ingredient in our proofs is a new notion of symmetric influence. We use this measure of influence to prove that partial symmetry is efficiently testable and also to construct an efficient sample extractor for partially symmetric functions. We then combine the sample extractor with the testing-by-implicit-learning approach to complete the proof that partially symmetric functions are efficiently isomorphism testable.

    Sublinear-Time Algorithms,Boolean Functions
  • Testing Supermodular-cut Condition. Shin-ichi Tanigawa and Yuichi Yoshida. Algorithmica 2015.

    For a function \(f:2^V \to \mathbb{Z}_+\) on a finite set \(V\) with \(f(\emptyset)=f(V)=0\), a digraph \(D=(V,A)\) is called \(f\)-connected if it satisfies the \(f\)-cut condition, that is, \(\delta_D(X) \geq f(X)\) for any \(X \subseteq V\), where \(\delta_D(X)\) is the number of arcs from \(X\) to \(V \setminus X\). We show that, for any crossing supermodular function \(f\), the \(f\)-connectivity can be tested with a constant number of queries in the general digraph model with average degree bound. As immediate corollaries, we obtain constant-time testers for \(k\)-edge-connectivity, rooted-\((k,l)\)-edge-connectivity, and the property of having \(k\) arc-disjoint arborescences. We also give a corresponding result for the undirected case.

    Sublinear-Time Algorithms,Submodular Functions
  • Generalized River Crossing Problems. Hiro Ito, Stefan Langerman and Yuichi Yoshida. Theory of Computing Systems 2015.

    Three men, each with a sister, must cross a river using a boat that can carry only two people in such a way that a sister is never left in the company of another man if her brother is not present. This very famous problem appeared in the Latin book “Problems to Sharpen the Young,” one of the earliest collections of recreational mathematics. This paper considers a generalization of such “river crossing problems” and provides a new formulation that can treat wide variations. The main result is that, if there is no upper bound on the number of transportations (river crossings), a large class of subproblems can be solved in polynomial time even when the passenger capacity of the boat is arbitrarily large. The authors speculated this fact at FUN 2012. On the other hand, this paper also demonstrates that, if an upper bound on the number of transportations is given, the problem is NP-hard even when the boat capacity is three, although a large class of subproblems can be solved in polynomial time if the boat capacity is two.

2014

  • Suppressing Epidemics on Networks by Exploiting Observer Nodes. Taro Takaguchi, Takehisa Hasegawa and Yuichi Yoshida. Physical Review E 2014. [arXiv]

    To control infection spreading on networks, we investigate the effect of observer nodes that recognize infection in a neighboring node and make the rest of the neighbor nodes immune. We numerically show that random placement of observer nodes works better on networks with clustering than on locally treelike networks, implying that our model is promising for realistic social networks. The efficiency of several heuristic schemes for observer placement is also examined for synthetic and empirical networks. In parallel with numerical simulations of epidemic dynamics, we also show that the effect of observer placement can be assessed by the size of the largest connected component of networks remaining after removing observer nodes and links between their neighboring nodes.

    Network Algorithms
  • Generalized Skew Bisubmodularity: A Characterization and a Min-Max Theorem. Satoru Fujishige, Shin-ichi Tanigawa and Yuichi Yoshida. Discrete Optimization 2014.

    Huber, Krokhin, and Powell (2013) introduced a concept of skew bisubmodularity, as a generalization of bisubmodularity, in their complexity dichotomy theorem for valued constraint satisfaction problems over the three-value domain. In this paper we consider a natural generalization of the concept of skew bisubmodularity and show a connection between the generalized skew bisubmodularity and a convex extension over rectangles. We also analyze the dual polyhedra, called skew bisubmodular polyhedra, associated with generalized skew bisubmodular functions and derive a min-max theorem that characterizes the minimum value of a generalized skew bisubmodular function in terms of a minimum-norm point in the associated skew bisubmodular polyhedron.

    Submodular Functions

2013

  • Property Testing for Cyclic Groups and Beyond. Francois Le Gall and Yuichi Yoshida. Journal of Combinatorial Optimization 2013.

    This paper studies the problem of testing if an input \((\Gamma,\circ)\), where \(\Gamma\) is a finite set of unknown size and \(\circ\) is a binary operation over \(\Gamma\) given as an oracle, is close to a specified class of groups. Friedl et al. (Proc. of STOC, 2005) have constructed an efficient tester using \(\mathrm{poly}(\log|\Gamma|)\) queries for the class of abelian groups. We focus in this paper on subclasses of abelian groups, and show that these problems are much harder: \(\Omega(|\Gamma|^{1/6})\) queries are necessary to test if the input is close to a cyclic group, and \(\Omega(|\Gamma|^c)\) queries for some constant c are necessary to test more generally if the input is close to an abelian group generated by k elements, for any fixed integer k≥1. We also show that knowledge of the size of the ground set \(\Gamma\) helps only for \(k=1\), in which case we construct an efficient tester using \(\mathrm{poly}(\log|\Gamma|)\) queries; for any other value \(k\geq 2\) the query complexity remains \(\Omega(|\Gamma|^c)\). All our upper and lower bounds hold for both the edit distance and the Hamming distance. These are, to the best of our knowledge, the first nontrivial lower bounds for such group-theoretic problems in the property testing model and, in particular, they imply the first exponential separations between the classical and quantum query complexities of testing closeness to classes of groups.

    Sublinear-Time Algorithms
  • Semi-strong Coloring of Intersecting Hypergraphs. Eric Blais, Amit Weinstein and Yuichi Yoshida. Combinatorics Probability and Computing 2013.

    For any \(c ≥ 2\), a \(c\)-strong colouring of the hypergraph \(G\) is an assignment of colours to the vertices of \(G\) such that, for every edge e of \(G\), the vertices of e are coloured by at least \(\min{c,|e|}\) distinct colours. The hypergraph \(G\) is \(t\)-intersecting if every two edges of \(G\) have at least \(t\) vertices in common.

    A natural variant of a question of Erdős and Lovász is: For fixed \(c \geq 2\) and \(t \geq 1\), what is the minimum number of colours that is sufficient to \(c\)-strong colour any \(t\)-intersecting hypergraphs? The purpose of this note is to describe some open problems related to this question.

  • Constant-Time Approximation Algorithms for the Optimum Branching Problem on Sparse Graphs. Mitsuru Kusumoto, Yuichi Yoshida and Hiro Ito. International Journal of Networking and Computing 2013.

    We propose a constant-time algorithm for approximating the weight of the maximum weight branching in the general graph model. A directed graph is called a branching if it is acyclic and each vertex has at most one incoming edge. An edge-weighted digraph \(G\) of average degree \(d\) whose weights are real values in \([0, 1]\) is given as an oracle access, and we are allowed to ask degrees and incoming edges for any vertex through the oracle. Then, with high probability, our algorithm estimates the weight of the maximum weight branching in \(G\) with an absolute error of at most εn with query complexity \(O(d/\epsilon^3)\), where \(n\) is the number of vertices. We also show a lower bound of \(\Omega(d/\epsilon^2)\). Additionally, our algorithm can be modified to run with query complexity \(O(1/\epsilon^4)\) for unweighted digraphs, i.e., it runs in time independent of the input size even for digraphs with \(d = \Omega(n)\) edges. In contrast, we show that it requires \(\Omega(n)\) queries to approximate the weight of the minimum (or maximum) spanning arborescence in a weighted digraph.

    Sublinear-Time Algorithms

2012

  • An Almost Optimal Algorithm for Winkler’s Sorting Pairs in Bins. Hiro Ito, Junichi Teruyama and Yuichi Yoshida. Progress in Informatics 2012.

    We investigate the following sorting problem: We are given n bins with two balls in each bin. Balls in the \(i\)th bin are numbered \(n + 1 - i\). We can swap two balls from adjacent bins. How many number of swaps are needed in order to sort balls, i.e., move every ball to the bin with the same number. For this problem the best known solution requires almost \(\frac{4n^2}{5}\) swaps. In this paper, we show an algorithm which solves this problem using less than \(\frac{2n^2}{3}\) swaps. Since it is known that the lower bound of the number of swapsi s \(\lceil {2n \choose 2}/3 \rceil = \lceil \frac{2n^2}{3} - \frac{n}{3}\rceil\), our resultis is almost tight. Furthermore, we show that for \(n = 2^m + 1 (m \geq 0)\) the algorithm is optimal.

  • Testing the (s, t)-Disconnectivity of Graphs and Digraphs. Yusuke Kobayashi and Yuichi Yoshida. Theoretical Computer Science 2012.

    Property testing is concerned with constant-time algorithms for deciding whether a given object satisfies a predetermined property or is far from satisfying it. In this paper, we consider testing properties related to the connectivity of two vertices in sparse graphs.

    We present one-sided error testers for \((s,t)\)-disconnectivity with query complexity \(2^{O(1/\epsilon}\) for digraphs and \(O(1/\epsilon^2)\) for graphs, where \(\epsilon\) is an error parameter. Furthermore, we show that these algorithms are the best possible in view of query complexity, i.e., we give matching lower bounds for two-sided error testers for both cases.

    We also give a constant-time algorithm for testing the \((s,t)\)-disconnectivity of a directed bounded-degree hypergraph, which can be used to test the satisfiability of Horn SAT.

    Sublinear-Time Algorithms
  • Property Testing on k-Vertex Connectivity of Graphs. Yuichi Yoshida and Hiro Ito. Algorithmica, 2012.

    We present an algorithm for testing the \(k\)-vertex-connectivity of graphs with the given maximum degree. The time complexity of the algorithm is independent of the number of vertices and edges of graphs. Fixed degree bound \(d\), a graph \(G\) with \(n\) vertices and a maximum degree at most \(d\) is called \(\epsilon\)-far from \(k\)-vertex-connectivity when at least \(\frac{\epsilon dn}{2}\) edges must be added to or removed from \(G\) to obtain a \(k\)-vertex-connected graph with a maximum degree at most \(d\). The algorithm always accepts every graph that is \(k\)-vertex-connected and rejects every graph that is \(\epsilon\)-far from \(k\)-vertex-connectivity with a probability of at least \(2/3\). The algorithm runs in \(O(d(\frac{c}{\epsilon d})^k \log\frac{1}{\epsilon d})\) time (\(c>1\) is a constant) for \((k−1)\)-vertex-connected graphs, and in \(O(d(\frac{ck}{\epsilon d})^k \log\frac{k}{\epsilon d})\) time (\(c>1\) is a constant) for general graphs. It is the first constant-time \(k\)-vertex-connectivity testing algorithm for general \(k \geq 4\).

    Sublinear-Time Algorithms
  • Improved Constant-Time Approximation Algorithms for Maximum Matchings and Other Optimization Problems. Yuichi Yoshida, Masaki Yamamoto and Hiro Ito. SIAM Journal on Computing 2012.

    We study constant-time approximation algorithms for bounded-degree graphs, which run in time independent of the number of vertices \(n\). We present an algorithm that decides whether a vertex is contained in a some fixed maximal independent set with expected query complexity \(O(d^2)\), where \(d\) is the degree bound. Using this algorithm, we show constant-time approximation algorithms with certain multiplicative error and additive error \(\epsilon n\) for many other problems, e.g., the maximum matching problem, the minimum vertex cover problem, and the minimum set cover problem, that run exponentially faster than existing algorithms with respect to \(d\) and \(\frac{1}{\epsilon}\). Our approximation algorithm for the maximum matching problem can be transformed to a two-sided error tester for the property of having a perfect matching. On the contrary, we show that every one-sided error tester for the property requires at least \(\Omega(n)\) queries.

    Sublinear-Time Algorithms
  • On the Distance Between Non-isomorphic Groups. Gábor Ivanyos, François Le Gall and Yuichi Yoshida. European Journal of Combinatorics 2012. [arXiv]

    A result of Ben-Or, Coppersmith, Luby and Rubinfeld on testing whether a map be two groups is close to a homomorphism implies a tight lower bound on the distance between the multiplication tables of two non-isomorphic groups.

2010

  • Testing k-Edge-Connectivity of Digraphs. Yuichi Yoshida and Hiro Ito. Journal of System Science and Complexity 2010.

    This paper presents an algorithm that tests whether a given degree-bounded digraph is k-edge-connected or \(\epsilon\)-far from \(k\)-edge-connectivity. This is the first testing algorithm for \(k\)-edgeconnectivity of digraphs whose running time is independent of the number of vertices and edges. A digraph of \(n\) vertices with degree bound \(d\) is \(\epsilon\)-far from \(k\)-edge-connectivity if at least \(\epsilon dn\) edges have to be added or deleted to make the digraph \(k\)-edge-connected, preserving the degree bound. Given a constant error parameter \(\epsilon\) and a degree bound \(d\), our algorithm always accepts all \(k\)-edge-connected digraphs and rejects all digraphs that is \(\epsilon\)-far from \(k\)-edge-connectivity with probability at least \(2/3\). It runs in \(O\left(d\left(\frac{c}{εd}\right)^k \log\frac{1}{\epsilon d}\right)\) (\(c > 1\) is a constant) time when input digraphs are restricted to be \((k-1)\)-edge connected and runs in \(O\left(d\left(\frac{ck}{\epsilon d}\right)^k\log\frac{k}{\epsilon d}\right)\) (\(c > 1\) is a constant) time for general digraphs.

    Sublinear-Time Algorithms