### September 16

#### Invited Talks

 13:00–14:00 Thomas Zeugmann Learning Theory and Complexity – Ultrametric Algorithms 14:00–14:15 Break 14:15–15:00 Nir Ailon Lower Bound for Fourier Transform in the Well-Conditioned Linear Model (Or: If You Want a Faster Fourier Transform You'd Need to Sacrifice Accuracy) 15:00–15:45 Rémi Eyraud Efficiency bounds in the identification in the limit paradigm

#### ELC Session: short talks and posters

 16:00–19:00 Kei Uchizawa Lower Bounds for Linear Decision Trees with Bounded Weights Takayoshi Shoudai Polynomial time learning of formal graph systems via queries Ayumi Shinohara A note on minimum consistent DFA from prefix samples Takahiro Fujita Online Prediction over Permutations under Precedence Constraints Marco Cuturi Entropy Regularized Optimal Transport and Applications

## Abstracts

• Thomas Zeugmann "Learning Theory and Complexity – Ultrametric Algorithms"

The talk starts with a short survey on inductive inference of recursive functions and complexity theoretic characterizations. Then we turn our attention to active learning of recursive functions via value queries. We are given a class $$U$$ of recursive functions and have to learn every target function $$f$$ in $$U$$. That is, the query is a natural number $$x$$ and the answer to the query is $$f(x)$$. After having asked finitely many queries the learner has to output a program computing the target function $$f$$. The complexity measure is the worst-case number of queries asked. We compare deterministic, non-deterministic, probabilistic, and ultra-metric learning algorithms performing this task. Ultra-metric algorithms are a new concept recently introduced by Rusins Freivalds. We prove that for some classes of recursive functions ultra-metric active learning algorithms can achieve the learning goal by asking significantly fewer queries than deterministic, probabilistic, and even nondeterministic active learning algorithms.
Note: The second part of the talk presents joint work with Rusins Freivalds.

• Nir Ailon "Lower Bound for Fourier Transform in the Well-Conditioned Linear Model (Or: If You Want a Faster Fourier Transform You'd Need to Sacrifice Accuracy)"

The Fourier Transform is one of the most important linear transformations used in science and technology. Cooley and Tukey's Fast Fourier Transform (FFT) from 1964 is a method for computing this transformation in time $$\mathrm{O}(n \log n)$$. In spite of its importance, no nontrivial (super-linear) lower bounds were known without making very strong assumptions. Morgenstern's result from 1974 provides an $$\Omega(n \log n)$$ lower bound for the *unnormalized* Fourier Transform (of determinant $$n^{n/2}$$), assuming only numbers with bounded constant modulus are used. [Ailon 2013] shows an $$\Omega(n \log n)$$ bound for the *normalized* Fourier Transform (of determinant 1) assuming only unitary operations on two coordinates are allowed. In this work we show that, as long as the computation is well conditioned, *any scaling* of the Fourier transform requires $$\Omega((n \log n)/R)$$ operations, where $$R$$ is the condition number. This means that, on a given machine, a faster Fourier transform is less accurate.
The main new technique is definition of a matrix entropy function, using "quasi-probabilities" (which can be negative or >1). I will discuss the result and present some open questions.

• Rémi Eyraud "Efficiency bounds in the identification in the limit paradigm"

The most widely used learning paradigm in Grammatical Inference was introduced in 1967 and is known as identification in the limit. An important issue that has been raised with respect to the original definition is the absence of efficiency bounds. Nearly fifty years after its introduction, it remains an open problem how to best incorporate a notion of efficiency and tractability into this framework. This talk focusses mainly on the set-based refinements of the paradigm that have been developed and studied, and the challenges they face.

• Kei Uchizawa "Lower Bounds for Linear Decision Trees with Bounded Weights"

We consider a linear decision tree $$T$$ such that the sum of the absolute values of the integer weights of a linear threshold function at each internal node is at most $$w$$, and prove that if T has size (i.e., the number of leave) $$s$$, rank $$r$$, and computes a Boolean function $$f$$, then there exists a depth-2 threshold circuit that computes $$f$$ and has $$s(2w+1)^r$$ threshold gates with weight at most $$(2w+1)^{r+1}$$ in the bottom level. Combining a known lower bound on size of depth-2 threshold circuits, we can obtain a $$2^{\Omega (n/\log w)}$$ lower bound on the size of linear decision trees computing Inner-Product function modulo 2, which improves on the previous bound $$2^{\sqrt{n}}$$ if $$w=2^{o(\sqrt{n})}$$.

• Takayoshi Shoudai "Polynomial time learning of formal graph systems via queries"

Formal graph system (FGS) is a logic program that deals with term graphs instead of the terms of first-order predicate logic. We investigate FGSs for polynomial-time learnability in the frameworks of the PAC-learning, query-learning, and inductive inference models. In this presentation, we introduce context-deterministic (c-deterministic, for short) regular FGSs as a subclass of FGSs and propose a polynomial time algorithm for learning the class of c-deterministic regular FGSs by using membership and equivalence queries.
Joint work with Seiya Hara, Tomoyuki Uchida

• Ayumi Shinohara "A note on minimum consistent DFA from prefix samples"

The minimum consistent DFA problem is a problem to find a deterministic finite automaton of minimum size, that is consistent with given positive and negative examples. The computational hardness of the problem, as well as tractable/intractable variants have been reported in the literature. In this presentation, we focus on the case that all examples are prefixes of a single string. We prove that the problem is still NP-hard and cannot be approximated within ratio $$n^{1/21}$$ with respect to the size $$n$$ of input unless P=NP, when the alphabet size $$|\Sigma | \geq 2$$. Moreover, we show a polynomial time algorithm to solve it for the case $$|\Sigma| = 1$$.
Joint work with Kaori Ueno, Shinichi Shimozono, and Kazuyuki Narisawa

• Takahiro Fujita "Online Prediction over Permutations under Precedence Constraints"

We consider an online linear optimization problem over the set of permutations under some precedence constraints. In this problem, the player is supposed to predict a permutation of $$n$$ fixed objects at each trial, under the constraints that some objects have higher priority than other objects in each permutation. This problem is naturally motivated by a scheduling problem whose objective is to minimize the sum of completion times of $$n$$ sequential tasks under precedence constraints. We propose a polynomial time online linear optimization algorithm and show that it predicts almost as well as the best known offline approximation algorithms in hindsight.
Joint work with Kohei Hatano, Shuji Kijima, and Eiji Takimoto

• Marco Cuturi "Entropy Regularized Optimal Transport and Applications"

Optimal transport distances (a.k.a Wasserstein, EMD) define a fundamental geometry for probability measures. Despite their intuitive theoretical properties and excellent performance in practical retrieval tasks to compare histogram of features, their usage in machine learning has been hampered because of their heavy computational price tag. Their computation involves the resolution of a linear program whose cost quickly becomes prohibitive. We show in this talk how a redefinition of these distances, can change this situation. We propose to smooth the classical optimal transport problem with an entropic regularization term, and show that the resulting optimum is also a distance which can be computed through Sinkhorn-Knopp's matrix scaling algorithm at a speed that is several orders of magnitude faster than that of transport solvers. We show how and why this approach can be effective regardless of the properties of the ground metric, and how it can be directly applied to provide new computational approaches to solve variational problems that involve optimal transport distances, such as the Wasserstein barycenter problem recently proposed by Agueh and Carlier (2010).