# Machine Learning Seminar Series

In 2016-2018 the Department of Computing Science hosts a Machine Learning Seminar Series. The series consists of (guest) lectures on Machine Learning topics related to the research interests of the department.

Please contact Andrii Dmytryshyn (andrii@cs.umu.se) if you have any questions.

List of the seminars:

______________________________________________________________________

September 14, 2016, at 10:30, room N420

**Ramon Lopez de Mantaras**

Artificial Intelligence Research Institute of the Spanish National Research Council, Barcelona, Spain

Title: **Past, Present, and Future of AI: A Fascinating Journey**

Abstract: In my talk I will start by reminding the origins of AI and I will also comment on the difference between today’s specific AI and the general human-like AI that the founding fathers had in mind. Then I will also briefly mention some of the most impressive results achieved along these past years. Next I will discuss some of the difficulties involved in trying to achieve general AI systems and particularly the problem of dealing with common sense knowledge. I will also advocate for the need to develop integrated AI systems, as well as the importance of embodiment, as a prerequisite to progress towards general human-like AI. Finally I will mention the importance of paying attention to developments in other fields such as biology, material sciences, or nano-technology that might have wide-ranging influences on our ideas about AI and on the machines we will build.

______________________________________________________________________

November 30, 2016, at 10:30, room N360

**Shay Cohen**

University of Edinburgh, Edinburgh, UK

Title: **Canonical Correlation Analysis for Making Predictions in Natural Language Processing**

Abstract: How can we learn from data when we have several modalities or views available in the data? How would we define a ``view'' to begin with? These questions can be answered in the framework of canonical correlation analysis (CCA). CCA is an old statistical learning technique introduced by Hotelling (1935). It performs multi-view dimensionality reduction, and is now being revitalized and used for a number of problems in natural language processing, computer vision and machine learning.

In this talk, I will describe the idea behind canonical correlation and show how it can be applied to an array of problems in natural language processing, ranging from automatic image captioning and deriving word embeddings to sequence labeling and syntactic parsing. Some of these applications of CCA make creative use of the notion of a "view" and enable learning structured statistical models, such as latent-variable context-free grammars and hidden Markov models with strong guarantees on the learning.

______________________________________________________________________

December 7, 2016, at 13:15, room N360

**Bart De Moor**

KU Leuven, Belgium

Title: **Linear Algebra in Machine Learning**

Abstract: In this presentation, we will focus on least squares support vector machines. These are machine learning algorithms, that basically, in their so-called primal formulation, start from a constrained optimization problem with a quadratic (least squares) objective function. The ingredients are a function that maps given data points to a higher dimensional (possibly infinite dimensional) feature space, a vector of unknown weights and a vector of error variables that, in the case of a classification problem, tolerates misclassifications.

First, we will show how standard (Gaussian) linear regression problems with linear constraints and a priori information on the distribution of the vector of unknowns and residuals, lead to constrained weighted least squares problems. These can be solved, either for the vector of unknowns, the vector of residuals or the Lagrange multipliers introduced for the constraints. This leads to three different, yet equivalent solutions.

We then use this result to solve a given least-squares optimization problem in its so- called dual form, which ultimately leads to (large) set of linear equations (in the literature called a Karush-Kuhn-Tucker system). The feature mapping is not to be known explicitly, but leads (via the so-called ‘kernel trick’) to a kernel function, that one can choose depending on the requirements and a priori information on the data and the problem. Examples are linear kernels, polynomial ones, Radial Basis Functions, etc.

So in the first part of the talk, we will elaborate how plain linear algebra allows to formulate non-linear classification and regression problems, by clever exploitation of the ‘kernel trick’.

In the second part of the talk, we turn the problem around. We will show how the machinery of least squares support vector machines can be deployed to formulate kernel extensions of Principal Component Analysis (SVD), Canonical Correlation Analysis (Angles between subspaces) and Partial Least Squares. This allows us to generalize these well known ‘linear’ statistical techniques, to non-linear extensions, while we only use insights and algorithms from linear algebra.

We will comment on issues of sparsity and robustness, and illustrate our presentation with an abundance of applications from industrial and medical data sources.

______________________________________________________________________

September 1, 2017, at 10:30, room N430

**Sharma Chakravarthy**

The University of Texas at Arlington, USA

Title: **Improving Answer Quality Prediction in Q/A Social Networks**

Abstract: As we all are familiar with, search for good answers is a general problem and Google provides a good mechanism for that by ranking pages on the internet. Most users do not view more than top 2 or 3 pages returned by Google. Alternatives for search are to ask specific questions to Question/Answer (or Q/A) networks and get answers from different individuals (may be experts.) Here you are likely to get a more focused answer to your question. Hence the proliferation of Q/A networks since 2000. However, due to multiple broader quality of answers, identifying the best answer or ranking answers, in general, is a challenge.

In this talk, we address the problem of answer quality prediction. I will present our approach to ranking answers using question and answer features. We feel that traditional features, classification, and ranking approaches are not well-suited for this problem. Unlike previous approaches, we take into account temporal aspects that characterize user behavior on Q/A networks. We have been able to establish that a smaller number of features can provide better accuracy with respect to ranking of answers. This talk is based on speaker’s research/projects and experience gained in the process over the last 30+ years.

______________________________________________________________________

April 25, 2018, at 13:15, room MA136

**Lili Jiang**

Department of Computing Science, Umeå University

Title: **Machine Learning and Privacy**

Abstract: With the increasing user generated content on internet, the advanced machine learning techniques have introduced more and more innovation and changes to our lives (e.g., product recommendation, image processing etc.). However the dawn of big (personal) data presents an entirely new threat to privacy, opening up volumes of data for analysis on a whole new scale. Thus, a series of privacy leakage incidents happened, for instance, two researchers de-anonymized Netflix users based on the published Netflix Competition data with movie-ratings; the store of Target in US sent coupons for baby items to a teenager girl based on her shopping records even before her parents knew she was pregnant. In this talk, I would like to present two parts, one part is about how machine learning is used to federate personal information and the potential risk of privacy leakage; the second part is about how privacy is guaranteed in/by applying machine learning.

______________________________________________________________________

May 23, 2018, at 13:15, room N420

**Zhiyong Zhou**

Department of Mathematics and Mathematical Statistics, Umeå University

Title: **Sparsity measures and sparse signal recovery**

Abstract: Sparse signal recovery, particularly compressive sensing, aims to reconstruct a sparse or compressible signal from noisy underdetermined linear measurements. If the measurement matrix satisfies the null space property (NSP) or restricted isometry property (RIP), stable and robust recovery can be guaranteed. Although probabilistic results conclude that the NSP and RIP are fulfilled for some specific random matrices with high probability, it's computationally hard to verify NSP and compute restricted isometry constant (RIC) for a given measurement matrix. In this talk, I’d like to introduce a new kind of stable sparsity measure, and present verifiable sufficient conditions and computable performance bounds for sparse recovery algorithms such as the Basis Pursuit, the Dantzig selector and the LASSO estimator, in terms of a newly defined family of quality measures for the measurement matrices.

______________________________________________________________________