Machine Learning Seminar Series
Fall terms 2016 and 2017 the Department of Computing Science hosts a Machine Learning Seminar Series. The series consists of (guest) lectures on Machine Learning topics related to the research interests of the department.
Please contact Andrii Dmytryshyn (email@example.com) if you have any questions.
List of the seminars:
September 14, 2016, at 10:30, room N420
Ramon Lopez de Mantaras
Artificial Intelligence Research Institute of the Spanish National Research Council, Barcelona, Spain
Title: Past, Present, and Future of AI: A Fascinating Journey
Abstract: In my talk I will start by reminding the origins of AI and I will also comment on the difference between today’s specific AI and the general human-like AI that the founding fathers had in mind. Then I will also briefly mention some of the most impressive results achieved along these past years. Next I will discuss some of the difficulties involved in trying to achieve general AI systems and particularly the problem of dealing with common sense knowledge. I will also advocate for the need to develop integrated AI systems, as well as the importance of embodiment, as a prerequisite to progress towards general human-like AI. Finally I will mention the importance of paying attention to developments in other fields such as biology, material sciences, or nano-technology that might have wide-ranging influences on our ideas about AI and on the machines we will build.
November 30, 2016, at 10:30, room N360
University of Edinburgh, Edinburgh, UK
Title: Canonical Correlation Analysis for Making Predictions in Natural Language Processing
Abstract: How can we learn from data when we have several modalities or views available in the data? How would we define a ``view'' to begin with? These questions can be answered in the framework of canonical correlation analysis (CCA). CCA is an old statistical learning technique introduced by Hotelling (1935). It performs multi-view dimensionality reduction, and is now being revitalized and used for a number of problems in natural language processing, computer vision and machine learning.
In this talk, I will describe the idea behind canonical correlation and show how it can be applied to an array of problems in natural language processing, ranging from automatic image captioning and deriving word embeddings to sequence labeling and syntactic parsing. Some of these applications of CCA make creative use of the notion of a "view" and enable learning structured statistical models, such as latent-variable context-free grammars and hidden Markov models with strong guarantees on the learning.
December 7, 2016, at 13:15, room N360
Bart De Moor
KU Leuven, Belgium
Title: Linear Algebra in Machine Learning
Abstract: In this presentation, we will focus on least squares support vector machines. These are machine learning algorithms, that basically, in their so-called primal formulation, start from a constrained optimization problem with a quadratic (least squares) objective function. The ingredients are a function that maps given data points to a higher dimensional (possibly infinite dimensional) feature space, a vector of unknown weights and a vector of error variables that, in the case of a classification problem, tolerates misclassifications.
First, we will show how standard (Gaussian) linear regression problems with linear constraints and a priori information on the distribution of the vector of unknowns and residuals, lead to constrained weighted least squares problems. These can be solved, either for the vector of unknowns, the vector of residuals or the Lagrange multipliers introduced for the constraints. This leads to three different, yet equivalent solutions.
We then use this result to solve a given least-squares optimization problem in its so- called dual form, which ultimately leads to (large) set of linear equations (in the literature called a Karush-Kuhn-Tucker system). The feature mapping is not to be known explicitly, but leads (via the so-called ‘kernel trick’) to a kernel function, that one can choose depending on the requirements and a priori information on the data and the problem. Examples are linear kernels, polynomial ones, Radial Basis Functions, etc.
So in the first part of the talk, we will elaborate how plain linear algebra allows to formulate non-linear classification and regression problems, by clever exploitation of the ‘kernel trick’.
In the second part of the talk, we turn the problem around. We will show how the machinery of least squares support vector machines can be deployed to formulate kernel extensions of Principal Component Analysis (SVD), Canonical Correlation Analysis (Angles between subspaces) and Partial Least Squares. This allows us to generalize these well known ‘linear’ statistical techniques, to non-linear extensions, while we only use insights and algorithms from linear algebra.
We will comment on issues of sparsity and robustness, and illustrate our presentation with an abundance of applications from industrial and medical data sources.
September 1, 2017, at 10:30, room N430
The University of Texas at Arlington, USA
Title: Improving Answer Quality Prediction in Q/A Social Networks
Abstract: As we all are familiar with, search for good answers is a general problem and Google provides a good mechanism for that by ranking pages on the internet. Most users do not view more than top 2 or 3 pages returned by Google. Alternatives for search are to ask specific questions to Question/Answer (or Q/A) networks and get answers from different individuals (may be experts.) Here you are likely to get a more focused answer to your question. Hence the proliferation of Q/A networks since 2000. However, due to multiple broader quality of answers, identifying the best answer or ranking answers, in general, is a challenge.
In this talk, we address the problem of answer quality prediction. I will present our approach to ranking answers using question and answer features. We feel that traditional features, classification, and ranking approaches are not well-suited for this problem. Unlike previous approaches, we take into account temporal aspects that characterize user behavior on Q/A networks. We have been able to establish that a smaller number of features can provide better accuracy with respect to ranking of answers. This talk is based on speaker’s research/projects and experience gained in the process over the last 30+ years.