Contrary to ordinary classification, in multilabel classification (MLC), one can assign more than one class label to each example. Consider for example the popular task of assigning topic categories, such as economics or politics, to news articles. Apart from the obvious challenge of a potentially high number of possible labels, it is widely accepted that one major issue in learning from multilabel data is the exploitation of label dependencies. However, only recently, there have been the first attempts to provide a formal understanding of dependencies in MLC. This talk will give an overview of the foundations of multilabel classification and the issues which arise from considering dependencies.
In the second part of the talk I will introduce our recent attempts of using deep neural network techniques in order to address multilabel text classification problems. Text classification is a field where neural networks commonly show only weak performance. I will introduce our approach, which combines recent advances from deep learning in order to overcome this problem. In addition, I will present our current line of research, which tries to deal with dependencies originating from hierarchical label structures by using a joint embedding for both documents and labels. Embedding is a technique that learns a projection of words or documents into a low dimensional space, where semantic relationships are preserved, or even revealed.
Eneldo Loza Mencia is a PostDoc in the Knowledge Engineering Group at the Technische Universität Darmstadt (Germany), where he also defended his thesis about "Efficient pairwise multilabel classification". Apart from multilabel classification and learning by pairwise comparison, his research interests include large scale classification, perceptron learning, text classification and information extraction.
Home Page: http://www.ke.tu-darmstadt.de/bibtex/authors/show/3450