Invited Speakers

  • Nico Beerenwinkel, ETH Zürich, Switzerland
    Title: "Analyzing molecular tumor profiles for precision oncology"

  • James Berger, Duke University, USA
    Title: "Gaussian Process Emulation of Computer Models with Massive Output"

  • Andreas Christmann, University of Bayreuth, Germany
    Title: "Kernel based methods in machine learning"

  • Jurgen Doornik, University of Oxford
    Title: "Model building in a changing world"

  • Iryna Gurevych, TU Darmstadt, Germany
    Title: "Latest News in Computational Argumentation: Surfing on the Deep Learning Wave, Scuba Diving in the Abyss of Fundamental Questions"

  • Barbara Hammer, Bielefeld University, Germany
    Title: "Transfer Learning and Learning with Concept Drift"

  • Johannes Hartig, DIPF, Frankfurt, Germany
    Title: "Analysis of data from educational achievement tests with generalized linear mixed models"

  • Luc de Readt, KU Leuven, Belgium
    Title: TBA

  • Roberto Rocci, University of Rome Tor Vergata, Italy
    Title: "Finite mixtures for simultaneous clustering and reduction"

  • Elena Simperl, University of Southampton, UK
    Title: "Data analysis with humans in the loop"

  • Yuan Sun, National Institute for Informatics, Japan
    Title: "Learning oriented diagnostic assessment based on students' response data"

SPECIAL SESSIONS (confirmed so far)

  • Advances in Recursive Partitioning and Related Methods

    Organization: Claudio Conversano

    Recursive Partitioning Methods (RPM) are a computer-intensive data-mining tool originally designed for analyzing vast databases of data characterized by mixed (numerical and categorical) variables and nonlinear relationships. Searching unknown patterns, predicting the distribution of homogeneous sub-groups of outcomes and identifying factors that distinguish particular sub-groups are among the main features of RPM that determined these methods to be increasingly popular in Data Science applications. The track is focused on specific issues related to the use of RPM, such as: model-based RPM, stability and visualization of RPM outcomes, Boosted RPM and use of RPM in semi-supervised clustering.

  • Applications in Digital Humanities

    Organization: Michaela Geierhos

    In this special session, we would like to discuss possible applications in digital humanities that go beyond theoretical questions. What do Digital Humanities offer for practice? Which tools and projects exist; what should be considered when dealing with data analysis focused on a interdisciplinary research question? The presentation of projects and software tools, their development history and faced challenges helps to to figure out new aspects of the practical side of digital humanities.

  • Big Data and Complex Network Analytics

    Organization: Martin Atzmüller

  • Bioinformatics and Biostatistics
    Organization: Dominik Heider

    This special session invites contributions from all aspects of biostatistics and bioinformatics, with a special emphasis on machine learning and statistical learning for biomedical problems, ranging from biotechnological applications towards medical diagnostics and prognostics. We further encourage work describing novel methods for preprocessing of biomedical data, e.g., feature selection methods.

  • Comparison and Benchmarking of Cluster Analysis Methods
    Organization: Christian Hennig

    There is an already huge and further growing variety of cluster analysis methods. The situation cries out for systematic comparison and benchmarking of such methods using simulations and benchmark datasets. Some such comparisons exist, but they are mostly rather patchy and unsystematic. Benchmarking cluster analysis methods is essentially more difficult than benchmarking supervised classification. In cluster analysis, usually there isn't just a single true grouping, rather different aims of clustering may lead to different clusterings on the same dataset that could be optimal according to different criteria. Benchmarking cluster analysis needs to take this into account. The track encourages researchers to share their work, experiences and thoughts on the systematic evaluation of clustering methods.

  • Consumer Preferences and Marketing Analytics

    Organization: Friederike Paetz, Daniel Guhl

    This track invites methodological, theoretical or empirical papers which aim to contribute to the general understanding of consumer preferences. The focus is laid on tools/methods like conjoint analysis or discrete choice analysis in a marketing context that aim at informing and improving management decisions. However, we also encourage work that deals with further quantitative techniques to extract preference information from consumer data.

  • Data Analysis Models in Economics and Business
    Organization: Jozef Pociecha

    This is a proposal for those who are dealing with applications of data analysis and classification models, machine learning procedures, multivariate time series and other multivariate methods in various areas of economic and business research. We are waiting for examples of a new approach to such type of empirical investigations, useful both for analytics and practice.

  • Data Analysis in Finance
    Organization: Krzysztof Jajuga

  • Data Science for Mental Health
    Organization: Fionn Murtagh

    Mental health and mental well-being are the main themes of this session. Also what can be relevant are poor health from dementia and Alzheimers, and consequences and repercussions of the state of mental health. Developments in methodology will be relevant and especially new developments relating to Big Data analytics. Data Science has, and will have, so much to offer for health and well-being, and for psychoanalysis, for mental capital in the social sciences, and for cognitive science and neurosciences.

  • Dimension Reduction and Visualisation for Classification
    Organization: Niel le Roux

    Visualization of multidimensional data requires some form of data reduction technique. These low-dimensional visualizations are useful in supervised and unsupervised learning situations. They are invaluable tools for revealing cluster structure; selecting an appropriate model and quantifying separation/overlap of predefined groups. In general, more than a single visualization is needed for a particular data set. Attention is focused on recently developed methods like computationally efficient procedures for finding optimal hyperplane separators; the -machine which uses dissimilarity measures for classification; correlation-based distance measures and component analysis techniques similar to canonical correlation analysis for quantifying neural reliability. In addition, the visualization of incomplete data, categorical and continuous, is considered using generalized orthogonal Procrustes analysis and correspondence analysis related biplots as well as summarizing multivariate binary data using latent variables.

  • Mining Streaming and Time-Evolving Data
    Organization: Barbara Hammer, Georg Krempl, Jerzy Stefanowski

    Recent years have seen a steep increase in the availability of data, which is often generated sequentially and by non-stationary processes. Mining such streaming and time-evolving data often requires to consider the ordering and temporal context of instances, which has been the subject of different lines of research: In data stream mining, the focus of analysis is mostly on the recently observed instances, and instances have a limited life cycle due to adaptation and forgetting mechanisms. In time series analysis, on the other hand, the internal structure of data points taken over time is considered. While the focus of this special session is on the problem of data stream mining, its aim is to present novel ideas from both lines of research that stimulate exchange between their communities.

  • Recent Developments in Longitudinal Data Analysis in Psychology
    Organization: Casper Albers

    With the recent advancement of techniques such as smartphones and other smart devices, more and more often psychological research is conducted based on intensive longitudinal data. In these data, individuals fill in items about the experiences and affect in their daily life. Such data has higher ecological validity than data from retrospective studies. These rich data are then analyzed to study the dynamics of, for example, emotion and psychopathology. One object of study is temporal dynamics. When studying temporal dynamics, the focus is not on detecting a gross underlying trend, as is often the case in developmental research, but rather on the intricate temporal dependence of and between variables such as emotions, or how variables within an individual influence each other or themselves over time. Although these studies are rising in popularity, techniques to study the dynamics are lagging behind. Due to the fact that quite often psychological time series are of a different nature than, e.g. economic time series, models from other fields of science cannot be imported without adaptations. The aim of this special session is to bring together scientists working on the methodology behind psychological longitudinal data and to discuss the state of the art methods to analyse such data.

  • Statistical Aspects of Machine Learning Methods
    Organization: Florian Dumpert

  • Statistical Learning with Imprecision
    Organization: Thomas Augustin, Sebastian Destercke

    In this session, we are interested in learning situations where imprecision, or indecision, plays a key role in the data analysis process. This may be due to partial data whose missingness process is atypical, to the need of providing robust conclusions in case of partially specified probabilities, etc. We welcome contributions of theoretical, methodological and applied nature in which imprecision is processed explicitly. This imprecision may occur in the data themselves, in the model or in the predictions produced by the model.

  • Statistical Visualization for Data Science
    Organization: Koji Kurihara, Adalbert Wilhelm

  • Time Series Analysis and Online Algorithms

    Organization: Wolfgang Konen

    With the increase of data streams in economics and industry, the data science of time series becomes more and more important. In this special session we would like to deal with new developments in time series analysis, with a special focus on online algorithms, that is algorithms which have the ability to adapt while the time series are being processed. Tasks, topics and application areas include, but are not limited to: Time series classification, prediction, motif discovery, anomaly detection, predictive maintenance, online algorithms for the preceding tasks, multivariate time series, internet of things.

  • Web Science
    Organization: Axel Ngonga