Human Behavior Understanding Using Mobile Phone and Social Media Data

Gianni Barlacchi

Machine Learning, Kernel Methods, Data Analysis, Natural Language Processing

Human behavior understanding is a challenging task, which aims at creating automatic models to capture how interactions among people affect the happiness, the health and the economic well-being of our society. The global growth of mobile phone usage has reinforced the need to study the psychological and social implications of this technology. The almost totality of this data is owned by telecommunication companies which are collecting Call Detail Records (CDRs) from the millions of mobile phones used around the world. Furthermore, the recent growing of geo-located social networks, e.g., Twitter and Foursquare, introduced the possibility to combine social media texts with CDRs. This research aims at investigating novel machine learning models that can combine the joint use of CDRs and social media data (mainly textual information) for human behavior understanding.




Machine Learning for Neuroscience

Danilo Benozzo

Machine Learning, Signal Processing, Neuroscience

Assessing causality interactions among brain time series




Machine Learning in Diffusion MRI Data

Methods for automatic segmentation of white matter tracts

Giulia Bertò

Diffusion MRI data, Tractography, White Matter Tract Segmentation, Brain Connectivity

Diffusion MRI tractography has become of great importance in order to study in vivo the anatomical connectivity of the human brain, both in normal and in disease conditions. From the whole-brain tractography it is possible to extract individual white matter tracts with a procedure called segmentation. Our work aims to propose new methods to automatically segment specific white matter tracts in one subject given existent data coming from other subjects. To validate our results, we will take advantage of the anatomical ex-vivo dissection performed by neurosurgeons. A further step will be to develop a tool that can be used to support an interactive automatic segmentation.





Machine Learning for Investigating Post-Transcriptional Regulation of Gene Expression

Gianluca Corrado

Machine Learning, Bioinformatics, Gene Expression

RNA binding proteins (RBPs) and non-coding RNAs (ncRNAs) are key actors in post-transcriptional gene regulation. By being able to bind messenger RNA (mRNA) they modulate many regulatory processes. In the last years, the increasing interest in this level of regulation, favored the development of many NGS-based experimental techniques to detect RNA-protein interactions, and the consequent release of a considerable amount of interaction data on a growing number of eukaryotic RBPs. Despite the continuous advances in the experimental procedures, these techniques are still far from fully uncovering, on their own, the global RNA-protein interaction system. For instance, the available interaction data still covers a small fraction (less than 10%) of the known human RBPs. Moreover, experimentally determined interactions are often noisy and cell-line dependent. Importantly, obtaining genome-wide experimental evidence of combinatorial interactions of RBPs is still an experimental challenge. Machine learning approaches are able to learn from the data and generalize the information contained in the data. This might give useful insights to help the investigation of the post-transcriptional regulation. In this work, we propose three machine learning contributions, aimed at addressing the three above-mentioned shortcomings of the experimental techniques, to help researchers unveiling some yet uncharacterized aspects of post-transcriptional gene regulation. The first contribution is RNAcommender, a tool capable of suggesting RNA targets to unexplored RBPs at a genome-wide level. RNAcommender is a recommender system that propagates the available interaction data, considering biologically relevant aspects of the RNA-protein interactions, such as protein domains and RNA predicted secondary structure. The second contribution is ProtScan, a tool that models RNA-protein interactions at a single-nucleotide resolution. Learning models from experimentally determined interactions allows to denoise the data and to make predictions of the RBP binding preferences in conditions that are different from those of the experiment. The third and last contribution is PTRcombiner, a tool that unveils the combinatorial aspects of post-transcriptional gene regulation. It extracts clusters of mRNA co-regulators from the interaction annotations, and it automatically provides a biological analysis that might supply a functional characterization of the set of mRNAs targeted by a cluster of co-regulators, as well as of the binding dynamics of different RBPs belonging to the same cluster.




Constructive Recommendation

Machine Learning

Paolo Dragone

Recommendation, machine learning

Recommendation systems are a well-established technology. Content-based recommender systems use machine learning algorithms to estimate the user preferences over the features of the items, using her feedback from previously examined ones. The underlying assumption of most modern recommendation systems is the availability of an explicit set of items to choose from and a set of features to represent them. However, this assumption does not hold when the set of possible items is exponentially large. One of such cases is when the recommendation task is constructive, i.e. the recommended items are "created" on the fly to satisfy the interest of the user. The possible constructive recommendation applications range from personalized shopping basket creation to automated travel planning systems. The field of constructive machine learning is rather new and its application to recommendation is still widely unexplored. This research proposal aims at facing the challenge of addressing the unsolved problems in the field of constructive machine learning oriented to recommendation. Moreover, we plan to develop applications of constructive recommendation suitable for real users. This document outlines the general research trend in constructive recommendation, the objectives of this research line and a set of promising preliminary results.





Machine learning for spatiotemporal prediction in geomarketing and environment

Gabriele Franch

Deep Learning, Machine Learning, GIS

Development of a predictive analytic approach based on machine learning to model the association between mobility, environment and human activity. In particular, my research will focus on novel methods for prediction of spatiotemporal patterns based on the integration of deep learning with geospatial technologies over high frequency environmental and telecommunications data (Call Detail Records or location data from mobile apps). The research will address the application to real-time analytics of high frequency retail or public service data, to the characterization of city and territories at diverse temporal, geospatial, and social scales. The work requires a highly multidisciplinary approach, leading to improvements in extracting patterns from mobile phone data and radar patterns, automatic synthesis of features in machine learning, and actionable graphics within dashboards, with expected impact on fields such as marketing, urban science, agriculture, civil protection. This project has also clear business applications, which will be explored on retail data streams in the pharma domain and other areas to demonstrate the approach. The project is developed in internship at FBK in its Complex Data Analytics research line, with the support of experts in Predictive Models for Big Data, GIS, and Mobile and Social Computing.




Structured Output Learning

Structured Learning for Coreference Resolution

Iryna Haponchyk

Such complex prediction tasks like coreference resolution in NLP are tackled with learning algorithms making inference in structured output spaces. These algorithms produce a complex output object in question at once, globally. The aim of the project is to elaborate methods for the consolidation of structured learning with the task specific knowledge.





Brain Decoding for Brain Mapping

Definition, Heuristic Quantification, and Improvement of Interpretability in Group MEG Decoding

Seyed Mostafa Kia

Brain Decoding; Brain Mapping; Neuroimaging; Machine Learning; Magnetoencephalography; Interpretability; Reproducibility

In the last century, a huge multi-disciplinary scientific endeavor is devoted to answer the historical questions in understanding the brain functions. Among the statistical methods used for this purpose, brain decoding provides a tool to predict the mental state of a human subject based on the recorded brain signal. Brain decoding is widely applied in the contexts of brain-computer interfacing, medical diagnosis, and multivariate hypothesis testing on neuroimaging data. In the latest case, linear classifiers are generally employed to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of brain maps to further study the spatio-temporal patterns of the underlying neurophysiological activity. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratio, across-subject variability, and the high dimensionality of the neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies.





Smart Feature Selection

Developing a robust and scalable framework for feature and hyperparameter selection

Andrea Mariello

Machine Learning, Optimization, Feature Selection, Information Theory

Nowadays, we are experiencing a growing interest in data science, a relatively new discipline at the intersection between Statistical Learning, Engineering and Operations Research in which practitioners develop and use techniques and algorithms to extract useful insights from an increasing number of huge collections of data. The majority of the datasets are characterized by a large number of high-dimensional patterns, such as those found in genetics, chemistry, finance etc. Others are also characterized by a high level of noise or missing values. The objective of our research is therefore to develop techniques for extracting automatically from those datasets the information providing the highest value. This translates also to better generalization and human interpretability of machine learning models. Our theoretical framework includes concepts from the Information Theory (entropy and mutual information), heuristics based on the nearest neighbors and optimization techniques for automatically tuning the hyperparameters of a model.




Managing the Scarcity of Monitoring Data through Machine Learning in Healthcare Domain

Alban Maxhuni

Intermediate models, Semi-supervised learning, and Transfer-learning.

In the field of Ubiquitous Computing, a significant problem of building accurate machine learning models is the effort and time consuming process to gather labeled data for learning algorithm. In healthcare, classification tasks require a ground truth normally provided by an expert physician, ending up with a small set of labeled data with a larger set of unlabeled data. We propose using our novel Intermediate Models to predict the mood variables associated with the questionnaire using data acquired from smartphones. To address scarce data, we propose applying semi-supervised learning setting which takes advantage of the presence of all unlabeled datasets. In addition, we propose using transfer learning that is used to improve the performance of learning with the aim at avoiding expensive data labeling efforts.




Machine Learning for Body Computing

Applying Deep Learning to Stereotypical Motor Movement Detection in Autism Spectrum Disorder

Nastaran Mohammadian Rad

Deep Learning, Machine Learning, Ubiquitous Computing, Stereotypical Motor Movement, Autism Spectrum Disorder

Stereotypical Motor Movements (SMMs) are abnormal postural or motor behaviors that interfere with learning and social interaction in autism spectrum disorder patients. In recent years, wireless inertial sensing technology has offered a valid infrastructure for abnormal behaviors detection. My research goal is to use such technologies in combination with machine learning algorithms to develop a real-time SMM detection system.




Learning and Inference in Hybrid Domains

Paolo Morettin

Machine Learning, Logics, Optimization

My research is concerned with structured learning and inference in hybrid domains, characterized by both continuous/discrete variables and logical/algebraic constraints.




Video Understanding through machine learning and computer vision approaches

Negar Rostamzadeh

video analysis, action recogniton, semantic segmentation, deep learning, computer vision, machine learning

Automatic video scene understanding and activity analysis are active research topics in computer vision. The interest in video analysis is motivated by the promise of important applications in several fields, such as patient monitoring and ambient assisted living. My research focuses on deep-learning and machine-learning applications in computer vision for video understand. One of the scenarios that my research is applied covers analyzing a single person behavior over a long period of time in order to (i) summarize the activities of a person over time to understand her/his daily routine; (ii) detect subtle activities and sudden changes in his/her life. To do so, we both recognize and detect activities within fine-grained temporal domain by employing the saptio-temporal patterns over the video.





Towards fully unsupervised methods

Emanuele Sansone

Understanding how to exploit unlabeled data to learn good representations that allow to reduce the amount of supervision required in classification tasks.




Machine Learning in diffusion MRI data

Correspondence among Connectomes as Combinatorial Optimization

Nusrat Sharmin

The white matter pathways of the brain can be reconstructed as 3D polylines, called streamlines, through the analysis of diffusion magnetic resonance imaging (dMRI). The whole set of streamlines is called tractogram and represents the structural connectome of the brain. The statistical analysis of the diffusion data is used in multiple applications, i.e. to observe the changes in age and to correlate with diseases. This kind of studies requires the analysis of groups of subjects, which creates two important problems: aligning across subjects, a problem called tractogram alignment, and the extraction of streamlines of interest, named tract segmentation problem. Due to the anatomical variability across subjects, these two problems are difficult to solve. In our work, we mainly investigate those two problems.