An Online Peer-Assessment Methodology for Improved Student Engagement and Early Intervention

Michael Mogessie Ashenafi

Peer-assessment, student engagement, early intervention, prediction model, learning analytics, higher education

Student performance in all levels of education is commonly measured using summative assessment methods such as midterms and final exams as well as high-stakes testing. Although not as popular as summative assessment, there are other methods of gauging student performance. Formative assessment is a continuous, student-oriented form of assessment, which focuses more on helping students improve their performance by promoting continuous engagement and by measuring progress constantly. One assessment practice that has been in use for decades in such a manner is peer-assessment. This form of assessment relies on having students evaluate the works of their peers. The work that is evaluated as well as the level of education may vary across practices. However, the work discussed in this thesis focuses on peer-assessment in higher education. Despite its cross-domain adoption and its longevity, peer-assessment has been a practice which is difficult to apply to courses with a high number of students. This directly stems from the fact that it has been used in traditional classes, where assessment is usually carried out using pen and paper. Such manual form of peer-assessment requires significant amount of time and eventually adds to both instructor and student burden. Automated peer-assessment, on the other hand, has the advantage of reducing, if not eliminating, many efficiency and effectiveness issues related to traditional peer-assessment. Moreover, its potential to scale up easily makes it a promising option to conducting new experiments or replicating existing ones. The aim of this research is to demonstrate how the potential of automated peer-assessment can be exploited to improve student engagement and to show how a well-designed peer-assessment methodology may provide teachers early insight into student who may have difficulties in successfully completing the course. A methodology is developed to demonstrate how online peer-assessment may elicit continuous student engagement. Data collected from a web-based implementation of this methodology are then used to construct several models that can predict performance and monitor student progress throughout a course in order to highlight the practice’s capacity to serve as a tool of early intervention. Finally, a promising role of the methodology in measuring student proficiency and test item difficulty is demonstrating by applying a generic Item Response Theory model to the peer-assessment data.





bring the algorithm to the data

Lorenzo Candeago

open algorithm, multi-party computation, blockchain

OPAL (Open Algorithms) is a project being developed by MIT that aims to build a platform for privacy-preserving data sharing. Instead of sending the data for analysis purposes, the algorithm is sent and run behind data owner's firewalls. Only the result of the computation is returned. Data can also be distributed in an encrypted form through a p2p network, and analytics can be performed on encrypted data thorugh secure multi-party computation. The blockchain is used as a ledger to log the queries sent to the OPAL engine.





Distributed streaming analysis from a neuroscientific perspective

A cognitive approach to stream analysis

Alessandro Ercolani

big data, machine learning, computational neuroscience, distributed streaming, streaming analysis

The analysis of data streaming from on-line instruments, large scale simulations, and distributed sensors now enables near real-time steering and control of complex systems such as scientific experiments, transportation systems, and urban environments. As data are produced exponentially and moore’s law coefficient is decreasing a cognitive approach to data engineer and analysis is establishing as an emerging field of research. Brain processing of parallel input coming from body sensors is extraordinary efficient. My research will focus on studying new algorithms for orchestrating and analysing parallel stream taking inspiration from computational neuroscience.




Contextualizing Data Quality Evaluation

Daniele Foroni

Data quality, data analysis, data analytics, data mining, big data, data management

Data quality is one of the main issues that arises in database management. It has been studied deeply in the literature and many characteristics of the data have been analyzed to estimate the goodness of the data itself. However, the road to a complete knowledge about the quality of a dataset is still incomplete. In fact, it has been proved that the quality of the data depends on the context where the data is applied. Thus, we propose an analysis of the correlation between a set of data quality characteristic and the quality of the output of a task. This is performed through an injection of noise into the data, in order to evaluate how much each characteristic affects each task.




Diffusion Network Inference and Deep Cascade Analysis

Zekarias Tilahun Kefato

Network Inference, Influence Analysis, Cascade prediction

The diffusion of a contagion is a common phenomena in both the cyber and natural spaces. Irrespective of the contagion–a meme, a hashtag, a biological virus–the process is always the same: a diffusion or a cascade occurs as a result of interaction between agents over a diffusion network. Unfortunately, the diffusion network is often unknown: that is, one can observe when the agents are infected by a given contagion (e.g., when a piece of information arrives, when a product is adopted, when a virus is caught), but does not know how the infection has been transmitted. The goal of this research is to infer such a network starting from the contagion events and their relative ordering. Towards this end, we devise a neural network model to learn a representation of nodes that captures their context in terms of what other nodes are infected with them. Ultimately, we utilize the learned representation of nodes in order to infer possible links between a pair of nodes. Once the latent network is inferred, one can apply different kinds of network analysis on it, such as cascade prediction and influence maximization.




Extraction & Exploitation of Goals and Intentions

For Item Retrieval, Recommendations, and Querying

Dimitra Papadimitriou

Data Management, Big Data, and Databases

Extraction and exploitation of user goals and intentions in data management and data mining tasks with a focus on item retrieval, recommendations, and querying.




Activity Analytics Based on Mobility and Social Media Data

Pavlos Paraskevopoulos

mobile data, social networks, geolocalize, tweets

My name is Pavlos Paraskevopoulos and I am a fifth year PhD student at the University of Trento, under the supervision of Prof Themis Palpanas (currently full professor at the Paris Descartes University, France). My current work is in the field of Social Data Analytics, with a special focus on human behavior analysis, based on data deriving from diverse streaming sources, such as mobile phone usage (i.e., number of calls, SMS, internet, etc.) and social media activities (i.e., tweets). I have developed a novel method for analyzing and geolocalizing non-geotagged Twitter posts. The proposed method is the first to do so at the fine-grain of neighborhoods (i.e., squares of 1x1 km), while being both effective and time efficient. The geolocalizing of more posts, could allow us to get a better big picture of an event impact, while identifying actionable insights hidden in the non-geotagged posts.




Real-time Anomaly Detection on Heterogeneous Data Streams


Sivam Pasupathipillai

Anomaly Detection, Stream Mining, Big Data, Data Management

Massive amounts of data are being collected every day all over the world. The ability to extract novel and useful information from this data has become a key success factor in data-driven companies. Moreover, traditional batch processing approaches are reaching their limit, due to the explosive growth of data volume and velocity. The purpose of this research is the definition and implementation of software architectures to enable real-time anomaly detection on Big Data streams, where an anomaly is defined as a rare and interesting pattern. For example, the identified solution(s) should enable the identification and characterization of events in a city, in order to evaluate their actual impact.




Human behavior understanding using mobile data and web usage patterns

Investigating a new user-centric paradigm in privacy and personal data management

Christos Perentis

personal data, privacy, personal data management, human behavior, living labs

The wide adoption of mobile devices and their capability of collecting personal and contextual information have resulted in a massive production of personal data (PD). The availability of such a huge amount of data represents an invaluable resource and opportunity for organizations and individuals to enable new applications and move toward Personal Big Data scenarios. However, this also raises new and significant privacy concerns since users are currently excluded from the life-cycle of their data, relegated to the role of PD producer with limited possibility to control and to exploit them. This situation imposes a shift toward a new paradigm for privacy and PD management, in order to unlock the value of PD and to provide an effective and transparent control on privacy, enabling virtuous exploitation opportunities. We will face the many challenges and research questions arising from this scenario by (i) understanding the parameters affecting users' attitudes toward privacy; (ii) monitoring and analysing their behaviour in relation with their traits (personality, privacy concerns, risk perception, etc.) and their social network influence; (iii) developing some intervention to investigate the behavioural change against different stimula. Our study will be supported by a continuous analysis on a large community of real users. The results of this work will be used as the basis for the development of novel user-centric paradigms of next generation PD management services.




Preference Graph Mining

Giulia Preti

Graph Mining; User Preferences; Pattern Mining;

Graphs are widely used to easily model complex relationships in various domains. We consider graphs whose edges are associated with weights that encode specific user preferences, investigate three major data mining tasks, namely graph clustering, graph pattern mining, and subgraph matching, and underline issues and possible applications. The dynamic nature of the user interests and their multiplicity pose unique challenges to solving these problems, and therefore we study how to implement efficient and effective algorithms that overcome them.




Community Evolution Detection in Temporal Attributed Graphs

Nasrullah Sheikh

Community Detection, Temporal Attributed Graphs

The communities formed by nodes in a graph provide helpful insights to better understand the graph themselves. Real-world graphs such as social networks show dynamic behavior over time in both the structure and their attributes. We call such networks temporal attributed graphs or TAGs. The problem dealt in this proposal is the study of community evolution in TAGs, with the goal of identifying major events in the community life-cycle (e.g. births, merges, splits). Existing work in this area presents two shortcomings: first, dynamic community detection is based on static techniques applied to a sequence of snapshots; second, structure and attributes are often considered separately. Due to these issues, the pattern of community evolution is lost, and the quality of the detected communities is poor because of the lack of exploiting of both sources of information. Therefore, in this work, we propose a framework for community evolution in TAGs aimed at achieving a more precise evaluation of the community evolution patterns.




Large-Scale Entity Linkage in Evolving Datasets

Paolo Sottovia

Entity resolution, data evolving, data mining, knowledge extraction

The ability to recognize that two data structures represent the same real-world entity is of paramount importance in the database community. This problem has been studied extensively for decades, but only a few approaches consider data that evolves over time, i.e., the data entries describe aspects of real-world entities that are valid at a specific time. Previous approaches assume that an entity can evolve only by changing its attribute values over time; however, an entity can also disappear or dissolve into several parts, which may then join other entities or create new entities. The goal is using the movement of the attributes across the entities to determine how they evolve.