Understanding and Exploiting Language Diversity

Khuyagbaatar Batsuren

language diversity, computational lexical semantics,

Languages are well known to be diverse on all structural levels, from the smallest (phonemic) to the broadest (pragmatic). We propose a formal, quantitative method for the evaluation of the degree of generality or locality of linguistic phenomena. The mainstay of our approach is a multi-faceted measure of diversity in language sets. We apply our method to lexical semantics where we show how evidence of a high degree of universality within a given language set can be used to extend lexico-semantic resources in a precise, diversity-aware manner. We demonstrate our approach on a case study on polysemes and homographs among cases of lexical ambiguity. Contrarily to past research that focused solely on exploiting systematic polysemy, the notion of universality provides us with an automated method also capable of predicting irregular polysemes.




A Semantic Framework for Enabling Personal Data

Enrico Bignotti

Nowadays, it is difficult to exploit personal data for providing services to users, since it must be “made sense” of them. Different communities focus on either of the representational or sensor data aspect of the problem. Our proposed solution is a methodological framework combining a reference ontology and sensor data to model people, their environment and their everyday contexts linking them to sensor data via statistical models. As evaluation points, we aim to ground the reference ontology in real life scenarios, while maintaining generality and interoperability, in addition to injecting semantics to improve sensor accuracy and provide tailored services to users.




Geospatial datamanagement in open data context

Subhashis Das

Data integration, knowledge organization, Open data, Ontology, Spatial data

Due to several open data initiatives, free geographical data are available from both public as well as private organizations. Everyday huge amount of geographical data is being captured by mobile gadgets that get stored in servers. As such, data generation involves various types of user communities and organizations, it becomes available in different formats, languages, and schema or standards. The main challenge in this scenario lies in the integration of such diverse geographical datasets and on the generator of knowledge from the existing sources. The aim of this research is to presents a scalable, interoperable and effective model and methodology for geographical data integration. The European directive on spatial data INSPIRE is used to formalize the geographical model, which we have named as Geo eTypes. A few aspects which have specially been emphasized are: modeling a flexible data integration application, dealing with messy data sources and alignment with an upper ontology.




Semantic Image Interpretation

Let's teach machines to understand images

Ivan Donadello

machine learning, deep learning, ontologies, computer vision

Semantic Image Interpretation (SII) is the process of automatically generating meaningful descriptions of the content of images. An example of meaningful image description is a graph which nodes correspond to image objects and the arcs to semantic relations between objects. Background knowledge (BK), in the form of logical theories, is extremely useful for SII. Many state-of-the-art algorithms for SII mainly adopt a bottom-up approach, which generates semantic interpretations of images starting from their low- level features. BK is used only at a late stage for enriching the semantic descriptions. In this work, we show how BK plays an important role also during the early phase of SII if integrated with low-level image features. To this aim, we propose: (i) a reference framework where a semantic image interpretation is a partial model of the BK. The elements of the partial model are grounded (linked) to a (set of) image segment(s). (ii) The use of Logic Tensor Network to predict an approximation of a partial model of a picture according to the BK. Logic Tensor Network is a framework that integrates learning from numerical data and logical reasoning over the BK. This integration allows us to reason over the low-level features of objects in a picture and to express facts between them consistently with the constraints of the BK. We evaluate our method on the task of classifying objects and their parts in the images of the PASCAL-Part dataset. In this experiment we combine the features extracted with a state-of-the-art object detector with BK expressed as a part-whole ontology. Our approach outperforms the state-of-the-art on object classification, and improves the performances on part-whole relation detection with respect to a rule-based baseline.





Adaptive Knowledge Representation

Mattia Fumagalli

Concepts, recognition ability, ontology, cognition, foundations

Until now, many knowledge representation (KR) formalisms have been developed providing useful and application-ready solutions to digitally represent and process the meaning of information. Although one of the main goals of these information artifacts is to promote a common understanding of information, the way by which they are used to describe the world is completely arbitrary. This raises the issue of semantic heterogeneity. The focus of my research is to provide a new knowledge representation formalism, grounded on teleosemantics theory, that can be used to address semantic heterogeneity through adaptivity. This will be done introducing and formalizing the notion of concept as recognition ability.




Contextualizing Bookmarks

Leveraging contextual meta-data for organization and retrieval of bookmarks

Hyeon Kyeong Hwang

Personal Information Management, Context-based Information Search and Retrieval, Knowledge Management, Bookmarking

The pervasive nature of the Web has changed our lives in such a way that we constantly need to deal with an overwhelming amount of information to process and manage. One of the well-observed and proven difficult among such tasks is that of re-finding information once seen before. The most popular way of keeping information for re-use at a later time is bookmarking the sources with meta-data such as titles and tags or painstakingly organizing them into hierarchical folders. However, several research results show that people often resort to alternative methods because of the difficulty in managing their bookmark collection. Recent studies have shown that success in re-finding depends more on human memory than the organization of bookmarks or other history mechanisms. Various commercial and prototypical PIM and Web tools have demonstrated that context plays an important role in memorizing important cues for retrieval. While these results are encouraging, there is not yet sufficient knowledge on which type of context is useful for which situations in which users wish to reuse previously found resources. For this purpose, we developed a prototypical tool, MemoryLane, that systematically supports and encourages users to use various types of contextual meta-data for their bookmarks. We carried out an experimental study in three stages. First, an online survey revealed that most users use bookmarks for keeping found things found, mainly organized in folders. Further, an analysis of bookmarks created by participants who used our prototypical bookmarking tool showed that category, query, tags and goal (purpose) were frequently added. Surprisingly, emotions were added more often than location. Finally, the success of retrieval largely depends on the specificity and accuracy of users’ recall of the items to be found. The provision of contextual cues significantly improved success rates. These results imply that preferably implicitly gathered contextual indicators should become part of both the bookmarking and the retrieval process.




Ontological Foundations for Strategic Analysis

Tiago Prince Sales

Ontology, Business Informatics, Value Proposition, Business Model

In competitive markets, companies need well-designed business strategies if they seek profitability. A carefully formulated strategy is based on evidence and considers issues arising from both the internal condition of the company, in terms of performance and capabilities, and the external environment where the company operates, in terms of industry competition and societal forces. In order to drive the identification of such strategic issues, various frameworks have been proposed, such as Porter’s Five Forces Analysis and SWOT. These frameworks, however, lack computational support for representing and reasoning with strategic information, for their concepts are informally defined and they rely exclusively on people to gather, represent and analyze strategic issues. In my doctoral research project, we address this limitation through an interdisciplinary approach that leverages theories and techniques from Formal Ontology, Enterprise Modelling and Strategic Management. Our goal is to develop a consistent conceptual foundation for the development of computational tools that can automate and reduce the complexity of performing strategic analysis, helping organizations to formulate and evaluate strategies.




Towards a New Generation of Digital Library Services

An entity-centric approach to document search

Amit Kumar Sarangi

Knowledge representation, entity-centric search

Digital-libraries adhere to knowledge organization principles, where documents are accessed by querying their metadata, which limits expressivity in document-search, as KO does not explicitly describe the related entities (authors, editors, publishers, and those in the subject). IFLA and DCMI have included entities in their abstract-models, but they lack description. Whereas, in KR entities are described with properties and relations, allowing us to explicitly represent entities. We will define schemas for mind-products and other relevant entities; provide language and vocabulary control for attributes; validate the schema and language over use-cases.




Human Machine Symbiosis

Mattia Zeni

semantic, big data, pervasive computing, smartphones social computing

The aim of this work is to propose a novel approach to the activity recognition field that accounts for both low-level environmental data collected by sensors and high-level human knowledge to infer the activity performed by the user. The final goal is to create a framework that is aware of the context and provides the correct service. Each human being performs daily routines composed by activities that can be considered accidental attribute of his person. These routines are used to dynamically generate semantic knowledge about the person that is then used in the activity recognition phase in combination with sensor data. The way we merge the two types of information is the crucial point of this approach, while the evaluation point will be an improvement of the final recognition accuracy. Low-level data are collected by general-purpose mobile devices such as smartphones and wearables generating a totally unobtrusive user experience