Consensus-based Approach for Keyword Extraction from Urban Events Collections

  • Ana OLIVEIRA Alves
    Centre of Informatics and Systems, University of Coimbra, Portugal & Polytechnic Institute of Coimbra, Portugal ana[at]
  • Bernardete Ribeiro
    Department of Informatics Engineering, University of Coimbra, Portugal


Automatic keyword extraction (AKE) from textual sources took a valuable step towards harnessing the problem of efficient scanning of large document collections. Particularly in the context of urban mobility, where the most relevant events in the city are advertised on-line, it becomes difficult to know exactly what is happening in a place.In this paper we tackle this problem by extracting a set of keywords from different kinds of textual sources, focusing on the urban events context. We propose an ensemble of automatic keyword extraction systems KEA (Key-phrase Extraction Algorithm) and KUSCO (Knowledge Unsupervised Search for instantiating Concepts on lightweight Ontologies) and Conditional Random Fields (CRF).Unlike KEA and KUSCO which are well-known tools for automatic keyword extraction, CRF needs further pre-processing. Therefore, a tool for handling AKE from the documents using CRF is developed. The architecture for the AKE ensemble system is designed and efficient integration of component applications is presented in which a consensus between such classifiers is achieved. Finally, we empirically show that our AKE ensemble system significantly succeeds on baseline sources and urban events collections.
Centre of Informatics and Systems, University of Coimbra, Portugal & Polytechnic Institute of Coimbra, Portugal
Ana Alves is a full-member researcher at the Center for Informatics and Systems of the University of Coimbra in Portugal. She received a M.Sc. and a PhD degree in informatics Engineering both from the Informatics Engineering Department, University of Coimbra, Coimbra, Portugal. She is an Assistant Professor at Polytechnic Institute of Coimbra.Her main research interests and publications relate to Ambient Intellience, Information Extraction and Semantics and Natural Language Processing.

Bernardete Ribeiro

Department of Informatics Engineering, University of Coimbra, Portugal
 Bernardete Ribeiro is Professor at the Informatics Engineering Department, Faculty of Science and Technology, University of Coimbra in Portugal. She received a MSc degree in Computer Science and a PhD in Informatics Engineering both from the Informatics Engineering Department, University of Coimbra. Her main publications are in the areas of neural networks and their applications to engineering systems, computational intelligence and support vector machines. She is a member of ACM and IEEE.