ArchivIA - Archivio istituzionale dell'Universita' di Catania >
Tesi di dottorato >
Area 09 - Ingegneria industriale e dell'informazione >
Please use this identifier to cite or link to this item:
|Issue Date: ||30-Jan-2014|
|Authors: ||Di Salvo, Roberto|
|Title: ||Large scale ground truth generation for performance evaluation of computer vision methods|
|Abstract: ||In this thesis a set of novel video annotation methods for performance evaluation of object detection, tracking and recognition applications is proposed.
Large scale labeled datasets are of key importance for the development of automatic video analysis tools as they, from one hand, allow multi-class classifiers training and, from the other hand, support the algorithms evaluation phase. This is widely recognized by the multimedia and computer vision communities, as witnessed by the growing number of available datasets; however, the research still lacks in usable and effective annotation tools, since a lot of human effort is necessary to generate high quality ground truth data. However, it is not feasible to collect large video ground truths, covering as much scenarios and object categories as possible, by exploiting only the effort of isolated research groups.
For these reasons in this thesis we first present a semi-automatic stand-alone tool for gathering ground truth data with the aim of improving the user experience by providing edit shortcuts such as hotkeys and drag-and-drop, and by integrating computer vision algorithms to make the whole process automatic with a little intervention by the end users. In this context we also present a collaborative web-based platform for video ground truthing which integrates the stand-alone tools and provides an easy and intuitive user interface that allows plain video annotation and instant sharing/integration of the generated ground truths, in order not to only alleviate a large part of the effort and time needed, but also to increase the quality of the generated annotations.
These tools are specifically thought to help users in collecting annotations thanks to the introduction of simple interfaces, which considerably improve and facilitate their work, also by integrating novel methods for quality control, but still remain a burdensome task with regard to the attention and time needed to obtain good records.
To motivate the users and relieve them from the tiresome task of making manual annotations, we devised strategies to automatically create annotation by processing data from the crowd. To this end we initially develop an approach based on an online game to collect big noisy data. By exploiting the information, we then propose data-driven approaches, mainly based on image segmentation and statistical methods, which allow us to obtain reliable video annotations by using low quality and noisy data gathered quickly and easily from the game. Also we demonstrate that the quality of the obtained annotations increases as more users play with the game making it an effective and valid application for the collection of consistent ground truth data.|
|Appears in Collections:||Area 09 - Ingegneria industriale e dell'informazione|
Files in This Item:
|DSLRRT82S25C351F-TesiDottorato_VERSIONE_FINALE.pdf||PhDThesisRobertoDiSalvo||2,91 MB||Adobe PDF||View/Open
Items in ArchivIA are protected by copyright, with all rights reserved, unless otherwise indicated.