Recognizing 3D Objects from a Limited Number of Views using Temporal Ensembles of Shape Functions
Abstract
We consider the problem of 3D object recognition, assuming an application scenario involving a mobile robot equipped with an RGB-D camera. In order to simulate this scenario, we use a database of 3D objects and render partial point clouds representing depth views of an object. Using the rendered point clouds, we represent each object with an object descriptor called temporal ensemble of shape functions (TESF). We investigate leave-one-out 1-NN classification performance on the considered dataset depending on the number of views used to build TESF descriptors, as well as the possibility of matching the descriptors built using varying numbers of views. We establish the baseline by classifying individual view ESF descriptors. Our experiments suggest that classifying TESF descriptors outperforms individual ESF classification, and that TESF descriptors offer reasonable descriptivity even when very few views are used. The performance remains very good even if the query TESF and the nearest TESF are built using a differing number of views.
Files
DOI
10.20532/ccvw.2014.0013
https://doi.org/10.20532/ccvw.2014.0013
BibTeX
@InProceedings{10.20532/ccvw.2014.0013,
author = {Karla Brki{\' c} and Sini{\v s}a {\v S}egvi{\' c}
and Zoran Kalafati{\' c} and Aitor Aldoma and Markus
Vincze},
title = {Recognizing {3D} Objects from a Limited Number of
Views using Temporal Ensembles of Shape Functions},
booktitle = {Proceedings of the Croatian Compter Vision Workshop,
Year 2},
pages = {44-49},
year = 2014,
editor = {Lon{\v c}ari{\' c}, Sven and Suba{\v s}i{\' c},
Marko},
address = {Zagreb},
month = {September},
organization = {Center of Excellence for Computer Vision},
publisher = {University of Zagreb},
abstract = {We consider the problem of 3D object recognition,
assuming an application scenario involving a mobile
robot equipped with an RGB-D camera. In order to
simulate this scenario, we use a database of 3D
objects and render partial point clouds representing
depth views of an object. Using the rendered point
clouds, we represent each object with an object
descriptor called temporal ensemble of shape
functions (TESF). We investigate leave-one-out 1-NN
classification performance on the considered dataset
depending on the number of views used to build TESF
descriptors, as well as the possibility of matching
the descriptors built using varying numbers of
views. We establish the baseline by classifying
individual view ESF descriptors. Our experiments
suggest that classifying TESF descriptors
outperforms individual ESF classification, and that
TESF descriptors offer reasonable descriptivity even
when very few views are used. The performance
remains very good even if the query TESF and the
nearest TESF are built using a differing number of
views.},
doi = {10.20532/ccvw.2014.0013},
url = {https://doi.org/10.20532/ccvw.2014.0013}
}
Pristupačnost