Recognizing 3D Objects from a Limited Number of Views using Temporal Ensembles of Shape Functions

Karla Brkić, Siniša Šegvić, Zoran Kalafatić, Aitor Aldomà and Markus Vincze

Abstract

We consider the problem of 3D object recognition, assuming an application scenario involving a mobile robot equipped with an RGB-D camera. In order to simulate this scenario, we use a database of 3D objects and render partial point clouds representing depth views of an object. Using the rendered point clouds, we represent each object with an object descriptor called temporal ensemble of shape functions (TESF). We investigate leave-one-out 1-NN classification performance on the considered dataset depending on the number of views used to build TESF descriptors, as well as the possibility of matching the descriptors built using varying numbers of views. We establish the baseline by classifying individual view ESF descriptors. Our experiments suggest that classifying TESF descriptors outperforms individual ESF classification, and that TESF descriptors offer reasonable descriptivity even when very few views are used. The performance remains very good even if the query TESF and the nearest TESF are built using a differing number of views.

Files

Full Paper as PDF

BibTeX Citation

DOI

10.20532/ccvw.2014.0013

https://doi.org/10.20532/ccvw.2014.0013

BibTeX

@InProceedings{10.20532/ccvw.2014.0013,
  author =       {Karla Brki{\' c} and Sini{\v s}a {\v S}egvi{\' c}
                  and Zoran Kalafati{\' c} and Aitor Aldoma and Markus
                  Vincze},
  title =        {Recognizing {3D} Objects from a Limited Number of
                  Views using Temporal Ensembles of Shape Functions},
  booktitle =    {Proceedings of the Croatian Compter Vision Workshop,
                  Year 2},
  pages =        {44-49},
  year =         2014,
  editor =       {Lon{\v c}ari{\' c}, Sven and Suba{\v s}i{\' c},
                  Marko},
  address =      {Zagreb},
  month =        {September},
  organization = {Center of Excellence for Computer Vision},
  publisher =    {University of Zagreb},
  abstract =     {We consider the problem of 3D object recognition,
                  assuming an application scenario involving a mobile
                  robot equipped with an RGB-D camera. In order to
                  simulate this scenario, we use a database of 3D
                  objects and render partial point clouds representing
                  depth views of an object. Using the rendered point
                  clouds, we represent each object with an object
                  descriptor called temporal ensemble of shape
                  functions (TESF). We investigate leave-one-out 1-NN
                  classification performance on the considered dataset
                  depending on the number of views used to build TESF
                  descriptors, as well as the possibility of matching
                  the descriptors built using varying numbers of
                  views. We establish the baseline by classifying
                  individual view ESF descriptors. Our experiments
                  suggest that classifying TESF descriptors
                  outperforms individual ESF classification, and that
                  TESF descriptors offer reasonable descriptivity even
                  when very few views are used. The performance
                  remains very good even if the query TESF and the
                  nearest TESF are built using a differing number of
                  views.},
  doi =          {10.20532/ccvw.2014.0013},
  url =          {https://doi.org/10.20532/ccvw.2014.0013}
}