A group of FER researchers from the Department of Electronics, Microelectronics, Computer and Intelligent Systems (ZEMRIS), led by professor Siniša Šegvić, PhD, won the first place in the international competition Robust Vision Challenge, in the discipline of semantic segmentation. The method which the ZEMRIS team used in the competition was developed within the doctoral dissertation of the institute's associate Marin Oršić, while assistant Petra Bevandić, junior researcher Ivan Grubišić and institute associate Josip Šarić also participated in the preparation of the submission.
Competition propositions required learning a single model that could implement predictions on multiple data sets. Such models are particularly interesting for industrial applications because they behave best in the presence of unexpected scenes. FER's model achieved first place in the absolute competition at the WildDash 2 set. It is the most demanding set for the semantic segmentation of automotive scenes because it contains images which were specifically selected by the experts as difficult for machine-learned models. Model learning and evaluation was performed on an NVIDIA DGX computer. It is a computer system with eight NVIDIA Tesla V-100 GPUs, which simplify the software execution of procedures for learning large models.
Our winning team’s method, titled "Effective Semantic Segmentation by Pyramid Fusion", was published at the last year’s CVPR conference, which Google Scholar lists as the 5th most prestigious scientific source according to the h5-index metric. It was also published in Pattern Recognition, journal which is ranked 10th out of a total of 220 most prestigious sources in the field of artificial intelligence according to Scopus.
"There are three main motivations for our participation in the competition. First, such ventures force us to step out of our comfort zone and "pick up" knowledge that we would otherwise gain much more slowly. Second, we wanted to increase the scientific impact of our publications. Third, this participation showed that our expertise is competitive on a global scale, so we hope to find projects to continue funding our research. We are interested in industrial projects, mainly because they confront us very effectively with the real limitations of our science. In addition, industrial research is important because public sources of funding in Croatia are not sufficient to conduct research at the global level", explains the team leader, professor Šegvić.
You can find more information in the detailed news content.
Professor Šegvić's group of researchers began studying semantic segmentation in early 2015 as part of the Croatian Science Foundation's MULTICLOD project (2014-2017). Understanding road scenes was of particular interest because of interesting applications related to automating road safety assessment. It is an industrial niche in which there are great opportunities for the synergy of the traffic and computer science, and which was the subject of research of an earlier semi-industrial project by professor Šegvić, called MASTIF (2008-2011).
Winning team at the Robust Vision Challenge 2020: Josip Šarić, Ivan Grubišić, Petra Bevandić, Marin Oršić
and professor Siniša Šegvić, PhD
The first large data set for the semantic segmentation of Cityscapes road scenes appeared in 2015 and was funded by the German company Daimler. The accuracy of the procedures on that data set grew rapidly, but the researchers soon realized that the models learned at Cityscape were doing poorly “in the wild”. The problem was that all Cityscape images were taken with the same camera and from the same perspective, causing machine-learned models to adapt to that framework so much that they would be unusable in other contexts.
Therefore, in 2018, the WildDash set was designed at the Austrian AIT Institute. Colleagues from the AIT suggested systematizing dangerous situations (or hazards) that could lead to wrong predictions. Examples of these hazards are: blurriness, underexposed parts of the image, rain, fog or snow. The first Robust Vision Challenge competition was launched in parallel with WildDash. The task was to teach models that could understand multiple types of scenes (e.g., interiors, photographs, car driving), which would be checked by simultaneously evaluating the models on multiple data sets, something akin to an athletic all-around. RVC considers seven computer vision tasks (stereoscopic reconstruction, optical flow, etc.).
The segmentation competition at the first RVC, held in 2018, consisted of the sets Cityscapes, WildDash, KITTI and ScanNet (interiors). In that competition, FER took second place behind Mapillary, which won all meetings. This year’s RVC included an upgraded WildDash, three datasets used in 2018, and three new datasets: Viper (artificially generated images), Ade20k (diverse photos), and Vistas (road driving, 5 continents). There is only one paper in the scientific literature that has considered such difficult problems. That paper was published at this year’s CVPR, and the authors of the paper were FER’s rivals in the competition. We believe that the main reason for the small number of applicants this year was the fact that very few teams have enough experience and strong enough equipment to participate in such an endeavour.
The figure shows the results for three scenes from the WildDash set and five procedures that presented their results on this set. The columns correspond to the scenes and the rows to the algorithms. Of course, the test is performed on hundreds of such images. The advantage of our method (SN_RN152pyrx8) over the second-best method (MSEG_1080) is best illustrated in the night scene with motorcycles. You can see that MSEG_1080 does not recognize the lines on the pavement and in some places classifies the light reflection on the lens as the column class.
Cooperation with the Economy
"My group is working on a number of industrial projects. With Rimac Automobiles, we are researching forecasting the future of road scenes. With Microblink, we are exploring a one-eyed reconstruction in order to detect fake authentication attempts in videos, an area in which I worked extensively as a postdoctoral student in France and Austria. We are researching the perception of autonomous storage robots with RoMB. We are researching the automatic recognition of safety attributes according to the EuroRAP protocol with the Faculty of Transport and Traffic Sciences in Zagreb. We occasionally cooperate with Xylon, while we also cooperated with Končar and Dok-ING within EU structural projects", pointed professor Siniša Šegvić, head of FER's research team.