Croatian scientists develop a new...

The first prize in the Data Challenge competition, held as part of the Extreme Value Analysis 2021 conference, was won by Assistant Professor Domagoj Vlah, PhD, from the Department of Applied Mathematics and Tomislav Ivek, PhD, from the Institute of Physics.

The CMIWAE (Conditional Missing data Importance Weighted Auto-Encoder) method developed by the Croatian researchers, which secured their victory in the Data Challenge competition, is based on an advanced variant of so-called variational autoencoders, artificial neural networks used to reconstruct data in the presence of noise, damage, or other difficulties. In contrast to competing methods, the CMIWAE method enables the neural network to independently find important regularities in the available data, so that neither expert information nor manual data adjustment is required.

Below you will find more information about the competition and a short interview with Assistant Professor Domagoj Vlah.

The EVA2021 conference took place from 28 June to 2 July 2021 and was organised by the University of Edinburgh. The Data Challenge competition required a reconstruction of the frequency of wildfires and the area burned by wildfires in the United States over a 23-year period. Wildfires are the uncontrolled burning of combustible material from natural vegetation and pose an extraordinary risk to human life, the natural environment, and property. Their frequency is expected to increase dramatically with global climate change. The theme of this year's competition at the EVA2021 conference was therefore to predict the frequency of wildfires and burnt areas at geographical and weather-related points with non-existent or corrupted data and only on the basis of the remaining available data.

"In this competition, we have very directly used AI methods historically developed in computer vision and adapted them to the field of mathematical statistics where they have never been used before. This is interesting because in a way it is a reversal of the usual flow of information, where innovations in mathematics are later applied in technology," says Assistant Professor Vlah, explaining the specificity of their methodology.

The technique developed by Croatian scientists Domagoj Vlah and Tomislav Ivek can also be applied in other research areas that rely on data with space-time extremes. The authors will present their approach in a scientific article that is currently being prepared.

Friends since primary school

Domagoj Vlah and Tomislav Ivek have been friends since primary school when they met at a maths competition. "We are both interested in computers and have been working with modern deep learning techniques, i.e. modern approaches to artificial intelligence, for quite some time. A little over two and a half years ago, purely by chance, I saw that there was a competition as part of the EVA international conference, which was held in Zagreb in 2019. That was the first time Tomislav Ivek and I took part in the competition, simply of curiosity to see how our artificial intelligence techniques would perform in comparison to the classical techniques of mathematical statistics. We were thrilled when we managed to win second place that year," said Domagoj Vlah.

Encouraged by this success, the two friends and science enthusiasts continued to study the field of deep learning and started writing scientific articles on the subject. The logical continuation was the competition at this year's edition of the EVA2021 conference, which they won.

Modern deep learning techniques

The participants of the Data Challenge are mostly experts in the field of mathematical statistics called extreme value analysis. They also apply machine learning techniques in their solutions, combined with a lot of expertise from their field, but use slightly older and more basic machine learning techniques, around 20 years old.

"We used very modern deep learning techniques that are only a year or two old, which we have further developed and adapted for the field of extreme value analysis. Unlike other solutions, ours requires almost no expert knowledge, and none of us are really experts in mathematical statistics. This makes me all the more pleased with our result because we have shown empirically that instead of modelling and using expert knowledge, such a competitive task can successfully be solved just by meta-modelling. Put simply, we have modelled an artificial intelligence system that has implicitly "learned" all the necessary expert knowledge to solve a given problem at the competition, and it has done so even more successfully than scientists in the field could," explains Assistant Professor Vlah and continues: "Such results are not uncommon for modern applications of this type of artificial intelligence. For example, computer programmes that play chess much better than the best human players have been around for many years, and there are other niches where this kind of artificial intelligence clearly outperforms human abilities."

A completely new approach to the "most mathematical" method

The method used by our researchers is based on a newer and more advanced variant of variational autoencoders, which Vlah and Ivek have further adapted and generalised so that it could be applied to a specific problem in the Data Challenge competition.

"This is a variant called Missing data Importance Weighted Auto-Encoder (MIWAE), which we have improved to work with additional conditional data and called Conditional Missing data Importance Weighted Auto-Encoder (CMIWAE). As far as we know, nothing like this has been tested before in the field of extreme value analysis. This is a completely new approach in the field of mathematical statistics," explains Domagoj Vlah.

According to Domagoj Vlah, one could say that the MIWAE method is one of the "most mathematical" methods used in deep learning. This is because it has a basis in mathematical probability theory, whereas many other methods used in deep learning, while very successful in practice, still do not have a satisfactory mathematical explanation for why they work so well. Originally, variational autoencoders were developed and applied in the area of artificial intelligence which deals with computer vision. The concept of the variational autoencoder was discovered in 2014, while MIWAE, its more advanced variant, was discovered in 2019.



Advantages of the CMIWAE method compared to other deep learning methods

The CMIWAE method, which also applies to the existing MIWAE method, potentially works well with a much smaller amount of data than is required for some older deep learning methods, but also for machine learning in general.

The general problem with deep learning is that very large amounts of data often need to be available for such systems to successfully learn something. In theory, MIWAE drastically reduces the need for large amounts of data, which in the end was probably decisive for the victory of Croatian scientists in this year's EVA2021 competition, where the amount of data available for learning was relatively small.

Original research at FER and collaboration with colleagues

Assistant Professor Vlah's scientific field is theoretical mathematics, while he has been intensively involved in researching new methods of deep learning for several years.

Encouraged by his successful participation in the first EVA2019 competition, Assistant Professor Domagoj Vlah started to collaborate on the topic of deep learning with some colleagues from FER who have not used these kinds of techniques in their work before. "Last year I collaborated with Associate Professor Hrvoje Pandžić, PhD, and his doctoral student Karlo Šepetanac, where we used deep learning to create an excellent solution to a problem in the field of energy. I have also recently started working on a project with the Vice Dean for Science, Professor Nikola Mišković, PhD. As part of this collaboration, we will try to use the CMIWAE technique as part of the control system of the autonomous marine vehicle," Vlah concludes.

Author: Petra Škaberna
News list