Learning from Deep Learning
Machine learning methods are revolutionizing many areas of sciences, including astronomy. One may say, and not without reason, that already too many articles have appeared with "deep learning" in their title. We not only present a deep learning based method that estimates cosmological parameters significantly better than the state of the art techniques, but we also demonstrate that we can learn from machine learning. Based on the hints extracted from the neural network's internal representations we could design a novel human understandable robust and efficient method that can estimate more precise cosmological parameters from weak lensing maps.
My story with machine learning has started back in the late 80's. Indeed, many of the basic concepts that are used today were invented at that time. Beyond the theoretical interest of analyzing the workings of neural networks I was always intrigued by applying them for real problems. In my thesis work we had a couple of applications of neural nets e.g. for signal peptide identification and quark and gluon jet separation but the approach did not become mainstream at those early times. The main reasons were the lack of powerful enough computers and the lack of large enough training sets. These problems have been mostly solved for now with the age of high throughput instruments and many core GPUs. One problem still remained that left many of the potential "users" suspicious and kept them from utilizing it: neural networks are "black boxes", they may work efficiently, but we cannot explain how they achieve the results. I have always been bothered by this problem and for long time I rather used better understood methods for analyzing scientific data sets. As the new wave of deep learning methods emerged I tried to quickly catch up partly because of the nostalgia towards my thesis topic and also because students were crazy about it so there was a need to put modern machine learning techniques into the curriculum.
Two of my students, who are also coauthors of this article, Dezso Ribli and Balint Pataki were not only enthusiastic, but also very clever and quickly got to the cutting edge of this research field. They have won first and second prizes among thousand contestants at the high profile DREAM Challenges, that invite participants to propose solutions to fundamental biomedical questions. Balint used machine learning techniques to diagnose Parkinson's disease from accelerometer sensor data. Dezso developed a deep learning based mammography diagnosis method that reached comparable accuracy to trained radiologists in lesion detection (Ribli et al. Scientific Reports 8 (1), 4165 (2018)).
In this March there was an interesting talk at our Institute by Zoltan Haiman, from Columbia University about a study they have just submitted for publication. They used deep learning for cosmological parameter estimation from gravitational weak lensing maps. Their results were already better than the state of the art traditional methods, but I challenged my students if they can do better.
Weak lensing maps can be imagined as topographic maps but instead of heights of hills and valleys they follow the distribution of matter in the universe. The overall density of matter (Ωm) and the scale of the primordial fluctuations (σ8) are two crucial parameters that influence the size and distribution of peaks and valleys on the lensing map. Since individual maps cannot be directly compared, researchers use various statistics like power spectrum or peak count histograms to recover the original cosmological parameters. These methods are known to have limited capacity to extract all the information from the maps.
The Haiman group was kind enough to share the lensing map data, so we could quickly start to work on the problem. It did not take more than a couple of days to put together a proper network architecture and Balint came back with results that significantly surpassed the improved estimates of the Haiman group. Reducing the error bars was a nice feat in itself but the old question remained: what is the magic behind the results? Can we look into the "black box"? By investigating the "kernels" of neurons at various levels of the network, Dezso has found one, that was very "active" during the prediction and at the same time quite symmetric and simple looking. It was similar to the Roberts cross operator that is routinely used in image processing for edge detection. We set up a method that instead of the large intricate structure of the neural network used only the Roberts cross operator to create a peak steepness histogram. This histogram could then be used to estimate cosmological parameters by comparing it to histograms from other simulated weak lensing maps where parameters are known. This simple method was as good as our advanced neural network and much better than any previous method.
It is quite interesting, that a simple idea, which is well known among the image processing researchers never crossed the mind of cosmologists, who rather prefer to use power spectrum or correlation function for such analyses. It happens often that scientists borrow successful techniques from other disciplines but in this case it took an artificial neural network - though with some human help - to make this cross-disciplinary discovery.
Our paper in Nature Astronomy is here.