ALEPH Workshop @ NIPS 2015

Applying (machine) Learning to Experimental Physics (ALEPH) and «Flavours of Physics» challenge

When: 11th of December 2015, 8:30 - 18:30
Where: room 515 bc, NIPS, Montreal, Canada

Experimental physics actively develops frontiers of our knowledge of the Universe and ranges from macroscopic objects observed through telescopes to micro-world of particle interaction. In each field of study scientists go from raw measurements (celestial objects spectra or energies of detected particles inside collider detectors) to higher levels of the representation that are more suitable for further analysis and to human perception. Each measurement can be used for supporting or refuting certain theory that compete for predictive power and completeness.

In many areas of physical experiments it assimilated computational paradigms a long time ago: both simulators and semi-automatic data analysis techniques have been applied widely for decades. In particular, nonparametric classification and regression are now routinely used as parts of the reconstruction (inference) chain. More recently, state-of-the-art budgeted learning techniques have also started to be used for real-time event selection on LHC, and last year we have also seen the first use of deep learning techniques in astrophysics. Nevertheless, most of these applications went largely unnoticed by the machine learning (ML) community.

The proposed workshop is the second of its kind, continuing the series of workshops started with the HEPML workshop held at NIPS 2014. The goal of this workshop is also to contribute to improving the dialog between these two communities, to continue introducing exciting scientific questions and open problems in computational statistics and machine learning to the NIPS community and to facilitate the transfer of some of the state-of-the-art techniques developed in ML to the Experimental Physics community.

Besides, the main incentive to propose this workshop is the «Flavours of Physics» machine learning challenge, organized by the workshop organizers. Like similar HiggsML challenge this challenge deals with physics being studied at the Large Hadron Collider. However, there are a number of considerable differences:

unlike last year, when the Higgs Boson was already discovered, the aim of this years challenge is to find a phenomenon that is not already known to exist – charged lepton flavour violation. Its observation would be a sign of long-sought “new physics”.
unlike last year, when only simulated data was provided, in this challenge we use real data from LHCb experiment at the LHC, mixed with simulated datasets of the decay.
it is the first time when such challenge is designed in a way to be of usefull for the physicists for future studies. That is why the metric introduced in the challenge includes checks that physicists always do in real analyses -- to make sure their results are unbiased -- and help using results of the challenge in further high energy physics researches.

Although the “Flavours of Physics” challenge is our main topic, the workshop will cover a broader interface between Physics and ML. The following topics will be discussed in the workshop.

Lessons learned at HEPML workshop and HiggsML challenge. One of the organizers of the previous workshop on the similar topic, Balázs Kégl, will overview the outcomes of the previous workshop and the challenge, so that it would be clear how it influenced both the machine learning and the HEP communities and which similar lessons can be learned from the results of the “Flavours of Physics” challenge.

Systematic uncertainties. Unlike in classical applications for nonparametric classification, when training classifiers in HEP, training data is usually simulated, and learning is used for approximation/interpolation rather than for estimation. At the same time, known model uncertainties cause shifts in the generating distribution that have to be quantified both when evaluating the trained classifiers, and during training. The main challenge is to design objective functions and automated methods to take model uncertainties into consideration and to be robust to model shifts. The invited talks by Balázs Kégl and Kyle Cranmer will touch on this issues and raise them in the follow-up panel discussion.

Deep learning We anticipate that the aspect of application of deep learning to Experimental Physics problems will still stay a hot topic, as it was at the HEPML workshop, so we will continue raising it at our workshop as well. Feature extraction in Physics data analysis tasks is usually manual and requires painstaking work based on prior knowledge, and usually includes reconstructing particle tracks in various detectors, estimating physical quantities (directions, momenta, particle types), and extracting observables with high discriminative power. One of the most exciting questions that is still very relevant nowadays is whether deep learning, successful in similarly structured vision and speech recognition tasks, could be used for automatically learning representations and features on the raw detector data. The invited talks by Jürgen Schmidhuber and Michael Williams will be devoted to these opportunities and will initiate relevant discussions after their keynotes and during the planned panel.

Open problems Experimental Physics is very rich field in terms of open problems and issues. Daniel Whiteson will cover problems of Experimental Physics from the point of view of Data Scientists and describe novel challenges requiring novel solutions from modern Machine Learning community.