Research Software Training

The Framework for Integrated Research Software Training in High Energy Physics (FIRST-HEP) project aims to develop a community framework for software training to prepare the scientific and engineering workforce needed for the computing challenges of HEP experiments.

High energy physics (HEP) research aims to make discoveries which provide a deeper understanding of the fundamental building blocks of nature and their interactions. The most successful theory to date is known as the “Standard Model” of particle physics. Although it describes an impressive number of experimental observations to date, it is known to be an incomplete description of nature. The search for what lies “Beyond the Standard Model” is the scientific driver for many ongoing experiments as well as the design of the large, next-generation facilities such as the High-Luminosity LHC (HL-LHC) at CERN and the Long-Baseline Neutrino Facility (LBNF) at Fermilab. Doing science with these facilities typically requires the development of sophisticated software capable of analyzing the enormous samples experimental data collected, and people are the key to developing this software. To be successful they require a mix of deep physics domain knowledge and advanced software skills.

Today software-related training support for the HEP workforce is uneven and made up of a patchwork of training activities with some significant holes. Although many universities do provide some relevant computer science, software engineering and introductory “data science” courses, many graduate students are not required to take these classes as part of their graduate curriculum. Large HEP experiments often provide some training for collaborators to learn the specific software tools used and/or developed by the experiments. In this case the goal is primarily to make new collaborators effective users of the complex experiment software ecosystems, rather than empowered contributors to that ecosystem. Some contributor-focused training activities do exist, but typically as an “add-on” activity of other projects or as a volunteer side effort by members of the community. Even when the individual standalone training efforts are successful, the fragmentation of the ensemble limits the impact, sustainability over time and vitality of the activities. The patchwork nature of existing training activities also does not provide a clear progression path from basic to more advanced skills. Researchers often try to “run before they can walk”.

How can such training be improved? One key insight is the need to think of training not as a set of individual, disconnected activities, but as part of a larger community framework, as shown here:

Training Framework

The goals of the FIRST-HEP project are:

  • To work with the HEP community to build a consensus around this vision and a collaborative framework among national and international entities for implementing the vision
  • To develop a Carpentries-style introductory HEP software training curriculum, and a community of instructors, which is seen nationally and internationally as the entrance point into the overall training vision
  • To develop a more advanced training school within the U.S. for the relevant software skills (building on the existing Computational and Data Science for High Energy Physics (CoDaS-HEP) school) as well as an international federation of similar advanced training schools in HEP and beyond