Structure of McMule

McMule is written in Fortran 95 with helper and analysis tools written in python [1]. The code is written with two kinds of applications in mind. First, several processes are implemented, some at NLO, some at NNLO. For these, the user can define an arbitrary (infrared safe), fully differential observable and compute cross sections and distributions. Second, the program is set up such that additional processes can be implemented by supplying the relevant matrix elements.

The main part of McMule is the libmcmule library which consists of several modules with a simple, mostly hierarchic structure. The mcmule executable loads this library which in turn loads the measurement function defined by the user.

The user function

For a user of the code who wants to run for an already implemented process, this is the only relevant part. The user function is defined in a (usually dynamically loaded) library and defines the measurement function.

At the beginning of the module, the user has to specify the number of quantities to be computed, user/nq, the number of bins in the histogram, user/nbins, as well as their lower and upper boundaries, user/min_val and user/max_val. The last three quantities are arrays of length user/nq. The quantities themselves, i.e. the measurement function, is to be defined by the user in terms of the momenta of the particles in user/quant(). Cuts can be applied by setting the logical variable mcmule/pass_cut to false [2]. Some auxiliary functions like (pseudo)rapidity, transverse momentum etc. are predefined in mcmule as documented in Section libmcmule user guide. Each quantity has to be given a name through the array mcmule/names.

Further, user contains a subroutine called quant/inituser(). This allows the user to read additional input at runtime, for example which of multiple cuts should be calculated. It also allows the user to print some information on the configuration implemented. Needless to say that it is good idea to do this for documentation purposes.

Once written (or more likely adapted from a template), the user function is compiled so it can be loaded by libmcmule

$ gfortran -fPIC --shared -o user.f95

It is possible to write a user function in languages other the Fortran and a mcmule.h header file is provided for C and C++. However, many of the quality-of-life functions of mcmule are only available in Fortran (cf. Section Symbols exposed to C/C++)

Modular structure of libmcmule

The relation between the most important Fortran modules is depicted in Figure 5. A solid arrow indicates “using” the full module, whereas a dashed arrow is indicative of partial use. In what follows we give a brief description of the various modules and mention some variables that play a prominent role in the interplay between the modules.

graph TD subgraph "libmcmule" openloops(openloops) --> olinterface global_def --> olinterface collier(collier) --> functions global_def --> functions functions --> phase_space olinterface --> pgmatel subgraph " " pgmatel["{pg}_mat_el"] --> pg["{pg}"] end pg --> mat_el mat_el --> integrands functions --> user_dummy phase_space --> integrands user_dummy --> integrands integrands --> mcmule_header vegas --> mcmule_header phase_space -.->|"ksoft"| pgmatel user_dummy -.-> |"metadata"| vegas vegas -.-> |"bin_it"| integrands integrands --> test vegas --> test end mcmule_header --> mcmule user_dummy <--> |"loads and calls"| user["Measurement function"]

Figure 5 The structure of McMule


This module simply provides some parameters such as fermion masses that are needed throughout the code. It also defines real(kind=prec) as a generic type for the precision used. [3] Currently, this simply corresponds to double precision.


This module is a library of basic functions that are needed at various points in the code. This includes dot products, eikonal factors, the integrated eikonal, and an interface for scalar integral functions among others.


This is an external module [3, 4, 5, 6]. It will be linked to McMule during compilation and provides the numerical evaluations of the scalar and in some cases tensor integral functions in functions.


The routines for generating phase-space points and their weights are collected in this module. Phase-space routines ending with FKS are prepared for the FKS subtraction procedure with a single unresolved photon. In the weight of such routines a factor \(\xi_1\) is omitted to allow the implementation of the distributions in the FKS method. This corresponds to a global variable xiout1. This factor has to be included in the integrand of the module integrands. Also the variable qsoft is provided that corresponds to the photon momentum without the (vanishing) energy factor \(\xi_1\). Routines ending with FKSS are routines with two unresolved photons. Correspondingly, a factor \(\xi_1\,\xi_2\) is missing in the weight and xioutA and xioutB, as well as qsoftA` and qsoftB are provided. To ensure numerical stability it is often required to tune the phase-space routine to a particular kinematic situation.


This is the external OpenLoops library [1, 2] that we use for some real-virtual matrix elements. It is pulled as a git submodule and linked to McMule during compilation.


This connects openloops to the rest of McMule by initialising OpenLoops for the process under consideration and converting to and from the OpenLoops conventions which are slightly different than the ones used by McMule.


Matrix elements are grouped into process groups such as muon decay (mudec) or \(\mu\)-\(e\) and \(\mu\)-\(p\) scattering (mue). Each process group contains a mat_el module that provides all matrix elements for its group. Simple matrix elements are coded directly in this module. More complicated results are imported from sub-modules not shown in Figure 5. A matrix element starting with P contains a polarised initial state. A matrix element ending in av is averaged over a neutrino pair in the final state.


In this module the soft limits of all applicable matrix elements of a process group are provided to allow for the soft subtractions required in the FKS scheme. These limits are simply the eikonal factor evaluated with qsoft from phase_space times the reduced matrix element, provided through mat_el.

This module also functions as the interface of the process group, exposing all necessary functions that are imported by


which collects all matrix elements as well as their particle labelling or PID.


This interfaces to the (usually dynamically loaded) measurement function defined by the user.


As the name suggests this module contains the adaptive Monte Carlo routine vegas [15] . The binning routine bin_it is also in this module, hence the need for the binning metadata, i.e. the number of bins and histograms as well as their bounds (min_val and max_val) and names, from user_dummy.


In this module the functions that are to be integrated by vegas are coded. There are three types of integrands: non-subtracted, single-subtracted, and double-subtracted integrands, corresponding to the three parts of the \({\rm FKS}^2\) scheme [8, 25]. The matrix elements to be evaluated and the phase-space routines used are set using function pointers through a subroutine initpiece. The factors \(\xi_i\) that were omitted in the phase-space weight have to be included here for the single- and double-subtracted integrands.


This is the main part of the library that initialised and runs McMule with configurations passed by the main executable. It also defines the Module mcmule which contains runmcmule().


For developing purposes, a separate main program exists that is used to validate the code after each change. Reference values for matrix elements and results of short integrations are stored here and compared against.


This is the main executable, but actually does little else than read the inputs and call mcmule_header.

The library of matrix elements deserves a few comments. As matrix elements quickly become very large, we store them separately from the main code. This makes it also easy to extend the program by minimising the code that needs to be changed.

We group matrix elements into process groups, generic processes, and generic pieces as indicated in Appendix Available processes and which_piece. The generic process is a prototype for the physical process such as \(\ell p\to\ell p\) where the flavour of the lepton \(\ell\) is left open. The generic piece describes a part of the calculation such as the real or virtual corrections, i.e. the different pieces of (8) (or correspondingly (14) at NNLO), that themselves may be further subdivided as is convenient. In particular, in some cases a generic piece is split into various partitions (cf. Section Phase-space generation for details on why that is important).

What happens when running

In the following we discuss what happens behind the scene when asking McMule to perform the calculation of m2enng0 in Section Simple runs at LO.

  1. When started, the first thing mcmule does is load the measurement function as from a shared library using the load_quant() routine.

  2. When started, mcmule reads options from stdin as specified in Table 1. It then passes these values to the runmcmule() function of mcmule.

  3. Once McMule knows its configuration it associates the numerical values of the masses, as specified through flavour as set by the user. In particular, we set the generic masses Mm and Me to Mtau and Mel. This is done in initflavour(), defined in global_def. For other processes this might also involve setting e.g. centre-of-mass energies scms to default values.

  4. Next, the function to be integrated by vegas is determined. This is a function stored in integrands. There are basically three types of integrands: a standard, non-subtracted integrand, sigma_0, a single-subtracted integrand needed beyond LO, sigma_1, and a double-subtracted integrand needed beyond NLO, sigma_2. Which integrand is needed and what matrix elements and phase-space it depends on is determined by calling the function init_piece which uses the variable which_piece to point function pointers at the necessary procedures. For our LO case, init_piece sets the integrand to sigma_0 and fixes the dimension of the integration to \({\tt ndim}=8\).

  5. The matrix element pointer is assigned to the matrix element that needs to be called, Pm2enngAV(q1,n1,q2,q3,q4,q5). The name of the function suggests we compute \(\mu(q_1,n_1)\to [\nu(q3) \bar\nu(q4)]e(q_2) \gamma(q_5)\) with the polarisation vector n1 of the initial lepton. Even though we average over the neutrinos, their momenta are still given for completeness.

  6. The interplay between the function sigma_0(x,wgt,ndim) and vegas is as usual, through an array of random numbers x of length ndim that corresponds to the dimension of the integration. In addition there is the vegas weight of the event, wgt due to the Jacobian introduced by the importance sampling. The function sigma_0 simply evaluates the complete weight wg of a particular event by combining wgt with the matrix element supplemented by symmetry, flux, and phase-space factors.

    1. In a first step a phase-space routine of phase_space is called. For our LO calculation, init_piece pointed a pointer to the phase-space routine psd5_25(), a phase-space routine optimised for radiative lepton decays (cf. Section Phase-space generation). This will be called as a first step in the integrand to generate the momenta with correct masses as well as the phase-space weight weight.

    2. With these momenta the observables to be computed are evaluated with a call to user/quant(). If one of them passes the cuts, the variable pass_cut is set to true.

    3. This triggers the computation of the matrix element and the assembly of the full weight.

    4. In a last step, the routine bin_it, stored in vegas, is called to put the weight into the correct bins of the various distributions. If the variable under- or overshoots the bounds specified by user/min_val and user/max_val, the event is placed into dedicated, infinitely big under- and overflow bins.

    These steps are done for all events and those after pre-conditioning are used to obtain the final distributions.

  7. After preconditioning the state of the integrator is reset, as is usual.

  8. During the main run, the code generates a statefile from which the full state of the integrator can be reconstructed should the integration be interrupted (cf. Section Differential distributions and intermediary state files for details). This makes the statefile ideal to also store results in a compact format.

  9. The value and error estimate of the integration is printed to stdout.

To analyse these results, we provide a python tool pymule, additionally to the main code for McMule. pymule uses numpy [26] for data storage and matplotlib for plotting [13]. While pymule works with any python interpreter, IPython [22] is recommended. We will encountered pymule in Section Analysing the results when we discuss how to use it to analyse results. A full list of functions provided can be found in Appendix pymule user guide.