General aspects of using McMule

In this section, we will collect a few general points of interest regarding McMule. In particular, we will discuss heuristics on how much statistics is necessary for different contributions in Section Statistics. This is followed by a more in-depth discussion of the analysis strategy in Section Analysis.

Flavour

McMule implements generic processes that are connected to physical processes by setting the masses of the particles. For example, m2enng0 can refer to \(\mu\to e\nu\bar\nu\gamma\), \(\tau\to e\nu\bar\nu\gamma\), or \(\tau\to\mu\nu\bar\nu\gamma\), depending on the values of Mm and Me. These parameters are controlled by flavour as set during initflavour(). Further, initflavour() also sets the centre-of-mass energy scms for \(2\to n\) processes. We provide flavours for all relevant combinations as well as a few presets for scms as shown in Table 3. For processes that support it, appending a + or - can switch the charge of one of the invloved flavours.

Table 3 The different `flavour` in McMule
Flavour	Mapping			Default centre-of-mass energy
Flavour	`t`	`m`	`e`	Default centre-of-mass energy
`mu-e`		μ	e	no default
`tau-mu`		τ	μ	no default
`tau-e`		τ	e	no default
`tau-mue`	τ	μ	e	no default
`tau-emu`	τ	e	μ	no default
`mu-0`		μ	0	no default
`tau-0`		τ	0	no default
`e-p`		p	e	no default
`mu-p`		p	μ	no default
`muone`		μ	e	\(m_e^2+m_\mu^2+2m_e\times150\,{\rm GeV}\)
`mesa`		p	e	\(m_e^2+m_p^2+2m_p\times155\,{\rm MeV}\)
`belleTau`		τ	e	\(M_{\Upsilon(4S)}^2\)

Statistics

McMule is a Monte Carlo program. This means it samples the integrand at \(N\) (pseudo-)random points to get an estimate for the integral. However, because it uses the adaptive Monte Carlo integration routine vegas [15], we split \(N=i\times n\) into \(i\) iterations (itmx), each with \(n\) points (ncall). After each iteration, vegas changes the way it will sample the next iteration based on the results of the previous one. Hence, the performance of the integration is a subtle interplay between \(i\) and \(n\) – it is not sufficient any more to consider their product \(N\).

Further, we always perform the integration in two steps: a pre-conditioning with \(i_{\text{ad}}\times n_{\text{ad}}\) (ncall_ad and itmx_ad, respectively), that is used to optimise the integration strategy and after which the result is discarded, and a main integration that benefits from the integrator’s understanding of the integrand.

Of course there are no one-size-fits-all rules of how to choose the \(i\) and \(n\) for pre-conditioning and main run. However, the following heuristics have proven helpful:

\(n\) is always much larger than \(i\). For very simple integrands, \(n=\mathcal{O}(10\cdot 10^3)\) and \(i=\mathcal{O}(10)\).
Increasing \(n\) reduces errors that can be thought of as systematic because it allows the integrator to ‘discover’ new features of the integrand. Increasing \(i\) on the other hand will rarely have that effect and only improves the statistical error. This is especially true for distributions
There is no real limit on \(n\), except that it has to fit into the datatype used – integrations with \(n=\mathcal{O}(2^{31}-1)\) are not too uncommon – while \(i\) is rarely (much) larger than 100.
For very stringent cuts it can happen that that typical values of \(n_{\text{ad}}\) are too small for any point to pass the cuts. In this case vegas will return NaN, indicating that no events were found. Barring mistakes in the definition of the cuts, a pre-pre-conditioning with extremely large \(n\) but \(i=1\!-\!2\) can be helpful.
\(n\) also needs to be large enough for vegas to reliably find all features of the integrand. It is rarely obvious that it did, though sometimes it becomes clear when increasing \(n\) or looking at intermediary results as a function of the already-completed iterations.
The main run should always have larger \(i\) and \(n\) than the pre-conditioning. Judging how much more is a delicate game though \(i/i_{\text{ad}} = \mathcal{O}(5)\) and \(n/n_{\text{ad}} = \mathcal{O}(10\!-\!50)\) have been proven helpful.
If, once the integration is completed, the result is unsatisfactory, take into account the following strategies
- A large \(\chi^2/\rm{d.o.f.}\) indicates a too small \(n\). Try to increase \(n_{\text{ad}}\) and, to a perhaps lesser extent, \(n\).
- Increase \(i\). Often it is a good idea to consciously set \(i\) to a value so large that the integrator will never reach it and to keep looking at ‘intermediary’ results.
- If the error is small enough for the application but the result seems incorrect (for example because the \(\xi_{c}\) dependence does not vanish), massively increase \(n\).
Real corrections need much more statistics in both \(i\) and \(n\) (\(\mathcal{O}(10)\) times more for \(n\), \(\mathcal{O}(2)\) for \(i\)) than the corresponding LO calculations because of the higher-dimensional phase-space.
Virtual corrections have the same number of dimensions as the LO calculation and can go by with only a modest increase to account for the added functional complexity.
vegas tends to underestimate the numerical error.

These guidelines are often helpful but should not be considered infallible as they are just that – guidelines.

McMule is not parallelised; however, because Monte Carlo integrations require a random seed anyway, it is possible to calculate multiple estimates of the same integral using different random seeds \(z_1\) and combining the results obtained this way. This also allows to for a better, more reliable understanding of the error estimate.

Analysis

Once the Monte Carlo has run, an offline analysis of the results is required. This entails loading, averaging, and combining the data. This is automatised in pymule but the basic steps are

Load the data into a suitable analysis framework such as python.
Combine the different random seeds into one result per contribution and \(\xi_{c}\). The \(\chi^2/{\rm d.o.f.}\) of this merging must be small. Otherwise, try to increase the statistics or choose of different phase-space parametrisation.
Add all contributions that combine into one of the physical contributions (18). This includes any partitioning done in Section Phase-space generation.
(optional) At N\(^\ell\)LO, perform a fit[1]

(2)\[\sigma_{n+j}^{(\ell)} = c_0^{(j)} + c_1^{(j)} \log\xi_{c} + c_2^{(j)} \log^2\xi_{c} + \cdots + c_\ell^{(j)} \log^\ell = \sum_{i=0}^\ell c_i^{(j)}\log^i\xi_{c}\]

This has the advantage that it very clearly quantifies any residual \(\xi_{c}\) dependence. We will come back to this issue in Section Study of \xi_{c} dependence.
Combine all physical contributions of (17) into \(\sigma^{(\ell)}(\xi_{c})\) which has to be \(\xi_{c}\) independent.
Perform detailed checks on \(\xi_{c}\) independence. This is especially important on the first time a particular configuration is run. Beyond NLO, it is also extremely helpful to check whether the sum of the fits (2) is compatible with a constant, i.e. whether for all \(1\le i\le\ell\)

(3)\[\Bigg| \frac{\sum_{j=0}^\ell c_i^{(j)} } {\sum_{j=0}^\ell \delta c_i^{(j)} } \Bigg| < 1\]

where \(\delta c_i^{(j)}\) is the error estimate on the coefficient \(c_i^{(j)}\).[2] pymule’s mergefkswithplot() can be helpful here.

If (3) is not satisfied or only very poorly, try to run the Monte Carlo again with an increased \(n\).
Merge the different estimates of (17) from the different \(\xi_{c}\) into one final number \(\sigma^{(\ell)}\). The \(\chi^2/{\rm d.o.f.}\) of this merging must be small.
Repeat the above for any distributions produced, though often bin-wise fitting as in Point 3 is rarely necessary or helpful.

If a total cross section is \(\xi_{c}\) independent but the distributions (or a cross section obtained after applying cuts) are not, this is a hint that the distribution (or the applied cuts) is not IR safe.

These steps have been almost completely automatised in pymule and Mathematica. Though all steps of this pipeline could be easily implemented in any other language by following the specification of the file format below (Section Differential distributions and intermediary state files).

Manual compilation

You might need to compile McMule manually if you are not using a sufficiently recent Linux distribution or want to work it on yourself. In this case, you first need to obtain a copy of the McMule source code. We recommend the following approach

$ git clone --recursive https://gitlab.com/mule-tools/mcmule

To build McMule, you will need

Python 3.8 or newer

Meson 0.64.0 or newer

ninja 1.8.2 or newer

gcc 4.8 or newer incl. GFortran

You can install these dependencies for example using

$ sudo dnf install python3 meson ninja-build gcc gcc-gfortran # on RHEL derivates
$ sudo apt install python3 python3-venv python3-pip meson ninja-build gcc gfortran # on Debian derivates incl. Ubuntu
$ sudo brew install meson ninja gcc gfortran  # on macOS, requires homebrew

Next, you will need to install the pymule python package which is used during the building, running, and analysing of McMule. It can be used from the tools/ folder of the McMule repository but it is recommended that the user installs it

$ pip3 install pymule --index-url https://gitlab.com/api/v4/projects/19995364/packages/pypi/simple

Please note that most modern linux distributions have implemented PEP 668 and will refuse to install packages using pip3 outside of a virtual environment. If the above command fails, you should create a virtual environment to work in

$ python3 -m venv mcmule-environment
$ source mcmule-environment/bin/activate
$ pip3 install pymule --index-url https://gitlab.com/api/v4/projects/19995364/packages/pypi/simple

For the development version of pymule, you can run

$ pip3 install pymule --index-url https://gitlab.com/api/v4/projects/19995364/packages/pypi/simple --pre -U

instead. You will have to run the source command every time you start working with McMule. We apologise for the inconvenience this causes.

Now you need to configure and build McMule using meson and ninja

$ meson setup build
$ ninja -C build

Note that this will distribute the build on as many CPUs as your machine has which can cause memory issues. If you do not want to do that, add -j <number of jobs> flag to the ninja command. Despite the parallelisation, a full build of McMule is can take up to 1h, depending on your machine. If you only need to compile some parts of McMule (such as Bhabha scattering), you can control which process groups are build

$ meson setup build -Dgroups=mue,ee

If you need debug symbols, you can disable optimisation

$ meson setup build --buildtype=debug