Likelihoods

Nafi includes a few simple likelihoods for testing and demonstration purposes.

In more complex situations you will want to evaluate likelihoods and enumerate/simulate outcomes outside of nafi, then feed these into nafi’s inference functions.

Single-bin counting experiment

Methods for producing likelihood ratios for a counting experiment with background.

nafi.likelihoods.counting.lnl_and_weights(mu_sig, mu_bg, n_max=None, return_outcomes=False)

Return (logl, toy weight) for a counting experiment with background.

Both are (n_outcomes, hypotheses) arrays:: lnl contains the log likelihood at each hypothesis, toy_weight contains P(outcome | hypotheses), normalized over outcomes

Parameters:

mu_sig – Array with signal rate hypotheses
mu_bg – Background rate (scalar)
n_max – Largest number of events to consider. If None, will be determined automatically from mu_sig and mu_bg.
return_outcomes – If True, return a third array of shape (n_outcomes,) containing the number of events for each outcome.

nafi.likelihoods.counting.single_lnl(*, n, mu_sig, mu_bg)

Return log likelihood for a single counting experiment observation

Parameters:

n – Observed number of events
mu_sig – Signal rate hypothesis (array)
mu_bg – Background rate (scalar)

Two-bin counting experiment

A two-bin counting experiment without uncertainties.

The first bin expects f_sig_1 * mu_sig signal and mu_bg_1 background. The second bin excepts (1 - fsig_1) * mu_sig signal and mu_bg_2 background.

nafi.likelihoods.twobin.lnl_and_weights(mu_sig, f_sig_1, mu_bg_1, mu_bg_2, n_max=None, m_max=None, return_outcomes=False)

Return (logl, toy weight) for a counting experiment with background.

Both are (n_outcomes, hypotheses) arrays:: lnl contains the log likelihood at each hypothesis, toy_weight contains P(outcome | hypotheses), normalized over outcomes

Parameters:

mu_sig – Array with signal rate hypotheses
f_sig_1 – Fraction of signal events in first bin
mu_bg_1 – Expected background in first bin
mu_bg_2 – Expected background in second bin
n_max – Largest number of events in bin 1 and two to consider. If None, will be determined automatically from parameters.
m_max – Largest number of events in bin 1 and two to consider. If None, will be determined automatically from parameters.
return_outcomes – If True, return a third argument, tuple of (n, m) arrays with the number of observed events for each possible outcome.

nafi.likelihoods.twobin.outcomes(n_max, m_max, ravel=True)

Return 2-tuple of two ((n_max + 1) * (m_max + 1)) arrays with possible experimental outcomes of a two-bin experiment.

Parameters:

n_max – Maximum number of events in first bin to consider
m_max – Maximum number of events in second bin to consider
ravel – If False (default is true), instead return two (n_max + 1, m_max + 1) 2d arrays.

nafi.likelihoods.twobin.total_mus(mu_sig, f_sig_1, mu_bg_1, mu_bg_2): Return (mu_1, mu_2) total expected number of events in both bins

Gaussian measurement

nafi.likelihoods.gaussian.lnl_and_weights(mu_hyp, n_x=1000, x_transform=<function _logit>, return_outcomes=False)

Return (logl, toy weight) for a Gaussian measurement of a parameter mu constrained to >= 0.

That is, the true value is at mu >= 0, and the experiment observes a single value X:

\[X \sim \mathrm{Norm}(\mu, 1)\]

Both results are (n_outcomes, hypotheses) arrays.: lnl contains the log likelihood at each hypothesis, and toy_weight contains P(outcome | hypotheses), normalized over outcomes.

Parameters:

mu_hyp – Array with hypotheses
n_x – Number of x outcomes to consider
x_transform – Function to transform a uniform 0-1 grid (minus endpoints) to x-values to consider as possible oucomes. Defaults to logit.

Extended unbinned likelihood

Classes for modelling likelihood ratios for an unbinned likelihoods without nuisance parameters.

class nafi.likelihoods.unbinned.TwoGaussians

Simulation and (extended) unbinned log likelihood for a signal and background that are both Gaussians with unit variance, but different means.

Takes a single parameter, sigma_sep, the distance between the signa. and background means.

differential_rate(x, mu_sig, mu_bg, params): Return differential rate for events with observed x

lnl_and_weights(mu_sig_hyp, mu_bg, key=None, n_sig_max=None, n_bg_max=None, trials_per_n=10000, return_outcomes=False, **params)

Return (logl, toy weight) for an unbinned signal/background likelihood

Both are (n_outcomes, hypotheses) arrays:: lnl contains the log likelihood at each hypothesis, toy_weight contains P(outcome | hypotheses), normalized over outcomes

Parameters:

mu_sig_hyp – Array with expected signal event hypotheses
mu_bg – Expected background events (scalar)
key – Jax PNRG key to use. If not provided, will choose a random one according to the numpy global random state.
n_sig_max – Largest number of signal events to consider. If None, will be determined automatically from mu_sig.
n_bg_max – Largest number of background events to consider. If None, will be determined automatically from mu_bg.
trials_per_n – Number of MC trials to use per signal event count.
return_outcomes – If True, return a third array of shape (n_outcomes,) containing a summary statistic of the outcome for each toy. The default (set by the summary_stat method) is the total number of observed events.

simulate_background(n_trials, n_max, key, params): Simulate (n_trials, n_max) background events

simulate_signal(n_trials, n_max, key, params): Simulate (n_trials, n_max) signal events

single_lnl(x, mu_sig, mu_bg, **params)

Return log likelihood for a single observation

Parameters:

x – Observed values (n_events,) array
mu_sig – Rate hypotheses, (n_hypotheses,) array
mu_bg – Background rate (scalar)

Returns:

(n_hypotheses,) array of log likelihoods

summary(x, present, params)

Return a summary statistic of the data.

The number must be additive over events. This function is called once for signal events and once for background events, then the results are added.

By default, returns the total number of events.

class nafi.likelihoods.unbinned.UnbinnedSignalBackground

Simulation and (extended) unbinned log likelihood for two Poisson processes (signal and background) distinguished by a single observable per event (e.g. energy or position).

To specify your own signal and background distributions, inherit from this class and override the simulate_signal, simulate_background, and differential_rate methods. See the TwoGaussians class for an example.

differential_rate(x, mu_sig, mu_bg, params): Return differential rate for events with observed x

lnl_and_weights(mu_sig_hyp, mu_bg, key=None, n_sig_max=None, n_bg_max=None, trials_per_n=10000, return_outcomes=False, **params)

Return (logl, toy weight) for an unbinned signal/background likelihood

Both are (n_outcomes, hypotheses) arrays:: lnl contains the log likelihood at each hypothesis, toy_weight contains P(outcome | hypotheses), normalized over outcomes

Parameters:

mu_sig_hyp – Array with expected signal event hypotheses
mu_bg – Expected background events (scalar)
key – Jax PNRG key to use. If not provided, will choose a random one according to the numpy global random state.
n_sig_max – Largest number of signal events to consider. If None, will be determined automatically from mu_sig.
n_bg_max – Largest number of background events to consider. If None, will be determined automatically from mu_bg.
trials_per_n – Number of MC trials to use per signal event count.
return_outcomes – If True, return a third array of shape (n_outcomes,) containing a summary statistic of the outcome for each toy. The default (set by the summary_stat method) is the total number of observed events.

simulate_background(n_trials, n_max, key, params): Simulate (n_trials, n_max) background events

simulate_signal(n_trials, n_max, key, params): Simulate (n_trials, n_max) signal events

single_lnl(x, mu_sig, mu_bg, **params)

Return log likelihood for a single observation

Parameters:

x – Observed values (n_events,) array
mu_sig – Rate hypotheses, (n_hypotheses,) array
mu_bg – Background rate (scalar)

Returns:

(n_hypotheses,) array of log likelihoods

summary(x, present, params)

Return a summary statistic of the data.

The number must be additive over events. This function is called once for signal events and once for background events, then the results are added.

By default, returns the total number of events.

On-off problem

A two-bin counting experiment where the second bin (‘ancilla’) has no signal, but a multiple tau times the background in the first bin.

The background rate is a nuisance parameter that is profiled over.

Notation:

n events are observed in the main experiment, m in the ancilla
The main experiment has signal mu_sig and background mu_bg
The ancilla only has background mu_bg * tau

nafi.likelihoods.onoff.conditional_bestfit_bg(mu_sig_hyp, n, m, *, tau)

Return the conditional best fit background rate for an on-off experiment. See lnl_and_weights for details.

Parameters:

n – Number of observed events
m – Number of observed events in the ancillary experiment
mu_hyp – Array with signal rate hypotheses
tau – Ratio of background rates in the two experiments

See e.g. Rolke, Lopez, Conrad (2005) arxiv:0403059, or Sen, Walker and Woodrofe (2009)

nafi.likelihoods.onoff.profile_lnl(mu_sig_hyp, n, m, *, tau)

Return (n_outcomes, n_hyp) log profile likelihood.

That is, mu_bg is replaced with its conditional best-fit value for each tested hypothesis.

nafi.likelihoods.onoff.profile_weights(mu_sig_hyp, n, m, n_obs, m_obs, *, tau)

Return (n_outcomes, n_hyp) weights of outcomes using the profile construction for the observed outcome n_obs, m_obs.

That is, the weights correspond to generating toy dataset with different mu_bg for each hypothesis, specifically, mu_bg set to the conditional best-fit value to the observed data for each hypothesis.

nafi.likelihoods.onoff.true_weights(mu_sig_hyp, mu_bg_hyp, n, m, *, tau)

Return (outcomes, signal hypotheses, background hypotheses) array with weights of outcomes given the true signal and background hypotheses.

(i.e. P(outcome | sig, bg), normalized to sum to 1 over outcomes)

Parameters:

mu_sig_hyp (-) – Array with signal rate hypotheses
mu_bg_hyp (-) – Array with background rate hypotheses
n (-) – same as always
m – same as always
tau – same as always

Uncertain-background counting experiment

TODO: this still needs work. See twobin for a more complete profile likelihood example.

nafi.likelihoods.counting_uncbg.lnl_and_weights(mu_sig_hyp, mu_bg_true, mu_bg_model, sigma_bg, n_max=None)

Return (logl, toy_weight) for a counting experiment with a background that has a Gaussian absolute theoretical uncertainty on mu_bg of sigma_bg.

Here lnl and toy_weight are both (n_outcomes, hypotheses) arrays,: where lnl contains the log likelihood at each hypothesis with the background rate profiled out, and toy_weight contains P(outcome | hypotheses), normalized over outcomes.

Since the background cannot be negative, this model can be ill-defined. Consider using nafi.likelihoods.onoff instead.

Parameters:

mu_sig_hyp – Array with signal rate hypotheses
mu_bg_true – True background rate to assume for toy data, array of same shape as mu_sig_hyp.
mu_bg_model – Expected/modelled mu_bg, scalar
sigma_bg – Gaussian absolute uncertainty on mu_bg_model (i.e. in number of events, not a percentage)