Likelihoods
Nafi includes a few simple likelihoods for testing and demonstration purposes.
In more complex situations you will want to evaluate likelihoods and enumerate/simulate outcomes outside of nafi, then feed these into nafi’s inference functions.
Single-bin counting experiment
Methods for producing likelihood ratios for a counting experiment with background.
- nafi.likelihoods.counting.lnl_and_weights(mu_sig, mu_bg, n_max=None, return_outcomes=False)
Return (logl, toy weight) for a counting experiment with background.
- Both are (n_outcomes, hypotheses) arrays:
lnl contains the log likelihood at each hypothesis, toy_weight contains P(outcome | hypotheses), normalized over outcomes
- Parameters:
mu_sig – Array with signal rate hypotheses
mu_bg – Background rate (scalar)
n_max – Largest number of events to consider. If None, will be determined automatically from mu_sig and mu_bg.
return_outcomes – If True, return a third array of shape (n_outcomes,) containing the number of events for each outcome.
- nafi.likelihoods.counting.single_lnl(*, n, mu_sig, mu_bg)
Return log likelihood for a single counting experiment observation
- Parameters:
n – Observed number of events
mu_sig – Signal rate hypothesis (array)
mu_bg – Background rate (scalar)
Two-bin counting experiment
A two-bin counting experiment without uncertainties.
The first bin expects f_sig_1 * mu_sig signal and mu_bg_1 background. The second bin excepts (1 - fsig_1) * mu_sig signal and mu_bg_2 background.
- nafi.likelihoods.twobin.lnl_and_weights(mu_sig, f_sig_1, mu_bg_1, mu_bg_2, n_max=None, m_max=None, return_outcomes=False)
Return (logl, toy weight) for a counting experiment with background.
- Both are (n_outcomes, hypotheses) arrays:
lnl contains the log likelihood at each hypothesis, toy_weight contains P(outcome | hypotheses), normalized over outcomes
- Parameters:
mu_sig – Array with signal rate hypotheses
f_sig_1 – Fraction of signal events in first bin
mu_bg_1 – Expected background in first bin
mu_bg_2 – Expected background in second bin
n_max – Largest number of events in bin 1 and two to consider. If None, will be determined automatically from parameters.
m_max – Largest number of events in bin 1 and two to consider. If None, will be determined automatically from parameters.
return_outcomes – If True, return a third argument, tuple of (n, m) arrays with the number of observed events for each possible outcome.
- nafi.likelihoods.twobin.outcomes(n_max, m_max, ravel=True)
Return 2-tuple of two ((n_max + 1) * (m_max + 1)) arrays with possible experimental outcomes of a two-bin experiment.
- Parameters:
n_max – Maximum number of events in first bin to consider
m_max – Maximum number of events in second bin to consider
ravel – If False (default is true), instead return two (n_max + 1, m_max + 1) 2d arrays.
- nafi.likelihoods.twobin.total_mus(mu_sig, f_sig_1, mu_bg_1, mu_bg_2)
Return (mu_1, mu_2) total expected number of events in both bins
Gaussian measurement
- nafi.likelihoods.gaussian.lnl_and_weights(mu_hyp, n_x=1000, x_transform=<function _logit>, return_outcomes=False)
Return (logl, toy weight) for a Gaussian measurement of a parameter mu constrained to >= 0.
That is, the true value is at mu >= 0, and the experiment observes a single value X:
\[X \sim \mathrm{Norm}(\mu, 1)\]- Both results are (n_outcomes, hypotheses) arrays.
lnlcontains the log likelihood at each hypothesis, andtoy_weightcontains P(outcome | hypotheses), normalized over outcomes.
- Parameters:
mu_hyp – Array with hypotheses
n_x – Number of x outcomes to consider
x_transform – Function to transform a uniform 0-1 grid (minus endpoints) to x-values to consider as possible oucomes. Defaults to logit.
Extended unbinned likelihood
Classes for modelling likelihood ratios for an unbinned likelihoods without nuisance parameters.
- class nafi.likelihoods.unbinned.TwoGaussians
Simulation and (extended) unbinned log likelihood for a signal and background that are both Gaussians with unit variance, but different means.
Takes a single parameter,
sigma_sep, the distance between the signa. and background means.- differential_rate(x, mu_sig, mu_bg, params)
Return differential rate for events with observed x
- lnl_and_weights(mu_sig_hyp, mu_bg, key=None, n_sig_max=None, n_bg_max=None, trials_per_n=10000, return_outcomes=False, **params)
Return (logl, toy weight) for an unbinned signal/background likelihood
- Both are (n_outcomes, hypotheses) arrays:
lnl contains the log likelihood at each hypothesis, toy_weight contains P(outcome | hypotheses), normalized over outcomes
- Parameters:
mu_sig_hyp – Array with expected signal event hypotheses
mu_bg – Expected background events (scalar)
key – Jax PNRG key to use. If not provided, will choose a random one according to the numpy global random state.
n_sig_max – Largest number of signal events to consider. If None, will be determined automatically from mu_sig.
n_bg_max – Largest number of background events to consider. If None, will be determined automatically from mu_bg.
trials_per_n – Number of MC trials to use per signal event count.
return_outcomes – If True, return a third array of shape (n_outcomes,) containing a summary statistic of the outcome for each toy. The default (set by the summary_stat method) is the total number of observed events.
- simulate_background(n_trials, n_max, key, params)
Simulate (n_trials, n_max) background events
- simulate_signal(n_trials, n_max, key, params)
Simulate (n_trials, n_max) signal events
- single_lnl(x, mu_sig, mu_bg, **params)
Return log likelihood for a single observation
- Parameters:
x – Observed values (n_events,) array
mu_sig – Rate hypotheses, (n_hypotheses,) array
mu_bg – Background rate (scalar)
- Returns:
(n_hypotheses,) array of log likelihoods
- summary(x, present, params)
Return a summary statistic of the data.
The number must be additive over events. This function is called once for signal events and once for background events, then the results are added.
By default, returns the total number of events.
- class nafi.likelihoods.unbinned.UnbinnedSignalBackground
Simulation and (extended) unbinned log likelihood for two Poisson processes (signal and background) distinguished by a single observable per event (e.g. energy or position).
To specify your own signal and background distributions, inherit from this class and override the simulate_signal, simulate_background, and differential_rate methods. See the TwoGaussians class for an example.
- differential_rate(x, mu_sig, mu_bg, params)
Return differential rate for events with observed x
- lnl_and_weights(mu_sig_hyp, mu_bg, key=None, n_sig_max=None, n_bg_max=None, trials_per_n=10000, return_outcomes=False, **params)
Return (logl, toy weight) for an unbinned signal/background likelihood
- Both are (n_outcomes, hypotheses) arrays:
lnl contains the log likelihood at each hypothesis, toy_weight contains P(outcome | hypotheses), normalized over outcomes
- Parameters:
mu_sig_hyp – Array with expected signal event hypotheses
mu_bg – Expected background events (scalar)
key – Jax PNRG key to use. If not provided, will choose a random one according to the numpy global random state.
n_sig_max – Largest number of signal events to consider. If None, will be determined automatically from mu_sig.
n_bg_max – Largest number of background events to consider. If None, will be determined automatically from mu_bg.
trials_per_n – Number of MC trials to use per signal event count.
return_outcomes – If True, return a third array of shape (n_outcomes,) containing a summary statistic of the outcome for each toy. The default (set by the summary_stat method) is the total number of observed events.
- simulate_background(n_trials, n_max, key, params)
Simulate (n_trials, n_max) background events
- simulate_signal(n_trials, n_max, key, params)
Simulate (n_trials, n_max) signal events
- single_lnl(x, mu_sig, mu_bg, **params)
Return log likelihood for a single observation
- Parameters:
x – Observed values (n_events,) array
mu_sig – Rate hypotheses, (n_hypotheses,) array
mu_bg – Background rate (scalar)
- Returns:
(n_hypotheses,) array of log likelihoods
- summary(x, present, params)
Return a summary statistic of the data.
The number must be additive over events. This function is called once for signal events and once for background events, then the results are added.
By default, returns the total number of events.
On-off problem
A two-bin counting experiment where the second bin (‘ancilla’) has no signal, but a multiple tau times the background in the first bin.
The background rate is a nuisance parameter that is profiled over.
- Notation:
nevents are observed in the main experiment,min the ancillaThe main experiment has signal
mu_sigand backgroundmu_bgThe ancilla only has background
mu_bg * tau
- nafi.likelihoods.onoff.conditional_bestfit_bg(mu_sig_hyp, n, m, *, tau)
Return the conditional best fit background rate for an on-off experiment. See lnl_and_weights for details.
- Parameters:
n – Number of observed events
m – Number of observed events in the ancillary experiment
mu_hyp – Array with signal rate hypotheses
tau – Ratio of background rates in the two experiments
See e.g. Rolke, Lopez, Conrad (2005) arxiv:0403059, or Sen, Walker and Woodrofe (2009)
- nafi.likelihoods.onoff.profile_lnl(mu_sig_hyp, n, m, *, tau)
Return (n_outcomes, n_hyp) log profile likelihood.
That is, mu_bg is replaced with its conditional best-fit value for each tested hypothesis.
- nafi.likelihoods.onoff.profile_weights(mu_sig_hyp, n, m, n_obs, m_obs, *, tau)
Return (n_outcomes, n_hyp) weights of outcomes using the profile construction for the observed outcome n_obs, m_obs.
That is, the weights correspond to generating toy dataset with different mu_bg for each hypothesis, specifically, mu_bg set to the conditional best-fit value to the observed data for each hypothesis.
- nafi.likelihoods.onoff.true_weights(mu_sig_hyp, mu_bg_hyp, n, m, *, tau)
Return (outcomes, signal hypotheses, background hypotheses) array with weights of outcomes given the true signal and background hypotheses.
(i.e. P(outcome | sig, bg), normalized to sum to 1 over outcomes)
- Parameters:
mu_sig_hyp (-) – Array with signal rate hypotheses
mu_bg_hyp (-) – Array with background rate hypotheses
n (-) – same as always
m – same as always
tau – same as always
Uncertain-background counting experiment
TODO: this still needs work. See twobin for a more complete profile
likelihood example.
- nafi.likelihoods.counting_uncbg.lnl_and_weights(mu_sig_hyp, mu_bg_true, mu_bg_model, sigma_bg, n_max=None)
Return (logl, toy_weight) for a counting experiment with a background that has a Gaussian absolute theoretical uncertainty on mu_bg of sigma_bg.
- Here lnl and toy_weight are both (n_outcomes, hypotheses) arrays,
where lnl contains the log likelihood at each hypothesis with the background rate profiled out, and toy_weight contains P(outcome | hypotheses), normalized over outcomes.
Since the background cannot be negative, this model can be ill-defined. Consider using
nafi.likelihoods.onoffinstead.- Parameters:
mu_sig_hyp – Array with signal rate hypotheses
mu_bg_true – True background rate to assume for toy data, array of same shape as mu_sig_hyp.
mu_bg_model – Expected/modelled mu_bg, scalar
sigma_bg – Gaussian absolute uncertainty on mu_bg_model (i.e. in number of events, not a percentage)