binny.nz_tomo.between_sample_metrics module#
Cross-bin comparison metrics for binned redshift distributions.
- binny.nz_tomo.between_sample_metrics.between_bin_overlap(z: Any, bins_a: Mapping[int, Any], bins_b: Mapping[int, Any], *, method: str = 'min', unit: Literal['fraction', 'percent'] = 'fraction', normalize: bool = False, rtol: float = 0.001, atol: float = 1e-06, decimal_places: int | None = 2) dict[int, dict[int, float]]#
Computes a rectangular pairwise comparison matrix between two bin sets.
This function compares all bin distributions from one tomographic sample against all bin distributions from another tomographic sample, assuming both are evaluated on a shared redshift grid. The output is generally rectangular rather than symmetric, since the two samples can contain different bin indices and different numbers of bins.
Supported methods:
"min": Integral of the pointwise minimum of the two curves. If curves are normalized, values lie in [0, 1]."cosine": Cosine similarity under a continuous inner product. For nonnegative curves, values lie in [0, 1], with 1 meaning identical up to overall scaling."js": Jensen–Shannon distance computed on segment-mass probability vectors. With normalized curves, values lie in [0, 1], with 0 meaning identical and larger values meaning more distinct distributions."hellinger": Hellinger distance on segment-mass probability vectors (in [0, 1])."tv": Total variation distance on segment-mass probability vectors (in [0, 1]).
- Parameters:
z – One-dimensional redshift grid shared by both bin sets.
bins_a – Mapping from first-sample bin index to bin distributions evaluated on
z.bins_b – Mapping from second-sample bin index to bin distributions evaluated on
z.method – Pairwise metric to compute.
unit – Output units. If
"percent", values are multiplied by 100.normalize – Whether to normalize curves before comparison.
rtol – Relative tolerance for the normalization check.
atol – Absolute tolerance for the normalization check.
decimal_places – Rounding precision for output values.
- Returns:
Nested mapping
mat[i][j]giving the pairwise value between first-sample biniand second-sample binj.- Raises:
ValueError – If
methodis not supported.
- binny.nz_tomo.between_sample_metrics.between_interval_mass_matrix(z: Any, bins: Mapping[int, Any], target_edges: Mapping[int, tuple[float, float]] | Sequence[float] | ndarray, *, unit: Literal['fraction', 'percent'] = 'fraction', decimal_places: int | None = 2) dict[int, dict[int, float]]#
Computes a rectangular interval-mass matrix against target bin edges.
The interval-mass matrix
mass[i][j]gives the fraction of the total mass in input binithat lies within target intervalj. This is the between-sample analogue of a leakage matrix and is useful, for example, when asking how much of a source bin falls inside a lens-bin interval.- Parameters:
z – One-dimensional redshift grid shared by all input bins.
bins – Mapping from input bin index to bin distributions evaluated on
z.target_edges – Either a mapping from target bin index to
(low, high)edges, or a sequence/array of edges where target binjhas edges(target_edges[j], target_edges[j+1]).unit – Output units. If
"percent", values are multiplied by 100.decimal_places – Rounding precision for output values.
- Returns:
Nested mapping
mass[i][j]giving the fraction of mass in input binithat lies within target intervalj.- Raises:
ValueError – If a bin has non-positive total mass.
ValueError – If target edges are invalid (hi <= lo).
ValueError – If
unitis not supported.
- binny.nz_tomo.between_sample_metrics.between_overlap_pairs(z: Any, bins_a: Mapping[int, Any], bins_b: Mapping[int, Any], *, threshold: float = 10.0, unit: Literal['fraction', 'percent'] = 'percent', method: str = 'min', direction: Literal['high', 'low'] = 'high', normalize: bool = False, rtol: float = 0.001, atol: float = 1e-06, decimal_places: int | None = 2) list[tuple[int, int, float]]#
Returns between-sample bin pairs passing a threshold in a chosen metric.
This is a convenience wrapper around
between_bin_overlap(). It computes the rectangular pairwise matrix between two tomographic samples and returns all bin correlations(i, j)that pass the requested threshold.- Parameters:
z – One-dimensional redshift grid shared by both bin sets.
bins_a – Mapping from first-sample bin index to bin distributions evaluated on
z.bins_b – Mapping from second-sample bin index to bin distributions evaluated on
z.threshold – Threshold to apply in the units specified by
unit.unit – Units used for both the metric calculation and the threshold. Accepted values are
"fraction"and"percent".method – Pairwise metric passed to
between_bin_overlap().direction – Whether to select values >= threshold (
"high") or <= threshold ("low").normalize – Passed to
between_bin_overlap().rtol – Relative tolerance for normalization check (if needed).
atol – Absolute tolerance for normalization check (if needed).
decimal_places – Rounding precision for output values.
- Returns:
List of
(i, j, value)tuples, whereiis a first-sample bin index andjis a second-sample bin index. Results are sorted by decreasing value fordirection="high"and increasing value fordirection="low".- Raises:
ValueError – If
directionis not"high"or"low".
- binny.nz_tomo.between_sample_metrics.between_pearson_matrix(z: Any, bins_a: Mapping[int, Any], bins_b: Mapping[int, Any], *, normalize: bool = False, rtol: float = 0.001, atol: float = 1e-06, decimal_places: int | None = 2) dict[int, dict[int, float]]#
Computes a rectangular trapezoid-weighted Pearson matrix between two bin sets.
The Pearson correlation between two curves
f(z)andg(z)is defined ascorr(f, g) = cov(f, g) / (std(f) * std(g))
where the covariance and standard deviations are computed using trapezoid integration weights over the redshift grid.
Unlike
pearson_matrix(), this function compares two different tomographic samples and therefore returns a rectangular matrixcorr[i][j], whereiis from the first sample andjis from the second sample.Note: if
normalize=True, the comparison is in terms of shape correlations, since all curves are normalized to unit integral before computing the correlation. Ifnormalize=False, the correlation reflects both shape and amplitude similarities.- Parameters:
z – One-dimensional redshift grid shared by both bin sets.
bins_a – Mapping from first-sample bin index to bin distributions evaluated on
z.bins_b – Mapping from second-sample bin index to bin distributions evaluated on
z.normalize – Whether to normalize curves before computing correlations.
rtol – Relative tolerance for the normalization check.
atol – Absolute tolerance for the normalization check.
decimal_places – Rounding precision for output values.
- Returns:
Nested mapping
corr[i][j]giving the Pearson correlation between first-sample biniand second-sample binj.- Raises:
ValueError – If either bin set contains a bin with non-positive integral when normalization is checked or performed.
ValueError – If the two bin sets are not evaluated on the same z grid.
- binny.nz_tomo.between_sample_metrics.bin_overlap(z: Any, bins: Mapping[int, Any], *, method: str = 'min', unit: Literal['fraction', 'percent'] = 'fraction', normalize: bool = False, rtol: float = 0.001, atol: float = 1e-06, decimal_places: int | None = 2) dict[int, dict[int, float]]#
Computes a pairwise comparison matrix for binned redshift distributions.
This function compares all correlations of bin distributions evaluated on a shared redshift grid and returns a symmetric matrix of values.
Supported methods:
"min": Integral of the pointwise minimum of the two curves. If curves are normalized, values lie in [0, 1] and the diagonal is 1."cosine": Cosine similarity under a continuous inner product. For nonnegative curves, values lie in [0, 1], with 1 meaning identical up to overall scaling."js": Jensen–Shannon distance computed on segment-mass probability vectors. With normalized curves, values lie in [0, 1], with 0 meaning identical and 1 meaning maximally different under this metric."hellinger": Hellinger distance on segment-mass probability vectors (in [0, 1])."tv": Total variation distance on segment-mass probability vectors (in [0, 1]).
- Parameters:
z – One-dimensional redshift grid shared by all bins.
bins – Mapping from bin index to bin distributions evaluated on
z.method – Pairwise metric to compute.
unit – Output units. If
"percent", values are multiplied by 100.normalize – Whether to normalize curves before comparison.
rtol – Relative tolerance for the normalization check.
atol – Absolute tolerance for the normalization check.
decimal_places – Rounding precision for output values.
- Returns:
Nested mapping
mat[i][j]giving the pairwise value between binsiandj.- Raises:
ValueError – If
methodis not supported.
- binny.nz_tomo.between_sample_metrics.leakage_matrix(z: Any, bins: Mapping[int, Any], bin_edges: Mapping[int, tuple[float, float]] | Sequence[float] | ndarray, *, unit: Literal['fraction', 'percent'] = 'fraction', decimal_places: int | None = 2) dict[int, dict[int, float]]#
Computes a leakage/confusion matrix between bins based on nominal edges.
The leakage matrix
leak[i][j]gives the fraction of the total mass in binithat lies within the edges of binj. The diagonal entries therefore give the completeness of each bin with respect to its nominal edges, while the off-diagonal entries give the contamination from other bins.- Parameters:
z – One-dimensional redshift grid shared by all bins.
bins – Mapping from bin index to bin distributions evaluated on
z.bin_edges – Either a mapping from bin index to (low, high) edges, or a sequence/array of edges where bin
ihas edges(bin_edges[i], bin_edges[i+1]).unit – Output units. If
"percent", values are multiplied by 100.decimal_places – Rounding precision for output values.
- Returns:
Nested mapping
leak[i][j]giving the fraction of mass in binithat lies within the edges of binj.- Raises:
ValueError – If a bin has non-positive total mass.
ValueError – If bin edges are invalid (hi <= lo).
ValueError – If
unitis not supported.
- binny.nz_tomo.between_sample_metrics.overlap_pairs(z: Any, bins: Mapping[int, Any], *, threshold: float = 10.0, unit: Literal['fraction', 'percent'] = 'percent', method: str = 'min', direction: Literal['high', 'low'] = 'high', normalize: bool = False, rtol: float = 0.001, atol: float = 1e-06, decimal_places: int | None = 2) list[tuple[int, int, float]]#
Returns bin-index correlations passing a threshold in a chosen pairwise metric.
This is a convenience wrapper around
bin_overlap(). It computes the pairwise matrix and returns unique off-diagonal correlations(i, j)withi < jthat pass the requested threshold.- Parameters:
z – One-dimensional redshift grid shared by all bins.
bins – Mapping from bin index to bin distributions evaluated on
z.threshold – Threshold to apply in the units specified by
unit.unit – Units used for both the overlap calculation and the threshold. Accepted values are
"fraction"and"percent".method – Pairwise metric passed to
bin_overlap().direction – Whether to select values >= threshold (
"high") or <= threshold ("low").normalize – Passed to
bin_overlap().rtol – Relative tolerance for normalization check (if needed).
atol – Absolute tolerance for normalization check (if needed).
decimal_places – Rounding precision for output values.
- Returns:
List of (i, j, value) tuples with i < j, sorted by decreasing value for
direction="high"and increasing value fordirection="low".- Raises:
ValueError – If
directionis not"high"or"low".
- binny.nz_tomo.between_sample_metrics.pearson_matrix(z: Any, bins: Mapping[int, Any], *, normalize: bool = False, rtol: float = 0.001, atol: float = 1e-06, decimal_places: int | None = 2) dict[int, dict[int, float]]#
Computes a trapezoid-weighted Pearson correlation matrix between bin curves.
The Pearson correlation between two curves
f(z)andg(z)is defined ascorr(f, g) = cov(f, g) / (std(f) * std(g))
where the covariance and standard deviations are computed using trapezoid integration weights over the redshift grid.
Note: if
normalize=True, the comparison is in terms of shape correlations, since all curves are normalized to unit integral before computing the correlation. Ifnormalize=False, the correlation reflects both shape and amplitude similarities.- Parameters:
z – One-dimensional redshift grid shared by all bins.
bins – Mapping from bin index to bin distributions evaluated on
z.normalize – Control normalization behavior. If
True, all bins are normalized before computing correlations.rtol – Relative tolerance for the normalization check.
atol – Absolute tolerance for the normalization check.
decimal_places – Rounding precision for output values.
- Returns:
Nested mapping
corr[i][j]giving the Pearson correlation between binsiandj.- Raises:
ValueError – If a bin has non-positive integral when normalization is checked or performed.