harv.models package

Model components for Keplerian orbit inference.

This subpackage contains:

  • Parameterization objects (internal): declare parameter roles and design matrices.

  • Concrete model classes: RVModel, GaiaAstrometryModel.

  • JointModel: composition of multiple component models.

class harv.models.AbstractComponentModel

Bases: Module

Abstract base for single-data-type component models.

Component models are templates: they carry only parameterization and extensions (config). data and linear_prior are passed at evaluation time to the methods that need them. This means the same model instance can be re-used across multiple datasets without rebuilding.

Concrete subclasses must implement:

  • _param_infos: all parameter descriptors.

  • _base_design_matrix(nl_values, data): the base design matrix from runtime data + nonlinear values.

  • _strip_obs(data): return (obs, obs_err) as unit-stripped JAX arrays.

  • _obs_unit(data): the physical unit string of the observations.

Concrete subclasses must also declare:

  • extensions: tuple of AbstractExtension (model modifiers).

__init__()
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

chi_squared(nl_values, linear_values, data)

Goodness-of-fit \(\chi^2\) for one fully-specified parameter set.

Unlike log_prob() (which marginalizes the linear parameters and returns a marginal log-likelihood), this evaluates the model at the given linear-parameter values and returns the residual statistic

\[\chi^2 = r^\top C^{-1} r, \qquad r = y_\mathrm{obs} - X\,y,\]

where \(C\) is the (extension-modified) observation covariance. For a diagonal covariance this is \(\sum_i r_i^2 / \sigma_i^2\); a Gaussian-process extension promotes \(C\) to a full matrix and the Mahalanobis form is used. Jitter is included via the inflated \(C\).

Parameters:
  • nl_values (dict[str, Any]) – Nonlinear parameter values (orbital + any extension parameters), in the same form accepted by log_prob().

  • linear_values (dict[str, Array]) – Linear parameter values, unit-stripped to the model’s linear parameter units (see _linear_param_units()).

  • data (AbstractData) – Runtime observation data.

Return type:

Array

Returns:

Scalar \(\chi^2\).

log_prob(nl_values, data, *, linear_priors=None, linear_values=None, marginalized_names=None)

Compute the log-likelihood.

Three calling conventions are supported:

  1. Auto mode (recommended): pass linear_prior and let the model classify which linear params to marginalize. Non-Gaussian linear priors are expected as entries in nl_values.

  2. Manual marginalization: pass marginalized_names (and optionally linear_values) to control exactly which linear params are marginalized.

  3. Explicit evaluation: pass linear_values without marginalized_names (and linear_priors=None) to evaluate the Gaussian log-likelihood at fixed linear parameter values.

Parameters:
  • nl_values (dict[str, Any]) – Parameter values. In auto mode this may contain explicit linear parameter values alongside the nonlinear ones.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData / SystemData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for analytic marginalization. Required for auto and manual-marginalization modes.

  • linear_values (dict[str, Array] | None) – Explicit linear parameter values (unit-stripped). When given without marginalized_names, triggers explicit evaluation.

  • marginalized_names (tuple[str, ...] | None) – Which linear params to marginalize (manual mode).

Return type:

Array

Returns:

Scalar log-likelihood.

numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)

Build a numpyro model function for MCMC sampling.

Parameters:
  • nonlinear_priors (dict[str, Distribution | QuantityDistribution]) – Prior distributions for nonlinear parameters. Keys are parameter names (e.g. "period", "eccentricity"). Values are Distribution or QuantityDistribution.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for the linear parameters. Required when any marginalization happens (marginalized=True or full non-marginalized mode that still needs explicit linear priors).

  • marginalized (bool) – If True (default), linear parameters are analytically marginalized and only nonlinear parameters are sampled. If False, all parameters are sampled explicitly.

  • marginalized_names (tuple[str, ...] | None) – Optional subset of linear parameter names to analytically marginalize when marginalized=True. None means use the model’s automatic prior-based classification.

Return type:

Callable[[], None]

Returns:

A no-argument callable suitable for numpyro.infer.MCMC.

params_explicit(linear_priors)

Names of parameters that must be explicitly sampled.

Given a (resolved) linear_prior dict, returns all nonlinear parameters (orbital + extension, e.g. jitter) plus any linear parameters with non-Gaussian priors (e.g. parallax with a HalfNormal prior) that cannot be analytically marginalized.

These are the keys that must appear in the values dict passed to log_prob().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

params_marginalized(linear_priors)

Names of linear parameters analytically marginalized in log_prob.

Given a (resolved) linear_prior dict, returns the linear parameter names whose priors are Gaussian (or callable-returning-Normal) and so are integrated out analytically via the Woodbury identity rather than sampled. Their values are NOT required in the values dict; they are recovered afterward via sample_conditional_linear().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

predict(nl_values, linear_values, data)

Full predicted observable y_pred = X @ y at the data times.

X is the extension-augmented design matrix (_full_design_matrix()) and y is the ordered linear-parameter vector built from linear_values (missing entries default to 0). Used by _log_prob_explicit(), chi_squared(), and the plot functions so that every prediction path shares the same construction.

Parameters:
Return type:

Array

sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, explicit_linear=None, use_mean=False)

Sample linear parameters from the conditional posterior.

In auto mode (both marginalized_names and explicit_linear are None), the method classifies from linear_prior and extracts explicit linear values from nl_values.

Returns all linear parameter values (both sampled and explicit), unit-stripped.

When use_mean=True, the conditional posterior mean is returned for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. This is the appropriate choice when completing a MAP estimate (see optimize()); MCMC paths should keep the default use_mean=False.

Parameters:
Return type:

dict[str, Array]

extensions: AbstractVar[tuple[AbstractExtension, ...]]
class harv.models.AbstractParameterization

Bases: Module

Abstract base for model parameterizations.

Subclasses must implement params() (returning all parameter descriptors). The nonlinear_params() and linear_params() convenience methods are derived automatically.

Concrete parameterizations are also expected to override default_prior() with a type-narrow signature describing their required scale arguments.

__init__()
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

default_prior(**kwargs)

Build a HarvPrior with sensible defaults.

Each concrete parameterization overrides this with its own type-narrow signature for the required scale arguments (e.g. sigma_K0 for RV, sigma_a0 for astrometry). The base implementation raises NotImplementedError; default_prior is not declared @abstractmethod because subclass signatures legitimately differ.

Parameters:

**kwargs (Any) – Parameterization-specific keyword arguments (scale arguments and per-parameter prior overrides).

Returns:

A prior whose nonlinear_priors and linear_prior entries match the names declared by self.params().

Return type:

HarvPrior

linear_log_prior_correction(linear_map)

Optional log-prior correction added to the marginal log-likelihood.

Called with the conditional-mean linear-parameter values (unit-stripped, in the model’s observation units) after Gaussian marginalization. Returns None (no correction) by default.

Non-trivial only when the parameterization’s linear parameters are not the natural physical parameters — for example, Thiele-Innes constants (A, B, F, G) instead of the Campbell elements (a_0, ω, Ω, cos i). In that case the Jacobian of the change of variables must be applied to recover the correct posterior when priors are specified in the physical (Campbell) space.

Parameters:

linear_map (dict[str, Array]) – Conditional-mean values of the marginalized linear parameters, keyed by parameter name.

Returns:

Scalar log-correction to add to the marginal log-likelihood, or None if no correction is needed.

Return type:

Array or None

linear_params()

Return only the linear parameters.

Return type:

tuple[ParamInfo, ...]

nonlinear_params()

Return only the nonlinear parameters.

Return type:

tuple[ParamInfo, ...]

abstractmethod params()

All parameters declared by this parameterization (nonlinear first).

Return type:

tuple[ParamInfo, ...]

class harv.models.EcoswEsinwRV

Bases: AbstractParameterization

Alternative RV parameterization using e*cos(w) and e*sin(w).

Replaces the (eccentricity, arg_peri) pair with (ecosw, esinw) = (e*cos(omega), e*sin(omega)), which often has better sampling geometry for low eccentricities:

  • Nonlinear:
    • period - the orbital period

    • ecosw - the eccentricity times cosine of argument of periastron

    • esinw - the eccentricity times sine of argument of periastron

    • phase_peri - the phase at which the mean anomaly is zero (i.e. periastron passage), using a time system relative to the data’s reference time

  • Linear:
    • rv_semiamp - sometimes called “K”, the RV semi-amplitude

    • v_sys - the systemic velocity

Examples

>>> from harv.models.parameterizations.rv import EcoswEsinwRV
>>> p = EcoswEsinwRV()
>>> [pi.name for pi in p.params()]
['period', 'ecosw', 'esinw', 'phase_peri', 'rv_semiamp', 'v_sys']
__init__()
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

default_prior(*, period_min=None, period_max=None, sigma_K0=None, sigma_v0=None, P0=Quantity(Array(1., dtype=float32, weak_type=True), unit='yr'), **kwargs)

Build a HarvPrior with sensible defaults.

Nonlinear priors:

  • period: log-uniform on [period_min, period_max].

  • ecosw: Uniform(-1, 1).

  • esinw: Uniform(-1, 1).

  • phase_peri: Uniform(0, 1).

Linear priors are the same as StandardRV.default_prior().

Independent Uniform(-1, 1) priors on ecosw and esinw do not match the implicit prior under e ~ Beta(0.867, 3.03) with omega ~ Uniform(0, 2*pi). This is the simplest sensible default for this parameterization; for a matched prior, sample with StandardRV and convert.

Parameters:
Return type:

HarvPrior

design_matrix(sin_f, cos_f, nl_values)

Build (n_obs, 2) design matrix from ecosw/esinw.

Parameters:
  • sin_f (Array) – Sine of true anomaly (unit-stripped).

  • cos_f (Array) – Cosine of true anomaly (unit-stripped).

  • nl_values (dict[str, Any]) – Must contain "ecosw" and "esinw" (dimensionless scalars). The eccentricity and arg_peri are derived internally.

Return type:

Array

Returns:

Design matrix block, shape (n_obs, 2).

eccentricity(nl_values)

Return the orbital eccentricity derived from ecosw/esinw.

Parameters:

nl_values (dict[str, Any])

Return type:

Any

linear_log_prior_correction(linear_map)

Optional log-prior correction added to the marginal log-likelihood.

Called with the conditional-mean linear-parameter values (unit-stripped, in the model’s observation units) after Gaussian marginalization. Returns None (no correction) by default.

Non-trivial only when the parameterization’s linear parameters are not the natural physical parameters — for example, Thiele-Innes constants (A, B, F, G) instead of the Campbell elements (a_0, ω, Ω, cos i). In that case the Jacobian of the change of variables must be applied to recover the correct posterior when priors are specified in the physical (Campbell) space.

Parameters:

linear_map (dict[str, Array]) – Conditional-mean values of the marginalized linear parameters, keyed by parameter name.

Returns:

Scalar log-correction to add to the marginal log-likelihood, or None if no correction is needed.

Return type:

Array or None

linear_params()

Return only the linear parameters.

Return type:

tuple[ParamInfo, ...]

nonlinear_params()

Return only the nonlinear parameters.

Return type:

tuple[ParamInfo, ...]

params()

All parameters declared by this parameterization (nonlinear first).

Return type:

tuple[ParamInfo, ...]

strip_nl_for_design(nl_values)

Return nl_values with units stripped for design_matrix.

Parameters:

nl_values (dict[str, Any])

Return type:

dict[str, Any]

class harv.models.GaiaAstrometryModel

Bases: AbstractComponentModel

Gaia epoch astrometry component model (template).

Carries a parameterization and extensions. Data and the linear prior are passed at evaluation time so the same model instance can be reused across multiple datasets.

Parameters:
  • parameterization (StandardGaiaAstrometry | ThieleInnesGaiaAstrometry) – Declares parameter names/roles and builds the base design matrix.

  • extensions (tuple[AbstractExtension, ...]) – Model extensions (jitter, trends, …).

  • pm_time_unit (str) – If not None, override the proper motion units in the design matrix to use this unit instead of the default (obs_unit / yr).

Examples

>>> from harv.models.astrometry import GaiaAstrometryModel
>>> model = GaiaAstrometryModel()
>>> sorted(model._all_nonlinear_names())
['arg_peri', 'cos_i', 'eccentricity', 'lon_asc_node', 'period', 'phase_peri']
__init__(parameterization=StandardGaiaAstrometry(), extensions=(), *, pm_time_unit='yr')
Parameters:
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

chi_squared(nl_values, linear_values, data)

Goodness-of-fit \(\chi^2\) for one fully-specified parameter set.

Unlike log_prob() (which marginalizes the linear parameters and returns a marginal log-likelihood), this evaluates the model at the given linear-parameter values and returns the residual statistic

\[\chi^2 = r^\top C^{-1} r, \qquad r = y_\mathrm{obs} - X\,y,\]

where \(C\) is the (extension-modified) observation covariance. For a diagonal covariance this is \(\sum_i r_i^2 / \sigma_i^2\); a Gaussian-process extension promotes \(C\) to a full matrix and the Mahalanobis form is used. Jitter is included via the inflated \(C\).

Parameters:
  • nl_values (dict[str, Any]) – Nonlinear parameter values (orbital + any extension parameters), in the same form accepted by log_prob().

  • linear_values (dict[str, Array]) – Linear parameter values, unit-stripped to the model’s linear parameter units (see _linear_param_units()).

  • data (AbstractData) – Runtime observation data.

Return type:

Array

Returns:

Scalar \(\chi^2\).

extensions: tuple[AbstractExtension, ...] = ()
log_prob(nl_values, data, *, linear_priors=None, linear_values=None, marginalized_names=None)

Compute the log-likelihood.

Three calling conventions are supported:

  1. Auto mode (recommended): pass linear_prior and let the model classify which linear params to marginalize. Non-Gaussian linear priors are expected as entries in nl_values.

  2. Manual marginalization: pass marginalized_names (and optionally linear_values) to control exactly which linear params are marginalized.

  3. Explicit evaluation: pass linear_values without marginalized_names (and linear_priors=None) to evaluate the Gaussian log-likelihood at fixed linear parameter values.

Parameters:
  • nl_values (dict[str, Any]) – Parameter values. In auto mode this may contain explicit linear parameter values alongside the nonlinear ones.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData / SystemData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for analytic marginalization. Required for auto and manual-marginalization modes.

  • linear_values (dict[str, Array] | None) – Explicit linear parameter values (unit-stripped). When given without marginalized_names, triggers explicit evaluation.

  • marginalized_names (tuple[str, ...] | None) – Which linear params to marginalize (manual mode).

Return type:

Array

Returns:

Scalar log-likelihood.

numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)

Build a numpyro model function for MCMC sampling.

Parameters:
  • nonlinear_priors (dict[str, Distribution | QuantityDistribution]) – Prior distributions for nonlinear parameters. Keys are parameter names (e.g. "period", "eccentricity"). Values are Distribution or QuantityDistribution.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for the linear parameters. Required when any marginalization happens (marginalized=True or full non-marginalized mode that still needs explicit linear priors).

  • marginalized (bool) – If True (default), linear parameters are analytically marginalized and only nonlinear parameters are sampled. If False, all parameters are sampled explicitly.

  • marginalized_names (tuple[str, ...] | None) – Optional subset of linear parameter names to analytically marginalize when marginalized=True. None means use the model’s automatic prior-based classification.

Return type:

Callable[[], None]

Returns:

A no-argument callable suitable for numpyro.infer.MCMC.

parameterization: StandardGaiaAstrometry | ThieleInnesGaiaAstrometry = StandardGaiaAstrometry()
params_explicit(linear_priors)

Names of parameters that must be explicitly sampled.

Given a (resolved) linear_prior dict, returns all nonlinear parameters (orbital + extension, e.g. jitter) plus any linear parameters with non-Gaussian priors (e.g. parallax with a HalfNormal prior) that cannot be analytically marginalized.

These are the keys that must appear in the values dict passed to log_prob().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

params_marginalized(linear_priors)

Names of linear parameters analytically marginalized in log_prob.

Given a (resolved) linear_prior dict, returns the linear parameter names whose priors are Gaussian (or callable-returning-Normal) and so are integrated out analytically via the Woodbury identity rather than sampled. Their values are NOT required in the values dict; they are recovered afterward via sample_conditional_linear().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

pm_time_unit: str = 'yr'
predict(nl_values, linear_values, data)

Full predicted observable y_pred = X @ y at the data times.

X is the extension-augmented design matrix (_full_design_matrix()) and y is the ordered linear-parameter vector built from linear_values (missing entries default to 0). Used by _log_prob_explicit(), chi_squared(), and the plot functions so that every prediction path shares the same construction.

Parameters:
Return type:

Array

predict_orbit_sky(nl_values, linear_values, times)

Predicted sky-plane orbital offsets (dRA, dDec) at times.

Delegates to self.parameterization.sky_orbit(...). Returns the orbital contribution only — proper motion, parallax, and zero-point are not included. This is what the on-sky orbit ellipse panel of plot_gaia_sky_orbit() plots.

The returned arrays are in the same unit as the scalar semi-major-axis linear param ("semi_major_axis" for StandardGaiaAstrometry, or the TI constants for ThieleInnesGaiaAstrometry).

Parameters:
Return type:

tuple[Array, Array]

sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, explicit_linear=None, use_mean=False)

Sample linear parameters from the conditional posterior.

In auto mode (both marginalized_names and explicit_linear are None), the method classifies from linear_prior and extracts explicit linear values from nl_values.

Returns all linear parameter values (both sampled and explicit), unit-stripped.

When use_mean=True, the conditional posterior mean is returned for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. This is the appropriate choice when completing a MAP estimate (see optimize()); MCMC paths should keep the default use_mean=False.

Parameters:
Return type:

dict[str, Array]

class harv.models.GP

Bases: AbstractExtension

Gaussian Process covariance extension.

Adds the kernel matrix K(t, t') to the observation covariance via the modify_covariance hook, allowing analytic marginalization of linear parameters in the presence of correlated noise.

Plotting code may choose to visualize the GP contribution, but that support is handled privately on the plotting side rather than through the extension base API.

Parameters:
  • kernel_builder (Callable[[dict[str, Any]], Any]) – Receives the full nonlinear-parameter dict (unit-stripped) and returns a callable kernel object that can be called with (X, Xp) to produce the kernel matrix. For example, from tinygp.

  • hyperparams (tuple[ParamInfo, ...]) – Nonlinear hyperparameters declared by this extension (e.g. gp_amp, gp_length_scale). These are sampled alongside other nonlinear model parameters.

  • time_unit (str) – Unit to strip times to before building the coordinate array. Default "" (dimensionless / already stripped).

Examples

>>> from harv.models.extensions.base import ParamInfo
>>> from harv.models.extensions.gp import GP; GP(
...     kernel_builder=lambda hp: hp["gp_amp"] ** 2,
...     hyperparams=(ParamInfo("gp_amp", "km/s"),),
...     time_unit="day",
... ).extra_params()[0].name
'gp_amp'
__init__(kernel_builder, hyperparams, time_unit='')
Parameters:
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

conditional_mean(residuals, data_times, prediction_times, data_err, hp)

Conditional-mean GP prediction at prediction_times.

Conditions on observed residuals (already mean-subtracted by the deterministic model prediction) at data_times with measurement noise data_err, then returns the GP posterior mean evaluated at prediction_times. Used by the plotting layer to overlay structured noise; the same kernel and hyperparameter conventions as modify_covariance() apply.

All time arrays are assumed to be unit-stripped to the same unit (the kernel’s coordinate unit). hp is the per-sample hyperparameter dict (e.g. from a single posterior sample).

Parameters:
Return type:

Array

extra_params()

Parameters introduced by this extension.

Return type:

tuple[ParamInfo, ...]

modify_covariance(cov, data, nl_values)

Add the GP kernel matrix K(t, t') to the covariance.

If the incoming covariance is diagonal (1-d), it is promoted to a full (n_obs, n_obs) matrix first.

Parameters:
Return type:

Array

modify_design_matrix(X, data, nl_values)

Optionally append columns to the design matrix.

Parameters:
  • X (Array) – Current design matrix (base + earlier extensions).

  • data (AbstractData | None) – Observation data (unit-stripped times, etc. accessed via helpers).

  • nl_values (dict[str, Any]) – Current nonlinear parameter values (unit-stripped scalars).

Return type:

Array

Returns:

Updated design matrix, shape (n_obs, n_cols + n_extra). Return X unchanged if not applicable.

time_unit: str = ''
kernel_builder: Callable[[dict[str, Any]], Any]
hyperparams: tuple[ParamInfo, ...]
class harv.models.Jitter

Bases: AbstractExtension

Add excess variance (jitter / white noise) to the observation covariance.

The extension declares one nonlinear parameter, jitter, and modifies the covariance by adding jitter**2 in quadrature to the diagonal variances. The jitter unit must match the observation unit (e.g. "km/s" for RV, "mas" for astrometry). Unit stripping is the model’s responsibility.

Parameters:

param_unit (str) – Physical unit string for the jitter parameter metadata. Default "" (dimensionless / observation units).

Examples

>>> from harv.models.extensions import Jitter
>>> j = Jitter(param_unit="km/s")
>>> j.extra_params()[0].unit
'km/s'
__init__(param_unit='')
Parameters:

param_unit (str)

Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

extra_params()

Parameters introduced by this extension.

Return type:

tuple[ParamInfo, ...]

modify_covariance(cov, data, nl_values)

Add jitter**2 in quadrature to the diagonal variances.

Works on both 1-d (diagonal) and 2-d (full) covariance representations.

Parameters:
Return type:

Array

modify_design_matrix(X, data, nl_values)

Optionally append columns to the design matrix.

Parameters:
  • X (Array) – Current design matrix (base + earlier extensions).

  • data (AbstractData | None) – Observation data (unit-stripped times, etc. accessed via helpers).

  • nl_values (dict[str, Any]) – Current nonlinear parameter values (unit-stripped scalars).

Return type:

Array

Returns:

Updated design matrix, shape (n_obs, n_cols + n_extra). Return X unchanged if not applicable.

param_unit: str = ''
class harv.models.MonomialTrend

Bases: AbstractExtension

Append monomial trend columns to the design matrix.

Uses a standard monomial basis: dt^1, dt^2, ..., dt^order (RV) or scan-angle-projected monomials (astrometry).

Parameters:
  • order (int) – Polynomial order (number of trend terms). Must be >= 1.

  • time_unit (str) – Unit string for time – used in ParamInfo metadata.

  • obs_unit (str) – Unit string for the observations – used in ParamInfo metadata.

  • astrometry (bool) – If True, add two columns per order (RA + Dec, projected by scan angle) with exponents k + 1 to avoid degeneracy with the base proper-motion columns. Default False (single column per order).

Examples

>>> from harv.models.extensions import MonomialTrend
>>> trend = MonomialTrend(order=2)
>>> [p.name for p in trend.extra_params()]
['trend_1', 'trend_2']
__init__(order=1, time_unit='', obs_unit='', astrometry=False)
Parameters:
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

astrometry: bool = False
extra_params()

Parameters introduced by this extension.

Return type:

tuple[ParamInfo, ...]

modify_covariance(cov, data, nl_values)

Optionally modify the data covariance matrix.

Parameters:
  • cov (Array) – Current covariance. Diagonal (1-d) when only measurement errors are present; full (2-d) after a GP extension adds off-diagonal structure. shape (n_obs,) or (n_obs, n_obs)

  • data (AbstractData | None) – Observation data.

  • nl_values (dict[str, Any]) – Current nonlinear parameter values (unit-stripped scalars).

Return type:

Array

Returns:

Updated covariance (same or promoted shape). Return cov unchanged if not applicable.

modify_design_matrix(X, data, nl_values)

Append trend columns to the design matrix.

For RV (astrometry=False):

Columns are (t - t_ref)^k for k = 1..order.

For Gaia astrometry (astrometry=True):

Two columns per order: sin(psi) * dt^(k+1) and cos(psi) * dt^(k+1) where dt = (t - t_ref) and psi is the scan angle. The exponent k + 1 avoids degeneracy with the base proper-motion (dt^1) columns.

The data object must have time and t_ref attributes (and scan_angle for astrometry mode).

Parameters:
Return type:

Array

obs_unit: str = ''
order: int = 1
time_unit: str = ''
class harv.models.MultiSurveyOffset

Bases: AbstractExtension

Per-instrument additive offsets to account for instrumental systematics.

The extension stores a pre-computed indicator matrix and appends it as extra columns to the design matrix. Each column corresponds to a non-reference instrument; the reference instrument has no explicit offset (absorbed by the systemic velocity or baseline parameter).

Parameters:
  • indicator_matrix (Array) – Binary matrix: indicator_matrix[i, j] == 1 iff observation i belongs to the j-th non-reference instrument.

  • instrument_names (tuple[str, ...]) – Ordered names of the non-reference instruments (same column order as indicator_matrix). Used as parameter names.

  • obs_unit (str) – Physical unit string for the offset parameters (must match the observation unit of the model, e.g. "km/s").

Examples

>>> import jax.numpy as jnp
>>> from harv.models.extensions import MultiSurveyOffset
>>> indicator = jnp.array([[0.0, 0.0], [1.0, 0.0], [0.0, 1.0]])
>>> ext = MultiSurveyOffset(indicator, ("espresso", "keck"), "km/s")
>>> [p.name for p in ext.extra_params()]
['espresso', 'keck']
__init__(indicator_matrix, instrument_names, obs_unit='')
Parameters:
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

extra_params()

Parameters introduced by this extension.

Return type:

tuple[ParamInfo, ...]

modify_covariance(cov, data, nl_values)

Optionally modify the data covariance matrix.

Parameters:
  • cov (Array) – Current covariance. Diagonal (1-d) when only measurement errors are present; full (2-d) after a GP extension adds off-diagonal structure. shape (n_obs,) or (n_obs, n_obs)

  • data (AbstractData | None) – Observation data.

  • nl_values (dict[str, Any]) – Current nonlinear parameter values (unit-stripped scalars).

Return type:

Array

Returns:

Updated covariance (same or promoted shape). Return cov unchanged if not applicable.

modify_design_matrix(X, data, nl_values)

Append indicator columns to the design matrix.

Parameters:
Return type:

Array

obs_unit: str = ''
indicator_matrix: Array
instrument_names: tuple[str, ...]
class harv.models.HarvPrior

Bases: Module

Prior distribution for rejection sampling of Keplerian orbits.

This class encapsulates the prior distributions for both nonlinear and linear parameters. It is agnostic to data type - the sampler determines which parameters are required based on the provided data.

We recommend using the “default” factory constructors (e.g. default_rv(), default_gaia_astrometry(), etc.), which set up sensible priors for common use cases.

Nonlinear parameterization:

Parameter names in nonlinear_priors must match the field names of the parameterization, for example, period, eccentricity, phase_peri, etc. These parameters are sampled explicitly.

See the options available in harv.models.parameterizations.

Default parameterizations:

Radial Velocity:
  • Nonlinear: period, eccentricity, phase_peri, arg_peri

  • Linear: rv_semiamp, v_sys

Astrometry:
  • Nonlinear: period, eccentricity, phase_peri, cos_i, arg_peri, lon_asc_node

  • Linear params: ra0, dec0, pmra, pmdec, parallax, semi_major_axis

Parameters:
  • nonlinear_priors (dict[str, Distribution | QuantityDistribution]) – Mapping from parameter name to its prior distribution (a bare dist.Distribution for dimensionless parameters, or a harv.distributions.QuantityDistribution wrapper for parameters with physical units).

  • linear_priors (dict[str, Distribution | QuantityDistribution | Callable[..., QuantityDistribution | Normal]]) –

    Per-parameter priors for linear parameters. Each entry is classified:

    • dist.Normal or QD(Normal) – Gaussian, can be analytically marginalized.

    • LinearPriorCallable – called with nonlinear params to produce a Normal, can be marginalized.

    • dist.HalfNormal, dist.Delta, etc. – non-Gaussian, sampled explicitly alongside nonlinear params.

    When using default_rv() with offsets, the non-reference offset priors are automatically included as linear parameters.

  • extension_priors (dict[str, Distribution | QuantityDistribution]) – Priors for extension parameters declared via extra_params().

__init__(nonlinear_priors, linear_priors, extension_priors=<factory>)
Parameters:
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

classmethod from_parameterization(parameterization, **kwargs)

Build a default prior for any supported parameterization.

Delegates to parameterization.default_prior(**kwargs). Each concrete parameterization declares its own required scale arguments (e.g. sigma_K0 for RV, sigma_a0 for astrometry).

Parameters:
Return type:

HarvPrior

Examples

>>> from unxt import Q
>>> from harv.models.parameterizations.rv import EcoswEsinwRV
>>> from harv.samplers import HarvPrior
>>> prior = HarvPrior.from_parameterization(
...     EcoswEsinwRV(),
...     period_min=Q(2.0, "day"),
...     period_max=Q(1000.0, "day"),
...     sigma_K0=Q(30.0, "km/s"),
...     sigma_v0=Q(50.0, "km/s"),
... )
>>> sorted(prior.nonlinear_priors.keys())
['ecosw', 'esinw', 'period', 'phase_peri']
sample(key, n_samples, *, model, return_logprobs=False, marginalized_names=None)

Draw a complete prior sample for a given model.

Unlike sample_nonlinear(), this draws every parameter the rejection sampler would consume for model:

  • base nonlinear params from nonlinear_priors

  • extension nonlinear params (e.g. jitter, GP hypers) discovered by walking model.extensions

  • any linear params from linear_priors that are sampled explicitly rather than analytically marginalized. With the default marginalized_names=None this is just the non-Gaussian linear priors (e.g. HalfNormal-prior parallax); when marginalized_names is given it is every linear param not in that set (see below).

The returned Samples is the same container the rejection sampler produces for posteriors, with units restored from each QuantityDistribution. Its linear field is empty in the common Gaussian-linear case. ln_likelihood is always None (no data yet); ln_prior is populated when return_logprobs=True, summing the nonlinear (base + extension) prior log-densities — matching the convention used by RejectionSampler.run().

model is required because the set of extension and explicit-linear parameters depends on the extensions attached to the model.

Parameters:
  • key (Array) – JAX random key.

  • n_samples (int) – Number of prior draws.

  • model (AbstractComponentModel | JointModel) – Component or joint model template defining the extensions and linear-prior classification.

  • return_logprobs (bool) – If True, populate Samples.ln_prior with the per-sample nonlinear-prior log-density.

  • marginalized_names (tuple[str, ...] | None) – Which linear params are analytically marginalized (and therefore not drawn here), mirroring RejectionSampler.marginalized_names. Must match the consuming sampler so the explicit-linear keys agree. None (default) marginalizes every linear param that can be (the Gaussian ones), leaving only non-Gaussian priors explicit.

Returns:

A container holding the full prior draw, suitable for passing to RejectionSampler.run_with_samples() or for caching via harv.samplers.make_prior_cache().

Return type:

Samples

Examples

>>> import jax
>>> from unxt import Q
>>> from harv import HarvPrior, RVModel, StandardRV
>>> prior = StandardRV().default_prior(
...     period_min=Q(2.0, "day"),
...     period_max=Q(1000.0, "day"),
...     sigma_K0=Q(30.0, "km/s"),
...     sigma_v0=Q(50.0, "km/s"),
... )
>>> samples = prior.sample(jax.random.key(0), 100, model=RVModel())
>>> samples.n_samples
100
>>> samples.data_type
'RVModel'
sample_nonlinear(key, n_samples)

Sample nonlinear parameters from priors.

Parameters:
  • key (Array) – Random key for sampling.

  • n_samples (int) – Number of samples to draw.

Return type:

dict[str, Any]

Returns:

Dictionary mapping each parameter name to an array of shape (n_samples,). Values are bare JAX arrays regardless of whether the distribution is wrapped in QuantityDistribution.

Examples

>>> import jax
>>> from unxt import Q
>>> from harv.samplers import HarvPrior
>>> sorted(
...     HarvPrior.default_rv(
...         period_min=Q(2.0, "day"),
...         period_max=Q(1000.0, "day"),
...         sigma_K0=Q(30.0, "km/s"),
...         sigma_v0=Q(50.0, "km/s"),
...     ).sample_nonlinear(jax.random.key(0), 10).keys()
... )
['arg_peri', 'eccentricity', 'period', 'phase_peri']
>>> HarvPrior.default_rv(
...     period_min=Q(2.0, "day"),
...     period_max=Q(1000.0, "day"),
...     sigma_K0=Q(30.0, "km/s"),
...     sigma_v0=Q(50.0, "km/s"),
... ).sample_nonlinear(jax.random.key(0), 10)["period"].shape
(10,)
nonlinear_priors: dict[str, Distribution | QuantityDistribution]
linear_priors: dict[str, Distribution | QuantityDistribution | Callable[..., QuantityDistribution | Normal]]
extension_priors: dict[str, Distribution | QuantityDistribution]
class harv.models.JointModel

Bases: Module

Composition of component models that share orbital parameters.

We recommend using the JointModel factory methods like for_rv_and_gaia() or for_sb2().

Parameters:
  • components (dict[str, AbstractComponentModel]) – Named component models. Keys are used to namespace component-specific parameters (e.g. "rv.jitter").

  • shared_params (tuple[str, ...]) – Names of orbital parameters shared across all components (e.g. ("period", "eccentricity", "phase_peri", "arg_peri")).

  • shared_linear_params (tuple[str, ...])

__init__(components, shared_params, shared_linear_params=())
Parameters:
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

property component_names: tuple[str, ...]

Names of the components in this joint model.

classmethod for_rv_and_gaia(components, *, shared_params=None, shared_linear_params=None)

Build a JointModel for combined RV + Gaia astrometry.

TODO: if we want to support more complex shared parameters, look at for_sb2 to see what we can generalize

Parameters:
  • components (dict[str, AbstractComponentModel]) – RV and Gaia astrometry component models (e.g. {"rv": ..., "astro": ...}).

  • shared_params (tuple[str, ...] | None) – Override the default shared orbital parameters. Defaults to ("period", "eccentricity", "phase_peri", "arg_peri").

  • shared_linear_params (tuple[str, ...] | None) – Linear parameter names shared across components. Defaults to () (no shared linear params for heterogeneous joint models).

Return type:

JointModel

Returns:

The constructed JointModel.

classmethod for_sb2(prior, *, component_names=('primary', 'secondary'), extensions=(), shared_params=None, shared_linear_params=None)

Build an SB2 JointModel from a prior.

The prior is expected to follow the convention of default_sb2_prior(): linear-prior keys name1.rv_semiamp and name2.rv_semiamp map to the per-component rv_semiamp of the components, named “name1” and “name2” in this example, but they can be customized. Other linear-prior keys (e.g. v_sys) are automatically declared shared across components.

Data is not bound to the model; pass it at sampler.run(data, ...) time.

Parameters:
  • prior (Any) – SB2-style prior. Keys for named component-specific parameters (e.g., rv_semiamp) must correspond to the component names in component_names. For example, with the default names “primary” and “secondary”, the prior must have keys “primary.rv_semiamp” and “secondary.rv_semiamp”. Other linear parameters (e.g. “v_sys”) are automatically treated as shared across components.

  • component_names (tuple[str, ...]) – Names of the two SB2 components. Defaults to ("primary", "secondary").

  • extensions (tuple[Any, ...] | dict[str, tuple[Any, ...]]) –

    Extensions to attach to each component.

    • A bare tuple is applied to all components.

    • A dict is keyed by component name; missing keys yield no extensions for that component.

  • shared_params (tuple[str, ...] | None) – Defaults to the standard nonlinear shared orbital params. For example, “period”, “eccentricity”, “phase_peri”, and “arg_peri”.

  • shared_linear_params (tuple[str, ...] | None) – Defaults to every key in prior.linear_prior except the rv_semiamp keys.

Return type:

JointModel

Returns:

The constructed JointModel.

Examples

>>> from unxt import Q
>>> from harv.models.joint import JointModel
>>> from harv.models.priors import default_sb2_prior
>>> prior = default_sb2_prior(
...     period_min=Q(10., "day"), period_max=Q(1000., "day"),
...     sigma_K0=Q(30., "km/s"), sigma_v0=Q(50., "km/s"),
... )
>>> joint = JointModel.for_sb2(prior=prior)
>>> joint.shared_linear_params
('v_sys',)
log_prob(nl_values, data, *, linear_priors=None, marginalized_names=None)

Compute the joint log-likelihood.

When shared_linear_params contains names that are being analytically marginalized, a single joint MarginalizedLinear is built spanning all components (the joint path). Otherwise the per-component log-likelihoods are summed as before.

Parameters:
  • nl_values (dict[str, Any]) – Flat dict of parameter values. Shared orbital params use bare names ("period", "eccentricity", etc.). Component-specific nonlinear params use "component.param" convention (e.g. "rv.jitter").

  • data (AbstractDatasetContainer) – Per-component data, indexed by component name.

  • linear_priors (dict[str, Any] | None) – Flat merged linear-prior dict. None means treat all linear params as marginalizable.

  • marginalized_names (tuple[str, ...] | None) – Optional linear parameter names to marginalize. Component-qualified names are accepted (e.g. "rv.parallax" or just "parallax" if unambiguous).

Return type:

Array

Returns:

Scalar log-likelihood.

numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)

Build a numpyro model for MCMC sampling of the joint model.

Parameters:
  • nonlinear_priors (dict[str, Distribution | QuantityDistribution]) – Prior distributions for all nonlinear parameters. Shared orbital params use bare names. Component-specific params use "component.param" convention.

  • data (AbstractDatasetContainer) – Per-component data, indexed by component name.

  • linear_priors (dict[str, Any] | None) – Flat merged linear-prior dict.

  • marginalized (bool) – If True (default), linear parameters are marginalized per-component. If False, all parameters are sampled explicitly.

  • marginalized_names (tuple[str, ...] | None) – Optional linear parameter names to marginalize when marginalized=True. Component-qualified names are accepted.

Return type:

Callable[[], None]

Returns:

A numpyro model function suitable for use with a sampler.

params_explicit(linear_priors)

Names of parameters that must be explicitly sampled.

Shared nonlinear params use bare names (e.g. "period"). Component-specific nonlinear params use "comp.param" notation (e.g. "rv.jitter"). Explicit-linear params (non-Gaussian priors, e.g. "parallax") are listed flat without namespace prefix, matching how they appear in the log_prob values dict.

Parameters:

linear_priors (dict[str, Any] | None) – The merged linear-prior dict (with "comp.param" qualified keys for non-shared params). None means treat all linear params as marginalizable.

Return type:

tuple[str, ...]

params_marginalized(linear_priors)

Names of linear parameters analytically marginalized across all components.

De-duplicated; order follows component iteration order.

Parameters:

linear_priors (dict[str, Any] | None) – The merged linear-prior dict. None means treat all linear params as marginalizable.

Return type:

tuple[str, ...]

sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, use_mean=False)

Sample conditional linear params for each component.

When shared_linear_params are jointly marginalized, shared parameters appear at the top level of the returned dict (with bare names) rather than inside per-component sub-dicts.

Parameters:
  • nl_values (dict[str, Any]) – Flat parameter values dict.

  • key (Array) – JAX PRNG key.

  • data (AbstractDatasetContainer) – Per-component data, indexed by component name.

  • linear_priors (dict[str, Any] | None) – Flat merged linear-prior dict.

  • marginalized_names (tuple[str, ...] | None) – Optional linear parameter names to marginalize.

  • use_mean (bool) – When True, return the conditional posterior mean for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. Default False.

Return type:

dict[str, Any]

Returns:

Sampled parameter values. The structure depends on the path:

  • Default path (no shared marginalization): dict[comp_name, dict[param_name, array]].

  • Joint path (shared marginalization): mixed dict where shared params are top-level and per-component params are in sub-dicts keyed by component name.

shared_linear_params: tuple[str, ...] = ()
components: dict[str, AbstractComponentModel]
shared_params: tuple[str, ...]
class harv.models.RVModel

Bases: AbstractComponentModel

Radial-velocity component model (template).

Carries a parameterization and extensions. Data and the linear prior are passed at evaluation time so the same model instance can be reused across multiple datasets.

Parameters:

Examples

>>> from harv.models.rv import RVModel
>>> model = RVModel()
>>> sorted(model._all_nonlinear_names())
['arg_peri', 'eccentricity', 'period', 'phase_peri']
__init__(parameterization=StandardRV(), extensions=())
Parameters:
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

chi_squared(nl_values, linear_values, data)

Goodness-of-fit \(\chi^2\) for one fully-specified parameter set.

Unlike log_prob() (which marginalizes the linear parameters and returns a marginal log-likelihood), this evaluates the model at the given linear-parameter values and returns the residual statistic

\[\chi^2 = r^\top C^{-1} r, \qquad r = y_\mathrm{obs} - X\,y,\]

where \(C\) is the (extension-modified) observation covariance. For a diagonal covariance this is \(\sum_i r_i^2 / \sigma_i^2\); a Gaussian-process extension promotes \(C\) to a full matrix and the Mahalanobis form is used. Jitter is included via the inflated \(C\).

Parameters:
  • nl_values (dict[str, Any]) – Nonlinear parameter values (orbital + any extension parameters), in the same form accepted by log_prob().

  • linear_values (dict[str, Array]) – Linear parameter values, unit-stripped to the model’s linear parameter units (see _linear_param_units()).

  • data (AbstractData) – Runtime observation data.

Return type:

Array

Returns:

Scalar \(\chi^2\).

extensions: tuple[AbstractExtension, ...] = ()
log_prob(nl_values, data, *, linear_priors=None, linear_values=None, marginalized_names=None)

Compute the log-likelihood.

Three calling conventions are supported:

  1. Auto mode (recommended): pass linear_prior and let the model classify which linear params to marginalize. Non-Gaussian linear priors are expected as entries in nl_values.

  2. Manual marginalization: pass marginalized_names (and optionally linear_values) to control exactly which linear params are marginalized.

  3. Explicit evaluation: pass linear_values without marginalized_names (and linear_priors=None) to evaluate the Gaussian log-likelihood at fixed linear parameter values.

Parameters:
  • nl_values (dict[str, Any]) – Parameter values. In auto mode this may contain explicit linear parameter values alongside the nonlinear ones.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData / SystemData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for analytic marginalization. Required for auto and manual-marginalization modes.

  • linear_values (dict[str, Array] | None) – Explicit linear parameter values (unit-stripped). When given without marginalized_names, triggers explicit evaluation.

  • marginalized_names (tuple[str, ...] | None) – Which linear params to marginalize (manual mode).

Return type:

Array

Returns:

Scalar log-likelihood.

numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)

Build a numpyro model function for MCMC sampling.

Parameters:
  • nonlinear_priors (dict[str, Distribution | QuantityDistribution]) – Prior distributions for nonlinear parameters. Keys are parameter names (e.g. "period", "eccentricity"). Values are Distribution or QuantityDistribution.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for the linear parameters. Required when any marginalization happens (marginalized=True or full non-marginalized mode that still needs explicit linear priors).

  • marginalized (bool) – If True (default), linear parameters are analytically marginalized and only nonlinear parameters are sampled. If False, all parameters are sampled explicitly.

  • marginalized_names (tuple[str, ...] | None) – Optional subset of linear parameter names to analytically marginalize when marginalized=True. None means use the model’s automatic prior-based classification.

Return type:

Callable[[], None]

Returns:

A no-argument callable suitable for numpyro.infer.MCMC.

parameterization: StandardRV | EcoswEsinwRV = StandardRV()
params_explicit(linear_priors)

Names of parameters that must be explicitly sampled.

Given a (resolved) linear_prior dict, returns all nonlinear parameters (orbital + extension, e.g. jitter) plus any linear parameters with non-Gaussian priors (e.g. parallax with a HalfNormal prior) that cannot be analytically marginalized.

These are the keys that must appear in the values dict passed to log_prob().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

params_marginalized(linear_priors)

Names of linear parameters analytically marginalized in log_prob.

Given a (resolved) linear_prior dict, returns the linear parameter names whose priors are Gaussian (or callable-returning-Normal) and so are integrated out analytically via the Woodbury identity rather than sampled. Their values are NOT required in the values dict; they are recovered afterward via sample_conditional_linear().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

predict(nl_values, linear_values, data)

Full predicted observable y_pred = X @ y at the data times.

X is the extension-augmented design matrix (_full_design_matrix()) and y is the ordered linear-parameter vector built from linear_values (missing entries default to 0). Used by _log_prob_explicit(), chi_squared(), and the plot functions so that every prediction path shares the same construction.

Parameters:
Return type:

Array

predict_at_times(times, nl_values, linear_values, *, t_ref, obs_unit='km/s')

Predicted RV at arbitrary times, no observed-data object required.

Internally constructs an RVData shim at times with dummy rv / rv_err (zeros / ones) and delegates to predict(). The dummy obs are never read by the prediction path (_full_design_matrix only consumes data.time and data.t_ref; extensions read at most those fields too). The returned array is in the same units the model’s linear parameters are expressed in.

Raises TypeError if any MultiSurveyOffset is in self.extensions — its indicator_matrix is fixed to the original data’s row count and cannot apply on an arbitrary time grid. Callers building a smooth-curve overlay should filter it out before calling.

Parameters:
  • times (Real[Quantity[PhysicalType('time')], '*batch'])

  • nl_values (dict[str, Any])

  • linear_values (dict[str, Array])

  • t_ref (Real[Quantity[PhysicalType('time')], ''])

  • obs_unit (str)

Return type:

Array

sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, explicit_linear=None, use_mean=False)

Sample linear parameters from the conditional posterior.

In auto mode (both marginalized_names and explicit_linear are None), the method classifies from linear_prior and extracts explicit linear values from nl_values.

Returns all linear parameter values (both sampled and explicit), unit-stripped.

When use_mean=True, the conditional posterior mean is returned for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. This is the appropriate choice when completing a MAP estimate (see optimize()); MCMC paths should keep the default use_mean=False.

Parameters:
Return type:

dict[str, Array]

class harv.models.StandardGaiaAstrometry

Bases: AbstractParameterization

Standard Gaia astrometry parameterization.

The default harv parameterization for Gaia epoch astrometry modeling uses the following parameters:

  • Nonlinear:
    • period - orbital period

    • eccentricity - orbital eccentricity

    • phase_peri - phase at which the mean anomaly is zero (i.e. periastron passage), using a time system relative to the data’s reference time

    • arg_peri - argument of periastron

    • lon_asc_node - longitude of the ascending node

    • cos_i - cosine of the inclination

  • Linear:
    • ra0 - right ascension at the reference epoch

    • dec0 - declination at the reference epoch

    • pmra - proper motion in right ascension

    • pmdec - proper motion in declination

    • parallax - parallax

    • semi_major_axis - semi-major axis

The design matrix has shape (n_obs, 6) with columns [ra0, dec0, pmra, pmdec, parallax, a_orbit].

Parameters are defined in the Gaia local-plane coordinate (LPC) convention from Lindegren & Bastian (GAIA-C3-TN-LU-LL-061-08).

Examples

>>> from harv.models.parameterizations.gaia import StandardGaiaAstrometry
>>> p = StandardGaiaAstrometry()
>>> [pp.name for pp in p.nonlinear_params()]
['period', 'eccentricity', 'phase_peri', 'arg_peri', 'lon_asc_node', 'cos_i']
__init__()
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

default_prior(*, period_min=None, period_max=None, sigma_a0=None, sigma_parallax=None, sigma_pos=None, sigma_vtan=None, P0=Quantity(Array(1., dtype=float32, weak_type=True), unit='yr'), **kwargs)

Build a HarvPrior with sensible defaults.

Same defaults as harv.samplers.HarvPrior.default_gaia_astrometry() (and default_gaia_astrometry is a thin wrapper around this method).

Parameters:
  • period_min (Real[Quantity[PhysicalType('time')], ''] | None)

  • period_max (Real[Quantity[PhysicalType('time')], ''] | None)

  • sigma_a0 (Real[Quantity[PhysicalType('length')], ''] | None)

  • sigma_parallax (Real[Quantity[PhysicalType('angle')], ''] | None)

  • sigma_pos (Real[Quantity[PhysicalType('angle')], ''] | None)

  • sigma_vtan (Real[Quantity[PhysicalType({'speed', 'velocity'})], ''] | None)

  • P0 (Real[Quantity[PhysicalType('time')], ''])

  • kwargs (Distribution | QuantityDistribution | Callable[..., QuantityDistribution | Normal])

Return type:

HarvPrior

design_matrix(sin_f, cos_f, dt, sin_psi, cos_psi, parallax_factor, nl_values)

Build (n_obs, 6) along-scan design matrix.

Columns: [ra0, dec0, pmra, pmdec, parallax, semi_major_axis].

Parameters:
  • sin_f (Array) – Sine of true anomaly (unit-stripped).

  • cos_f (Array) – Cosine of true anomaly (unit-stripped).

  • dt (Array) – Time elapsed since reference epoch (unit-stripped, in internal time unit, typically years).

  • sin_psi (Array) – Sine of scan angle.

  • cos_psi (Array) – Cosine of scan angle.

  • parallax_factor (Array) – Parallax factor (unit-stripped).

  • nl_values (dict[str, Any]) – Must contain "eccentricity", "arg_peri", "lon_asc_node", "cos_i" (unit-stripped scalars).

Return type:

Array

Returns:

Design matrix block, shape (n_obs, 6).

linear_log_prior_correction(linear_map)

Optional log-prior correction added to the marginal log-likelihood.

Called with the conditional-mean linear-parameter values (unit-stripped, in the model’s observation units) after Gaussian marginalization. Returns None (no correction) by default.

Non-trivial only when the parameterization’s linear parameters are not the natural physical parameters — for example, Thiele-Innes constants (A, B, F, G) instead of the Campbell elements (a_0, ω, Ω, cos i). In that case the Jacobian of the change of variables must be applied to recover the correct posterior when priors are specified in the physical (Campbell) space.

Parameters:

linear_map (dict[str, Array]) – Conditional-mean values of the marginalized linear parameters, keyed by parameter name.

Returns:

Scalar log-correction to add to the marginal log-likelihood, or None if no correction is needed.

Return type:

Array or None

linear_params()

Return only the linear parameters.

Return type:

tuple[ParamInfo, ...]

nonlinear_params()

Return only the nonlinear parameters.

Return type:

tuple[ParamInfo, ...]

params()

All parameters declared by this parameterization (nonlinear first).

Return type:

tuple[ParamInfo, ...]

sky_orbit(times, nl_values, linear_values)

Sky-plane orbital offsets (dRA, dDec) from the photocentre orbit.

Computes (dRA, dDec) = a_0 * (B*X + G*Y, A*X + F*Y) using the same kepler primitives (thiele_innes_ABFG(), mean_anomaly(), true_anomaly_from_mean()) that the design matrix uses. Returns bare JAX arrays in the same unit as the scalar linear_values["semi_major_axis"] (unit-agnostic by construction).

Parameters:
Return type:

tuple[Array, Array]

class harv.models.StandardRV

Bases: AbstractParameterization

Standard RV parameterization.

The default harv parameterization for radial velocity modeling uses the following Keplerian parameters:

  • Nonlinear:
    • period - orbital period

    • eccentricity - orbital eccentricity

    • phase_peri - phase at which the mean anomaly is zero (i.e. periastron passage), using a time system relative to the data’s reference time

    • arg_peri - argument of periastron

  • Linear:
    • rv_semiamp - sometimes called “K”, the RV semi-amplitude

    • v_sys - systemic velocity

The design matrix has shape (n_obs, 2) with columns [zd(t), 1] where zd(t) = cos(omega + f(t)) + e * cos(omega) is the RV shape function.

Examples

>>> from harv.models.parameterizations.rv import StandardRV
>>> p = StandardRV()
>>> [pp.name for pp in p.params()]
['period', 'eccentricity', 'phase_peri', 'arg_peri', 'rv_semiamp', 'v_sys']
__init__()
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

default_prior(*, period_min=None, period_max=None, sigma_K0=None, sigma_v0=None, P0=Quantity(Array(1., dtype=float32, weak_type=True), unit='yr'), **kwargs)

Build a HarvPrior with sensible defaults.

Same defaults as harv.samplers.HarvPrior.default_rv() (and default_rv is a thin wrapper around this method).

Parameters:
  • period_min (Real[Quantity[PhysicalType('time')], ''] | None) – Log-uniform period bounds.

  • period_max (Real[Quantity[PhysicalType('time')], ''] | None) – Log-uniform period bounds.

  • sigma_K0 (Real[Quantity[PhysicalType({'speed', 'velocity'})], ''] | None) – RV semi-amplitude scale at reference period P0.

  • sigma_v0 (Real[Quantity[PhysicalType({'speed', 'velocity'})], ''] | None) – Systemic-velocity prior scale.

  • P0 (Real[Quantity[PhysicalType('time')], '']) – Reference period for the period-dependent rv_semiamp prior.

  • **kwargs (Distribution | QuantityDistribution | Callable[..., QuantityDistribution | Normal]) – Per-parameter prior overrides or extension priors.

Return type:

HarvPrior

design_matrix(sin_f, cos_f, nl_values)

Build (n_obs, 2) design matrix: columns [rv_shape, 1].

Parameters:
  • sin_f (Array) – Sine of true anomaly (unit-stripped).

  • cos_f (Array) – Cosine of true anomaly (unit-stripped).

  • nl_values (dict[str, Any]) – Must contain "eccentricity" and "arg_peri" (both unit-stripped scalars).

Return type:

Array

Returns:

Design matrix block, shape (n_obs, 2).

eccentricity(nl_values)

Return the orbital eccentricity from nonlinear values.

Parameters:

nl_values (dict[str, Any])

Return type:

Any

linear_log_prior_correction(linear_map)

Optional log-prior correction added to the marginal log-likelihood.

Called with the conditional-mean linear-parameter values (unit-stripped, in the model’s observation units) after Gaussian marginalization. Returns None (no correction) by default.

Non-trivial only when the parameterization’s linear parameters are not the natural physical parameters — for example, Thiele-Innes constants (A, B, F, G) instead of the Campbell elements (a_0, ω, Ω, cos i). In that case the Jacobian of the change of variables must be applied to recover the correct posterior when priors are specified in the physical (Campbell) space.

Parameters:

linear_map (dict[str, Array]) – Conditional-mean values of the marginalized linear parameters, keyed by parameter name.

Returns:

Scalar log-correction to add to the marginal log-likelihood, or None if no correction is needed.

Return type:

Array or None

linear_params()

Return only the linear parameters.

Return type:

tuple[ParamInfo, ...]

nonlinear_params()

Return only the nonlinear parameters.

Return type:

tuple[ParamInfo, ...]

params()

All parameters declared by this parameterization (nonlinear first).

Return type:

tuple[ParamInfo, ...]

strip_nl_for_design(nl_values)

Return nl_values with units stripped for design_matrix.

Parameters:

nl_values (dict[str, Any])

Return type:

dict[str, Any]

harv.models.default_sb2_prior(*, period_min=None, period_max=None, sigma_K0=None, sigma_v0=None, P0=Quantity(Array(1., dtype=float32, weak_type=True), unit='yr'), component_names=('primary', 'secondary'), **kwargs)

Create default prior for SB2 (double-lined) radial velocity data.

SB2 is a joint composition of two StandardRV components, not a single parameterization, so this lives as a module-level factory rather than a classmethod on HarvPrior. It pairs naturally with harv.models.JointModel.for_sb2() (JointModel.for_sb2(prior=...)).

Both semi-amplitudes use the same period-dependent scaling as HarvPrior.default_rv(). The systemic velocity prior is a fixed Gaussian.

The default names for the two components are “primary” and “secondary”, which means the linear priors for the semi-amplitudes must be keyed as “primary.rv_semiamp” and “secondary.rv_semiamp”. You can customize the component names via the component_names argument, but the linear prior keys must always be {component_name}.rv_semiamp.

Parameters:
  • period_min (Real[Quantity[PhysicalType('time')], ''] | None) – Lower bound for the log-uniform period prior.

  • period_max (Real[Quantity[PhysicalType('time')], ''] | None) – Upper bound for the log-uniform period prior.

  • sigma_K0 (Real[Quantity[PhysicalType({'speed', 'velocity'})], ''] | None) – RV semi-amplitude scale at the reference period P0.

  • sigma_v0 (Real[Quantity[PhysicalType({'speed', 'velocity'})], ''] | None) – Systemic velocity prior scale.

  • P0 (Real[Quantity[PhysicalType('time')], '']) – Reference period for the K prior scaling. Default: 1 yr.

  • component_names (tuple[str, str]) – Names of the two components. These are used to construct the linear prior keys for the semi-amplitudes (e.g. “primary.rv_semiamp” and “secondary.rv_semiamp”).

  • **kwargs (Distribution | QuantityDistribution | Callable[..., QuantityDistribution | Normal]) – Override any default nonlinear or linear prior by name.

Returns:

Prior configured for SB2 RV data.

Return type:

HarvPrior

Examples

>>> from unxt import Q
>>> from harv.samplers import default_sb2_prior
>>> sorted(
...     default_sb2_prior(
...         period_min=Q(2.0, "day"),
...         period_max=Q(1000.0, "day"),
...         sigma_K0=Q(30.0, "km/s"),
...         sigma_v0=Q(50.0, "km/s"),
...     ).nonlinear_priors.keys()
... )
['arg_peri', 'eccentricity', 'period', 'phase_peri']
>>> sorted(
...     default_sb2_prior(
...         period_min=Q(2.0, "day"),
...         period_max=Q(1000.0, "day"),
...         sigma_K0=Q(30.0, "km/s"),
...         sigma_v0=Q(50.0, "km/s"),
...     ).linear_priors
... )
['primary.rv_semiamp', 'secondary.rv_semiamp', 'v_sys']

Subpackages

Submodules

harv.models.astrometry module

Concrete Gaia epoch-astrometry component model.

GaiaAstrometryModel is a template carrying a parameterization (default StandardGaiaAstrometry) and extensions. Data and linear priors are passed at evaluation time.

final class harv.models.astrometry.GaiaAstrometryModel

Bases: AbstractComponentModel

Gaia epoch astrometry component model (template).

Carries a parameterization and extensions. Data and the linear prior are passed at evaluation time so the same model instance can be reused across multiple datasets.

Parameters:
  • parameterization (StandardGaiaAstrometry | ThieleInnesGaiaAstrometry) – Declares parameter names/roles and builds the base design matrix.

  • extensions (tuple[AbstractExtension, ...]) – Model extensions (jitter, trends, …).

  • pm_time_unit (str) – If not None, override the proper motion units in the design matrix to use this unit instead of the default (obs_unit / yr).

Examples

>>> from harv.models.astrometry import GaiaAstrometryModel
>>> model = GaiaAstrometryModel()
>>> sorted(model._all_nonlinear_names())
['arg_peri', 'cos_i', 'eccentricity', 'lon_asc_node', 'period', 'phase_peri']
parameterization: StandardGaiaAstrometry | ThieleInnesGaiaAstrometry = StandardGaiaAstrometry()
extensions: tuple[AbstractExtension, ...] = ()
pm_time_unit: str = 'yr'
predict_orbit_sky(nl_values, linear_values, times)

Predicted sky-plane orbital offsets (dRA, dDec) at times.

Delegates to self.parameterization.sky_orbit(...). Returns the orbital contribution only — proper motion, parallax, and zero-point are not included. This is what the on-sky orbit ellipse panel of plot_gaia_sky_orbit() plots.

The returned arrays are in the same unit as the scalar semi-major-axis linear param ("semi_major_axis" for StandardGaiaAstrometry, or the TI constants for ThieleInnesGaiaAstrometry).

Parameters:
Return type:

tuple[Array, Array]

__init__(parameterization=StandardGaiaAstrometry(), extensions=(), *, pm_time_unit='yr')
Parameters:
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

chi_squared(nl_values, linear_values, data)

Goodness-of-fit \(\chi^2\) for one fully-specified parameter set.

Unlike log_prob() (which marginalizes the linear parameters and returns a marginal log-likelihood), this evaluates the model at the given linear-parameter values and returns the residual statistic

\[\chi^2 = r^\top C^{-1} r, \qquad r = y_\mathrm{obs} - X\,y,\]

where \(C\) is the (extension-modified) observation covariance. For a diagonal covariance this is \(\sum_i r_i^2 / \sigma_i^2\); a Gaussian-process extension promotes \(C\) to a full matrix and the Mahalanobis form is used. Jitter is included via the inflated \(C\).

Parameters:
  • nl_values (dict[str, Any]) – Nonlinear parameter values (orbital + any extension parameters), in the same form accepted by log_prob().

  • linear_values (dict[str, Array]) – Linear parameter values, unit-stripped to the model’s linear parameter units (see _linear_param_units()).

  • data (AbstractData) – Runtime observation data.

Return type:

Array

Returns:

Scalar \(\chi^2\).

log_prob(nl_values, data, *, linear_priors=None, linear_values=None, marginalized_names=None)

Compute the log-likelihood.

Three calling conventions are supported:

  1. Auto mode (recommended): pass linear_prior and let the model classify which linear params to marginalize. Non-Gaussian linear priors are expected as entries in nl_values.

  2. Manual marginalization: pass marginalized_names (and optionally linear_values) to control exactly which linear params are marginalized.

  3. Explicit evaluation: pass linear_values without marginalized_names (and linear_priors=None) to evaluate the Gaussian log-likelihood at fixed linear parameter values.

Parameters:
  • nl_values (dict[str, Any]) – Parameter values. In auto mode this may contain explicit linear parameter values alongside the nonlinear ones.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData / SystemData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for analytic marginalization. Required for auto and manual-marginalization modes.

  • linear_values (dict[str, Array] | None) – Explicit linear parameter values (unit-stripped). When given without marginalized_names, triggers explicit evaluation.

  • marginalized_names (tuple[str, ...] | None) – Which linear params to marginalize (manual mode).

Return type:

Array

Returns:

Scalar log-likelihood.

numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)

Build a numpyro model function for MCMC sampling.

Parameters:
  • nonlinear_priors (dict[str, Distribution | QuantityDistribution]) – Prior distributions for nonlinear parameters. Keys are parameter names (e.g. "period", "eccentricity"). Values are Distribution or QuantityDistribution.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for the linear parameters. Required when any marginalization happens (marginalized=True or full non-marginalized mode that still needs explicit linear priors).

  • marginalized (bool) – If True (default), linear parameters are analytically marginalized and only nonlinear parameters are sampled. If False, all parameters are sampled explicitly.

  • marginalized_names (tuple[str, ...] | None) – Optional subset of linear parameter names to analytically marginalize when marginalized=True. None means use the model’s automatic prior-based classification.

Return type:

Callable[[], None]

Returns:

A no-argument callable suitable for numpyro.infer.MCMC.

params_explicit(linear_priors)

Names of parameters that must be explicitly sampled.

Given a (resolved) linear_prior dict, returns all nonlinear parameters (orbital + extension, e.g. jitter) plus any linear parameters with non-Gaussian priors (e.g. parallax with a HalfNormal prior) that cannot be analytically marginalized.

These are the keys that must appear in the values dict passed to log_prob().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

params_marginalized(linear_priors)

Names of linear parameters analytically marginalized in log_prob.

Given a (resolved) linear_prior dict, returns the linear parameter names whose priors are Gaussian (or callable-returning-Normal) and so are integrated out analytically via the Woodbury identity rather than sampled. Their values are NOT required in the values dict; they are recovered afterward via sample_conditional_linear().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

predict(nl_values, linear_values, data)

Full predicted observable y_pred = X @ y at the data times.

X is the extension-augmented design matrix (_full_design_matrix()) and y is the ordered linear-parameter vector built from linear_values (missing entries default to 0). Used by _log_prob_explicit(), chi_squared(), and the plot functions so that every prediction path shares the same construction.

Parameters:
Return type:

Array

sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, explicit_linear=None, use_mean=False)

Sample linear parameters from the conditional posterior.

In auto mode (both marginalized_names and explicit_linear are None), the method classifies from linear_prior and extracts explicit linear values from nl_values.

Returns all linear parameter values (both sampled and explicit), unit-stripped.

When use_mean=True, the conditional posterior mean is returned for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. This is the appropriate choice when completing a MAP estimate (see optimize()); MCMC paths should keep the default use_mean=False.

Parameters:
Return type:

dict[str, Array]

harv.models.component module

Abstract base for component models.

A component model generates predictions for a single data type (RV or astrometry) and evaluates the log-likelihood – either by analytically marginalizing over the linear parameters or by evaluating at explicit values.

class harv.models.component.AbstractComponentModel

Bases: Module

Abstract base for single-data-type component models.

Component models are templates: they carry only parameterization and extensions (config). data and linear_prior are passed at evaluation time to the methods that need them. This means the same model instance can be re-used across multiple datasets without rebuilding.

Concrete subclasses must implement:

  • _param_infos: all parameter descriptors.

  • _base_design_matrix(nl_values, data): the base design matrix from runtime data + nonlinear values.

  • _strip_obs(data): return (obs, obs_err) as unit-stripped JAX arrays.

  • _obs_unit(data): the physical unit string of the observations.

Concrete subclasses must also declare:

  • extensions: tuple of AbstractExtension (model modifiers).

extensions: AbstractVar[tuple[AbstractExtension, ...]]
params_explicit(linear_priors)

Names of parameters that must be explicitly sampled.

Given a (resolved) linear_prior dict, returns all nonlinear parameters (orbital + extension, e.g. jitter) plus any linear parameters with non-Gaussian priors (e.g. parallax with a HalfNormal prior) that cannot be analytically marginalized.

These are the keys that must appear in the values dict passed to log_prob().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

params_marginalized(linear_priors)

Names of linear parameters analytically marginalized in log_prob.

Given a (resolved) linear_prior dict, returns the linear parameter names whose priors are Gaussian (or callable-returning-Normal) and so are integrated out analytically via the Woodbury identity rather than sampled. Their values are NOT required in the values dict; they are recovered afterward via sample_conditional_linear().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

log_prob(nl_values, data, *, linear_priors=None, linear_values=None, marginalized_names=None)

Compute the log-likelihood.

Three calling conventions are supported:

  1. Auto mode (recommended): pass linear_prior and let the model classify which linear params to marginalize. Non-Gaussian linear priors are expected as entries in nl_values.

  2. Manual marginalization: pass marginalized_names (and optionally linear_values) to control exactly which linear params are marginalized.

  3. Explicit evaluation: pass linear_values without marginalized_names (and linear_priors=None) to evaluate the Gaussian log-likelihood at fixed linear parameter values.

Parameters:
  • nl_values (dict[str, Any]) – Parameter values. In auto mode this may contain explicit linear parameter values alongside the nonlinear ones.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData / SystemData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for analytic marginalization. Required for auto and manual-marginalization modes.

  • linear_values (dict[str, Array] | None) – Explicit linear parameter values (unit-stripped). When given without marginalized_names, triggers explicit evaluation.

  • marginalized_names (tuple[str, ...] | None) – Which linear params to marginalize (manual mode).

Return type:

Array

Returns:

Scalar log-likelihood.

predict(nl_values, linear_values, data)

Full predicted observable y_pred = X @ y at the data times.

X is the extension-augmented design matrix (_full_design_matrix()) and y is the ordered linear-parameter vector built from linear_values (missing entries default to 0). Used by _log_prob_explicit(), chi_squared(), and the plot functions so that every prediction path shares the same construction.

Parameters:
Return type:

Array

chi_squared(nl_values, linear_values, data)

Goodness-of-fit \(\chi^2\) for one fully-specified parameter set.

Unlike log_prob() (which marginalizes the linear parameters and returns a marginal log-likelihood), this evaluates the model at the given linear-parameter values and returns the residual statistic

\[\chi^2 = r^\top C^{-1} r, \qquad r = y_\mathrm{obs} - X\,y,\]

where \(C\) is the (extension-modified) observation covariance. For a diagonal covariance this is \(\sum_i r_i^2 / \sigma_i^2\); a Gaussian-process extension promotes \(C\) to a full matrix and the Mahalanobis form is used. Jitter is included via the inflated \(C\).

Parameters:
  • nl_values (dict[str, Any]) – Nonlinear parameter values (orbital + any extension parameters), in the same form accepted by log_prob().

  • linear_values (dict[str, Array]) – Linear parameter values, unit-stripped to the model’s linear parameter units (see _linear_param_units()).

  • data (AbstractData) – Runtime observation data.

Return type:

Array

Returns:

Scalar \(\chi^2\).

sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, explicit_linear=None, use_mean=False)

Sample linear parameters from the conditional posterior.

In auto mode (both marginalized_names and explicit_linear are None), the method classifies from linear_prior and extracts explicit linear values from nl_values.

Returns all linear parameter values (both sampled and explicit), unit-stripped.

When use_mean=True, the conditional posterior mean is returned for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. This is the appropriate choice when completing a MAP estimate (see optimize()); MCMC paths should keep the default use_mean=False.

Parameters:
Return type:

dict[str, Array]

numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)

Build a numpyro model function for MCMC sampling.

Parameters:
  • nonlinear_priors (dict[str, Distribution | QuantityDistribution]) – Prior distributions for nonlinear parameters. Keys are parameter names (e.g. "period", "eccentricity"). Values are Distribution or QuantityDistribution.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for the linear parameters. Required when any marginalization happens (marginalized=True or full non-marginalized mode that still needs explicit linear priors).

  • marginalized (bool) – If True (default), linear parameters are analytically marginalized and only nonlinear parameters are sampled. If False, all parameters are sampled explicitly.

  • marginalized_names (tuple[str, ...] | None) – Optional subset of linear parameter names to analytically marginalize when marginalized=True. None means use the model’s automatic prior-based classification.

Return type:

Callable[[], None]

Returns:

A no-argument callable suitable for numpyro.infer.MCMC.

__init__()
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

harv.models.joint module

JointModel: composition of component models with shared parameters.

A JointModel holds multiple AbstractComponentModel instances and sums their log-likelihoods. Shared orbital parameters (period, eccentricity, phase_peri, arg_peri) are passed once and forwarded to every component. Component-specific nonlinear parameters (e.g. per-component jitter) are prefixed with the component name ("{component_name}.{param_name}" becomes {param_name} when forwarded to the component).

final class harv.models.joint.JointModel

Bases: Module

Composition of component models that share orbital parameters.

We recommend using the JointModel factory methods like for_rv_and_gaia() or for_sb2().

Parameters:
  • components (dict[str, AbstractComponentModel]) – Named component models. Keys are used to namespace component-specific parameters (e.g. "rv.jitter").

  • shared_params (tuple[str, ...]) – Names of orbital parameters shared across all components (e.g. ("period", "eccentricity", "phase_peri", "arg_peri")).

  • shared_linear_params (tuple[str, ...])

components: dict[str, AbstractComponentModel]
shared_params: tuple[str, ...]
shared_linear_params: tuple[str, ...] = ()
classmethod for_rv_and_gaia(components, *, shared_params=None, shared_linear_params=None)

Build a JointModel for combined RV + Gaia astrometry.

TODO: if we want to support more complex shared parameters, look at for_sb2 to see what we can generalize

Parameters:
  • components (dict[str, AbstractComponentModel]) – RV and Gaia astrometry component models (e.g. {"rv": ..., "astro": ...}).

  • shared_params (tuple[str, ...] | None) – Override the default shared orbital parameters. Defaults to ("period", "eccentricity", "phase_peri", "arg_peri").

  • shared_linear_params (tuple[str, ...] | None) – Linear parameter names shared across components. Defaults to () (no shared linear params for heterogeneous joint models).

Return type:

JointModel

Returns:

The constructed JointModel.

classmethod for_sb2(prior, *, component_names=('primary', 'secondary'), extensions=(), shared_params=None, shared_linear_params=None)

Build an SB2 JointModel from a prior.

The prior is expected to follow the convention of default_sb2_prior(): linear-prior keys name1.rv_semiamp and name2.rv_semiamp map to the per-component rv_semiamp of the components, named “name1” and “name2” in this example, but they can be customized. Other linear-prior keys (e.g. v_sys) are automatically declared shared across components.

Data is not bound to the model; pass it at sampler.run(data, ...) time.

Parameters:
  • prior (Any) – SB2-style prior. Keys for named component-specific parameters (e.g., rv_semiamp) must correspond to the component names in component_names. For example, with the default names “primary” and “secondary”, the prior must have keys “primary.rv_semiamp” and “secondary.rv_semiamp”. Other linear parameters (e.g. “v_sys”) are automatically treated as shared across components.

  • component_names (tuple[str, ...]) – Names of the two SB2 components. Defaults to ("primary", "secondary").

  • extensions (tuple[Any, ...] | dict[str, tuple[Any, ...]]) –

    Extensions to attach to each component.

    • A bare tuple is applied to all components.

    • A dict is keyed by component name; missing keys yield no extensions for that component.

  • shared_params (tuple[str, ...] | None) – Defaults to the standard nonlinear shared orbital params. For example, “period”, “eccentricity”, “phase_peri”, and “arg_peri”.

  • shared_linear_params (tuple[str, ...] | None) – Defaults to every key in prior.linear_prior except the rv_semiamp keys.

Return type:

JointModel

Returns:

The constructed JointModel.

Examples

>>> from unxt import Q
>>> from harv.models.joint import JointModel
>>> from harv.models.priors import default_sb2_prior
>>> prior = default_sb2_prior(
...     period_min=Q(10., "day"), period_max=Q(1000., "day"),
...     sigma_K0=Q(30., "km/s"), sigma_v0=Q(50., "km/s"),
... )
>>> joint = JointModel.for_sb2(prior=prior)
>>> joint.shared_linear_params
('v_sys',)
property component_names: tuple[str, ...]

Names of the components in this joint model.

params_explicit(linear_priors)

Names of parameters that must be explicitly sampled.

Shared nonlinear params use bare names (e.g. "period"). Component-specific nonlinear params use "comp.param" notation (e.g. "rv.jitter"). Explicit-linear params (non-Gaussian priors, e.g. "parallax") are listed flat without namespace prefix, matching how they appear in the log_prob values dict.

Parameters:

linear_priors (dict[str, Any] | None) – The merged linear-prior dict (with "comp.param" qualified keys for non-shared params). None means treat all linear params as marginalizable.

Return type:

tuple[str, ...]

params_marginalized(linear_priors)

Names of linear parameters analytically marginalized across all components.

De-duplicated; order follows component iteration order.

Parameters:

linear_priors (dict[str, Any] | None) – The merged linear-prior dict. None means treat all linear params as marginalizable.

Return type:

tuple[str, ...]

log_prob(nl_values, data, *, linear_priors=None, marginalized_names=None)

Compute the joint log-likelihood.

When shared_linear_params contains names that are being analytically marginalized, a single joint MarginalizedLinear is built spanning all components (the joint path). Otherwise the per-component log-likelihoods are summed as before.

Parameters:
  • nl_values (dict[str, Any]) – Flat dict of parameter values. Shared orbital params use bare names ("period", "eccentricity", etc.). Component-specific nonlinear params use "component.param" convention (e.g. "rv.jitter").

  • data (AbstractDatasetContainer) – Per-component data, indexed by component name.

  • linear_priors (dict[str, Any] | None) – Flat merged linear-prior dict. None means treat all linear params as marginalizable.

  • marginalized_names (tuple[str, ...] | None) – Optional linear parameter names to marginalize. Component-qualified names are accepted (e.g. "rv.parallax" or just "parallax" if unambiguous).

Return type:

Array

Returns:

Scalar log-likelihood.

sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, use_mean=False)

Sample conditional linear params for each component.

When shared_linear_params are jointly marginalized, shared parameters appear at the top level of the returned dict (with bare names) rather than inside per-component sub-dicts.

Parameters:
  • nl_values (dict[str, Any]) – Flat parameter values dict.

  • key (Array) – JAX PRNG key.

  • data (AbstractDatasetContainer) – Per-component data, indexed by component name.

  • linear_priors (dict[str, Any] | None) – Flat merged linear-prior dict.

  • marginalized_names (tuple[str, ...] | None) – Optional linear parameter names to marginalize.

  • use_mean (bool) – When True, return the conditional posterior mean for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. Default False.

Return type:

dict[str, Any]

Returns:

Sampled parameter values. The structure depends on the path:

  • Default path (no shared marginalization): dict[comp_name, dict[param_name, array]].

  • Joint path (shared marginalization): mixed dict where shared params are top-level and per-component params are in sub-dicts keyed by component name.

numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)

Build a numpyro model for MCMC sampling of the joint model.

Parameters:
  • nonlinear_priors (dict[str, Distribution | QuantityDistribution]) – Prior distributions for all nonlinear parameters. Shared orbital params use bare names. Component-specific params use "component.param" convention.

  • data (AbstractDatasetContainer) – Per-component data, indexed by component name.

  • linear_priors (dict[str, Any] | None) – Flat merged linear-prior dict.

  • marginalized (bool) – If True (default), linear parameters are marginalized per-component. If False, all parameters are sampled explicitly.

  • marginalized_names (tuple[str, ...] | None) – Optional linear parameter names to marginalize when marginalized=True. Component-qualified names are accepted.

Return type:

Callable[[], None]

Returns:

A numpyro model function suitable for use with a sampler.

__init__(components, shared_params, shared_linear_params=())
Parameters:
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

harv.models.rv module

Concrete RV component model.

RVModel is a template carrying a parameterization (default StandardRV) and extensions. Data and linear priors are passed at evaluation time to RVModel.log_prob() / RVModel.sample_conditional_linear() / RVModel.numpyro_model().

final class harv.models.rv.RVModel

Bases: AbstractComponentModel

Radial-velocity component model (template).

Carries a parameterization and extensions. Data and the linear prior are passed at evaluation time so the same model instance can be reused across multiple datasets.

Parameters:

Examples

>>> from harv.models.rv import RVModel
>>> model = RVModel()
>>> sorted(model._all_nonlinear_names())
['arg_peri', 'eccentricity', 'period', 'phase_peri']
parameterization: StandardRV | EcoswEsinwRV = StandardRV()
extensions: tuple[AbstractExtension, ...] = ()
predict_at_times(times, nl_values, linear_values, *, t_ref, obs_unit='km/s')

Predicted RV at arbitrary times, no observed-data object required.

Internally constructs an RVData shim at times with dummy rv / rv_err (zeros / ones) and delegates to predict(). The dummy obs are never read by the prediction path (_full_design_matrix only consumes data.time and data.t_ref; extensions read at most those fields too). The returned array is in the same units the model’s linear parameters are expressed in.

Raises TypeError if any MultiSurveyOffset is in self.extensions — its indicator_matrix is fixed to the original data’s row count and cannot apply on an arbitrary time grid. Callers building a smooth-curve overlay should filter it out before calling.

Parameters:
  • times (Real[Quantity[PhysicalType('time')], '*batch'])

  • nl_values (dict[str, Any])

  • linear_values (dict[str, Array])

  • t_ref (Real[Quantity[PhysicalType('time')], ''])

  • obs_unit (str)

Return type:

Array

__init__(parameterization=StandardRV(), extensions=())
Parameters:
Return type:

None

static __new__(cls, *args, **kwargs)
Parameters:
Return type:

TypeVar(_ModuleT, bound= Module)

chi_squared(nl_values, linear_values, data)

Goodness-of-fit \(\chi^2\) for one fully-specified parameter set.

Unlike log_prob() (which marginalizes the linear parameters and returns a marginal log-likelihood), this evaluates the model at the given linear-parameter values and returns the residual statistic

\[\chi^2 = r^\top C^{-1} r, \qquad r = y_\mathrm{obs} - X\,y,\]

where \(C\) is the (extension-modified) observation covariance. For a diagonal covariance this is \(\sum_i r_i^2 / \sigma_i^2\); a Gaussian-process extension promotes \(C\) to a full matrix and the Mahalanobis form is used. Jitter is included via the inflated \(C\).

Parameters:
  • nl_values (dict[str, Any]) – Nonlinear parameter values (orbital + any extension parameters), in the same form accepted by log_prob().

  • linear_values (dict[str, Array]) – Linear parameter values, unit-stripped to the model’s linear parameter units (see _linear_param_units()).

  • data (AbstractData) – Runtime observation data.

Return type:

Array

Returns:

Scalar \(\chi^2\).

log_prob(nl_values, data, *, linear_priors=None, linear_values=None, marginalized_names=None)

Compute the log-likelihood.

Three calling conventions are supported:

  1. Auto mode (recommended): pass linear_prior and let the model classify which linear params to marginalize. Non-Gaussian linear priors are expected as entries in nl_values.

  2. Manual marginalization: pass marginalized_names (and optionally linear_values) to control exactly which linear params are marginalized.

  3. Explicit evaluation: pass linear_values without marginalized_names (and linear_priors=None) to evaluate the Gaussian log-likelihood at fixed linear parameter values.

Parameters:
  • nl_values (dict[str, Any]) – Parameter values. In auto mode this may contain explicit linear parameter values alongside the nonlinear ones.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData / SystemData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for analytic marginalization. Required for auto and manual-marginalization modes.

  • linear_values (dict[str, Array] | None) – Explicit linear parameter values (unit-stripped). When given without marginalized_names, triggers explicit evaluation.

  • marginalized_names (tuple[str, ...] | None) – Which linear params to marginalize (manual mode).

Return type:

Array

Returns:

Scalar log-likelihood.

numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)

Build a numpyro model function for MCMC sampling.

Parameters:
  • nonlinear_priors (dict[str, Distribution | QuantityDistribution]) – Prior distributions for nonlinear parameters. Keys are parameter names (e.g. "period", "eccentricity"). Values are Distribution or QuantityDistribution.

  • data (AbstractData) – Runtime observation data (RVData / GaiaAstrometryData).

  • linear_priors (dict[str, Any] | None) – Per-parameter priors for the linear parameters. Required when any marginalization happens (marginalized=True or full non-marginalized mode that still needs explicit linear priors).

  • marginalized (bool) – If True (default), linear parameters are analytically marginalized and only nonlinear parameters are sampled. If False, all parameters are sampled explicitly.

  • marginalized_names (tuple[str, ...] | None) – Optional subset of linear parameter names to analytically marginalize when marginalized=True. None means use the model’s automatic prior-based classification.

Return type:

Callable[[], None]

Returns:

A no-argument callable suitable for numpyro.infer.MCMC.

params_explicit(linear_priors)

Names of parameters that must be explicitly sampled.

Given a (resolved) linear_prior dict, returns all nonlinear parameters (orbital + extension, e.g. jitter) plus any linear parameters with non-Gaussian priors (e.g. parallax with a HalfNormal prior) that cannot be analytically marginalized.

These are the keys that must appear in the values dict passed to log_prob().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

params_marginalized(linear_priors)

Names of linear parameters analytically marginalized in log_prob.

Given a (resolved) linear_prior dict, returns the linear parameter names whose priors are Gaussian (or callable-returning-Normal) and so are integrated out analytically via the Woodbury identity rather than sampled. Their values are NOT required in the values dict; they are recovered afterward via sample_conditional_linear().

Parameters:

linear_priors (dict[str, Any] | None)

Return type:

tuple[str, ...]

predict(nl_values, linear_values, data)

Full predicted observable y_pred = X @ y at the data times.

X is the extension-augmented design matrix (_full_design_matrix()) and y is the ordered linear-parameter vector built from linear_values (missing entries default to 0). Used by _log_prob_explicit(), chi_squared(), and the plot functions so that every prediction path shares the same construction.

Parameters:
Return type:

Array

sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, explicit_linear=None, use_mean=False)

Sample linear parameters from the conditional posterior.

In auto mode (both marginalized_names and explicit_linear are None), the method classifies from linear_prior and extracts explicit linear values from nl_values.

Returns all linear parameter values (both sampled and explicit), unit-stripped.

When use_mean=True, the conditional posterior mean is returned for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. This is the appropriate choice when completing a MAP estimate (see optimize()); MCMC paths should keep the default use_mean=False.

Parameters:
Return type:

dict[str, Array]