harv package¶
harv: Tools for inferring Keplerian orbital parameters.
A JAX-based package for modeling binary-star and star-exoplanet systems with time series data, such as Gaia DR4 epoch astrometry and radial velocities. The package is units aware via unxt, supports probabilistic modeling with numpyro, and provides flexible Keplerian orbit frameworks for single and multi-body systems.
- class harv.GaiaAstrometryData¶
Bases:
AbstractAstrometryDataGaia epoch astrometry (along-scan measurements).
Examples
>>> import jax.numpy as jnp >>> from unxt import Q >>> from harv import GaiaAstrometryData >>> data = GaiaAstrometryData( ... time=Q([0.0, 100.0, 200.0], "day"), ... al_position=Q([0.1, -0.2, 0.05], "mas"), ... al_position_err=Q([0.01, 0.01, 0.01], "mas"), ... scan_angle=Q([0.5, 1.2, 2.8], "rad"), ... parallax_factor=jnp.array([0.3, -0.1, 0.4]), ... ) >>> data.n_times 3
- Parameters:
time (
Real[Quantity[PhysicalType('time')], 'n'])al_position (
Real[Quantity[PhysicalType('angle')], 'n'])al_position_err (
Real[Quantity[PhysicalType('angle')], 'n'])scan_angle (
Real[Quantity[PhysicalType('angle')], 'n'])parallax_factor (
Float[Array, 'n'])t_ref (
Real[Quantity[PhysicalType('time')], '']|None)
- __init__(time, al_position, al_position_err, scan_angle, parallax_factor, *, t_ref=None)¶
- Parameters:
time (
Real[Quantity[PhysicalType('time')], 'n'])al_position (
Real[Quantity[PhysicalType('angle')], 'n'])al_position_err (
Real[Quantity[PhysicalType('angle')], 'n'])scan_angle (
Real[Quantity[PhysicalType('angle')], 'n'])parallax_factor (
Float[Array, 'n'])t_ref (
Real[Quantity[PhysicalType('time')], '']|None)
- Return type:
None
- static __new__(cls, *args, **kwargs)¶
- plot(ax=None, *, al_unit=None, add_labels=True, relative_to_t_ref=False, **kwargs)¶
Plot along-scan residuals vs time.
- Parameters:
ax (
Any) –matplotlib.axes.Axesinstance to draw on. IfNone, usesplt.gca().al_unit (
str|None) – Display unit for the along-scan position. Defaults to the data’s own unit.add_labels (
bool) – Add axis labels.relative_to_t_ref (
bool) – Plot time relative tot_ref.**kwargs (
Any) – Passed toax.errorbar(). Defaults can be overridden.
- Returns:
The
matplotlib.axes.Axesinstance.- Return type:
Examples
>>> import jax.numpy as jnp >>> import matplotlib.pyplot as plt >>> from unxt import Q >>> from harv import GaiaAstrometryData >>> data = GaiaAstrometryData( ... time=Q([0.0, 100.0, 200.0], "day"), ... al_position=Q([0.1, -0.2, 0.05], "mas"), ... al_position_err=Q([0.01, 0.01, 0.01], "mas"), ... scan_angle=Q([0.5, 1.2, 2.8], "rad"), ... parallax_factor=jnp.array([0.3, -0.1, 0.4]), ... ) >>> ax = data.plot() >>> plt.close("all")
-
t_ref:
Real[Quantity[PhysicalType('time')], '']|None= None¶ Reference epoch. If None, uses mean observation time.
-
al_position:
Real[Quantity[PhysicalType('angle')], 'n']¶ Along-scan position.
-
al_position_err:
Real[Quantity[PhysicalType('angle')], 'n']¶ Along-scan uncertainty.
-
scan_angle:
Real[Quantity[PhysicalType('angle')], 'n']¶ Per-CCD scan angle.
-
parallax_factor:
Float[Array, 'n']¶ AL parallax factors.
-
time:
Real[Quantity[PhysicalType('time')], 'n']¶ Barycentric TCB times.
- class harv.RVData¶
Bases:
AbstractDataRadial velocity measurements.
Examples
>>> from unxt import Q >>> from harv import RVData >>> data = RVData( ... time=Q([0.0, 50.0, 100.0], "day"), ... rv=Q([1.0, -2.0, 0.5], "km/s"), ... rv_err=Q([0.5, 0.5, 0.5], "km/s"), ... ) >>> data.n_times 3
- Parameters:
time (
Real[Quantity[PhysicalType('time')], 'n'])rv (
Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])rv_err (
Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])t_ref (
Real[Quantity[PhysicalType('time')], '']|None)
- __init__(time, rv, rv_err, *, t_ref=None)¶
- Parameters:
time (
Real[Quantity[PhysicalType('time')], 'n'])rv (
Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])rv_err (
Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])t_ref (
Real[Quantity[PhysicalType('time')], '']|None)
- Return type:
None
- static __new__(cls, *args, **kwargs)¶
- plot(ax=None, *, rv_unit=None, add_labels=True, relative_to_t_ref=False, phase_fold=None, **kwargs)¶
Plot RV data as error bars.
- Parameters:
ax (
Any) – Thematplotlib.axes.Axesinstance to draw on. IfNone, usesplt.gca().rv_unit (
str|None) – Display unit for the RV axis. Defaults to the data’s own unit.add_labels (
bool) – Add axis labels.relative_to_t_ref (
bool) – Plot time relative tot_ref. Mutually exclusive withphase_fold.phase_fold (
Any|None) – If given, fold observations to orbital phase using this period: x = (time - t_ref) / phase_fold mod 1. Mutually exclusive withrelative_to_t_ref.**kwargs (
Any) – Passed toax.errorbar(). Defaults can be overridden.
- Returns:
The
matplotlib.axes.Axesinstance.- Return type:
Examples
>>> import matplotlib.pyplot as plt >>> from unxt import Q >>> data = RVData( ... time=Q([0.0, 50.0, 100.0], "day"), ... rv=Q([1.0, -2.0, 0.5], "km/s"), ... rv_err=Q([0.5, 0.5, 0.5], "km/s"), ... ) >>> ax = data.plot() # uses errorbar() with sensible defaults >>> ax = data.plot(color="C1", markersize=6) # override style >>> ax = data.plot(phase_fold=Q(50.0, "day")) # phase-folded >>> plt.close("all")
-
t_ref:
Real[Quantity[PhysicalType('time')], '']|None= None¶ Reference epoch. If None, uses mean observation time.
-
rv:
Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n']¶ Radial velocities.
-
rv_err:
Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n']¶ Radial velocity uncertainties.
-
time:
Real[Quantity[PhysicalType('time')], 'n']¶ Barycentric TCB times.
- class harv.SourceData¶
Bases:
AbstractDatasetContainerContainer for multiple named datasets for a single source.
Accepts arbitrary named datasets via keyword arguments. Names are user-defined and can be anything (e.g., gaia, keck_rv, hst_imaging).
- Parameters:
datasets (
AbstractAstrometryData|RVData)
- __init__(**datasets)¶
- Parameters:
datasets (
AbstractAstrometryData|RVData)- Return type:
None
- static __new__(cls, *args, **kwargs)¶
- get_datasets_by_type(data_type)¶
Get all datasets/components of a specific data type.
- Parameters:
data_type (
type[TypeVar(_DT, bound=AbstractAstrometryData|RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by.- Return type:
dict[str,TypeVar(_DT, bound=AbstractAstrometryData|RVData)]
Examples
>>> from harv.data.datasets import RVData, GaiaAstrometryData >>> from harv.data.containers import SourceData >>> source_data = SourceData( ... keck_rv=RVData(...), ... gaia=GaiaAstrometryData(...), ... ) >>> source_data.get_datasets_by_type(RVData) {'keck_rv': RVData(...)} >>> source_data.get_datasets_by_type(GaiaAstrometryData) {'gaia': GaiaAstrometryData(...)}
- indicator_data_by_type(data_type, reference)¶
Return stacked data and indicator flags for one dataset type.
This is a convenience wrapper around get_datasets_by_type + build_indicator_matrix for use in extensions that need to build a kernel matrix across multiple datasets of the same type (e.g. multiple RV instruments).
- Parameters:
data_type (
type[TypeVar(_DT, bound=AbstractAstrometryData|RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.reference (
str) – Name of the reference dataset to use for time coordinates and metadata. Must be one of the keys in the returned dict from get_datasets_by_type(data_type).
- Return type:
tuple[TypeVar(_DT, bound=AbstractAstrometryData|RVData),Array|None,tuple[str,...] |None]
- plot(*args, **kwargs)¶
Plot all datasets on a single axes.
Only valid when every contained dataset shares the same concrete type; plotting heterogeneous types (e.g. RV in km/s and astrometry in mas) on a single axes would overlay incompatible y-axes. Use
get_datasets_by_type()to filter to a single type first when needed.Parameters mirror
AbstractDatasetContainer.plot().
- stacked_by_type(data_type)¶
Stack all datasets of the requested type.
- Parameters:
data_type (
type[TypeVar(_DT, bound=AbstractAstrometryData|RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.- Return type:
TypeVar(_DT, bound=AbstractAstrometryData|RVData)
Examples
>>> from harv.data.datasets import RVData, GaiaAstrometryData >>> from harv.data.containers import SourceData >>> source_data = SourceData( ... keck_rv=RVData(...), ... wiyn_rv=RVData(...), ... gaia=GaiaAstrometryData(...), ... ) >>> source_data.stacked_by_type(RVData) RVData(...)
- property t_ref: Real[Quantity[PhysicalType('time')], ''] | None¶
Reference epoch shared by all contained datasets.
Guaranteed to be consistent across components because every concrete subclass calls
_synchronize_t_refs()in its__init__.
- values()¶
Dataset/component values.
- Return type:
- harv.QD¶
alias of
QuantityDistribution
- class harv.QuantityDistribution¶
Bases:
ModulePairs a numpyro distribution with the physical unit of its samples.
For scalar distributions,
unitis a single string. For multivariate distributions (e.g.MultivariateNormalover parameters with mixed units),unitis a tuple of strings – one per dimension.- Parameters:
distribution (
Distribution) – The underlying numpyro distribution (works with bare floats).unit (
str|tuple[str,...]) – Physical unit of the samples. A single string for scalar distributions; a tuple for multivariate distributions where each element may have a different unit.
Examples
Scalar (period in days):
>>> import jax >>> import numpyro.distributions as dist >>> from harv import QuantityDistribution >>> qd = QuantityDistribution(dist.LogUniform(50., 2000.), "day") >>> sample = qd.sample(jax.random.key(0)) >>> sample.unit Unit("d")
Multivariate (astrometric linear parameters with mixed units):
qd = QuantityDistribution( dist.MultivariateNormal(loc=jnp.zeros(6), ...), ("mas", "mas", "mas/yr", "mas/yr", "mas", "mas"), ) sample = qd.sample(key) # -> raw jax.Array (consumer splits by name)
- __init__(distribution, unit)¶
- Parameters:
distribution (
Distribution)
- Return type:
None
- static __new__(cls, *args, **kwargs)¶
- log_prob(value)¶
Evaluate log-probability, stripping units if present.
- sample(key, sample_shape=())¶
Sample from the distribution, attaching units when possible.
For scalar units (
str), returnsQuantity. For tuple units (multivariate with mixed units), returns a raw array – the consumer splits by parameter name and attaches per-element units.
-
distribution:
Distribution¶
- class harv.GaiaAstrometryModel¶
Bases:
AbstractComponentModelGaia epoch astrometry component model (template).
Carries a parameterization and extensions. Data and the linear prior are passed at evaluation time so the same model instance can be reused across multiple datasets.
- Parameters:
parameterization (
StandardGaiaAstrometry|ThieleInnesGaiaAstrometry) – Declares parameter names/roles and builds the base design matrix.extensions (
tuple[AbstractExtension,...]) – Model extensions (jitter, trends, …).pm_time_unit (
str) – If not None, override the proper motion units in the design matrix to use this unit instead of the default (obs_unit / yr).
Examples
>>> from harv.models.astrometry import GaiaAstrometryModel >>> model = GaiaAstrometryModel() >>> sorted(model._all_nonlinear_names()) ['arg_peri', 'cos_i', 'eccentricity', 'lon_asc_node', 'period', 'phase_peri']
- __init__(parameterization=StandardGaiaAstrometry(), extensions=(), *, pm_time_unit='yr')¶
- Parameters:
parameterization (
StandardGaiaAstrometry|ThieleInnesGaiaAstrometry)extensions (
tuple[AbstractExtension,...])pm_time_unit (
str)
- Return type:
None
- static __new__(cls, *args, **kwargs)¶
- chi_squared(nl_values, linear_values, data)¶
Goodness-of-fit \(\chi^2\) for one fully-specified parameter set.
Unlike
log_prob()(which marginalizes the linear parameters and returns a marginal log-likelihood), this evaluates the model at the given linear-parameter values and returns the residual statistic\[\chi^2 = r^\top C^{-1} r, \qquad r = y_\mathrm{obs} - X\,y,\]where \(C\) is the (extension-modified) observation covariance. For a diagonal covariance this is \(\sum_i r_i^2 / \sigma_i^2\); a Gaussian-process extension promotes \(C\) to a full matrix and the Mahalanobis form is used. Jitter is included via the inflated \(C\).
- Parameters:
nl_values (
dict[str,Any]) – Nonlinear parameter values (orbital + any extension parameters), in the same form accepted bylog_prob().linear_values (
dict[str,Array]) – Linear parameter values, unit-stripped to the model’s linear parameter units (see_linear_param_units()).data (
AbstractData) – Runtime observation data.
- Return type:
- Returns:
Scalar \(\chi^2\).
-
extensions:
tuple[AbstractExtension,...] = ()¶
- log_prob(nl_values, data, *, linear_priors=None, linear_values=None, marginalized_names=None)¶
Compute the log-likelihood.
Three calling conventions are supported:
Auto mode (recommended): pass
linear_priorand let the model classify which linear params to marginalize. Non-Gaussian linear priors are expected as entries innl_values.Manual marginalization: pass
marginalized_names(and optionallylinear_values) to control exactly which linear params are marginalized.Explicit evaluation: pass
linear_valueswithoutmarginalized_names(andlinear_priors=None) to evaluate the Gaussian log-likelihood at fixed linear parameter values.
- Parameters:
nl_values (
dict[str,Any]) – Parameter values. In auto mode this may contain explicit linear parameter values alongside the nonlinear ones.data (
AbstractData) – Runtime observation data (RVData / GaiaAstrometryData / SystemData).linear_priors (
dict[str,Any] |None) – Per-parameter priors for analytic marginalization. Required for auto and manual-marginalization modes.linear_values (
dict[str,Array] |None) – Explicit linear parameter values (unit-stripped). When given withoutmarginalized_names, triggers explicit evaluation.marginalized_names (
tuple[str,...] |None) – Which linear params to marginalize (manual mode).
- Return type:
- Returns:
Scalar log-likelihood.
- numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)¶
Build a numpyro model function for MCMC sampling.
- Parameters:
nonlinear_priors (
dict[str,Distribution|QuantityDistribution]) – Prior distributions for nonlinear parameters. Keys are parameter names (e.g."period","eccentricity"). Values areDistributionorQuantityDistribution.data (
AbstractData) – Runtime observation data (RVData / GaiaAstrometryData).linear_priors (
dict[str,Any] |None) – Per-parameter priors for the linear parameters. Required when any marginalization happens (marginalized=Trueor full non-marginalized mode that still needs explicit linear priors).marginalized (
bool) – IfTrue(default), linear parameters are analytically marginalized and only nonlinear parameters are sampled. IfFalse, all parameters are sampled explicitly.marginalized_names (
tuple[str,...] |None) – Optional subset of linear parameter names to analytically marginalize whenmarginalized=True.Nonemeans use the model’s automatic prior-based classification.
- Return type:
- Returns:
A no-argument callable suitable for
numpyro.infer.MCMC.
-
parameterization:
StandardGaiaAstrometry|ThieleInnesGaiaAstrometry= StandardGaiaAstrometry()¶
- params_explicit(linear_priors)¶
Names of parameters that must be explicitly sampled.
Given a (resolved)
linear_priordict, returns all nonlinear parameters (orbital + extension, e.g. jitter) plus any linear parameters with non-Gaussian priors (e.g. parallax with a HalfNormal prior) that cannot be analytically marginalized.These are the keys that must appear in the
valuesdict passed tolog_prob().
- params_marginalized(linear_priors)¶
Names of linear parameters analytically marginalized in log_prob.
Given a (resolved)
linear_priordict, returns the linear parameter names whose priors are Gaussian (or callable-returning-Normal) and so are integrated out analytically via the Woodbury identity rather than sampled. Their values are NOT required in thevaluesdict; they are recovered afterward viasample_conditional_linear().
- predict(nl_values, linear_values, data)¶
Full predicted observable
y_pred = X @ yat the data times.Xis the extension-augmented design matrix (_full_design_matrix()) andyis the ordered linear-parameter vector built fromlinear_values(missing entries default to0). Used by_log_prob_explicit(),chi_squared(), and the plot functions so that every prediction path shares the same construction.
- predict_orbit_sky(nl_values, linear_values, times)¶
Predicted sky-plane orbital offsets
(dRA, dDec)at times.Delegates to
self.parameterization.sky_orbit(...). Returns the orbital contribution only — proper motion, parallax, and zero-point are not included. This is what the on-sky orbit ellipse panel ofplot_gaia_sky_orbit()plots.The returned arrays are in the same unit as the scalar semi-major-axis linear param (
"semi_major_axis"forStandardGaiaAstrometry, or the TI constants forThieleInnesGaiaAstrometry).
- sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, explicit_linear=None, use_mean=False)¶
Sample linear parameters from the conditional posterior.
In auto mode (both
marginalized_namesandexplicit_linearareNone), the method classifies fromlinear_priorand extracts explicit linear values fromnl_values.Returns all linear parameter values (both sampled and explicit), unit-stripped.
When
use_mean=True, the conditional posterior mean is returned for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. This is the appropriate choice when completing a MAP estimate (seeoptimize()); MCMC paths should keep the defaultuse_mean=False.
- class harv.JointModel¶
Bases:
ModuleComposition of component models that share orbital parameters.
We recommend using the
JointModelfactory methods likefor_rv_and_gaia()orfor_sb2().- Parameters:
components (
dict[str,AbstractComponentModel]) – Named component models. Keys are used to namespace component-specific parameters (e.g."rv.jitter").shared_params (
tuple[str,...]) – Names of orbital parameters shared across all components (e.g.("period", "eccentricity", "phase_peri", "arg_peri")).
- __init__(components, shared_params, shared_linear_params=())¶
- static __new__(cls, *args, **kwargs)¶
- classmethod for_rv_and_gaia(components, *, shared_params=None, shared_linear_params=None)¶
Build a JointModel for combined RV + Gaia astrometry.
TODO: if we want to support more complex shared parameters, look at for_sb2 to see what we can generalize
- Parameters:
components (
dict[str,AbstractComponentModel]) – RV and Gaia astrometry component models (e.g.{"rv": ..., "astro": ...}).shared_params (
tuple[str,...] |None) – Override the default shared orbital parameters. Defaults to("period", "eccentricity", "phase_peri", "arg_peri").shared_linear_params (
tuple[str,...] |None) – Linear parameter names shared across components. Defaults to()(no shared linear params for heterogeneous joint models).
- Return type:
- Returns:
The constructed JointModel.
- classmethod for_sb2(prior, *, component_names=('primary', 'secondary'), extensions=(), shared_params=None, shared_linear_params=None)¶
Build an SB2 JointModel from a prior.
The
prioris expected to follow the convention ofdefault_sb2_prior(): linear-prior keysname1.rv_semiampandname2.rv_semiampmap to the per-componentrv_semiampof the components, named “name1” and “name2” in this example, but they can be customized. Other linear-prior keys (e.g.v_sys) are automatically declared shared across components.Data is not bound to the model; pass it at
sampler.run(data, ...)time.- Parameters:
prior (
Any) – SB2-style prior. Keys for named component-specific parameters (e.g.,rv_semiamp) must correspond to the component names in component_names. For example, with the default names “primary” and “secondary”, the prior must have keys “primary.rv_semiamp” and “secondary.rv_semiamp”. Other linear parameters (e.g. “v_sys”) are automatically treated as shared across components.component_names (
tuple[str,...]) – Names of the two SB2 components. Defaults to("primary", "secondary").extensions (
tuple[Any,...] |dict[str,tuple[Any,...]]) –Extensions to attach to each component.
A bare
tupleis applied to all components.A
dictis keyed by component name; missing keys yield no extensions for that component.
shared_params (
tuple[str,...] |None) – Defaults to the standard nonlinear shared orbital params. For example, “period”, “eccentricity”, “phase_peri”, and “arg_peri”.shared_linear_params (
tuple[str,...] |None) – Defaults to every key inprior.linear_priorexcept therv_semiampkeys.
- Return type:
- Returns:
The constructed JointModel.
Examples
>>> from unxt import Q >>> from harv.models.joint import JointModel >>> from harv.models.priors import default_sb2_prior >>> prior = default_sb2_prior( ... period_min=Q(10., "day"), period_max=Q(1000., "day"), ... sigma_K0=Q(30., "km/s"), sigma_v0=Q(50., "km/s"), ... ) >>> joint = JointModel.for_sb2(prior=prior) >>> joint.shared_linear_params ('v_sys',)
- log_prob(nl_values, data, *, linear_priors=None, marginalized_names=None)¶
Compute the joint log-likelihood.
When
shared_linear_paramscontains names that are being analytically marginalized, a single jointMarginalizedLinearis built spanning all components (the joint path). Otherwise the per-component log-likelihoods are summed as before.- Parameters:
nl_values (
dict[str,Any]) – Flat dict of parameter values. Shared orbital params use bare names ("period","eccentricity", etc.). Component-specific nonlinear params use"component.param"convention (e.g."rv.jitter").data (
AbstractDatasetContainer) – Per-component data, indexed by component name.linear_priors (
dict[str,Any] |None) – Flat merged linear-prior dict.Nonemeans treat all linear params as marginalizable.marginalized_names (
tuple[str,...] |None) – Optional linear parameter names to marginalize. Component-qualified names are accepted (e.g."rv.parallax"or just"parallax"if unambiguous).
- Return type:
- Returns:
Scalar log-likelihood.
- numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)¶
Build a numpyro model for MCMC sampling of the joint model.
- Parameters:
nonlinear_priors (
dict[str,Distribution|QuantityDistribution]) – Prior distributions for all nonlinear parameters. Shared orbital params use bare names. Component-specific params use"component.param"convention.data (
AbstractDatasetContainer) – Per-component data, indexed by component name.linear_priors (
dict[str,Any] |None) – Flat merged linear-prior dict.marginalized (
bool) – IfTrue(default), linear parameters are marginalized per-component. IfFalse, all parameters are sampled explicitly.marginalized_names (
tuple[str,...] |None) – Optional linear parameter names to marginalize whenmarginalized=True. Component-qualified names are accepted.
- Return type:
- Returns:
A numpyro model function suitable for use with a sampler.
- params_explicit(linear_priors)¶
Names of parameters that must be explicitly sampled.
Shared nonlinear params use bare names (e.g.
"period"). Component-specific nonlinear params use"comp.param"notation (e.g."rv.jitter"). Explicit-linear params (non-Gaussian priors, e.g."parallax") are listed flat without namespace prefix, matching how they appear in thelog_probvalues dict.
- params_marginalized(linear_priors)¶
Names of linear parameters analytically marginalized across all components.
De-duplicated; order follows component iteration order.
- sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, use_mean=False)¶
Sample conditional linear params for each component.
When
shared_linear_paramsare jointly marginalized, shared parameters appear at the top level of the returned dict (with bare names) rather than inside per-component sub-dicts.- Parameters:
key (
Array) – JAX PRNG key.data (
AbstractDatasetContainer) – Per-component data, indexed by component name.linear_priors (
dict[str,Any] |None) – Flat merged linear-prior dict.marginalized_names (
tuple[str,...] |None) – Optional linear parameter names to marginalize.use_mean (
bool) – WhenTrue, return the conditional posterior mean for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. DefaultFalse.
- Return type:
- Returns:
Sampled parameter values. The structure depends on the path:
Default path (no shared marginalization):
dict[comp_name, dict[param_name, array]].Joint path (shared marginalization): mixed dict where shared params are top-level and per-component params are in sub-dicts keyed by component name.
-
components:
dict[str,AbstractComponentModel]¶
- class harv.ParamInfo¶
Bases:
ModuleImmutable descriptor for a single model parameter.
- Parameters:
name (
str) – Parameter name. Must not contain"."(reserved forJointModeltied-parameter paths).unit (
str) – Unit string (e.g."day","km/s",""for dimensionless).linear (
bool) – Whether the parameter enters the model linearly (defaultFalse).
Examples
>>> from harv.models.extensions.base import ParamInfo >>> p = ParamInfo("period", "time") >>> p.name 'period' >>> p.linear False
- __init__(name, unit, linear=False)¶
- static __new__(cls, *args, **kwargs)¶
- class harv.RVModel¶
Bases:
AbstractComponentModelRadial-velocity component model (template).
Carries a parameterization and extensions. Data and the linear prior are passed at evaluation time so the same model instance can be reused across multiple datasets.
- Parameters:
parameterization (
StandardRV|EcoswEsinwRV) – Declares parameter names/roles and builds the base design matrix.extensions (
tuple[AbstractExtension,...]) – Model extensions (jitter, trends, offsets, GP, …).
Examples
>>> from harv.models.rv import RVModel >>> model = RVModel() >>> sorted(model._all_nonlinear_names()) ['arg_peri', 'eccentricity', 'period', 'phase_peri']
- __init__(parameterization=StandardRV(), extensions=())¶
- Parameters:
parameterization (
StandardRV|EcoswEsinwRV)extensions (
tuple[AbstractExtension,...])
- Return type:
None
- static __new__(cls, *args, **kwargs)¶
- chi_squared(nl_values, linear_values, data)¶
Goodness-of-fit \(\chi^2\) for one fully-specified parameter set.
Unlike
log_prob()(which marginalizes the linear parameters and returns a marginal log-likelihood), this evaluates the model at the given linear-parameter values and returns the residual statistic\[\chi^2 = r^\top C^{-1} r, \qquad r = y_\mathrm{obs} - X\,y,\]where \(C\) is the (extension-modified) observation covariance. For a diagonal covariance this is \(\sum_i r_i^2 / \sigma_i^2\); a Gaussian-process extension promotes \(C\) to a full matrix and the Mahalanobis form is used. Jitter is included via the inflated \(C\).
- Parameters:
nl_values (
dict[str,Any]) – Nonlinear parameter values (orbital + any extension parameters), in the same form accepted bylog_prob().linear_values (
dict[str,Array]) – Linear parameter values, unit-stripped to the model’s linear parameter units (see_linear_param_units()).data (
AbstractData) – Runtime observation data.
- Return type:
- Returns:
Scalar \(\chi^2\).
-
extensions:
tuple[AbstractExtension,...] = ()¶
- log_prob(nl_values, data, *, linear_priors=None, linear_values=None, marginalized_names=None)¶
Compute the log-likelihood.
Three calling conventions are supported:
Auto mode (recommended): pass
linear_priorand let the model classify which linear params to marginalize. Non-Gaussian linear priors are expected as entries innl_values.Manual marginalization: pass
marginalized_names(and optionallylinear_values) to control exactly which linear params are marginalized.Explicit evaluation: pass
linear_valueswithoutmarginalized_names(andlinear_priors=None) to evaluate the Gaussian log-likelihood at fixed linear parameter values.
- Parameters:
nl_values (
dict[str,Any]) – Parameter values. In auto mode this may contain explicit linear parameter values alongside the nonlinear ones.data (
AbstractData) – Runtime observation data (RVData / GaiaAstrometryData / SystemData).linear_priors (
dict[str,Any] |None) – Per-parameter priors for analytic marginalization. Required for auto and manual-marginalization modes.linear_values (
dict[str,Array] |None) – Explicit linear parameter values (unit-stripped). When given withoutmarginalized_names, triggers explicit evaluation.marginalized_names (
tuple[str,...] |None) – Which linear params to marginalize (manual mode).
- Return type:
- Returns:
Scalar log-likelihood.
- numpyro_model(nonlinear_priors, data, linear_priors, *, marginalized=True, marginalized_names=None)¶
Build a numpyro model function for MCMC sampling.
- Parameters:
nonlinear_priors (
dict[str,Distribution|QuantityDistribution]) – Prior distributions for nonlinear parameters. Keys are parameter names (e.g."period","eccentricity"). Values areDistributionorQuantityDistribution.data (
AbstractData) – Runtime observation data (RVData / GaiaAstrometryData).linear_priors (
dict[str,Any] |None) – Per-parameter priors for the linear parameters. Required when any marginalization happens (marginalized=Trueor full non-marginalized mode that still needs explicit linear priors).marginalized (
bool) – IfTrue(default), linear parameters are analytically marginalized and only nonlinear parameters are sampled. IfFalse, all parameters are sampled explicitly.marginalized_names (
tuple[str,...] |None) – Optional subset of linear parameter names to analytically marginalize whenmarginalized=True.Nonemeans use the model’s automatic prior-based classification.
- Return type:
- Returns:
A no-argument callable suitable for
numpyro.infer.MCMC.
-
parameterization:
StandardRV|EcoswEsinwRV= StandardRV()¶
- params_explicit(linear_priors)¶
Names of parameters that must be explicitly sampled.
Given a (resolved)
linear_priordict, returns all nonlinear parameters (orbital + extension, e.g. jitter) plus any linear parameters with non-Gaussian priors (e.g. parallax with a HalfNormal prior) that cannot be analytically marginalized.These are the keys that must appear in the
valuesdict passed tolog_prob().
- params_marginalized(linear_priors)¶
Names of linear parameters analytically marginalized in log_prob.
Given a (resolved)
linear_priordict, returns the linear parameter names whose priors are Gaussian (or callable-returning-Normal) and so are integrated out analytically via the Woodbury identity rather than sampled. Their values are NOT required in thevaluesdict; they are recovered afterward viasample_conditional_linear().
- predict(nl_values, linear_values, data)¶
Full predicted observable
y_pred = X @ yat the data times.Xis the extension-augmented design matrix (_full_design_matrix()) andyis the ordered linear-parameter vector built fromlinear_values(missing entries default to0). Used by_log_prob_explicit(),chi_squared(), and the plot functions so that every prediction path shares the same construction.
- predict_at_times(times, nl_values, linear_values, *, t_ref, obs_unit='km/s')¶
Predicted RV at arbitrary times, no observed-data object required.
Internally constructs an
RVDatashim attimeswith dummyrv/rv_err(zeros / ones) and delegates topredict(). The dummy obs are never read by the prediction path (_full_design_matrixonly consumesdata.timeanddata.t_ref; extensions read at most those fields too). The returned array is in the same units the model’s linear parameters are expressed in.Raises
TypeErrorif anyMultiSurveyOffsetis inself.extensions— itsindicator_matrixis fixed to the original data’s row count and cannot apply on an arbitrary time grid. Callers building a smooth-curve overlay should filter it out before calling.
- sample_conditional_linear(nl_values, key, data, *, linear_priors=None, marginalized_names=None, explicit_linear=None, use_mean=False)¶
Sample linear parameters from the conditional posterior.
In auto mode (both
marginalized_namesandexplicit_linearareNone), the method classifies fromlinear_priorand extracts explicit linear values fromnl_values.Returns all linear parameter values (both sampled and explicit), unit-stripped.
When
use_mean=True, the conditional posterior mean is returned for the marginalized linear parameters instead of a random draw. For a Gaussian conditional this is also the conditional MAP. This is the appropriate choice when completing a MAP estimate (seeoptimize()); MCMC paths should keep the defaultuse_mean=False.
- class harv.AbstractSampler¶
Bases:
ModuleAbstract base for harv samplers.
Every sampler holds a prior and a pre-built model. Use the bare constructor to combine them:
sampler = RejectionSampler(prior, RVModel(extensions=...)) samples = sampler.run(data, n_prior_samples=100_000)
Concrete subclasses are marked
@finalper the project’s abstract-final pattern.- Parameters:
prior (
HarvPrior)model (
AbstractComponentModel|JointModel)
- __init__(prior, model, marginalized_names=None)¶
- Parameters:
prior (
HarvPrior)model (
AbstractComponentModel|JointModel)
- Return type:
None
- static __new__(cls, *args, **kwargs)¶
- get_extensions()¶
Return the extensions in effect for this sampler.
Walks the attached model: returns
model.extensionsfor a single component model, ordict[component_name, tuple[Extension, ...]]for aJointModelso the per-component association used for namespaced parameter names like"primary.jitter"is preserved. Use this in any downstream consumer (e.g.harv.plot.plot_rv()) that needs to know which extensions are in play.- Return type:
tuple[AbstractExtension,...] |dict[str,tuple[AbstractExtension,...]]
-
model:
AbstractComponentModel|JointModel¶
- harv.make_prior_cache(prior, model, n_samples, filename, *, key, batch_size=100000, return_logprobs=False, marginalized_names=None)¶
Write
n_samplesprior draws to an HDF5 cache, in batches.Each batch is drawn from an independent subkey via
jax.random.fold_in(), so the on-disk order is i.i.d. and sequential reads of the file are statistically equivalent to random reads.- Parameters:
prior (
HarvPrior) – Prior to draw from.model (
AbstractComponentModel|JointModel) – Component or joint model template defining which extension parameters need priors. SeeHarvPrior.sample().n_samples (
int) – Total number of samples to write.key (
Array) – JAX random key. Independent subkeys per batch are derived viajax.random.fold_in().batch_size (
int) – Number of samples generated and written per chunk. Memory usage scales withbatch_size, notn_samples. Default: 100,000.return_logprobs (
bool) – IfTrue, writeln_priorto the cache so it can be reused by downstream tools without recomputation.marginalized_names (
tuple[str,...] |None) – Forwarded toHarvPrior.sample(). Must match themarginalized_namesof theRejectionSamplerthat will consume the cache, so the explicit-linear keys written here match thoseRejectionSampler.run_with_samples()expects. Leave asNone(default) for the default marginalization.
- Return type:
Examples
>>> import jax >>> from unxt import Q >>> from harv import HarvPrior, RVModel, StandardRV >>> from harv.samplers import make_prior_cache >>> prior = StandardRV().default_prior( ... period_min=Q(2.0, "day"), ... period_max=Q(1000.0, "day"), ... sigma_K0=Q(30.0, "km/s"), ... sigma_v0=Q(50.0, "km/s"), ... ) >>> make_prior_cache( ... prior, RVModel(), 1_000_000, "cache.h5", ... key=jax.random.key(0), batch_size=100_000, ... )
- class harv.NumpyroSampler¶
Bases:
AbstractSamplerMCMC sampler for Keplerian orbital parameters using numpyro.
Builds a numpyro model from a component model (or joint model) and runs warm-started MCMC from rejection-sampler output, returning a
Samplescontainer. Data is passed torun()rather than at construction, so the same configured sampler can be applied to multiple datasets.- Parameters:
prior (
HarvPrior) – Prior distributions for nonlinear (and optionally linear) parameters.parameterization – Orbital parameterization. For RV data defaults to
StandardRV. Ignored for Gaia astrometry data.extensions – Model extensions (jitter, trends, offsets, GP) supplied at construction time.
Examples
>>> from unxt import Q >>> from harv.models import StandardRV >>> from harv.samplers import NumpyroSampler >>> prior = StandardRV().default_prior( ... period_min=Q(2.0, "day"), ... period_max=Q(1000.0, "day"), ... sigma_K0=Q(30.0, "km/s"), ... sigma_v0=Q(50.0, "km/s"), ... ) >>> mcmc = NumpyroSampler(prior) >>> mcmc_samples = mcmc.run( ... data, init_samples=rej_samples, num_warmup=500, num_samples=1000 ... )
- Parameters:
model (
AbstractComponentModel|JointModel)
- __init__(prior, model, marginalized_names=None)¶
- Parameters:
prior (
HarvPrior)model (
AbstractComponentModel|JointModel)
- Return type:
None
- static __new__(cls, *args, **kwargs)¶
- get_extensions()¶
Return the extensions in effect for this sampler.
Walks the attached model: returns
model.extensionsfor a single component model, ordict[component_name, tuple[Extension, ...]]for aJointModelso the per-component association used for namespaced parameter names like"primary.jitter"is preserved. Use this in any downstream consumer (e.g.harv.plot.plot_rv()) that needs to know which extensions are in play.- Return type:
tuple[AbstractExtension,...] |dict[str,tuple[AbstractExtension,...]]
- optimize(samples, data, *, seed=None, max_passes=10, tol=0.0001)¶
Refine each input sample to the local posterior MAP via BFGS.
Uses
numpyro.optim.Minimize()(BFGS viajax.scipy.optimize.minimize()) with anAutoDeltaguide to find the local mode oflog_prior + marginal_log_likelihoodstarting from each sample in samples. The returnedSampleshas the refined nonlinear values, linear values set to the conditional posterior mean (equal to the conditional MAP since the conditional is Gaussian) at the refined nonlinear point, and populatedln_likelihood/ln_prior.Particularly useful when rejection sampling returns a single sample inside a posterior mode – this moves it from a random acceptance point to the mode peak.
- Parameters:
samples (
Samples) – Posterior samples (e.g. fromRejectionSampler.run()) used as warm starts. Each sample seeds an independent BFGS run.data (
AbstractData|AbstractDatasetContainer) – Observed data (same shape/type asrun()).seed (
int|None) – Random number seed. If not specified, picks a seed based on the current time.max_passes (
int) – Maximum number of BFGS restarts per sample. The wrappedjax.scipy.optimize.minimizeBFGS often quits early when its line search fails; restarting from the previous result frequently recovers further progress. Default 10.tol (
float) – Loss-change tolerance for the BFGS-restart loop. If the loss decreases by less thantolon a pass, optimization is considered converged. Default1e-4.
- Return type:
- Returns:
Refined samples with
ln_likelihoodandln_priorpopulated.
Notes
Only the marginalized path is supported (matches
run(marginalized=True)).extra_modelis not supported.Emits
UserWarningfor any sample that does not converge withinmax_passesBFGS restarts; the returned point for such a sample may not be the local MAP.
- run(data, *, init_samples=None, seed=None, marginalized=True, extra_model=None, extra_init_params=None, kernel=None, num_chains=4, num_warmup=500, num_samples=1000, chain_method='parallel', return_logprobs=False)¶
Run MCMC warm-started from rejection-sampler output.
- Parameters:
data (
AbstractData|AbstractDatasetContainer) – Observed data (RVData,GaiaAstrometryData, orSystemDatafor joint models).init_samples (
Samples|None) – Posterior samples produced by rejection sampling, used to set the initial positions for each MCMC chain.seed (
int|None) – Random number seed. If not specified, picks a seed based on the current time.marginalized (
bool) – IfTrue(default), linear parameters are analytically marginalized in the likelihood and conditionally sampled afterward. IfFalse, all parameters are sampled jointly.extra_model (
Callable[[dict[str,Any]],dict[str,Any]] |None) – A function(pars: dict) -> dictthat replaces specific linear parameters with deterministic functions of new physical parameters.extra_init_params (
dict[str,Any] |None) – Initial values for parameters introduced byextra_model.kernel (
type|None) – A numpyro MCMC kernel class. Defaults tonumpyro.infer.NUTS.num_chains (
int) – Number of independent MCMC chains. Default: 4.num_warmup (
int) – Number of warmup (burn-in) steps per chain. Default: 500.num_samples (
int) – Number of posterior samples per chain. Default: 1000.chain_method (
str) – How to run chains:"parallel","sequential", or"vectorized". Default:"parallel".return_logprobs (
bool) – IfTrue, store per-sample log-probabilities on the returnedSamples:ln_likelihood(the marginal log-likelihood, re-evaluated viamodel.log_prob) andln_prior(the summed nonlinear-prior log-density). EnablesSamples.map_sample()andSamples.ln_posterior. Requiresmarginalized=Trueand noextra_model. DefaultFalse.
- Return type:
- Returns:
Posterior samples container with nonlinear and linear parameters.
-
model:
AbstractComponentModel|JointModel¶
- class harv.HarvPrior¶
Bases:
ModulePrior distribution for rejection sampling of Keplerian orbits.
This class encapsulates the prior distributions for both nonlinear and linear parameters. It is agnostic to data type - the sampler determines which parameters are required based on the provided data.
We recommend using the “default” factory constructors (e.g.
default_rv(),default_gaia_astrometry(), etc.), which set up sensible priors for common use cases.Nonlinear parameterization:
Parameter names in
nonlinear_priorsmust match the field names of the parameterization, for example,period,eccentricity,phase_peri, etc. These parameters are sampled explicitly.See the options available in harv.models.parameterizations.
Default parameterizations:
- Radial Velocity:
Nonlinear:
period,eccentricity,phase_peri,arg_periLinear:
rv_semiamp,v_sys
- Astrometry:
Nonlinear:
period,eccentricity,phase_peri,cos_i,arg_peri,lon_asc_nodeLinear params:
ra0,dec0,pmra,pmdec,parallax,semi_major_axis
- Parameters:
nonlinear_priors (
dict[str,Distribution|QuantityDistribution]) – Mapping from parameter name to its prior distribution (a baredist.Distributionfor dimensionless parameters, or aharv.distributions.QuantityDistributionwrapper for parameters with physical units).linear_priors (
dict[str,Distribution|QuantityDistribution|Callable[...,QuantityDistribution|Normal]]) –Per-parameter priors for linear parameters. Each entry is classified:
dist.NormalorQD(Normal)– Gaussian, can be analytically marginalized.LinearPriorCallable– called with nonlinear params to produce a Normal, can be marginalized.dist.HalfNormal,dist.Delta, etc. – non-Gaussian, sampled explicitly alongside nonlinear params.
When using
default_rv()withoffsets, the non-reference offset priors are automatically included as linear parameters.extension_priors (
dict[str,Distribution|QuantityDistribution]) – Priors for extension parameters declared viaextra_params().
- __init__(nonlinear_priors, linear_priors, extension_priors=<factory>)¶
- Parameters:
nonlinear_priors (
dict[str,Distribution|QuantityDistribution])linear_priors (
dict[str,Distribution|QuantityDistribution|Callable[...,QuantityDistribution|Normal]])extension_priors (
dict[str,Distribution|QuantityDistribution])
- Return type:
None
- static __new__(cls, *args, **kwargs)¶
- classmethod from_parameterization(parameterization, **kwargs)¶
Build a default prior for any supported parameterization.
Delegates to
parameterization.default_prior(**kwargs). Each concrete parameterization declares its own required scale arguments (e.g.sigma_K0for RV,sigma_a0for astrometry).- Parameters:
parameterization (
AbstractParameterization) – A concrete parameterization (e.g.StandardRV,EcoswEsinwRV,StandardGaiaAstrometry,ThieleInnesGaiaAstrometry).**kwargs (
Distribution|QuantityDistribution|Callable[...,QuantityDistribution|Normal] |Any) – Forwarded to the parameterization’sdefault_priormethod.
- Return type:
Examples
>>> from unxt import Q >>> from harv.models.parameterizations.rv import EcoswEsinwRV >>> from harv.samplers import HarvPrior >>> prior = HarvPrior.from_parameterization( ... EcoswEsinwRV(), ... period_min=Q(2.0, "day"), ... period_max=Q(1000.0, "day"), ... sigma_K0=Q(30.0, "km/s"), ... sigma_v0=Q(50.0, "km/s"), ... ) >>> sorted(prior.nonlinear_priors.keys()) ['ecosw', 'esinw', 'period', 'phase_peri']
- sample(key, n_samples, *, model, return_logprobs=False, marginalized_names=None)¶
Draw a complete prior sample for a given model.
Unlike
sample_nonlinear(), this draws every parameter the rejection sampler would consume formodel:base nonlinear params from
nonlinear_priorsextension nonlinear params (e.g.
jitter, GP hypers) discovered by walkingmodel.extensionsany linear params from
linear_priorsthat are sampled explicitly rather than analytically marginalized. With the defaultmarginalized_names=Nonethis is just the non-Gaussian linear priors (e.g.HalfNormal-prior parallax); whenmarginalized_namesis given it is every linear param not in that set (see below).
The returned
Samplesis the same container the rejection sampler produces for posteriors, with units restored from eachQuantityDistribution. Itslinearfield is empty in the common Gaussian-linear case.ln_likelihoodis alwaysNone(no data yet);ln_prioris populated whenreturn_logprobs=True, summing the nonlinear (base + extension) prior log-densities — matching the convention used byRejectionSampler.run().modelis required because the set of extension and explicit-linear parameters depends on the extensions attached to the model.- Parameters:
key (
Array) – JAX random key.n_samples (
int) – Number of prior draws.model (
AbstractComponentModel|JointModel) – Component or joint model template defining the extensions and linear-prior classification.return_logprobs (
bool) – IfTrue, populateSamples.ln_priorwith the per-sample nonlinear-prior log-density.marginalized_names (
tuple[str,...] |None) – Which linear params are analytically marginalized (and therefore not drawn here), mirroringRejectionSampler.marginalized_names. Must match the consuming sampler so the explicit-linear keys agree.None(default) marginalizes every linear param that can be (the Gaussian ones), leaving only non-Gaussian priors explicit.
- Returns:
A container holding the full prior draw, suitable for passing to
RejectionSampler.run_with_samples()or for caching viaharv.samplers.make_prior_cache().- Return type:
Examples
>>> import jax >>> from unxt import Q >>> from harv import HarvPrior, RVModel, StandardRV >>> prior = StandardRV().default_prior( ... period_min=Q(2.0, "day"), ... period_max=Q(1000.0, "day"), ... sigma_K0=Q(30.0, "km/s"), ... sigma_v0=Q(50.0, "km/s"), ... ) >>> samples = prior.sample(jax.random.key(0), 100, model=RVModel()) >>> samples.n_samples 100 >>> samples.data_type 'RVModel'
- sample_nonlinear(key, n_samples)¶
Sample nonlinear parameters from priors.
- Parameters:
- Return type:
- Returns:
Dictionary mapping each parameter name to an array of shape
(n_samples,). Values are bare JAX arrays regardless of whether the distribution is wrapped inQuantityDistribution.
Examples
>>> import jax >>> from unxt import Q >>> from harv.samplers import HarvPrior >>> sorted( ... HarvPrior.default_rv( ... period_min=Q(2.0, "day"), ... period_max=Q(1000.0, "day"), ... sigma_K0=Q(30.0, "km/s"), ... sigma_v0=Q(50.0, "km/s"), ... ).sample_nonlinear(jax.random.key(0), 10).keys() ... ) ['arg_peri', 'eccentricity', 'period', 'phase_peri'] >>> HarvPrior.default_rv( ... period_min=Q(2.0, "day"), ... period_max=Q(1000.0, "day"), ... sigma_K0=Q(30.0, "km/s"), ... sigma_v0=Q(50.0, "km/s"), ... ).sample_nonlinear(jax.random.key(0), 10)["period"].shape (10,)
-
nonlinear_priors:
dict[str,Distribution|QuantityDistribution]¶
-
linear_priors:
dict[str,Distribution|QuantityDistribution|Callable[...,QuantityDistribution|Normal]]¶
-
extension_priors:
dict[str,Distribution|QuantityDistribution]¶
- class harv.RejectionSampler¶
Bases:
AbstractSamplerRejection sampler for Keplerian orbital parameters.
Implements rejection sampling with analytic marginalization over linear parameters. Configure once, then call
run()with each dataset.- Parameters:
prior (
HarvPrior) – Prior distributions for nonlinear (and optionally linear) parameters.model (
AbstractComponentModel|JointModel) – Fully constructed model template (no data, no linear_priors).marginalized_names (
tuple[str,...] |None) – Linear parameter names to analytically marginalize. If None, all Gaussian linear parameters are auto-classified for marginalization.batch_size (
int) – Number of samples to process per batch. Smaller values use less memory but may be slower. Default: 100_000.
Examples
>>> from unxt import Q >>> import jax.numpy as jnp >>> from harv import HarvPrior, RejectionSampler, RVData >>> from harv.models.rv import RVModel >>> data = RVData( ... time=Q(jnp.linspace(0, 100, 5), "day"), ... rv=Q(jnp.zeros(5), "km/s"), ... rv_err=Q(jnp.full(5, 1.0), "km/s"), ... ) >>> prior = HarvPrior.default_rv( ... period_min=Q(2.0, "day"), ... period_max=Q(1000.0, "day"), ... sigma_K0=Q(30.0, "km/s"), ... sigma_v0=Q(50.0, "km/s"), ... ) >>> sampler = RejectionSampler(prior, RVModel()) >>> samples = sampler.run(data, n_prior_samples=100_000)
- __init__(prior, model, marginalized_names=None, batch_size=100000)¶
- Parameters:
prior (
HarvPrior)model (
AbstractComponentModel|JointModel)batch_size (
int)
- Return type:
None
- static __new__(cls, *args, **kwargs)¶
- get_extensions()¶
Return the extensions in effect for this sampler.
Walks the attached model: returns
model.extensionsfor a single component model, ordict[component_name, tuple[Extension, ...]]for aJointModelso the per-component association used for namespaced parameter names like"primary.jitter"is preserved. Use this in any downstream consumer (e.g.harv.plot.plot_rv()) that needs to know which extensions are in play.- Return type:
tuple[AbstractExtension,...] |dict[str,tuple[AbstractExtension,...]]
- run(data, *, n_prior_samples, max_posterior_samples=None, seed=None, ignore_non_finite=False, return_logprobs=False)¶
Run rejection sampling.
- Parameters:
data (
AbstractData|AbstractDatasetContainer) – Observed data: anAbstractDatasubclass (e.g.RVData,GaiaAstrometryData) for single-component models, or anAbstractDatasetContainer(e.g.SystemData,SourceData) forJointModel.n_prior_samples (
int) – Number of samples to draw from the prior.max_posterior_samples (
int|None) – Maximum number of posterior samples to return. If None, returns all accepted samples.seed (
int|None) – Random number seed. If not specified, picks a seed based on the current time.ignore_non_finite (
bool) – IfTrue, anyNaNor infinite log-likelihood values are treated as rejected samples by replacing them with-infbefore the rejection step. IfFalse(default), non-finite values are left unchanged.return_logprobs (
bool) – IfTrue, store per-sample log-probabilities on the returnedSamples:ln_likelihood(the marginal log-likelihood) andln_prior(the summed nonlinear prior log-density). These enableSamples.map_sample()and theSamples.ln_posteriorproperty. DefaultFalse.
- Return type:
- Returns:
Posterior samples container.
- run_with_samples(data, prior_samples, *, max_posterior_samples=None, seed=None, ignore_non_finite=False, return_logprobs=False, randomize_prior_order=True)¶
Run rejection against pre-computed prior samples.
Parallel to
run()but skips the prior-sampling step: the caller supplies the prior draws either in memory (aSamplesreturned byHarvPrior.sample()) or as a path to an HDF5 file produced bymake_prior_cache(). Useful when the same prior library is reused across many datasets — generate once, evaluate many times.- Parameters:
data (
AbstractData|AbstractDatasetContainer) – Observed data; same conventions asrun().prior_samples (
Samples|str|PathLike[str]) – Either aSamplesinstance (in-memory cache, e.g. fromprior.sample(...)) or anstr/os.PathLikepath to an HDF5 file. The HDF5 path is streamed batch by batch, so the file may be much larger than RAM.randomize_prior_order (
bool) – Disk-streaming branch only: whenTrue(default), batches are read from the HDF5 file in a random order (drawn fromseed). Each batch is still a single contiguous h5py slice, so disk I/O is unchanged. Set toFalsefor strictly sequential reads (reproducibility / debugging). Ignored for the in-memory branch.
- Returns:
Posterior samples, equivalent in shape and contents to
run()’s return.- Return type:
-
model:
AbstractComponentModel|JointModel¶
- class harv.Samples¶
Bases:
ModuleContainer for posterior samples.
Stores both nonlinear and linear parameter samples as
Qobjects with units baked in. Provides dict-like access, statistical summaries, and visualization tools.- Parameters:
nonlinear (
dict[str,Quantity]) – Nonlinear parameter samples, one Q per parameter. Keys:"period","eccentricity","phase_peri", and optionally"arg_peri","cos_i","lon_asc_node". Units: period has time units; angles have"rad"; dimensionless parameters have unit"".linear (
dict[str,Quantity]) – Linear parameter samples, one Q per parameter. Keys: e.g."rv_semiamp","v_sys"for RV;"ra0","dec0","pmra","pmdec","parallax","semi_major_axis"for astrometry. Units are data-driven (e.g."km/s"for RV).data_type (
str) – Informational label identifying the model that produced these samples (e.g."RVModel","GaiaAstrometryModel","JointModel"). Stored in HDF5 for round-tripping.metadata (
dict[str,Any]) – Additional metadata (t_ref,num_chains, acceptance rate, etc.).linear_extension_names (
tuple[str,...]) – Names of linear parameters introduced by extensions (instrument offsets, polynomial trends, etc.) beyond the base linear set.
Examples
Samplesis normally produced byrun(), but can be constructed directly for testing or manual use:>>> from unxt import Q >>> from harv.samplers.samples import Samples >>> samples = Samples( ... nonlinear={ ... "period": Q([100.0, 101.0, 99.5], "day"), ... "eccentricity": Q([0.1, 0.15, 0.12], ""), ... "phase_peri": Q([0.3, 0.31, 0.29], ""), ... "arg_peri": Q([1.0, 1.1, 0.9], "rad"), ... }, ... linear={ ... "rv_semiamp": Q([10.0, 11.0, 9.5], "km/s"), ... "v_sys": Q([5.0, 5.1, 4.9], "km/s"), ... }, ... data_type="rv", ... metadata={"t_ref": 0.0, "t_ref_unit": "day"}, ... ) >>> samples.n_samples 3 >>> "period" in samples True >>> samples.keys()[:4] ['period', 'eccentricity', 'phase_peri', 'arg_peri']
- __init__(nonlinear, linear, data_type='', metadata=<factory>, linear_extension_names=(), ln_likelihood=None, ln_prior=None)¶
- static __new__(cls, *args, **kwargs)¶
- binary_mass_function()¶
Binary mass function from the RV orbital elements.
- chi2(data, model)¶
Per-sample goodness-of-fit \(\chi^2\) against the data.
For each posterior sample, evaluates the model prediction at the stored parameter values and returns \(\chi^2 = r^\top C^{-1} r\) (see
chi_squared()). Jitter and other extension noise terms are included viaC.- Parameters:
data (
AbstractData) – The data the samples were fit to.model (
Any) – The component model used for the fit (RVModelorGaiaAstrometryModel); provides the prediction and noise model.
- Return type:
- Returns:
Array of shape
(n_samples,).
- companion_mass(m1, *, sini=None)¶
Companion mass \(m_2\) given the primary mass.
For RV samples the mass function is
binary_mass_function();sinidefaults to 1 (the minimum companion mass). For astrometry samples the dark-companion astrometric mass function is used andsiniis ignored (the inclination is already encoded in the physical orbit size).
- convert_parameterization(*, source, target)¶
Convert stored values between supported parameterizations.
Wraps
harv.samplers.convert_parameterization(), returning a newSampleswithmetadata,data_type, andlinear_extension_namespreserved. The initial implementation supports single-component RV and Gaia astrometry parameterizations only.- Parameters:
source (
AbstractParameterization)target (
AbstractParameterization)
- Return type:
- classmethod from_hdf5(filename)¶
Load samples from HDF5 file.
- Parameters:
- Return type:
- Returns:
Loaded samples object.
Examples
>>> samples = Samples.from_hdf5("posterior_samples.h5") >>> samples.n_samples 42 >>> samples.data_type 'rv'
- property ln_posterior: Array¶
Per-sample log-posterior density,
ln_prior + ln_likelihood.Both
ln_priorandln_likelihoodmust have been stored (run the sampler withreturn_logprobs=True); otherwiseValueErroris raised.
- map_sample(*, return_index=False)¶
Return the maximum a posteriori (MAP) sample.
Selects the single sample with the highest
ln_posterior.- Parameters:
return_index (
bool) – IfTrue, also return the integer index of the MAP sample.- Return type:
- Returns:
A length-1
Sampleswith the MAP sample, or(Samples, index)whenreturn_indexisTrue.- Raises:
ValueError – If per-sample log-probabilities were not stored (run the sampler with
return_logprobs=True).
- max_phase_gap(data)¶
Largest gap in orbital-phase coverage, per sample.
The maximum gap between consecutive observations on the (circular) phase axis – the ESA Gaia “maximum phase gap” statistic.
- Parameters:
data (
AbstractData) – The data the samples were fit to.- Return type:
- Returns:
Array of shape
(n_samples,); values in[0, 1].
- median(key=None)¶
Compute median values for parameters.
- Parameters:
key (
str|None) – If provided, return median for this parameter only. If None, return dict of medians for all parameters.- Return type:
dict[str,AbstractQuantity|Array] |AbstractQuantity|Array- Returns:
Median value(s).
Examples
>>> from unxt import Q >>> from harv.samplers.samples import Samples >>> samples = Samples( ... nonlinear={"period": Q([100.0, 102.0], "day"), ... "eccentricity": Q([0.1, 0.15], ""), ... "phase_peri": Q([0.3, 0.31], ""), ... "arg_peri": Q([1.0, 1.1], "rad")}, ... linear={"rv_semiamp": Q([10.0, 12.0], "km/s"), ... "v_sys": Q([5.0, 5.2], "km/s")}, ... data_type="rv", ... metadata={"t_ref": 0.0, "t_ref_unit": "day"}, ... ) >>> med = samples.median("period") >>> med.unit Unit("d") >>> all_medians = samples.median() >>> "period" in all_medians True
- property meta: _MetadataView¶
Q-aware read-only view over
metadata.metadataitself stores quantity-valued entries in split form (<name>value +<name>_unitstring) so the static field never holds a JAX array. Usesamples.meta[name]to read those entries back asQinstances without manually reassembling the two halves; plain scalar entries (e.g.num_chains) round-trip unchanged. See_MetadataViewfor the full read API.
- minimum_companion_mass(m1)¶
Minimum companion mass (edge-on,
sin i = 1).Convenience wrapper for
companion_mass()withsini=1.
- percentile(key, percentiles=(16, 50, 84))¶
Compute percentiles for a parameter.
- Parameters:
- Return type:
- Returns:
Percentile values with appropriate units.
Examples
>>> from unxt import Q >>> from harv.samplers.samples import Samples >>> samples = Samples( ... nonlinear={"period": Q([100.0, 102.0], "day"), ... "eccentricity": Q([0.1, 0.15], ""), ... "phase_peri": Q([0.3, 0.31], ""), ... "arg_peri": Q([1.0, 1.1], "rad")}, ... linear={"rv_semiamp": Q([10.0, 12.0], "km/s"), ... "v_sys": Q([5.0, 5.2], "km/s")}, ... data_type="rv", ... metadata={"t_ref": 0.0, "t_ref_unit": "day"}, ... ) >>> p16, p50, p84 = samples.percentile("eccentricity") >>> len(samples.percentile("period", [5, 50, 95])) 3
- period_modes(data, n_clusters=2)¶
Cluster the period samples and test each mode for unimodality.
Runs K-means on
log(period)(experimental; see thejoker’sis_P_Kmodal). Requires the optionalscikit-learndependency.- Parameters:
data (
AbstractData) – The data the samples were fit to.n_clusters (
int) – Number of period modes to cluster into. Default 2.
- Return type:
- Returns:
(all_unimodal, mode_periods, n_per_mode)– whether every mode is individually unimodal, the median period of each mode (aQ), and the sample count per mode.
- period_unimodal(data)¶
Whether the period samples lie within a single mode.
Uses the criterion
ptp(P) < 4 * P_min**2 / (2*pi*T), whereTis the data time span – the period spacing at which adjacent aliases become resolvable (see thejoker’sis_P_unimodal).- Parameters:
data (
AbstractData) – The data the samples were fit to (for the observed time span).- Return type:
- periods_spanned(data)¶
Number of orbital periods spanned by the data, per sample.
- Parameters:
data (
AbstractData) – The data the samples were fit to.- Return type:
- Returns:
Array of shape
(n_samples,).
- phase_coverage(data, n_bins=10)¶
Fraction of phase bins containing at least one observation.
The ESA Gaia “phase coverage” statistic, per sample.
- Parameters:
data (
AbstractData) – The data the samples were fit to.n_bins (
int) – Number of equal-width phase bins. Default 10.
- Return type:
- Returns:
Array of shape
(n_samples,); values in[0, 1].
- phase_coverage_per_period(data)¶
Maximum number of observations within any single period, per sample.
- Parameters:
data (
AbstractData) – The data the samples were fit to.- Return type:
- Returns:
Array of shape
(n_samples,).
- plot_corner(params=None, truths=None, labels=None, **plot_kwargs)¶
Create corner plot of posterior samples using arviz.
- Parameters:
params (
list[str] |None) – Parameters to include in corner plot. If None, selects a default set based on data_type.truths (
dict[str,Any] |None) – Dictionary of true parameter values to overplot as reference values.labels (
dict[str,str] |None) – Override display names for specific parameters, e.g.{"period": "period [day]", "rv_semiamp": "K [km/s]"}. Parameters not listed use their plain parameter name as the label.**plot_kwargs (
Any) – Additional keyword arguments passed to arviz.plot_pair().
- Return type:
- Returns:
Array of axes from
arviz.plot_pair.
Examples
Default corner plot (selects parameters based on data type):
>>> axes = samples.plot_corner()
Specify parameters and overplot true values:
>>> from unxt import Q >>> axes = samples.plot_corner( ... params=["period", "eccentricity"], ... truths={"period": Q(100, "day"), "eccentricity": 0.3}, ... )
- reduced_chi2(data, model, *, dof=None)¶
Per-sample reduced \(\chi^2\) (\(\chi^2 / \mathrm{dof}\)).
- Parameters:
data (
AbstractData) – The data the samples were fit to.model (
Any) – The component model used for the fit.dof (
int|None) – Degrees of freedom. Defaults ton_obs - n_params, wheren_paramscounts every fitted parameter (orbital + linear + extension). A well-fitting model gives reduced \(\chi^2\) near 1.
- Return type:
- Returns:
Array of shape
(n_samples,).- Raises:
ValueError – If
dof(default or supplied) is not positive.
- semi_major_axis_AU()¶
Physical semi-major axis (AU) from the angular orbit size + parallax.
- summary(params=None)¶
Compute summary statistics for parameters.
For each parameter, computes: - median - mean - std (standard deviation) - percentiles (16th, 84th) for +/-1-sigma equivalent
- Parameters:
params (
list[str] |None) – List of parameter names to summarize. If None, summarizes all.- Return type:
- Returns:
Dictionary mapping parameter names to their statistics.
Examples
>>> from unxt import Q >>> from harv.samplers.samples import Samples >>> samples = Samples( ... nonlinear={"period": Q([100.0, 102.0], "day"), ... "eccentricity": Q([0.1, 0.15], ""), ... "phase_peri": Q([0.3, 0.31], ""), ... "arg_peri": Q([1.0, 1.1], "rad")}, ... linear={"rv_semiamp": Q([10.0, 12.0], "km/s"), ... "v_sys": Q([5.0, 5.2], "km/s")}, ... data_type="rv", ... metadata={"t_ref": 0.0, "t_ref_unit": "day"}, ... ) >>> summary = samples.summary(["period", "eccentricity"]) >>> sorted(summary.keys()) ['eccentricity', 'period'] >>> sorted(summary["period"].keys()) ['mean', 'median', 'p16', 'p84', 'std']
- thiele_innes_to_campbell()¶
Convert Thiele-Innes linear parameters to Campbell orbital elements.
See
campbell_from_thiele_innes()for the mathematical details of the conversion.
- to_arviz(params=None, labels=None)¶
Export samples to an
arviz.InferenceDataobject.- Parameters:
params (
list[str] |None) – Parameters to include. IfNone, all parameters returned bykeys()are included.labels (
dict[str,str] |None) – Override display names for specific parameters, e.g.{"period": "period [day]", "rv_semiamp": "K [km/s]"}. Parameters not listed use their plain parameter name as the label.
- Return type:
- Returns:
Inference data suitable for
arviz.plot_pair,arviz.summary, etc.- Raises:
ImportError – If
arvizis not installed.
Examples
>>> idata = samples.to_arviz(["period", "eccentricity"])
- to_hdf5(filename)¶
Save samples to HDF5 file.
Examples
Save posterior samples and reload them:
>>> samples.to_hdf5("posterior_samples.h5") >>> reloaded = Samples.from_hdf5("posterior_samples.h5") >>> reloaded.n_samples == samples.n_samples True
- wrap_angles()¶
Wrap negative
rv_semiamp/semi_major_axisto positive.Negative
rv_semiamp(K) orsemi_major_axis(a) describes an orbit that is physically identical to the all-positive case under two symmetries of the model:arg_peri -> arg_peri + piflips the sign of bothKanda(the(omega, K, a) -> (omega + pi, -K, -a)symmetry);lon_asc_node -> lon_asc_node + piflips the sign ofaalone (the astrometric(Omega, a) -> (Omega + pi, -a)symmetry;lon_asc_nodedoes not enter the RV model).
This method returns a new
SamplesenforcingK >= 0anda >= 0in two steps:shift
arg_peribypi(flipping everyrv_semiampandsemi_major_axis) on the samples where the trigger amplitude is negative — enforcingK >= 0;shift
lon_asc_nodebypi(flipping everysemi_major_axis) on the samples whereais still negative — enforcinga >= 0.
Both shifted angles are wrapped to
[0, 2*pi). A singlearg_perishift cannot make bothKandapositive when their signs disagree (which routinely happens in joint RV + astrometry posteriors), so the secondlon_asc_nodeshift is required.Joint models (e.g. SB2) namespace per-component linear parameters as
"component.param_name"; this method discovers everyrv_semiamp- andsemi_major_axis-suffixed key. The firstrv_semiamp-suffixed key (insertion order) triggers step 1;semi_major_axistriggers step 1 only when norv_semiampis present (astrometry-only fits).No-op when
arg_periis absent fromnonlinear, when neitherrv_semiampnorsemi_major_axisis inlinear, or when no entries are negative.- Raises:
NotImplementedError – If the model has multiple per-component
arg_periorlon_asc_nodekeys (e.g."primary.arg_peri"and"secondary.arg_peri"). All current harv joint factories share both, so this is not expected to arise in practice.- Return type:
Examples
>>> from unxt import Q >>> from harv.samplers.samples import Samples >>> samples = Samples( ... nonlinear={"period": Q([100.0, 100.0], "day"), ... "eccentricity": Q([0.1, 0.1], ""), ... "phase_peri": Q([0.3, 0.3], ""), ... "arg_peri": Q([1.0, 1.0], "rad")}, ... linear={"rv_semiamp": Q([-10.0, 10.0], "km/s"), ... "v_sys": Q([0.0, 0.0], "km/s")}, ... data_type="rv", ... metadata={"t_ref": 0.0, "t_ref_unit": "day"}, ... ) >>> wrapped = samples.wrap_angles() >>> bool((wrapped["rv_semiamp"].value >= 0).all()) True
- Return type:
Subpackages¶
- harv.data package
AbstractAstrometryDataAbstractDataGaiaAstrometryDataRVDataSourceDataSystemDataSystemData.__init__()SystemData.__new__()SystemData.dataset_typeSystemData.get_datasets_by_type()SystemData.indicator_data()SystemData.indicator_data_by_type()SystemData.items()SystemData.keys()SystemData.plot()SystemData.stacked()SystemData.stacked_by_type()SystemData.t_refSystemData.values()
build_indicator_matrix()stack_datasets()- Submodules
- harv.data.containers module
AbstractDatasetContainerAbstractDatasetContainer.t_refAbstractDatasetContainer.keys()AbstractDatasetContainer.values()AbstractDatasetContainer.items()AbstractDatasetContainer.get_datasets_by_type()AbstractDatasetContainer.stacked_by_type()AbstractDatasetContainer.indicator_data_by_type()AbstractDatasetContainer.plot()AbstractDatasetContainer.__init__()AbstractDatasetContainer.__new__()
SourceDataSystemDataSystemData.__init__()SystemData.dataset_typeSystemData.stacked()SystemData.indicator_data()SystemData.__new__()SystemData.get_datasets_by_type()SystemData.indicator_data_by_type()SystemData.items()SystemData.keys()SystemData.plot()SystemData.stacked_by_type()SystemData.t_refSystemData.values()
- harv.data.datasets module
- harv.data.helpers module
- harv.kepler package
KeplerianBodyKeplerianBody.__init__()KeplerianBody.__new__()KeplerianBody.ecc_zero_tolKeplerianBody.from_masses()KeplerianBody.get_mass()KeplerianBody.get_position()KeplerianBody.get_velocity()KeplerianBody.orientationKeplerianBody.periodKeplerianBody.eccentricityKeplerianBody.semi_major_axisKeplerianBody.t_peri
KeplerianOrientationKeplerianOrientation.__init__()KeplerianOrientation.__new__()KeplerianOrientation.arg_periKeplerianOrientation.cos_arg_periKeplerianOrientation.cos_iKeplerianOrientation.cos_lon_asc_nodeKeplerianOrientation.from_angles()KeplerianOrientation.from_thiele_innes()KeplerianOrientation.inclinationKeplerianOrientation.lon_asc_nodeKeplerianOrientation.rotation_matrixKeplerianOrientation.sin_arg_periKeplerianOrientation.sin_iKeplerianOrientation.sin_lon_asc_nodeKeplerianOrientation.thiele_innes_constants()
AbstractNBodySystemTwoBodySystemTwoBodySystem.__init__()TwoBodySystem.__new__()TwoBodySystem.m_companionTwoBodySystem.m_totalTwoBodySystem.n_bodiesTwoBodySystem.position_barycentric()TwoBodySystem.position_relative()TwoBodySystem.velocity_barycentric()TwoBodySystem.velocity_relative()TwoBodySystem.m_primaryTwoBodySystem.companion
astrometric_mass_function()astrometric_orbit_at_times()binary_mass_function()companion_mass_from_mass_function()compute_true_anomaly_components()rv_at_times()mean_anomaly()rv_shape()semi_major_axis_physical()thiele_innes_ABFG()true_anomaly_from_mean()- Submodules
- harv.kepler.body module
KeplerianBodyKeplerianBody.periodKeplerianBody.eccentricityKeplerianBody.semi_major_axisKeplerianBody.t_periKeplerianBody.orientationKeplerianBody.ecc_zero_tolKeplerianBody.from_masses()KeplerianBody.get_mass()KeplerianBody.get_position()KeplerianBody.get_velocity()KeplerianBody.__init__()KeplerianBody.__new__()
- harv.kepler.constants module
- harv.kepler.masses module
- harv.kepler.nbody_system module
AbstractNBodySystemTwoBodySystemTwoBodySystem.m_primaryTwoBodySystem.companionTwoBodySystem.n_bodiesTwoBodySystem.m_totalTwoBodySystem.m_companionTwoBodySystem.position_barycentric()TwoBodySystem.position_relative()TwoBodySystem.velocity_barycentric()TwoBodySystem.velocity_relative()TwoBodySystem.__init__()TwoBodySystem.__new__()
- harv.kepler.orbits module
- harv.kepler.orientation module
KeplerianOrientationKeplerianOrientation.sin_arg_periKeplerianOrientation.cos_arg_periKeplerianOrientation.sin_lon_asc_nodeKeplerianOrientation.cos_lon_asc_nodeKeplerianOrientation.sin_iKeplerianOrientation.cos_iKeplerianOrientation.from_angles()KeplerianOrientation.from_thiele_innes()KeplerianOrientation.arg_periKeplerianOrientation.lon_asc_nodeKeplerianOrientation.inclinationKeplerianOrientation.rotation_matrixKeplerianOrientation.thiele_innes_constants()KeplerianOrientation.__init__()KeplerianOrientation.__new__()
- harv.models package
AbstractComponentModelAbstractComponentModel.__init__()AbstractComponentModel.__new__()AbstractComponentModel.chi_squared()AbstractComponentModel.log_prob()AbstractComponentModel.numpyro_model()AbstractComponentModel.params_explicit()AbstractComponentModel.params_marginalized()AbstractComponentModel.predict()AbstractComponentModel.sample_conditional_linear()AbstractComponentModel.extensions
AbstractParameterizationEcoswEsinwRVGaiaAstrometryModelGaiaAstrometryModel.__init__()GaiaAstrometryModel.__new__()GaiaAstrometryModel.chi_squared()GaiaAstrometryModel.extensionsGaiaAstrometryModel.log_prob()GaiaAstrometryModel.numpyro_model()GaiaAstrometryModel.parameterizationGaiaAstrometryModel.params_explicit()GaiaAstrometryModel.params_marginalized()GaiaAstrometryModel.pm_time_unitGaiaAstrometryModel.predict()GaiaAstrometryModel.predict_orbit_sky()GaiaAstrometryModel.sample_conditional_linear()
GPJitterMonomialTrendMultiSurveyOffsetHarvPriorJointModelJointModel.__init__()JointModel.__new__()JointModel.component_namesJointModel.for_rv_and_gaia()JointModel.for_sb2()JointModel.log_prob()JointModel.numpyro_model()JointModel.params_explicit()JointModel.params_marginalized()JointModel.sample_conditional_linear()JointModel.shared_linear_paramsJointModel.componentsJointModel.shared_params
RVModelStandardGaiaAstrometryStandardGaiaAstrometry.__init__()StandardGaiaAstrometry.__new__()StandardGaiaAstrometry.default_prior()StandardGaiaAstrometry.design_matrix()StandardGaiaAstrometry.linear_log_prior_correction()StandardGaiaAstrometry.linear_params()StandardGaiaAstrometry.nonlinear_params()StandardGaiaAstrometry.params()StandardGaiaAstrometry.sky_orbit()
StandardRVdefault_sb2_prior()- Subpackages
- Submodules
- harv.models.astrometry module
GaiaAstrometryModelGaiaAstrometryModel.parameterizationGaiaAstrometryModel.extensionsGaiaAstrometryModel.pm_time_unitGaiaAstrometryModel.predict_orbit_sky()GaiaAstrometryModel.__init__()GaiaAstrometryModel.__new__()GaiaAstrometryModel.chi_squared()GaiaAstrometryModel.log_prob()GaiaAstrometryModel.numpyro_model()GaiaAstrometryModel.params_explicit()GaiaAstrometryModel.params_marginalized()GaiaAstrometryModel.predict()GaiaAstrometryModel.sample_conditional_linear()
- harv.models.component module
AbstractComponentModelAbstractComponentModel.extensionsAbstractComponentModel.params_explicit()AbstractComponentModel.params_marginalized()AbstractComponentModel.log_prob()AbstractComponentModel.predict()AbstractComponentModel.chi_squared()AbstractComponentModel.sample_conditional_linear()AbstractComponentModel.numpyro_model()AbstractComponentModel.__init__()AbstractComponentModel.__new__()
- harv.models.joint module
JointModelJointModel.componentsJointModel.shared_paramsJointModel.shared_linear_paramsJointModel.for_rv_and_gaia()JointModel.for_sb2()JointModel.component_namesJointModel.params_explicit()JointModel.params_marginalized()JointModel.log_prob()JointModel.sample_conditional_linear()JointModel.numpyro_model()JointModel.__init__()JointModel.__new__()
- harv.models.rv module
- harv.samplers package
AbstractSamplerNumpyroSamplerconvert_parameterization()make_prior_cache()QDQuantityDistributionRejectionSamplerSamplesSamples.__init__()Samples.__new__()Samples.binary_mass_function()Samples.chi2()Samples.companion_mass()Samples.convert_parameterization()Samples.data_typeSamples.from_hdf5()Samples.keys()Samples.linear_extension_namesSamples.ln_likelihoodSamples.ln_posteriorSamples.ln_priorSamples.map_sample()Samples.max_phase_gap()Samples.median()Samples.metaSamples.minimum_companion_mass()Samples.n_samplesSamples.percentile()Samples.period_modes()Samples.period_unimodal()Samples.periods_spanned()Samples.phase_coverage()Samples.phase_coverage_per_period()Samples.plot_corner()Samples.reduced_chi2()Samples.semi_major_axis_AU()Samples.summary()Samples.thiele_innes_to_campbell()Samples.to_arviz()Samples.to_hdf5()Samples.wrap_angles()Samples.nonlinearSamples.linearSamples.metadata
- Submodules
- harv.samplers.base module
- harv.samplers.conversion module
- harv.samplers.numpyro module
- harv.samplers.prior_cache module
- harv.samplers.rejection module
- harv.samplers.samples module
SamplesSamples.nonlinearSamples.linearSamples.data_typeSamples.metadataSamples.linear_extension_namesSamples.ln_likelihoodSamples.ln_priorSamples.n_samplesSamples.metaSamples.ln_posteriorSamples.keys()Samples.wrap_angles()Samples.thiele_innes_to_campbell()Samples.convert_parameterization()Samples.median()Samples.percentile()Samples.summary()Samples.map_sample()Samples.period_unimodal()Samples.period_modes()Samples.max_phase_gap()Samples.phase_coverage()Samples.periods_spanned()Samples.phase_coverage_per_period()Samples.chi2()Samples.reduced_chi2()Samples.binary_mass_function()Samples.semi_major_axis_AU()Samples.companion_mass()Samples.minimum_companion_mass()Samples.to_hdf5()Samples.from_hdf5()Samples.__init__()Samples.__new__()Samples.to_arviz()Samples.plot_corner()
- harv.simulate package
simulate_gaia_epoch_astrometry()fake_parallax_factor()simulate_rv_multisurv_data()simulate_rv_sb1_data()- Submodules
- harv.simulate.astrometry module
- harv.simulate.rv module
- harv.simulate.scanlaw module
AbstractGaiaScanLawGaiaReducedCommandedScanLawGaiaReducedCommandedScanLaw.drGaiaReducedCommandedScanLaw.random_downsample_fractionGaiaReducedCommandedScanLaw.random_seedGaiaReducedCommandedScanLaw.file_pathGaiaReducedCommandedScanLaw.read_metadata()GaiaReducedCommandedScanLaw.load_scans_for_healpix()GaiaReducedCommandedScanLaw.query()GaiaReducedCommandedScanLaw.__init__()GaiaReducedCommandedScanLaw.__new__()
- harv.stats package
MarginalizedLinearMarginalizedLinear.__init__()MarginalizedLinear.arg_constraintsMarginalizedLinear.batch_shapeMarginalizedLinear.cdf()MarginalizedLinear.conditional()MarginalizedLinear.covariance_matrixMarginalizedLinear.entropy()MarginalizedLinear.enumerate_support()MarginalizedLinear.event_dimMarginalizedLinear.event_shapeMarginalizedLinear.expand()MarginalizedLinear.expand_by()MarginalizedLinear.gather_pytree_aux_fields()MarginalizedLinear.gather_pytree_data_fields()MarginalizedLinear.get_args()MarginalizedLinear.has_enumerate_supportMarginalizedLinear.has_rsampleMarginalizedLinear.icdf()MarginalizedLinear.infer_shapes()MarginalizedLinear.is_discreteMarginalizedLinear.log_prob()MarginalizedLinear.mask()MarginalizedLinear.meanMarginalizedLinear.pytree_aux_fieldsMarginalizedLinear.pytree_data_fieldsMarginalizedLinear.reparametrized_paramsMarginalizedLinear.rsample()MarginalizedLinear.sample()MarginalizedLinear.sample_with_intermediates()MarginalizedLinear.set_default_validate_args()MarginalizedLinear.shape()MarginalizedLinear.supportMarginalizedLinear.to_event()MarginalizedLinear.tree_flatten()MarginalizedLinear.tree_unflatten()MarginalizedLinear.validate_args()MarginalizedLinear.variance
- Submodules
- harv.stats.linear_op module
- harv.stats.numpyro_ext module
UnitDiskUnitDisk.supportUnitDisk.__init__()UnitDisk.sample()UnitDisk.log_prob()UnitDisk.meanUnitDisk.varianceUnitDisk.arg_constraintsUnitDisk.batch_shapeUnitDisk.cdf()UnitDisk.entropy()UnitDisk.enumerate_support()UnitDisk.event_dimUnitDisk.event_shapeUnitDisk.expand()UnitDisk.expand_by()UnitDisk.gather_pytree_aux_fields()UnitDisk.gather_pytree_data_fields()UnitDisk.get_args()UnitDisk.has_enumerate_supportUnitDisk.has_rsampleUnitDisk.icdf()UnitDisk.infer_shapes()UnitDisk.is_discreteUnitDisk.mask()UnitDisk.pytree_aux_fieldsUnitDisk.pytree_data_fieldsUnitDisk.reparametrized_paramsUnitDisk.rsample()UnitDisk.sample_with_intermediates()UnitDisk.set_default_validate_args()UnitDisk.shape()UnitDisk.to_event()UnitDisk.tree_flatten()UnitDisk.tree_unflatten()UnitDisk.validate_args()
AngleAngle.__init__()Angle.supportAngle.sample()Angle.log_prob()Angle.meanAngle.varianceAngle.cdf()Angle.icdf()Angle.arg_constraintsAngle.batch_shapeAngle.entropy()Angle.enumerate_support()Angle.event_dimAngle.event_shapeAngle.expand()Angle.expand_by()Angle.gather_pytree_aux_fields()Angle.gather_pytree_data_fields()Angle.get_args()Angle.has_enumerate_supportAngle.has_rsampleAngle.infer_shapes()Angle.is_discreteAngle.mask()Angle.pytree_aux_fieldsAngle.pytree_data_fieldsAngle.reparametrized_paramsAngle.rsample()Angle.sample_with_intermediates()Angle.set_default_validate_args()Angle.shape()Angle.to_event()Angle.tree_flatten()Angle.tree_unflatten()Angle.validate_args()
MarginalizedLinearMarginalizedLinear.arg_constraintsMarginalizedLinear.supportMarginalizedLinear.reparametrized_paramsMarginalizedLinear.__init__()MarginalizedLinear.sample_with_intermediates()MarginalizedLinear.sample()MarginalizedLinear.log_prob()MarginalizedLinear.conditional()MarginalizedLinear.tree_flatten()MarginalizedLinear.tree_unflatten()MarginalizedLinear.batch_shapeMarginalizedLinear.cdf()MarginalizedLinear.entropy()MarginalizedLinear.enumerate_support()MarginalizedLinear.event_dimMarginalizedLinear.event_shapeMarginalizedLinear.expand()MarginalizedLinear.expand_by()MarginalizedLinear.gather_pytree_aux_fields()MarginalizedLinear.gather_pytree_data_fields()MarginalizedLinear.get_args()MarginalizedLinear.has_enumerate_supportMarginalizedLinear.has_rsampleMarginalizedLinear.icdf()MarginalizedLinear.infer_shapes()MarginalizedLinear.is_discreteMarginalizedLinear.mask()MarginalizedLinear.meanMarginalizedLinear.pytree_aux_fieldsMarginalizedLinear.pytree_data_fieldsMarginalizedLinear.rsample()MarginalizedLinear.set_default_validate_args()MarginalizedLinear.shape()MarginalizedLinear.to_event()MarginalizedLinear.validate_args()MarginalizedLinear.varianceMarginalizedLinear.covariance_matrix
Submodules¶
harv.custom_types module¶
Custom types used in harv.
- harv.custom_types.float_converter(x)¶
Converter for dimensionless scalar float fields.
Strips units from a dimensionless quantity or passes through plain scalars, always producing a 0-d JAX array. Use as an
eqx.fieldconverter wherever a dimensionless scalar is stored internally as a plain JAX float.
harv.distributions module¶
Unit-aware wrapper for numpyro distributions.
This module provides QuantityDistribution, a thin wrapper that
pairs a numpyro distribution with a physical unit string so that samples
carry explicit units via unxt.Q.
- harv.distributions.QD¶
alias of
QuantityDistribution
- class harv.distributions.QuantityDistribution¶
Bases:
ModulePairs a numpyro distribution with the physical unit of its samples.
For scalar distributions,
unitis a single string. For multivariate distributions (e.g.MultivariateNormalover parameters with mixed units),unitis a tuple of strings – one per dimension.- Parameters:
distribution (
Distribution) – The underlying numpyro distribution (works with bare floats).unit (
str|tuple[str,...]) – Physical unit of the samples. A single string for scalar distributions; a tuple for multivariate distributions where each element may have a different unit.
Examples
Scalar (period in days):
>>> import jax >>> import numpyro.distributions as dist >>> from harv import QuantityDistribution >>> qd = QuantityDistribution(dist.LogUniform(50., 2000.), "day") >>> sample = qd.sample(jax.random.key(0)) >>> sample.unit Unit("d")
Multivariate (astrometric linear parameters with mixed units):
qd = QuantityDistribution( dist.MultivariateNormal(loc=jnp.zeros(6), ...), ("mas", "mas", "mas/yr", "mas/yr", "mas", "mas"), ) sample = qd.sample(key) # -> raw jax.Array (consumer splits by name)
-
distribution:
Distribution¶
- sample(key, sample_shape=())¶
Sample from the distribution, attaching units when possible.
For scalar units (
str), returnsQuantity. For tuple units (multivariate with mixed units), returns a raw array – the consumer splits by parameter name and attaches per-element units.
- log_prob(value)¶
Evaluate log-probability, stripping units if present.
- __init__(distribution, unit)¶
- Parameters:
distribution (
Distribution)
- Return type:
None
harv.io module¶
Serialization helpers for harv sampler objects.
Provides save_sampler() and load_sampler() for persisting a
configured sampler (prior + extensions) to disk so that the same setup can be
reloaded without re-specifying all prior parameters.
The format is a standard Python pickle file. Equinox modules (eqx.Module
subclasses) are picklable, so all prior distributions, extension objects, and
static fields round-trip exactly.
Examples
Save and reload a RejectionSampler:
>>> import harv
>>> sampler = harv.RejectionSampler(prior)
>>> harv.save_sampler("sampler.pkl", sampler)
>>> sampler2 = harv.load_sampler("sampler.pkl")
- harv.io.load_sampler(path)¶
Load a sampler from a pickle file written by
save_sampler().- Parameters:
- Returns:
Restored sampler with all prior distributions and extensions intact.
- Return type:
Examples
>>> import harv >>> sampler = load_sampler("sampler.pkl")
- harv.io.save_sampler(path, sampler)¶
Save a configured sampler to a pickle file.
All prior distributions, extensions, and static configuration are preserved. The file can be reloaded with
load_sampler()without needing to re-specify any prior parameters.- Parameters:
- Return type:
Examples
>>> import harv >>> save_sampler("sampler.pkl", sampler)
harv.plot module¶
Plotting utilities.
- harv.plot.get_t_grid(times, period, *, span_buffer_factor=0.1, n_points_per_period=256, max_t_grid=1000000)¶
Dense time grid spanning the observation baseline with a small buffer.
Generates a regular grid of times for plotting model orbits over data. The grid resolution adapts to the period so that fast orbits are well-resolved while long-period orbits don’t create excessive grids.
- Parameters:
times (
Real[Quantity[PhysicalType('time')], '*batch']) – Observation times.period (
Real[Quantity[PhysicalType('time')], '']) – Orbital period (scalar). Used to set the grid spacing asperiod / n_points_per_period.span_buffer_factor (
float) – Fractional buffer added to each side of the observation baseline. Default: 0.1 (10% on each side).n_points_per_period (
int) – Number of grid points per orbital period. Default: 256.max_t_grid (
int|None) – Maximum number of grid points. Default: 1e6. Set to None to disable.
- Return type:
Real[Quantity[PhysicalType('time')], 'n']- Returns:
Regular time grid spanning the buffered observation range.
Examples
>>> from unxt import Q >>> times = Q([0.0, 50.0, 100.0], "day") >>> t_grid = get_t_grid(times, Q(30.0, "day")) >>> len(t_grid) > 128 True
- harv.plot.plot_gaia_astrometry(samples, data, model=None, *, data_plot_kwargs=None, sky_orbit_kwargs=None, figsize=(10, 5), axes=None, **kwargs)¶
Plot the best-fit Gaia astrometry model for a single posterior sample.
Produces a two-panel goodness-of-fit figure for one posterior sample:
Panel 1 (left): on-sky photocentre orbital ellipse (delegated to
plot_gaia_sky_orbit()), with each Gaia epoch shown as a scan-direction segment at the model-predicted photocentre offset.Panel 2 (right): along-scan position residual vs time — the observed
al_positionminus the full predicted model for the sample. The prediction is built by delegating to the model (predict()), which folds in every design-matrix extension (e.g.MonomialTrend(astrometry=True),MultiSurveyOffset). Covariance-only extensions (jitter, GP) widen the residual error bars via_full_obs_err(); GP conditional-mean structured noise is additionally subtracted from the residual so only true noise remains.
samples must contain exactly one posterior sample. Select one beforehand with
samples[i]ormap_sample().- Parameters:
samples (
Samples) – Posterior samples from a Gaia astrometry or joint model. Must contain exactly one sample; otherwise aValueErroris raised.data (
GaiaAstrometryData) – The data conditioned on by the model. Required.model (
Any) – TheGaiaAstrometryModelwhose extensions and parameterization define the prediction. WhenNone(default), a bareGaiaAstrometryModel()is used — equivalent to the previous no-extension behavior.data_plot_kwargs (
dict[str,Any] |None) – Style overrides for the panel-2 residual error bars.sky_orbit_kwargs (
dict[str,Any] |None) – Forwarded toplot_gaia_sky_orbit()for panel 1.figsize (
tuple[float,float]) – Figure size when axes isNone. Default(10, 5).axes (
tuple[Any,Any] |None) – Two axes to draw into,(sky, residual).None(default) creates a new figure.**kwargs (
Any) – Forwarded tomatplotlib.pyplot.subplotswhen axes isNone.
- Return type:
- Returns:
The figure if axes was
None, elseNone.- Raises:
ImportError – If matplotlib is not installed.
ValueError – If samples does not contain exactly one posterior sample.
Examples
>>> fig = plot_gaia_astrometry(samples[0], data=gaia_data) >>> fig = plot_gaia_astrometry( ... samples.map_sample(), data=gaia_data, model=sampler.model ... )
- harv.plot.plot_gaia_sky_orbit(model, samples, *, data=None, n_grid=500, errorbar_scale=1.0, plot_kwargs=None, data_plot_kwargs=None, ax=None, **kwargs)¶
Plot a single astrometric orbit ellipse on the sky.
Draws the photocenter orbit projected onto the sky-plane (
ΔRAvsΔDec) for one posterior sample. When data is provided, each Gaia epoch is rendered as a short line segment in the scan direction at the model- predicted photocenter offset, with half-length equal to the along-scan measurement uncertainty (scaled by errorbar_scale).The orbit-only sky path (no PM, no parallax) is constructed by delegating to
predict_orbit_sky(), so this function automatically supports both Standard and Thiele-Innes parameterizations.- Parameters:
model (
Any) – TheGaiaAstrometryModelwhose parameterization defines the orbit.samples (
Samples) – Posterior samples containing exactly one sample. Select beforehand withsamples[i]orSamples.map_sample().data (
GaiaAstrometryData|None) – Gaia epoch astrometry data. WhenNone, only the orbit ellipse is drawn (no per-epoch markers).n_grid (
int) – Number of phase points used to draw the smooth orbit curve. Default 500.errorbar_scale (
float) – Scale factor on the half-length of each scan-direction line segment (default1.0= 1-sigma).plot_kwargs (
dict[str,Any] |None) – Style overrides for the orbit curve (forwarded toax.plot).data_plot_kwargs (
dict[str,Any] |None) – Style overrides for the per-epoch scan-direction segments.ax (
Any) – Axes to draw into.None(default) creates a new figure.**kwargs (
Any) – Forwarded tomatplotlib.pyplot.subplotswhen ax isNone.
- Return type:
- Returns:
The figure if ax was
None, elseNone.- Raises:
ImportError – If matplotlib is not installed.
ValueError – If samples does not contain exactly one posterior sample.
- harv.plot.plot_rv(samples, data=None, model=None, *, n_samples=128, time_grid=None, show_signal_components=False, relative_to_t_ref=False, relative_to_median_v_sys=False, phase_fold_median=False, apply_median_offsets=True, plot_kwargs=None, data_plot_kwargs=None, extra_err_plot_kwargs=None, color_cycler=None, ax=None, **kwargs)¶
Plot RV curves computed from (posterior) samples over data.
Draws
n_samplesposterior RV curves sampled from samples. The predicted curve and error model are obtained by delegating to the model:The smooth curve at the plotting grid uses
predict_at_times(), which folds in every design-matrix extension (e.g.MonomialTrend).Error bars are widened using
_full_obs_err()(jitter, GP diagonal).GP structured noise is overlaid via
conditional_mean().
MultiSurveyOffsetis subsetted out for the smooth-curve overlay (itsindicator_matrixis bound to the original data’s row count and cannot be evaluated on a plotting grid); per-instrument median offsets are still applied to data points so that instruments land in the reference frame.- Parameters:
samples (
Samples) – Posterior samples fromRejectionSamplerorNumpyroSampler.data (
RVData|SourceData|SystemData|None) – Observed RV data to overplot. WhenNone, only orbit model curves are drawn (no data points, no instrument-colour cycling).model (
Any) – The RV component model (orJointModelfor SB2) whose extensions and parameterization define the prediction. WhenNone(default), a bareRVModel()is used — the previous no-extension behavior.n_samples (
int|None) – Number of posterior curves to draw. Set to None to draw all samples. Default: 128.time_grid (
Real[Quantity[PhysicalType('time')], '*batch']|None) – Explicit time grid used to evaluate and plot the posterior orbit curves. When provided, this is used instead of the default phase grid orget_t_grid(). Ifphase_fold_median=True, the supplied time grid is converted to phase using the reference sample’s period and periastron time.show_signal_components (
bool) – Whether to plot the Keplerian signal and the combined extension-driven contribution as separate curves instead of plotting their sum. This decomposition view is only supported for time-domain RV plots with observed data. Default:False.relative_to_t_ref (
bool) – Whether to plot time relative to the reference epoch (t_ref) of the data.relative_to_median_v_sys (
bool) – Whether to shift all curves by the median systemic velocity (v_sys) of the samples, so that the curves show only the relative RV variations. Only applies when a “v_sys” parameter is present in the samples. Default: False.phase_fold_median (
bool) – IfTrue, fold data and model to orbital phase using the sample closest to the median period. Phase zero is set to that sample’st_perivalue. Only that single reference orbit curve is drawn — plotting multiple samples on a phase axis defined by one period is misleading when the posterior has period spread. When plot-aware extensions are present, the reference sample’s extension contribution is subtracted from the data before folding so the Keplerian orbit overlays the phase-folded points. Default:False.apply_median_offsets (
bool) – Shift non-reference instrument data by the posterior median offset so all instruments land in the reference frame. Only applies when aMultiSurveyOffsetextension is present. Default:True.plot_kwargs (
dict[str,Any] |None) – Style overrides for orbit model curves.data_plot_kwargs (
dict[str,Any] |None) – Style overrides for data points.extra_err_plot_kwargs (
dict[str,Any] |None) – Style overrides for the widened error bars drawn when a jitter extension is present. Keys override the defaults (marker=””, ecolor=”#6b2828”, alpha=0.5).color_cycler (
Any|None) – Colour cycler used to assign distinct colours to each instrument’s data points. WhenNone(default) the currentaxes.prop_cyclefrommatplotlib.rcParamsis used.ax (
Any) – Axes to draw into. IfNone(default), a new figure and axes are created and the axes object is returned.**kwargs (
Any) – Forwarded tomatplotlib.pyplot.subplotswhen ax isNone.
- Return type:
- Returns:
The axes plotted to.
- Raises:
ImportError – If matplotlib is not installed.
Examples
>>> ax = plot_rv(samples, rv_data) >>> ax = plot_rv(samples, rv_data, model=sampler.model) >>> ax = plot_rv(samples, rv_data, phase_fold_median=True)