harv.data package ¶

Examples

>>> import jax.numpy as jnp
>>> import matplotlib.pyplot as plt
>>> from unxt import Q
>>> from harv import GaiaAstrometryData
>>> data = GaiaAstrometryData(
...     time=Q([0.0, 100.0, 200.0], "day"),
...     al_position=Q([0.1, -0.2, 0.05], "mas"),
...     al_position_err=Q([0.01, 0.01, 0.01], "mas"),
...     scan_angle=Q([0.5, 1.2, 2.8], "rad"),
...     parallax_factor=jnp.array([0.3, -0.1, 0.4]),
... )
>>> ax = data.plot()
>>> plt.close("all")

t_ref: Real[Quantity[PhysicalType('time')], ''] | None = None¶: Reference epoch. If None, uses mean observation time.

al_position: Real[Quantity[PhysicalType('angle')], 'n']¶: Along-scan position.

al_position_err: Real[Quantity[PhysicalType('angle')], 'n']¶: Along-scan uncertainty.

scan_angle: Real[Quantity[PhysicalType('angle')], 'n']¶: Per-CCD scan angle.

parallax_factor: Float[Array, 'n']¶: AL parallax factors.

time: Real[Quantity[PhysicalType('time')], 'n']¶: Barycentric TCB times.

class harv.data.RVData¶

Bases: AbstractData

Radial velocity measurements.

Examples

>>> from unxt import Q
>>> from harv import RVData
>>> data = RVData(
...     time=Q([0.0, 50.0, 100.0], "day"),
...     rv=Q([1.0, -2.0, 0.5], "km/s"),
...     rv_err=Q([0.5, 0.5, 0.5], "km/s"),
... )
>>> data.n_times
3

Parameters:

time (Real[Quantity[PhysicalType('time')], 'n'])
rv (Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])
rv_err (Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])
t_ref (Real[Quantity[PhysicalType('time')], ''] | None)

__init__(time, rv, rv_err, *, t_ref=None)¶

Parameters:

time (Real[Quantity[PhysicalType('time')], 'n'])
rv (Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])
rv_err (Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])
t_ref (Real[Quantity[PhysicalType('time')], ''] | None)

Return type:

None

static __new__(cls, *args, **kwargs)¶

Parameters:

cls (type[TypeVar(_ModuleT, bound= Module)])
args (object)
kwargs (object)

Return type:

TypeVar(_ModuleT, bound= Module)

property n_times: int¶: Number of times / epochs / observations.

plot(ax=None, *, rv_unit=None, add_labels=True, relative_to_t_ref=False, phase_fold=None, **kwargs)¶

Plot RV data as error bars.

Parameters:

ax (Any) – The matplotlib.axes.Axes instance to draw on. If None, uses plt.gca().
rv_unit (str | None) – Display unit for the RV axis. Defaults to the data’s own unit.
add_labels (bool) – Add axis labels.
relative_to_t_ref (bool) – Plot time relative to t_ref. Mutually exclusive with phase_fold.
phase_fold (Any | None) – If given, fold observations to orbital phase using this period: x = (time - t_ref) / phase_fold mod 1. Mutually exclusive with relative_to_t_ref.
**kwargs (Any) – Passed to ax.errorbar(). Defaults can be overridden.

Returns:

The matplotlib.axes.Axes instance.

Return type:

Examples

>>> import matplotlib.pyplot as plt
>>> from unxt import Q
>>> data = RVData(
...     time=Q([0.0, 50.0, 100.0], "day"),
...     rv=Q([1.0, -2.0, 0.5], "km/s"),
...     rv_err=Q([0.5, 0.5, 0.5], "km/s"),
... )
>>> ax = data.plot()  # uses errorbar() with sensible defaults
>>> ax = data.plot(color="C1", markersize=6)  # override style
>>> ax = data.plot(phase_fold=Q(50.0, "day"))  # phase-folded
>>> plt.close("all")

t_ref: Real[Quantity[PhysicalType('time')], ''] | None = None¶: Reference epoch. If None, uses mean observation time.

rv: Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n']¶: Radial velocities.

rv_err: Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n']¶: Radial velocity uncertainties.

time: Real[Quantity[PhysicalType('time')], 'n']¶: Barycentric TCB times.

class harv.data.SourceData¶

Container for multiple named datasets for a single source.

Accepts arbitrary named datasets via keyword arguments. Names are user-defined and can be anything (e.g., gaia, keck_rv, hst_imaging).

Parameters:: datasets (AbstractAstrometryData | RVData)

__init__(**datasets)¶

Parameters:: datasets (AbstractAstrometryData | RVData)
Return type:: None

static __new__(cls, *args, **kwargs)¶

Parameters:

cls (type[TypeVar(_ModuleT, bound= Module)])
args (object)
kwargs (object)

Return type:

TypeVar(_ModuleT, bound= Module)

get_datasets_by_type(data_type)¶

Get all datasets/components of a specific data type.

Parameters:: data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by.
Return type:: dict[str, TypeVar(_DT, bound= AbstractAstrometryData | RVData)]

Examples

>>> from harv.data.datasets import RVData, GaiaAstrometryData
>>> from harv.data.containers import SourceData
>>> source_data = SourceData(
...     keck_rv=RVData(...),
...     gaia=GaiaAstrometryData(...),
... )
>>> source_data.get_datasets_by_type(RVData)
{'keck_rv': RVData(...)}
>>> source_data.get_datasets_by_type(GaiaAstrometryData)
{'gaia': GaiaAstrometryData(...)}

indicator_data_by_type(data_type, reference)¶

Return stacked data and indicator flags for one dataset type.

This is a convenience wrapper around get_datasets_by_type + build_indicator_matrix for use in extensions that need to build a kernel matrix across multiple datasets of the same type (e.g. multiple RV instruments).

Parameters:

data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.
reference (str) – Name of the reference dataset to use for time coordinates and metadata. Must be one of the keys in the returned dict from get_datasets_by_type(data_type).

Return type:

tuple[TypeVar(_DT, bound= AbstractAstrometryData | RVData), Array | None, tuple[str, ...] | None]

items()¶

(name, dataset) pairs.

Return type:: Iterator[tuple[str, AbstractAstrometryData | RVData]]

keys()¶

Dataset/component names.

Return type:: Iterator[str]

plot(*args, **kwargs)¶

Plot all datasets on a single axes.

Only valid when every contained dataset shares the same concrete type; plotting heterogeneous types (e.g. RV in km/s and astrometry in mas) on a single axes would overlay incompatible y-axes. Use get_datasets_by_type() to filter to a single type first when needed.

Parameters mirror AbstractDatasetContainer.plot().

Raises:

TypeError – If the contained datasets are not all of the same concrete type.

Parameters:

args (Any)
kwargs (Any)

Return type:

stacked_by_type(data_type)¶

Stack all datasets of the requested type.

Parameters:: data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.
Return type:: TypeVar(_DT, bound= AbstractAstrometryData | RVData)

Examples

>>> from harv.data.datasets import RVData, GaiaAstrometryData
>>> from harv.data.containers import SourceData
>>> source_data = SourceData(
...     keck_rv=RVData(...),
...     wiyn_rv=RVData(...),
...     gaia=GaiaAstrometryData(...),
... )
>>> source_data.stacked_by_type(RVData)
RVData(...)

property t_ref: Real[Quantity[PhysicalType('time')], ''] | None¶

Reference epoch shared by all contained datasets.

Guaranteed to be consistent across components because every concrete subclass calls _synchronize_t_refs() in its __init__.

values()¶

Dataset/component values.

Return type:: Iterator[AbstractAstrometryData | RVData]

class harv.data.SystemData¶

Container for a multi-component system.

Each named component holds the same concrete data class representing observations of a distinct physical body or photocenter in a gravitationally bound system.

Parameters:: datasets (AbstractAstrometryData | RVData)

__init__(**datasets)¶

Parameters:: datasets (AbstractAstrometryData | RVData)
Return type:: None

static __new__(cls, *args, **kwargs)¶

Parameters:

cls (type[TypeVar(_ModuleT, bound= Module)])
args (object)
kwargs (object)

Return type:

TypeVar(_ModuleT, bound= Module)

property dataset_type: type[AbstractData]¶: Concrete dataset class shared by all components.

get_datasets_by_type(data_type)¶

Get all datasets/components of a specific data type.

Parameters:: data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by.
Return type:: dict[str, TypeVar(_DT, bound= AbstractAstrometryData | RVData)]

Examples

>>> from harv.data.datasets import RVData, GaiaAstrometryData
>>> from harv.data.containers import SourceData
>>> source_data = SourceData(
...     keck_rv=RVData(...),
...     gaia=GaiaAstrometryData(...),
... )
>>> source_data.get_datasets_by_type(RVData)
{'keck_rv': RVData(...)}
>>> source_data.get_datasets_by_type(GaiaAstrometryData)
{'gaia': GaiaAstrometryData(...)}

indicator_data(reference)¶

Return stacked data and component-indicator flags.

Parameters:: reference (str)
Return type:: tuple[AbstractAstrometryData | RVData, Array | None, tuple[str, ...] | None]

indicator_data_by_type(data_type, reference)¶

Return stacked data and indicator flags for one dataset type.

Parameters:

data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.
reference (str) – Name of the reference dataset to use for time coordinates and metadata. Must be one of the keys in the returned dict from get_datasets_by_type(data_type).

Return type:

tuple[TypeVar(_DT, bound= AbstractAstrometryData | RVData), Array | None, tuple[str, ...] | None]

items()¶

(name, dataset) pairs.

Return type:: Iterator[tuple[str, AbstractAstrometryData | RVData]]

keys()¶

Dataset/component names.

Return type:: Iterator[str]

plot(ax=None, *, add_legend=True, color_cycler=None, **kwargs)¶

Plot all contained datasets on the same axes.

Dispatches to each dataset’s .plot() method, drawing all components onto a single axes panel with a legend showing the names. Each dataset is assigned a distinct color from color_cycler (or the current axes.prop_cycle when not specified).

This base implementation does not check that the contained datasets share a concrete type; concrete subclasses are responsible for any preconditions (SystemData enforces homogeneity at construction; SourceData validates at call time).

Parameters:

ax (Any) – The matplotlib.axes.Axes instance to draw on. If None, a new figure is created.
add_legend (bool) – Whether to add a legend labelled by component name. Default: True.
color_cycler (Any) – A cycler.Cycler whose "color" key supplies per-component colors. When None (default), colors are taken from the current axes.prop_cycle rcParam.
**kwargs (Any) – Forwarded to each component’s .plot() method. A color keyword here overrides the cycler for all components.

Return type:

Returns:

The matplotlib.axes.Axes instance.

Examples

>>> import matplotlib.pyplot as plt
>>> from unxt import Q
>>> from harv import RVData
>>> from harv.data.containers import SystemData
>>> sys_data = SystemData(
...     primary=RVData(
...         time=Q([0.0, 50.0], "day"),
...         rv=Q([10.0, -10.0], "km/s"),
...         rv_err=Q([0.5, 0.5], "km/s"),
...     ),
...     secondary=RVData(
...         time=Q([0.0, 50.0], "day"),
...         rv=Q([-10.0, 10.0], "km/s"),
...         rv_err=Q([0.5, 0.5], "km/s"),
...     ),
... )
>>> ax = sys_data.plot()
>>> plt.close("all")

stacked()¶

Stack all component datasets.

Return type:: AbstractAstrometryData | RVData

stacked_by_type(data_type)¶

Stack all datasets of the requested type.

Parameters:: data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.
Return type:: TypeVar(_DT, bound= AbstractAstrometryData | RVData)

Examples

>>> from harv.data.datasets import RVData, GaiaAstrometryData
>>> from harv.data.containers import SourceData
>>> source_data = SourceData(
...     keck_rv=RVData(...),
...     wiyn_rv=RVData(...),
...     gaia=GaiaAstrometryData(...),
... )
>>> source_data.stacked_by_type(RVData)
RVData(...)

property t_ref: Real[Quantity[PhysicalType('time')], ''] | None¶

Reference epoch shared by all contained datasets.

Guaranteed to be consistent across components because every concrete subclass calls _synchronize_t_refs() in its __init__.

values()¶

Dataset/component values.

Return type:: Iterator[AbstractAstrometryData | RVData]

harv.data.build_indicator_matrix(datasets, reference)¶

Build indicator matrix for multi-survey data of the same type.

Parameters:

datasets (dict[str, TypeVar(DT, bound= AbstractData)]) – Ordered mapping of instrument name -> dataset. Dict order must match the order used when stacking (see stack_datasets()).
reference (str) – Name of the reference instrument (its observations get no offset column).

Return type:

tuple[TypeVar(DT, bound= AbstractData), Array | None, tuple[str, ...] | None]

Returns:

stacked (DT) – Stacked dataset containing all observations.
indicator (jax.Array | None) – Shape (n_obs_total, n_non_ref). indicator[i, j] = 1 when observation i belongs to non-reference instrument j.
instrument_names (tuple[str, …] | None) – Names of the non-reference instruments, in column order.

Examples

>>> from unxt import Q
>>> from harv.data import RVData
>>> from harv.data.helpers import build_indicator_matrix
>>> rv1 = RVData(
...     time=Q([0.0, 50.0], "day"),
...     rv=Q([1.0, -2.0], "km/s"),
...     rv_err=Q([0.5, 0.5], "km/s"),
... )
>>> rv2 = RVData(
...     time=Q([10.0, 60.0], "day"),
...     rv=Q([0.5, -1.5], "km/s"),
...     rv_err=Q([0.3, 0.3], "km/s"),
... )
>>> stacked, indicator, names = build_indicator_matrix(
...     {"survey1": rv1, "survey2": rv2}, reference="survey1",
... )
>>> stacked.n_times
4
>>> names
('survey2',)
>>> indicator.shape
(4, 1)

harv.data.stack_datasets(datasets)¶

Concatenate multiple datasets in dict order into a single one.

Parameters:: datasets (dict[str, TypeVar(DT, bound= AbstractData)]) – Ordered mapping of instrument name -> dataset. Dict order determines the row order in the stacked output; it must match the order used when building the indicator matrix (see build_indicator_matrix()).
Return type:: TypeVar(DT, bound= AbstractData)

Examples

>>> from unxt import Q
>>> from harv.data import RVData
>>> from harv.data.helpers import stack_datasets
>>> rv1 = RVData(
...     time=Q([0.0, 50.0], "day"),
...     rv=Q([1.0, -2.0], "km/s"),
...     rv_err=Q([0.5, 0.5], "km/s"),
... )
>>> rv2 = RVData(
...     time=Q([10.0, 60.0], "day"),
...     rv=Q([0.5, -1.5], "km/s"),
...     rv_err=Q([0.3, 0.3], "km/s"),
... )
>>> stacked = stack_datasets({"instr1": rv1, "instr2": rv2})
>>> stacked.n_times
4

Submodules¶

harv.data.containers module¶

Dataset containers for multi-component and multi-instrument data.

class harv.data.containers.AbstractDatasetContainer¶

Bases: Module

Base class providing a dict-like interface over named datasets.

Subclasses (SystemData, SourceData) share this common interface but carry different semantic meaning.

Parameters:: _datasets (dict[str, AbstractAstrometryData | RVData])

property t_ref: Real[Quantity[PhysicalType('time')], ''] | None¶

Reference epoch shared by all contained datasets.

Guaranteed to be consistent across components because every concrete subclass calls _synchronize_t_refs() in its __init__.

keys()¶

Dataset/component names.

Return type:: Iterator[str]

values()¶

Dataset/component values.

Return type:: Iterator[AbstractAstrometryData | RVData]

items()¶

(name, dataset) pairs.

Return type:: Iterator[tuple[str, AbstractAstrometryData | RVData]]

get_datasets_by_type(data_type)¶

Get all datasets/components of a specific data type.

Parameters:: data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by.
Return type:: dict[str, TypeVar(_DT, bound= AbstractAstrometryData | RVData)]

Examples

>>> from harv.data.datasets import RVData, GaiaAstrometryData
>>> from harv.data.containers import SourceData
>>> source_data = SourceData(
...     keck_rv=RVData(...),
...     gaia=GaiaAstrometryData(...),
... )
>>> source_data.get_datasets_by_type(RVData)
{'keck_rv': RVData(...)}
>>> source_data.get_datasets_by_type(GaiaAstrometryData)
{'gaia': GaiaAstrometryData(...)}

stacked_by_type(data_type)¶

Stack all datasets of the requested type.

Parameters:: data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.
Return type:: TypeVar(_DT, bound= AbstractAstrometryData | RVData)

Examples

>>> from harv.data.datasets import RVData, GaiaAstrometryData
>>> from harv.data.containers import SourceData
>>> source_data = SourceData(
...     keck_rv=RVData(...),
...     wiyn_rv=RVData(...),
...     gaia=GaiaAstrometryData(...),
... )
>>> source_data.stacked_by_type(RVData)
RVData(...)

indicator_data_by_type(data_type, reference)¶

Return stacked data and indicator flags for one dataset type.

Parameters:

data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.
reference (str) – Name of the reference dataset to use for time coordinates and metadata. Must be one of the keys in the returned dict from get_datasets_by_type(data_type).

Return type:

tuple[TypeVar(_DT, bound= AbstractAstrometryData | RVData), Array | None, tuple[str, ...] | None]

plot(ax=None, *, add_legend=True, color_cycler=None, **kwargs)¶

Plot all contained datasets on the same axes.

Parameters:

ax (Any) – The matplotlib.axes.Axes instance to draw on. If None, a new figure is created.
add_legend (bool) – Whether to add a legend labelled by component name. Default: True.
color_cycler (Any) – A cycler.Cycler whose "color" key supplies per-component colors. When None (default), colors are taken from the current axes.prop_cycle rcParam.
**kwargs (Any) – Forwarded to each component’s .plot() method. A color keyword here overrides the cycler for all components.

Return type:

Returns:

The matplotlib.axes.Axes instance.

Examples

>>> import matplotlib.pyplot as plt
>>> from unxt import Q
>>> from harv import RVData
>>> from harv.data.containers import SystemData
>>> sys_data = SystemData(
...     primary=RVData(
...         time=Q([0.0, 50.0], "day"),
...         rv=Q([10.0, -10.0], "km/s"),
...         rv_err=Q([0.5, 0.5], "km/s"),
...     ),
...     secondary=RVData(
...         time=Q([0.0, 50.0], "day"),
...         rv=Q([-10.0, 10.0], "km/s"),
...         rv_err=Q([0.5, 0.5], "km/s"),
...     ),
... )
>>> ax = sys_data.plot()
>>> plt.close("all")

__init__(_datasets)¶

Parameters:: _datasets (dict[str, AbstractAstrometryData | RVData])
Return type:: None

static __new__(cls, *args, **kwargs)¶

Parameters:

cls (type[TypeVar(_ModuleT, bound= Module)])
args (object)
kwargs (object)

Return type:

TypeVar(_ModuleT, bound= Module)

class harv.data.containers.SourceData¶

Container for multiple named datasets for a single source.

Accepts arbitrary named datasets via keyword arguments. Names are user-defined and can be anything (e.g., gaia, keck_rv, hst_imaging).

Parameters:: datasets (AbstractAstrometryData | RVData)

__init__(**datasets)¶

Parameters:: datasets (AbstractAstrometryData | RVData)
Return type:: None

plot(*args, **kwargs)¶

Plot all datasets on a single axes.

Parameters mirror AbstractDatasetContainer.plot().

Raises:

TypeError – If the contained datasets are not all of the same concrete type.

Parameters:

args (Any)
kwargs (Any)

Return type:

static __new__(cls, *args, **kwargs)¶

Parameters:

cls (type[TypeVar(_ModuleT, bound= Module)])
args (object)
kwargs (object)

Return type:

TypeVar(_ModuleT, bound= Module)

get_datasets_by_type(data_type)¶

Get all datasets/components of a specific data type.

Parameters:: data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by.
Return type:: dict[str, TypeVar(_DT, bound= AbstractAstrometryData | RVData)]

Examples

>>> from harv.data.datasets import RVData, GaiaAstrometryData
>>> from harv.data.containers import SourceData
>>> source_data = SourceData(
...     keck_rv=RVData(...),
...     gaia=GaiaAstrometryData(...),
... )
>>> source_data.get_datasets_by_type(RVData)
{'keck_rv': RVData(...)}
>>> source_data.get_datasets_by_type(GaiaAstrometryData)
{'gaia': GaiaAstrometryData(...)}

indicator_data_by_type(data_type, reference)¶

Return stacked data and indicator flags for one dataset type.

Parameters:

data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.
reference (str) – Name of the reference dataset to use for time coordinates and metadata. Must be one of the keys in the returned dict from get_datasets_by_type(data_type).

Return type:

tuple[TypeVar(_DT, bound= AbstractAstrometryData | RVData), Array | None, tuple[str, ...] | None]

items()¶

(name, dataset) pairs.

Return type:: Iterator[tuple[str, AbstractAstrometryData | RVData]]

keys()¶

Dataset/component names.

Return type:: Iterator[str]

stacked_by_type(data_type)¶

Stack all datasets of the requested type.

Parameters:: data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.
Return type:: TypeVar(_DT, bound= AbstractAstrometryData | RVData)

Examples

>>> from harv.data.datasets import RVData, GaiaAstrometryData
>>> from harv.data.containers import SourceData
>>> source_data = SourceData(
...     keck_rv=RVData(...),
...     wiyn_rv=RVData(...),
...     gaia=GaiaAstrometryData(...),
... )
>>> source_data.stacked_by_type(RVData)
RVData(...)

property t_ref: Real[Quantity[PhysicalType('time')], ''] | None¶

Reference epoch shared by all contained datasets.

Guaranteed to be consistent across components because every concrete subclass calls _synchronize_t_refs() in its __init__.

values()¶

Dataset/component values.

Return type:: Iterator[AbstractAstrometryData | RVData]

class harv.data.containers.SystemData¶

Container for a multi-component system.

Each named component holds the same concrete data class representing observations of a distinct physical body or photocenter in a gravitationally bound system.

Parameters:: datasets (AbstractAstrometryData | RVData)

__init__(**datasets)¶

Parameters:: datasets (AbstractAstrometryData | RVData)
Return type:: None

property dataset_type: type[AbstractData]¶: Concrete dataset class shared by all components.

stacked()¶

Stack all component datasets.

Return type:: AbstractAstrometryData | RVData

indicator_data(reference)¶

Return stacked data and component-indicator flags.

Parameters:: reference (str)
Return type:: tuple[AbstractAstrometryData | RVData, Array | None, tuple[str, ...] | None]

static __new__(cls, *args, **kwargs)¶

Parameters:

cls (type[TypeVar(_ModuleT, bound= Module)])
args (object)
kwargs (object)

Return type:

TypeVar(_ModuleT, bound= Module)

get_datasets_by_type(data_type)¶

Get all datasets/components of a specific data type.

Parameters:: data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by.
Return type:: dict[str, TypeVar(_DT, bound= AbstractAstrometryData | RVData)]

Examples

>>> from harv.data.datasets import RVData, GaiaAstrometryData
>>> from harv.data.containers import SourceData
>>> source_data = SourceData(
...     keck_rv=RVData(...),
...     gaia=GaiaAstrometryData(...),
... )
>>> source_data.get_datasets_by_type(RVData)
{'keck_rv': RVData(...)}
>>> source_data.get_datasets_by_type(GaiaAstrometryData)
{'gaia': GaiaAstrometryData(...)}

indicator_data_by_type(data_type, reference)¶

Return stacked data and indicator flags for one dataset type.

Parameters:

data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.
reference (str) – Name of the reference dataset to use for time coordinates and metadata. Must be one of the keys in the returned dict from get_datasets_by_type(data_type).

Return type:

tuple[TypeVar(_DT, bound= AbstractAstrometryData | RVData), Array | None, tuple[str, ...] | None]

items()¶

(name, dataset) pairs.

Return type:: Iterator[tuple[str, AbstractAstrometryData | RVData]]

keys()¶

Dataset/component names.

Return type:: Iterator[str]

plot(ax=None, *, add_legend=True, color_cycler=None, **kwargs)¶

Plot all contained datasets on the same axes.

Parameters:

ax (Any) – The matplotlib.axes.Axes instance to draw on. If None, a new figure is created.
add_legend (bool) – Whether to add a legend labelled by component name. Default: True.
color_cycler (Any) – A cycler.Cycler whose "color" key supplies per-component colors. When None (default), colors are taken from the current axes.prop_cycle rcParam.
**kwargs (Any) – Forwarded to each component’s .plot() method. A color keyword here overrides the cycler for all components.

Return type:

Returns:

The matplotlib.axes.Axes instance.

Examples

>>> import matplotlib.pyplot as plt
>>> from unxt import Q
>>> from harv import RVData
>>> from harv.data.containers import SystemData
>>> sys_data = SystemData(
...     primary=RVData(
...         time=Q([0.0, 50.0], "day"),
...         rv=Q([10.0, -10.0], "km/s"),
...         rv_err=Q([0.5, 0.5], "km/s"),
...     ),
...     secondary=RVData(
...         time=Q([0.0, 50.0], "day"),
...         rv=Q([-10.0, 10.0], "km/s"),
...         rv_err=Q([0.5, 0.5], "km/s"),
...     ),
... )
>>> ax = sys_data.plot()
>>> plt.close("all")

stacked_by_type(data_type)¶

Stack all datasets of the requested type.

Parameters:: data_type (type[TypeVar(_DT, bound= AbstractAstrometryData | RVData)]) – Concrete data class (e.g. RVData, GaiaAstrometryData) to filter by before stacking.
Return type:: TypeVar(_DT, bound= AbstractAstrometryData | RVData)

Examples

>>> from harv.data.datasets import RVData, GaiaAstrometryData
>>> from harv.data.containers import SourceData
>>> source_data = SourceData(
...     keck_rv=RVData(...),
...     wiyn_rv=RVData(...),
...     gaia=GaiaAstrometryData(...),
... )
>>> source_data.stacked_by_type(RVData)
RVData(...)

property t_ref: Real[Quantity[PhysicalType('time')], ''] | None¶

Reference epoch shared by all contained datasets.

Guaranteed to be consistent across components because every concrete subclass calls _synchronize_t_refs() in its __init__.

values()¶

Dataset/component values.

Return type:: Iterator[AbstractAstrometryData | RVData]

harv.data.datasets module¶

Observation data classes for time series data.

class harv.data.datasets.AbstractAstrometryData¶

Bases: AbstractData

Abstract base class for astrometric data.

Parameters:

time (Real[Quantity[PhysicalType('time')], 'n'])
t_ref (Real[Quantity[PhysicalType('time')], ''] | None)

__init__(time, *, t_ref=None)¶

Parameters:

time (Real[Quantity[PhysicalType('time')], 'n'])
t_ref (Real[Quantity[PhysicalType('time')], ''] | None)

Return type:

None

static __new__(cls, *args, **kwargs)¶

Parameters:

cls (type[TypeVar(_ModuleT, bound= Module)])
args (object)
kwargs (object)

Return type:

TypeVar(_ModuleT, bound= Module)

property n_times: int¶: Number of times / epochs / observations.

t_ref: Real[Quantity[PhysicalType('time')], ''] | None = None¶: Reference epoch. If None, uses mean observation time.

time: Real[Quantity[PhysicalType('time')], 'n']¶: Barycentric TCB times.

class harv.data.datasets.AbstractData¶

Bases: Module

Abstract base class for observational data time series.

Parameters:

time (Real[Quantity[PhysicalType('time')], 'n'])
t_ref (Real[Quantity[PhysicalType('time')], ''] | None)

time: Real[Quantity[PhysicalType('time')], 'n']¶: Barycentric TCB times.

t_ref: Real[Quantity[PhysicalType('time')], ''] | None = None¶: Reference epoch. If None, uses mean observation time.

property n_times: int¶: Number of times / epochs / observations.

__init__(time, *, t_ref=None)¶

Parameters:

time (Real[Quantity[PhysicalType('time')], 'n'])
t_ref (Real[Quantity[PhysicalType('time')], ''] | None)

Return type:

None

static __new__(cls, *args, **kwargs)¶

Parameters:

cls (type[TypeVar(_ModuleT, bound= Module)])
args (object)
kwargs (object)

Return type:

TypeVar(_ModuleT, bound= Module)

class harv.data.datasets.GaiaAstrometryData¶

Bases: AbstractAstrometryData

Gaia epoch astrometry (along-scan measurements).

Examples

>>> import jax.numpy as jnp
>>> from unxt import Q
>>> from harv import GaiaAstrometryData
>>> data = GaiaAstrometryData(
...     time=Q([0.0, 100.0, 200.0], "day"),
...     al_position=Q([0.1, -0.2, 0.05], "mas"),
...     al_position_err=Q([0.01, 0.01, 0.01], "mas"),
...     scan_angle=Q([0.5, 1.2, 2.8], "rad"),
...     parallax_factor=jnp.array([0.3, -0.1, 0.4]),
... )
>>> data.n_times
3

Parameters:

time (Real[Quantity[PhysicalType('time')], 'n'])
al_position (Real[Quantity[PhysicalType('angle')], 'n'])
al_position_err (Real[Quantity[PhysicalType('angle')], 'n'])
scan_angle (Real[Quantity[PhysicalType('angle')], 'n'])
parallax_factor (Float[Array, 'n'])
t_ref (Real[Quantity[PhysicalType('time')], ''] | None)

al_position: Real[Quantity[PhysicalType('angle')], 'n']¶: Along-scan position.

al_position_err: Real[Quantity[PhysicalType('angle')], 'n']¶: Along-scan uncertainty.

scan_angle: Real[Quantity[PhysicalType('angle')], 'n']¶: Per-CCD scan angle.

parallax_factor: Float[Array, 'n']¶: AL parallax factors.

plot(ax=None, *, al_unit=None, add_labels=True, relative_to_t_ref=False, **kwargs)¶

Plot along-scan residuals vs time.

Parameters:

ax (Any) – matplotlib.axes.Axes instance to draw on. If None, uses plt.gca().
al_unit (str | None) – Display unit for the along-scan position. Defaults to the data’s own unit.
add_labels (bool) – Add axis labels.
relative_to_t_ref (bool) – Plot time relative to t_ref.
**kwargs (Any) – Passed to ax.errorbar(). Defaults can be overridden.

Returns:

The matplotlib.axes.Axes instance.

Return type:

Examples

>>> import jax.numpy as jnp
>>> import matplotlib.pyplot as plt
>>> from unxt import Q
>>> from harv import GaiaAstrometryData
>>> data = GaiaAstrometryData(
...     time=Q([0.0, 100.0, 200.0], "day"),
...     al_position=Q([0.1, -0.2, 0.05], "mas"),
...     al_position_err=Q([0.01, 0.01, 0.01], "mas"),
...     scan_angle=Q([0.5, 1.2, 2.8], "rad"),
...     parallax_factor=jnp.array([0.3, -0.1, 0.4]),
... )
>>> ax = data.plot()
>>> plt.close("all")

__init__(time, al_position, al_position_err, scan_angle, parallax_factor, *, t_ref=None)¶

Parameters:

time (Real[Quantity[PhysicalType('time')], 'n'])
al_position (Real[Quantity[PhysicalType('angle')], 'n'])
al_position_err (Real[Quantity[PhysicalType('angle')], 'n'])
scan_angle (Real[Quantity[PhysicalType('angle')], 'n'])
parallax_factor (Float[Array, 'n'])
t_ref (Real[Quantity[PhysicalType('time')], ''] | None)

Return type:

None

static __new__(cls, *args, **kwargs)¶

Parameters:

cls (type[TypeVar(_ModuleT, bound= Module)])
args (object)
kwargs (object)

Return type:

TypeVar(_ModuleT, bound= Module)

property n_times: int¶: Number of times / epochs / observations.

t_ref: Real[Quantity[PhysicalType('time')], ''] | None = None¶: Reference epoch. If None, uses mean observation time.

time: Real[Quantity[PhysicalType('time')], 'n']¶: Barycentric TCB times.

class harv.data.datasets.RVData¶

Bases: AbstractData

Radial velocity measurements.

Examples

>>> from unxt import Q
>>> from harv import RVData
>>> data = RVData(
...     time=Q([0.0, 50.0, 100.0], "day"),
...     rv=Q([1.0, -2.0, 0.5], "km/s"),
...     rv_err=Q([0.5, 0.5, 0.5], "km/s"),
... )
>>> data.n_times
3

Parameters:

time (Real[Quantity[PhysicalType('time')], 'n'])
rv (Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])
rv_err (Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])
t_ref (Real[Quantity[PhysicalType('time')], ''] | None)

__init__(time, rv, rv_err, *, t_ref=None)¶

Parameters:

time (Real[Quantity[PhysicalType('time')], 'n'])
rv (Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])
rv_err (Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n'])
t_ref (Real[Quantity[PhysicalType('time')], ''] | None)

Return type:

None

static __new__(cls, *args, **kwargs)¶

Parameters:

cls (type[TypeVar(_ModuleT, bound= Module)])
args (object)
kwargs (object)

Return type:

TypeVar(_ModuleT, bound= Module)

property n_times: int¶: Number of times / epochs / observations.

t_ref: Real[Quantity[PhysicalType('time')], ''] | None = None¶: Reference epoch. If None, uses mean observation time.

time: Real[Quantity[PhysicalType('time')], 'n']¶: Barycentric TCB times.

rv: Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n']¶: Radial velocities.

rv_err: Real[Quantity[PhysicalType({'speed', 'velocity'})], 'n']¶: Radial velocity uncertainties.

plot(ax=None, *, rv_unit=None, add_labels=True, relative_to_t_ref=False, phase_fold=None, **kwargs)¶

Plot RV data as error bars.

Parameters:

ax (Any) – The matplotlib.axes.Axes instance to draw on. If None, uses plt.gca().
rv_unit (str | None) – Display unit for the RV axis. Defaults to the data’s own unit.
add_labels (bool) – Add axis labels.
relative_to_t_ref (bool) – Plot time relative to t_ref. Mutually exclusive with phase_fold.
phase_fold (Any | None) – If given, fold observations to orbital phase using this period: x = (time - t_ref) / phase_fold mod 1. Mutually exclusive with relative_to_t_ref.
**kwargs (Any) – Passed to ax.errorbar(). Defaults can be overridden.

Returns:

The matplotlib.axes.Axes instance.

Return type: