lamindb.curators.CxGCurator

class lamindb.curators.CxGCurator(dataset, schema_version, *, organism='human', defaults=None, extra_sources=None)

Bases: SlotsCurator

Curator for AnnData objects that should adhere to a specific CELLxGENE Schema version.

Parameters:
  • dataset (AnnData | Artifact) – The AnnData-like object to validate & annotate.

  • schema_version (Literal['4.0.0', '5.0.0', '5.1.0', '5.2.0', '5.3.0']) – A CELLxGENE Schema version that defines the validation constraints.

  • organism (Literal['human', 'mouse'], default: 'human') – The organism of the Schema.

  • defaults (dict[str, str], default: None) – Default values that are set if columns or column values are missing.

  • extra_sources (dict[str, SQLRecord], default: None) – A dictionary mapping .obs.columns to Source records. These extra sources are joined with the CELLxGENE fixed sources. Use this parameter when subclassing.

Example

Attributes

property slots: dict[str, DataFrameCurator]

Access sub curators by slot.

Methods

save_artifact(*, key=None, description=None, revises=None, run=None)

Save an annotated artifact.

Parameters:
  • key (str | None, default: None) – A path-like key to reference artifact in default storage, e.g., "myfolder/myfile.fcs". Artifacts with the same key form a version family.

  • description (str | None, default: None) – A description.

  • revises (Artifact | None, default: None) – Previous version of the artifact. Is an alternative way to passing key to trigger a new version.

  • run (Run | None, default: None) – The run that creates the artifact.

Return type:

Artifact

Returns:

A saved artifact record.

validate()

Validate dataset against Schema.

Raises:

lamindb.errors.ValidationError – If validation fails.

Return type:

None