Analyze a sharded dataset#
import lamindb as ln
import lnschema_bionty as lb
ln.track()
💡 loaded instance: testuser1/test-facs (lamindb 0.54.4)
💡 notebook imports: lamindb==0.54.4 lnschema_bionty==0.31.2 scanpy==1.9.5
💡 Transform(id='zzJzdgJ763Dyz8', name='Analyze a sharded dataset', short_name='facs3', version='0', type=notebook, updated_at=2023-10-01 16:44:53, created_by_id='DzTjkKse')
💡 Run(id='WFBzw6EMZ3lBgpeLPYfC', run_at=2023-10-01 16:44:53, transform_id='zzJzdgJ763Dyz8', created_by_id='DzTjkKse')
ln.Dataset.filter().df()
name | description | version | hash | reference | reference_type | transform_id | run_id | file_id | initial_version_id | updated_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||
8zlLWz5kwz4eVoYxBRwf | My versioned FACS dataset | None | 1 | Piw2n0vdnoNoAV7ZxgsW-g | None | None | OWuTtS4SAponz8 | TJweM0VKGkTQHyy8CZci | 8zlLWz5kwz4eVoYxBRwf | None | 2023-10-01 16:44:34 | DzTjkKse |
8zlLWz5kwz4eVoYxBREt | My versioned FACS dataset | None | 2 | dmrCH-OEK94Zbh7i51wn | None | None | SmQmhrhigFPLz8 | qh5Vw8DryjLToUWD3lqo | None | 8zlLWz5kwz4eVoYxBRwf | 2023-10-01 16:44:43 | DzTjkKse |
dataset = ln.Dataset.filter(name="My versioned FACS dataset", version="2").one()
adata = dataset.load(join="inner")
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/anndata/_core/anndata.py:1838: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
utils.warn_names_duplicates("obs")
The AnnData
has the reference to the individual files in the .obs
annotations:
adata.obs.file_id.cat.categories
Index(['8zlLWz5kwz4eVoYxBRwf', 'rYTrXas2KdzpqLDUILvH'], dtype='object')
By default, the intersection of features is used:
adata.var.index
Index(['CD57', 'Cd19', 'Cd4', 'CD8', 'CD3', 'CD27', 'Cd14', 'Ccr7', 'CD127',
'CD28'],
dtype='object')
Let us create a plot:
markers = lb.CellMarker.lookup()
import scanpy as sc
sc.pp.pca(adata)
sc.pl.pca(adata, color=markers.cd14.name, save="_cd14")
filepath = "figures/pca_cd14"
WARNING: saving figure to file figures/pca_cd14.pdf
file = ln.File("./figures/pca_cd14.pdf", description="My result on CD14")
file.save()
file.view_flow()
# clean up test instance
!lamin delete --force test-facs
!rm -r test-flow
💡 deleting instance testuser1/test-facs
✅ deleted instance settings file: /home/runner/.lamin/instance--testuser1--test-facs.env
✅ instance cache deleted
✅ deleted '.lndb' sqlite file
❗ consider manually deleting your stored data: /home/runner/work/lamin-usecases/lamin-usecases/docs/test-facs
rm: cannot remove 'test-flow': No such file or directory