aggregate_count

aggregate_count(d, *, result_col='n', taxid_col='taxid')

Nodes count aggregation along branches.

Aggregates nodes count in a DataFrame with taxonomy ids along the branches of the lifemap tree.

Parameters

Name Type Description Default
d pd.DataFrame | pl.DataFrame DataFrame to aggregate data from. required
result_col str Name of the column created to store the counts. By default 'n'. 'n'
taxid_col str Name of the d column containing taxonomy ids. By default 'taxid'. 'taxid'

Returns

Name Type Description
pl.DataFrame | pd.DataFrame Aggregated DataFrame in the same format as input.

See also

aggregate_num: aggregation of a numeric variable.

aggregate_freq: aggregation of the values counts of a categorical variable.

Examples

>>> from pylifemap import aggregate_count
>>> import polars as pl
>>> d = pl.DataFrame({"taxid": [33154, 33090, 2]})
>>> aggregate_count(d)
shape: (5, 2)
┌───────┬─────┐
│ taxid ┆ n   │
------
│ i32   ┆ u32 │
╞═══════╪═════╡
03
21
27592
330901
331541
└───────┴─────┘