aggregate_count

aggregate_count(d, *, result_col='n', taxid_col='taxid')

Nodes count aggregation along branches.

Aggregates nodes count in a DataFrame with taxonomy ids along the branches of the lifemap tree.

Parameters

Name Type Description Default
d pd.DataFrame | pl.DataFrame DataFrame to aggregate data from. required
result_col str Name of the column created to store the counts, by default “n”. 'n'
taxid_col str Name of the d column containing taxonomy ids, by default “taxid”. 'taxid'

Returns

Type Description
pl.DataFrame Aggregated DataFrame.

See Also

aggregate_num : aggregation of a numeric variable.

aggregate_freq : aggregation of the values counts of a categorical variable.

Examples

>>> from pylifemap import aggregate_count
>>> import polars as pl
>>> d = pl.DataFrame({"taxid": [33154, 33090, 2]})
>>> aggregate_count(d)
shape: (5, 2)
┌───────┬─────┐
│ taxid ┆ n   │
------
│ i32   ┆ u32 │
╞═══════╪═════╡
03
21
27592
330901
331541
└───────┴─────┘