aggregate_count
aggregate_count(d, *, result_col='n', taxid_col='taxid')Nodes count aggregation along branches.
Aggregates nodes count in a DataFrame with taxonomy ids along the branches of the lifemap tree.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| d | pd.DataFrame | pl.DataFrame |
DataFrame to aggregate data from. | required |
| result_col | str | Name of the column created to store the counts, by default “n”. | 'n' |
| taxid_col | str | Name of the d column containing taxonomy ids, by default “taxid”. |
'taxid' |
Returns
| Name | Type | Description |
|---|---|---|
pl.DataFrame |
Aggregated DataFrame. |
See also
aggregate_num : aggregation of a numeric variable.
aggregate_freq : aggregation of the values counts of a categorical variable.
Examples
>>> from pylifemap import aggregate_count
>>> import polars as pl
>>> d = pl.DataFrame({"taxid": [33154, 33090, 2]})
>>> aggregate_count(d)
shape: (5, 2)
┌───────┬─────┐
│ taxid ┆ n │
│ --- ┆ --- │
│ i32 ┆ u32 │
╞═══════╪═════╡
│ 0 ┆ 3 │
│ 2 ┆ 1 │
│ 2759 ┆ 2 │
│ 33090 ┆ 1 │
│ 33154 ┆ 1 │
└───────┴─────┘