aggregate_count
*, result_col='n', taxid_col='taxid') aggregate_count(d,
Nodes count aggregation along branches.
Aggregates nodes count in a DataFrame with taxonomy ids along the branches of the lifemap tree.
Parameters
Name | Type | Description | Default |
---|---|---|---|
d | pd .DataFrame | pl .DataFrame |
DataFrame to aggregate data from. | required |
result_col | str | Name of the column created to store the counts, by default “n”. | 'n' |
taxid_col | str | Name of the d column containing taxonomy ids, by default “taxid”. |
'taxid' |
Returns
Name | Type | Description |
---|---|---|
pl .DataFrame |
Aggregated DataFrame. |
See also
aggregate_num
: aggregation of a numeric variable.
aggregate_freq
: aggregation of the values counts of a categorical variable.
Examples
>>> from pylifemap import aggregate_count
>>> import polars as pl
>>> d = pl.DataFrame({"taxid": [33154, 33090, 2]})
>>> aggregate_count(d)
5, 2)
shape: (
┌───────┬─────┐
│ taxid ┆ n │--- ┆ --- │
│
│ i32 ┆ u32 │
╞═══════╪═════╡0 ┆ 3 │
│ 2 ┆ 1 │
│ 2759 ┆ 2 │
│ 33090 ┆ 1 │
│ 33154 ┆ 1 │
│ └───────┴─────┘