get_duplicated_taxids
get_duplicated_taxids(data, taxid_col='taxid')Get a list of duplicated taxids in a data frame.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| data | pl.DataFrame | pd.DataFrame |
Pandas or polars dataframe with original data. | required |
| taxid_col | str | Name of the column storing taxonomy ids, by default “taxid”. | 'taxid' |
Returns
| Name | Type | Description |
|---|---|---|
| list | Duplicated taxids |
See also
get_unknown_taxids : function to get a list of unknown taxids.
Examples
>>> from pylifemap import get_duplicated_taxids
>>> import polars as pl
>>> d = pl.DataFrame({"taxid_values": [2, 33154, 33090, 33090, 2], "value": [10, 5, 100, 1, 2]})
>>> get_duplicated_taxids(d, taxid_col="taxid_values")
[2, 33090]