get_duplicated_taxids

get_duplicated_taxids(data, taxid_col='taxid')

Get a list of duplicated taxids in a data frame.

Parameters

Name Type Description Default
data pl.DataFrame | pd.DataFrame Pandas or polars dataframe with original data. required
taxid_col str Name of the column storing taxonomy ids, by default “taxid”. 'taxid'

Returns

Name Type Description
list Duplicated taxids

See also

get_unknown_taxids : function to get a list of unknown taxids.

Examples

>>> from pylifemap import get_duplicated_taxids
>>> import polars as pl
>>> d = pl.DataFrame({"taxid_values": [2, 33154, 33090, 33090, 2], "value": [10, 5, 100, 1, 2]})
>>> get_duplicated_taxids(d, taxid_col="taxid_values")
[2, 33090]