Getting started

Presentation

Note

This is a quick introduction to the features and usage of the package. See the usage page for more detailed informations.

What is it?

pylifemap is a Python package that allows to visualize your own taxonomy data on the interactive tree of life provided by Lifemap.

It allows to create this type of interactive visualization:

import polars as pl
from pylifemap import Lifemap

# Load iucn dataset
iucn = pl.read_parquet(
    "https://raw.githubusercontent.com/Lifemap-ToL/pylifemap/main/data/iucn.parquet"
)

Lifemap(iucn).layer_points(radius=5, opacity=0.8).show()

How do I use it?

pylifemap visualizations are jupyter widgets and are better created inside a notebook environment, such as Jupyter, marimo or Quarto. These environments allow to create documents with text, code and code output, including plots and interactive widgets. They allow to create and develop computations and visualizations interactively.

It is also possible to generate pylifemap visualizations from a Python script. In this case the interactive widget will be directly opened in your Web browser.

Installation

You can install pylifemap like any other Python package. The recommended way is via pip or uv:

# Install pylifemap, preferably in a virtual environment
pip install pylifemap
# Add pylifemap to a python project
uv add pylifemap

To have an overview of the package without installing anything, you can also open our introduction notebook in Google Colab: Open In Colab

Creating a visualization

To create a Lifemap data visualization, you must follow these steps:

  1. Import Lifemap
  2. Prepare and load your data
  3. Initialize a Lifemap object
  4. Add visualization layers

1. Import Lifemap

To be able to use pylifemap in a script or notebook you have to import Lifemap from the package with the following line:

from pylifemap import Lifemap

2. Prepare your data

The data you want to visualize on the Lifemap tree of life must be in a pandas or polars DataFrame. They must contain observations (species) as rows, and variables as columns, and one column must contain the NCBI taxonomy identifier of the species.

In the following we will use an example dataset generated from The IUCN Red List of Threatened Species. It is a parquet file with the Red List category (in 2022) of more than 84000 species.

We can import it as a polars or pandas DataFrame with the following code:

import polars as pl

iucn = pl.read_parquet(
    "https://raw.githubusercontent.com/Lifemap-ToL/pylifemap/main/data/iucn.parquet"
)
import pandas as pd

iucn = pd.read_parquet(
    "https://raw.githubusercontent.com/Lifemap-ToL/pylifemap/main/data/iucn.parquet"
)

The resulting table only has two columns: taxid, which contains the species identifiers, and status, with the Red List category of each species.

iucn
shape: (84_981, 2)
taxidstatus
i32str
651506"Data Deficient"
2803960"Critically Endangered"
143610"Critically Endangered"
2760993"Least Concern"
72259"Least Concern"
337230"Least Concern"
442623"Vulnerable"
2303643"Critically Endangered"
442625"Critically Endangered"
442626"Least Concern"

For some visualizations you will have to aggregate a count of species or the values of a numerical or categorical variable along the tree branches. For this you can use one of the provided aggregation functions.

3. Initialize a Lifemap object

The next step is to create a new Lifemap object. To do this we have to pass it our DataFrame, as well as the name of the column with our taxonomy identifiers1. We will also add .show(), which is needed to display the visualization.

Lifemap(iucn, taxid_col="taxid").show()
Warning: 779 taxids have not been found in Lifemap database.
Warning: 152 duplicated taxids have been found in the data.

For the moment this only displays the Lifemap base map.

Note

When initializing a new Lifemap object, pylifemap will check and display a warning message if unknown or duplicated taxids are found in the data.

You can get the full list of unknown or duplicated taxids by using the helper functions get_unknown_taxids and get_duplicated_taxids.

4. Add visualization layers

After initializing our Lifemap object, we must add visualization layers to create graphical representations of our data. For example, we can add a points layer which will add a colored point over each species in our dataset. To do that, we can add a call to layer_points():

Lifemap(iucn, taxid_col="taxid").layer_points().show()
Warning: 779 taxids have not been found in Lifemap database.
Warning: 152 duplicated taxids have been found in the data.

Some layers allow to represent the values of one of our dataset columns. For example, we can color our points depending on the values in the status column of iucn:

Lifemap(iucn, taxid_col="taxid").layer_points(fill="status").show()
Warning: 779 taxids have not been found in Lifemap database.
Warning: 152 duplicated taxids have been found in the data.
Note

There are many options to customize the base map or the layers, see the usage page for more informations.

Exporting a visualization

Once you are satisfied with a visualization, there are several ways to export it:

  • at any time, you can use the “Export PNG” button on the widget to export the current view as a PNG image. You can use the “Fullscreen” button first to generate bigger images if needed
  • you can use .save("file.html") instead of .show() to save the widget to an HTML file
  • if in a Jupyter notebook, you can save it to HTML, which will include the widgets in the output
  • you can also use Quarto to generate an HTML document embedding pylifemap widgets, or to convert a Jupyter notebook to an HTML file

Footnotes

  1. if your column is named “taxid” you can omit the taxid_col argument as it is its default value.↩︎