Skip to content



Kolena Workflows are simplified in Kolena Datasets. If you are setting up your Kolena environments for the first time, please refer to the Datasets Quickstart.

Install Kolena to set up rigorous and repeatable model testing in minutes.

In this quickstart guide, we'll use the object_detection_2d example integration to demonstrate the how to curate test data and test models in Kolena.

Install kolena#

Install the kolena Python package to programmatically interact with Kolena:

pip install kolena
poetry add kolena

Clone the Examples#

The kolenaIO/kolena repository contains a number of example integrations to clone and run directly:

To get started, clone the kolena repository:

git clone

With the repository cloned, let's set up the object_detection_2d example:

cd kolena/examples/workflow/object_detection_2d
poetry update && poetry install

Now we're up and running and can start creating test suites and testing models.

Create Test Suites#

Each of the example integrations comes with scripts for two flows:

  1. Create test cases and test suite(s) from a source dataset
  2. Test model(s) on the created test suites

Before running, let's first configure our environment by populating the KOLENA_TOKEN environment variable. Visit the Developer page to generate an API token and copy and paste the code snippet into your environment:

export KOLENA_TOKEN="********"

We can now create test suites using the provided seeding script:

poetry run python3 object_detection_2d/

After this script has completed, we can visit the Test Suites page to view our newly created test suites.

In this object_detection_2d example, we've created test suites stratifying the COCO 2014 validation set (which is stored as a CSV in S3) into test cases by brightness and bounding box size. In this example will be looking at the following labels:

["bicycle", "car", "motorcycle", "bus", "train", "truck", "traffic light", "fire hydrant", "stop sign"]

Test a Model#

After we've created test suites, the final step is to test models on these test suites. The object_detection_2d example provides the following models to choose from {yolo_r, yolo_x, mask_rcnn, faster_rcnn, yolo_v4s, yolo_v3} for this step:

poetry run python3 object_detection_2d/ "yolo_v4s"

Note: Testing additional models

In this example, model results have already been extracted and are stored in CSV files in S3. To run a new model, plug it into the infer method in

Once this script has completed, click the results link in your console or visit Results to view the test results for this newly tested model.


In this quickstart, we used an example integration from kolenaIO/kolena to create test suites from the COCO dataset and test the open-source yolo_v4s model on these test suites.

This example shows us how to define an ML problem as a workflow for testing in Kolena, and can be arbitrarily extended with additional metrics, plots, visualizations, and data.