Welcome to LOST’s documentation!

_images/LOSTFeaturesIn40seconds.gif

LOST features in a nutshell.

About LOST

LOST (Label Object and Save Time) is a flexible web-based framework for semi-automatic image annotation. It provides multiple annotation interfaces for fast image annotation.

LOST is flexible since it allows to run user defined annotation pipelines where different annotation interfaces/ tools and algorithms can be combined in one process.

It is web-based since the whole annotation process is visualized in your browser. You can quickly setup LOST with docker on your local machine or run it on a web server to make an annotation process available to your annotators around the world. LOST allows to organize label trees, to monitor the state of an annotation process and to do annotations inside the browser.

LOST was especially designed to model semi-automatic annotation pipelines to speed up the annotation process. Such a semi-automatic can be achieved by using AI generated annotation proposals that are presented to an annotator inside the annotation tool.

Getting Started

Setup LOST

LOST releases are hosted on DockerHub and shipped in Containers. See LOST Quick Setup for more information.

Getting Data into LOST

Image Data

In the current version there is no GUI available in order to load images into LOST. So we will use the command line or a file explorer to do that. An image dataset in LOST is just a folder with images. LOST will recognize all folders that are located at path_to_lost/data/data/media in your filesystem as a dataset. In order to add your dataset just copy it to the path above e.g.:

# Copy your dataset into the LOST media folder
cp -r path/to/my/dataset path_to_lost/data/data/media

# It may be required that you copy it as a super user since the
# docker container that executes LOST is running as root service
# and owns the media folder.
sudo cp -r path/to/my/dataset path_to_lost/data/data/media

LabelTrees

Labels are organized in LabelTrees. Each LabelLeaf need to have at least a name. Optional information for a LabelLeaf are a description, an abbreviation and an external ID (maybe from another system). LOST provides a GUI to create or edit LabelTrees and an import of LabelTrees defined in a CSV file via the command line. In order to be able to edit LabelTrees in LOST you need to login as a user with role Designer. After login you need to click on the Labels button on the left navigation bar.

Users, Groups and Roles

There are two main user roles in LOST: A Designer and an Annotator role. Both roles have different views and access to information. An Annotators job is just to work on annotation tasks that are assigned to him, while a Designer may do more advanced stuff and everything an Annotator may do. For example a Designer will start annotation piplines and choose or edit LabelTrees for the annotation tasks.

Independent of its role a user can be part of one or multiple user Groups. In this way annotation tasks can be a assigned to Groups of users that can work collaborative on the same task.

In order to manage users and groups, click on the Users icon on the left menu bar. Please note that only users with the role Designer are allowed to manage users.

Starting an Annotation Pipeline

All annotation processes in LOST are modeled as pipelines. Such a pipeline defines the order in which specific pipeline elements will be executed. Possible elements are Datasources, Scripts, AnnotationTasks, DataExports and VisualOutputs.

Each version of LOST is equipped with a selection of standard pipelines that can be used as a quick start to annotate your data. In order to start an annotation pipeline you need to be logged in in as a user with role Designer and click on the Start Pipeline button on the left navigation bar. Now you will see a table with possible pipelines that can be started.

After selecting a pipeline by a click on a specific row in the table you need to configure it. A visualization of the selected pipeline will be displayed. In most cases a Datasource is the first element of a pipeline. Click on it and select an available dataset. After a click on the OK button the pipeline element will turn green to indicate that the configuration was successful.

The next element you need to look for is an AnnotationTask. After clicking on it a wizard will pop up and guide you through the configuration of this AnnotationTask. In the first step a name and instructions for the AnnotationTask can be defined. Click on the next button and select a user or group of users that should perform this AnnotationTask. Now a LabelTree need to be selected by clicking on a specific tree in the table. Now a visualization of the LabelTree will be displayed. Here you can select a subset of labels that should be used for the AnnotationTask. The idea is that each parent leaf represents a category that can be selected to use all direct child leafs as labels. So if you click on a leaf, all direct child leafs will be used as possible labels for the AnnotationTask. It is possible to select multiple leafs as label categories. After selecting the label subset click on OK and the configuration of this AnnotationTask is done.

Now visit all other elements that not have been configured (indicated by a yellow color) and move on to the next step in the wizard. Here you can enter a name and a description of your pipeline. After entering these information you can click on the checkmark symbol to get to the Start Pipe button. With a click on this button your annotation pipeline will be started :-)

You can monitor the state of all running pipelines on your Designer dashboard. To get to a specific pipeline click on the Dashboard button in the left navigation bar and select a pipeline in the table.

Annotate Your Images

Once your pipeline has requested all annotations for an AnnotationTask, selected annotators will be able to work on it. If you are logged in as a user with role Designer you can now switch to the annotator view by clicking on the Annotator button at the upper right corner of your browser. You will be redirected to the annotator dashboard. If you are logged in as a user with role Annotator you see this dashboard directly after login.

Here you can see a table with all available AnnotationTasks for you. Click on a task you want to work on and you will be redirected to one of the annotation tools (see also the For Annotators chapter). Now instructions will pop up and you are ready to annotate.

Download Your Annotation Results

All example pipelines in LOST have a Script element that will export your annotations to a CSV file when the annotation process has finished. To download this file go to the Designer dashboard that is part of the Designer view and select a pipeline. A visualization of the annotation process will be displayed. Look for a DataExport element and click on it. A pop up will appear that shows all files that are available for download. Now click on a file and the download will start.

LOST Quick Setup

LOST provides a quick_setup script, that will configure LOST and instruct you how to start LOST. We designed this script for Linux environments, but it will also work on Windows host machines.

The quick_setup will import some out-of-the-box annotation pipelines and example label trees. When you start LOST, all required docker containers will be downloaded from DockerHub and started on your machine. The following containers are used in different combinations by the LOST quick_setup:

  • mysql ~124 MB download (extracted 372 MB)
  • rabbitmq ~90 MB download (extracted 149 MB)
  • lost ~1 GB download (extracted 2.95 GB)
  • lost-cv ~3 GB download (extracted 6.94 GB)
  • lost-cv-gpu ~4 GB download (extracted 8.33 GB), an nvidia docker container

There are three configurations that can be created with the quick_setup script:

  1. A standard config that starts the following containers: mysql, rabbitmq, lost, lost-cv. In this config you are able to run all annotation pipelines that are available. Semi-automatic pipelines that make use of AI will be executed on your CPU.
  2. A minimum configuration that starts the mysql, rabbitmq and lost container. This conifg may only run simple annotation pipelines that have no AI included, since there is no container that has an environment installed to perform deep learning algorithms. This setup will required the smallest amount of disc space on your machine.
  3. A gpu config that will allow you to execute our semi-automatic AI annotation piplines on your nvidia gpu. The following containers will be downloaded: mysql, rabbitmq, lost and lost-cv-gpu.

Standard Setup

  1. Install docker on your machine or server:

    https://docs.docker.com/install/

  2. Install docker-compose:

    https://docs.docker.com/compose/install/

  3. Clone LOST:
    git clone https://github.com/l3p-cv/lost.git
    
  4. Run quick_setup script:
    cd lost/docker/quick_setup/
    # python3 quick_setup.py path/to/install/lost
    # If you want to install a specific release,
    # you can use the --release argument to do so.
    python3 quick_setup.py ~/lost
    
  5. Run LOST:

    Follow instructions of the quick_setup script, printed in the command line.

Minimum Setup (LOST only)

Note

No semi-automatic pipelines will be available for you. So almost no magic will happen here ;-)

  1. Install docker on your machine or server:

    https://docs.docker.com/install/

  2. Install docker-compose:

    https://docs.docker.com/compose/install/

  3. Clone LOST:
    git clone https://github.com/l3p-cv/lost.git
    
  4. Run quick_setup script:
    cd lost/docker/quick_setup/
    # python3 quick_setup.py path/to/install/lost -noai
    # If you want to install a specific release,
    # you can use the --release argument to do so.
    python3 quick_setup.py ~/lost -noai
    
  5. Run LOST:

    Follow instructions of the quick_setup script, printed in the command line.

LOST + GPU Worker

Note

You will need a nvidia GPU to use this setup. This setup will also assume, that LOST and the GPU worker are running on the same host machine.

  1. Install docker on your machine or server:

    https://docs.docker.com/install/

  2. Install docker-compose:

    https://docs.docker.com/compose/install/

  3. Install nvidia docker:

    https://github.com/NVIDIA/nvidia-docker#quickstart

  4. Install nvidia-docker2:
    sudo apt-get update
    sudo apt-get install docker-ce nvidia-docker2
    sudo systemctl restart docker
    
  5. Clone LOST:
    git clone https://github.com/l3p-cv/lost.git
    
  6. Run quick_setup script:
    cd lost/docker/quick_setup/
    # python3 quick_setup.py path/to/install/lost -gpu
    # If you want to install a specific release,
    # you can use the --release argument to do so.
    python3 quick_setup.py ~/lost -gpu
    
  7. Run LOST:

    Follow instructions of the quick_setup script, printed in the command line.

Install LOST from backup

  1. Perform full backup with sudo .. code-block:: bash

    sudo zip -r backup.zip ~/lost

  2. Install docker on your machine or server:

    https://docs.docker.com/install/

  3. Install docker-compose:

    https://docs.docker.com/compose/install/

  4. Clone LOST:
    git clone https://github.com/l3p-cv/lost.git
    
  5. Run quick_setup script:
    cd lost/docker/quick_setup/
    # python3 quick_setup.py path/to/install/lost
    # If you want to install a specific release,
    # you can use the --release argument to do so.
    python3 quick_setup.py ~/lost
    sudo rm -rf ~/lost
    unzip backup.zip ~/lost
    

5. Make sure that ~/lost/docker/.env file contains proper absolute path to ~/lost in LOST_DATA and proper LOST_DB_PASSWORD

  1. Run LOST:
    Follow instructions of the quick_setup script, printed in the command line.

Migration Guide from 0.0.6 to 1.1.0

  1. Make these changes to the database:
_images/db-changes.png

Figure 1: The the changes required to be made manually

2. Also you need to change your custom pipeline configuration files: backend/lost/pyapi/examples/pipes/<your_pipeline>/<config_file>.json

  1. Old unfinished tasks can become unfinishable so I recommend creating special user called ‘trash’ and for all unfinished tasks change lost.anno_task#group_id to ‘trash’ user group id from lost.user_groups.
  2. I recommend clearing lost.choosen_anno_task table.

Utf-8 char encoding fix

1. Convert database to utf-8: https://www.a2hosting.com/kb/developer-corner/mysql/convert-mysql-database-utf-8

Or run on lost database
SET foreign_key_checks = 0;
ALTER TABLE anno_task  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE choosen_anno_task  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE data_export  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE datasource  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE `group`  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE image_anno  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE label  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE label_leaf  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE `loop`  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE pipe  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE pipe_element  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE pipe_template  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE required_label_leaf  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE result  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE result_link  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE role  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE script  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE track  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE two_d_anno  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE user  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE user_groups  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE user_roles  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE visual_output  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1;
ALTER TABLE worker  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER DATABASE CHARACTER SET utf8 COLLATE utf8_general_ci;
SET foreign_key_checks = 1;
  1. Change DB name in your .env to:
    LOST_DB_NAME=lost?charset=utf8mb4
    

For Annotators

Your Dashboard

_images/annotator-dashboard.png

Figure 1: The annotator dashboard.

In Figure 1 you can see an example of the annotator dashboard. At the top, the progress and some statistics of the current selected AnnotationTask are shown.

In the table on the button all available AnnotationTasks are presented. A click on a specific row will direct you to the annotation tool that is required to accomplish the selected AnnotationTask. Rows that have a grey background mark finished tasks and can not be selected to work on.

Getting To Know SIA - A Single Image Annotation Tool

SIA was designed to annotate single images with Points, Lines, Polygons and Boxes. To each of the above mentioned annotations a class label can also be assigned.

Figure 2 shows an example of the SIA tool. At the top you can see a progess bar with some information about the current AnnotationTask. Beyond this bar the actual annotation tool is presented. SIA consists of three main components. These components are the canvas, the image bar and the tool bar

_images/sia-example.png

Figure 2: An example of SIA.

_images/sia-canvas.png

Figure 3: An example of the SIA canvas component. It presents the image to the annotator. By right click, you can draw annotations on the image.

_images/sia-image-bar.png

Figure 4: The image bar component provides information about the image. Beginning with the filename of the image and the id of this image in the database. This is followed by the number of the image in the current annotation session and the overall number of images to annotate. The last information is the label that was given to the whole image, if provided.

_images/sia-toolbar.png

Figure 5: The toolbar provides a control to assign a label to the whole image. The navigation between images. Buttons to select the annotation tool. A button to toggle the SIA fullscreen mode. A junk button to mark the whole image as junk that should not be considered. A control to delete all annotations in the image. A settings button and a help button.

Warning

There may be also tasks where you can not assign a label to an annotation. The designer of pipeline a can decide that no class labels should be assigned.

Warning

Please note that there may be also tasks where no new annotations can be drawn and where you only can delete or adjust annotations.

Note

Please note that not all tools may be available for all tasks. The designer of a pipeline can decide to allow only specific tools.

Meet MIA - A Multi Image Annotation Tool

MIA was designed to annotate clusters of similar objects or images. The idea here is to speed up the annotation process by assigning a class label to a whole cluster of images. The annotators task is remove images that do not belong to the cluster clicking on these images. When all wrong images are removed, the remaining images get the same label assigned by the annotator.

As an example, in Figure 7 the annotator clicked on the car since it does not belong to the cluster of aeroplanes. Since he clicked on it the car is grayed out. Now the annotator moved on to the label input field and selected Aeroplane as label for the remaining images. Now the annotator needs to click on the Submit button to complete this annotation step.

_images/mia-example.png

Figure 7: An example of MIA.

Figure 8 shows the left part of the MIA control panel. You can see the label input field and the current selected label in a red box.

_images/mia-controls1.png

Figure 8: Left part of the MIA control panel.

In Figure 9 the right part of the MIA control panel is presented. The blue submit button on the left can be used to submit the annotations.

On the right part of the figure there is a reverse button to invert your selection. When clicked in the example of Figure 7 the car would be selected for annotation again and all aeroplanes would be grayed out. Next to the reverse button there are two zoom buttons that can be used to scale all presented images simultaneously. Next to the zoom buttons there is a dropdown with name amount here the annotator can select the maximum number of images that are presented at the same time within the cluster view.

_images/mia-controls2.png

Figure 9: Right part of the MIA control panel.

In some cases the annotator may want to have a closer look at a specific image of the cluster. In order to zoom a single image perform a double click on it. Figure 10 shows an example of a single image zoom. To scale the image back to original size, double click again.

_images/mia-example-zoom.png

Figure 10: Zoomed view of a specific image of the cluster.

Annotation Review

SIA-Review-Tool For Designers

As owner of a pipeline it is often required to have a quick look at the current annotation progress or to correct single annotations of an annotation task. For that purpose we implemented the SIA-Review tool. It is available for users with the role designer and can be accessed via the pipeline view, when clicking on the Review Annotations button.

_images/lost-review-btn.png

Figure 1: Review button inside the details view of an annotation task.

Figure 1 show the detail popup of an annotation task in a pipeline. When clicking on the Review Annotations button you will be redirected to the SIA Review Tool. Figure 2 shows the review interface.

_images/lost-review-sia.png

Figure 2: Interface of the SIA Review Tool

In contrast to the normal SIA Annotation Tool you need to explicitly click on the SAVE button in order to save changes. When moving to the next image without saving, no changes will be stored. Everything else will be similar to the SIA Annotation Tool. The review tool can be used to review all types of annotation tasks (SIA and MIA)

_images/lost-review-filter.png

Figure 3: Filter box, where images can be filtered by iteration.

Figure 3 shows the filter box where images/annotation can be filtered by iteration.

The LOST Ecosystem

LOST was designed as a web-based framework that is able to run and visualize user defined annotation pipelines. The frontend of LOST is designed as Single Page Application and implemented in React using JavaScript. The communication between frontend and backend is build on web services. As primary programming language in the backend we use Python and utilize FLASK to provide the web services. As solution for easy deployment we use Docker.

The LOST Container Landscape

_images/lost-eco-system.svg

Figure 1: The LOST default container landscape.

LOST is a composition of different docker containers. The main ingredients for LOST are a MySQL database, a RabbitMQ message broker, the FLASK framework, a NGINX web server, the LOST framework itself and the LOST data folder where all LOST related data is stored.

Figure 1 shows a schematic illustration of the LOST container landscape. Starting on the left side of the illustration we see the LOST data folder that is used to store all data of LOST on the host machine. This folder is mounted inside the most containers of LOST. On the right side of Figure 1 you can see all containers that are started together with the help of Docker Compose. We see the containers called rabbitmqlost, db-lost, lost, lost-cv and phpmyadmin, while the numbers indicate the ports where the applications can be accessed.

The most important container to understand here is the container called lost. This container will serve the LOST web application with NGINX on port 80 and is used as default Worker to execute scripts. It is connected to the rabbitmqlost container to use Celery for script execution scheduling and to the db-lost container in order to access the MySQL database that contains the current application state. The container called lost-cv is connected analog to lost. The pypmyadmin container is used for easy database monitoring during development and serves a graphical user interface to the MySQL database on port 8081.

Pipeline Engine and Workers

The PipeEngine will bring your annotation process to live by executing PipelineElements in the specified order. Therefore it will start AnnotationTasks and assigns Scripts to Workers that will execute these Scripts.

Workers

A Worker is a specific docker container that is able to execute LOST Script elements inside an annotation pipeline. In each Worker a set of python libraries is installed inside an Anaconda environment.

A LOST application may have multiple Workers with different Environments installed, since some scripts can have dependencies on specific libraries. For example, as you can see in Figure 1 LOST is shipped with two workers by default. One is called lost and the other one lost-cv. The lost worker can execute scripts that just rely on the lost python api. The lost-cv worker has also installed libraries like Keras, Tensorflow and OpenCV that are used for computer vision and machine learning.

Celery as Scheduler

In oder to assign Scripts for execution to available Workers we use Celery in combination with RabbitMQ as message broker.

Since each worker may have a specific software environment installed the PipeEngine will take care that scripts are only executed by Workers that have the correct Environment. This is achieved by creating one message queue per Environment. Workers that have this Environment installed will listen to this message queue. Once the PipeEngine finds a Script for execution in a specific Environment it will send it to the related message queue. Now the Script will be assigned by the round robin approach to one of the Workers that listen to the message queue related to the Environment.

For Pipeline Designers

The pipeline idea

A LOST-Pipeline consists of different elements that will be processed in a defined order by the LOST-Engine to transform data into knowledge.

A LOST-Pipeline is defined by a PipelineTemplate and modeled as directed graph. A Pipeline is an istance of a PipelineTemplate. A PipelineTemplate may define a graph that consits of the following PipelineElements:

  • Script: A user defined script that transforms input data to output data.
  • Datasource: Input data for a annotation pipeline. In most cases this will be a folder with images.
  • AnnotationTask: Some kind of an image annotation task performed by a human annotator.
  • Visualization: Can display an image or html text in the web gui that was generated by a user defined script.
  • DataExport: Provides a download link to a file that was generated by a script.
  • Loop: A loop element points to another element in the Pipeline and creates a loop in the graph. A loop element implements a similar behaviour as a while loop in a programming language.

Designing a pipeline - A first example

In the following we will have a look at the sia_all_tools pipeline which is part of the sia pipeline project example in LOST. Based on this example we will discuss all the important steps when developing your own pipeline.

Pipeline Projects

A pipeline is defined by a json file and is related to Script elements. A Script is essentialy a python file. Multiple pipelines and scripts can be bundled as a pipeline project and imported into lost. A pipeline project is defined as a folder of pipeline and script files. The listing below shows the file sturcture of the sia pipeline project. In our example we will focus on the sia_all_tools pipeline and its related scripts (export_csv, request_annos), which are highlighted in the listing.

sia/
├── export_csv.py
├── request_annos.py
├── request_yolo_annos.py
├── semiauto_yolov3.json
└── sia_all_tools.json

0 directories, 5 files

A Pipeline Definition File

Below you can see the pipeline definition file of the sia_all_tools pipeline. This pipeline will request annotations for all images inside a folder from the Single Image Annotation (SIA) tool and export these annotations to a csv file. The created csv file will be available for download by means of a DataExport element inside the web gui.

As you can see in the listing, the pipeline is defined by a json object that has a description, a author, a pipe-schema-version and a list of pipeline elements. Each element is defined by a json object and has a peN (pipeline element number) which is the identifier of the element itself. All elements need also an attribute that is called peOut and contains a list of elements where the current element is connected to.

The first element in the sia_all_tools pipeline is a Datasource (lines 5 - 11) of type rawFile. This Datasource will provide a path to a folder with images inside of the LOST filesystem. The exact path is selected when the pipeline will be started. The Datasource element is connected to the Script element with peN: 1. This Script element is defined by peN, peOut and a script path plus a script description. The script path needs to be defined relative to the pipeline project folder.

The Script element is connected to an AnnotationTask element with peN: 2 of type sia (lines 20 - 45). Within the json object of the AnnotationTask you can specify the name of the task and also instructions for the annotator. In the configuration part all tools and actions are allowed for SIA in this pipeline. If you would like that a annotator should only use bounding boxes for annotation, you could set point, line and polygon to false.

The AnnotationTask element is connected to a Script element with path: export_csv.py. This script will read all annotations from the AnnotationTask and create a csv file with these annotations inside the LOST filesystem. The created csv file will be made available for download by the DataExport element with peN: 4. A DataExport element may serve an arbitrary file from the LOST filesystem for download.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
{
  "description": "This pipeline selects all images of a datasource and requests annotations.",
  "author": "Jonas Jaeger",
  "pipe-schema-version" : 1.0,
  "elements": [{
      "peN": 0,
      "peOut": [1],
      "datasource": {
        "type": "rawFile"
      }
    },
    {
      "peN": 1,
      "peOut": [2],
      "script": {
        "path": "request_annos.py",
        "description": "Request annotations for all images in a folder"
      }
    },
    {
      "peN": 2,
      "peOut": [3],
      "annoTask": {
        "name": "Single Image Annotation Task",
        "type": "sia",
        "instructions": "Please draw bounding boxes for all objects in image.",
        "configuration": {
          "tools": {
              "point": true,
              "line": true,
              "polygon": true,
              "bbox": true,
              "junk": true
          },
          "annos":{
              "multilabels": false,
              "actions": {
                  "draw": true,
                  "label": true,
                  "edit": true
              },
              "minArea": 250
          },
          "img": {
              "multilabels": false,
              "actions": {
                  "label": true
              }
          }
        }
      }
    },
    {
      "peN": 3,
      "peOut": [4],
      "script": {
        "path": "export_csv.py",
        "description": "Export all annotations to a csv file."
      }
    },
    {
      "peN": 4,
      "peOut": null,
      "dataExport": {}
    }
  ]
}

How to write a script?

request_annos.py

A script in LOST is just a normal python3 module. In the listing below you can see the request_annos.py script from our example pipeline (sia_all_tools). The request_annos.py script will read in a path to an imageset from the previous datasource element in the pipeline and will request annotations from the next annotation task element in the pipeline. This script will also send dummy annotation proposals to the annotation task if one of the arguments is set to ture when the pipeline is started in the web gui.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
from lost.pyapi import script
import os
import random

ENVS = ['lost']
ARGUMENTS = {'polygon' : { 'value':'false',
                            'help': 'Add a dummy polygon proposal as example.'},
            'line' : { 'value':'false',
                            'help': 'Add a dummy line proposal as example.'},
            'point' : { 'value':'false',
                            'help': 'Add a dummy point proposal as example.'},
            'bbox' : { 'value':'false',
                            'help': 'Add a dummy bbox proposal as example.'}
            }
class RequestAnnos(script.Script):
    '''Request annotations for each image of an imageset.

    An imageset is basicly a folder with images.
    '''
    def main(self):
        for ds in self.inp.datasources:
            media_path = ds.path
            annos = []
            anno_types = []
            if self.get_arg('polygon').lower() == 'true':
                polygon= [[0.1,0.1],[0.4,0.1],[0.2,0.3]]
                annos.append(polygon)
                anno_types.append('polygon')
            if self.get_arg('line').lower() == 'true':
                line= [[0.5,0.5],[0.7,0.7]]
                annos.append(line)
                anno_types.append('line')
            if self.get_arg('point').lower() == 'true':
                point= [0.8,0.1]
                annos.append(point)
                anno_types.append('point')
            if self.get_arg('bbox').lower() == 'true':
                box= [0.6,0.6,0.1,0.05]
                annos.append(box)
                anno_types.append('bbox')
            for img_file in os.listdir(media_path):
                img_path = os.path.join(media_path, img_file)
                self.outp.request_annos(img_path=img_path, annos=annos, anno_types=anno_types)
                self.logger.info('Requested annos for: {}'.format(img_path))

if __name__ == "__main__":
    my_script = RequestAnnos() 

In order to write a LOST script you need to define a class that inherits from lost.pyapi and defines a main method (see below).

from lost.pyapi import script
class RequestAnnos(script.Script):
    '''Request annotations for each image of an imageset.

    An imageset is basicly a folder with images.
    '''
    def main(self):
        for ds in self.inp.datasources:

Later on you need to instantiate it and your LOST script is done.

if __name__ == "__main__":
    my_script = RequestAnnos() 

In the request_annos.py script you can also see some special variables ENVS and ARGUMENTS. These variables will be read during the import process. The EVNS variable provides meta information for the pipeline engine by defining a list of environments (similar to conda environments) where this script may be executed in. In this way you can assure that a script will only be executed in environments where all your dependencies are installed. All environments are installed in workers that may execute your script. If many different environments are defined within the ENVS list of a script, the pipeline engine will try to assign the script to a worker in the same order as defined within the ENVS list. So if a worker is online that has installed the first environment in the list the pipeline engine will assign the script to this worker. If no worker with the first environment is online, it will try to assign the script to a worker with the second environment in the list and so on.

ENVS = ['lost']

The ARGUMENTS variable will be used to provide script arguments that can be set during the start process of a pipline within the web gui. ARGUMENTS are defined as a dictionary of dictionaries that contain the arguments. Each argument is again a dictionary with keys value and help. As you can see in the listing below the first argument is called polygon its value is false and its help text is Add a dummy polygon as example.

ARGUMENTS = {'polygon' : { 'value':'false',
                            'help': 'Add a dummy polygon proposal as example.'},
            'line' : { 'value':'false',
                            'help': 'Add a dummy line proposal as example.'},
            'point' : { 'value':'false',
                            'help': 'Add a dummy point proposal as example.'},
            'bbox' : { 'value':'false',
                            'help': 'Add a dummy bbox proposal as example.'}
            }

Within your script you can access the value of an argument with the get_arg(…) method as shown below.

            if self.get_arg('polygon').lower() == 'true':
                polygon= [[0.1,0.1],[0.4,0.1],[0.2,0.3]]
                annos.append(polygon)
                anno_types.append('polygon')

A script can access all the elements it is connected to. Each script has an input and an output object. Since the input of our request_annos.py script is connected to a Datasource element, we access it by iterating over all Datasource objects that are connected to the input and read out the path where a folder with images is provided:

        for ds in self.inp.datasources:
            media_path = ds.path

Now we can use the path provided by the datasource to read all image files that are located there and request annotations for each image, as you can see in the listing below.

It would be sufficient to provide only the img_path argument to the request_annos(..) method, but in our example script there is also the option to send some dummy annotations to the annotation tool. In a semi automatic setup, you could use an ai to generate some annotation proposal and send these proposals to the annotation tool in the same way.

            for img_file in os.listdir(media_path):
                img_path = os.path.join(media_path, img_file)
                self.outp.request_annos(img_path=img_path, annos=annos, anno_types=anno_types)

Since each script has a logger, we can also write which images we have requested to the pipeline log file. The log file can be downloaded in the web gui. The logger object is a standard python logger.

                self.logger.info('Requested annos for: {}'.format(img_path))
export_csv.py

The export_csv.py (see Listing e1: Full export_csv.py script.) script will read all annotations from its input and create a csv file from these annotations. This csv file will then be added to a DataExport element, which will provide the file in the web gui for download.

Listing e1: Full export_csv.py script.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from lost.pyapi import script
import os
import pandas as pd

ENVS = ['lost']
ARGUMENTS = {'file_name' : { 'value':'annos.csv',
                            'help': 'Name of the file with exported bbox annotations.'}
            }

class ExportCsv(script.Script):
    '''This Script creates a csv file from image annotations and adds a data_export
    to the output of this script in pipeline.
    '''
    def main(self):
        df = self.inp.to_df()
        csv_path = self.get_path(self.get_arg('file_name'), context='instance')
        df.to_csv(path_or_buf=csv_path,
                      sep=',',
                      header=True,
                      index=False)
        self.outp.add_data_export(file_path=csv_path)

if __name__ == "__main__":
    my_script = ExportCsv()

Now we will do a step by step walk through the code.

Listing e2: ENVS and ARGUMENTS of export_csv.py.
ENVS = ['lost']
ARGUMENTS = {'file_name' : { 'value':'annos.csv',
                            'help': 'Name of the file with exported bbox annotations.'}
            }

As you can see in the listing above, the script is executed in the standard lost environment. The name of the csv file can be set by the argument file_name and has a default value of annos.csv.

Listing e3: Transforming all annotations from input into a pandas.DataFrame.
        df = self.inp.to_df()

The lost.pyapi.inout.Input.to_df() method will read all annotations form self.inp (Input of this script lost.pyapi.script.Script.inp) and transform the annotations into a pandas.DataFrame

Listing e4: Get the path to store the csv file.
        csv_path = self.get_path(self.get_arg('file_name'), context='instance')

Now the script will calculate the path to store the csv file (lost.pyapi.script.Script.get_path()). In general a script can store files to three different contexts. Since our csv file should only be used by this instance of the script, the instance context was selected.

It would be also possible to store a file to a context called pipe. In the pipe context all scripts within an annotation pipeline can access the file and exchange information in this way. The third context is called static. The static context allows to access and store files in the pipe project folder.

Listing e5: Store csv file to the LOST filesystem.
        df.to_csv(path_or_buf=csv_path,
                      sep=',',
                      header=True,
                      index=False)

After we have calculated the csv_path, the csv file can be stored to this path. In order to do that the to_csv method from the pandas.DataFrame is used.

Listing e6: Adding the csv file path to all connected DataExport elements.
        self.outp.add_data_export(file_path=csv_path)

As final step the path to the csv file is assigned to the connected DataExport element in order to make it available for download via the web gui.

Importing a pipeline project

After creating a pipeline it needs to be imported into LOST. Please see Importing a Pipeline Project into LOST for more information.

Debugging a script

When your script starts to throw errors it is time for debugging your script inside the docker container. Please see Debugging a Script for more information.

All About Pipelines

PipeProjects

A pipeline project in LOST is defined as a folder that contains pipeline definition files in json format and related python3 scripts. Additional, other files can be placed into this folder that can be accessed by the scripts of a pipeline.

Pipeline Project Examples

Pipeline project examples can be found here: lost/backend/lost/pyapi/examples/pipes

Directory Structure

Example directory structure for a pipeline project.
my_pipeline_project/
├── an_ai_script.py
├── another_pipeline.json
├── another_script.py
├── a_pretrained_model_for_the_ai.md5
├── export_the_annos.py
├── my_pipeline.json
├── my_script.py
└── my_special_python_lib
    ├── __init__.py
    ├── my_magic_module.py
    └── utils.py

1 directory, 10 files

The listing above show an example for a pipeline directory structure. Within the project there are two pipeline definition files another_pipeline.json and my_pipeline.json. These pipelines can use all the scripts (an_ai_script.py, another_script.py, export_the_annos.py, my_script.py) inside the project folder. Some of the scripts may require a special python package you have written. So if you want to use this package (e.g. my_special_python_lib), just place it also inside the pipeline project folder. Sometimes it is also useful to place some files into the project folder, for example a pretrained ai model that should be loaded inside a script.

Importing a Pipeline Project into LOST

After creating a pipeline it needs to be imported into LOST. In order to do that we need to copy the pipeline project folder into the lost_data_folder/my_data in your/host file system e.g:

# Copy your pipe_project into the LOST data folder
cp -r my_pipe_project path_to_lost_data/my_data/

Every file that is located under lost_data_folder will be visible inside the lost docker container. Now we will login to the container with:

# Log in to the docker container.
# If your user is not part of the docker group,
# you may need to use *sudo*
docker exec -it lost bash

After a successful login we can start the pipeline import. For this import we will use the lost command line tools. To import a pipeline project we use a program called import_pipe_project.py. This program expects the path to the pipeline project as argument.

If you copied your pipeline project to /home/my_user/lost/data/my_data/my_pipe_project on the host machine, it will be available inside the container under /home/lost/my_data/my_pipe_project.

Note

It is just a convention to place your pipelines that should be imported into the my_data folder. Theoretical you could place your pipeline projects anywhere in the lost_data_folder, but life is easier when following this convention.

Let do the import:

# Import my_pipe_project into LOST
import_pipe_project.py /home/lost/my_data/my_pipe_project

The import_pipe_project.py program will copy your pipeline project folder into the folder /home/lost/data/pipes and write all the meta information into the lost database. After this import the pipeline should be visible in the web gui when clicking on the Start Pipeline button in the Designer view.

Updating a LOST Pipeline

If you changed anything inside your pipe project, e.g. bug fixes, you need to update your pipe project in LOST. In order to do this, the procedure is the same as for importing a pipeline with the difference that you need to call the update_pipe_project.py program:

# Update my_pipe_project in LOST
update_pipe_project.py /home/lost/my_data/my_pipe_project

Namespacing

When importing or updating a pipeline project in LOST the following namespacing will be applied to pipelines: <name of pipeline project folder>.<name of pipeline json file>. In the same way scripts will be namespaced internally by LOST: <name of pipeline project folder>.<name of python script file>.

So in our example the pipelines would be named my_pipe_project.another_pipeline and my_pipe_project.my_pipeline.

Pipeline Definition Files

Within the pipeline definition file you define your annotation process. Such a pipeline is composed of different standard elements that are supported by LOST like datasource, script, annotTask, dataExport, visualOutput and loop. Each pipeline element is represented by a json object inside the pipeline definition.

As you can see in the example, the pipeline itself is also defined by a json object. This object has a description, a author, a pipe-schema-version and a list of pipeline elements. Each element object has a peN (pipeline element number) which is the identifier of the element itself. An element needs also an attribute that is called peOut and contains a list of elements where the current element is connected to.

An Example

A simple example pipeline.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
{
  "description" : "This pipeline selects all images of an rawFile for an annotation task",
  "author" : "Jonas Jaeger",
  "pipe-schema-version" : 1.0,
  "elements" : [
    {
      "peN" : 0,
      "peOut" : [1],
      "datasource" : {
        "type" : "rawFile"
      }
    },
    {
      "peN" : 1,
      "peOut" : [2],
      "script" : {
        "path": "anno_all_imgs.py",
        "description" : "Request ImageAnnotations for all images in an rawFile"
      }
    },
    {
      "peN" : 2,
      "peOut" : [3],
      "annoTask" : {
        "name" : "MultiImageAnnoation",
        "type" : "mia",
        "instructions" : "Please assign a class labels to all presented images.",
        "configuration": {
          "type": "imageBased"
        }
      }
    },
    {
      "peN" : 3,
      "peOut" : [4],
      "script" : {
        "path": "export_csv.py",
        "description" : "Export all annotations to csv file"
      }
    },
    {
      "peN" : 4,
      "peOut" : null,
      "dataExport" : {}
    }
  ]
}

Possible Pipeline Elements

Below you will find the definition of all possible pipeline elements in LOST.

Datasource Element
1
2
3
4
5
6
7
 {
   "peN" : "[int]",
   "peOut" : "[list of int]|[null]",
   "datasource" : {
     "type" : "rawFile"
   }
 }

Datasource elements are intended to provide datasets to Script elements. To be more specific it will provide a path inside the LOST system. In most cases this will be a path to a folder with images that should be annotated. The listing above shows the definition of a Datasource element. At the current state only type rawFile is supported, which will provide a path.

Script Element
1
2
3
4
5
6
7
8
 {
   "peN" : "[int]",
   "peOut" : "[list of int]|[null]",
   "script" : {
     "path": "[string]",
     "description" : "[string]"
   }
 }

Script elements represent python3 scripts that are executed as part of your pipeline. In order to define a Script you need to specify a path to the script file relative to the pipeline project folder and a short description of your script.

AnnoTask Element
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
 {
   "peN" : "[int]",
   "peOut" : "[list of int]|[null]",
   "annoTask" : {
     "type" : "mia|sia",
     "name" : "[string]",
     "instructions" : "[string]",
     "configuration":{"..."}
   }
 }

An AnnoTask represents an annotation task for a human-in-the-loop. Scripts can request annotations for specific images that will be presented in one of the annotation tools in the web gui.

Right now two types of annotation tools are available. If you set type to sia the single image annotation tool will be used for annotation. When choosing mia the images will be present in the multi image annotation tool.

An AnnoTask requires also a name and instructions for the annotator. Based on the type a specific configuration is required.

If “type” is “mia” the configuration will be the following:

1
2
3
4
5
6
 {
   "type": "annoBased|imageBased",
   "showProposedLabel": "[boolean]",
   "drawAnno": "[boolean]",
   "addContext": "[float]"
 }
MIA configuration:
  • type
    • If imageBased a whole image will be presented in the clustered view.
    • If annoBased all lost.db.model.TwoDAnno objects related to an image will be cropped and presented in the clustered view.
  • showProposedLabel
    • If true, the assigned sim_class will be interpreted as label and be used as pre-selection of the label in the MIA tool.
  • drawAnno
    • If true and type : annoBased the specific annotation will be drawn inside the cropped image.
  • addContext
    • If type : annoBased and addContext > 0.0, some amount of pixels will be added around the annotation when the annotation is cropped. The number of pixels that are add is calculated relative to the image size. So if you set addContext to 0.1, 10 percent of the image size will be added to the crop. This setting is useful to provide the annotator some more visual context during the annotation step.

If “type” is “sia” the configuration will be the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
 {
   "tools": {
           "point": "[boolean]",
           "line": "[boolean]",
           "polygon": "[boolean]",
           "bbox": "[boolean]",
           "junk": "[boolean]"
   },
   "annos":{
       "multilabels": "[boolean]",
       "actions": {
           "draw": "[boolean]",
           "label": "[boolean]",
           "edit": "[boolean]",
       },
       "minArea": "[int]",
       "maxAnnos": "[int or null]"
   },
   "img": {
       "multilabels": "[boolean]",
       "actions": {
           "label": "[boolean]",
       }
   }
 }
SIA configuration:
  • tools
    • Inside the tools object you can select which drawing tools are available and if the junk button is present in the SIA gui. You may choose either true or false for each of the tools (point, line, polygon, bbox, junk).
  • annos (configuration for annotations on the image)
    • actions
      • draw is set to false a user may not draw any new annotations. This is useful if a script sent annotation proposals to SIA and the user should only correct the proposed annotations.
      • label allows to disable the possibility to assign labels to annotations. This option is useful if you wish that your annotator will only draw annotations.
      • edit inidcates wether an annotator may edit an annotation that is already present.
    • multilabels allows to assign multiple labels per annotation.
    • minArea The minimum area in pixels that an annotation may have. This constraint is only applied to annotations where an area can be defined (e.g. BBoxs, Polygons).
    • maxAnnos Maximum number of annos that are allowed per image. If null an infinite number of annotation are allowed per image.
  • img (configuration for the image)
    • actions
      • label allows to disable the possibility to assign labels to the image.
    • multilabels allows to assign multiple labels to the image.
DataExport
1
2
3
4
5
 {
   "peN" : "[int]",
   "peOut" : "[list of int]|[null]",
   "dataExport" : {}
 }

A DataExport is used to serve a file generated by a script in the web gui. No special configuration is required for this pipeline element. The file to download will be provided by a Script that is connected to the input of the DataExport element. This Script will call the lost.pyapi.inout.ScriptOutput.add_data_export() method in order to do that.

VisualOutput
1
2
3
4
5
 {
   "peN" : "[int]",
   "peOut" : "[list of int]|[null]",
   "visualOutput" : {}
 }

A VisualOutput element can display images and html text inside the LOST web gui. A connected Script element will provide the content to an VisualOutput by calling lost.pyapi.inout.ScriptOutput.add_visual_output().

Loop
1
2
3
4
5
6
7
8
 {
   "peN": "[int]",
   "peOut": "[list of int]|[null]",
   "loop": {
     "maxIteration": "[int]|[null]",
     "peJumpId": "[int]"
   }
 }

A Loop element can be used to build learning loops inside of a pipeline. Such a Loop models a similar behaviour to a while loop in a programming language.

The peJumpId defines the peN of another element in the pipeline where this Loop should jump to while looping. The maxIteration setting inside a loop definition can be set to a maximum amount of iterations that should be performed or to null in order to have an infinity loop.

A Script element inside a loop cycle may break a loop by calling lost.pyapi.script.Script.break_loop(). Scripts inside a loop cycle may check if a loop was broken by calling lost.pyapi.script.Script.loop_is_broken().

All About Scripts

What is a Script?

Scripts are specific elements that are part of a LOST annotation pipeline. A script element is implemented as a python3 module. The listing below shows an example of such a script. This script will request image annotations for all images of a dataset.

Listing 1: An example LOST script.
from lost.pyapi import script
import os

ENVS = ['lost']

class AnnoAllImgs(script.Script):
    '''This Script requests image annotations for each image of an imageset.

    An imageset is basicly a folder with images.
    '''
    def main(self):
        self.logger.info("Request image annotations for:")
        for ds in self.inp.datasources:
            media_path = ds.path
            for img_file in os.listdir(media_path):
                img_path = os.path.join(media_path, img_file)
                self.outp.request_image_anno(img_path=img_path)
                self.logger.debug(img_path)

if __name__ == "__main__":
    my_script = AnnoAllImgs()

In order to implement a script you need to create a python class that inherits from lost.pyapi.script.Script. Your class needs to implement a main method needs to be instantiated within your python script. The listing below shows a minimum example for a script.

Listing 2: A minimum example for a script in LOST
1
2
3
4
5
6
7
8
9
from lost.pyapi import script

class MyScript(script.Script):

    def main(self):
        self.logger.info('Hello World!')

if __name__ == "__main__":
    MyScript()

Example Scripts

More script examples can be found here: lost/backend/lost/pyapi/examples/pipes

The LOST PyAPI Script Model

As all pipeline elements a script has an input and an output object. Via these objects it is connected to other elements in a pipeline (see also Pipeline Definition Files).

Inside a script you can exchange information with the connected elements by using the self.inp object and the self.outp object.

Reading Imagesets

It is a common pattern to read a path to an imageset from a Datasource element in your annotation pipeline. See Listing 3 for a code example. Since multiple Datasources could be connected to our script, we iterate over all connected Datasources of the input with self.inp.datasources. For each Datasource element we can read the path attribute to get the filesystem path to a folder with images.

Listing 3: Getting the path to all images of a Datasource.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from lost.pyapi import script
import os

class MyScript(script.Script):

    def main(self):
        for ds in self.inp.datasources:
            for img_file in os.listdir(ds.path):
                img_path = os.path.join(ds.path, img_file)

if __name__ == "__main__":
    MyScript()

Requesting Annotations

The most important feature of the LOST PyAPI is the ability to request annotations for a connected AnnotationTask element. Inside a Script you can access the output element and call the self.outp.request_annos method (see Listing 4).

Listing 4: Requesting an annotation for a image.
self.outp.self.outp.request_annos(img_path)

Sometimes you also want to send annotation proposals to an AnnotationTask in order to support your annotator. In most cases these proposals will be generated by an AI, like an object detector. The listing below shows a simple example to send a dummy box and a dummy point to an annotation tool.

Listing 5: Requesting an annotation for a image.
self.outp.self.outp.request_annos(img_path,
    annos = [[0.1, 0.1, 0.2, 0.2], [0.1, 0.2]],
    anno_types = ['bbox', 'point'])

Annotation Broadcasting

If multiple AnnoTask elements are connected to your ScriptOutput and you call self.outp.request_annos, the annotation request will be broadcasted to all connected AnnoTasks. So each AnnoTask will get its own copy of your annotation request. Technically, for each annotation request an empty ImageAnno will be created for each AnnoTask. During the annotation process this ImageAnno will be filled with information.

Reading Annotations

Another important task is to read annotations from previous pipeline elements. In most cases this will be AnnoTask elements.

If you like to read all annotations at the script input in a vectorized way, you can use self.inp.to_df() to get a pandas DataFrame or self.inp.to_vec() to get a list of lists.

If you prefer to iterate over all ImageAnnos you can use the respective iterator self.inp.img_annos. See the listing below for an example.

Iterate over all annotation at the script input.
for img_anno in self.inp.img_annos:
    for twod_anno in img_anno.twod_annos:
        self.logger.info('image path: {}, 2d_anno_data: {}'.format(img_anno.img_path, twod_anno.data)

Contexts to Store Files

There are three different contexts that can be used to store files that should handled by your script. Each context is modeled as a specific folder in the lost filesystem. In order to get the path to a context call self.get_path. Listing 6 shows an application of the self.get_path in order to get the path to the instance context.

Listing 6: Create a csv file and store this file to the instance context.
from lost.pyapi import script
import os
import pandas as pd

ENVS = ['lost']
ARGUMENTS = {'file_name' : { 'value':'annos.csv',
                            'help': 'Name of the file with exported bbox annotations.'}
            }

class ExportCsv(script.Script):
    '''This Script creates a csv file from image annotations and adds a data_export
    to the output of this script in pipeline.
    '''
    def main(self):
        df = self.inp.to_df()
        csv_path = self.get_path(self.get_arg('file_name'), context='instance')
        df.to_csv(path_or_buf=csv_path,
                      sep=',',
                      header=True,
                      index=False)
        self.outp.add_data_export(file_path=csv_path)

if __name__ == "__main__":
    my_script = ExportCsv()

There a three types of contexts that can be accessed: instance, pipe, static.

The instance context is only accessible by the current instance of your script. Each time a pipeline is started each script will get its own instance folder in the LOST filesystem. No other script in the same pipeline will access this folder.

If you like to exchange files among the script instances of a started pipeline, you can choose the pipe context. When calling self.get_path with context = ‘pipe’ you will get a path to a folder that is available to all script instances of a pipeline instance.

The static context is a path to the pipeline project folder where all script instances will have access to. In this way you can access files that you have provided inside the Pipeline Project. For example, if you like to load a pretrained machine learning model inside of your script, you can put it into the pipeline project folder and and access it via the static context:

Listing 7: Getting the path to the static context.
path_to_model = self.get_path('pretrained_model.md5', context='static')

Logging

Each Script will have a its own logger. This logger is an instance of the standard python logger. The example below shows how to log an info message, a warning and an error. All logs are redirected to a pipeline log file that can be downloaded via the pipeline view inside the web gui.

Listing 8: Logging examples.
self.logger.info('I am a info message')
self.logger.warning('I am a warning')
self.logger.error('An error occured!')

Script Errors and Exceptions

If an error occurs in your script, the traceback of the exception will be visible in the web gui, when clicking on the respective script in your pipeline. The error will also be automatically logged to the pipeline log file.

Script ARGUMENTS

The ARGUMENTS variable will be used to provide script arguments that can be set during the start of a pipline within the web gui. ARGUMENTS are defined as a dictionary of dictionaries. Each argument dictionary has the keys value and help. As you can see in the listing below the first argument is called my_arg its value is true and its help text is A boolean argument.

Listing 9: Defining arguments.
ARGUMENTS = {'my_arg' : { 'value':'true',
                'help': 'A boolean argument.'}
            }

Within your script you can access the value of an argument with the get_arg(...) method as shown below.

Listing 10: Accessing argument values.
if self.get_arg('my_arg').lower() == 'true':
    self.logger.info('my_arg was true')

Script ENVS

The EVNS variable provides meta information for the pipeline engine by defining a list of environments (similar to conda environments) where this script may be executed in. In this way you can assure that a script will only be executed in environments where all your dependencies are installed. All environments are installed in workers that may execute your script. If many different environments are defined within the ENVS list of a script, the pipeline engine will try to assign the script to a worker in the same order as defined within the ENVS list. So if a worker is online that has installed the first environment in the list the pipeline engine will assign the script to this worker. If no worker with the first environment is online, it will try to assign the script to a worker with the second environment in the list and so on. Listing 11 shows an example of the ENVS definition in a script that may be executed in two different environments.

Listing 11: ENVS definition with inside a script.
ENVS = ['lost', 'lost-cv']

Script RESOURCES

Sometimes a script will require all resources of a worker. And therefore no other script should be executed in parallel by the worker that executes your script. This is often the case if you train an AI model and you need all GPU memory to do this. In those cases, you can define a RESOURCES variable inside your python script and assign a list containing the string lock_all to it. See the listing below for an example:

Listing 12: RESOURCES definition inside a script.
RESOURCES = ['lock_all']

Debugging a Script

Most likely, if you imported your pipeline and run it for the first time some scripts will not work, since you placed some tiny bug into your code :-)

Inside the web GUI all exceptions and errors of your script will be visualized when clicking on the respective script element in the pipeline visualization. In this way you get a first hint what’s wrong.

In order to debug your code you need to login to the docker container and find the instance folder that is created for each script instance. Inside this folder there is a bash script called debug.sh that need to be executed in order to start the pudb debugger. You will find your script by its unique pipeline element id. The path to the script instance folder will be /home/lost/data/instance/i-<pipe_element_id>.

# Log in to docker
docker exec -it lost bash
# Change directory to the instance path of your script
cd /home/lost/data/instance/i-<pipe_element_id>
# Start debugging
bash debug.sh

Note

If your script requires a special ENV to be executed, you need to login to a container that has installed this environment for debugging.

The LOST Command Line Interface

Login to a Docker Container

In order to use the LOST command line interface, you need to login to a lost docker container:

# Log in to the docker container.
# If your user is not part of the docker group,
# you may need to use *sudo*
docker exec -it lost bash

Managing Pipeline Projects

Import Project

After creating a pipeline it needs to be imported into LOST. In order to do that we need to copy the pipeline project folder into the lost_data_folder/my_data in your/host file system e.g:

# Copy your pipe_project into the LOST data folder
cp -r my_pipe_project path_to_lost_data/my_data/

Every file that is located under lost_data_folder will be visible inside the lost docker container.

Now we will login to the container with:

# Log in to the docker container.
# If your user is not part of the docker group,
# you may need to use *sudo*
docker exec -it lost bash

After a successful login we can start the pipeline import. For this import we will use the lost command line tools. To import a pipeline project we use a program called import_pipe_project.py. This program expects the path to the pipeline project as argument.

If you copied your pipeline project to /home/my_user/lost/data/my_data/my_pipe_project on the host machine, it will be available inside the container under /home/lost/my_data/my_pipe_project.

Note

It is just a convention to place your pipelines that should be imported into the my_data folder. Theoretical you could place your pipeline projects anywhere in the lost_data_folder, but life is easier when following this convention.

Let do the import:

# Import my_pipe_project into LOST
import_pipe_project.py /home/lost/my_data/my_pipe_project

The import_pipe_project.py program will copy your pipeline project folder into the folder /home/lost/data/pipes and write all the meta information into the lost database. After this import the pipeline should be visible in the web gui when clicking on the Start Pipeline button in the Designer view.

Update Project

To update a pipe project you need to perform the same steps as for the import, with the difference that you need to call the update_pipe_project.py program:

# Update my_pipe_project in LOST
update_pipe_project.py /home/lost/my_data/my_pipe_project

Remove Project

If you want to remove a pipeline project from you lost instance, you can use the remove_pipe_project.py script. After logging into the container, perform:

# Remove my_pipe_project from a LOST instance
remove_pipe_project.py /home/lost/my_data/my_pipe_project

Note

You can only remove pipelines are not already in use. Since your data would get inconsistent otherwise. If you like to remove a pipeline that was instantiated, you need to delete all instances of this pipeline first.

Managing Label Trees

Sets of labels are managed in label trees. See Figure 1 for an example. The LOST command line tools support the import of a label tree from a csv file, the export of a label tree to a csv file and to remove a label tree by its name.

_images/labeltree_img.png

Figure 1: An example label tree. As it is visualized in the web gui.

Import Label Tree

Before you can import a labeltree, you need to define it in a csv file. See Figure to for an example of such a tree definition. For more examples navigate to lost/backend/lost/pyapi/examples/label_trees in our GitHub repository. Each leaf in a tree represents a label, while the root is the tree name and can not be selected as label during an annotation process.

_images/labeltree_csv.png

Figure 2: CSV representation of the example label tree in Figure 1.

When you have created your own label tree (lets assume you defined it in my_labeltree.csv), you need to copy it to lost_data_folder/my_data/:

# Copy your labeltree definition the LOST data folder
cp  my_labeltree.csv path_to_lost_data/my_data/

Now your csv file can be accessed from inside of the docker container. In order to import your label tree, we will login to the container and call import_label_tree.py:

# Login to the lost docker container
docker exec -it lost bash

# Import the label tree from your csv file
import_label_tree.py /home/lost/my_data/my_labeltree.csv

The label tree should now be visible in the web gui.

Export Label Tree

If you like to export a label tree that you have created with the lost web gui to a csv file you can use export_label_tree.py.

For now we will assume that we want to export the tree presented in Figure 1. Its name is dummy tree (name of the root node) an we want to export it to a file called exported_tree.csv. To do that we need to perform the following steps:

# Login to the lost docker container
docker exec -it lost bash

# Export a label tree to a csv file
export_label_tree.py "dummy tree" /home/lost/my_data/exported_tree.csv

In the host machine the exported_tree.csv will now be visible at lost_data_folder/my_data/exported_tree.csv.

Remove Label Tree

You can remove a label tree from LOST by calling remove_label_tree.py inside the lost docker container. A label tree can be identified by its name. So if you like to remove our example tree from Figure 1 with name dummy tree, you need to perform the following steps:

# Login to the lost docker container
docker exec -it lost bash

# Remove a label tree by name
remove_label_tree.py --name "dummy tree"

Note

A label tree can only be removed by the cli, if no label in this tree is used by a pipeline instance in LOST.

Advanced Setup

Nginx Configuration

LOST is shipped in docker containers. The base image inherits from an official nginx container. LOST is installed in this container. The communication to the host system is done via the nginx webserver, which can be configured via a configuration file. A differentiation is made between the debug mode and a productive environment.

Configuration File

When starting the lost container the corresponding configuration file (depending on debug mode) for nginx is copied from the repository into the folder

/etc/nginx/conf.d/default.conf

by the entrypoint.sh script.

Both nginx configuration files can be found at: lost/docker/lost/nginx in our GitHub repository.

Custom Configuration File

If a custom configuration file is desired, this file must be mounted from the host machine into the lost container.

volumes:
    - /host/path/to/nginx/conf:/etc/nginx/conf.d/default.conf

Custom Settings

Email

Lorem Ipsum

Pipeline Schedule

Lorem Ipusm

Database

Lorem Ipusm

Session Timeout

Lorem Ipusm

Worker

  1. Worker Timeout
  2. Worker Beat

Secret Key

Lorem Ipusm

Lost Frontend Port

Lorem Ipusm

Lost Data

Lorem Ipusm

LOST Worker

GPU Worker

Lorem Ipusm

Distributed Computing

Lorem Ipusm

Contribution Guide

How to contribute new features or bug fixes?

  1. Select a feature you want to implement / a bug you want to fix from the lost issue list
    • If you have a new feature, create a new feature request
  2. State in the issue comments that you are willing to implement the feature/ fix the bug
  3. We will respond to your comment
  4. Implement the feature
  5. Create a pull request

How to do backend development?

The backend is written in python. We use flask as webserver and celery to execute time consuming tasks.

If you want to adjust backend code and test your changes please perform the following steps:

  1. Install LOST as described in LOST QuickSetup.
  2. Adjust the DEBUG variable in the .env config file. This file should be located at lost_install_dir/docker/.env.
Changes that need to be performed in the .env file. This will cause the LOST flask server to start in debug mode.
    DEBUG=True
  1. In oder to run your code, you need to mount your code into the docker container. You can do this by adding docker volumes in the docker-compose.yml file. The file should be located at lost_install_dir/docker/docker-compose.yml. Do this for all containers in the compose file that contain lost source code (lost, lost-cv, lost-cv-gpu)
Adjustments to docker-compose.yml. Mount your backend code into the docker container.
  version: '2'
    services:
        lost:
          image: l3pcv/lost:${LOST_VERSION}
          container_name: lost
          command: bash /entrypoint.sh
          env_file:
            - .env
          volumes:
            - ${LOST_DATA}:/home/lost
            - </path/to/lost_clone>/backend/lost:/code/backend/lost

Note

Because flask is in debug mode, code changes are applied immediately. An exception to this behaviour are changes to code that is related to celery tasks. After such changes lost needs to be restarted manually to get the code changes working.

How to do frontend development?

The Frontend is developed with React, Redux, CoreUI and reactstrap

  1. To start developing frontend follow the LOST QuickSetup instruction.
  2. Change directory to the frontend folder and install npm packages
cd lost/frontend/lost/
npm i
  1. [Optional] Set backend port in package.json start script with REACT_APP_PORT variable.
  2. Start development server with
npm start
Frontend Applications
Application Directory
Dashboard src/components/Dashboard
SIA (Single Image Annotation) src/components/SIA
MIA (Multi Image Annotation) src/components/MIA
Running Pipeline src/components/pipeline/src/running
Start Pipeline src/components/pipeline/src/start
Labels src/components/Labels
Workers src/components/Workers
Users src/components/Users

Building lost containers locally

  • The whole build process is described in .gitlab-ci.yml.
  • All required docker files are provided in lost/docker within the lost repo.
  • There are 3 lost container that will be executing scripts and the webserver
    • lost: Will run the webserver and provide the basic environment where scripts can be executed.
    • lost-cv: Will provide an computer vision environment in oder to execute scripts that require special libraries like opencv.
    • lost-cv-gpu: Will provide gpu support for scripts that use libraries that need gpu support like tensorflow.
  • Building the lost container
    • The lost container will inherit from the lost-base.
    • As first step build lost-base. The Dockerfile is located at lost/docker/lost-base.
    • After that you can build the lost container, using your local version of lost-base. The dockerfile can be found here: lost/docker/lost
  • Building lost-cv will work analog to building the lost container
  • lost-cv-gpu is based on lost-gpu-base
    • Build lost-gpu-base first and then use your local version to build lost-cv-gpu

User and Groups

Management

Users and groups can be added via the “Users” section. Each user created gets its own default group with the same name as the username. No more users can be added to this default group. Groups that are added manually can be assigned to any number of users.

Visibility

Pipeline

Pipelines can be assigned to a group or to the own user when starting. Only groups to which the user is assigned can be selected. Later, these pipelines will only be visible to the selected group or user.

Label Trees

Label Trees are visible system-wide across all applications.

AnnoTasks

AnnoTask can be assigned either to your own user or to a group when starting a pipeline. Only groups to which the user is assigned can be selected.

Pipeline Templates

Pipeline Templates are visible system-wide across all applications.

Conventions

Image coordinate system

The same coordinate system as in OpenCv is used. So an image is treated as a matix. The x-axis increases to the right and the y-axis increases while moving downwards in the coordinate system, as in matrix notation.

Annotations

All annotations are defined relative to the image. See also lost.db.model.TwoDAnno.

Bounding Box definition

  • x: Defines the x-coordinate of the center of a bounding box (relative, in pixels).
  • y: Defines the y-coordinate of the center of a bounding box (relative, in pixels).
  • width: Defines the width of a bbox (relative, in pixels).
  • height: Defines the height of a bbox (relative, in pixels).

pyapi

script

Script

class lost.pyapi.script.Script(pe_id=None)[source]

Superclass for a user defined Script.

Custom scripts need to inherit from Script and implement the main method.

pe_id

Pipe element id. Assign the pe id of a pipline script in order to emulate this script in a jupyter notebook for example.

Type:int
break_loop()[source]

Break next loop in pipeline.

create_label_tree(name, external_id=None)[source]

Create a new LabelTree

Parameters:
  • name (str) – Name of the tree / name of the root leaf.
  • external_id (str) – An external id for the root leaf.
Returns:

The created LabelTree.

Return type:

lost.logic.label.LabelTree

get_abs_path(path)[source]

Get absolute path in current file system.

Parameters:path (str) – A relative path.
Returns:Absolute path
Return type:str
get_alien_element(pe_id)[source]

Get an pipeline element by id from somewhere in the LOST system.

It is an alien element since it is most likely not part of the pipeline instance this script belongs to.

Parameters:pe_id (int) – PipeElementID of the alien element.
Returns:
get_arg(arg_name)[source]

Get argument value by name for this script.

Parameters:arg_name (str) – Name of the argument.
Returns:Value of the given argument.
get_label_tree(name)[source]

Get a LabelTree by name.

Parameters:name (str) – Name of the desired LabelTree.
Retruns:
lost.logic.label.LabelTree or None:
If a label tree with the given name exists it will be returned. Otherwise None will be returned
get_path(file_name, context='instance', ptype='abs')[source]

Get path for the filename in a specific context in filesystem.

Parameters:
  • file_name (str) – Name or relative path for a file.
  • context (str) – Options: instance, pipe, static
  • ptype (str) – Type of this path. Can be relative or absolute Options: abs, rel
Returns:

Path to the file in the specified context.

Return type:

str

get_rel_path(path)[source]

Get relativ path for current project

Parameters:path (str) – A absolute path
Returns:Relative path
Return type:str
inp

lost.pyapi.inout.Input

instance_context

Get the path to store files that are only valid for this instance.

Type:str
iteration

Get the current iteration.

Number of times this script has been executed.

Type:int
logger

A standard python logger for this script.

It will log to the pipline log file.

Type:logging.Logger
loop_is_broken()[source]

Check if the current loop is broken

outp

lost.pyapi.inout.ScriptOutput

pipe_context

Root path to store files that should be visible for all elements in the pipeline.

Type:str
pipe_info

An object with pipeline informations

Type:lost.pyapi.pipeline.PipeInfo
progress

Get current progress that is displayed in the progress bar of this script.

Current progress in percent 0…100

Type:float
reject_execution()[source]

Reject execution of this script and set it to PENDING again.

Note

This method is useful if you want to execute this script only when some condition based on previous pipeline elements is meet.

report_err(msg)[source]

Report an error for this user script to portal

Parameters:msg – The error message that should be reported.

Note

You can call this method multiple times if you like. All messages will be concatenated an sent to the portal.

static_context

Get the static path.

Files that are stored at this path can be accessed by all instances of a script.

Type:str
update_progress(value)[source]

Update the progress for this script.

Parameters:value (float) – Progress in percent 0…100

inout

ScriptOutput

class lost.pyapi.inout.ScriptOutput(script)[source]

Special Output class since lost.pyapi.script.Script objects may manipulate and request annotations.

add_annos(img_path, img_labels=None, img_sim_class=None, annos=[], anno_types=[], anno_labels=[], anno_sim_classes=[], frame_n=None, video_path=None)[source]

Add annos in list style to an image.

Parameters:
  • img_path (str) – Path to the image where annotations are added for.
  • img_labels (list of int) – Labels that will be assigned to the image. Each label in the list is represented by a label_leaf_id.
  • img_sim_class (int) – A culster id that will be used to cluster this image in the MIA annotation tool.
  • annos (list of list) – A list of POINTs: [x,y] BBOXes: [x,y,w,h] LINEs or POLYGONs: [[x,y], [x,y], …]
  • anno_types (list of str) – Can be ‘point’, ‘bbox’, ‘line’, ‘polygon’
  • anno_labels (list of list of int) – Labels for the twod annos. Each label in the list is represented by a label_leaf_id. (see also LabelLeaf).
  • anno_sim_classes (list of ints) – List of arbitrary cluster ids that are used to cluster annotations in the MIA annotation tool.
  • frame_n (int) – If img_path belongs to a video frame_n indicates the framenumber.
  • video_path (str) – If img_path belongs to a video this is the path to this video.

Example

Add annotations to an:

>>> self.outp.add_annos('path/to/img.jpg',
...     annos = [
...         [0.1, 0.1, 0.2, 0.2],
...         [0.1, 0.2],
...         [[0.1, 0.3], [0.2, 0.3], [0.15, 0.1]]
...     ],
...     anno_types=['bbox', 'point', 'polygon'],
...     anno_labels=[
...         [1],
...         [1],
...         [4]
...     ],
...     anno_sim_classes=[10, 10, 15]
... )

Note

In contrast to request_annos this method will broadcast the added annotations to all connected pipeline elements.

add_data_export(file_path)[source]

Serve a file for download inside the web gui via a DataExport element.

Parameters:file_path (str) – Path to the file that should be provided for download.
add_visual_output(img_path=None, html=None)[source]

Display an image and html in the web gui via a VisualOutput element.

Parameters:
  • img_path (str) – Path in the lost filesystem to the image to display.
  • html (str) – HTML text to display.
anno_tasks

list of lost.pyapi.pipe_elements.AnnoTask objects

bbox_annos

Iterate over all bbox annotation.

Returns:Iterator of lost.db.model.TwoDAnno.
data_exports

list of lost.pyapi.pipe_elements.VisualOutput objects.

datasources

list of lost.pyapi.pipe_elements.Datasource objects

img_annos

Iterate over all lost.db.model.ImageAnno objects in this Resultset.

Returns:Iterator of lost.db.model.ImageAnno objects.
line_annos

Iterate over all line annotations.

Returns:Iterator of lost.db.model.TwoDAnno objects.
mia_tasks

list of lost.pyapi.pipe_elements.MIATask objects

point_annos

Iterate over all point annotations.

Returns:Iterator of lost.db.model.TwoDAnno.
polygon_annos

Iterate over all polygon annotations.

Returns:Iterator of lost.db.model.TwoDAnno objects.
raw_files

list of lost.pyapi.pipe_elements.RawFile objects

request_annos(img_path, img_labels=None, img_sim_class=None, annos=[], anno_types=[], anno_labels=[], anno_sim_classes=[], frame_n=None, video_path=None)[source]

Request annotations for a subsequent annotaiton task.

Parameters:
  • img_path (str) – Path to the image where annotations are added for.
  • img_label (list of int) – Labels that will be assigned to the image. The labels should be represented by a label_leaf_id. An image may have multiple labels.
  • img_sim_class (int) – A culster id that will be used to cluster this image in the MIA annotation tool.
  • annos (list of list) – A list of POINTs: [x,y] BBOXes: [x,y,w,h] LINEs or POLYGONs: [[x,y], [x,y], …]
  • anno_types (list of str) – Can be ‘point’, ‘bbox’, ‘line’, ‘polygon’
  • anno_labels (list of int) – Labels for the twod annos. Each label in the list is represented by a label_leaf_id. (see also LabelLeaf).
  • anno_sim_classes (list of ints) – List of arbitrary cluster ids that are used to cluster annotations in the MIA annotation tool.
  • frame_n (int) – If img_path belongs to a video frame_n indicates the framenumber.
  • video_path (str) – If img_path belongs to a video this is the path to this video.

Example

Request human annotations for an image with annotation proposals:

 >>> self.outp.add_annos('path/to/img.jpg',
...     annos = [
...         [0.1, 0.1, 0.2, 0.2],
...         [0.1, 0.2],
...         [[0.1, 0.3], [0.2, 0.3], [0.15, 0.1]]
...     ],
...     anno_types=['bbox', 'point', 'polygon'],
...     anno_labels=[
...         [1],
...         [1],
...         [4]
...     ],
...     anno_sim_classes=[10, 10, 15]
... )

Reqest human annotations for an image without porposals:

>>> self.outp.request_annos('path/to/img.jpg')
request_bbox_annos(img_path, boxes=[], labels=[], frame_n=None, video_path=None, sim_classes=[])[source]

Request BBox annotations for a subsequent annotaiton task.

Parameters:
  • img_path (str) – Path of the image.
  • boxes (list) – A list of boxes [[x,y,w,h],..].
  • labels (list) – A list of labels for each box.
  • frame_n (int) – If img_path belongs to a video frame_n indicates the framenumber.
  • video_path (str) – If img_path belongs to a video this is the path to this video.
  • sim_classes (list) – [sim_class1, sim_class2,…] A list of similarity classes that is used to cluster BBoxes when using MIA for annotation.

Note

There are three cases when you request a bbox annotation.

Case1: Annotate empty image
You just want to get bounding boxes drawn by a human annotator for an image. -> Only set the img_path argument.
Case2: Annotate image with a preset of boxes
You want to get verified predicted bounding boxes by a human annotator and you have not predicted a label for the boxes. -> Set the img_path argument and boxes.
Case3: Annotate image with a preset of boxes and labels
You want to get predicted bounding boxes and the related predicted labels to be verified by a human annotator. -> Set the img_path and the boxes argument. For boxes you need to assign a list of box and a list of label_ids for labels. An annotation may have multiple labels. E.g. boxes =[[0.1,0.1,0.2,0.3],…], labels =[[1,5],[5],…]

Example

How to use this method in a Script:

>>> self.request_bbox_annos('path/to/img.png',
...     boxes=[[0.1,0.1,0.2,0.3],[0.2,0.2,0.4,0.4]],
...     labels=[[0],[1]]
... )
request_image_anno(img_path, sim_class=None, labels=None, frame_n=None, video_path=None)[source]

Request a class label annotation for an image.

Parameters:
  • img_path (str) – Path to the image that should be annotated.
  • sim_class (int) – A similarity class for this image. This similarity measure will be used to cluster images for MultiObjectAnnoation -> Images with the same sim_class will be presented to the annotator in one step.
  • labels (list of int) – Labels that will be assigned to the image. Each label should represent a label_leaf_id.
  • frame_n (int) – If img_path belongs to a video frame_n indicates the framenumber.
  • video_path (str) – If img_path belongs to a video this is the path to this video.

Example

Request image annotation::
>>> self.request_image_anno('path/to/image', sim_class=2)
sia_tasks

list of lost.pyapi.pipe_elements.SIATask objects

to_df()

Get a pandas DataFrame of all annotations related to this object.

Returns:
Column names are:
’img.idx’, ‘img.anno_task_id’, ‘img.timestamp’, ‘img.timestamp_lock’, ‘img.state’, ‘img.sim_class’, ‘img.frame_n’, ‘img.video_path’, ‘img.img_path’, ‘img.result_id’, ‘img.iteration’, ‘img.group_id’, ‘img.anno_time’, ‘img.lbl.idx’, ‘img.lbl.name’, ‘img.lbl.external_id’, ‘img.annotator’, ‘anno.idx’, ‘anno.anno_task_id’, ‘anno.timestamp’, ‘anno.timestamp_lock’, ‘anno.state’, ‘anno.track_n’, ‘anno.dtype’, ‘anno.sim_class’, ‘anno.iteration’, ‘anno.group_id’, ‘anno.img_anno_id’, ‘anno.annotator’, ‘anno.confidence’, ‘anno.anno_time’, ‘anno.lbl.idx’, ‘anno.lbl.name’, ‘anno.lbl.external_id’, ‘anno.data’
Return type:pandas.DataFrame
to_vec(columns='all')

Get a vector of all Annotations related to this object.

Parameters:columns (str or list of str) – ‘all’ OR ‘img.idx’, ‘img.anno_task_id’, ‘img.timestamp’, ‘img.timestamp_lock’, ‘img.state’, ‘img.sim_class’, ‘img.frame_n’, ‘img.video_path’, ‘img.img_path’, ‘img.result_id’, ‘img.iteration’, ‘img.group_id’, ‘img.anno_time’, ‘img.lbl.idx’, ‘img.lbl.name’, ‘img.lbl.external_id’, ‘img.annotator’, ‘anno.idx’, ‘anno.anno_task_id’, ‘anno.timestamp’, ‘anno.timestamp_lock’, ‘anno.state’, ‘anno.track_n’, ‘anno.dtype’, ‘anno.sim_class’, ‘anno.iteration’, ‘anno.group_id’, ‘anno.img_anno_id’, ‘anno.annotator’, ‘anno.confidence’, ‘anno.anno_time’, ‘anno.lbl.idx’, ‘anno.lbl.name’, ‘anno.lbl.external_id’, ‘anno.data’
Retruns:
list OR list of lists: Desired columns

Example

Return just a list of 2d anno labels:

>>> img_anno.to_vec('anno.lbl.name')
['Aeroplane', 'Bicycle', 'Bottle', 'Horse']

Return a list of lists:

>>> self.inp.get_anno_vec.(['img.img_path', 'anno.lbl.name',
...     'anno.data', 'anno.dtype'])
[
    ['path/to/img1.jpg', 'Aeroplane', [0.1, 0.1, 0.2, 0.2], 'bbox'],
    ['path/to/img1.jpg', 'Bicycle', [0.1, 0.1], 'point'],
    ['path/to/img2.jpg', 'Bottle', [[0.1, 0.1], [0.2, 0.2]], 'line'],
    ['path/to/img3.jpg', 'Horse', [0.2, 0.15, 0.3, 0.18], 'bbox']
]
twod_annos

Iterate over 2D-annotations.

Returns:of lost.db.model.TwoDAnno objects.
Return type:Iterator
visual_outputs

list of lost.pyapi.pipe_elements.VisualOutput objects.

Output

class lost.pyapi.inout.Output(element)[source]
anno_tasks

list of lost.pyapi.pipe_elements.AnnoTask objects

bbox_annos

Iterate over all bbox annotation.

Returns:Iterator of lost.db.model.TwoDAnno.
data_exports

list of lost.pyapi.pipe_elements.VisualOutput objects.

datasources

list of lost.pyapi.pipe_elements.Datasource objects

img_annos

Iterate over all lost.db.model.ImageAnno objects in this Resultset.

Returns:Iterator of lost.db.model.ImageAnno objects.
line_annos

Iterate over all line annotations.

Returns:Iterator of lost.db.model.TwoDAnno objects.
mia_tasks

list of lost.pyapi.pipe_elements.MIATask objects

point_annos

Iterate over all point annotations.

Returns:Iterator of lost.db.model.TwoDAnno.
polygon_annos

Iterate over all polygon annotations.

Returns:Iterator of lost.db.model.TwoDAnno objects.
raw_files

list of lost.pyapi.pipe_elements.RawFile objects

sia_tasks

list of lost.pyapi.pipe_elements.SIATask objects

to_df()

Get a pandas DataFrame of all annotations related to this object.

Returns:
Column names are:
’img.idx’, ‘img.anno_task_id’, ‘img.timestamp’, ‘img.timestamp_lock’, ‘img.state’, ‘img.sim_class’, ‘img.frame_n’, ‘img.video_path’, ‘img.img_path’, ‘img.result_id’, ‘img.iteration’, ‘img.group_id’, ‘img.anno_time’, ‘img.lbl.idx’, ‘img.lbl.name’, ‘img.lbl.external_id’, ‘img.annotator’, ‘anno.idx’, ‘anno.anno_task_id’, ‘anno.timestamp’, ‘anno.timestamp_lock’, ‘anno.state’, ‘anno.track_n’, ‘anno.dtype’, ‘anno.sim_class’, ‘anno.iteration’, ‘anno.group_id’, ‘anno.img_anno_id’, ‘anno.annotator’, ‘anno.confidence’, ‘anno.anno_time’, ‘anno.lbl.idx’, ‘anno.lbl.name’, ‘anno.lbl.external_id’, ‘anno.data’
Return type:pandas.DataFrame
to_vec(columns='all')

Get a vector of all Annotations related to this object.

Parameters:columns (str or list of str) – ‘all’ OR ‘img.idx’, ‘img.anno_task_id’, ‘img.timestamp’, ‘img.timestamp_lock’, ‘img.state’, ‘img.sim_class’, ‘img.frame_n’, ‘img.video_path’, ‘img.img_path’, ‘img.result_id’, ‘img.iteration’, ‘img.group_id’, ‘img.anno_time’, ‘img.lbl.idx’, ‘img.lbl.name’, ‘img.lbl.external_id’, ‘img.annotator’, ‘anno.idx’, ‘anno.anno_task_id’, ‘anno.timestamp’, ‘anno.timestamp_lock’, ‘anno.state’, ‘anno.track_n’, ‘anno.dtype’, ‘anno.sim_class’, ‘anno.iteration’, ‘anno.group_id’, ‘anno.img_anno_id’, ‘anno.annotator’, ‘anno.confidence’, ‘anno.anno_time’, ‘anno.lbl.idx’, ‘anno.lbl.name’, ‘anno.lbl.external_id’, ‘anno.data’
Retruns:
list OR list of lists: Desired columns

Example

Return just a list of 2d anno labels:

>>> img_anno.to_vec('anno.lbl.name')
['Aeroplane', 'Bicycle', 'Bottle', 'Horse']

Return a list of lists:

>>> self.inp.get_anno_vec.(['img.img_path', 'anno.lbl.name',
...     'anno.data', 'anno.dtype'])
[
    ['path/to/img1.jpg', 'Aeroplane', [0.1, 0.1, 0.2, 0.2], 'bbox'],
    ['path/to/img1.jpg', 'Bicycle', [0.1, 0.1], 'point'],
    ['path/to/img2.jpg', 'Bottle', [[0.1, 0.1], [0.2, 0.2]], 'line'],
    ['path/to/img3.jpg', 'Horse', [0.2, 0.15, 0.3, 0.18], 'bbox']
]
twod_annos

Iterate over 2D-annotations.

Returns:of lost.db.model.TwoDAnno objects.
Return type:Iterator
visual_outputs

list of lost.pyapi.pipe_elements.VisualOutput objects.

Input

class lost.pyapi.inout.Input(element)[source]

Class that represants an input of a pipeline element.

Parameters:element (object) – Related lost.db.model.PipeElement object.
anno_tasks

list of lost.pyapi.pipe_elements.AnnoTask objects

bbox_annos

Iterate over all bbox annotation.

Returns:Iterator of lost.db.model.TwoDAnno.
data_exports

list of lost.pyapi.pipe_elements.VisualOutput objects.

datasources

list of lost.pyapi.pipe_elements.Datasource objects

img_annos

Iterate over all lost.db.model.ImageAnno objects in this Resultset.

Returns:Iterator of lost.db.model.ImageAnno objects.
line_annos

Iterate over all line annotations.

Returns:Iterator of lost.db.model.TwoDAnno objects.
mia_tasks

list of lost.pyapi.pipe_elements.MIATask objects

point_annos

Iterate over all point annotations.

Returns:Iterator of lost.db.model.TwoDAnno.
polygon_annos

Iterate over all polygon annotations.

Returns:Iterator of lost.db.model.TwoDAnno objects.
raw_files

list of lost.pyapi.pipe_elements.RawFile objects

sia_tasks

list of lost.pyapi.pipe_elements.SIATask objects

to_df()[source]

Get a pandas DataFrame of all annotations related to this object.

Returns:
Column names are:
’img.idx’, ‘img.anno_task_id’, ‘img.timestamp’, ‘img.timestamp_lock’, ‘img.state’, ‘img.sim_class’, ‘img.frame_n’, ‘img.video_path’, ‘img.img_path’, ‘img.result_id’, ‘img.iteration’, ‘img.group_id’, ‘img.anno_time’, ‘img.lbl.idx’, ‘img.lbl.name’, ‘img.lbl.external_id’, ‘img.annotator’, ‘anno.idx’, ‘anno.anno_task_id’, ‘anno.timestamp’, ‘anno.timestamp_lock’, ‘anno.state’, ‘anno.track_n’, ‘anno.dtype’, ‘anno.sim_class’, ‘anno.iteration’, ‘anno.group_id’, ‘anno.img_anno_id’, ‘anno.annotator’, ‘anno.confidence’, ‘anno.anno_time’, ‘anno.lbl.idx’, ‘anno.lbl.name’, ‘anno.lbl.external_id’, ‘anno.data’
Return type:pandas.DataFrame
to_vec(columns='all')[source]

Get a vector of all Annotations related to this object.

Parameters:columns (str or list of str) – ‘all’ OR ‘img.idx’, ‘img.anno_task_id’, ‘img.timestamp’, ‘img.timestamp_lock’, ‘img.state’, ‘img.sim_class’, ‘img.frame_n’, ‘img.video_path’, ‘img.img_path’, ‘img.result_id’, ‘img.iteration’, ‘img.group_id’, ‘img.anno_time’, ‘img.lbl.idx’, ‘img.lbl.name’, ‘img.lbl.external_id’, ‘img.annotator’, ‘anno.idx’, ‘anno.anno_task_id’, ‘anno.timestamp’, ‘anno.timestamp_lock’, ‘anno.state’, ‘anno.track_n’, ‘anno.dtype’, ‘anno.sim_class’, ‘anno.iteration’, ‘anno.group_id’, ‘anno.img_anno_id’, ‘anno.annotator’, ‘anno.confidence’, ‘anno.anno_time’, ‘anno.lbl.idx’, ‘anno.lbl.name’, ‘anno.lbl.external_id’, ‘anno.data’
Retruns:
list OR list of lists: Desired columns

Example

Return just a list of 2d anno labels:

>>> img_anno.to_vec('anno.lbl.name')
['Aeroplane', 'Bicycle', 'Bottle', 'Horse']

Return a list of lists:

>>> self.inp.get_anno_vec.(['img.img_path', 'anno.lbl.name',
...     'anno.data', 'anno.dtype'])
[
    ['path/to/img1.jpg', 'Aeroplane', [0.1, 0.1, 0.2, 0.2], 'bbox'],
    ['path/to/img1.jpg', 'Bicycle', [0.1, 0.1], 'point'],
    ['path/to/img2.jpg', 'Bottle', [[0.1, 0.1], [0.2, 0.2]], 'line'],
    ['path/to/img3.jpg', 'Horse', [0.2, 0.15, 0.3, 0.18], 'bbox']
]
twod_annos

Iterate over 2D-annotations.

Returns:of lost.db.model.TwoDAnno objects.
Return type:Iterator
visual_outputs

list of lost.pyapi.pipe_elements.VisualOutput objects.

pipeline

PipeInfo

class lost.pyapi.pipeline.PipeInfo(pipe, dbm)[source]
description

Description that was defined when pipeline was started.

Type:str
logfile_path

Path to pipeline log file.

Type:str
name

Name of this pipeline

Type:str
timestamp

Timestamp when pipeline was started.

Type:str
timestamp_finished

Timestamp when pipeline was finished.

Type:str

pipe_elements

Datasource

class lost.pyapi.pipe_elements.Datasource(pe, dbm)[source]
inp

Input of this pipeline element

Type:lost.pyapi.inout.Input
outp

Output of this pipeline element

Type:lost.pyapi.inout.Output
path

Absolute path to file or folder

Type:str
pipe_info

An object with pipeline informations

Type:lost.pyapi.pipeline.PipeInfo

RawFile

class lost.pyapi.pipe_elements.RawFile(pe, dbm)[source]
inp

Input of this pipeline element

Type:lost.pyapi.inout.Input
outp

Output of this pipeline element

Type:lost.pyapi.inout.Output
path

Absolute path to file or folder

Type:str
pipe_info

An object with pipeline informations

Type:lost.pyapi.pipeline.PipeInfo

AnnoTask

class lost.pyapi.pipe_elements.AnnoTask(pe, dbm)[source]
configuration

Configuration of this annotask.

Type:str
inp

Input of this pipeline element

Type:lost.pyapi.inout.Input
instructions

Instructions for the annotator of this AnnoTask.

Type:str
name

A name for this annotask.

Type:str
outp

Output of this pipeline element

Type:lost.pyapi.inout.Output
pipe_info

An object with pipeline informations

Type:lost.pyapi.pipeline.PipeInfo
possible_label_df

Get all possible labels for this annotation task in DataFrame format

pd.DataFrame: Column names are:
‘idx’, ‘name’, ‘abbreviation’, ‘description’, ‘timestamp’, ‘external_id’, ‘is_deleted’, ‘parent_leaf_id’ ,’is_root’
Type:pd.DataFrame
progress

Progress in percent.

Value range 0…100.

Type:float

MIATask

class lost.pyapi.pipe_elements.MIATask(pe, dbm)[source]
configuration

Configuration of this annotask.

Type:str
inp

Input of this pipeline element

Type:lost.pyapi.inout.Input
instructions

Instructions for the annotator of this AnnoTask.

Type:str
name

A name for this annotask.

Type:str
outp

Output of this pipeline element

Type:lost.pyapi.inout.Output
pipe_info

An object with pipeline informations

Type:lost.pyapi.pipeline.PipeInfo
possible_label_df

Get all possible labels for this annotation task in DataFrame format

pd.DataFrame: Column names are:
‘idx’, ‘name’, ‘abbreviation’, ‘description’, ‘timestamp’, ‘external_id’, ‘is_deleted’, ‘parent_leaf_id’ ,’is_root’
Type:pd.DataFrame
progress

Progress in percent.

Value range 0…100.

Type:float

SIATask

class lost.pyapi.pipe_elements.SIATask(pe, dbm)[source]
configuration

Configuration of this annotask.

Type:str
inp

Input of this pipeline element

Type:lost.pyapi.inout.Input
instructions

Instructions for the annotator of this AnnoTask.

Type:str
name

A name for this annotask.

Type:str
outp

Output of this pipeline element

Type:lost.pyapi.inout.Output
pipe_info

An object with pipeline informations

Type:lost.pyapi.pipeline.PipeInfo
possible_label_df

Get all possible labels for this annotation task in DataFrame format

pd.DataFrame: Column names are:
‘idx’, ‘name’, ‘abbreviation’, ‘description’, ‘timestamp’, ‘external_id’, ‘is_deleted’, ‘parent_leaf_id’ ,’is_root’
Type:pd.DataFrame
progress

Progress in percent.

Value range 0…100.

Type:float

DataExport

class lost.pyapi.pipe_elements.DataExport(pe, dbm)[source]
file_path

A list of absolute path to exported files

Type:list of str
inp

Input of this pipeline element

Type:lost.pyapi.inout.Input
outp

Output of this pipeline element

Type:lost.pyapi.inout.Output
pipe_info

An object with pipeline informations

Type:lost.pyapi.pipeline.PipeInfo
to_dict()[source]

Transform a list of exports to a dictionary.

Returns:[{‘iteration’:int, ‘file_path’:str},…]
Return type:list of dict

VisualOutput

class lost.pyapi.pipe_elements.VisualOutput(pe, dbm)[source]
html_strings

list of html strings.

Type:list of str
img_paths

List of absolute paths to images.

Type:list of str
inp

Input of this pipeline element

Type:lost.pyapi.inout.Input
outp

Output of this pipeline element

Type:lost.pyapi.inout.Output
pipe_info

An object with pipeline informations

Type:lost.pyapi.pipeline.PipeInfo
to_dict()[source]

Transforms a list of visualization information into a list of dicts.

Returns:[{‘iteration’:int, ‘img_path’:str, ‘html_string’:str},…]
Return type:list of dicts

Loop

class lost.pyapi.pipe_elements.Loop(pe, dbm)[source]
inp

Input of this pipeline element

Type:lost.pyapi.inout.Input
is_broken

True if loop is broken

Type:bool
iteration

Current iteration of this loop.

Type:int
max_iteration

Maximum number of iteration.

Type:int
outp

Output of this pipeline element

Type:lost.pyapi.inout.Output
pe_jump

PipelineElement where this loop will jump to when looping.

Can be of type:
pipe_info

An object with pipeline informations

Type:lost.pyapi.pipeline.PipeInfo

model

ImageAnno

class lost.db.model.ImageAnno(anno_task_id=None, user_id=None, timestamp=None, state=None, sim_class=None, result_id=None, img_path=None, frame_n=None, video_path=None, iteration=0, anno_time=None, is_junk=None, description=None)[source]

An ImageAnno represents an image annotation.

Multiple labels as well as 2d annotations (e.g. points, lines, boxes, polygons) can be assigned to an image.

labels

The related Label object.

Type:list
twod_annos

A list of TwoDAnno objects.

Type:list
img_path

Path to the image where this anno belongs to.

Type:str
frame_n

If this image is part of an video, frame_n indicates the frame number.

Type:int
video_path

If this image is part of an video, this should be the path to that video in file system.

Type:str
sim_class

The similarity class this anno belong to. It is used to cluster similar annos in MIA

Type:int
anno_time

Overall annotation time in seconds.

timestamp

Timestamp of ImageAnno

Type:DateTime
iteration

The iteration of a loop when this anno was created.

Type:int
idx

ID of this ImageAnno in database

Type:int
anno_task_id

ID of the anno_task this ImageAnno belongs to.

Type:int
state

See lost.db.state.Anno

Type:enum
result_id

Id of the related result.

user_id

Id of the annotator.

Type:int
is_junk

This image was marked as Junk.

Type:bool
description

Description for this annotation. Assigned by an annotator or algorithm.

Type:str
get_anno_vec(anno_type='bbox')[source]

Get related 2d annotations in list style.

Parameters:anno_type (str) – Can be ‘bbox’, ‘point’, ‘line’, ‘polygon’
Returns:
For POINTs:
[[x, y], [x, y], …]
For BBOXs:
[[x, y, w, h], [x, y, w, h], …]
For LINEs and POLYGONs:
[[[x, y], [x, y],…], [[x, y], [x, y],…]]
Return type:list of list of floats

Example

In the following example all bounding boxes of the image annotation will be returned in list style:

>>> img_anno.anno_vec()
[[0.1 , 0.2 , 0.3 , 0.18],
 [0.25, 0.25, 0.2, 0.4]]
>>> img_anno.get_anno_lbl_vec('name', 'bbox') #Get related label names
[['cow'], ['horse']]
iter_annos(anno_type='bbox')[source]

Iterator for all related 2D annotations of this image.

Parameters:anno_type (str) – Can be bbox’, ‘point’, ‘line’, ‘polygon’, ‘all’
Retruns:
iterator of TwoDAnno objects

Example

>>> for bb in img_anno.iter_annos('bbox'):
...     do_something(bb)
to_df()[source]

Tranform this ImageAnnotation and all related TwoDAnnotaitons into a pandas DataFrame.

Returns:
Column names are:
’img.idx’, ‘img.anno_task_id’, ‘img.timestamp’, ‘img.timestamp_lock’, ‘img.state’, ‘img.sim_class’, ‘img.frame_n’, ‘img.video_path’, ‘img.img_path’, ‘img.result_id’, ‘img.iteration’, ‘img.user_id’, ‘img.anno_time’, ‘img.lbl.idx’, ‘img.lbl.name’, ‘img.lbl.external_id’, ‘img.annotator’, ‘img.is_junk’, ‘anno.idx’, ‘anno.anno_task_id’, ‘anno.timestamp’, ‘anno.timestamp_lock’, ‘anno.state’, ‘anno.track_n’, ‘anno.dtype’, ‘anno.sim_class’, ‘anno.iteration’, ‘anno.user_id’, ‘anno.img_anno_id’, ‘anno.annotator’, ‘anno.confidence’, ‘anno.anno_time’, ‘anno.lbl.idx’, ‘anno.lbl.name’, ‘anno.lbl.external_id’, ‘anno.data’
Return type:pandas.DataFrame
to_dict(style='flat')[source]

Transform this ImageAnno and all related TwoDAnnos into a dict.

Parameters:style (str) – ‘flat’ or ‘hierarchical’. Return a dict in flat or nested style.
Returns:In ‘flat’ style return a list of dicts with one dict per annotation. In ‘hierarchical’ style, return a nested dictionary.
Return type:list of dict OR dict

Note

In ‘flat’ style annotation data and lists of labels are serialized as json strings. You may want to deserialize them with json.loads()

Example

HowTo iterate through all TwoDAnnotations of this ImageAnno dictionary in flat style:

>>> for d in img_anno.to_dict():
...     print(d['img.img_path'], d['anno.lbl.name'], d['anno.dtype'])
path/to/img1.jpg ['Aeroplane'] bbox
path/to/img1.jpg ['Bicycle'] point

Possible keys in flat style:

>>> img_anno.to_dict()[0].keys()
dict_keys([
    'img.idx', 'img.anno_task_id', 'img.timestamp',
    'img.timestamp_lock', 'img.state', 'img.sim_class',
    'img.frame_n', 'img.video_path', 'img.img_path',
    'img.result_id', 'img.iteration', 'img.user_id',
    'img.anno_time', 'img.lbl.idx', 'img.lbl.name',
    'img.lbl.external_id', 'img.annotator', 'img.is_junk'
    'anno.idx', 'anno.anno_task_id', 'anno.timestamp',
    'anno.timestamp_lock', 'anno.state', 'anno.track_n',
    'anno.dtype', 'anno.sim_class', 'anno.iteration',
    'anno.user_id', 'anno.img_anno_id', 'anno.annotator',
    'anno.confidence', 'anno.anno_time', 'anno.lbl.idx',
    'anno.lbl.name', 'anno.lbl.external_id', 'anno.data'
])

HowTo iterate through all TwoDAnnotations of this ImageAnno dictionary in hierarchical style:

>>> h_dict = img_anno.to_dict(style='hierarchical')
>>> for d in h_dict['img.twod_annos']:
...     print(h_dict['img.img_path'], d['anno.lbl.name'], d['anno.dtype'])
path/to/img1.jpg [Aeroplane] bbox
path/to/img1.jpg [Bicycle] point

Possible keys in hierarchical style:

>>> h_dict = img_anno.to_dict(style='hierarchical')
>>> h_dict.keys()
dict_keys([
    'img.idx', 'img.anno_task_id', 'img.timestamp',
    'img.timestamp_lock', 'img.state', 'img.sim_class',
    'img.frame_n', 'img.video_path', 'img.img_path',
    'img.result_id', 'img.iteration', 'img.user_id',
    'img.anno_time', 'img.lbl.idx', 'img.lbl.name',
    'img.lbl.external_id', 'img.annotator', 'img.twod_annos'
])
>>> h_dict['img.twod_annos'][0].keys()
dict_keys([
    'anno.idx', 'anno.anno_task_id', 'anno.timestamp',
    'anno.timestamp_lock', 'anno.state', 'anno.track_n',
    'anno.dtype', 'anno.sim_class', 'anno.iteration',
    'anno.user_id', 'anno.img_anno_id', 'anno.annotator',
    'anno.confidence', 'anno.anno_time', 'anno.lbl.idx',
    'anno.lbl.name', 'anno.lbl.external_id', 'anno.data'
])
to_vec(columns='all')[source]

Transform this ImageAnnotation and all related TwoDAnnotations in list style.

Parameters:columns (str or list of str) – ‘all’ OR ‘img.idx’, ‘img.anno_task_id’, ‘img.timestamp’, ‘img.timestamp_lock’, ‘img.state’, ‘img.sim_class’, ‘img.frame_n’, ‘img.video_path’, ‘img.img_path’, ‘img.result_id’, ‘img.iteration’, ‘img.user_id’, ‘img.anno_time’, ‘img.lbl.idx’, ‘img.lbl.name’, ‘img.lbl.external_id’, ‘img.annotator’, ‘img.is_junk’, ‘anno.idx’, ‘anno.anno_task_id’, ‘anno.timestamp’, ‘anno.timestamp_lock’, ‘anno.state’, ‘anno.track_n’, ‘anno.dtype’, ‘anno.sim_class’, ‘anno.iteration’, ‘anno.user_id’, ‘anno.img_anno_id’, ‘anno.annotator’, ‘anno.confidence’, ‘anno.anno_time’, ‘anno.lbl.idx’, ‘anno.lbl.name’, ‘anno.lbl.external_id’, ‘anno.data’
Retruns:
list OR list of lists: Desired columns

Example

Return just a list of serialized 2d anno labels:

>>> img_anno.to_vec('anno.lbl.name')
["['Aeroplane']", "['Bicycle']"]

Return a list of lists:

>>> img_anno.to_vec(['img.img_path', 'anno.lbl.name',
...     'anno.lbl.idx', 'anno.dtype'])
[
    ['path/to/img1.jpg', "['Aeroplane']", "[14]", 'bbox'],
    ['path/to/img1.jpg', "['Bicycle']", "[15]", 'point']
]

TwoDAnno

class lost.db.model.TwoDAnno(anno_task_id=None, user_id=None, timestamp=None, state=None, track_id=None, sim_class=None, img_anno_id=None, timestamp_lock=None, iteration=0, data=None, dtype=None, confidence=None, anno_time=None, description=None)[source]

A TwoDAnno represents a 2D annotation/ drawing for an image.

A TwoDAnno can be of type point, line, bbox or polygon.

idx

ID of this TwoDAnno in database

Type:int
anno_task_id

ID of the anno_task this TwoDAnno belongs to.

Type:int
timestamp

Timestamp created of TwoDAnno

Type:DateTime
timestamp_lock

Timestamp locked in view

Type:DateTime
state

can be unlocked, locked, locked_priority or labeled (see lost.db.state.Anno)

Type:enum
track_id

The track id this TwoDAnno belongs to.

Type:int
sim_class

The similarity class this anno belong to. It is used to cluster similar annos in MIA.

Type:int
iteration

The iteration of a loop when this anno was created.

Type:int
user_id

Id of the annotator.

Type:int
img_anno_id

ID of ImageAnno this TwoDAnno is appended to

Type:int
data

drawing data (for e.g. x,y, width, height) of anno - depends on dtype

Type:Text
dtype

type of TwoDAnno (for e.g. bbox, polygon) (see lost.db.dtype.TwoDAnno)

Type:int
labels

A list of Label objects related to the TwoDAnno.

Type:list
confidence

Confidence of Annotation.

Type:float
anno_time

Overall Annotation Time in ms.

description

Description for this annotation. Assigned by an annotator or algorithm.

Type:str
add_label(label_leaf_id)[source]

Add a label to this 2D annotation.

Parameters:label_leaf_id (int) – Id of the label_leaf that should be added.
bbox

BBOX annotation in list style [x, y, w, h]

Example

>>> anno = TwoDAnno()
>>> anno.bbox = [0.1, 0.1, 0.2, 0.2]
>>> anno.bbox
[0.1, 0.1, 0.2, 0.2]
Type:list
get_anno_vec()[source]

Get annotation data in list style.

Returns:
For a POINT:
[x, y]
For a BBOX:
[x, y, w, h]
For a LINE and POLYGONS:
[[x, y], [x, y],…]
Return type:list of floats

Example

HowTo get a numpy array? In the following example a bounding box is returned:

>>> np.array(twod_anno.get_anno_vec())
array([0.1 , 0.2 , 0.3 , 0.18])
line

LINE annotation in list style [[x, y], [x, y], …]

Example

>>> anno = TwoDAnno()
>>> anno.line = [[0.1, 0.1], [0.2, 0.2]]
>>> anno.line
[[0.1, 0.1], [0.2, 0.2]]
Type:list of list
point

POINT annotation in list style [x, y]

Example

>>> anno = TwoDAnno()
>>> anno.point = [0.1, 0.1]
>>> anno.point
[0.1, 0.1]
Type:list
polygon

polygon annotation in list style [[x, y], [x, y], …]

Example

>>> anno = TwoDAnno()
>>> anno.polygon = [[0.1, 0.1], [0.2, 0.1], [0.15, 0.2]]
>>> anno.polygon
[[0.1, 0.1], [0.2, 0.1], [0.15, 0.2]]
Type:list of list
to_df()[source]

Transform this annotation into a pandas DataFrame

Returns:A DataFrame where column names correspond to the keys of the dictionary returned from to_dict() method.
Return type:pandas.DataFrame

Note

Column names are:
[‘anno.idx’, ‘anno.anno_task_id’, ‘anno.timestamp’, ‘anno.timestamp_lock’, ‘anno.state’, ‘anno.track_id’, ‘anno.dtype’, ‘anno.sim_class’, ‘anno.iteration’, ‘anno.user_id’, ‘anno.img_anno_id’, ‘anno.annotator’, ‘anno.confidence’, ‘anno.anno_time’, ‘anno.lbl.idx’, ‘anno.lbl.name’, ‘anno.lbl.external_id’, ‘anno.data’]
to_dict(style='flat')[source]

Transform this object into a dict.

Parameters:style (str) – ‘flat’ or ‘hierarchical’ ‘flat’: Return a dictionray in table style ‘hierarchical’: Return a nested dictionary
Retruns:
dict: In flat or hierarchical style.

Example

Get a dict in flat style. Note that ‘anno.data’, ‘anno.lbl.idx’, ‘anno.lbl.name’ and ‘anno.lbl.external_id’ are json strings in contrast to the hierarchical style.

>>> bbox.to_dict(style='flat')
{
    'anno.idx': 88,
    'anno.anno_task_id': None,
    'anno.timestamp': None,
    'anno.timestamp_lock': None,
    'anno.state': None,
    'anno.track_id': None,
    'anno.dtype': 'bbox',
    'anno.sim_class': None,
    'anno.iteration': 0,
    'anno.user_id': 47,
    'anno.img_anno_id': None,
    'anno.annotator': 'test',
    'anno.confidence': None,
    'anno.anno_time': None,
    'anno.lbl.idx': '["14"]',
    'anno.lbl.name': '["Aeroplane"]',
    'anno.lbl.external_id': '["6"]',
    'anno.data': '{"x": 0.1, "y": 0.1, "w": 0.2, "h": 0.2}'
}

Get a dict in hierarchical style. Note that ‘anno.data’ is a dict in contrast to the flat style.

>>> bbox.to_dict(style='hierarchical')
{
    'anno.idx': 86,
    'anno.anno_task_id': None,
    'anno.timestamp': None,
    'anno.timestamp_lock': None,
    'anno.state': None,
    'anno.track_id': None,
    'anno.dtype': 'bbox',
    'anno.sim_class': None,
    'anno.iteration': 0,
    'anno.user_id': 46,
    'anno.img_anno_id': None,
    'anno.annotator': 'test',
    'anno.confidence': None,
    'anno.anno_time': None,
    'anno.lbl.idx': [14],
    'anno.lbl.name': ['Aeroplane'],
    'anno.lbl.external_id': ['6'],
    'anno.data': {
        'x': 0.1, 'y': 0.1, 'w': 0.2, 'h': 0.2
    }
}
to_vec(columns='all')[source]

Tansfrom this annotation in list style.

Parameters:columns (list of str OR str) – Possible column names are: ‘all’ OR [‘anno.idx’, ‘anno.anno_task_id’, ‘anno.timestamp’, ‘anno.timestamp_lock’, ‘anno.state’, ‘anno.track_id’, ‘anno.dtype’, ‘anno.sim_class’, ‘anno.iteration’, ‘anno.user_id’, ‘anno.img_anno_id’, ‘anno.annotator’, ‘anno.confidence’, ‘anno.anno_time’, ‘anno.lbl.idx’, ‘anno.lbl.name’, ‘anno.lbl.external_id’, ‘anno.data’]
Returns:A list of the desired columns.
Return type:list of objects

Example

If you want to get only the annotation in list style e.g. [x, y, w, h] (if this TwoDAnnotation is a bbox).

>>> anno.to_vec('anno.data')
[0.1, 0.1, 0.2, 0.2]

If you want in addition also the corresponding label names and label ids for this annotation then just add additional column names:

>>> bbox.to_vec(['anno.data', 'anno.lbl.idx', 'anno.lbl.name'])
[[0.1, 0.1, 0.2, 0.2], "[14]", "['Aeroplane']"]

LabelLeaf

class lost.db.model.LabelLeaf(idx=None, name=None, abbreviation=None, description=None, timestamp=None, external_id=None, label_tree_id=None, is_deleted=None, parent_leaf_id=None, is_root=None)[source]

A LabelLeaf

idx

ID in database.

Type:int
name

Name of the LabelName.

Type:str
abbreviation
Type:str
description
Type:str
timestamp
Type:DateTime
external_id

Id of an external semantic label system (for e.g. synsetid of wordnet)

Type:str
is_deleted
Type:Boolean
is_root

Indicates if this leaf is the root of a tree.

Type:Boolean
parent_leaf_id

Reference to parent LabelLeaf.

Type:Integer
label_leafs
Type:list of LabelLeaf
to_df()[source]

Transform this LabelLeaf to a pandas DataFrame.

Returns:
Return type:pd.DataFrame
to_dict()[source]

Transform this object to a dict.

Returns:
Return type:dict

Label

class lost.db.model.Label(idx=None, dtype=None, label_leaf_id=None, img_anno_id=None, two_d_anno_id=None, annotator_id=None, timestamp_lock=None, timestamp=None, confidence=None, anno_time=None)[source]

Represants an Label that is related to an annoation.

idx

ID in database.

Type:int
dtype

lost.db.dtype.Result type of this attribute.

Type:enum
label_leaf_id

ID of related model.LabelLeaf.

img_anno_id
Type:int
two_d_anno_id
Type:int
timestamp
Type:DateTime
timestamp_lock
Type:DateTime
label_leaf

related model.LabelLeaf object.

Type:model.LabelLeaf
annotator_id

GroupID of Annotator who has assigned this Label.

Type:Integer
confidence

Confidence of Annotation.

Type:float
anno_time

Time of annotaiton duration

Type:float

logic.label

LabelTree

class lost.logic.label.LabelTree(dbm, root_id=None, root_leaf=None, name=None, logger=None)[source]

A class that represants a LabelTree.

Parameters:
  • dbm (lost.db.access.DBMan) – Database manager object.
  • root_id (int) – label_leaf_id of the root Leaf.
  • root_leaf (lost.db.model.LabelLeaf) – Root leaf of the tree.
  • name (str) – Name of a label tree.
  • logger (logger) – A logger.
create_child(parent_id, name, external_id=None)[source]

Create a new leaf in label tree.

Parameters:
  • parent_id (int) – Id of the parend leaf.
  • name (str) – Name of the leaf e.g the class name.
  • external_id (str) – Some id of an external label system.
Retruns:
lost.db.model.LabelLeaf: The the created child leaf.
create_root(name, external_id=None)[source]

Create the root of a label tree.

Parameters:
  • name (str) – Name of the root leaf.
  • external_id (str) – Some id of an external label system.
Retruns:
lost.db.model.LabelLeaf or None:
The created root leaf or None if a root leaf with same name is already present in database.
delete_subtree(leaf)[source]

Recursive delete all leafs in subtree starting with leaf

Parameters:leaf (lost.db.model.LabelLeaf) – Delete all childs of this leaf. The leaf itself stays.
delete_tree()[source]

Delete whole tree from system

get_child_vec(parent_id, columns='idx')[source]

Get a vector of child labels.

Parameters:
  • parent_id (int) – Id of the parent leaf.
  • columns (str or list of str) – Can be any attribute of lost.db.model.LabelLeaf for example ‘idx’, ‘external_idx’, ‘name’ or a list of these e.g. [‘name’, ‘idx’]

Example

>>> label_tree.get_child_vec(1, columns='idx')
[2, 3, 4]
>>> label_tree.get_child_vec(1, columns=['idx', 'name'])
[
    [2, 'cow'],
    [3, 'horse'],
    [4, 'person']
]
Returns:
Return type:list in the requested columns
import_df(df)[source]

Import LabelTree from DataFrame

Parameters:df (pandas.DataFrame) – LabelTree in DataFrame style.
Retruns:
lost.db.model.LabelLeaf or None:
The created root leaf or None if a root leaf with same name is already present in database.
to_df()[source]

Transform this LabelTree to a pandas DataFrame.

Returns:pandas.DataFrame

dtype

TwoDAnno

class lost.db.dtype.TwoDAnno[source]

Type of a TwoDAnno

BBOX

A BBox.

Type:1
POLYGON

A Polygon.

Type:2
POINT

A Point.

Type:3
LINE

A Line.

Type:4
CIRCLE

A Circle.

Type:5

util methods

anno_helper

A module with helper methods to tranform annotations into different formats and to crop annotations from an image.

lost.pyapi.utils.anno_helper.calc_box_for_anno(annos, types, point_padding=0.05)[source]

Calculate a bouning box for an arbitrary 2DAnnotation.

Parameters:
  • annos (list) – List of annotations.
  • types (list) – List of types.
  • point_padding (float, optional) – In case of a point we need to add some padding to get a box.
Returns:

A list of bounding boxes in format [[xc,yc,w,h],…]

Return type:

list

lost.pyapi.utils.anno_helper.crop_boxes(annos, types, img, context=0.0, draw_annotations=False)[source]

Crop a bounding boxes for TwoDAnnos from image.

Parameters:
  • annos (list) – List of annotations.
  • types (list) – List of types.
  • img (numpy.array) – The image where boxes should be cropped from.
  • context (float) – The context that should be added to the box.
  • draw_annotations (bool) – If true, annotation will be painted inside the crop.
Returns:

A tuple that contains a list of image crops and a list of bboxes [[xc,yc,w,h],…]

Return type:

(list of numpy.array, list of list of float)

lost.pyapi.utils.anno_helper.divide_into_patches(img, x_splits=2, y_splits=2)[source]

Divide image into x_splits*y_splits patches.

Parameters:
  • img (array) – RGB image (skimage.io.imread).
  • x_splits (int) – Number of elements on x axis.
  • y_splits (int) – Number of elements on y axis.
Returns:

img_patches, box_coordinates

img batches and box coordinates of these patches in the image.

Return type:

list, list

Note

img_patches are in following order:
[[x0,y0], [x0,y1],…[x0,yn],…,[xn,y0], [xn, y1]…[xn,yn]]
lost.pyapi.utils.anno_helper.draw_annos(annos, types, img, color=(255, 0, 0), point_r=2)[source]

Draw annotations inside a image

Parameters:
  • annos (list) – List of annotations.
  • types (list) – List of types.
  • img (numpy.array) – The image to draw annotations in.
  • color (tuple) – (R,G,B) color that is used for drawing.

Note

The given image will be directly edited!

Returns:Image with drawn annotations
Return type:numpy.array
lost.pyapi.utils.anno_helper.to_abs(annos, types, img_size)[source]

Convert relative annotation coordinates to absolute ones

Parameters:
  • annos (list of list) –
  • types (list of str) –
  • img_size (tuple) – (width, height) of the image in pixels.
Returns:

Annotations in absolute format.

Return type:

list of list

lost.pyapi.utils.anno_helper.trans_boxes_to(boxes, convert_to='minmax')[source]

Transform a box from standard lost format into a different format

Parameters:
  • boxes (list of list) – Boxes in standard lost format [[xc,yc,w,h],…]
  • convert_to (str) – minmax -> [[xmim,ymin,xmax,ymax]…]
Returns:

Converted boxes.

Return type:

list of list

blacklist

A helper module to deal with blacklists.

class lost.pyapi.utils.blacklist.ImgBlacklist(my_script, name='img-blacklist.json', context='pipe')[source]

A class to deal with image blacklists.

Such blacklists are often used for annotation loops, in order to prevent annotating the same image multiple times.

my_script

The script instance that creates this blacklist.

Type:lost.pyapi.script.Script
name

The name of the blacklist file.

Type:str
context

Options: instance, pipe, static

Type:str

Example

Add images to blacklist.

>>> blacklist = ImgBlacklist(self, name='blacklist.json')
>>> blacklist.add(['path/to/img0.jpg'])
>>> balcklist.save()

Load a blacklist and check if a certain image is already in list.

>>> blacklist = ImgBlacklist(self, name='blacklist.json')
>>> blacklist.contains('path/to/img0.jpg')
True
>>> blacklist.contains('path/to/img1.jpg')
False

Get list of images that are not part of the blacklist

>>> blacklist.get_whitelist(['path/to/img0.jpg', 'path/to/img1.jpg', 'path/to/img2.jpg'])
['path/to/img1.jpg', 'path/to/img2.jpg']

Add images to the blacklist

>>> blacklist.add(['path/to/img1.jpg', 'path/to/img2.jpg'])
add(imgs)[source]

Add a list of images to blacklist.

Parameters:imgs (list) – A list of image identifiers that should be added to the blacklist.
contains(img)[source]

Check if blacklist contains a spcific image

Parameters:img (str) – The image identifier
Returns:True if img in blacklist, False if not.
Return type:bool
delete_blacklist()[source]

Remove blacklist from filesystem

get_whitelist(img_list, n='all')[source]

Get a list of images that are not part of the blacklist.

Parameters:
  • img_list (list of str) – A list of images where should be checked if they are in the blacklist
  • n ('all' or 'int') – The maximum number of images that should be returned.
Returns:

A list of images that are not in the blacklist.

Return type:

list of str

remove_item(item)[source]

Remove item from blacklist

Parameters:item (str) – The item/ image to remove from blacklist.
save()[source]

Write blacklist to filesystem

vis

lost.pyapi.utils.vis.boxes(script, img_anno, figsize=(15, 15), fontsize=15, label_offset=(0, 15))[source]

Draw bboxes on into an matplotlib figures

Parameters:
  • script (lost.pyapi.script.Script) – The script object that uses this method.
  • img_anno (lost.pyapi.annos.ImageAnno) – The image anno where bboxes should be visualized for.
  • figsize (tuple) – Size of the matplotlib figure
  • fontsize (ing) – Fontsize in pixels for label display
  • label_offset (tuple) – Position of the label in pixels in relation to the upper left corner of the box.
Returns:

Matplotlib figure

lost.pyapi.utils.vis.vis_tracks(img, tracks, frame_n, dots=15, figsize=(10, 10), dot_radius=5, linewidth=2)[source]

Visualize a track on image

Parameters:
  • img (array or str) – An RGB image or path to the image file.
  • tracks (array) – [[frame_n, track_id, xc, yc, w, h]…[…]] Box is defined relative to the image.
  • frame_n (int) – The frame number belonning to the image
  • dots (int) – Number of dots that will be displayed. Past locations that will be visualized.
  • figsize (tuple) – (int,int) Size of the figure to display.
  • dot_radius (int) – Radius of the first dot.
  • linewidth (int) – Linewidth of the box to draw.
Returns:

Matplotlib figure

Indices and tables