Labeling tool for human activity data
This tool allows to annotate observations of human activity data by setting markers in a timeline that indicate the end of a segment. Marker can describe e.g. the types of actions, objects the observed subject has interacted with, etc. The program then generates annotations for the frames since the previous marker. The set of marker types as well as mappings to be used when exporting the annotations can be configured using YAML files.
The source code of this tool can be found at GitHub in the repository knowrob/data_labeling_tool. In Ubuntu, you need to have the following packages installed:
sudo apt-get install libgstreamer0.10-0 libgstreamer0.10-dev libgstreamer-plugins-base0.10-0 libgstreamer-plugins-base0.10-dev
To provide some test data to start with, the following archive contains a sample folder structure including a video, some first dummy annotations, and some configuration files.
For comments, questions and support queries please contact Moritz Tenorth.
- Start the program and select the video file to be labeled
- The program tries to load the configuration file from <filename>/../config/config.yaml
- If the config.yaml could not be found, it shows a dialog to search for it
- The program then looks for existing labels in the annotation folder relative to config.yaml
- Skip through the video frame by frame using left and right arrow keys
- Jump to a position by clicking in the timeline (slider)
- Start and pause playback using the space bar
- Set label markers at the end of a segment to assign a label to the segment before
- Export labels to a file (for formats see below)
- Load labels from file (just restart and use the same output folder)
- Jump between labels with pageup/down
- Remove marker with DEL, clear current label buffer with ESCAPE, commit current label buffer (i.e. set a label on the timeline) with ENTER
In order to facilitate the annotation and the processing of the resulting information, the following folder layout shall be used. Additional files and folders can of course be added, but the following are considered mandatory.
<sequence-id> |- annotations (will contain the generated labels) | |- models (will contain auto-generated BLN models) | |- labels.dat (main label file) | \- <label-seq>.owl etc (other file formats possible) | |- config | |- config.yaml | |- owl-class-mapping.yaml | \- ... | |- mocap | \- motion capture data (format TBD) | |- video |- <sequence-id>.ogv (low-res version for annotation, OGG video format) \- <sequence-id>.dv (high-res version as recorded)
The set of label categories and the label values can be configured using a set of YAML files in the config folder. Usually, such a set of configuration files needs to be created only once for a whole sequence of experiments and can be copied to each experiment data folder. Since these configuration files are stored inside the respective data folders, the label domains for the annotations are also inherently documented.
The format of the config.yaml is the following:
# list mappings between internal IDs and # displayed text for each category label-categories: actionIDs : Actions obj1IDs : Objects_1 # ... # Define which class denotes the action (becomes main index) action-category: actionIDs # define domain of label entries for each category label-values: actionIDs: - None - Take - Put # ... obj1IDs: - None - Cup_1 - Cup_2 # ...
The labels used for the annotation are common English words – to generate action representations that can be used in the knowledge base, they need to be mapped against the concepts in the ontology, which are usually called slightly differently. This mapping can be defined in the owl-mapping.yaml file which contains one mapping per line, grouped into three categories (classes, properties, and individuals). Each of these categories needs to exist, though may be empty.
classes: put: PuttingSomethingSomewhere, take: TakingSomething, stir: Stirring, box: BoxTheContainer, bag: PlasticBag, bowl: Bowl-Mixing properties: with: primaryObjectMoving, into: toLocation, out_of: fromLocation individuals: cupboard1: cupboard23
Annotation file formats
All labels are stored as the concatenation of the different label-categories with a hyphen (-) in between. The annotation using a kind of 'restricted natural language' (i.e. action, object, and possibly preposition and another object) can easily be translated into relational representations and ontology-based annotations by defining a mapping between the identifiers in the ontology and the labels used by the human annotator.
1 32 Put-Sugar-Into-Cup_1-Sequence_omission 33 82 Put-Milk-Into-Coffee_jar-Misuse1 ...
In this format, there is one entry per marker that has been set in the labeling tool. The first number is the first frame in this segment, the second one the last frame, followed by the label (see above). This is the format that is used internally by the program to serialize and re-load a set of annotations.
If a labels.dat file is found in the output folder, it is automatically parsed and the markers are set accordingly. This can be very convenient for inspecting a set of labels, e.g. for checking correctness.
<code> … Put-Sugar-Into-Cup_1-Sequence_omission Put-Sugar-Into-Cup_1-Sequence_omission Put-Sugar-Into-Cup_1-Sequence_omission Put-Milk-Into-Coffee_jar-Misuse1 Put-Milk-Into-Coffee_jar-Misuse1 Put-Milk-Into-Coffee_jar-Misuse1 … <code>
This format contains one label per video frame, each on a separate line.
The OWL exporter outputs a description of the task in terms of action instances in OWL, linked to the respective objects and additional information. This file can be loaded into KnowRob and be used for reasoning, e.g. using the mod_ham package from the knowrob_human repository. For more information on the OWL representation of actions, have a look here.
This exporter outputs the data in terms of BLOG files that can be used for training Bayesian Logic Networks (BLN). It relies on having a complete mapping of all labels that appear in the data to OWL classes, properties or individual (see owl-mapping.yaml)