The following file formats are used within clana
.
Label Format¶
The label file format is a text format. It is used to make sense of the prediction. The order matters.
Specification¶
One label per line
It is a CSV file with
;
as the delimiter and"
as the quoting character.The first value is a short version of the label. It has to be unique over all short versions.
The second value is a long version of the label. It has to be unique over all long versions.
Example¶
Computer Vision¶
car;car
cat;cat
dog;dog
mouse;mouse
mnist.csv:
0;0
1;1
2;2
3;3
4;4
5;5
6;6
7;7
8;8
9;9
Language Identification¶
German;de
English;en
French;fr
Classification Dump Format¶
TODO: THIS IS WAY TOO BIG!
The classification dump format is a text format. It describes what the output of a classifier for some inputs.
Specification¶
The Classification Dump Format is a text format.
Each line contains exactly one output of the classifier for one input.
It is a CSV file with
;
as the delimiter and"
as the quoting character.The first value is an identifier for the input. It is no longer than 60 characters.
The second and following values are the outputs for each label. Each of those values is a number in
[0, 1]
.The outputs are in the same order as in the related
label.csv
file.
Example¶
identifier 1;0.1;0.3;0.6
ident 2;0.8;0.1;0.1
Ground Truth Format¶
The Ground Truth Format is a text file format. It is used to describe the ground truth of data.
Specification¶
Each line contains the ground truth of exactly one element.
It is a CSV file with
;
as the delimiter and"
as the quoting character.The first value is an identifier for the input. It is no longer than 60 characters.
The second and following values are the outputs for each label. Each of those values is a number in
[0, 1]
.The outputs are in the same order as in the related
label.csv
file.
Example¶
identifier 1;1;0;1
identifier 1;0.5;0;0.5