Provenance Database Schema

Main database

Below we describe the JSON schema for the anomalies, normalexecs and metadata collections of the main database component of the provenance database.

Function event schema

This section describes the JSON schema for the anomalies and normalexecs collections. The fields of the JSON object are bolded, and a brief description follows the colon (:).

Function execution “events” in Chimbuko are labeled by a unique (for each process) string of following form “$RANK:$IO_STEP:$IDX” (eg “0:12:225”), where RANK, IO_STEP and IDX are the MPI rank, the io step and an integer index, respectively, and $VAL indicates the numerical value of the variable VAL. We will refer to such a string as an “event label” below.

{ start of schema

“__id”: Record index assigned by Sonata,
“version”: The schema version,
“call_stack”:    Function execution call stack (starting with anomalous function execution),
[
{
“entry”: timestamp of function execution entry ,
“exit”: timestamp of function execution exit (0 if has not exited at time of write) ,
“fid”: Global function index (can be used as a key instead of function name),
“func”: function name,
“event_id”: Event label (see above),
“is_anomaly”: True/false depending on whether event is anomalous (applies only to executions that have exited by time of write)
},
….
],
“counter_events”: [  An array of counter data received on the specific process thread during function execution
{
“counter_idx”: An index used internally to index counters,
“counter_name”: A string describing the counter,
“counter_value”: The value of the counter (integer),
“pid”: process index,
“rid”: process rank,
“tid”: process thread,
“ts”: timestamp
},
…
],
“entry”: Timestamp of function execution entry,
“exit”: Timestamp of function execution exit,
“event_id”: Event label (see above),
“fid”: Global function index (can be used as a key instead of function name),
“func”: function name,
“algo_params”:   The parameters used by the outlier detection algorithm to classify this event (format is algorithm dependent, see below),
“is_gpu_event”: true or false depending on whether function executed on a GPU
“gpu_location”: if a GPU event, a JSON description of the context (see below), otherwise null,
“gpu_parent”: if a GPU event, a JSON description of the parent CPU function (see below), otherwise null,
“pid”: process index,
“rid”: process rank,
“tid”: thread index
“hostname”: The hostname of the node on which the application was executing
“runtime_exclusive”: Function execution time exclusive of children,
“runtime_total”: Function total execution time,
“io_step”: IO step index,
“io_step_tstart”: Time of start of IO step,
“io_step_tend”:  Time of end of IO step,
“outlier_score”: The anomaly score of the execution reflecting how unlikely the event is (algorithm dependent, larger is more anomalous),
“outlier_severity”: The severity of the anomaly, reflecting how important the anomaly was,
“event_window”: Capture of function executions and MPI comms events occuring in window around anomaly on same thread (object)
{
“exec_window”: The function executions in the window arranged in order of their entry time (array)
[
{
“entry”: timestamp of function execution entry ,
“exit”: timestamp of function execution exit (0 if has not exited at time of write) ,
“fid”: Global function index (can be used as a key instead of function name),
“func”: function name,
“event_id”: Event label (see above),
“parent_event_id”: Event label of parent function execution,
“is_anomaly”: True/false depending on whether event is anomalous (applies only to executions that have exited by time of write)
},
…
],
“comm_window”: The MPI communications in the window (array)
[
{
type: SEND or RECV,
pid: process index,
rid: rank of current process,
tid: thread idx,
src: message origin rank,
tar: message target rank,
bytes: message size,
tag: an integer tag associated with the message,
timestamp: time MPI function executed,
execdata_key: the ID label of the parent function
},
…
]
} end of event_window object
“node_state”: The state of the node provided by TAU’s monitoring plugin. This is null if no state information is available. (object)
{
“data”: A list of fields and values (list)
[
{
“field”: The field name, e.g. “Memory Available (MB)”
“value”: The value
},
…
],
timestamp: The timestamp of the most recent state update
}

} end of schema

For the SSTD (original) algorithm, the algo_params field has the following format:

{

“accumulate”: not used at present,
“count”: number of times function encountered (global),
“kurtosis”: kurtosis of distribution,
“maximum”: maximum of distribution,
“mean”: mean of distribution,
“minimum”: minimum of distribution,
“skewness”: skewness of distribution,
“stddev”: standard deviation of distribution

}

For the HBOS and COPOD algorithms, the algo_params field has the following format:

{

“histogram”: the histogram,

{

“Histogram Bin Counts” : the height of the histogram bin (array) ,

“Histogram Bin Edges” : the edges of the bins starting with the lower edge of bin 0 followed by the upper edges of bins 0..N (array)

},

“internal_global_threshold” : a score threshold for anomaly detection used internally

}

The schema for the gpu_location field is as follows:

{

“context”: GPU device context (NVidia terminology),
“device”: GPU device index,
“stream”: GPU device stream (NVidia terminology),
“thread”: virtual thread index assigned to this context/device/stream by Tau

}

and for the gpu_parent field:

{

“event_id”: The event label (see above) of the parent function execution,
“tid”: Thread index for CPU parent function,
“call_stack”:    Parent function call stack (starting with parent function execution),
[
{
“entry”: timestamp of function execution entry ,
“exit”: timestamp of function execution exit (0 if has not exited at time of write) ,
“fid”: Global function index (can be used as a key instead of function name),
“func”: function name,
“event_id”: The event label
},
….
]

}

Note that Tau considers a GPU device/context/stream much in the same way as a CPU thread, and assigns it a unique index. This index is the “thread index” for GPU events.

Metadata schema

Metadata are stored in the metadata collection in the following JSON schema:

{

“descr”: String description (key) of metadata entry
“pid”: Program index from which metadata originated,
“rid”: Process rank from which metadata originated,
“tid”: Process thread associated with metadata,
“value”: Value of the metadata entry,
“__id”: Record index assigned by Sonata*

}

Note that the tid (thread index) for metadata is usually 0, apart from for metadata associated with a GPU context/device/stream, for which the index is the virtual thread index assigned by Tau to the context/device/stream.

Global database

Below we describe the JSON schema for the func_stats, counter_stats and ad_model collections of the global database component of the provenance database.

A common data structure RunStats is used extensively to represent statistics (mean, min/max, std. dev., etc) of some quantity. It has the following schema:

{

‘accumulate’: The sum of all values (same as mean * count). In some cases this entry is not populated,
‘count’: The number of values,
‘kurtosis’: kurtosis of the distribution of values,
‘maximum’: maximum value,
‘mean’: average value,
‘minimum’: minimum value,
‘skewness’: skewness of distribution of values,
‘stddev’: standard deviation of distribution of values

}

Function profile statistics schema

func_stats contains aggregated profile information and anomaly information for all functions. The JSON schema is as follows:

{

“__id”: record index,
“app”: application/program index,
“fid”: function index,
“fname”: function name,
“anomaly_metrics”: statistics on anomalies for this function (object). Note this entry is null if no anomalies were detected
{
“anomaly_count”: statistics on the anomaly count for time steps in which anomalies were detected, as well as the total number of anomalies (RunStats)
“first_io_step”: the first IO step in which an anomaly was detected,
“last_io_step”: the last IO step in which an anomaly was detected,
“max_timestamp”: the last anomaly’s timestamp,
“min_timestamp”: the first anomaly’s timestamp,
“score”: statistics on the scores for the anomalies (RunStats),
“severity”: statistics on the severity of the anomalies (RunStats),
},
“runtime_profile”: statistics on function runtime (i.e. the function profile)  (object)
{
“exclusive_runtime”: statistics on the runtime excluding child function calls (RunStats),
“inclusive_runtime”: statistics on the runtime including child function calls (RunStats)
}

}

Counter statistics schema

The counter_stats collection has the following schema:

{

‘app’: Program index,
‘counter’: Counter description,
‘stats’:   Global aggregated statistics on counter values since start of run (RunStats)

}

AD model schema

The ad_model collection contains the final AD model for each function. It has the following schema:

{

“__id”: A unique record index,
“pid”: The program index,
“fid”: The function index,
“func_name”: The function name,
“model” : The model for this function, form dependent on algorithm used (object)

}

The “model” entry has the same form as the “algo_params” entry of the main database, and is documented above.