πŸ“– Glossary

Common terms used throughout the SimBA documentation. Terms defined here can be cross-referenced from anywhere in the docs with the :term: role (e.g. :term:`ROI` renders as ROI).

aggregate statistics

Session- or video-level summaries of classifier output (total time, bout counts, mean bout duration, latency, etc.) saved to the project logs.

annotation
labelling

The process of marking, frame-by-frame, whether a behavior is present or absent in a video. These human labels are the ground truth used to train a classifier.

behavior

target The action a SimBA classifier is trained to detect (e.g. attack, grooming, freezing). Also referred to as the target.

body-part

A single tracked point on an animal (e.g. nose, left_ear, tail_base), produced by pose estimation and stored as x, y (and probability) columns.

bounding box

The smallest axis-aligned (or rotated) rectangle that encloses an animal or a set of body-parts; used for overlap, area and proximity computations.

bout

A continuous, uninterrupted episode of a behavior β€” i.e. a run of consecutive frames classified as the same behavior. Bout-level statistics summarise the count, duration and timing of these episodes.

centroid

The geometric centre of a set of body-parts (or a shape); often used as a single location for an animal.

circular statistics

Statistics for angular/directional data (degrees), where 359Β° and 1Β° are close. Used in SimBA for heading, turning and directional analyses.

classifier

A supervised machine-learning model (typically a random forest) trained on annotated features to predict the presence of a behavior on each frame.

clustering

embedding Unsupervised grouping of behavioral data without labels β€” e.g. projecting features with UMAP/t-SNE and clustering the result to discover behavioral motifs.

confusion matrix

A table of predicted vs. true labels (true/false positives and negatives) used to evaluate a classifier.

convex hull

The smallest convex polygon enclosing a set of body-parts; a common basis for animal area, shape and overlap metrics.

cross-validation

Splitting annotated data into train/test folds to estimate how well a classifier generalises to unseen frames, guarding against over-fitting.

DeepLabCut

DLC A popular open-source pose estimation toolbox. SimBA imports DLC tracking data (single- and multi-animal).

directionality

Whether, and where, an animal is facing β€” e.g. toward another animal, a body-part or a ROI.

discrimination threshold

probability threshold The probability cut-off above which a frame is scored as the behavior. Raising it makes detection stricter (higher precision), lowering it more permissive (higher recall).

egocentric alignment

Re-centering and rotating each frame so a chosen body-part is fixed in position and orientation, removing the animal’s global location/heading from the analysis.

ethogram

A catalogue of the distinct behaviors an animal performs, and (in a session) their occurrence over time.

feature

feature extraction A numeric quantity computed per frame from pose estimation data (distances, velocities, angles, areas, etc.). Feature extraction turns raw tracking into the inputs a classifier learns from.

feature importance

A ranking of how much each feature contributes to a classifier’s decisions (e.g. Gini importance, permutation importance, or SHAP).

FPS

Frames per second β€” the video frame rate. Required to convert frame counts to seconds and to compute time-based metrics.

FSTTC

Forward Spike Time Tiling Coefficient β€” a measure of the temporal association between two behaviors (how often one tends to follow another within a time window), adapted from spike-train analysis.

Gantt plot

A timeline visualization showing when each behavior occurs across a session as horizontal bars.

geometry

Representing animals/arenas as shapes (points, lines, convex hulls, polygons, circles) via Shapely, enabling area, overlap, distance and containment computations.

heatmap

A spatial visualization of where an animal spends time (location heatmap) or where a behavior occurs, binned over the arena.

interpolation

Filling in missing body-part coordinates (e.g. dropped/occluded frames) by estimating values from neighbouring frames.

keypoint

Synonym for body-part β€” a tracked point produced by pose estimation.

Kleinberg smoothing

burst detection A burst-detection algorithm (Kleinberg, 2003) applied to classifier output to merge fragmented detections into coherent bouts and remove noise.

machine results

The per-video CSV files (in project_folder/csv/machine_results) holding the classifier predictions for each frame.

maDLC

Multi-animal DeepLabCut β€” the multi-animal variant of DeepLabCut.

minimum bout length

The shortest allowed bout duration; shorter detected episodes are removed as noise during post-classification smoothing.

multi-animal tracking

identity Tracking several animals at once while maintaining each individual’s identity across frames (and recovering it after occlusion), e.g. via maDLC or SLEAP.

occlusion

When a body-part is hidden (by another animal, an object or self) and so is poorly tracked or missing β€” often handled by interpolation.

outlier correction

Detecting and correcting implausible body-part coordinates (location- and movement-based) before feature extraction.

p

pose confidence The probability/likelihood score (0–1) that pose estimation assigns to each tracked body-part, indicating tracking reliability.

path plot

A visualization tracing an animal’s movement trajectory through the arena over time.

pose estimation

Tracking the 2D positions of animal body-parts across video frames, using tools such as DeepLabCut, SLEAP or YOLO.

precision

recall F1 Standard classification metrics. Precision = fraction of predicted-positive frames that are correct; recall = fraction of true behavior frames detected; F1 = their harmonic mean.

project config

The project_config.ini file at the root of a SimBA project, storing all project settings (paths, body-parts, classifiers, thresholds).

px/mm

pixels per millimeter The conversion factor between image pixels and real-world millimetres, used to report distances/speeds in physical units. Set per video via a known reference length.

random forest

The default supervised algorithm behind a SimBA classifier: an ensemble of decision trees whose votes give a per-frame behavior probability.

ROI

Region of Interest β€” a user-defined shape (rectangle, circle or polygon) drawn on the video frame, used to quantify time spent, entries, movement and directionality within specific areas.

sequential analysis

Analysing the order and timing of behaviors β€” which tend to precede or follow others (see FSTTC) β€” to uncover behavioral structure.

severity scoring

Grading the intensity of a detected behavior (e.g. attack severity) using movement/feature-based criteria.

SHAP

SHapley Additive exPlanations β€” a model-interpretability method giving each feature a contribution score, used in SimBA to explain why a classifier made a prediction.

SLEAP

An open-source multi-animal pose estimation framework whose output SimBA can import.

sliding window

rolling window A fixed-length time window slid across the data to compute time-resolved features (e.g. mean velocity over the last 0.5 s).

smoothing

Reducing frame-to-frame jitter in tracking data (e.g. Savitzky–Golay or Gaussian) to stabilise body-part trajectories.

time bins

Dividing a session into fixed-duration intervals (e.g. 60 s) to report how metrics change over the course of a recording.

validation

Checking a trained classifier on a held-out or new video β€” including the one-click β€œvalidation video” with the predicted probability overlaid frame-by-frame.

velocity

An animal’s speed of movement (distance per unit time), typically derived from the frame-to-frame displacement of a body-part or centroid, in px/mm-scaled units.

video info

The per-project table (video_info.csv) mapping each video to its FPS, resolution and px/mm.

YOLO

A fast real-time object/keypoint detection model family; SimBA supports YOLO-based detection and pose estimation workflows.