YOLO methods๏ƒ

Methods for training YOLO models, creating training and validation datasets, and converting behavioral neuroscience specific datasets to YOLO datasets.

Utilities๏ƒ

simba.utils.yolo.apply_fixed_bbox_size(data, video_name, img_w, img_h, bbox_size)[source]๏ƒ

Apply a fixed axis-aligned bounding-box size to detected rows in a results table.

The current box center is preserved, then the box is resized to bbox_size (h, w). If the resized box would exceed frame boundaries, the box is shifted so it remains fully inside the image while preserving the requested size.

The function expects YOLO corner columns X1..Y4 and updates them in-place on the input dataframe before returning it.

Parameters
  • data (pd.DataFrame) โ€“ Detection dataframe containing CONFIDENCE and corner coordinate columns X1, Y1, X2, Y2, X3, Y3, X4, Y4.

  • video_name (str) โ€“ Video identifier used in error messages.

  • img_w (int) โ€“ Image width in pixels.

  • img_h (int) โ€“ Image height in pixels.

  • bbox_size (Tuple[int, int]) โ€“ Target fixed bounding-box size as (height, width) in pixels.

Returns

Input dataframe with updated fixed-size bbox coordinates for detected rows.

Return type

pd.DataFrame

Raises

InvalidInputError โ€“ If required columns are missing or if bbox_size is larger than image dimensions.

simba.utils.yolo.create_yolo_sample_visualizations(samples, save_dir, names=None, palette='Set1', seg_opacity=0.5, draw_labels=True, verbose=True, source='')[source]๏ƒ

Create annotated visualizations from YOLO-format (image, label_str) samples.

Auto-detects annotation type (bounding-box or segmentation) from the label string format and draws the appropriate overlays. Images are saved as PNG files in save_dir.

Parameters
  • samples (List[Tuple[str, np.ndarray, str]]) โ€“ List of (sample_name, image, label_str) tuples produced by a SAM3-to-YOLO converter.

  • save_dir (Union[str, os.PathLike]) โ€“ Directory where annotated images are saved. Created if it does not exist.

  • names (Optional[Tuple[str, ...]]) โ€“ Class names in index order. Required when draw_labels=True; otherwise optional and only used to size the color palette. Default None.

  • palette (str) โ€“ Color palette name. Default 'Set1'.

  • seg_opacity (float) โ€“ Opacity of filled segmentation polygons (0.0โ€“1.0). Default 0.5.

  • draw_labels (bool) โ€“ If True, draw the class name text alongside each box/polygon. Default True.

  • verbose (bool) โ€“ Print progress messages. Default True.

  • source (str) โ€“ Caller class name for log messages.

simba.utils.yolo.detect_yolo_project_type(label_path)[source]๏ƒ

Detect YOLO project type (bbox, keypoint, or segmentation) from a single label file.

  • bbox: class_id + 4 values (x_center, y_center, w, h)

  • keypoint: class_id + 4 values + N*3 keypoints (x, y, visibility)

  • segmentation: class_id + N*2 polygon vertices (N >= 3)

simba.utils.yolo.export_yolo_model(model_path, export_format, imgsz=256, device=0, int8=False, batch=1, workspace=8, data=None, task=None, dynamic=False, simplify=True, half=False)[source]๏ƒ

Export a YOLO model using Ultralytics model.export.

Wrapper around Ultralytics export that supports common deployment formats (including ONNX and TensorRT engine).

Note

INT8 export is only valid for engine format and cannot be combined with half=True.

Important

When exporting a segmentation model, the imgsz parameter is critical for mask quality. Segmentation requires pixel-level precision along object boundaries, so spatial detail lost to downscaling hurts segmentation far more than detection or pose tasks. Set imgsz as large as your GPU memory allows. The default 256 may be too coarse for high-quality segmentation masks.

Parameters
  • model_path (Union[str, os.PathLike]) โ€“ Path to source YOLO weights (typically .pt).

  • export_format (Literal["onnx", "engine", "torchscript", "onnxsimplify", "coreml", "openvino", "pb", "tf", "tflite", "torch"]) โ€“ Target export format.

  • imgsz (int) โ€“ Export input image size in pixels.

  • device (Union[Literal['cpu'], int]) โ€“ Export device ('cpu' or CUDA index).

  • int8 (bool) โ€“ If True, request INT8 TensorRT export. Requires export_format='engine'.

  • batch (int) โ€“ Export batch/profile size (must be >= 1). For INT8, ensure calibration data size is at least this value.

  • workspace (int) โ€“ TensorRT workspace budget in GB (must be >= 1).

  • data (Optional[Union[str, os.PathLike]]) โ€“ Optional dataset yaml path used for export/calibration.

  • task (Optional[Literal["detect", "segment", "classify", "pose", "obb"]]) โ€“ Optional explicit YOLO task. Set this to avoid backend task auto-guessing warnings.

  • dynamic (bool) โ€“ If True, build with dynamic input profiles.

  • half (bool) โ€“ If True, request FP16 export where supported.

Returns

Path-like export artifact returned by Ultralytics.

Return type

Union[str, os.PathLike]

Raises
Example

>>> export_yolo_model(
...     model_path=r"F://netholabs\primintellect_test\mdl\weights\best.pt",
...     export_format='engine',
...     imgsz=256,
...     device=0,
...     int8=True,
...     batch=4,
...     workspace=8,
...     task='detect',
...     dynamic=False,
...     half=False
... )
simba.utils.yolo.filter_yolo_keypoint_data(bbox_data, keypoint_data, class_id=None, confidence=None, class_idx=None, confidence_idx=None)[source]๏ƒ

Helper to filters YOLO bounding box and keypoint data based on class ID and/or confidence threshold.

Parameters
  • bbox_data (np.ndarray) โ€“ A 2D array of shape (N, M) representing YOLO bounding box data, where each row corresponds to one detection and contains class and confidence values.

  • keypoint_data (np.ndarray) โ€“ A 3D array of shape (N, 2, 3) representing keypoints for each detection, where K is the number of keypoints per detection.

  • class_id (Optional[int]) โ€“ Target class ID to filter detections. Defaults to None.

  • confidence (Optional[float]) โ€“ Minimum confidence threshold to keep detections. Must be in [0, 1]. Defaults to None.

  • confidence_idx (int) โ€“ Index in bbox_data where confidence value is stored. Defaults to 5.

  • class_idx (int) โ€“ Index in bbox_data where class ID is stored. Defaults to 6.

simba.utils.yolo.fit_yolo(weights_path, model_yaml, save_path, epochs=25, batch=16, plots=True, imgsz=640, format=None, device=0, verbose=True, workers=8)[source]๏ƒ

Trains a YOLO model using specified initial weights and a configuration YAML file.

See also

For the recommended wrapper class with parameter validation, see simba.model.yolo_fit.FitYolo.

Parameters
  • initial_weights โ€“ Path to the pre-trained YOLO model weights (usually a .pt file). Example weights can be found [here](https://huggingface.co/Ultralytics).

  • model_yaml โ€“ YAML file containing paths to the training, validation, and testing datasets and the object class mappings. Example YAML file can be found [here](https://github.com/sgoldenlab/simba/blob/master/misc/ex_yolo_model.yaml).

  • save_path โ€“ Directory path where the trained model, logs, and results will be saved.

  • epochs โ€“ Number of epochs to train the model. Default is 5.

  • batch โ€“ Batch size for training. Default is 16.

Returns

None. The trained model and associated training logs are saved in the specified project_path.

Example

>>> fit_yolo(initial_weights=r"C:/troubleshooting/coco_data/weights/yolov8n-obb.pt", data=r"C:/troubleshooting/coco_data/model.yaml", save_path=r"C:/troubleshooting/coco_data/mdl", batch=16)
simba.utils.yolo.get_yolo_imgsz_and_batch_size(model, raise_error=True)[source]๏ƒ

Attempt to read the image size and batch size baked into a YOLO model.

Note

For .engine (TensorRT) files both values are read straight from the embedded header and represent the fixed input bindings. For .pt and other formats the values are scraped from the training arguments, so imgsz reflects the training size (a sensible default, not a hard constraint) and batch is frequently unavailable.

See also

read_yolo_metadata() (full metadata dictionary)

Parameters
  • model (Union[str, os.PathLike, YOLO]) โ€“ Path to a YOLO model file, or an already-loaded ultralytics.YOLO instance.

  • raise_error (bool) โ€“ If True (default), raise InvalidInputError when imgsz or batch cannot be found in the model metadata. If False, missing values are returned as None.

Returns

Tuple of (imgsz, batch_size), each an int (or None if not found and raise_error is False).

Return type

Tuple[Optional[int], Optional[int]]

Raises

InvalidInputError โ€“ If raise_error is True and imgsz and/or batch is not present in the model metadata.

Example

>>> get_yolo_imgsz_and_batch_size(r'/models/best.engine')
(256, 192)
>>> get_yolo_imgsz_and_batch_size(r'/models/best.pt', raise_error=False)
(640, None)
simba.utils.yolo.keypoint_array_to_yolo_annotation_str(x, img_h, img_w, padding=None)[source]๏ƒ

Convert a set of keypoints into a YOLO-format annotation string that includes the normalized bounding box and keypoints.

[x_center y_center width height x1 y1 v1 x2 y2 v2 โ€ฆ xn yn vn]

Parameters
  • x (np.ndarray) โ€“ Array of keypoints with shape (N, 3), where each row contains (x, y, visibility).

  • img_h (int) โ€“ Height of the image.

  • img_w (int) โ€“ Width of the image.

  • padding (Optional[float]) โ€“ Optional padding factor (between 0.0 and 1.0) to expand the bounding box around the keypoints.

Returns

YOLO string representation of the pose-estimation data including bounding box and keypoints.

Return type

str

Example

>>> x = np.array([[100, 200, 2], [150, 250, 2], [120, 240, 1]])
>>> keypoint_array_to_yolo_annotation_str(x=x, img_h=480, img_w=640)
simba.utils.yolo.load_yolo_model(weights_path, verbose=True, format=None, device=0)[source]๏ƒ

Load a YOLO model.

Parameters
  • weights_path (Union[str, os.PathLike]) โ€“ Path to model weights (.pt, .engine, etc).

  • verbose (bool) โ€“ Whether to print loading info.

  • format (Optional[str]) โ€“ Export format, one of VALID_FORMATS or None to skip export.

  • device (Union[Literal['cpu'], int]) โ€“ Device to load model on. โ€˜cpuโ€™, int GPU index.

Example

>>> load_yolo_model(weights_path=r"/mnt/c/troubleshooting/coco_data/mdl/train8/weights/best.pt", format="onnx", device=0)
simba.utils.yolo.read_yolo_metadata(model)[source]๏ƒ

Read metadata from a YOLO model file or loaded YOLO instance.

Supports .engine (TensorRT), .pt (PyTorch), .onnx, .torchscript, and any other format that ultralytics.YOLO can load. For .engine files the embedded JSON header is read directly without loading the model. For all other formats the model is loaded via Ultralytics to extract metadata.

Parameters

model (Union[str, os.PathLike, YOLO]) โ€“ Path to a YOLO model file, or an already-loaded ultralytics.YOLO instance.

Returns

Dictionary of model metadata. Common keys: batch, imgsz, task, names, stride, fp16, dynamic.

Return type

dict

Raises

InvalidInputError โ€“ If model is not a YOLO instance, not a valid path, or has an unsupported extension.

Example

>>> meta = read_yolo_metadata('/models/best.engine')
>>> meta['batch']
192
>>> meta['imgsz']
[256, 256]
>>> meta = read_yolo_metadata('/models/best.pt')
>>> meta['task']
'detect'
simba.utils.yolo.yolo_predict(model, source, half=False, batch_size=4, stream=False, imgsz=640, iou=0.75, device=0, threshold=0.25, max_detections=300, verbose=True, retina_msk=False)[source]๏ƒ

Produce YOLO predictions.

Parameters
  • model (Union[str, os.PathLike]) โ€“ Loaded ultralytics.YOLO model. Returned by load_yolo_model().

  • source (Union[str, os.PathLike, np.ndarray]) โ€“ Path to video, video stream, directory, image, or image as loaded array.

  • half (bool) โ€“ Whether to use half precision (FP16) for inference to speed up processing.

  • stream (bool) โ€“ If True, return a generator that yields results one by one. Useful for stream or large videos.

  • imgsz (int) โ€“ Size to resize input images to (square dimension). Must be positive integer.

  • iou (float) โ€“ If max_detections > 1, then the bbox overlap allowed to detect multiple animals.

  • batch_size (Optional[int]) โ€“ If stream is False, then the number of images to process in each batch.

  • device (Union[Literal['cpu'], int]) โ€“ Device identifier for inference. โ€˜cpuโ€™ to force CPU inference. E.g., integer index of the GPU device (e.g., 0 for โ€˜cuda:0โ€™).

  • threshold (float) โ€“ Confidence threshold for filtering predictions. Only detections with confidence >= threshold are returned. Must be between 0.0 and 1.0.

  • max_detections (int) โ€“ Maximum number of detections per image/frame to return.

  • verbose (bool) โ€“ If True, print inference progress and summary information.

Returns

YOLO results or generator of YOLO results.

Bounding-box inference๏ƒ

class simba.model.yolo_inference.YoloInference(weights, video_path, verbose=False, save_dir=None, half_precision=True, device=0, batch_size=400, core_cnt=8, threshold=0.25, max_detections=300, max_per_class=None, smoothing_method=None, smoothing_time_window=None, interpolate=False, imgsz=320, bbox_size=None, stream=True)[source]

Bases: object

Performs object detection inference on a video using a YOLO model.

YOLO-based object detection (bounding-box) on one or more video files. It supports GPU acceleration, batch processing, streaming, and optional result saving. The model returns bounding box coordinates and class confidence scores for each frame. Results can be smoothed or interpolated to handle detection gaps.

See also

To perform bounding box and keypoint (pose) detection, see YOLOPoseInference(). To perform keypoint (pose) detection with tracking, see YOLOPoseTrackInference() To visualize bounding boxes only, see YOLOVisualizer()

EXPECTED RUNTIMES

VIDEOS (COUNT)

FRAMES (COUNT)

TIME (S)

STDEV(S)

1

9000

19.69

0.185202592

2

18000

39.91333333

0.718424202

3

27000

59.20333333

0.29143324

4

36000

80.82

1.407870733

BATCH SIZE: 500

IMGSZ: 256

NVIDIA GeForce RTX 4070

CPU COUNT (LOADERS): 16

3 runs

Parameters
  • weights (Union[str, os.PathLike, YOLO]) โ€“ Path to YOLO model weights or a preloaded ultralytics.YOLO model instance.

  • video_path (Union[Union[str, os.PathLike], List[Union[str, os.PathLike]]]) โ€“ Input video path, list of paths, or directory containing videos.

  • verbose (Optional[bool]) โ€“ If True, print progress information.

  • save_dir (Optional[Union[str, os.PathLike]]) โ€“ Directory to save output CSV files. If None, results are returned in-memory.

  • half_precision (Optional[bool]) โ€“ If True, run inference in fp16 where supported.

  • device (Union[Literal['cpu'], int]) โ€“ Inference device (โ€˜cpuโ€™ or CUDA index).

  • batch_size (Optional[int]) โ€“ Number of frames per prediction batch.

  • core_cnt (int) โ€“ CPU thread count used by torch.

  • threshold (float) โ€“ Detection confidence threshold in [0.0, 1.0].

  • max_detections (int) โ€“ Maximum detections per frame (total, across all classes) returned by the model.

  • max_per_class (Optional[int]) โ€“ Maximum number of detections to retain per class per frame. E.g., if one โ€˜residentโ€™ and one โ€˜intruderโ€™ is expected, set this to 1. Defaults to None, meaning all detected instances of each class are retained (up to max_detections).

  • smoothing_method (Optional[Literal['savitzky-golay', 'bartlett', 'blackman', 'boxcar', 'cosine', 'gaussian', 'hamming', 'exponential']]) โ€“ Optional temporal smoothing method for bbox coordinates.

  • smoothing_time_window (Optional[int]) โ€“ Smoothing window in milliseconds. Used only when smoothing_method is not None.

  • interpolate (bool) โ€“ If True, interpolate missing bbox coordinates (nearest, per class).

  • imgsz (int) โ€“ Model inference image size.

  • bbox_size (Optional[Tuple[int, int]]) โ€“ Optional fixed bbox size (height, width) in pixels applied to detected boxes.

  • stream (Optional[bool]) โ€“ If True, use streaming predictions.

Returns

If save_dir is None, returns a dict mapping video name to result dataframe. Otherwise saves CSVs and returns None.

Return type

Union[None, Dict[str, pd.DataFrame]]

Example

>>> video_path = "/mnt/d/netholabs/yolo_videos/input/mp4_20250606083508/2025-05-28_19-50-23.mp4"
>>> i = YoloInference(
...     weights=r"/mnt/c/troubleshooting/coco_data/mdl/train8/weights/best.pt",
...     video_path=video_path,
...     save_dir=r"/mnt/c/troubleshooting/coco_data/mdl/results",
...     verbose=True,
...     device=0,
...     interpolate=True,
...     bbox_size=(128, 128)
... )
>>> i.run()

NVDEC GPU-accelerated YOLO inference๏ƒ

class simba.model.yolo_nvdec_inference.YoloNVDECInference(video_path, engine_path, save_dir=None, task='detect', imsz=None, batch_size=None, max_workers=None, gpu_id=0, conf_threshold=0.05, iou_threshold=0.45, keypoint_names=None, vertice_cnt=60, max_detections=None, segment_smoothing=None, interpolate=True, recursive=False, smoothing_method=None, smoothing_time_window=None, verbose=True)[source]๏ƒ

Bases: object

GPU-accelerated YOLO inference on videos using NVDEC decode + TensorRT.

Decodes video frames on GPU via NVDEC (PyNvVideoCodec), runs YOLO detection, pose-estimation, or segmentation through a TensorRT engine with GPU-side letterboxing and NMS, and stores per-frame results as DataFrames.

Important

The number of parallel NVDEC hardware decode engines varies by GPU (e.g., 1 on RTX 4070, 3 on RTX 4090, 7 on H100) and directly controls how many videos can be decoded simultaneously. More NVDEC engines means higher throughput when processing multiple videos. The count is auto-detected via get_nvdec_count(). If your GPU is not listed or the count is incorrect, pass max_workers explicitly.

Important

When running segmentation (task='segment'), the imsz parameter is critical for mask quality. Segmentation requires pixel-level precision along object boundaries, so spatial detail lost to downscaling hurts segmentation far more than detection or pose tasks. Set imsz as large as your GPU memory allows. The default 256 may be too coarse for high-quality segmentation masks.

See also

  • SAM3ToYoloBBox โ€” create a YOLO bounding-box project from SAM3 annotations.

  • SAM3ToYoloSeg โ€” create a YOLO segmentation project from SAM3 annotations.

  • YoloInference โ€” CPU-based YOLO bounding-box inference.

  • YOLOPoseInference โ€” CPU-based YOLO pose inference.

  • YOLOSegmentationInference โ€” CPU-based YOLO segmentation inference.

EXPECTED RUNTIMES BOUNDING BOX

VIDEOS (COUNT)

FRAMES (COUNT)

TIME (S)

STDEV(S)

5

9010

11.2562

0.569887814

10

18020

20.87785

0.145593286

20

36040

41.24536667

1.867777656

BATCH SIZE: 10

IMGSZ: 256

ORIGINAL SIZE: 1280x1024

NVIDIA GeForce RTX 4070 (NVDECs: 1)

3 runs

Parameters
  • video_path (Union[str, os.PathLike]) โ€“ Directory containing input video files, or path to a single video file.

  • engine_path (Union[str, os.PathLike]) โ€“ Path to TensorRT engine file (.engine). If alternative model file exists, convert it to engine using simba.utils.yolo.export_yolo_model(). For multi-GPU, place the source .pt weights alongside the engine โ€” per-GPU engines are auto-exported on first run.

  • save_dir (Optional[Union[str, os.PathLike]]) โ€“ Directory for per-video CSV output. If None, results kept in memory only. Default None.

  • task (Literal['detect', 'pose', 'segment']) โ€“ YOLO task type. Default 'detect'.

  • imsz (Optional[int]) โ€“ Model input image size (square). If None, read from engine metadata. Default None.

  • batch_size (Optional[int]) โ€“ Inference batch size. If None, read from engine metadata. Default None.

  • max_workers (Optional[int]) โ€“ Number of parallel worker processes. If None, auto-detected from GPU NVDEC count. Default None.

  • gpu_id (Union[int, Tuple[int, ...]]) โ€“ CUDA device index or tuple of device indices for multi-GPU inference. Workers are round-robin assigned across listed GPUs. When multiple GPUs are specified, NVDEC engine counts are summed across all GPUs. Default 0.

  • conf_threshold (float) โ€“ Confidence threshold for detections. Default 0.05.

  • iou_threshold (float) โ€“ IoU threshold for NMS. Default 0.45.

  • keypoint_names (Optional[Tuple[str, ...]]) โ€“ Keypoint names in index order, used only when task='pose' (ignored otherwise). Required when task='pose', raises error if not provided.

  • vertice_cnt (int) โ€“ Number of resampled polygon vertices, used only when task='segment' (ignored otherwise). Default 60.

  • max_detections (Optional[int]) โ€“ Maximum number of detections to keep per frame after NMS (sorted by confidence). If None, keep all. Default None.

  • segment_smoothing (Optional[int]) โ€“ B-spline smoothing factor for segmentation polygon vertices, used only when task='segment' (ignored otherwise). Higher values produce smoother contours. If None, no smoothing is applied. Default None.

  • interpolate (bool) โ€“ If True, linearly interpolate missing detections across frames, used only when task='detect' (ignored otherwise). Default True.

  • recursive (bool) โ€“ If True and video_path is a directory, search all subdirectories for video files. Default False.

  • smoothing_method (Optional[Literal['savitzky-golay', 'bartlett', 'blackman', 'boxcar', 'cosine', 'gaussian', 'hamming', 'exponential']]) โ€“ Smoothing method for detection coordinates, used only when task='detect' (ignored otherwise). If None, no smoothing is applied. Default None.

  • smoothing_time_window (Optional[int]) โ€“ Time window (in ms) for coordinate smoothing. Required when smoothing_method is not None. Default None.

  • verbose (bool) โ€“ Print progress messages. Default True.

Example

>>> detector = YoloNVDECInference(video_path=r'/videos', engine_path=r'/best.engine', task='detect')
>>> detector.run()
>>> detector.results['video_name']
>>> detector = YoloNVDECInference(video_path=r'/videos', engine_path=r'/pose.engine', task='pose', keypoint_names=('NOSE', 'LEFT_EAR', 'RIGHT_EAR'))
>>> detector.run()
>>> detector.save()
>>> detector = YoloNVDECInference(video_path=r'/videos/my_video.mp4', engine_path=r'/seg.engine', task='segment', save_dir=r'/output', vertice_cnt=30)
>>> detector.run()
>>> detector.save()

Pose-estimation inference๏ƒ

class simba.model.yolo_pose_inference.YOLOPoseInference(weights, video_path, keypoint_names=None, verbose=True, save_dir=None, device=0, format=None, batch_size=4, torch_threads=8, half_precision=True, stream=False, box_threshold=0.5, bbox_size=None, max_tracks=None, interpolate=False, smoothing=None, imgsz=640, iou=0.5, overwrite=True, raise_error=True, randomize_order=False, recursive=False)[source]๏ƒ

Bases: object

YOLOPoseInference performs pose estimation on videos using a YOLO-based keypoint detection model.

This class runs YOLO-based keypoint detection on a given video or list of videos. It supports GPU acceleration, batch or stream-based inference, result interpolation, and saving results to disk. The model returns detected keypoints and their confidence scores for each frame, and optionally tracks poses over time.

EXPECTED RUNTIMES

VIDEOS (COUNT)

FRAMES (COUNT)

TIME (S)

STDEV(S)

1

9000

21.89

2.87

2

18000

41.83

0.48

3

27000

63.08

0.41

4

36000

84.44

1.32

5

45000

103.84

1.17

6

54000

126.29

1.22

7

63000

148.71

1.86

BATCH SIZE: 500

IMGSZ: 288

NVIDIA GeForce RTX 4070

3 runs

See also

For bounding box inference only (no pose), see simba.model.yolo_inference.YoloInference(). For segmentation inference, see simba.model.yolo_seg_inference.YOLOSegmentationInference(). To fit YOLO model, see :func:`simba.model.yolo_fit.FitYolo. For detailed instructions, see YOLO Pose Estimation Inference Documentation.

Parameters
  • weights (Union[str, os.PathLike]) โ€“ Path to the trained YOLO model weights (e.g., โ€˜best.ptโ€™).

  • video_path (Union[str, os.PathLike] or List[Union[str, os.PathLike]]) โ€“ Path to a single video, list of videos, or directory containing video files.

  • keypoint_names (Tuple[str, ...]) โ€“ Tuple containing the names of keypoints to be tracked (e.g., (โ€˜noseโ€™, โ€˜left_earโ€™, โ€ฆ)). If None, (โ€˜BP_0โ€™, โ€˜BP_1โ€™, โ€ฆ) will be used.

  • verbose (Optional[bool]) โ€“ If True, outputs progress information and timing. Defaults to True.

  • save_dir (Optional[Union[str, os.PathLike]]) โ€“ Directory to save the inference results. If None, results are returned in memory. Defaults to None.

  • device (Union[Literal['cpu'], int]) โ€“ Device to use for inference. Use โ€˜cpuโ€™ for CPU or GPU index (e.g., 0 for CUDA:0). Defaults to 0.

  • format (Optional[str]) โ€“ Optional export format for the model. Supported values: โ€œonnxโ€, โ€œengineโ€, โ€œtorchscriptโ€, โ€œonnxsimplifyโ€, โ€œcoremlโ€, โ€œopenvinoโ€, โ€œpbโ€, โ€œtfโ€, โ€œtfliteโ€. Defaults to None.

  • batch_size (Optional[int]) โ€“ Number of frames to process in parallel. Defaults to 4.

  • torch_threads (int) โ€“ Number of PyTorch threads to use. Defaults to 8.

  • half_precision (bool) โ€“ If True, uses half-precision (FP16) inference. Defaults to True.

  • stream (bool) โ€“ If True, processes frames one-by-one in a generator style. Recommended for long videos. Defaults to False.

  • box_threshold (float) โ€“ Confidence threshold bounding box detection. All detections (bounding boxes AND keypoints) below this value are ignored. Defaults to 0.5.

  • max_tracks (Optional[int]) โ€“ Maximum number (total sum) of pose tracks to keep. If None, all tracks are retained.

  • max_per_class (Optional[int]) โ€“ Maximum number pose tracks per class. E.g., if one โ€˜residentโ€™ and one โ€˜intruderโ€™ is expecte, set this to 1. Defaults to None meaning all detected instances of each class are retained.

  • interpolate (bool) โ€“ If True, interpolates missing keypoints across frames using the โ€˜nearestโ€™ method. Defaults to False.

  • smoothing (bool) โ€“ If not None, then the time in milliseconds for Gaussian-applied body-part smoothing.

  • overwrite (bool) โ€“ If True, overwrites the data at the save_dir. If False, skips the file if it exists.

  • raise_error (bool) โ€“ If True, raise error if the input video metadata canโ€™t be read. If False, then skips the video file.

  • randomize_order (bool) โ€“ If True, analyzes the input data in a random order. If False, then in fixed order.

  • recursive (bool) โ€“ If True, analyzes all video files found recursively in the video_path directory. If False, only looks in the top directory.

  • imgsz (int) โ€“ Input image size for inference. Must be square. Defaults to 640.

YOLO pose-estimation segmentation visualizer๏ƒ

class simba.plotting.yolo_seg_visualizer.YOLOSegmentationVisualizer(data_path, video_path, save_dir, color=(255, 255, 0), core_cnt=- 1, threshold=0.0, verbose=False, shape_opacity=0.5)[source]๏ƒ

Bases: object

Visualizes polygon-based YOLO segmentation results overlaid on video frames.

Accepts either a single video file + CSV data file, or a directory of videos + a directory of CSVs (matched by filename stem).

See also

To run segmentation inference, see simba.model.yolo_seg_inference.YOLOSegmentationInference() To fit YOLO model, see simba.model.yolo_fit.FitYolo()

Parameters
  • data_path (Union[str, os.PathLike]) โ€“ Path to a CSV file or a directory of CSV files with YOLO segmentation output. Must include columns โ€œFRAMEโ€, โ€œIDโ€, and at least six โ€œVERTICEโ€ columns.

  • video_path (Union[str, os.PathLike]) โ€“ Path to a single video file or a directory of video files. When directories are passed, files are matched by stem name.

  • save_dir (Union[str, os.PathLike]) โ€“ Directory where output videos (temp and final) are saved.

  • color (Tuple[int, int, int]) โ€“ RGB color for drawing polygons. Defaults to (255, 255, 0).

  • core_cnt (Optional[int]) โ€“ Number of parallel processes to use. Defaults to -1 (auto all available cores).

  • threshold (float) โ€“ Confidence threshold (not currently used in rendering). Defaults to 0.0.

  • verbose (bool) โ€“ Whether to print detailed progress messages. Defaults to False.

  • shape_opacity (float) โ€“ Alpha blending factor for filled polygon overlay (range 0.0โ€“1.0). If None, solid fill is used. Defaults to 0.5.

Example

>>> runner = YOLOSegmentationVisualizer(data_path=r"D:/results/video1.csv", video_path=r"D:/videos/video1.mp4", save_dir=r'D:/output', verbose=True)
>>> runner.run()
>>> runner = YOLOSegmentationVisualizer(data_path=r"D:/results", video_path=r"D:/videos", save_dir=r'D:/output', verbose=True)
>>> runner.run()

YOLO pose-estimation segmentation inference๏ƒ

class simba.model.yolo_seg_inference.YOLOSegmentationInference(weights_path, video_path, verbose=True, save_dir=None, device=0, format=None, batch_size=4, torch_threads=8, half_precision=True, stream=False, threshold=0.5, max_tracks=300, interpolate=False, imgsz=640, iou=0.5, retina_msk=False, vertice_cnt=30)[source]๏ƒ

Bases: object

Run inference on video(s) using a trained YOLO segmentation model.

Parameters
  • weights_path (Union[str, os.Pathlike]) โ€“ Path to the trained YOLO .pt weights file.

  • video_path (Union[str, os.Pathlike]) โ€“ Path to a single video or a list of video paths to run inference on.

  • verbose (bool) โ€“ Whether to print progress information. Default is True.

  • save_dir (Union[str, os.Pathlike]) โ€“ Directory where output videos and data will be saved.

  • device (Union[str, int]) โ€“ Device to run inference on; use โ€˜cpuโ€™ or an integer GPU index (e.g., 0).

  • format (str) โ€“ Optional export format for the model. Supported values: โ€œonnxโ€, โ€œengineโ€, โ€œtorchscriptโ€, โ€œonnxsimplifyโ€, โ€œcoremlโ€, โ€œopenvinoโ€, โ€œpbโ€, โ€œtfโ€, โ€œtfliteโ€. Defaults to None.

  • batch_size (Optional[int]) โ€“ Number of frames to process at once. Increase for faster performance with sufficient memory.

  • torch_threads (int) โ€“ Number of CPU threads to use (when on CPU).

  • half_precision (bool) โ€“ Whether to use half-precision (FP16) for inference on GPU. Default is True.

  • stream (bool) โ€“ Whether to stream video processing (less memory, suitable for long videos).

  • threshold (float) โ€“ Confidence threshold for object/segmentation detection.

  • max_tracks (int) โ€“ Optional maximum number of objects to track. If None, tracking is disabled.

  • interpolate (bool) โ€“ Whether to interpolate results (useful for smoothing or low-FPS videos).

  • imgsz (int) โ€“ Inference image size (width/height in pixels); must be multiple of 32.

  • iou (float) โ€“ IoU threshold for non-max suppression (NMS).

  • retina_msk (bool) โ€“ Whether to use high-resolution Retina-style masks.

  • vertice_cnt (int) โ€“ Number of vertices used to approximate the segmentation mask polygon.

Important

The imgsz parameter is critical for mask quality. Segmentation requires pixel-level precision along object boundaries, so spatial detail lost to downscaling hurts segmentation far more than detection or pose tasks. Set imgsz as large as your GPU memory allows. The default 640 may be too coarse for high-quality segmentation masks.

Note

To create YOLO segmentation dataset for fitting, use simba.third_party_label_appenders.transform.labelme_to_yolo_seg.LabelmeKeypoints2YoloSeg(). To fit YOLO model, see :func:`simba.model.yolo_fit.FitYolo. To visualize the segmentation results, see simba.plotting.yolo_seg_visualizer.YOLOSegmentationVisualizer()

Example

>>> weights_path = r"D:/platea/yolo_071525/mdl/train3/weights/best.pt"
>>> video_path = r"D:/platea/platea_videos/videos/clipped/10B_Mouse_5-choice_MustTouchTrainingNEWFINAL_a7.mp4"
>>> save_dir = r"D:/platea/platea_videos/videos/yolo_results"
>>> runner = YOLOSegmentationInference(weights_path=weights_path, video_path=video_path, save_dir=save_dir, verbose=True, device=0, format=None, stream=True, batch_size=10, imgsz=320, interpolate=True, threshold=0.8, retina_msk=True)
>>> runner.run()

Pose-estimation track inference๏ƒ

class simba.model.yolo_pose_track_inference.YOLOPoseTrackInference(weights_path, video_path, keypoint_names, config_path, verbose=False, save_dir=None, device=0, format=None, batch_size=4, torch_threads=8, half_precision=True, recursive=False, stream=False, interpolate=False, threshold=0.7, max_tracks=2, smoothing=None, randomize_order=False, n=None, imgsz=320, min_video_size=None, overwrite=True, iou=0.5, raise_error=True)[source]๏ƒ

Bases: object

Perform YOLO-based pose estimation and object tracking inference on single or multiple videos.

Uses YOLO pose model to detect and track objects with keypoint localization across video frames. It supports GPU acceleration, multi-object tracking with configurable trackers (e.g., BoTSORT, ByteTrack), and post-processing options including interpolation and smoothing of keypoint trajectories.

Note

Requires GPU with CUDA support. The class will raise an error if no GPU is detected. The ultralytics package must be installed.

See also

For pose inference without tracks, see simba.model.yolo_pose_inference.YOLOPoseInference(). For visualizing pose tracks, see simba.plotting.yolo_pose_track_visualizer.YOLOPoseTrackVisualizer(). If you do not need tracks (have individually idenitifiable individuals) use simba.plotting.yolo_pose_track_visualizer.YOLOPoseTrackVisualizer() for better runtimes.

EXPECTED RUNTIMES

VIDEOS (COUNT)

FRAMES (COUNT)

TIME (S)

STDEV(S)

1

1500

22.41333333

1.243958735

2

3000

44.76866667

0.22300299

3

6000

66.592

1.305805881

4

9000

89.683

1.132298106

BATCH SIZE: 500

IMGSZ: 256

NVIDIA GeForce RTX 4070

CPU COUNT (LOADERS): 16

3 runs

Parameters
  • weights_path (Union[str, os.PathLike]) โ€“ Path to the YOLO pose model weights file (e.g., .pt file). Must be a valid YOLO pose model, not a detection or segmentation model.

  • video_path (Union[Union[str, os.PathLike], List[Union[str, os.PathLike]]]) โ€“ Path(s) to video file(s) or directory containing videos. Can be a single video path, a list of video paths, or a directory path. If a directory is provided, all video files in the directory will be processed.

  • keypoint_names (Tuple[str, ...]) โ€“ Tuple of keypoint names corresponding to the modelโ€™s keypoint outputs. Length must match the number of keypoints expected by the model.

  • config_path (Union[str, os.PathLike]) โ€“ Path to the tracker configuration YAML file (e.g., โ€˜botsort.ymlโ€™, โ€˜bytetrack.ymlโ€™).

  • verbose (Optional[bool]) โ€“ If True, prints progress information during processing. Default: False.

  • save_dir (Optional[Union[str, os.PathLike]]) โ€“ Directory to save output CSV files. If None, results are returned as a dictionary of DataFrames instead of being saved. Default: None.

  • device (Union[Literal['cpu'], int]) โ€“ Device to run inference on. Use โ€˜cpuโ€™ for CPU inference or an integer for GPU device ID (e.g., 0 for cuda:0). Default: 0.

  • format (Optional[str]) โ€“ Model format for loading. If None, inferred from weights file extension. Default: None.

  • batch_size (Optional[int]) โ€“ Number of frames to process in each batch. Higher values increase memory usage but may improve throughput. Default: 4.

  • torch_threads (int) โ€“ Number of threads for PyTorch operations. Default: 8.

  • half_precision (bool) โ€“ If True, uses FP16 half-precision inference for faster processing. Requires GPU support. Default: True.

  • recursive (bool) โ€“ If True and video_path is a directory, recursively searches subdirectories for video files. Default: False.

  • stream (bool) โ€“ If True, enables streaming mode for memory-efficient processing of long videos. Default: False.

  • interpolate (bool) โ€“ If True, interpolates missing keypoint coordinates and confidence values using nearest-neighbor interpolation with forward/backward filling. Default: False.

  • threshold (float) โ€“ Confidence threshold for detections. Detections with confidence below this value are filtered out. Range: (0, 1]. Default: 0.7.

  • max_tracks (Optional[int]) โ€“ Maximum number of tracks to maintain simultaneously. If None, unlimited tracks are allowed. Default: 2.

  • smoothing (Optional[int]) โ€“ Smoothing window size in milliseconds. If provided, applies Gaussian smoothing to keypoint trajectories. If None, no smoothing is applied. Default: None.

  • imgsz (int) โ€“ Input image size for inference. Images are resized to this dimension while maintaining aspect ratio. Default: 320.

  • iou (float) โ€“ IoU (Intersection over Union) threshold for Non-Maximum Suppression (NMS) during tracking. Range: (0, 1]. Default: 0.5.

Example
>>> keypoint_names = ('Nose', 'Left_ear', 'Right_ear', 'Tail_base')
>>> inference = YOLOPoseTrackInference(
...     weights_path='path/to/model.pt',
...     video_path='path/to/video.mp4',
...     keypoint_names=keypoint_names,
...     config_path='botsort.yml',
...     save_dir='path/to/output',
...     verbose=True,
...     threshold=0.5,
...     interpolate=True,
...     smoothing=100
... )
>>> inference.run()
Example
>>> # Process multiple videos and return results as DataFrames
>>> inference = YOLOPoseTrackInference(
...     weights_path='model.pt',
...     video_path=['video1.mp4', 'video2.mp4'],
...     keypoint_names=('Nose', 'Tail'),
...     config_path='bytetrack.yml',
...     save_dir=None,
...     max_tracks=5
... )
>>> results_dict = inference.run()  # Returns dict of video_name: DataFrame

Pose-estimation track plotting๏ƒ

class simba.plotting.yolo_pose_track_visualizer.YOLOPoseTrackVisualizer(data_path, video_path, save_dir, palettes=None, core_cnt=- 1, threshold=0.0, thickness=None, circle_size=None, verbose=False, bbox=False, overwrite=True)[source]๏ƒ

Bases: object

Visualizes YOLO-based keypoint pose estimation tracks on video frames and creates annotated output videos.

This class overlays tracked keypoints and optional bounding boxes onto the source video, grouping detections by track identifier and applying per-track color palettes.

See also

  • YOLOPoseInference() for generating YOLO pose CSV data.

  • FitYolo() for fitting YOLO detector models.

  • Tracker configuration templates in simba.assets.tracker_yml.

Parameters
  • data_path (Union[str, os.PathLike]) โ€“ Path to a YOLO pose CSV file, or directory containing multiple CSV files.

  • video_path (Union[str, os.PathLike]) โ€“ Path to the source video, or directory with videos matching the CSV filenames.

  • save_dir (Union[str, os.PathLike]) โ€“ Directory where annotated videos are written.

  • palettes (Optional[Union[str, Tuple[str, ...]]]) โ€“ Custom color palettes keyed by track IDs. Defaults to random palettes per track when None.

  • core_cnt (Optional[int]) โ€“ CPU cores to use for parallel rendering. Defaults to -1 (all available cores).

  • threshold (float) โ€“ Confidence threshold for rendering detections. Detections below this threshold are skipped.

  • thickness (Optional[int]) โ€“ Line thickness for bounding boxes. If None, derived from frame dimensions.

  • circle_size (Optional[int]) โ€“ Radius of drawn keypoints. If None, derived from frame dimensions.

  • verbose (Optional[bool]) โ€“ Set True to print progress information.

Example

>>> video_path = r"/mnt/c/troubleshooting/mitra/project_folder/videos/501_MA142_Gi_CNO_0521.mp4"
>>> data_path = "/mnt/c/troubleshooting/mitra/yolo_pose/501_MA142_Gi_CNO_0521.csv"
>>> kp_vis = YOLOPoseTrackVisualizer(data_path=data_path,
>>>                                 video_path=video_path,
>>>                                 save_dir='/mnt/c/troubleshooting/mitra/yolo_pose/',
>>>                                 core_cnt=18,
>>>                                 bbox=True)
>>> kp_vis.run()

Pose-estimation plotting๏ƒ

class simba.plotting.yolo_pose_visualizer.YOLOPoseVisualizer(data_path, video_path, save_dir, palettes='Set1', core_cnt=- 1, threshold=0.0, thickness=None, circle_size=None, verbose=True, bbox=True, skeleton=None, recursive=False, pool=None, sample_n=None)[source]๏ƒ

Bases: object

Visualizes YOLO-based keypoint pose estimation data on video frames and creates an annotated output video.

This class takes keypoint data (CSV) and overlays it onto the corresponding video using color-coded keypoints and optional filtering. The result is saved as a new annotated video, and supports multicore parallel rendering for efficient processing of long videos.

See also

To create YOLO pose data, see YOLOPoseInference() To fit YOLO model, see FitYolo() For instructions, see YOLO Pose Estimation Visualization Documentation.

Parameters
  • data_path (Union[str, os.PathLike]) โ€“ Path to the CSV file containing keypoint data, or folder containing keypoint data (output from YOLO pose inference).

  • video_path (Union[str, os.PathLike]) โ€“ Path to the original input video, or folder containing original videos, to overlay keypoints on.

  • save_dir (Union[str, os.PathLike]) โ€“ Directory to save the resulting annotated video.

  • palettes (Optional[Union[str, Tuple[str, ...]]]) โ€“ Name(s) of categorical color palettes used to draw keypoints per detected class. A single string applies to all classes; a tuple assigns one palette per class. Defaults to (โ€˜Set1โ€™,).

  • core_cnt (Optional[int]) โ€“ Number of CPU cores to use for parallel rendering. Defaults to -1 (all available cores).

  • threshold (float) โ€“ Confidence threshold for rendering bounding boxes, keypoints, and skeleton edges. Only entries with confidence >= threshold are drawn.

  • thickness (Optional[int]) โ€“ Thickness of bounding boxes and skeleton edges. If None, computed from frame dimensions.

  • circle_size (Optional[int]) โ€“ Radius of keypoint circles. If None, computed from frame dimensions.

  • verbose (Optional[bool]) โ€“ Set True to enable progress logging.

  • bbox (Optional[bool]) โ€“ Set False to disable rendering of bounding boxes around detections.

  • skeleton (Optional[List[Tuple[str, str]]]) โ€“ Iterable of keypoint name pairs defining skeleton edges to render when both keypoints exceed threshold.

  • recursive (Optional[bool]) โ€“ If True, search data and video directories recursively; otherwise only the top level is scanned.

  • sample_n (Optional[int]) โ€“ Randomly sample sample_n data files to visualize. If None, visualize all detected files.

Example

>>> video_path = r"/mnt/c/troubleshooting/mitra/project_folder/videos/501_MA142_Gi_CNO_0521.mp4"
>>> data_path = "/mnt/c/troubleshooting/mitra/yolo_pose/501_MA142_Gi_CNO_0521.csv"
>>> kp_vis = YOLOPoseVisualizer(data_path=data_path,
>>>                            video_path=video_path,
>>>                            save_dir='/mnt/c/troubleshooting/mitra/yolo_pose/',
>>>                            core_cnt=18)
>>> kp_vis.run()

Bounding box plotting๏ƒ

class simba.plotting.yolo_visualize.YOLOVisualizer(data_path, video_path, save_dir, palette='Set1', core_cnt=- 1, threshold=0.0, padding=0, pool=None, thickness=None, opacity=0.6, outline_color=None, color_by='class', verbose=True)[source]

Bases: object

Visualize YOLO bounding-box inference results on a source video.

See also

For bounding-box inference, see simba.model.yolo_inference.YoloInference.

Parameters
  • data_path (Union[str, os.PathLike]) โ€“ Path to YOLO results CSV. Expected columns: FRAME, CLASS_ID, CLASS_NAME, CONFIDENCE, X1..Y4. Multiple rows sharing the same FRAME and CLASS_NAME (i.e. several detections of one class per frame, as produced by YoloInference with max_per_class > 1) are rendered as separate instances, each drawn as its own polygon track and color (ordered by detection confidence).

  • video_path (Union[str, os.PathLike]) โ€“ Path to the video from which the data was produced.

  • save_dir (Union[str, os.PathLike]) โ€“ Directory where to save visualization output.

  • palette (Optional[str]) โ€“ Matplotlib color palette name for per-class geometry colors (e.g., 'Set1', 'tab10'). Default: 'Set1'.

  • core_cnt (Optional[int]) โ€“ CPU core count for parallel processing. Use -1 for all available cores.

  • threshold (float) โ€“ Confidence threshold in [0.0, 1.0]. Detections below threshold are masked before polygon conversion.

  • padding (int) โ€“ Polygon offset in pixels used during multiframe bbox-to-polygon conversion for rendering. Defaults to 0 (draw the exact detection box). Positive values expand polygons outward, -1 shrinks them inward. This affects visualization geometry only, not the underlying YOLO detections in the input CSV.

  • thickness (Optional[int]) โ€“ Polygon line thickness. If None, default geometry plotter thickness is used.

  • opacity (float) โ€“ Polygon fill opacity in [0.0, 1.0]. Default: 0.6.

  • outline_color (Optional[Tuple[int, int, int]]) โ€“ BGR color for polygon outlines. If None, no outlines are drawn. Default: None.

  • color_by (Literal['class', 'instance']) โ€“ How detections are colored when multiple instances per class are present. 'class' (default) gives every instance of a class the same class color (avoids color flicker, since instance slots are confidence-ranked per frame and not identity-tracked). 'instance' gives each instance slot its own color (useful only when the data carries stable identities, e.g. from a tracker). For single-instance-per-class data both options are equivalent.

  • verbose (bool) โ€“ If True, prints progress information. Default: True.

Raises

FrameRangeError โ€“ If YOLO result frame coverage does not match video frame count.

Example

>>> test = YOLOVisualizer(
...     data_path=r"/mnt/c/troubleshooting/yolo_inference/08102021_DOT_Rat7_8(2).csv",
...     video_path=r"/mnt/c/troubleshooting/RAT_NOR/project_folder/videos/08102021_DOT_Rat7_8(2).mp4",
...     save_dir="/mnt/c/troubleshooting/yolo_videos",
...     threshold=0.25,
...     core_cnt=4
... )
>>> test.run()

YOLO annotation visualizer๏ƒ

class simba.plotting.yolo_annotation_visualizer.YOLOAnnotationVisualizer(map_yaml_path, save_dir, split='all', n=None, circle_size=None, thickness=None, palette='Set1', img_format='.png', seg_opacity=0.5, show_names=False, show_outline=False, verbose=True)[source]๏ƒ

Bases: object

Visualize YOLO annotation label files overlaid on their source images.

See also

For visualizing YOLO bounding-box inference results on video, see simba.plotting.yolo_visualize.YOLOVisualizer(). For visualizing YOLO keypoint pose-estimation results on video, see simba.plotting.yolo_pose_visualizer.YOLOPoseVisualizer(). For visualizing YOLO segmentation polygon results on video, see simba.plotting.yolo_seg_visualizer.YOLOSegmentationVisualizer(). For auto-detecting the YOLO project type from a label file, see simba.utils.yolo.detect_yolo_project_type().

param Union[str, os.PathLike] map_yaml_path

Path to the YOLO project map.yaml file.

param Union[str, os.PathLike] save_dir

Directory where annotated images are saved.

param Optional[str] split

Which split to visualize: 'train', 'val', or 'all'. Default 'all'.

param Optional[int] n

Number of images to visualize. If None, visualize every image. Default None.

param Optional[int] circle_size

Radius of keypoint circles. If None, computed from image dimensions.

param Optional[int] thickness

Line thickness for bounding boxes / polygon edges. If None, computed from image dimensions.

param str palette

Color palette name (e.g. 'Set1'). Default 'Set1'.

param str img_format

Output image format extension. Default '.png'.

param float seg_opacity

Opacity of filled segmentation polygons (0.0โ€“1.0). Default 0.5.

param bool show_names

If True, draw class name labels on each annotation. Default False.

param bool show_outline

If True, draw polygon outline for segmentation annotations. Default False.

param bool verbose

Print progress messages. Default True.

example

>>> viz = YOLOAnnotationVisualizer(map_yaml_path=r'F:

etholabsmoira_lp_sammap.yamlโ€™, save_dir=rโ€™F: etholabsnnotation_visualizationsโ€™, n=400)

>>> viz.run()
>>> viz = YOLOAnnotationVisualizer(map_yaml_path=r'/path/to/map.yaml', save_dir=r'/path/to/output')
>>> viz.run()
>>> viz = YOLOAnnotationVisualizer(map_yaml_path=r'/path/to/map.yaml', save_dir=r'/path/to/output', n=50, circle_size=5, thickness=2, img_format='.jpeg')
>>> viz.run()

COCO key-points -> YOLO pose-estimation format conversion๏ƒ

class simba.third_party_label_appenders.transform.coco_keypoints_to_yolo.COCOKeypoints2Yolo(coco_path, img_dir, save_dir, train_size=0.7, flip_idx=(0, 2, 1, 5, 4, 3, 6), verbose=True, greyscale=False, clahe=False, bbox_pad=None)[source]

Bases: object

Convert COCO Keypoints version 1.0 data format into a YOLO keypoints training set.

Processes COCO format keypoint annotations and converts them to YOLO keypoint format, splitting the data into training and validation sets. Images are copied to the output directory and annotations are converted to YOLO format text files. A YAML configuration file is automatically generated.

Note

COCO keypoint files can be created using https://www.cvat.ai/.

This function expects the path to a single COCO Keypoints version 1.0 file. To merge several before passing the file to this function, use simba.third_party_label_appenders.transform.utils.merge_coco_keypoints_files().

Important

All image file names have to be unique.

See also

To convert COCO Keypoints version 1.0 data format into a YOLO bounding box training set, use simba.third_party_label_appenders.transform.coco_keypoints_to_yolo_bbox.COCOKeypoints2YoloBbox(). To train YOLO pose models with the converted data, see simba.model.yolo_fit.FitYolo() and YOLO Pose Estimation Training Documentation. To run inference with trained YOLO pose models, see simba.model.yolo_pose_inference.YOLOPoseInference() or simba.model.yolo_pose_track_inference.YOLOPoseTrackInference() and YOLO Pose Estimation Inference Documentation.

Parameters
  • coco_path (Union[str, os.PathLike]) โ€“ Path to COCO keypoints 1.0 file in JSON format. Must contain โ€˜categoriesโ€™, โ€˜imagesโ€™, and โ€˜annotationsโ€™ keys.

  • img_dir (Union[str, os.PathLike]) โ€“ Directory holding image files representing the annotated entries in the coco_path. Will search recursively, so itโ€™s OK to have images in subdirectories.

  • save_dir (Union[str, os.PathLike]) โ€“ Directory where to save the YOLO formatted data. Will create โ€˜images/trainโ€™, โ€˜images/valโ€™, โ€˜labels/trainโ€™, โ€˜labels/valโ€™ subdirectories.

  • train_size (float) โ€“ Size of the training set as a fraction between 0.1 and 0.99. Remaining data becomes validation set. Default: 0.7 (70% training, 30% validation).

  • flip_idx (Tuple[int, ...]) โ€“ Tuple of integers representing the re-ordering of body-part indices when the image is horizontally flipped 180 degrees. Must match the number of keypoints. Default: (0, 2, 1, 5, 4, 3, 6).

  • verbose (bool) โ€“ If True (default), prints progress messages. If False, suppresses output.

  • greyscale (bool) โ€“ If True, converts images to greyscale before saving. If False (default), keeps original color format.

  • clahe (bool) โ€“ If True, applies CLAHE (Contrast Limited Adaptive Histogram Equalization) enhancement to images before saving. If False (default), no enhancement is applied.

  • bbox_pad (Optional[float]) โ€“ Optional padding factor for bounding boxes (between 10e-6 and 1.0). If provided, bounding boxes are expanded by this percentage to better encompass all body-parts. If None (default), no padding is applied.

Returns

None. YOLO formatted data is saved to save_dir with structure: images/train, images/val, labels/train, labels/val, and map.yaml.

Example

>>> runner = COCOKeypoints2Yolo(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/s1/annotations/s1.json", img_dir=r"D:/cvat_annotations/frames/simon", save_dir=r"D:/cvat_annotations/frames/yolo_keypoints", clahe=True)
>>> runner.run()
Example II

>>> runner = COCOKeypoints2Yolo(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/merged.json", img_dir=r"D:/cvat_annotations/frames", save_dir=r"D:/cvat_annotations/frames/yolo", clahe=False)
>>> runner.run()
Example III

>>> runner = COCOKeypoints2Yolo(coco_path=r"E:/netholabs_videos/mosaics/subset/to_annotate/2d_mosaic_batch_1.json", img_dir=r"E:/netholabs_videos/mosaics/subset/to_annotate", save_dir=r"E:/netholabs_videos/mosaics/yolo_mdl", clahe=False, bbox_pad=0.1)
>>> runner.run()
references
1

Helpful YouTube tutorial by Farhan to get YOLO tracking data in animals - https://www.youtube.com/watch?v=CcGbgFPwQTc.

2

Great YouTube tutorial by Felipe on annotating data and making data YOLO ready - https://www.youtube.com/watch?v=m9fH9OWn8YM.

COCO key-points -> YOLO bounding box conversion๏ƒ

class simba.third_party_label_appenders.transform.coco_keypoints_to_yolo_bbox.COCOKeypoints2YoloBbox(coco_path, img_dir, save_dir, train_size=0.7, verbose=True, greyscale=False, clahe=False, bbox_pad=None, obb=False)[source]

Bases: object

Convert COCO Keypoints version 1.0 data format into a YOLO bounding box training set.

Note

COCO keypoint files can be created using https://www.cvat.ai/.

This function expects the path to a single COCO Keypoints version 1.0 file. To merge several before passing the file to thsi function, use simba.third_party_label_appenders.transform.utils.merge_coco_keypoints_files().

Important

All image file names have to be unique.

See also

To convert OCO Keypoints version 1.0 data format into a YOLO keypoint training set, use simba.third_party_label_appenders.transform.coco_keypoints_to_yolo.COCOKeypoints2Yolo()

Parameters
  • coco_path (Union[str, os.PathLike]) โ€“ Path to coco keypoints 1.0 file in json format.

  • img_dir (Union[str, os.PathLike]) โ€“ Directory holding img files representing the annotated entries in the coco_path. Will search recursively, so its OK to have images in subdirectories.

  • save_dir (Union[str, os.PathLike]) โ€“ Directory where to save the yolo formatted data.

  • split (Tuple[float, float, float]) โ€“ The size of the training set. Value between 0-1.0 representing the percent of training data.

  • verbose (bool) โ€“ If true, prints progress. Default: True.

  • flip_idx (Tuple[int, ...]) โ€“ Tuple of ints, representing the flip of body-part coordinates when the animal image flips 180 degrees.

Returns

None

Example

>>> runner = COCOKeypoints2YoloBbox(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/s1/annotations/s1.json", img_dir=r"D:/cvat_annotations/frames/simon", save_dir=r"D:/cvat_annotations/frames/yolo_keypoints", clahe=True)
>>> runner.run()
Example II

>>> runner = COCOKeypoints2YoloBbox(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/merged.json", img_dir=r"D:/cvat_annotations/frames", save_dir=r"D:/cvat_annotations/frames/yolo", clahe=False)
>>> runner.run()
references
1

Helpful YouTube tutorial by Farhan to get YOLO tracking data in animals - https://www.youtube.com/watch?v=CcGbgFPwQTc.

2

Great YouTube tutorial by Felipe on annotating data and making data YOLO ready - https://www.youtube.com/watch?v=m9fH9OWn8YM.

COCO key-points -> YOLO segmentation conversion๏ƒ

class simba.third_party_label_appenders.transform.coco_keypoints_to_yolo_seg.COCOKeypoints2YoloSeg(coco_path, img_dir, save_dir, train_size=0.7, verbose=True, greyscale=False, clahe=False, bbox_pad=None)[source]

Bases: object

SAM3 -> YOLO segmentation project๏ƒ

class simba.third_party_label_appenders.transform.sam3_to_yolo_seg.SAM3ToYoloSeg(video_dir, sam_path, save_dir, txt_prompt='mouse', n_frames=50, names=('animal',), train_val_split=0.7, conf=0.5, sam_imgsz=644, greyscale=False, clahe=False, vertice_cnt=40, seed=None, visualize=False, io_timeout=30.0, verbose=True)[source]

Bases: object

Sample N random frames from each video in a directory, run SAM3 with a text prompt, and write the resulting masks as a YOLO segmentation project.

Note

To fit a YOLO segmentation model, see FitYolo. For YOLO segmentation inference, see YOLOSegmentationInference.

See also

  • MergeYoloProjects โ€” merge several map.yaml projects (same classes and task) into one dataset.

Raises
Parameters
  • video_dir (Union[str, os.PathLike]) โ€“ Directory containing input videos.

  • sam_path (Union[str, os.PathLike]) โ€“ Path to SAM3 model weights (e.g. sam3.pt).

  • save_dir (Union[str, os.PathLike]) โ€“ Root output directory for the YOLO project.

  • txt_prompt (str) โ€“ Text prompt for SAM3 (e.g. โ€œmouseโ€, โ€œmouse tailโ€).

  • n_frames (int) โ€“ Number of random frames to sample from each video.

  • names (Tuple[str, ...]) โ€“ Class names in index order. Default ('animal',).

  • train_val_split (float) โ€“ Fraction allocated to training (0.1-0.9). Default 0.7.

  • conf (float) โ€“ SAM3 confidence threshold. Default 0.25.

  • sam_imgsz (int) โ€“ Image size for SAM3 inference. Default 640.

  • greyscale (bool) โ€“ If True, save extracted frames in greyscale. Default False.

  • clahe (Optional[Union[Tuple[int, int, int], bool]]) โ€“ If True, applies CLAHE with default params. If tuple of (clip_limit, tile_x, tile_y), applies CLAHE with those params. Default False.

  • vertice_cnt (Optional[int]) โ€“ If not None, resample each mask polygon to this many vertices. Default 40.

  • seed (Optional[int]) โ€“ Random seed for reproducible frame sampling.

  • visualize (bool) โ€“ If True, saves annotated images with segmentation polygon overlays to a visualizations subfolder inside save_dir. Useful for verifying SAM3 annotation quality. Default False.

  • io_timeout (float) โ€“ Seconds to keep retrying file I/O (read/write) when the operation fails (e.g. temporary drive disconnect). Default 30.0.

  • verbose (bool) โ€“ If True, print progress updates. Default True.

Example

>>> runner = SAM3ToYoloSeg(video_dir=r'/path/to/videos', sam_path=r'/path/to/sam3.pt', save_dir=r'/path/to/yolo_project', txt_prompt='mouse', n_frames=50)
>>> runner.run()

SAM3 -> YOLO bounding-box (detection) project๏ƒ

Merge multiple YOLO projects๏ƒ

class simba.third_party_label_appenders.transform.merge_yolo_projects.MergeYoloProjects(yaml_paths, save_dir, train_val_split=None, seed=None, verbose=True)[source]

Bases: object

Merge multiple YOLO projects into a single YOLO project.

Reads each projectโ€™s YAML, validates that all projects share the same task type (bounding-box detection, segmentation, or keypoint pose) and class names, then copies all images and labels into a single output project with train/val splits.

See also

Parameters
  • yaml_paths (List[Union[str, os.PathLike]]) โ€“ List of paths to YOLO project YAML files.

  • save_dir (Union[str, os.PathLike]) โ€“ Root output directory for the merged project.

  • train_val_split (Optional[float]) โ€“ If provided, reshuffle all samples and split at this ratio (0.1-0.9). If None, preserve each projectโ€™s existing train/val assignments. Default None.

  • seed (Optional[int]) โ€“ Random seed for reproducible splitting. Only used when train_val_split is not None.

  • verbose (bool) โ€“ If True, print progress. Default True.

Example

>>> merger = MergeYoloProjects(yaml_paths=[r'/project_a/map.yaml', r'/project_b/map.yaml'], save_dir=r'/merged_project', train_val_split=0.8)
>>> merger.run()

Multi-animal DeepLabCut predictions -> YOLO pose-estimation annotations format conversion๏ƒ

class simba.third_party_label_appenders.transform.dlc_ma_h5_to_yolo.MADLCH52Yolo(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, threshold=0, train_size=0.7, flip_idx=None, greyscale=False, clahe=False, padding=0.0, single_id=None)[source]

Bases: object

Convert multi-animal DeepLabCut pose estimation H5 data and corresponding videos into YOLO keypoint dataset format.

Note

This converts DeepLabCut inference data to YOLO keypoints (not DeepLabcut annotations).

param Union[str, os.PathLike] data_dir

Directory path containing DLC-generated H5 files with inferred keypoints.

param Union[str, os.PathLike] video_dir

Directory path containing corresponding videos from which frames are to be extracted.

param Union[str, os.PathLike] save_dir

Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.

param Optional[int] frms_cnt

Number of frames to randomly sample from each video for conversion. If None, all frames are used.

param float threshold

Minimum confidence score threshold to filter out low-confidence pose instances. Only instances with instance.score >= this threshold are used.

param float train_size

Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

param bool verbose

If True, prints progress. Default: True.

param Tuple[int, โ€ฆ] flip_idx

Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping. If None, it will be inferred.

param float padding

Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.

param Optional[str] single_id

If the data contains pose-estimation for multiple indivisuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.

return

None. Results saved in save_dir.

example

>>> DATA_DIR = r'D: roubleshooting\dlc_h5_multianimal_to_yolo\data'
>>> VIDEO_DIR = r'D:        roubleshooting\dlc_h5_multianimal_to_yolo
ideosโ€™
>>> SAVE_DIR = r"D:\imgs\madlc"
>>> runner = MADLCH52Yolo(data_dir=DATA_DIR, video_dir=VIDEO_DIR, save_dir=SAVE_DIR, clahe=True, single_id='animal_1')
>>> runner.run()

DeepLabCut predictions -> YOLO pose-estimation annotations๏ƒ

class simba.third_party_label_appenders.transform.dlc_to_yolo.DLC2Yolo(dlc_dir, save_dir, train_size=0.7, verbose=False, padding=0.15, flip_idx=None, names=('mouse',), greyscale=False, clahe=False)[source]

Bases: object

Converts DLC annotations into YOLO keypoint format formatted for model training.

Important

Use for single animal DLC data. For multi-animal DLC data,

Note

dlc_dir can be a directory with subdirectories containing images and CSV files with the CollectedData substring filename. For creating the flip_idx, see simba.third_party_label_appenders.converters.get_yolo_keypoint_flip_idx(). For creating the bp_id_idx, see simba.third_party_label_appenders.converters.get_yolo_keypoint_bp_id_idx()

Parameters
  • dlc_dir (Union[str, os.PathLike]) โ€“ Directory path containing DLC-generated CSV files with keypoint annotations and images.

  • save_dir (Union[str, os.PathLike]) โ€“ Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.

  • train_size (float) โ€“ Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

  • verbose (bool) โ€“ If True, prints progress. Default: True.

  • padding (float) โ€“ Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.

  • flip_idx (Tuple[int, ...]) โ€“ Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping.

  • names (Tuple[str]) โ€“ Tuple of animal (class) names. Used for creating the YAML class names mapping file.

Returns

None. Results saved in save_dir.

Example

>>> DLC_DIR = r'D:/rat_resident_intruder/dlc_data'
>>> SAVE_DIR = r'D:/rat_resident_intruder/yolo_3'
>>> runner = DLC2Yolo(dlc_dir=DLC_DIR, save_dir=SAVE_DIR, verbose=True, clahe=True, names=('resident', 'intruder'))
>>> runner.run()

Labelme annotations -> YOLO bounding box annotations๏ƒ

class simba.third_party_label_appenders.transform.labelme_to_yolo.LabelmeBoundingBoxes2YoloBoundingBoxes(labelme_dir, save_dir, obb=False, verbose=True, clahe=False, train_size=0.7, greyscale=False)[source]

Bases: object

Convert LabelMe annotations in json to YOLO format and save the corresponding images and labels in txt format.

Note

For more information on the LabelMe annotation tool, see the LabelMe GitHub repository. The Labelme Json files has too contain a imageData key holding the image as a b64 string. For an expected Labelme json format, see THIS FILE.

See also

To split YOLO data into train, test, and validation sets (expected by e.g., UltraLytics), see simba.third_party_label_appenders.converters.split_yolo_train_test_val(). To convert Labelme points annotations to YOLO keypoint training data, see simba.third_party_label_appenders.transform.labelme_to_yolo_keypoints.LabelmeKeypoints2YoloKeypoints().

Important

For YOLO bounding boxes (not YOLO keypoint data!) from labelme keypoints.

Parameters
  • labelme_dir (Union[str, os.PathLike) โ€“ Path to the directory containing LabelMe annotation .json files.

  • save_dir (Union[str, os.PathLike) โ€“ Directory where the YOLO-format images and labels will be saved. Will create โ€˜images/โ€™, โ€˜labels/โ€™, and โ€˜map.jsonโ€™ inside this directory.

  • obb (bool) โ€“ If True, saves annotations as oriented bounding boxes (8 coordinates). If False, uses standard YOLO format (x_center, y_center, width, height)

  • verbose (bool) โ€“ If True, prints progress messages during conversion.

Example

>>> LABELME_DIR = r'D:\platea       s_annotations'
>>> SAVE_DIR = r"D:\platea\yolo"
>>> runner = LabelmeBoundingBoxes2YoloBoundingBoxes(labelme_dir=LABELME_DIR, save_dir=SAVE_DIR)
>>> runner.run()

Labelme points -> YOLO keypoints annotations๏ƒ

class simba.third_party_label_appenders.transform.labelme_to_yolo_keypoints.LabelmeKeypoints2YoloKeypoints(data_path, save_dir, greyscale=True, train_size=0.7, padding=0.0, names=('mouse',), flip_idx=None, clahe=True, verbose=True)[source]

Bases: object

Labelme points -> YOLO segmentation annotations๏ƒ

class simba.third_party_label_appenders.transform.labelme_to_yolo_seg.LabelmeKeypoints2YoloSeg(data_path, save_dir, greyscale=True, train_size=0.7, padding=0, names=('mouse',), clahe=True, verbose=True)[source]

Bases: object

SimBA ROIs -> YOLO bounding box annotations๏ƒ

class simba.third_party_label_appenders.transform.simba_roi_to_yolo.SimBAROI2Yolo(config_path=None, roi_path=None, video_dir=None, save_dir=None, roi_frm_cnt=10, train_size=0.7, obb=False, greyscale=False, clahe=False, verbose=True)[source]

Bases: object

Converts SimBA roi definitions into annotations and images for training yolo network.

Parameters
  • config_path (Optional[Union[str, os.PathLike]]) โ€“ Optional path to the project config file in SimBA project.

  • roi_path (Optional[Union[str, os.PathLike]]) โ€“ Path to the SimBA roi definitions .h5 file. If None, then the roi_coordinates_path of the project.

  • video_dir (Optional[Union[str, os.PathLike]]) โ€“ Directory where to find the videos. If None, then the videos folder of the project.

  • save_dir (Optional[Union[str, os.PathLike]]) โ€“ Directory where to save the labels and images. If None, then the logs folder of the project.

  • roi_frm_cnt (Optional[int]) โ€“ Number of frames for each video to create bounding boxes for.

  • train_size (float) โ€“ Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

  • obb (Optional[bool]) โ€“ If True, created object-oriented yolo bounding boxes. Else, axis aligned yolo bounding boxes. Default False.

  • greyscale (Optional[bool]) โ€“ If True, converts the images to greyscale if rgb. Default: True.

  • verbose (Optional[bool]) โ€“ If True, prints progress. Default: False.

Returns

None

Example I

>>> SimBAROI2Yolo(config_path=r"C:/troubleshooting/RAT_NOR/project_folder/project_config.ini").run()
Example II

>>> SimBAROI2Yolo(config_path=r"C:/troubleshooting/RAT_NOR/project_folder/project_config.ini", save_dir=r"C:/troubleshooting/RAT_NOR/project_folder/logs/yolo", video_dir=r"C:/troubleshooting/RAT_NOR/project_folder/videos", roi_path=r"C:/troubleshooting/RAT_NOR/project_folder/logs/measures/ROI_definitions.h5").run()
Example III

>>> SimBAROI2Yolo(video_dir=r"C:/troubleshooting/RAT_NOR/project_folder/videos", roi_path=r"C:/troubleshooting/RAT_NOR/project_folder/logs/measures/ROI_definitions.h5", save_dir=r'C:/troubleshooting/RAT_NOR/project_folder/yolo', verbose=True, roi_frm_cnt=20, obb=True).run()

SimBA pose-estimation -> YOLO pose-estimation annotations๏ƒ

class simba.third_party_label_appenders.transform.simba_to_yolo.SimBA2Yolo(config_path, save_dir, data_dir=None, train_size=0.7, verbose=False, greyscale=False, clahe=False, padding=0.0, threshold=0.0, flip_idx=None, names=('animal_1',), sample_size=None, bp_id_idx=None, single_id=None)[source]

Bases: object

Convert pose estimation data from a SimBA project into the YOLO keypoint format, including frame sampling, image-label pair creation, bounding box computation, and train/validation splitting.

Parameters
  • config_path (Union[str, os.PathLike]) โ€“ Path to the SimBA project .ini configuration file.

  • save_dir (Union[str, os.PathLike]) โ€“ Directory where YOLO-formatted data will be saved. Subdirectories for images/labels (train/val) are created.

  • data_dir (Optional[Union[str, os.PathLike]) โ€“ Optional directory containing outlier-corrected SimBA pose estimation data. If None, uses path from config.

  • train_size (float) โ€“ Proportion of samples to allocate to the training set (range 0.1โ€“0.99). Remaining samples go to validation.

  • verbose (bool) โ€“ If True, prints progress updates to the console.

  • greyscale (bool) โ€“ If True, saves extracted video frames in greyscale. Otherwise, saves in color.

  • padding (float) โ€“ Padding added around the bounding box (as a proportion of image dimensions, range 0.0โ€“1.0). Useful if animal body-parts are in a โ€œlineโ€.

  • flip_idx (Tuple[int, ...]) โ€“ Tuple defining symmetric keypoint indices for horizontal flipping. Used to write the map.yaml file. If None, then attempt to infer.

  • names (Dict[int, str]) โ€“ Dictionary mapping instance IDs to class names. Used in annotation labels and map.yaml.

  • sample_size (Optional[int]) โ€“ If specified, limits the number of randomly sampled frames per video. If None, all frames are used.

  • bp_id_idx (Optional[Dict[int, Union[Tuple[int], List[int]]]]) โ€“ Optional mapping of instance IDs to keypoint index groups, allowing support for multiple animals per frame. Must match keys in map_dict.

  • single_id (Optional[str]) โ€“ If the data contains pose-estimation for multiple indivisuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.

Returns

None. Saves YOLO-formatted images and annotations to disk in the save_dir location.

Example

>>> SAVE_DIR = r'D: roubleshooting\mitra\mitra_yolo'
>>> CONFIG_PATH = r"C:      roubleshooting\mitra\project_folder\project_config.ini"
>>> runner = SimBA2Yolo(config_path=CONFIG_PATH, save_dir=SAVE_DIR, sample_size=10, verbose=True)
>>> runner.run()

SimBA pose-estimation -> YOLO segmentation annotations๏ƒ

class simba.third_party_label_appenders.transform.simba_to_yolo_seg.SimBA2YoloSegmentation(config_path, save_dir, data_dir=None, train_size=0.7, verbose=False, greyscale=False, clahe=False, padding=0, threshold=0.0, sample_size=None, single_id=None)[source]๏ƒ

Bases: simba.mixins.config_reader.ConfigReader

Convert pose estimation data from a SimBA project into the YOLO keypoint format, including frame sampling, image-label pair creation, bounding box computation, and train/validation splitting.

Parameters
  • config_path (Union[str, os.PathLike]) โ€“ Path to the SimBA project .ini configuration file.

  • save_dir (Union[str, os.PathLike]) โ€“ Directory where YOLO-formatted data will be saved. Subdirectories for images/labels (train/val) are created.

  • data_dir (Optional[Union[str, os.PathLike]) โ€“ Optional directory containing outlier-corrected SimBA pose estimation data. If None, uses path from config.

  • train_size (float) โ€“ Proportion of samples to allocate to the training set (range 0.1โ€“0.99). Remaining samples go to validation.

  • verbose (bool) โ€“ If True, prints progress updates to the console.

  • greyscale (bool) โ€“ If True, saves extracted video frames in greyscale. Otherwise, saves in color.

  • padding (float) โ€“ Padding added around the bounding box (as a proportion of image dimensions, range 0.0โ€“1.0). Useful if animal body-parts are in a โ€œlineโ€.

  • flip_idx (Tuple[int, ...]) โ€“ Tuple defining symmetric keypoint indices for horizontal flipping. Used to write the map.yaml file. If None, then attempt to infer.

  • names (Dict[int, str]) โ€“ Dictionary mapping instance IDs to class names. Used in annotation labels and map.yaml.

  • sample_size (Optional[int]) โ€“ If specified, limits the number of randomly sampled frames per video. If None, all frames are used.

  • bp_id_idx (Optional[Dict[int, Union[Tuple[int], List[int]]]]) โ€“ Optional mapping of instance IDs to keypoint index groups, allowing support for multiple animals per frame. Must match keys in map_dict.

  • single_id (Optional[str]) โ€“ If the data contains pose-estimation for multiple indivisuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.

Returns

None. Saves YOLO-formatted images and annotations to disk in the save_dir location.

Example

>>> SAVE_DIR = r'D: roubleshooting\mitra\mitra_yolo'
>>> CONFIG_PATH = r"C:      roubleshooting\mitra\project_folder\project_config.ini"
>>> runner = SimBA2Yolo(config_path=CONFIG_PATH, save_dir=SAVE_DIR, sample_size=10, verbose=True)
>>> runner.run()

SLEAP CSV predictions -> YOLO pose-estimation annotations๏ƒ

class simba.third_party_label_appenders.transform.sleap_csv_to_yolo.Sleap2Yolo(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, instance_threshold=0, train_size=0.7, flip_idx=None, names=None, greyscale=False, clahe=False, padding=0.0, single_id=None)[source]

Bases: object

Convert SLEAP pose estimation CSV data and corresponding videos into YOLO keypoint dataset format.

Note

This converts SLEAP inference data to YOLO keypoints (not SLEAP annotations).

Parameters
  • data_dir (Union[str, os.PathLike]) โ€“ Directory path containing SLEAP-generated CSV files with inferred keypoints.

  • video_dir (Union[str, os.PathLike]) โ€“ Directory path containing corresponding videos from which frames are to be extracted.

  • save_dir (Union[str, os.PathLike]) โ€“ Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.

  • frms_cnt (Optional[int]) โ€“ Number of frames to randomly sample from each video for conversion. If None, all frames are used.

  • instance_threshold (float) โ€“ Minimum confidence score threshold to filter out low-confidence pose instances. Only instances with instance.score >= this threshold are used.

  • train_size (float) โ€“ Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

  • verbose (bool) โ€“ If True, prints progress. Default: True.

  • flip_idx (Tuple[int, ...]) โ€“ Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping.

  • map_dict (Dict[str, int]) โ€“ Dictionary mapping class indices to class names. Used for creating the YAML class names mapping file.

  • padding (float) โ€“ Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.

  • single_id (Optional[str]) โ€“ If the data contains pose-estimation for multiple individuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.

Returns

None. Results saved in save_dir.

Example

>>> DATA_DIR = r'D:res\datant\sleap_csv'
>>> VIDEO_DIR = r'D:res\datant\sleap_video'
>>> SAVE_DIR = r"D:\imgs\sleap_csv"
>>> runner = Sleap2Yolo(data_dir=DATA_DIR, video_dir=VIDEO_DIR, frms_cnt=50, train_size=0.8, instance_threshold=0.9, save_dir=SAVE_DIR, single_id='ant')
>>> runner.run()

SLEAP H5 predictions -> YOLO pose-estimation annotations๏ƒ

class simba.third_party_label_appenders.transform.sleap_h5_to_yolo.SleapH52Yolo(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, threshold=0, train_size=0.7, flip_idx=None, animal_cnt=2, greyscale=False, clahe=False, padding=0.0, single_id=None)[source]

Bases: object

Convert SLEAP .h5 pose estimation annotations to YOLO keypoint annotation format.

Reads SLEAP .h5 files and associated videos, samples frames based on a confidence threshold, extracts keypoints for one or more animals, and saves image-label pairs in a format compatible with YOLOv8 keypoint training.

Parameters
  • data_dir (Union[str, os.PathLike]) โ€“ Directory containing SLEAP .h5 files.

  • video_dir (Union[str, os.PathLike]) โ€“ Directory containing the videos associated with .h5 files.

  • save_dir (Union[str, os.PathLike]) โ€“ Directory to save YOLO-formatted images, labels, and metadata.

  • frms_cnt (Optional[int]) โ€“ Number of frames to sample per video. If None, all valid frames are used.

  • verbose (bool) โ€“ If True, print progress during processing.

  • threshold (float) โ€“ Likelihood threshold below which poses are discarded.

  • train_size (float) โ€“ Proportion of frames to assign to the training set (rest go to validation).

  • flip_idx (Tuple[int, ...]) โ€“ Tuple indicating how to flip body-parts for augmentation. Length must match keypoint count.

  • animal_cnt (int) โ€“ Number of animals tracked per frame.

  • greyscale (bool) โ€“ If True, convert images to grayscale.

  • clahe (bool) โ€“ If True, apply CLAHE (Contrast Limited Adaptive Histogram Equalization).

  • padding (float) โ€“ Relative padding to apply around the bounding box of keypoints (range 0.0 to 1.0).

  • single_id (Optional[str]) โ€“ Optional custom ID to assign all annotations the same class (used in single-animal datasets).

Example

>>>DATA_DIR = rโ€™D:/ares/data/termite_1/dataโ€™ >>>VIDEO_DIR = rโ€™D:/ares/data/termite_1/videoโ€™ >>>SAVE_DIR = rโ€D:/imgs/sleap_h5โ€ >>>runner = SleapH52Yolo(data_dir=DATA_DIR, video_dir=VIDEO_DIR, save_dir=SAVE_DIR, threshold=0.9, frms_cnt=50, single_id=โ€™termiteโ€™) >>>runner.run()

SLEAP annotations -> YOLO pose-estimation annotations๏ƒ

class simba.third_party_label_appenders.transform.sleap_to_yolo.SleapAnnotations2Yolo(sleap_dir, save_dir, video_dir=None, padding=None, train_size=0.8, verbose=True, greyscale=False, clahe=False, single_id=None)[source]

Bases: object

Convert SLEAP annotations to YOLO formatted training data.

Parameters
  • data_dir (Union[str, os.PathLike]) โ€“ Directory containing SLEAP annotations .slp files

  • save_dir (Union[str, os.PathLike]) โ€“ Directory to save YOLO-formatted images, labels, and metadata.

  • verbose (bool) โ€“ If True, print progress during processing.

  • train_size (float) โ€“ Proportion of frames to assign to the training set (rest go to validation).

  • greyscale (bool) โ€“ If True, convert images to grayscale.

  • clahe (bool) โ€“ If True, apply CLAHE (Contrast Limited Adaptive Histogram Equalization).

  • padding (float) โ€“ Relative padding to apply around the bounding box of keypoints (range 0.0 to 1.0).

  • single_id (Optional[str]) โ€“ Optional custom ID to assign all annotations the same class (used in single-animal datasets).

Example

>>> runner = SleapAnnotations2Yolo(sleap_dir=r'D:/cvat_annotations/frames/slp_to_yolo', save_dir=r'D:/cvat_annotations/frames/slp_to_yolo/yolo')
>>> runner.run()

LightningPose keypoints -> YOLO bounding box conversion๏ƒ

class simba.third_party_label_appenders.transform.litpose_to_yolo_bbox.LitPose2YOLOBbox(litpose_dir, save_dir, train_size=0.7, verbose=False, padding=0.0, sample_n=None, names=('mouse',), greyscale=False, clahe=False)[source]

Bases: object

Convert LitPose keypoint annotations into a YOLO bounding-box dataset.

Parameters
  • litpose_dir (Union[str, os.PathLike]) โ€“ Path to LitPose directory containing annotation CSV files and the labeled-data image folder.

  • save_dir (Union[str, os.PathLike]) โ€“ Output directory where YOLO-formatted images and labels subdirectories are created.

  • train_size (float) โ€“ Fraction of samples assigned to the training split. Default 0.7.

  • verbose (bool) โ€“ If True, print per-image progress during conversion.

  • padding (float) โ€“ Extra fractional padding around each axis-aligned box inferred from keypoints.

  • sample_n (Optional[int]) โ€“ Optional cap on the number of sampled frames before split. If None, all frames are used.

  • names (Tuple[str, ...]) โ€“ Class names in YOLO index order.

  • greyscale (bool) โ€“ If True, load and save images in grayscale.

  • clahe (bool) โ€“ If True, apply CLAHE preprocessing when reading images.

References

1

Lightning Pose documentation: https://lightning-pose.readthedocs.io/en/latest/

2

Biderman et al., Lightning Pose: improved animal pose estimation via semi-supervised learning, Bayesian ensembling and cloud-native open-source tools, Nature Methods (2024), doi: https://doi.org/10.1038/s41592-024-02319-1

Example

>>> runner = LitPose2YOLOBbox(litpose_dir=r'Z:\home\simon\lp_300126', save_dir=r'E:\litpose_yolobox', verbose=True, clahe=False, greyscale=False, sample_n=1000, padding=0.15)
>>> runner.run()

LightningPose keypoints -> YOLO pose-estimation annotations๏ƒ

class simba.third_party_label_appenders.transform.litpose_to_yolo_keypoints.LitPose2YOLO(litpose_dir, save_dir, train_size=0.7, verbose=False, padding=0.0, sample_n=None, flip_idx=None, names=('mouse',), greyscale=False, clahe=False)[source]

Bases: object

Convert LitPose keypoint annotations into a YOLO keypoint dataset.

Parameters
  • litpose_dir (Union[str, os.PathLike]) โ€“ Path to LitPose directory containing annotation CSV files and the labeled-data image folder.

  • save_dir (Union[str, os.PathLike]) โ€“ Output directory where YOLO-formatted images and labels subdirectories are created.

  • train_size (float) โ€“ Fraction of samples assigned to the training split. Default 0.7.

  • verbose (bool) โ€“ If True, print per-image progress during conversion.

  • padding (float) โ€“ Extra padding factor used when computing normalized YOLO boxes from keypoints.

  • sample_n (Optional[int]) โ€“ Optional cap on the number of sampled frames before split. If None, all frames are used.

  • flip_idx (Optional[Tuple[int, ...]]) โ€“ Optional keypoint flip index order for YOLO pose augmentation. If None, inferred from body-part names.

  • names (Tuple[str, ...]) โ€“ Class names in YOLO index order.

  • greyscale (bool) โ€“ If True, load and save images in grayscale.

  • clahe (bool) โ€“ If True, apply CLAHE preprocessing when reading images.

References

1

Lightning Pose documentation: https://lightning-pose.readthedocs.io/en/latest/

2

Biderman et al., Lightning Pose: improved animal pose estimation via semi-supervised learning, Bayesian ensembling and cloud-native open-source tools, Nature Methods (2024), doi: https://doi.org/10.1038/s41592-024-02319-1

Example

>>> runner = LitPose2YOLO(litpose_dir=r'Z:\home\simon\lp_300126', save_dir=r'E:\litpose_yolobox', verbose=True, clahe=False, greyscale=False, sample_n=1000, padding=0.15)
>>> runner.run()