YOLO methods๏
On this page
Methods for training YOLO models, creating training and validation datasets, and converting behavioral neuroscience specific datasets to YOLO datasets.
Utilities๏
- simba.utils.yolo.apply_fixed_bbox_size(data, video_name, img_w, img_h, bbox_size)[source]๏
Apply a fixed axis-aligned bounding-box size to detected rows in a results table.
The current box center is preserved, then the box is resized to
bbox_size(h, w). If the resized box would exceed frame boundaries, the box is shifted so it remains fully inside the image while preserving the requested size.The function expects YOLO corner columns
X1..Y4and updates them in-place on the input dataframe before returning it.- Parameters
data (pd.DataFrame) โ Detection dataframe containing
CONFIDENCEand corner coordinate columnsX1, Y1, X2, Y2, X3, Y3, X4, Y4.video_name (str) โ Video identifier used in error messages.
img_w (int) โ Image width in pixels.
img_h (int) โ Image height in pixels.
bbox_size (Tuple[int, int]) โ Target fixed bounding-box size as
(height, width)in pixels.
- Returns
Input dataframe with updated fixed-size bbox coordinates for detected rows.
- Return type
pd.DataFrame
- Raises
InvalidInputError โ If required columns are missing or if
bbox_sizeis larger than image dimensions.
- simba.utils.yolo.create_yolo_sample_visualizations(samples, save_dir, names=None, palette='Set1', seg_opacity=0.5, draw_labels=True, verbose=True, source='')[source]๏
Create annotated visualizations from YOLO-format (image, label_str) samples.
Auto-detects annotation type (bounding-box or segmentation) from the label string format and draws the appropriate overlays. Images are saved as PNG files in
save_dir.- Parameters
samples (List[Tuple[str, np.ndarray, str]]) โ List of
(sample_name, image, label_str)tuples produced by a SAM3-to-YOLO converter.save_dir (Union[str, os.PathLike]) โ Directory where annotated images are saved. Created if it does not exist.
names (Optional[Tuple[str, ...]]) โ Class names in index order. Required when
draw_labels=True; otherwise optional and only used to size the color palette. DefaultNone.palette (str) โ Color palette name. Default
'Set1'.seg_opacity (float) โ Opacity of filled segmentation polygons (0.0โ1.0). Default
0.5.draw_labels (bool) โ If True, draw the class name text alongside each box/polygon. Default
True.verbose (bool) โ Print progress messages. Default
True.source (str) โ Caller class name for log messages.
- simba.utils.yolo.detect_yolo_project_type(label_path)[source]๏
Detect YOLO project type (bbox, keypoint, or segmentation) from a single label file.
bbox: class_id + 4 values (x_center, y_center, w, h)
keypoint: class_id + 4 values + N*3 keypoints (x, y, visibility)
segmentation: class_id + N*2 polygon vertices (N >= 3)
- simba.utils.yolo.export_yolo_model(model_path, export_format, imgsz=256, device=0, int8=False, batch=1, workspace=8, data=None, task=None, dynamic=False, simplify=True, half=False)[source]๏
Export a YOLO model using Ultralytics
model.export.Wrapper around Ultralytics export that supports common deployment formats (including ONNX and TensorRT engine).
Note
INT8 export is only valid for
engineformat and cannot be combined withhalf=True.Important
When exporting a segmentation model, the
imgszparameter is critical for mask quality. Segmentation requires pixel-level precision along object boundaries, so spatial detail lost to downscaling hurts segmentation far more than detection or pose tasks. Setimgszas large as your GPU memory allows. The default256may be too coarse for high-quality segmentation masks.- Parameters
model_path (Union[str, os.PathLike]) โ Path to source YOLO weights (typically
.pt).export_format (Literal["onnx", "engine", "torchscript", "onnxsimplify", "coreml", "openvino", "pb", "tf", "tflite", "torch"]) โ Target export format.
imgsz (int) โ Export input image size in pixels.
device (Union[Literal['cpu'], int]) โ Export device (
'cpu'or CUDA index).int8 (bool) โ If True, request INT8 TensorRT export. Requires
export_format='engine'.batch (int) โ Export batch/profile size (must be >= 1). For INT8, ensure calibration data size is at least this value.
workspace (int) โ TensorRT workspace budget in GB (must be >= 1).
data (Optional[Union[str, os.PathLike]]) โ Optional dataset yaml path used for export/calibration.
task (Optional[Literal["detect", "segment", "classify", "pose", "obb"]]) โ Optional explicit YOLO task. Set this to avoid backend task auto-guessing warnings.
dynamic (bool) โ If True, build with dynamic input profiles.
half (bool) โ If True, request FP16 export where supported.
- Returns
Path-like export artifact returned by Ultralytics.
- Return type
Union[str, os.PathLike]
- Raises
SimBAPAckageVersionError โ If Ultralytics is unavailable.
InvalidInputError โ On unsupported format or invalid precision combination.
- Example
>>> export_yolo_model( ... model_path=r"F://netholabs\primintellect_test\mdl\weights\best.pt", ... export_format='engine', ... imgsz=256, ... device=0, ... int8=True, ... batch=4, ... workspace=8, ... task='detect', ... dynamic=False, ... half=False ... )
- simba.utils.yolo.filter_yolo_keypoint_data(bbox_data, keypoint_data, class_id=None, confidence=None, class_idx=None, confidence_idx=None)[source]๏
Helper to filters YOLO bounding box and keypoint data based on class ID and/or confidence threshold.
- Parameters
bbox_data (np.ndarray) โ A 2D array of shape (N, M) representing YOLO bounding box data, where each row corresponds to one detection and contains class and confidence values.
keypoint_data (np.ndarray) โ A 3D array of shape (N, 2, 3) representing keypoints for each detection, where K is the number of keypoints per detection.
class_id (Optional[int]) โ Target class ID to filter detections. Defaults to None.
confidence (Optional[float]) โ Minimum confidence threshold to keep detections. Must be in [0, 1]. Defaults to None.
confidence_idx (int) โ Index in bbox_data where confidence value is stored. Defaults to 5.
class_idx (int) โ Index in bbox_data where class ID is stored. Defaults to 6.
- simba.utils.yolo.fit_yolo(weights_path, model_yaml, save_path, epochs=25, batch=16, plots=True, imgsz=640, format=None, device=0, verbose=True, workers=8)[source]๏
Trains a YOLO model using specified initial weights and a configuration YAML file.
See also
For the recommended wrapper class with parameter validation, see
simba.model.yolo_fit.FitYolo.- Parameters
initial_weights โ Path to the pre-trained YOLO model weights (usually a .pt file). Example weights can be found [here](https://huggingface.co/Ultralytics).
model_yaml โ YAML file containing paths to the training, validation, and testing datasets and the object class mappings. Example YAML file can be found [here](https://github.com/sgoldenlab/simba/blob/master/misc/ex_yolo_model.yaml).
save_path โ Directory path where the trained model, logs, and results will be saved.
epochs โ Number of epochs to train the model. Default is 5.
batch โ Batch size for training. Default is 16.
- Returns
None. The trained model and associated training logs are saved in the specified project_path.
- Example
>>> fit_yolo(initial_weights=r"C:/troubleshooting/coco_data/weights/yolov8n-obb.pt", data=r"C:/troubleshooting/coco_data/model.yaml", save_path=r"C:/troubleshooting/coco_data/mdl", batch=16)
- simba.utils.yolo.get_yolo_imgsz_and_batch_size(model, raise_error=True)[source]๏
Attempt to read the image size and batch size baked into a YOLO model.
Note
For
.engine(TensorRT) files both values are read straight from the embedded header and represent the fixed input bindings. For.ptand other formats the values are scraped from the training arguments, soimgszreflects the training size (a sensible default, not a hard constraint) andbatchis frequently unavailable.See also
read_yolo_metadata()(full metadata dictionary)- Parameters
model (Union[str, os.PathLike, YOLO]) โ Path to a YOLO model file, or an already-loaded
ultralytics.YOLOinstance.raise_error (bool) โ If True (default), raise
InvalidInputErrorwhenimgszorbatchcannot be found in the model metadata. If False, missing values are returned asNone.
- Returns
Tuple of
(imgsz, batch_size), each anint(orNoneif not found andraise_erroris False).- Return type
- Raises
InvalidInputError โ If
raise_erroris True andimgszand/orbatchis not present in the model metadata.- Example
>>> get_yolo_imgsz_and_batch_size(r'/models/best.engine') (256, 192) >>> get_yolo_imgsz_and_batch_size(r'/models/best.pt', raise_error=False) (640, None)
- simba.utils.yolo.keypoint_array_to_yolo_annotation_str(x, img_h, img_w, padding=None)[source]๏
Convert a set of keypoints into a YOLO-format annotation string that includes the normalized bounding box and keypoints.
[x_center y_center width height x1 y1 v1 x2 y2 v2 โฆ xn yn vn]
- Parameters
- Returns
YOLO string representation of the pose-estimation data including bounding box and keypoints.
- Return type
- Example
>>> x = np.array([[100, 200, 2], [150, 250, 2], [120, 240, 1]]) >>> keypoint_array_to_yolo_annotation_str(x=x, img_h=480, img_w=640)
- simba.utils.yolo.load_yolo_model(weights_path, verbose=True, format=None, device=0)[source]๏
Load a YOLO model.
See also
For recommended wrapper classes that use this function, see
simba.model.yolo_fit.FitYolo,simba.model.yolo_inference.YoloInference,simba.model.yolo_pose_inference.YOLOPoseInference,simba.model.yolo_seg_inference.YOLOSegmentationInference, andsimba.model.yolo_pose_track_inference.YOLOPoseTrackInference.- Parameters
weights_path (Union[str, os.PathLike]) โ Path to model weights (.pt, .engine, etc).
verbose (bool) โ Whether to print loading info.
format (Optional[str]) โ Export format, one of VALID_FORMATS or None to skip export.
device (Union[Literal['cpu'], int]) โ Device to load model on. โcpuโ, int GPU index.
- Example
>>> load_yolo_model(weights_path=r"/mnt/c/troubleshooting/coco_data/mdl/train8/weights/best.pt", format="onnx", device=0)
- simba.utils.yolo.read_yolo_metadata(model)[source]๏
Read metadata from a YOLO model file or loaded YOLO instance.
Supports
.engine(TensorRT),.pt(PyTorch),.onnx,.torchscript, and any other format thatultralytics.YOLOcan load. For.enginefiles the embedded JSON header is read directly without loading the model. For all other formats the model is loaded via Ultralytics to extract metadata.- Parameters
model (Union[str, os.PathLike, YOLO]) โ Path to a YOLO model file, or an already-loaded
ultralytics.YOLOinstance.- Returns
Dictionary of model metadata. Common keys:
batch,imgsz,task,names,stride,fp16,dynamic.- Return type
- Raises
InvalidInputError โ If
modelis not a YOLO instance, not a valid path, or has an unsupported extension.- Example
>>> meta = read_yolo_metadata('/models/best.engine') >>> meta['batch'] 192 >>> meta['imgsz'] [256, 256] >>> meta = read_yolo_metadata('/models/best.pt') >>> meta['task'] 'detect'
- simba.utils.yolo.yolo_predict(model, source, half=False, batch_size=4, stream=False, imgsz=640, iou=0.75, device=0, threshold=0.25, max_detections=300, verbose=True, retina_msk=False)[source]๏
Produce YOLO predictions.
See also
For recommended wrapper classes that use this function, see
simba.model.yolo_inference.YoloInference,simba.model.yolo_pose_inference.YOLOPoseInference, andsimba.model.yolo_seg_inference.YOLOSegmentationInference.- Parameters
model (Union[str, os.PathLike]) โ Loaded ultralytics.YOLO model. Returned by
load_yolo_model().source (Union[str, os.PathLike, np.ndarray]) โ Path to video, video stream, directory, image, or image as loaded array.
half (bool) โ Whether to use half precision (FP16) for inference to speed up processing.
stream (bool) โ If True, return a generator that yields results one by one. Useful for stream or large videos.
imgsz (int) โ Size to resize input images to (square dimension). Must be positive integer.
iou (float) โ If max_detections > 1, then the bbox overlap allowed to detect multiple animals.
batch_size (Optional[int]) โ If stream is False, then the number of images to process in each batch.
device (Union[Literal['cpu'], int]) โ Device identifier for inference. โcpuโ to force CPU inference. E.g., integer index of the GPU device (e.g., 0 for โcuda:0โ).
threshold (float) โ Confidence threshold for filtering predictions. Only detections with confidence >= threshold are returned. Must be between 0.0 and 1.0.
max_detections (int) โ Maximum number of detections per image/frame to return.
verbose (bool) โ If True, print inference progress and summary information.
- Returns
YOLO results or generator of YOLO results.
Bounding-box inference๏
- class simba.model.yolo_inference.YoloInference(weights, video_path, verbose=False, save_dir=None, half_precision=True, device=0, batch_size=400, core_cnt=8, threshold=0.25, max_detections=300, max_per_class=None, smoothing_method=None, smoothing_time_window=None, interpolate=False, imgsz=320, bbox_size=None, stream=True)[source]
Bases:
objectPerforms object detection inference on a video using a YOLO model.
YOLO-based object detection (bounding-box) on one or more video files. It supports GPU acceleration, batch processing, streaming, and optional result saving. The model returns bounding box coordinates and class confidence scores for each frame. Results can be smoothed or interpolated to handle detection gaps.
See also
To perform bounding box and keypoint (pose) detection, see
YOLOPoseInference(). To perform keypoint (pose) detection with tracking, seeYOLOPoseTrackInference()To visualize bounding boxes only, seeYOLOVisualizer()EXPECTED RUNTIMES
VIDEOS (COUNT)
FRAMES (COUNT)
TIME (S)
STDEV(S)
1
9000
19.69
0.185202592
2
18000
39.91333333
0.718424202
3
27000
59.20333333
0.29143324
4
36000
80.82
1.407870733
BATCH SIZE: 500
IMGSZ: 256
NVIDIA GeForce RTX 4070
CPU COUNT (LOADERS): 16
3 runs
- Parameters
weights (Union[str, os.PathLike, YOLO]) โ Path to YOLO model weights or a preloaded
ultralytics.YOLOmodel instance.video_path (Union[Union[str, os.PathLike], List[Union[str, os.PathLike]]]) โ Input video path, list of paths, or directory containing videos.
verbose (Optional[bool]) โ If True, print progress information.
save_dir (Optional[Union[str, os.PathLike]]) โ Directory to save output CSV files. If None, results are returned in-memory.
half_precision (Optional[bool]) โ If True, run inference in fp16 where supported.
device (Union[Literal['cpu'], int]) โ Inference device (โcpuโ or CUDA index).
batch_size (Optional[int]) โ Number of frames per prediction batch.
core_cnt (int) โ CPU thread count used by torch.
threshold (float) โ Detection confidence threshold in [0.0, 1.0].
max_detections (int) โ Maximum detections per frame (total, across all classes) returned by the model.
max_per_class (Optional[int]) โ Maximum number of detections to retain per class per frame. E.g., if one โresidentโ and one โintruderโ is expected, set this to 1. Defaults to None, meaning all detected instances of each class are retained (up to
max_detections).smoothing_method (Optional[Literal['savitzky-golay', 'bartlett', 'blackman', 'boxcar', 'cosine', 'gaussian', 'hamming', 'exponential']]) โ Optional temporal smoothing method for bbox coordinates.
smoothing_time_window (Optional[int]) โ Smoothing window in milliseconds. Used only when
smoothing_methodis not None.interpolate (bool) โ If True, interpolate missing bbox coordinates (nearest, per class).
imgsz (int) โ Model inference image size.
bbox_size (Optional[Tuple[int, int]]) โ Optional fixed bbox size
(height, width)in pixels applied to detected boxes.stream (Optional[bool]) โ If True, use streaming predictions.
- Returns
If
save_diris None, returns a dict mapping video name to result dataframe. Otherwise saves CSVs and returns None.- Return type
Union[None, Dict[str, pd.DataFrame]]
- Example
>>> video_path = "/mnt/d/netholabs/yolo_videos/input/mp4_20250606083508/2025-05-28_19-50-23.mp4" >>> i = YoloInference( ... weights=r"/mnt/c/troubleshooting/coco_data/mdl/train8/weights/best.pt", ... video_path=video_path, ... save_dir=r"/mnt/c/troubleshooting/coco_data/mdl/results", ... verbose=True, ... device=0, ... interpolate=True, ... bbox_size=(128, 128) ... ) >>> i.run()
NVDEC GPU-accelerated YOLO inference๏
- class simba.model.yolo_nvdec_inference.YoloNVDECInference(video_path, engine_path, save_dir=None, task='detect', imsz=None, batch_size=None, max_workers=None, gpu_id=0, conf_threshold=0.05, iou_threshold=0.45, keypoint_names=None, vertice_cnt=60, max_detections=None, segment_smoothing=None, interpolate=True, recursive=False, smoothing_method=None, smoothing_time_window=None, verbose=True)[source]๏
Bases:
objectGPU-accelerated YOLO inference on videos using NVDEC decode + TensorRT.
Decodes video frames on GPU via NVDEC (PyNvVideoCodec), runs YOLO detection, pose-estimation, or segmentation through a TensorRT engine with GPU-side letterboxing and NMS, and stores per-frame results as DataFrames.
Important
The number of parallel NVDEC hardware decode engines varies by GPU (e.g., 1 on RTX 4070, 3 on RTX 4090, 7 on H100) and directly controls how many videos can be decoded simultaneously. More NVDEC engines means higher throughput when processing multiple videos. The count is auto-detected via
get_nvdec_count(). If your GPU is not listed or the count is incorrect, passmax_workersexplicitly.Important
When running segmentation (
task='segment'), theimszparameter is critical for mask quality. Segmentation requires pixel-level precision along object boundaries, so spatial detail lost to downscaling hurts segmentation far more than detection or pose tasks. Setimszas large as your GPU memory allows. The default256may be too coarse for high-quality segmentation masks.See also
SAM3ToYoloBBoxโ create a YOLO bounding-box project from SAM3 annotations.SAM3ToYoloSegโ create a YOLO segmentation project from SAM3 annotations.YoloInferenceโ CPU-based YOLO bounding-box inference.YOLOPoseInferenceโ CPU-based YOLO pose inference.YOLOSegmentationInferenceโ CPU-based YOLO segmentation inference.
EXPECTED RUNTIMES BOUNDING BOX
VIDEOS (COUNT)
FRAMES (COUNT)
TIME (S)
STDEV(S)
5
9010
11.2562
0.569887814
10
18020
20.87785
0.145593286
20
36040
41.24536667
1.867777656
BATCH SIZE: 10
IMGSZ: 256
ORIGINAL SIZE: 1280x1024
NVIDIA GeForce RTX 4070 (NVDECs: 1)
3 runs
- Parameters
video_path (Union[str, os.PathLike]) โ Directory containing input video files, or path to a single video file.
engine_path (Union[str, os.PathLike]) โ Path to TensorRT engine file (.engine). If alternative model file exists, convert it to engine using
simba.utils.yolo.export_yolo_model(). For multi-GPU, place the source.ptweights alongside the engine โ per-GPU engines are auto-exported on first run.save_dir (Optional[Union[str, os.PathLike]]) โ Directory for per-video CSV output. If None, results kept in memory only. Default None.
task (Literal['detect', 'pose', 'segment']) โ YOLO task type. Default
'detect'.imsz (Optional[int]) โ Model input image size (square). If None, read from engine metadata. Default None.
batch_size (Optional[int]) โ Inference batch size. If None, read from engine metadata. Default None.
max_workers (Optional[int]) โ Number of parallel worker processes. If None, auto-detected from GPU NVDEC count. Default None.
gpu_id (Union[int, Tuple[int, ...]]) โ CUDA device index or tuple of device indices for multi-GPU inference. Workers are round-robin assigned across listed GPUs. When multiple GPUs are specified, NVDEC engine counts are summed across all GPUs. Default 0.
conf_threshold (float) โ Confidence threshold for detections. Default 0.05.
iou_threshold (float) โ IoU threshold for NMS. Default 0.45.
keypoint_names (Optional[Tuple[str, ...]]) โ Keypoint names in index order, used only when
task='pose'(ignored otherwise). Required whentask='pose', raises error if not provided.vertice_cnt (int) โ Number of resampled polygon vertices, used only when
task='segment'(ignored otherwise). Default 60.max_detections (Optional[int]) โ Maximum number of detections to keep per frame after NMS (sorted by confidence). If None, keep all. Default None.
segment_smoothing (Optional[int]) โ B-spline smoothing factor for segmentation polygon vertices, used only when
task='segment'(ignored otherwise). Higher values produce smoother contours. IfNone, no smoothing is applied. DefaultNone.interpolate (bool) โ If True, linearly interpolate missing detections across frames, used only when
task='detect'(ignored otherwise). Default True.recursive (bool) โ If True and
video_pathis a directory, search all subdirectories for video files. Default False.smoothing_method (Optional[Literal['savitzky-golay', 'bartlett', 'blackman', 'boxcar', 'cosine', 'gaussian', 'hamming', 'exponential']]) โ Smoothing method for detection coordinates, used only when
task='detect'(ignored otherwise). IfNone, no smoothing is applied. DefaultNone.smoothing_time_window (Optional[int]) โ Time window (in ms) for coordinate smoothing. Required when
smoothing_methodis not None. DefaultNone.verbose (bool) โ Print progress messages. Default True.
- Example
>>> detector = YoloNVDECInference(video_path=r'/videos', engine_path=r'/best.engine', task='detect') >>> detector.run() >>> detector.results['video_name']
>>> detector = YoloNVDECInference(video_path=r'/videos', engine_path=r'/pose.engine', task='pose', keypoint_names=('NOSE', 'LEFT_EAR', 'RIGHT_EAR')) >>> detector.run() >>> detector.save()
>>> detector = YoloNVDECInference(video_path=r'/videos/my_video.mp4', engine_path=r'/seg.engine', task='segment', save_dir=r'/output', vertice_cnt=30) >>> detector.run() >>> detector.save()
Pose-estimation inference๏
- class simba.model.yolo_pose_inference.YOLOPoseInference(weights, video_path, keypoint_names=None, verbose=True, save_dir=None, device=0, format=None, batch_size=4, torch_threads=8, half_precision=True, stream=False, box_threshold=0.5, bbox_size=None, max_tracks=None, interpolate=False, smoothing=None, imgsz=640, iou=0.5, overwrite=True, raise_error=True, randomize_order=False, recursive=False)[source]๏
Bases:
objectYOLOPoseInference performs pose estimation on videos using a YOLO-based keypoint detection model.
This class runs YOLO-based keypoint detection on a given video or list of videos. It supports GPU acceleration, batch or stream-based inference, result interpolation, and saving results to disk. The model returns detected keypoints and their confidence scores for each frame, and optionally tracks poses over time.
EXPECTED RUNTIMES
VIDEOS (COUNT)
FRAMES (COUNT)
TIME (S)
STDEV(S)
1
9000
21.89
2.87
2
18000
41.83
0.48
3
27000
63.08
0.41
4
36000
84.44
1.32
5
45000
103.84
1.17
6
54000
126.29
1.22
7
63000
148.71
1.86
BATCH SIZE: 500
IMGSZ: 288
NVIDIA GeForce RTX 4070
3 runs
See also
For bounding box inference only (no pose), see
simba.model.yolo_inference.YoloInference(). For segmentation inference, seesimba.model.yolo_seg_inference.YOLOSegmentationInference(). To fit YOLO model, see :func:`simba.model.yolo_fit.FitYolo. For detailed instructions, see YOLO Pose Estimation Inference Documentation.- Parameters
weights (Union[str, os.PathLike]) โ Path to the trained YOLO model weights (e.g., โbest.ptโ).
video_path (Union[str, os.PathLike] or List[Union[str, os.PathLike]]) โ Path to a single video, list of videos, or directory containing video files.
keypoint_names (Tuple[str, ...]) โ Tuple containing the names of keypoints to be tracked (e.g., (โnoseโ, โleft_earโ, โฆ)). If None, (โBP_0โ, โBP_1โ, โฆ) will be used.
verbose (Optional[bool]) โ If True, outputs progress information and timing. Defaults to True.
save_dir (Optional[Union[str, os.PathLike]]) โ Directory to save the inference results. If None, results are returned in memory. Defaults to None.
device (Union[Literal['cpu'], int]) โ Device to use for inference. Use โcpuโ for CPU or GPU index (e.g., 0 for CUDA:0). Defaults to 0.
format (Optional[str]) โ Optional export format for the model. Supported values: โonnxโ, โengineโ, โtorchscriptโ, โonnxsimplifyโ, โcoremlโ, โopenvinoโ, โpbโ, โtfโ, โtfliteโ. Defaults to None.
batch_size (Optional[int]) โ Number of frames to process in parallel. Defaults to 4.
torch_threads (int) โ Number of PyTorch threads to use. Defaults to 8.
half_precision (bool) โ If True, uses half-precision (FP16) inference. Defaults to True.
stream (bool) โ If True, processes frames one-by-one in a generator style. Recommended for long videos. Defaults to False.
box_threshold (float) โ Confidence threshold bounding box detection. All detections (bounding boxes AND keypoints) below this value are ignored. Defaults to 0.5.
max_tracks (Optional[int]) โ Maximum number (total sum) of pose tracks to keep. If None, all tracks are retained.
max_per_class (Optional[int]) โ Maximum number pose tracks per class. E.g., if one โresidentโ and one โintruderโ is expecte, set this to 1. Defaults to None meaning all detected instances of each class are retained.
interpolate (bool) โ If True, interpolates missing keypoints across frames using the โnearestโ method. Defaults to False.
smoothing (bool) โ If not None, then the time in milliseconds for Gaussian-applied body-part smoothing.
overwrite (bool) โ If True, overwrites the data at the
save_dir. If False, skips the file if it exists.raise_error (bool) โ If True, raise error if the input video metadata canโt be read. If False, then skips the video file.
randomize_order (bool) โ If True, analyzes the input data in a random order. If False, then in fixed order.
recursive (bool) โ If True, analyzes all video files found recursively in the video_path directory. If False, only looks in the top directory.
imgsz (int) โ Input image size for inference. Must be square. Defaults to 640.
YOLO pose-estimation segmentation visualizer๏
- class simba.plotting.yolo_seg_visualizer.YOLOSegmentationVisualizer(data_path, video_path, save_dir, color=(255, 255, 0), core_cnt=- 1, threshold=0.0, verbose=False, shape_opacity=0.5)[source]๏
Bases:
objectVisualizes polygon-based YOLO segmentation results overlaid on video frames.
Accepts either a single video file + CSV data file, or a directory of videos + a directory of CSVs (matched by filename stem).
See also
To run segmentation inference, see
simba.model.yolo_seg_inference.YOLOSegmentationInference()To fit YOLO model, seesimba.model.yolo_fit.FitYolo()- Parameters
data_path (Union[str, os.PathLike]) โ Path to a CSV file or a directory of CSV files with YOLO segmentation output. Must include columns โFRAMEโ, โIDโ, and at least six โVERTICEโ columns.
video_path (Union[str, os.PathLike]) โ Path to a single video file or a directory of video files. When directories are passed, files are matched by stem name.
save_dir (Union[str, os.PathLike]) โ Directory where output videos (temp and final) are saved.
color (Tuple[int, int, int]) โ RGB color for drawing polygons. Defaults to (255, 255, 0).
core_cnt (Optional[int]) โ Number of parallel processes to use. Defaults to -1 (auto all available cores).
threshold (float) โ Confidence threshold (not currently used in rendering). Defaults to 0.0.
verbose (bool) โ Whether to print detailed progress messages. Defaults to False.
shape_opacity (float) โ Alpha blending factor for filled polygon overlay (range 0.0โ1.0). If None, solid fill is used. Defaults to 0.5.
- Example
>>> runner = YOLOSegmentationVisualizer(data_path=r"D:/results/video1.csv", video_path=r"D:/videos/video1.mp4", save_dir=r'D:/output', verbose=True) >>> runner.run() >>> runner = YOLOSegmentationVisualizer(data_path=r"D:/results", video_path=r"D:/videos", save_dir=r'D:/output', verbose=True) >>> runner.run()
YOLO pose-estimation segmentation inference๏
- class simba.model.yolo_seg_inference.YOLOSegmentationInference(weights_path, video_path, verbose=True, save_dir=None, device=0, format=None, batch_size=4, torch_threads=8, half_precision=True, stream=False, threshold=0.5, max_tracks=300, interpolate=False, imgsz=640, iou=0.5, retina_msk=False, vertice_cnt=30)[source]๏
Bases:
objectRun inference on video(s) using a trained YOLO segmentation model.
- Parameters
weights_path (Union[str, os.Pathlike]) โ Path to the trained YOLO .pt weights file.
video_path (Union[str, os.Pathlike]) โ Path to a single video or a list of video paths to run inference on.
verbose (bool) โ Whether to print progress information. Default is True.
save_dir (Union[str, os.Pathlike]) โ Directory where output videos and data will be saved.
device (Union[str, int]) โ Device to run inference on; use โcpuโ or an integer GPU index (e.g., 0).
format (str) โ Optional export format for the model. Supported values: โonnxโ, โengineโ, โtorchscriptโ, โonnxsimplifyโ, โcoremlโ, โopenvinoโ, โpbโ, โtfโ, โtfliteโ. Defaults to None.
batch_size (Optional[int]) โ Number of frames to process at once. Increase for faster performance with sufficient memory.
torch_threads (int) โ Number of CPU threads to use (when on CPU).
half_precision (bool) โ Whether to use half-precision (FP16) for inference on GPU. Default is True.
stream (bool) โ Whether to stream video processing (less memory, suitable for long videos).
threshold (float) โ Confidence threshold for object/segmentation detection.
max_tracks (int) โ Optional maximum number of objects to track. If None, tracking is disabled.
interpolate (bool) โ Whether to interpolate results (useful for smoothing or low-FPS videos).
imgsz (int) โ Inference image size (width/height in pixels); must be multiple of 32.
iou (float) โ IoU threshold for non-max suppression (NMS).
retina_msk (bool) โ Whether to use high-resolution Retina-style masks.
vertice_cnt (int) โ Number of vertices used to approximate the segmentation mask polygon.
Important
The
imgszparameter is critical for mask quality. Segmentation requires pixel-level precision along object boundaries, so spatial detail lost to downscaling hurts segmentation far more than detection or pose tasks. Setimgszas large as your GPU memory allows. The default640may be too coarse for high-quality segmentation masks.Note
To create YOLO segmentation dataset for fitting, use
simba.third_party_label_appenders.transform.labelme_to_yolo_seg.LabelmeKeypoints2YoloSeg(). To fit YOLO model, see :func:`simba.model.yolo_fit.FitYolo. To visualize the segmentation results, seesimba.plotting.yolo_seg_visualizer.YOLOSegmentationVisualizer()- Example
>>> weights_path = r"D:/platea/yolo_071525/mdl/train3/weights/best.pt" >>> video_path = r"D:/platea/platea_videos/videos/clipped/10B_Mouse_5-choice_MustTouchTrainingNEWFINAL_a7.mp4" >>> save_dir = r"D:/platea/platea_videos/videos/yolo_results" >>> runner = YOLOSegmentationInference(weights_path=weights_path, video_path=video_path, save_dir=save_dir, verbose=True, device=0, format=None, stream=True, batch_size=10, imgsz=320, interpolate=True, threshold=0.8, retina_msk=True) >>> runner.run()
Pose-estimation track inference๏
- class simba.model.yolo_pose_track_inference.YOLOPoseTrackInference(weights_path, video_path, keypoint_names, config_path, verbose=False, save_dir=None, device=0, format=None, batch_size=4, torch_threads=8, half_precision=True, recursive=False, stream=False, interpolate=False, threshold=0.7, max_tracks=2, smoothing=None, randomize_order=False, n=None, imgsz=320, min_video_size=None, overwrite=True, iou=0.5, raise_error=True)[source]๏
Bases:
objectPerform YOLO-based pose estimation and object tracking inference on single or multiple videos.
Uses YOLO pose model to detect and track objects with keypoint localization across video frames. It supports GPU acceleration, multi-object tracking with configurable trackers (e.g., BoTSORT, ByteTrack), and post-processing options including interpolation and smoothing of keypoint trajectories.
Note
Requires GPU with CUDA support. The class will raise an error if no GPU is detected. The ultralytics package must be installed.
See also
For pose inference without tracks, see
simba.model.yolo_pose_inference.YOLOPoseInference(). For visualizing pose tracks, seesimba.plotting.yolo_pose_track_visualizer.YOLOPoseTrackVisualizer(). If you do not need tracks (have individually idenitifiable individuals) usesimba.plotting.yolo_pose_track_visualizer.YOLOPoseTrackVisualizer()for better runtimes.EXPECTED RUNTIMES
VIDEOS (COUNT)
FRAMES (COUNT)
TIME (S)
STDEV(S)
1
1500
22.41333333
1.243958735
2
3000
44.76866667
0.22300299
3
6000
66.592
1.305805881
4
9000
89.683
1.132298106
BATCH SIZE: 500
IMGSZ: 256
NVIDIA GeForce RTX 4070
CPU COUNT (LOADERS): 16
3 runs
- Parameters
weights_path (Union[str, os.PathLike]) โ Path to the YOLO pose model weights file (e.g., .pt file). Must be a valid YOLO pose model, not a detection or segmentation model.
video_path (Union[Union[str, os.PathLike], List[Union[str, os.PathLike]]]) โ Path(s) to video file(s) or directory containing videos. Can be a single video path, a list of video paths, or a directory path. If a directory is provided, all video files in the directory will be processed.
keypoint_names (Tuple[str, ...]) โ Tuple of keypoint names corresponding to the modelโs keypoint outputs. Length must match the number of keypoints expected by the model.
config_path (Union[str, os.PathLike]) โ Path to the tracker configuration YAML file (e.g., โbotsort.ymlโ, โbytetrack.ymlโ).
verbose (Optional[bool]) โ If True, prints progress information during processing. Default: False.
save_dir (Optional[Union[str, os.PathLike]]) โ Directory to save output CSV files. If None, results are returned as a dictionary of DataFrames instead of being saved. Default: None.
device (Union[Literal['cpu'], int]) โ Device to run inference on. Use โcpuโ for CPU inference or an integer for GPU device ID (e.g., 0 for cuda:0). Default: 0.
format (Optional[str]) โ Model format for loading. If None, inferred from weights file extension. Default: None.
batch_size (Optional[int]) โ Number of frames to process in each batch. Higher values increase memory usage but may improve throughput. Default: 4.
torch_threads (int) โ Number of threads for PyTorch operations. Default: 8.
half_precision (bool) โ If True, uses FP16 half-precision inference for faster processing. Requires GPU support. Default: True.
recursive (bool) โ If True and video_path is a directory, recursively searches subdirectories for video files. Default: False.
stream (bool) โ If True, enables streaming mode for memory-efficient processing of long videos. Default: False.
interpolate (bool) โ If True, interpolates missing keypoint coordinates and confidence values using nearest-neighbor interpolation with forward/backward filling. Default: False.
threshold (float) โ Confidence threshold for detections. Detections with confidence below this value are filtered out. Range: (0, 1]. Default: 0.7.
max_tracks (Optional[int]) โ Maximum number of tracks to maintain simultaneously. If None, unlimited tracks are allowed. Default: 2.
smoothing (Optional[int]) โ Smoothing window size in milliseconds. If provided, applies Gaussian smoothing to keypoint trajectories. If None, no smoothing is applied. Default: None.
imgsz (int) โ Input image size for inference. Images are resized to this dimension while maintaining aspect ratio. Default: 320.
iou (float) โ IoU (Intersection over Union) threshold for Non-Maximum Suppression (NMS) during tracking. Range: (0, 1]. Default: 0.5.
- Example
>>> keypoint_names = ('Nose', 'Left_ear', 'Right_ear', 'Tail_base') >>> inference = YOLOPoseTrackInference( ... weights_path='path/to/model.pt', ... video_path='path/to/video.mp4', ... keypoint_names=keypoint_names, ... config_path='botsort.yml', ... save_dir='path/to/output', ... verbose=True, ... threshold=0.5, ... interpolate=True, ... smoothing=100 ... ) >>> inference.run()
- Example
>>> # Process multiple videos and return results as DataFrames >>> inference = YOLOPoseTrackInference( ... weights_path='model.pt', ... video_path=['video1.mp4', 'video2.mp4'], ... keypoint_names=('Nose', 'Tail'), ... config_path='bytetrack.yml', ... save_dir=None, ... max_tracks=5 ... ) >>> results_dict = inference.run() # Returns dict of video_name: DataFrame
Pose-estimation track plotting๏
- class simba.plotting.yolo_pose_track_visualizer.YOLOPoseTrackVisualizer(data_path, video_path, save_dir, palettes=None, core_cnt=- 1, threshold=0.0, thickness=None, circle_size=None, verbose=False, bbox=False, overwrite=True)[source]๏
Bases:
objectVisualizes YOLO-based keypoint pose estimation tracks on video frames and creates annotated output videos.
This class overlays tracked keypoints and optional bounding boxes onto the source video, grouping detections by track identifier and applying per-track color palettes.
See also
YOLOPoseInference()for generating YOLO pose CSV data.FitYolo()for fitting YOLO detector models.Tracker configuration templates in
simba.assets.tracker_yml.
- Parameters
data_path (Union[str, os.PathLike]) โ Path to a YOLO pose CSV file, or directory containing multiple CSV files.
video_path (Union[str, os.PathLike]) โ Path to the source video, or directory with videos matching the CSV filenames.
save_dir (Union[str, os.PathLike]) โ Directory where annotated videos are written.
palettes (Optional[Union[str, Tuple[str, ...]]]) โ Custom color palettes keyed by track IDs. Defaults to random palettes per track when None.
core_cnt (Optional[int]) โ CPU cores to use for parallel rendering. Defaults to -1 (all available cores).
threshold (float) โ Confidence threshold for rendering detections. Detections below this threshold are skipped.
thickness (Optional[int]) โ Line thickness for bounding boxes. If None, derived from frame dimensions.
circle_size (Optional[int]) โ Radius of drawn keypoints. If None, derived from frame dimensions.
verbose (Optional[bool]) โ Set True to print progress information.
- Example
>>> video_path = r"/mnt/c/troubleshooting/mitra/project_folder/videos/501_MA142_Gi_CNO_0521.mp4" >>> data_path = "/mnt/c/troubleshooting/mitra/yolo_pose/501_MA142_Gi_CNO_0521.csv" >>> kp_vis = YOLOPoseTrackVisualizer(data_path=data_path, >>> video_path=video_path, >>> save_dir='/mnt/c/troubleshooting/mitra/yolo_pose/', >>> core_cnt=18, >>> bbox=True) >>> kp_vis.run()
Pose-estimation plotting๏
- class simba.plotting.yolo_pose_visualizer.YOLOPoseVisualizer(data_path, video_path, save_dir, palettes='Set1', core_cnt=- 1, threshold=0.0, thickness=None, circle_size=None, verbose=True, bbox=True, skeleton=None, recursive=False, pool=None, sample_n=None)[source]๏
Bases:
objectVisualizes YOLO-based keypoint pose estimation data on video frames and creates an annotated output video.
This class takes keypoint data (CSV) and overlays it onto the corresponding video using color-coded keypoints and optional filtering. The result is saved as a new annotated video, and supports multicore parallel rendering for efficient processing of long videos.
See also
To create YOLO pose data, see
YOLOPoseInference()To fit YOLO model, seeFitYolo()For instructions, see YOLO Pose Estimation Visualization Documentation.- Parameters
data_path (Union[str, os.PathLike]) โ Path to the CSV file containing keypoint data, or folder containing keypoint data (output from YOLO pose inference).
video_path (Union[str, os.PathLike]) โ Path to the original input video, or folder containing original videos, to overlay keypoints on.
save_dir (Union[str, os.PathLike]) โ Directory to save the resulting annotated video.
palettes (Optional[Union[str, Tuple[str, ...]]]) โ Name(s) of categorical color palettes used to draw keypoints per detected class. A single string applies to all classes; a tuple assigns one palette per class. Defaults to (โSet1โ,).
core_cnt (Optional[int]) โ Number of CPU cores to use for parallel rendering. Defaults to -1 (all available cores).
threshold (float) โ Confidence threshold for rendering bounding boxes, keypoints, and skeleton edges. Only entries with confidence >= threshold are drawn.
thickness (Optional[int]) โ Thickness of bounding boxes and skeleton edges. If None, computed from frame dimensions.
circle_size (Optional[int]) โ Radius of keypoint circles. If None, computed from frame dimensions.
verbose (Optional[bool]) โ Set True to enable progress logging.
bbox (Optional[bool]) โ Set False to disable rendering of bounding boxes around detections.
skeleton (Optional[List[Tuple[str, str]]]) โ Iterable of keypoint name pairs defining skeleton edges to render when both keypoints exceed
threshold.recursive (Optional[bool]) โ If True, search data and video directories recursively; otherwise only the top level is scanned.
sample_n (Optional[int]) โ Randomly sample
sample_ndata files to visualize. If None, visualize all detected files.
- Example
>>> video_path = r"/mnt/c/troubleshooting/mitra/project_folder/videos/501_MA142_Gi_CNO_0521.mp4" >>> data_path = "/mnt/c/troubleshooting/mitra/yolo_pose/501_MA142_Gi_CNO_0521.csv" >>> kp_vis = YOLOPoseVisualizer(data_path=data_path, >>> video_path=video_path, >>> save_dir='/mnt/c/troubleshooting/mitra/yolo_pose/', >>> core_cnt=18) >>> kp_vis.run()
Bounding box plotting๏
- class simba.plotting.yolo_visualize.YOLOVisualizer(data_path, video_path, save_dir, palette='Set1', core_cnt=- 1, threshold=0.0, padding=0, pool=None, thickness=None, opacity=0.6, outline_color=None, color_by='class', verbose=True)[source]
Bases:
objectVisualize YOLO bounding-box inference results on a source video.
See also
For bounding-box inference, see
simba.model.yolo_inference.YoloInference.- Parameters
data_path (Union[str, os.PathLike]) โ Path to YOLO results CSV. Expected columns:
FRAME, CLASS_ID, CLASS_NAME, CONFIDENCE, X1..Y4. Multiple rows sharing the sameFRAMEandCLASS_NAME(i.e. several detections of one class per frame, as produced byYoloInferencewithmax_per_class > 1) are rendered as separate instances, each drawn as its own polygon track and color (ordered by detection confidence).video_path (Union[str, os.PathLike]) โ Path to the video from which the data was produced.
save_dir (Union[str, os.PathLike]) โ Directory where to save visualization output.
palette (Optional[str]) โ Matplotlib color palette name for per-class geometry colors (e.g.,
'Set1','tab10'). Default:'Set1'.core_cnt (Optional[int]) โ CPU core count for parallel processing. Use
-1for all available cores.threshold (float) โ Confidence threshold in
[0.0, 1.0]. Detections below threshold are masked before polygon conversion.padding (int) โ Polygon offset in pixels used during multiframe bbox-to-polygon conversion for rendering. Defaults to 0 (draw the exact detection box). Positive values expand polygons outward,
-1shrinks them inward. This affects visualization geometry only, not the underlying YOLO detections in the input CSV.thickness (Optional[int]) โ Polygon line thickness. If
None, default geometry plotter thickness is used.opacity (float) โ Polygon fill opacity in
[0.0, 1.0]. Default: 0.6.outline_color (Optional[Tuple[int, int, int]]) โ BGR color for polygon outlines. If
None, no outlines are drawn. Default: None.color_by (Literal['class', 'instance']) โ How detections are colored when multiple instances per class are present.
'class'(default) gives every instance of a class the same class color (avoids color flicker, since instance slots are confidence-ranked per frame and not identity-tracked).'instance'gives each instance slot its own color (useful only when the data carries stable identities, e.g. from a tracker). For single-instance-per-class data both options are equivalent.verbose (bool) โ If True, prints progress information. Default: True.
- Raises
FrameRangeError โ If YOLO result frame coverage does not match video frame count.
- Example
>>> test = YOLOVisualizer( ... data_path=r"/mnt/c/troubleshooting/yolo_inference/08102021_DOT_Rat7_8(2).csv", ... video_path=r"/mnt/c/troubleshooting/RAT_NOR/project_folder/videos/08102021_DOT_Rat7_8(2).mp4", ... save_dir="/mnt/c/troubleshooting/yolo_videos", ... threshold=0.25, ... core_cnt=4 ... ) >>> test.run()
YOLO annotation visualizer๏
- class simba.plotting.yolo_annotation_visualizer.YOLOAnnotationVisualizer(map_yaml_path, save_dir, split='all', n=None, circle_size=None, thickness=None, palette='Set1', img_format='.png', seg_opacity=0.5, show_names=False, show_outline=False, verbose=True)[source]๏
Bases:
objectVisualize YOLO annotation label files overlaid on their source images.
See also
For visualizing YOLO bounding-box inference results on video, see
simba.plotting.yolo_visualize.YOLOVisualizer(). For visualizing YOLO keypoint pose-estimation results on video, seesimba.plotting.yolo_pose_visualizer.YOLOPoseVisualizer(). For visualizing YOLO segmentation polygon results on video, seesimba.plotting.yolo_seg_visualizer.YOLOSegmentationVisualizer(). For auto-detecting the YOLO project type from a label file, seesimba.utils.yolo.detect_yolo_project_type().- param Union[str, os.PathLike] map_yaml_path
Path to the YOLO project
map.yamlfile.- param Union[str, os.PathLike] save_dir
Directory where annotated images are saved.
- param Optional[str] split
Which split to visualize:
'train','val', or'all'. Default'all'.- param Optional[int] n
Number of images to visualize. If
None, visualize every image. DefaultNone.- param Optional[int] circle_size
Radius of keypoint circles. If
None, computed from image dimensions.- param Optional[int] thickness
Line thickness for bounding boxes / polygon edges. If
None, computed from image dimensions.- param str palette
Color palette name (e.g.
'Set1'). Default'Set1'.- param str img_format
Output image format extension. Default
'.png'.- param float seg_opacity
Opacity of filled segmentation polygons (0.0โ1.0). Default
0.5.- param bool show_names
If True, draw class name labels on each annotation. Default False.
- param bool show_outline
If True, draw polygon outline for segmentation annotations. Default False.
- param bool verbose
Print progress messages. Default
True.- example
>>> viz = YOLOAnnotationVisualizer(map_yaml_path=r'F:
etholabsmoira_lp_sammap.yamlโ, save_dir=rโF: etholabsnnotation_visualizationsโ, n=400)
>>> viz.run() >>> viz = YOLOAnnotationVisualizer(map_yaml_path=r'/path/to/map.yaml', save_dir=r'/path/to/output') >>> viz.run() >>> viz = YOLOAnnotationVisualizer(map_yaml_path=r'/path/to/map.yaml', save_dir=r'/path/to/output', n=50, circle_size=5, thickness=2, img_format='.jpeg') >>> viz.run()
COCO key-points -> YOLO pose-estimation format conversion๏
- class simba.third_party_label_appenders.transform.coco_keypoints_to_yolo.COCOKeypoints2Yolo(coco_path, img_dir, save_dir, train_size=0.7, flip_idx=(0, 2, 1, 5, 4, 3, 6), verbose=True, greyscale=False, clahe=False, bbox_pad=None)[source]
Bases:
objectConvert COCO Keypoints version 1.0 data format into a YOLO keypoints training set.
Processes COCO format keypoint annotations and converts them to YOLO keypoint format, splitting the data into training and validation sets. Images are copied to the output directory and annotations are converted to YOLO format text files. A YAML configuration file is automatically generated.
Note
COCO keypoint files can be created using https://www.cvat.ai/.
This function expects the path to a single COCO Keypoints version 1.0 file. To merge several before passing the file to this function, use
simba.third_party_label_appenders.transform.utils.merge_coco_keypoints_files().Important
All image file names have to be unique.
See also
To convert COCO Keypoints version 1.0 data format into a YOLO bounding box training set, use
simba.third_party_label_appenders.transform.coco_keypoints_to_yolo_bbox.COCOKeypoints2YoloBbox(). To train YOLO pose models with the converted data, seesimba.model.yolo_fit.FitYolo()and YOLO Pose Estimation Training Documentation. To run inference with trained YOLO pose models, seesimba.model.yolo_pose_inference.YOLOPoseInference()orsimba.model.yolo_pose_track_inference.YOLOPoseTrackInference()and YOLO Pose Estimation Inference Documentation.- Parameters
coco_path (Union[str, os.PathLike]) โ Path to COCO keypoints 1.0 file in JSON format. Must contain โcategoriesโ, โimagesโ, and โannotationsโ keys.
img_dir (Union[str, os.PathLike]) โ Directory holding image files representing the annotated entries in the
coco_path. Will search recursively, so itโs OK to have images in subdirectories.save_dir (Union[str, os.PathLike]) โ Directory where to save the YOLO formatted data. Will create โimages/trainโ, โimages/valโ, โlabels/trainโ, โlabels/valโ subdirectories.
train_size (float) โ Size of the training set as a fraction between 0.1 and 0.99. Remaining data becomes validation set. Default: 0.7 (70% training, 30% validation).
flip_idx (Tuple[int, ...]) โ Tuple of integers representing the re-ordering of body-part indices when the image is horizontally flipped 180 degrees. Must match the number of keypoints. Default: (0, 2, 1, 5, 4, 3, 6).
verbose (bool) โ If True (default), prints progress messages. If False, suppresses output.
greyscale (bool) โ If True, converts images to greyscale before saving. If False (default), keeps original color format.
clahe (bool) โ If True, applies CLAHE (Contrast Limited Adaptive Histogram Equalization) enhancement to images before saving. If False (default), no enhancement is applied.
bbox_pad (Optional[float]) โ Optional padding factor for bounding boxes (between 10e-6 and 1.0). If provided, bounding boxes are expanded by this percentage to better encompass all body-parts. If None (default), no padding is applied.
- Returns
None. YOLO formatted data is saved to
save_dirwith structure: images/train, images/val, labels/train, labels/val, and map.yaml.- Example
>>> runner = COCOKeypoints2Yolo(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/s1/annotations/s1.json", img_dir=r"D:/cvat_annotations/frames/simon", save_dir=r"D:/cvat_annotations/frames/yolo_keypoints", clahe=True) >>> runner.run()
- Example II
>>> runner = COCOKeypoints2Yolo(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/merged.json", img_dir=r"D:/cvat_annotations/frames", save_dir=r"D:/cvat_annotations/frames/yolo", clahe=False) >>> runner.run()
- Example III
>>> runner = COCOKeypoints2Yolo(coco_path=r"E:/netholabs_videos/mosaics/subset/to_annotate/2d_mosaic_batch_1.json", img_dir=r"E:/netholabs_videos/mosaics/subset/to_annotate", save_dir=r"E:/netholabs_videos/mosaics/yolo_mdl", clahe=False, bbox_pad=0.1) >>> runner.run()
- references
- 1
Helpful YouTube tutorial by Farhan to get YOLO tracking data in animals - https://www.youtube.com/watch?v=CcGbgFPwQTc.
- 2
Great YouTube tutorial by Felipe on annotating data and making data YOLO ready - https://www.youtube.com/watch?v=m9fH9OWn8YM.
COCO key-points -> YOLO bounding box conversion๏
- class simba.third_party_label_appenders.transform.coco_keypoints_to_yolo_bbox.COCOKeypoints2YoloBbox(coco_path, img_dir, save_dir, train_size=0.7, verbose=True, greyscale=False, clahe=False, bbox_pad=None, obb=False)[source]
Bases:
objectConvert COCO Keypoints version 1.0 data format into a YOLO bounding box training set.
Note
COCO keypoint files can be created using https://www.cvat.ai/.
This function expects the path to a single COCO Keypoints version 1.0 file. To merge several before passing the file to thsi function, use
simba.third_party_label_appenders.transform.utils.merge_coco_keypoints_files().Important
All image file names have to be unique.
See also
To convert OCO Keypoints version 1.0 data format into a YOLO keypoint training set, use
simba.third_party_label_appenders.transform.coco_keypoints_to_yolo.COCOKeypoints2Yolo()- Parameters
coco_path (Union[str, os.PathLike]) โ Path to coco keypoints 1.0 file in json format.
img_dir (Union[str, os.PathLike]) โ Directory holding img files representing the annotated entries in the
coco_path. Will search recursively, so its OK to have images in subdirectories.save_dir (Union[str, os.PathLike]) โ Directory where to save the yolo formatted data.
split (Tuple[float, float, float]) โ The size of the training set. Value between 0-1.0 representing the percent of training data.
verbose (bool) โ If true, prints progress. Default: True.
flip_idx (Tuple[int, ...]) โ Tuple of ints, representing the flip of body-part coordinates when the animal image flips 180 degrees.
- Returns
None
- Example
>>> runner = COCOKeypoints2YoloBbox(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/s1/annotations/s1.json", img_dir=r"D:/cvat_annotations/frames/simon", save_dir=r"D:/cvat_annotations/frames/yolo_keypoints", clahe=True) >>> runner.run()
- Example II
>>> runner = COCOKeypoints2YoloBbox(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/merged.json", img_dir=r"D:/cvat_annotations/frames", save_dir=r"D:/cvat_annotations/frames/yolo", clahe=False) >>> runner.run()
- references
- 1
Helpful YouTube tutorial by Farhan to get YOLO tracking data in animals - https://www.youtube.com/watch?v=CcGbgFPwQTc.
- 2
Great YouTube tutorial by Felipe on annotating data and making data YOLO ready - https://www.youtube.com/watch?v=m9fH9OWn8YM.
COCO key-points -> YOLO segmentation conversion๏
SAM3 -> YOLO segmentation project๏
- class simba.third_party_label_appenders.transform.sam3_to_yolo_seg.SAM3ToYoloSeg(video_dir, sam_path, save_dir, txt_prompt='mouse', n_frames=50, names=('animal',), train_val_split=0.7, conf=0.5, sam_imgsz=644, greyscale=False, clahe=False, vertice_cnt=40, seed=None, visualize=False, io_timeout=30.0, verbose=True)[source]
Bases:
objectSample N random frames from each video in a directory, run SAM3 with a text prompt, and write the resulting masks as a YOLO segmentation project.
Note
To fit a YOLO segmentation model, see
FitYolo. For YOLO segmentation inference, seeYOLOSegmentationInference.See also
MergeYoloProjectsโ merge severalmap.yamlprojects (same classes and task) into one dataset.
- Raises
SimBAGPUError โ If no NVIDIA GPU is detected (via
nvidia-smi).SimBAPAckageVersionError โ If
ultralyticsis not installed, orSAM3SemanticPredictorcannot be imported.
- Parameters
video_dir (Union[str, os.PathLike]) โ Directory containing input videos.
sam_path (Union[str, os.PathLike]) โ Path to SAM3 model weights (e.g. sam3.pt).
save_dir (Union[str, os.PathLike]) โ Root output directory for the YOLO project.
txt_prompt (str) โ Text prompt for SAM3 (e.g. โmouseโ, โmouse tailโ).
n_frames (int) โ Number of random frames to sample from each video.
names (Tuple[str, ...]) โ Class names in index order. Default
('animal',).train_val_split (float) โ Fraction allocated to training (0.1-0.9). Default 0.7.
conf (float) โ SAM3 confidence threshold. Default 0.25.
sam_imgsz (int) โ Image size for SAM3 inference. Default 640.
greyscale (bool) โ If True, save extracted frames in greyscale. Default False.
clahe (Optional[Union[Tuple[int, int, int], bool]]) โ If True, applies CLAHE with default params. If tuple of (clip_limit, tile_x, tile_y), applies CLAHE with those params. Default False.
vertice_cnt (Optional[int]) โ If not None, resample each mask polygon to this many vertices. Default 40.
seed (Optional[int]) โ Random seed for reproducible frame sampling.
visualize (bool) โ If True, saves annotated images with segmentation polygon overlays to a
visualizationssubfolder insidesave_dir. Useful for verifying SAM3 annotation quality. Default False.io_timeout (float) โ Seconds to keep retrying file I/O (read/write) when the operation fails (e.g. temporary drive disconnect). Default 30.0.
verbose (bool) โ If True, print progress updates. Default True.
- Example
>>> runner = SAM3ToYoloSeg(video_dir=r'/path/to/videos', sam_path=r'/path/to/sam3.pt', save_dir=r'/path/to/yolo_project', txt_prompt='mouse', n_frames=50) >>> runner.run()
SAM3 -> YOLO bounding-box (detection) project๏
Merge multiple YOLO projects๏
- class simba.third_party_label_appenders.transform.merge_yolo_projects.MergeYoloProjects(yaml_paths, save_dir, train_val_split=None, seed=None, verbose=True)[source]
Bases:
objectMerge multiple YOLO projects into a single YOLO project.
Reads each projectโs YAML, validates that all projects share the same task type (bounding-box detection, segmentation, or keypoint pose) and class names, then copies all images and labels into a single output project with train/val splits.
See also
SAM3ToYoloBBox
- Parameters
yaml_paths (List[Union[str, os.PathLike]]) โ List of paths to YOLO project YAML files.
save_dir (Union[str, os.PathLike]) โ Root output directory for the merged project.
train_val_split (Optional[float]) โ If provided, reshuffle all samples and split at this ratio (0.1-0.9). If None, preserve each projectโs existing train/val assignments. Default None.
seed (Optional[int]) โ Random seed for reproducible splitting. Only used when
train_val_splitis not None.verbose (bool) โ If True, print progress. Default True.
- Example
>>> merger = MergeYoloProjects(yaml_paths=[r'/project_a/map.yaml', r'/project_b/map.yaml'], save_dir=r'/merged_project', train_val_split=0.8) >>> merger.run()
Multi-animal DeepLabCut predictions -> YOLO pose-estimation annotations format conversion๏
- class simba.third_party_label_appenders.transform.dlc_ma_h5_to_yolo.MADLCH52Yolo(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, threshold=0, train_size=0.7, flip_idx=None, greyscale=False, clahe=False, padding=0.0, single_id=None)[source]
Bases:
objectConvert multi-animal DeepLabCut pose estimation H5 data and corresponding videos into YOLO keypoint dataset format.
Note
This converts DeepLabCut inference data to YOLO keypoints (not DeepLabcut annotations).
- param Union[str, os.PathLike] data_dir
Directory path containing DLC-generated H5 files with inferred keypoints.
- param Union[str, os.PathLike] video_dir
Directory path containing corresponding videos from which frames are to be extracted.
- param Union[str, os.PathLike] save_dir
Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.
- param Optional[int] frms_cnt
Number of frames to randomly sample from each video for conversion. If None, all frames are used.
- param float threshold
Minimum confidence score threshold to filter out low-confidence pose instances. Only instances with instance.score >= this threshold are used.
- param float train_size
Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.
- param bool verbose
If True, prints progress. Default: True.
- param Tuple[int, โฆ] flip_idx
Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping. If None, it will be inferred.
- param float padding
Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.
- param Optional[str] single_id
If the data contains pose-estimation for multiple indivisuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.
- return
None. Results saved in
save_dir.- example
>>> DATA_DIR = r'D: roubleshooting\dlc_h5_multianimal_to_yolo\data' >>> VIDEO_DIR = r'D: roubleshooting\dlc_h5_multianimal_to_yolo
- ideosโ
>>> SAVE_DIR = r"D:\imgs\madlc" >>> runner = MADLCH52Yolo(data_dir=DATA_DIR, video_dir=VIDEO_DIR, save_dir=SAVE_DIR, clahe=True, single_id='animal_1') >>> runner.run()
DeepLabCut predictions -> YOLO pose-estimation annotations๏
- class simba.third_party_label_appenders.transform.dlc_to_yolo.DLC2Yolo(dlc_dir, save_dir, train_size=0.7, verbose=False, padding=0.15, flip_idx=None, names=('mouse',), greyscale=False, clahe=False)[source]
Bases:
objectConverts DLC annotations into YOLO keypoint format formatted for model training.
Important
Use for single animal DLC data. For multi-animal DLC data,
Note
dlc_dircan be a directory with subdirectories containing images and CSV files with theCollectedDatasubstring filename. For creating theflip_idx, seesimba.third_party_label_appenders.converters.get_yolo_keypoint_flip_idx(). For creating thebp_id_idx, seesimba.third_party_label_appenders.converters.get_yolo_keypoint_bp_id_idx()- Parameters
dlc_dir (Union[str, os.PathLike]) โ Directory path containing DLC-generated CSV files with keypoint annotations and images.
save_dir (Union[str, os.PathLike]) โ Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.
train_size (float) โ Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.
verbose (bool) โ If True, prints progress. Default: True.
padding (float) โ Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.
flip_idx (Tuple[int, ...]) โ Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping.
names (Tuple[str]) โ Tuple of animal (class) names. Used for creating the YAML class names mapping file.
- Returns
None. Results saved in
save_dir.- Example
>>> DLC_DIR = r'D:/rat_resident_intruder/dlc_data' >>> SAVE_DIR = r'D:/rat_resident_intruder/yolo_3' >>> runner = DLC2Yolo(dlc_dir=DLC_DIR, save_dir=SAVE_DIR, verbose=True, clahe=True, names=('resident', 'intruder')) >>> runner.run()
Labelme annotations -> YOLO bounding box annotations๏
- class simba.third_party_label_appenders.transform.labelme_to_yolo.LabelmeBoundingBoxes2YoloBoundingBoxes(labelme_dir, save_dir, obb=False, verbose=True, clahe=False, train_size=0.7, greyscale=False)[source]
Bases:
objectConvert LabelMe annotations in json to YOLO format and save the corresponding images and labels in txt format.
Note
For more information on the LabelMe annotation tool, see the LabelMe GitHub repository. The Labelme Json files has too contain a imageData key holding the image as a b64 string. For an expected Labelme json format, see THIS FILE.
See also
To split YOLO data into train, test, and validation sets (expected by e.g., UltraLytics), see
simba.third_party_label_appenders.converters.split_yolo_train_test_val(). To convert Labelme points annotations to YOLO keypoint training data, seesimba.third_party_label_appenders.transform.labelme_to_yolo_keypoints.LabelmeKeypoints2YoloKeypoints().Important
For YOLO bounding boxes (not YOLO keypoint data!) from labelme keypoints.
- Parameters
labelme_dir (Union[str, os.PathLike) โ Path to the directory containing LabelMe annotation .json files.
save_dir (Union[str, os.PathLike) โ Directory where the YOLO-format images and labels will be saved. Will create โimages/โ, โlabels/โ, and โmap.jsonโ inside this directory.
obb (bool) โ If True, saves annotations as oriented bounding boxes (8 coordinates). If False, uses standard YOLO format (x_center, y_center, width, height)
verbose (bool) โ If True, prints progress messages during conversion.
- Example
>>> LABELME_DIR = r'D:\platea s_annotations' >>> SAVE_DIR = r"D:\platea\yolo" >>> runner = LabelmeBoundingBoxes2YoloBoundingBoxes(labelme_dir=LABELME_DIR, save_dir=SAVE_DIR) >>> runner.run()
Labelme points -> YOLO keypoints annotations๏
Labelme points -> YOLO segmentation annotations๏
SimBA ROIs -> YOLO bounding box annotations๏
- class simba.third_party_label_appenders.transform.simba_roi_to_yolo.SimBAROI2Yolo(config_path=None, roi_path=None, video_dir=None, save_dir=None, roi_frm_cnt=10, train_size=0.7, obb=False, greyscale=False, clahe=False, verbose=True)[source]
Bases:
objectConverts SimBA roi definitions into annotations and images for training yolo network.
- Parameters
config_path (Optional[Union[str, os.PathLike]]) โ Optional path to the project config file in SimBA project.
roi_path (Optional[Union[str, os.PathLike]]) โ Path to the SimBA roi definitions .h5 file. If None, then the
roi_coordinates_pathof the project.video_dir (Optional[Union[str, os.PathLike]]) โ Directory where to find the videos. If None, then the videos folder of the project.
save_dir (Optional[Union[str, os.PathLike]]) โ Directory where to save the labels and images. If None, then the logs folder of the project.
roi_frm_cnt (Optional[int]) โ Number of frames for each video to create bounding boxes for.
train_size (float) โ Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.
obb (Optional[bool]) โ If True, created object-oriented yolo bounding boxes. Else, axis aligned yolo bounding boxes. Default False.
greyscale (Optional[bool]) โ If True, converts the images to greyscale if rgb. Default: True.
verbose (Optional[bool]) โ If True, prints progress. Default: False.
- Returns
None
- Example I
>>> SimBAROI2Yolo(config_path=r"C:/troubleshooting/RAT_NOR/project_folder/project_config.ini").run()
- Example II
>>> SimBAROI2Yolo(config_path=r"C:/troubleshooting/RAT_NOR/project_folder/project_config.ini", save_dir=r"C:/troubleshooting/RAT_NOR/project_folder/logs/yolo", video_dir=r"C:/troubleshooting/RAT_NOR/project_folder/videos", roi_path=r"C:/troubleshooting/RAT_NOR/project_folder/logs/measures/ROI_definitions.h5").run()
- Example III
>>> SimBAROI2Yolo(video_dir=r"C:/troubleshooting/RAT_NOR/project_folder/videos", roi_path=r"C:/troubleshooting/RAT_NOR/project_folder/logs/measures/ROI_definitions.h5", save_dir=r'C:/troubleshooting/RAT_NOR/project_folder/yolo', verbose=True, roi_frm_cnt=20, obb=True).run()
SimBA pose-estimation -> YOLO pose-estimation annotations๏
- class simba.third_party_label_appenders.transform.simba_to_yolo.SimBA2Yolo(config_path, save_dir, data_dir=None, train_size=0.7, verbose=False, greyscale=False, clahe=False, padding=0.0, threshold=0.0, flip_idx=None, names=('animal_1',), sample_size=None, bp_id_idx=None, single_id=None)[source]
Bases:
objectConvert pose estimation data from a SimBA project into the YOLO keypoint format, including frame sampling, image-label pair creation, bounding box computation, and train/validation splitting.
Note
For creating the
flip_idx, seesimba.third_party_label_appenders.converters.get_yolo_keypoint_flip_idx(). For creating thebp_id_idx, seesimba.third_party_label_appenders.converters.get_yolo_keypoint_bp_id_idx()- Parameters
config_path (Union[str, os.PathLike]) โ Path to the SimBA project .ini configuration file.
save_dir (Union[str, os.PathLike]) โ Directory where YOLO-formatted data will be saved. Subdirectories for images/labels (train/val) are created.
data_dir (Optional[Union[str, os.PathLike]) โ Optional directory containing outlier-corrected SimBA pose estimation data. If None, uses path from config.
train_size (float) โ Proportion of samples to allocate to the training set (range 0.1โ0.99). Remaining samples go to validation.
verbose (bool) โ If True, prints progress updates to the console.
greyscale (bool) โ If True, saves extracted video frames in greyscale. Otherwise, saves in color.
padding (float) โ Padding added around the bounding box (as a proportion of image dimensions, range 0.0โ1.0). Useful if animal body-parts are in a โlineโ.
flip_idx (Tuple[int, ...]) โ Tuple defining symmetric keypoint indices for horizontal flipping. Used to write the map.yaml file. If None, then attempt to infer.
names (Dict[int, str]) โ Dictionary mapping instance IDs to class names. Used in annotation labels and map.yaml.
sample_size (Optional[int]) โ If specified, limits the number of randomly sampled frames per video. If None, all frames are used.
bp_id_idx (Optional[Dict[int, Union[Tuple[int], List[int]]]]) โ Optional mapping of instance IDs to keypoint index groups, allowing support for multiple animals per frame. Must match keys in map_dict.
single_id (Optional[str]) โ If the data contains pose-estimation for multiple indivisuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.
- Returns
None. Saves YOLO-formatted images and annotations to disk in the save_dir location.
- Example
>>> SAVE_DIR = r'D: roubleshooting\mitra\mitra_yolo' >>> CONFIG_PATH = r"C: roubleshooting\mitra\project_folder\project_config.ini" >>> runner = SimBA2Yolo(config_path=CONFIG_PATH, save_dir=SAVE_DIR, sample_size=10, verbose=True) >>> runner.run()
SimBA pose-estimation -> YOLO segmentation annotations๏
- class simba.third_party_label_appenders.transform.simba_to_yolo_seg.SimBA2YoloSegmentation(config_path, save_dir, data_dir=None, train_size=0.7, verbose=False, greyscale=False, clahe=False, padding=0, threshold=0.0, sample_size=None, single_id=None)[source]๏
Bases:
simba.mixins.config_reader.ConfigReaderConvert pose estimation data from a SimBA project into the YOLO keypoint format, including frame sampling, image-label pair creation, bounding box computation, and train/validation splitting.
Note
For creating the
flip_idx, seesimba.third_party_label_appenders.converters.get_yolo_keypoint_flip_idx(). For creating thebp_id_idx, seesimba.third_party_label_appenders.converters.get_yolo_keypoint_bp_id_idx()- Parameters
config_path (Union[str, os.PathLike]) โ Path to the SimBA project .ini configuration file.
save_dir (Union[str, os.PathLike]) โ Directory where YOLO-formatted data will be saved. Subdirectories for images/labels (train/val) are created.
data_dir (Optional[Union[str, os.PathLike]) โ Optional directory containing outlier-corrected SimBA pose estimation data. If None, uses path from config.
train_size (float) โ Proportion of samples to allocate to the training set (range 0.1โ0.99). Remaining samples go to validation.
verbose (bool) โ If True, prints progress updates to the console.
greyscale (bool) โ If True, saves extracted video frames in greyscale. Otherwise, saves in color.
padding (float) โ Padding added around the bounding box (as a proportion of image dimensions, range 0.0โ1.0). Useful if animal body-parts are in a โlineโ.
flip_idx (Tuple[int, ...]) โ Tuple defining symmetric keypoint indices for horizontal flipping. Used to write the map.yaml file. If None, then attempt to infer.
names (Dict[int, str]) โ Dictionary mapping instance IDs to class names. Used in annotation labels and map.yaml.
sample_size (Optional[int]) โ If specified, limits the number of randomly sampled frames per video. If None, all frames are used.
bp_id_idx (Optional[Dict[int, Union[Tuple[int], List[int]]]]) โ Optional mapping of instance IDs to keypoint index groups, allowing support for multiple animals per frame. Must match keys in map_dict.
single_id (Optional[str]) โ If the data contains pose-estimation for multiple indivisuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.
- Returns
None. Saves YOLO-formatted images and annotations to disk in the save_dir location.
- Example
>>> SAVE_DIR = r'D: roubleshooting\mitra\mitra_yolo' >>> CONFIG_PATH = r"C: roubleshooting\mitra\project_folder\project_config.ini" >>> runner = SimBA2Yolo(config_path=CONFIG_PATH, save_dir=SAVE_DIR, sample_size=10, verbose=True) >>> runner.run()
SLEAP CSV predictions -> YOLO pose-estimation annotations๏
- class simba.third_party_label_appenders.transform.sleap_csv_to_yolo.Sleap2Yolo(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, instance_threshold=0, train_size=0.7, flip_idx=None, names=None, greyscale=False, clahe=False, padding=0.0, single_id=None)[source]
Bases:
objectConvert SLEAP pose estimation CSV data and corresponding videos into YOLO keypoint dataset format.
Note
This converts SLEAP inference data to YOLO keypoints (not SLEAP annotations).
- Parameters
data_dir (Union[str, os.PathLike]) โ Directory path containing SLEAP-generated CSV files with inferred keypoints.
video_dir (Union[str, os.PathLike]) โ Directory path containing corresponding videos from which frames are to be extracted.
save_dir (Union[str, os.PathLike]) โ Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.
frms_cnt (Optional[int]) โ Number of frames to randomly sample from each video for conversion. If None, all frames are used.
instance_threshold (float) โ Minimum confidence score threshold to filter out low-confidence pose instances. Only instances with instance.score >= this threshold are used.
train_size (float) โ Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.
verbose (bool) โ If True, prints progress. Default: True.
flip_idx (Tuple[int, ...]) โ Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping.
map_dict (Dict[str, int]) โ Dictionary mapping class indices to class names. Used for creating the YAML class names mapping file.
padding (float) โ Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.
single_id (Optional[str]) โ If the data contains pose-estimation for multiple individuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.
- Returns
None. Results saved in
save_dir.- Example
>>> DATA_DIR = r'D:res\datant\sleap_csv' >>> VIDEO_DIR = r'D:res\datant\sleap_video' >>> SAVE_DIR = r"D:\imgs\sleap_csv" >>> runner = Sleap2Yolo(data_dir=DATA_DIR, video_dir=VIDEO_DIR, frms_cnt=50, train_size=0.8, instance_threshold=0.9, save_dir=SAVE_DIR, single_id='ant') >>> runner.run()
SLEAP H5 predictions -> YOLO pose-estimation annotations๏
- class simba.third_party_label_appenders.transform.sleap_h5_to_yolo.SleapH52Yolo(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, threshold=0, train_size=0.7, flip_idx=None, animal_cnt=2, greyscale=False, clahe=False, padding=0.0, single_id=None)[source]
Bases:
objectConvert SLEAP .h5 pose estimation annotations to YOLO keypoint annotation format.
Reads SLEAP .h5 files and associated videos, samples frames based on a confidence threshold, extracts keypoints for one or more animals, and saves image-label pairs in a format compatible with YOLOv8 keypoint training.
- Parameters
data_dir (Union[str, os.PathLike]) โ Directory containing SLEAP .h5 files.
video_dir (Union[str, os.PathLike]) โ Directory containing the videos associated with .h5 files.
save_dir (Union[str, os.PathLike]) โ Directory to save YOLO-formatted images, labels, and metadata.
frms_cnt (Optional[int]) โ Number of frames to sample per video. If None, all valid frames are used.
verbose (bool) โ If True, print progress during processing.
threshold (float) โ Likelihood threshold below which poses are discarded.
train_size (float) โ Proportion of frames to assign to the training set (rest go to validation).
flip_idx (Tuple[int, ...]) โ Tuple indicating how to flip body-parts for augmentation. Length must match keypoint count.
animal_cnt (int) โ Number of animals tracked per frame.
greyscale (bool) โ If True, convert images to grayscale.
clahe (bool) โ If True, apply CLAHE (Contrast Limited Adaptive Histogram Equalization).
padding (float) โ Relative padding to apply around the bounding box of keypoints (range 0.0 to 1.0).
single_id (Optional[str]) โ Optional custom ID to assign all annotations the same class (used in single-animal datasets).
- Example
>>>DATA_DIR = rโD:/ares/data/termite_1/dataโ >>>VIDEO_DIR = rโD:/ares/data/termite_1/videoโ >>>SAVE_DIR = rโD:/imgs/sleap_h5โ >>>runner = SleapH52Yolo(data_dir=DATA_DIR, video_dir=VIDEO_DIR, save_dir=SAVE_DIR, threshold=0.9, frms_cnt=50, single_id=โtermiteโ) >>>runner.run()
SLEAP annotations -> YOLO pose-estimation annotations๏
- class simba.third_party_label_appenders.transform.sleap_to_yolo.SleapAnnotations2Yolo(sleap_dir, save_dir, video_dir=None, padding=None, train_size=0.8, verbose=True, greyscale=False, clahe=False, single_id=None)[source]
Bases:
objectConvert SLEAP annotations to YOLO formatted training data.
- Parameters
data_dir (Union[str, os.PathLike]) โ Directory containing SLEAP annotations .slp files
save_dir (Union[str, os.PathLike]) โ Directory to save YOLO-formatted images, labels, and metadata.
verbose (bool) โ If True, print progress during processing.
train_size (float) โ Proportion of frames to assign to the training set (rest go to validation).
greyscale (bool) โ If True, convert images to grayscale.
clahe (bool) โ If True, apply CLAHE (Contrast Limited Adaptive Histogram Equalization).
padding (float) โ Relative padding to apply around the bounding box of keypoints (range 0.0 to 1.0).
single_id (Optional[str]) โ Optional custom ID to assign all annotations the same class (used in single-animal datasets).
- Example
>>> runner = SleapAnnotations2Yolo(sleap_dir=r'D:/cvat_annotations/frames/slp_to_yolo', save_dir=r'D:/cvat_annotations/frames/slp_to_yolo/yolo') >>> runner.run()
LightningPose keypoints -> YOLO bounding box conversion๏
- class simba.third_party_label_appenders.transform.litpose_to_yolo_bbox.LitPose2YOLOBbox(litpose_dir, save_dir, train_size=0.7, verbose=False, padding=0.0, sample_n=None, names=('mouse',), greyscale=False, clahe=False)[source]
Bases:
objectConvert LitPose keypoint annotations into a YOLO bounding-box dataset.
- Parameters
litpose_dir (Union[str, os.PathLike]) โ Path to LitPose directory containing annotation CSV files and the
labeled-dataimage folder.save_dir (Union[str, os.PathLike]) โ Output directory where YOLO-formatted
imagesandlabelssubdirectories are created.train_size (float) โ Fraction of samples assigned to the training split. Default 0.7.
verbose (bool) โ If True, print per-image progress during conversion.
padding (float) โ Extra fractional padding around each axis-aligned box inferred from keypoints.
sample_n (Optional[int]) โ Optional cap on the number of sampled frames before split. If None, all frames are used.
names (Tuple[str, ...]) โ Class names in YOLO index order.
greyscale (bool) โ If True, load and save images in grayscale.
clahe (bool) โ If True, apply CLAHE preprocessing when reading images.
References
- 1
Lightning Pose documentation: https://lightning-pose.readthedocs.io/en/latest/
- 2
Biderman et al., Lightning Pose: improved animal pose estimation via semi-supervised learning, Bayesian ensembling and cloud-native open-source tools, Nature Methods (2024), doi: https://doi.org/10.1038/s41592-024-02319-1
- Example
>>> runner = LitPose2YOLOBbox(litpose_dir=r'Z:\home\simon\lp_300126', save_dir=r'E:\litpose_yolobox', verbose=True, clahe=False, greyscale=False, sample_n=1000, padding=0.15) >>> runner.run()
LightningPose keypoints -> YOLO pose-estimation annotations๏
- class simba.third_party_label_appenders.transform.litpose_to_yolo_keypoints.LitPose2YOLO(litpose_dir, save_dir, train_size=0.7, verbose=False, padding=0.0, sample_n=None, flip_idx=None, names=('mouse',), greyscale=False, clahe=False)[source]
Bases:
objectConvert LitPose keypoint annotations into a YOLO keypoint dataset.
- Parameters
litpose_dir (Union[str, os.PathLike]) โ Path to LitPose directory containing annotation CSV files and the
labeled-dataimage folder.save_dir (Union[str, os.PathLike]) โ Output directory where YOLO-formatted
imagesandlabelssubdirectories are created.train_size (float) โ Fraction of samples assigned to the training split. Default 0.7.
verbose (bool) โ If True, print per-image progress during conversion.
padding (float) โ Extra padding factor used when computing normalized YOLO boxes from keypoints.
sample_n (Optional[int]) โ Optional cap on the number of sampled frames before split. If None, all frames are used.
flip_idx (Optional[Tuple[int, ...]]) โ Optional keypoint flip index order for YOLO pose augmentation. If None, inferred from body-part names.
names (Tuple[str, ...]) โ Class names in YOLO index order.
greyscale (bool) โ If True, load and save images in grayscale.
clahe (bool) โ If True, apply CLAHE preprocessing when reading images.
References
- 1
Lightning Pose documentation: https://lightning-pose.readthedocs.io/en/latest/
- 2
Biderman et al., Lightning Pose: improved animal pose estimation via semi-supervised learning, Bayesian ensembling and cloud-native open-source tools, Nature Methods (2024), doi: https://doi.org/10.1038/s41592-024-02319-1
- Example
>>> runner = LitPose2YOLO(litpose_dir=r'Z:\home\simon\lp_300126', save_dir=r'E:\litpose_yolobox', verbose=True, clahe=False, greyscale=False, sample_n=1000, padding=0.15) >>> runner.run()