Third-party label appenders

On this page

BENTO

class simba.third_party_label_appenders.BENTO_appender.BentoAppender(config_path, data_dir)[source]

Bases: simba.mixins.config_reader.ConfigReader

Append BENTO annotation to SimBA featurized datasets.

Parameters
  • config_path (str) – path to SimBA project config file in Configparser format

  • data_dir (str) – Path to folder containing BENTO data.

Example

>>> bento_dir = 'tests/test_data/bento_example'
>>> config_path = 'tests/test_data/import_tests/project_folder/project_config.ini'
>>> bento_appender = BentoAppender(config_path=config_path, data_dir=bento_dir)
>>> bento_appender.run()

References

1

Segalin et al., eLife, https://doi.org/10.7554/eLife.63720

BORIS

class simba.third_party_label_appenders.BORIS_appender.BorisAppender(config_path, data_dir)[source]

Bases: simba.mixins.config_reader.ConfigReader

Append BORIS human annotations onto featurized pose-estimation data.

Parameters
  • config_path (str) – path to SimBA project config file in Configparser format

  • data_dir (str) – path to folder holding BORIS data files is CSV format

Boris
Example

>>> test = BorisAppender(config_path=r"C:   roubleshootingoris_test\project_folder\project_config.ini", data_dir=r"C:      roubleshootingoris_test\project_folderoris_files")
>>> test.run()

References

1

Behavioral Observation Research Interactive Software (BORIS) user guide.

BORIS source cleaner

class simba.third_party_label_appenders.boris_source_cleaner.BorisSourceCleaner(data_dir, save_dir, settings)[source]

Bases: object

Helper to clean BORIS files where behavior with the same name are assigned to different animals through the Subjects field.

Parameters
  • data_dir (Union[str, os.PathLike]) – Directory with BORIS annotations in CSV format.

  • save_dir (Union[str, os.PathLike]) – Directory to store the modified BORIS annotations in CSV format.

  • settings (dict) – Rules for how to change the behavior names.

Example

>>> boris_cleaner = BorisSourceCleaner(data_dir='/Users/simon/Downloads/boris_data', save_dir='/Users/simon/Downloads/save_dir', settings=SETTINGS)
>>> boris_cleaner.run()

Deepethogram

class simba.third_party_label_appenders.deepethogram_importer.DeepEthogramImporter(data_dir, config_path)[source]

Bases: simba.mixins.config_reader.ConfigReader

Append DeepEthogram optical flow annotations onto featurized pose-estimation data.

Parameters
  • config_path (str) – path to SimBA project config file in Configparser format

  • data_dir (str) – path to folder holding DeepEthogram data files is CSV format

Examples

>>> deepethogram_importer = DeepEthogramImporter(config_path=r'MySimBAConfigPath', data_dir=r'MyDeepEthogramDir')
>>> deepethogram_importer.run()

References

1

DeepEthogram repo.

2

Example DeepEthogram input file.

Ethovison

class simba.third_party_label_appenders.ethovision_import.ImportEthovision(config_path, data_dir)[source]

Bases: simba.mixins.config_reader.ConfigReader

Append ETHOVISION human annotations onto featurized pose-estimation data. Results are saved within the project_folder/csv/targets_inserted directory of the SimBA project (as parquets’ or CSVs).

Parameters
  • config_path (str) – path to SimBA project config file in Configparser format

  • data_dir (str) – path to folder holding ETHOVISION data files is XLSX or XLS format

Examples

>>> ethovision_importer = ImportEthovision(config_path="MyConfigPath", data_dir="MyEthovisionFolderPath")
>>> ethovision_importer.run()

Noldus Observer

class simba.third_party_label_appenders.observer_importer.NoldusObserverImporter(config_path, data_dir)[source]

Bases: simba.mixins.config_reader.ConfigReader

Append Noldus Observer human annotations onto featurized pose-estimation data. Results are saved within the project_folder/csv/targets_inserted directory of the SimBA project (as parquets’ or CSVs).

Parameters
  • config_path (str) – path to SimBA project config file in Configparser format

  • data_dir (str) – path to folder holding Observer data files is XLSX or XLS format

Examples

>>> _ = NoldusObserverImporter(config_path='MyConfigPath', data_dir='MyNoldusObserverDataDir').run()

Solomon coder

class simba.third_party_label_appenders.solomon_importer.SolomonImporter(config_path, data_dir)[source]

Bases: simba.mixins.config_reader.ConfigReader

Append SOLOMON human annotations onto featurized pose-estimation data.

Parameters
  • config_path (str) – path to SimBA project config file in Configparser format

  • data_dir (str) – path to folder holding SOLOMON data files is CSV format

Examples

>>> solomon_imported = SolomonImporter(config_path=r'MySimBAConfigPath', data_dir=r'MySolomonDir')
>>> solomon_imported.run()

References

1

SOLOMON CODER USER-GUIDE (PDF).

Shah appender

simba.third_party_label_appenders.shah_appender.shah_appender(labels_dir, features_dir, targets_dir, clf_names)[source]

Appends behavioral annotations from Shah-formatted .txt files to featurized pose estimation data.

Parses third-party annotation files in Shah format (containing FULL LOG sections with start/stop events), matches them with corresponding feature-extracted CSV files by video name, and creates binary target columns for each specified behavior. The merged data is saved to the targets directory.

Note

For expected input files, see simba.tests.data.shah_examples.zip. Data from Alyssa Hall, Broad Institute.

Parameters
  • labels_dir (Union[str, os.PathLike]) – Path to directory containing Shah-formatted .txt annotation files with start/stop event logs.

  • features_dir (Union[str, os.PathLike]) – Path to directory containing featurized pose estimation CSV files.

  • targets_dir (Union[str, os.PathLike]) – Path to directory where merged feature and annotation CSV files will be saved.

  • clf_names (List[str]) – List of behavior/classifier names to extract from the annotation files and append as binary target columns.

Returns

None

Example

>>> LABELS_DIR = r'/Users/simon/Desktop/envs/simba/troubleshooting/shah_data'                                           #PATH TO A DIRECTORY CONTAINING .TXT FILES
>>> FEATURES_DIR = r'/Users/simon/Desktop/envs/simba/troubleshooting/mitra/project_folder/csv/features_extracted'       #PATH TO A DIRECTORY CONTAINING FEATURIZED POSE ESTIMATION
>>> TARGETS_DIR = r'/Users/simon/Desktop/envs/simba/troubleshooting/mitra/project_folder/csv/targets_inserted'          #PATH TO A DIRECTORY WHERE FEATURIZED POSE ESTIMATION AND ALIGNED ANNOTATIONS ARE TO BE SAVED
>>> CLF_NAMES = ['Rearing', 'Grooming']                                                                                 #NAMES OF THE ANNOTATED BEHAVIORS TO BUILD CLASSIFIERS FROM
>>> shah_appender(labels_dir=LABELS_DIR, features_dir=FEATURES_DIR, targets_dir=TARGETS_DIR, clf_names=CLF_NAMES)

Generic third-party appender tool

class simba.third_party_label_appenders.third_party_appender.ThirdPartyLabelAppender(config_path, data_dir, app, file_format, error_settings, log=False)[source]

Bases: simba.mixins.config_reader.ConfigReader

Concatenate third-party annotations to featurized pose-estimation datasets in SimBA.

Parameters
  • app (str) – Third-party application. OPTIONS: [‘BORIS’, ‘BENTO’, ‘DEEPETHOGRAM’, ‘ETHOVISION’, ‘OBSERVER’, ‘SOLOMON’].

  • config_path (str) – path to SimBA project config file in Configparser format.

  • data_dir (str) – Directory holding third-party annotation data files.

  • settings (dict) – User-defined settings including how to handle errors, logging, and data file types associated with the third-party application.

… note::

Third-party import tutorials.

`BENTO: expected input <https://github.com/sgoldenlab/simba/blob/master/misc/bento_example.annot`__. BORIS: expected input. DEEPETHOGRAM: expected input. ETHOVISION: expected input. OBSERVER: expected input I. OBSERVER: expected input II. SOLOMON: expected input II.

Example

>>>test = ThirdPartyLabelAppender(config_path=’/Users/simon/Desktop/envs/simba/troubleshooting/two_black_animals_14bp/project_folder/project_config.ini’, >>> data_dir=’/Users/simon/Desktop/envs/simba/simba/tests/data/test_projects/two_c57/observer_annotations’, >>> app=OBSERVER, >>> file_format=’.xlsx’, >>> error_settings=error_settings, >>> log=log) >>> test.run()

References

1

DeepEthogram repo.

2

Segalin et al., eLife, https://doi.org/10.7554/eLife.63720

3

Behavioral Observation Research Interactive Software (BORIS) user guide.

4

Noldus Ethovision XT.

5

Noldus Observer XT.

6

Solomon coder user-guide (PDF).

Third-party annotation tools

simba.third_party_label_appenders.tools.is_new_boris_version(pd_df)[source]

Check the format of a boris annotation file.

In the new version, additional column names are present, while others have slightly different name. Here, we check for the presence of a column name present only in the newer version.

Returns

True if newer version

simba.third_party_label_appenders.tools.read_bento_files(data_paths, video_info_df, error_setting=None, log_setting=False)[source]

Reads multiple BENTO annotation files and processes them into a dictionary of DataFrames, each representing the combined annotations for a corresponding video. The function verifies that all files exist and that the file names match the video information provided.

param Union[List[str], str, os.PathLike] data_paths

Paths to BENTO annotation files or a directory containing such files. If a directory is provided, all files with the extension ‘.annot’ will be processed.

param Union[str, os.PathLike, pd.DataFrame] video_info_df

Path to a CSV file containing video information or a preloaded DataFrame with the same data. This information is used to match BENTO files with their corresponding videos and extract the FPS.

param Literal[Union[None, Methods.ERROR.value, Methods.WARNING.value]] error_setting

Determines the error handling mode. If set to Methods.ERROR.value, errors will raise exceptions. If set to Methods.WARNING.value, errors will generate warnings instead. If None, no error handling modifications are applied.

param Optional[bool] = False) -> Dict[str, pd.DataFrame] log_setting

If True, logging will be enabled for the process, providing detailed information about the steps being executed.

return

A dictionary where the keys are video names and the values are DataFrames containing the combined annotations for each video.

rtype

Dict[str, pd.DataFrame]

example

>>> dfs = read_bento_files(data_paths=r"C:  roubleshootingento_testento_files", error_setting='WARNING', log_setting=False, video_info_df=r"C:    roubleshootingento_test\project_folder\logs

ideo_info.csv”)

simba.third_party_label_appenders.tools.read_boris_annotation_files(data_paths, video_info_df, error_setting=None, orient='columns', log_setting=False)[source]

Reads multiple BORIS behavioral annotation files and compiles the data into a dictionary of dataframes.

param Union[List[str], str, os.PathLike] data_paths

Paths to the BORIS annotation files. This can be a list of file paths, a single directory containing the files, or a single file path.

param Union[str, os.PathLike, pd.DataFrame] video_info_df

The path to a CSV file, an existing dataframe, or a file-like object containing video information (e.g., FPS, video name). This data is used to align the annotation files with their respective videos.

param Literal[Union[None, Methods.ERROR.value, Methods.WARNING.value]] error_setting

Defines the behavior when encountering issues in the files. Options are Methods.ERROR.value to raise errors, Methods.WARNING.value to log warnings, or None for no action.

param Optional[bool] log_setting

Whether to log warnings and errors when error_setting is set to Methods.WARNING.value. Defaults to False.

return

A dictionary where each key is a video name, and each value is a dataframe containing the compiled behavioral annotations from the corresponding BORIS file.

example

>>> data = read_boris_annotation_files(data_paths=[r"C:     roubleshootingoris_test\project_folderoris_files\c_oxt23_190816_132617_s_trimmcropped.csv"], error_setting='WARNING', log_setting=False, video_info_df=r"C:   roubleshootingoris_test\project_folder\logs

ideo_info.csv”)

Annotation format converters

simba.third_party_label_appenders.converters.arr_to_b64(x)[source]

Helper to convert image in array format to an image in byte string format

simba.third_party_label_appenders.converters.b64_dict_to_imgs(x)[source]

Helper to convert a dictionary of images in byte64 format to a dictionary of images in array format.

Example

>>> df = labelme_to_df(labelme_dir=r'C:     roubleshooting\coco_data\labels est_2')
>>> x = df.set_index('image_name')['image'].to_dict()
>>> b64_dict_to_imgs(x)
simba.third_party_label_appenders.converters.b64_to_arr(img_b64)[source]

Helper to convert byte string (e.g., created by labelme.) to image in numpy array format

simba.third_party_label_appenders.converters.coco_keypoints_to_yolo(coco_path, img_dir, save_dir, train_size=0.7, flip_idx=(0, 2, 1, 3, 5, 4, 6), verbose=True)[source]

Convert COCO Keypoints version 1.0 data format into a YOLO keypoints training set.

Note

COCO keypoint files can be created using https://www.cvat.ai/.

param Union[str, os.PathLike] coco_path

Path to coco keypoints 1.0 file in json format.

param Union[str, os.PathLike] img_dir

Directory holding img files representing the annotated entries in the coco_path.

param Union[str, os.PathLike] save_dir

Directory where to save the yolo formatted data.

param Tuple[float, float, float] split

The size of the training set. Value between 0-1.0 representing the percent of training data.

param bool verbose

If true, prints progress. Default: True.

param Tuple[int, …] flip_idx

Tuple of ints, representing the flip of body-part coordinates when the animal image flips 180 degrees.

return

None

example

>>> coco_path = r"D:
etholabsimgs_vcatatch_1atch_1coco_annotationsperson_keypoints_default.json”
>>> coco_keypoints_to_yolo(coco_path=coco_path, img_dir=r'D:

etholabsimgs_vcatatch_1’, save_dir=r”D: etholabsimgs_vcatatch_1atch_1yolo_annotations”)

simba.third_party_label_appenders.converters.create_yolo_keypoint_yaml(path, train_path, val_path, names, kpt_shape=None, flip_idx=None, save_path=None, use_wsl_paths=False)[source]

Given a set of paths to directories, create a model.yaml file for yolo pose model training though ultralytics wrappers.

See also

Used by simba.sandbox.coco_keypoints_to_yolo.coco_keypoints_to_yolo()

param Union[str, os.PathLike] path

Parent directory holding both an images and a labels directory.

param Union[str, os.PathLike] train_path

Directory holding training images. For example, if C: roubleshootingcoco_dataimages rain is passed, then a C: roubleshootingcoco_datalabels rain is expected.

param Union[str, os.PathLike] val_path

Directory holding validation images. For example, if C: roubleshootingcoco_dataimages est is passed, then a C: roubleshootingcoco_datalabels est is expected.

param Union[str, os.PathLike] test_path

Directory holding test images. For example, if C: roubleshootingcoco_dataimages

alidation is passed, then a C: roubleshootingcoco_datalabels alidation is expected.

param Dict[str, int] names

Dictionary mapping pairing object names to object integer identifiers. E.g., {‘OBJECT 1’: 0, ‘OBJECT 2`: 2}

param Optional[Tuple[int, …]] flip_idx

Optional tuple of integers representing keypoint switch indexes if image is flipped horizontally. Only pass if pose-estimation data.

param Optional[Tuple[int, int]] kpt_shape

Optional tuple of integers representing the shape of each animals keypoints, e.g., (6, 3). Only pass if pose-estimation data.

param Union[str, os.PathLike] save_path

Optional location where to save the yolo model yaml file. If None, then the dict is returned.

param bool use_wsl_paths

If True, use Windows WSL paths (e.g., /mnt/…) in the config file.

return None

simba.third_party_label_appenders.converters.create_yolo_yaml(path, train_path, val_path, names, save_path=None, test_path=None, reverse_ids=True)[source]

Given a set of paths to directories, create a model.yaml file for model training though ultralytics wrappers.

param Union[str, os.PathLike] path

Parent directory holding both an images and a labels directory.

param Union[str, os.PathLike] train_path

Directory holding training images. For example, if C: roubleshootingcoco_dataimages rain is passed, then a C: roubleshootingcoco_datalabels rain is expected.

param Union[str, os.PathLike] val_path

Directory holding validation images. For example, if C: roubleshootingcoco_dataimages est is passed, then a C: roubleshootingcoco_datalabels est is expected.

param Union[str, os.PathLike] test_path

Directory holding test images. For example, if C: roubleshootingcoco_dataimages

alidation is passed, then a C: roubleshootingcoco_datalabels alidation is expected.

param Dict[str, int] names

Dictionary mapping pairing object names to object integer identifiers. E.g., {‘OBJECT 1’: 0, ‘OBJECT 2`: 2}

param Union[str, os.PathLike] save_path

Optional location where to save the yolo model yaml file. If None, then the dict is returned.

return None

simba.third_party_label_appenders.converters.dlc_multi_animal_h5_to_yolo_keypoints(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, threshold=0, train_size=0.7, flip_idx=None, names=None, greyscale=False, padding=0.0)[source]

Convert SLEAP pose estimation CSV data and corresponding videos into YOLO keypoint dataset format.

Note

This converts SLEAP inference data to YOLO keypoints (not SLEAP annotations).

param Union[str, os.PathLike] data_dir

Directory path containing DLC-generated H5 files with inferred keypoints.

param Union[str, os.PathLike] video_dir

Directory path containing corresponding videos from which frames are to be extracted.

param Union[str, os.PathLike] save_dir

Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.

param Optional[int] frms_cnt

Number of frames to randomly sample from each video for conversion. If None, all frames are used.

param float instance_threshold

Minimum confidence score threshold to filter out low-confidence pose instances. Only instances with instance.score >= this threshold are used.

param float train_size

Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

param bool verbose

If True, prints progress. Default: True.

param Tuple[int, …] flip_idx

Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping.

param Dict[str, int] map_dict

Dictionary mapping class indices to class names. Used for creating the YAML class names mapping file.

param float padding

Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.

return

None. Results saved in save_dir.

example

>>> dlc_multi_animal_h5_to_yolo_keypoints(data_dir=r'D:     roubleshooting\dlc_h5_multianimal_to_yolo\data', video_dir=r'D: roubleshooting\dlc_h5_multianimal_to_yolo

ideos’, save_dir=r’D: roubleshootingdlc_h5_multianimal_to_yoloyolo’)

simba.third_party_label_appenders.converters.dlc_to_labelme(dlc_dir, save_dir, labelme_version='5.3.1', flags=None, verbose=True)[source]

Convert a folder of DLC annotations into labelme json format.

DLC to Labelme
param Union[str, os.PathLike] dlc_dir

Folder with DLC annotations. I.e., directory inside

param Union[str, os.PathLike] save_dir

Directory to where to save the labelme json files.

param Optional[str] labelme_version

Version number encoded in the json files.

param Optional[Dict[Any, Any] flags

Flags included in the json files.

param Optional[bool] verbose

If True, prints progress.

return

None

example

>>> dlc_to_labelme(dlc_dir="D:\TS_DLC\labeled-data  s_annotations", save_dir="C:    roubleshooting\coco_data\labels est")
>>> dlc_to_labelme(dlc_dir=r'D:

at_resident_intruderdlc_dataWIN_20190816081353’, save_dir=r’D: at_resident_intruderlabelme’)

simba.third_party_label_appenders.converters.dlc_to_yolo_keypoints(dlc_dir, save_dir, train_size=0.7, verbose=False, padding=0.0, flip_idx=(1, 0, 2, 3, 5, 4, 6, 7), map_dict={0: 'mouse'}, greyscale=False, bp_id_idx=None)[source]

Converts DLC annotations into YOLO keypoint format formatted for model training.

Note

dlc_dir can be a directory with subdirectories containing images and CSV files with the CollectedData substring filename. For creating the flip_idx, see simba.third_party_label_appenders.converters.get_yolo_keypoint_flip_idx(). For creating the bp_id_idx, see simba.third_party_label_appenders.converters.get_yolo_keypoint_bp_id_idx()

param Union[str, os.PathLike] dlc_dir

Directory path containing DLC-generated CSV files with keypoint annotations and images.

param Union[str, os.PathLike] save_dir

Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.

param float train_size

Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

param bool verbose

If True, prints progress. Default: True.

param float padding

Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.

param Tuple[int, …] flip_idx

Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping.

param Dict[int, str] map_dict

Dictionary mapping class indices to class names. Used for creating the YAML class names mapping file.

return

None. Results saved in save_dir.

example

>>> dlc_to_yolo_keypoints(dlc_dir=r'D:\mouse_operant_data\Operant_C57_labelled_images\labeled-data', save_dir=r"D:\mouse_operant_data\yolo", verbose=True)
>>> dlc_to_yolo_keypoints(dlc_dir=r'D:

at_resident_intruderdlc_data’, save_dir=r”D: at_resident_intruderyolo”, verbose=True, bp_id_idx={0: list(range(0, 8)), 1: list(range(8, 16))}, map_dict={0: ‘resident’, 1: ‘intruder’})

simba.third_party_label_appenders.converters.geometries_to_coco(geometries, video_path, save_dir, version=1, description=None, licences=None)[source]

Convert a dictionary of geometries (keypoints or polygons) into COCO format annotations and save images extracted from a video to a specified directory.

This function takes a dictionary of geometries (e.g., keypoints, bounding boxes, or polygons) and converts them into COCO format annotations. The geometries are associated with frames of a video, and the corresponding images are extracted from the video, saved as PNG files, and linked to the annotations.

Geometries to COCO
example

>>> data_path = r"C:        roubleshooting\mitra\project_folder\csv\outlier_corrected_movement_location\FRR_gq_Saline_0624.csv"
>>> animal_data = read_df(file_path=data_path, file_type='csv', usecols=['Nose_x', 'Nose_y', 'Tail_base_x', 'Tail_base_y', 'Left_side_x', 'Left_side_y', 'Right_side_x', 'Right_side_y']).values.reshape(-1, 4, 2)[0:20].astype(np.int32)
>>> animal_polygons = GeometryMixin().bodyparts_to_polygon(data=animal_data)
>>> animal_polygons = GeometryMixin().multiframe_minimum_rotated_rectangle(shapes=animal_polygons)
>>> animal_polygons = GeometryMixin().geometries_to_exterior_keypoints(geometries=animal_polygons)
>>> animal_polygons = GeometryMixin.keypoints_to_axis_aligned_bounding_box(keypoints=animal_polygons)
>>> animal_polygons = {0: animal_polygons}
>>> geometries_to_coco(geometries=animal_polygons, video_path=r'C:  roubleshooting\mitra\project_folder

ideosFRR_gq_Saline_0624.mp4’, save_dir=r”C: roubleshootingcoco_data”)

simba.third_party_label_appenders.converters.geometries_to_yolo(geometries, video_path, save_dir, verbose=True, sample=None, obb=False, map=None)[source]

Converts geometrical shapes (like polygons) into YOLO format annotations and saves them along with corresponding video frames as images.

Geometries to YOLO
param Dict[Union[str, int], np.ndarray geometries

A dictionary where the keys represent category IDs (either string or int), and the values are NumPy arrays of shape (n_frames, n_points, 2). Each entry in the array represents the geometry of an object in a particular frame (e.g., keypoints or polygons).

param Union[str, os.PathLike] video_path

Path to the video file from which frames are extracted. The video is used to extract images corresponding to the geometrical annotations.

param Union[str, os.PathLike] save_dir

The directory where the output images and YOLO annotation files will be saved. Images will be stored in a subfolder images/ and annotations in labels/.

param Optional[bool] verbose

If True, prints progress while processing each frame. This can be useful for monitoring long-running tasks. Default is True.

param Optional[int] sample

If provided, only a random sample of the geometries will be used for annotation. This value represents the number of frames to sample. If None, all frames will be processed. Default is None.

param Optional[bool] obb

If True, uses oriented bounding boxes (OBB) by extracting the four corner points of the geometries. Otherwise, axis-aligned bounding boxes (AABB) are used. Default is False.

param Optional[Dict[int, str]] map

If True, uses oriented bounding boxes (OBB) by extracting the four corner points of the geometries. Otherwise, axis-aligned bounding boxes (AABB) are used. Default is False.

return None

example

>>> data_path = r"C:        roubleshooting\mitra\project_folder\csv\outlier_corrected_movement_locationŁ_MA142_Gi_CNO_0514.csv"
>>> animal_data = read_df(file_path=data_path, file_type='csv', usecols=['Nose_x', 'Nose_y', 'Tail_base_x', 'Tail_base_y', 'Left_side_x', 'Left_side_y', 'Right_side_x', 'Right_side_y']).values.reshape(-1, 4, 2).astype(np.int32)
>>> animal_polygons = GeometryMixin().bodyparts_to_polygon(data=animal_data)
>>> polygons = GeometryMixin().multiframe_minimum_rotated_rectangle(shapes=animal_polygons)
>>> animal_polygons = GeometryMixin().geometries_to_exterior_keypoints(geometries=polygons)
>>> animal_polygons = {0: animal_polygons}
>>> geometries_to_yolo(geometries=animal_polygons, video_path=r'C:  roubleshooting\mitra\project_folder

ideosŁ_MA142_Gi_CNO_0514.mp4’, save_dir=r”C: roubleshootingcoco_data”, sample=500, obb=True)

simba.third_party_label_appenders.converters.get_yolo_keypoint_bp_id_idx(animal_bp_dict)[source]

Helper to create a dictionary holding the indexes for each animals body-parts. USed for transforming data for creating a YOLO training set.

Parameters

animal_bp_dict – Dictionaru of animal body-parts. Can be created by simba.mixins.config_reader.ConfigReader.create_body_part_dictionary().

Returns

Dictionary where the key is the animal name, and the values are the indexes of the columns belonging to each animal.

Return type

Dict[int, List[int]

simba.third_party_label_appenders.converters.get_yolo_keypoint_flip_idx(x)[source]

Given a list of body-parts, create a flip_index YOLO yaml entry.

Important

Only works if the left and right bosy-parts have the substrings left and right (case-insensitive).

Parameters

x (List[str]) – List of the names of the body-parts. If several animals, then just a list of names for the body-parts for one animal.

Returns

The flip_idx required by the YOLO model yaml file. E.g., [1, 0, 2, 3, 4]

Return type

Tuple[int, …]

simba.third_party_label_appenders.converters.labelme_to_df(labelme_dir, greyscale=False, pad=False, size=None, normalize=False, save_path=None, verbose=True)[source]

Convert a directory of labelme .json files into a pandas dataframe.

Note

The images are stores as a 64-bit bytestring under the image header of the output dataframe.

Parameters
  • labelme_dir (Union[str, os.PathLike]) – Directory with labelme json files.

  • greyscale (Optional[bool]) – If True, converts the labelme images to greyscale if in rgb format. Default: False.

  • pad (Optional[bool]) – If True, checks if all images are the same size and if not; pads the images with black border so all images are the same size.

  • size (Union[Literal['min', 'max'], Tuple[int, int]]) – The size of the output images. Can be the smallesgt (min) the largest (max) or a tuple with the width and height of the images. Automatically corrects the labels to account for the image size.

  • normalize (Optional[bool]) – If true, normalizes the images. Default: False.

  • save_path (Optional[Union[str, os.PathLike]]) – The location where to store the dataframe. If None, then returns the dataframe. Default: None.

Return type

Union[None, pd.DataFrame]

Example

>>> labelme_to_df(labelme_dir=r'C:  roubleshooting\coco_data\labels est_2')
>>> df = labelme_to_df(labelme_dir=r'C:     roubleshooting\coco_data\labels est_read', greyscale=False, pad=False, normalize=False, size='min')
simba.third_party_label_appenders.converters.labelme_to_dlc(labelme_dir, scorer='SN', save_dir=None)[source]

Convert labels from labelme format to DLC format.

Parameters
  • labelme_dir (Union[str, os.PathLike]) – Directory with labelme json files.

  • scorer (Optional[str]) – Name of the scorer (anticipated by DLC as header)

  • save_dir (Optional[Union[str, os.PathLike]]) – Directory where to save the DLC annotations. If None, then same directory as labelme_dir with _dlc_annotations suffix.

Returns

None

Example

>>> labelme_dir = 'D:       s_annotations'
>>> labelme_to_dlc(labelme_dir=labelme_dir)
simba.third_party_label_appenders.converters.labelme_to_img_dir(labelme_dir, img_dir, img_format='png', verbose=True, greyscale=False)[source]

Given a directory of labelme JSON annotations, extract the images from the JSONs in b64 format and store them as images in a directory

param labelme_dir

Directory containing labelme json annotations.

param img_dir

Directory where to store the images.

param img_format

Format in which to save the images.

return

None

example

>>> labelme_to_img_dir(img_dir=r"C: roubleshooting\coco_data\labels rain_images", labelme_dir=r'C:  roubleshooting\coco_data\labels rain_')
>>> labelme_to_img_dir(img_dir=r"D:

at_resident_intruderlabelme”, labelme_dir=r’D: at_resident_intruderimgs’)

simba.third_party_label_appenders.converters.labelme_to_yolo(labelme_dir, save_dir, obb=False, verbose=True)[source]

Convert LabelMe annotations in json to YOLO format and save the corresponding images and labels in txt format.

Note

For more information on the LabelMe annotation tool, see the LabelMe GitHub repository. The LabemLe Json files has too contain a imageData key holding the image as a b64 string.

See also

To split YOLO data into train, test, and validation sets (expected by e.g., UltraLytics), see simba.third_party_label_appenders.converters.split_yolo_train_test_val().

Parameters
  • labelme_dir (Union[str, os.PathLike) – Path to the directory containing LabelMe annotation .json files.

  • save_dir (Union[str, os.PathLike) – Directory where the YOLO-format images and labels will be saved. Will create ‘images/’, ‘labels/’, and ‘map.json’ inside this directory.

  • obb (bool) – If True, saves annotations as oriented bounding boxes (8 coordinates). If False, uses standard YOLO format (x_center, y_center, width, height)

  • verbose (bool) – If True, prints progress messages during conversion.

Example

>>> LABELME_DIR = r'D:/annotations'
>>> SAVE_DIR = r"D:/yolo_data"
>>> labelme_to_yolo(labelme_dir=LABELME_DIR, save_dir=SAVE_DIR)
simba.third_party_label_appenders.converters.merge_coco_keypoints_files(data_dir, save_path)[source]

Merges multiple annotation COCO-format keypoint JSON files into a single file.

Note

Image and annotation entries are appended after adjusting their id fields to be unique.

These files can be created using https://www.cvat.ai/.

Parameters
  • data_dir (Union[str, os.PathLike]) – Directory containing multiple COCO keypoints .json files to merge.

  • save_path (Union[str, os.PathLike]) – File path to save the merged COCO keypoints JSON.

Returns

None. Results are saved in save_path.

simba.third_party_label_appenders.converters.scale_pose_img_sizes(pose_data, imgs, size, interpolation=2)[source]

Resizes images and scales corresponding pose-estimation data to match the new image sizes.

Scale pose img sizes
Parameters
  • pose_data – 3d MxNxR array of pose-estimation data where N is number of images, N the number of body-parts in each frame and R represents x,y coordinates of the body-parts.

  • imgs – Iteralble of images of same size as pose_data M dimension. Can be byte string representation of images, or images as arrays.

  • size – The target size for the resizing operation. It can be: - ‘min’: Resize all images to the smallest height and width found among the input images. - ‘max’: Resize all images to the largest height and width found among the imgs.

  • interpolation – Interpolation method to use for resizing. This can be one of OpenCV’s interpolation methods.

Returns

The converted pose_data and converted images to align with the new size.

Return type

Tuple[np.ndarray, Iterable[Union[np.ndarray, str]]]

Example

>>> df = labelme_to_df(labelme_dir=r'C:     roubleshooting\coco_data\labels est_read', greyscale=False, pad=False, normalize=False)
>>> imgs = list(df['image'])
>>> pose_data = df.drop(['image', 'image_name'], axis=1)
>>> pose_data_arr = pose_data.values.reshape(len(pose_data), int(len(pose_data.columns) / 2), 2).astype(np.float32)
>>> new_pose, new_imgs = scale_pose_img_sizes(pose_data=pose_data_arr, imgs=imgs, size=(700, 3000))
simba.third_party_label_appenders.converters.simba_rois_to_yolo(config_path=None, roi_path=None, video_dir=None, save_dir=None, roi_frm_cnt=10, train_size=0.7, obb=False, greyscale=True, verbose=False)[source]

Converts SimBA roi definitions into annotations and images for training yolo network.

SimBA ROIs to YOLO
param Optional[Union[str, os.PathLike]] config_path

Optional path to the project config file in SimBA project.

param Optional[Union[str, os.PathLike]] roi_path

Path to the SimBA roi definitions .h5 file. If None, then the roi_coordinates_path of the project.

param Optional[Union[str, os.PathLike]] video_dir

Directory where to find the videos. If None, then the videos folder of the project.

param Optional[Union[str, os.PathLike]] save_dir

Directory where to save the labels and images. If None, then the logs folder of the project.

param Optional[int] roi_frm_cnt

Number of frames for each video to create bounding boxes for.

param float train_size

Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

param Optional[bool] obb

If True, created object-oriented yolo bounding boxes. Else, axis aligned yolo bounding boxes. Default False.

param Optional[bool] greyscale

If True, converts the images to greyscale if rgb. Default: True.

param Optional[bool] verbose

If True, prints progress. Default: False.

return

None

example I

>>> simba_rois_to_yolo(config_path=r"C:     roubleshooting\RAT_NOR\project_folder\project_config.ini")
example II

>>> simba_rois_to_yolo(config_path=r"C:     roubleshooting\RAT_NOR\project_folder\project_config.ini", save_dir=r"C:        roubleshooting\RAT_NOR\project_folder\logs\yolo", video_dir=r"C:        roubleshooting\RAT_NOR\project_folder

ideos”, roi_path=r”C: roubleshootingRAT_NORproject_folderlogsmeasuresROI_definitions.h5”)

example III

>>> simba_rois_to_yolo(video_dir=r"C:       roubleshooting\RAT_NOR\project_folder

ideos”, roi_path=r”C: roubleshootingRAT_NORproject_folderlogsmeasuresROI_definitions.h5”, save_dir=r’C: roubleshootingRAT_NORproject_folderyolo’, verbose=True, roi_frm_cnt=20, obb=True)

simba.third_party_label_appenders.converters.simba_to_yolo_keypoints(config_path, save_dir, data_dir=None, train_size=0.7, verbose=False, greyscale=False, padding=0.0, flip_idx=(1, 0, 2, 4, 3, 5, 6, 7, 8, 9), map_dict={0: 'mouse'}, sample_size=None, bp_id_idx=None)[source]

Convert pose estimation data from a SimBA project into the YOLO keypoint format, including frame sampling, image-label pair creation, bounding box computation, and train/validation splitting.

Parameters
  • config_path (Union[str, os.PathLike]) – Path to the SimBA project .ini configuration file.

  • save_dir (Union[str, os.PathLike]) – Directory where YOLO-formatted data will be saved. Subdirectories for images/labels (train/val) are created.

  • data_dir (Optional[Union[str, os.PathLike]) – Optional directory containing outlier-corrected SimBA pose estimation data. If None, uses path from config.

  • train_size (float) – Proportion of samples to allocate to the training set (range 0.1–0.99). Remaining samples go to validation.

  • verbose (bool) – If True, prints progress updates to the console.

  • greyscale (bool) – If True, saves extracted video frames in greyscale. Otherwise, saves in color.

  • padding (float) – Padding added around the bounding box (as a proportion of image dimensions, range 0.0–1.0). Useful if animal body-parts are in a “line”.

  • flip_idx (Tuple[int, ...]) – Tuple defining symmetric keypoint indices for horizontal flipping. Used to write the map.yaml file.

  • map_dict (Dict[int, str]) – Dictionary mapping instance IDs to class names. Used in annotation labels and map.yaml.

  • sample_size (Optional[int]) – If specified, limits the number of randomly sampled frames per video. If None, all frames are used.

  • bp_id_idx (Optional[Dict[int, Union[Tuple[int], List[int]]]]) – Optional mapping of instance IDs to keypoint index groups, allowing support for multiple animals per frame. Must match keys in map_dict.

Returns

None. Saves YOLO-formatted images and annotations to disk in the save_dir location.

Example

>>> simba_to_yolo_keypoints(config_path=r"C:        roubleshooting\mitra\project_folder\project_config.ini", save_dir=r'C:  roubleshooting\mitra\yolo', sample_size=150, verbose=True)
simba.third_party_label_appenders.converters.sleap_to_yolo_keypoints(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, instance_threshold=0, train_size=0.7, flip_idx=None, names=None, greyscale=False, padding=0.0)[source]

Convert SLEAP pose estimation CSV data and corresponding videos into YOLO keypoint dataset format.

Note

This converts SLEAP inference data to YOLO keypoints (not SLEAP annotations).

Parameters
  • data_dir (Union[str, os.PathLike]) – Directory path containing SLEAP-generated CSV files with inferred keypoints.

  • video_dir (Union[str, os.PathLike]) – Directory path containing corresponding videos from which frames are to be extracted.

  • save_dir (Union[str, os.PathLike]) – Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.

  • frms_cnt (Optional[int]) – Number of frames to randomly sample from each video for conversion. If None, all frames are used.

  • instance_threshold (float) – Minimum confidence score threshold to filter out low-confidence pose instances. Only instances with instance.score >= this threshold are used.

  • train_size (float) – Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

  • verbose (bool) – If True, prints progress. Default: True.

  • flip_idx (Tuple[int, ...]) – Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping.

  • map_dict (Dict[str, int]) – Dictionary mapping class indices to class names. Used for creating the YAML class names mapping file.

  • padding (float) – Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.

Returns

None. Results saved in save_dir.

Example

>>> sleap_to_yolo_keypoints(data_dir=r'D:res\datant\sleap_csv', video_dir=r'D:res\datant\sleap_video', frms_cnt=550, train_size=0.8, instance_threshold=0.9, save_dir=r"D:res\datant\yolo")
simba.third_party_label_appenders.converters.split_yolo_train_test_val(data_dir, save_dir, split=(0.7, 0.2, 0.1), verbose=False)[source]

Split a directory of yolo labels and associated images into training, testing, and validation batches and create a mapping file for downstream model training.

Parameters
Returns

None

simba.third_party_label_appenders.converters.yolo_obb_data_to_bounding_box(center_x, center_y, width, height, angle)[source]

Converts the YOLO-oriented bounding box data to a set of bounding box corner points.

Given the center coordinates, width, height, and rotation angle of an oriented bounding box, this function computes the coordinates of the four corner points of the bounding box, with rotation applied about the center.

YOLO OBB data to bounding box
Parameters
  • center_x (float) – The x-coordinate of the bounding box center.

  • center_y (float) – The y-coordinate of the bounding box center.

  • width (float) – The width of the bounding box.

  • height (float) – The height of the bounding box.

  • angle (float) – The rotation angle of the bounding box in degrees, measured counterclockwise.

Returns

An array of shape (4, 2) containing the (x, y) coordinates of the four corners of the bounding box in the following order: top-left, top-right, bottom-right, and bottom-left.

Return type

np.ndarray

COCO key-points -> YOLO pose-estimation format conversion

class simba.third_party_label_appenders.transform.coco_keypoints_to_yolo.COCOKeypoints2Yolo(coco_path, img_dir, save_dir, train_size=0.7, flip_idx=(0, 2, 1, 5, 4, 3, 6), verbose=True, greyscale=False, clahe=False, bbox_pad=None)[source]

Bases: object

Convert COCO Keypoints version 1.0 data format into a YOLO keypoints training set.

Processes COCO format keypoint annotations and converts them to YOLO keypoint format, splitting the data into training and validation sets. Images are copied to the output directory and annotations are converted to YOLO format text files. A YAML configuration file is automatically generated.

Note

COCO keypoint files can be created using https://www.cvat.ai/.

This function expects the path to a single COCO Keypoints version 1.0 file. To merge several before passing the file to this function, use simba.third_party_label_appenders.transform.utils.merge_coco_keypoints_files().

Important

All image file names have to be unique.

See also

To convert COCO Keypoints version 1.0 data format into a YOLO bounding box training set, use simba.third_party_label_appenders.transform.coco_keypoints_to_yolo_bbox.COCOKeypoints2YoloBbox(). To train YOLO pose models with the converted data, see simba.model.yolo_fit.FitYolo() and YOLO Pose Estimation Training Documentation. To run inference with trained YOLO pose models, see simba.model.yolo_pose_inference.YOLOPoseInference() or simba.model.yolo_pose_track_inference.YOLOPoseTrackInference() and YOLO Pose Estimation Inference Documentation.

Parameters
  • coco_path (Union[str, os.PathLike]) – Path to COCO keypoints 1.0 file in JSON format. Must contain ‘categories’, ‘images’, and ‘annotations’ keys.

  • img_dir (Union[str, os.PathLike]) – Directory holding image files representing the annotated entries in the coco_path. Will search recursively, so it’s OK to have images in subdirectories.

  • save_dir (Union[str, os.PathLike]) – Directory where to save the YOLO formatted data. Will create ‘images/train’, ‘images/val’, ‘labels/train’, ‘labels/val’ subdirectories.

  • train_size (float) – Size of the training set as a fraction between 0.1 and 0.99. Remaining data becomes validation set. Default: 0.7 (70% training, 30% validation).

  • flip_idx (Tuple[int, ...]) – Tuple of integers representing the re-ordering of body-part indices when the image is horizontally flipped 180 degrees. Must match the number of keypoints. Default: (0, 2, 1, 5, 4, 3, 6).

  • verbose (bool) – If True (default), prints progress messages. If False, suppresses output.

  • greyscale (bool) – If True, converts images to greyscale before saving. If False (default), keeps original color format.

  • clahe (bool) – If True, applies CLAHE (Contrast Limited Adaptive Histogram Equalization) enhancement to images before saving. If False (default), no enhancement is applied.

  • bbox_pad (Optional[float]) – Optional padding factor for bounding boxes (between 10e-6 and 1.0). If provided, bounding boxes are expanded by this percentage to better encompass all body-parts. If None (default), no padding is applied.

Returns

None. YOLO formatted data is saved to save_dir with structure: images/train, images/val, labels/train, labels/val, and map.yaml.

Example

>>> runner = COCOKeypoints2Yolo(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/s1/annotations/s1.json", img_dir=r"D:/cvat_annotations/frames/simon", save_dir=r"D:/cvat_annotations/frames/yolo_keypoints", clahe=True)
>>> runner.run()
Example II

>>> runner = COCOKeypoints2Yolo(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/merged.json", img_dir=r"D:/cvat_annotations/frames", save_dir=r"D:/cvat_annotations/frames/yolo", clahe=False)
>>> runner.run()
Example III

>>> runner = COCOKeypoints2Yolo(coco_path=r"E:/netholabs_videos/mosaics/subset/to_annotate/2d_mosaic_batch_1.json", img_dir=r"E:/netholabs_videos/mosaics/subset/to_annotate", save_dir=r"E:/netholabs_videos/mosaics/yolo_mdl", clahe=False, bbox_pad=0.1)
>>> runner.run()
references
1

Helpful YouTube tutorial by Farhan to get YOLO tracking data in animals - https://www.youtube.com/watch?v=CcGbgFPwQTc.

2

Great YouTube tutorial by Felipe on annotating data and making data YOLO ready - https://www.youtube.com/watch?v=m9fH9OWn8YM.

COCO key-points -> YOLO bounding box conversion

class simba.third_party_label_appenders.transform.coco_keypoints_to_yolo_bbox.COCOKeypoints2YoloBbox(coco_path, img_dir, save_dir, train_size=0.7, verbose=True, greyscale=False, clahe=False, bbox_pad=None, obb=False)[source]

Bases: object

Convert COCO Keypoints version 1.0 data format into a YOLO bounding box training set.

Note

COCO keypoint files can be created using https://www.cvat.ai/.

This function expects the path to a single COCO Keypoints version 1.0 file. To merge several before passing the file to thsi function, use simba.third_party_label_appenders.transform.utils.merge_coco_keypoints_files().

Important

All image file names have to be unique.

See also

To convert OCO Keypoints version 1.0 data format into a YOLO keypoint training set, use simba.third_party_label_appenders.transform.coco_keypoints_to_yolo.COCOKeypoints2Yolo()

Parameters
  • coco_path (Union[str, os.PathLike]) – Path to coco keypoints 1.0 file in json format.

  • img_dir (Union[str, os.PathLike]) – Directory holding img files representing the annotated entries in the coco_path. Will search recursively, so its OK to have images in subdirectories.

  • save_dir (Union[str, os.PathLike]) – Directory where to save the yolo formatted data.

  • split (Tuple[float, float, float]) – The size of the training set. Value between 0-1.0 representing the percent of training data.

  • verbose (bool) – If true, prints progress. Default: True.

  • flip_idx (Tuple[int, ...]) – Tuple of ints, representing the flip of body-part coordinates when the animal image flips 180 degrees.

Returns

None

Example

>>> runner = COCOKeypoints2YoloBbox(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/s1/annotations/s1.json", img_dir=r"D:/cvat_annotations/frames/simon", save_dir=r"D:/cvat_annotations/frames/yolo_keypoints", clahe=True)
>>> runner.run()
Example II

>>> runner = COCOKeypoints2YoloBbox(coco_path=r"D:/cvat_annotations/frames/coco_keypoints_1/merged.json", img_dir=r"D:/cvat_annotations/frames", save_dir=r"D:/cvat_annotations/frames/yolo", clahe=False)
>>> runner.run()
references
1

Helpful YouTube tutorial by Farhan to get YOLO tracking data in animals - https://www.youtube.com/watch?v=CcGbgFPwQTc.

2

Great YouTube tutorial by Felipe on annotating data and making data YOLO ready - https://www.youtube.com/watch?v=m9fH9OWn8YM.

COCO key-points -> YOLO segmentation conversion

class simba.third_party_label_appenders.transform.coco_keypoints_to_yolo_seg.COCOKeypoints2YoloSeg(coco_path, img_dir, save_dir, train_size=0.7, verbose=True, greyscale=False, clahe=False, bbox_pad=None)[source]

Bases: object

SAM3 -> YOLO segmentation project

class simba.third_party_label_appenders.transform.sam3_to_yolo_seg.SAM3ToYoloSeg(video_dir, sam_path, save_dir, txt_prompt='mouse', n_frames=50, names=('animal',), train_val_split=0.7, conf=0.5, sam_imgsz=644, greyscale=False, clahe=False, vertice_cnt=40, seed=None, visualize=False, io_timeout=30.0, verbose=True)[source]

Bases: object

Sample N random frames from each video in a directory, run SAM3 with a text prompt, and write the resulting masks as a YOLO segmentation project.

Note

To fit a YOLO segmentation model, see FitYolo. For YOLO segmentation inference, see YOLOSegmentationInference.

See also

  • MergeYoloProjects — merge several map.yaml projects (same classes and task) into one dataset.

Raises
Parameters
  • video_dir (Union[str, os.PathLike]) – Directory containing input videos.

  • sam_path (Union[str, os.PathLike]) – Path to SAM3 model weights (e.g. sam3.pt).

  • save_dir (Union[str, os.PathLike]) – Root output directory for the YOLO project.

  • txt_prompt (str) – Text prompt for SAM3 (e.g. “mouse”, “mouse tail”).

  • n_frames (int) – Number of random frames to sample from each video.

  • names (Tuple[str, ...]) – Class names in index order. Default ('animal',).

  • train_val_split (float) – Fraction allocated to training (0.1-0.9). Default 0.7.

  • conf (float) – SAM3 confidence threshold. Default 0.25.

  • sam_imgsz (int) – Image size for SAM3 inference. Default 640.

  • greyscale (bool) – If True, save extracted frames in greyscale. Default False.

  • clahe (Optional[Union[Tuple[int, int, int], bool]]) – If True, applies CLAHE with default params. If tuple of (clip_limit, tile_x, tile_y), applies CLAHE with those params. Default False.

  • vertice_cnt (Optional[int]) – If not None, resample each mask polygon to this many vertices. Default 40.

  • seed (Optional[int]) – Random seed for reproducible frame sampling.

  • visualize (bool) – If True, saves annotated images with segmentation polygon overlays to a visualizations subfolder inside save_dir. Useful for verifying SAM3 annotation quality. Default False.

  • io_timeout (float) – Seconds to keep retrying file I/O (read/write) when the operation fails (e.g. temporary drive disconnect). Default 30.0.

  • verbose (bool) – If True, print progress updates. Default True.

Example

>>> runner = SAM3ToYoloSeg(video_dir=r'/path/to/videos', sam_path=r'/path/to/sam3.pt', save_dir=r'/path/to/yolo_project', txt_prompt='mouse', n_frames=50)
>>> runner.run()

SAM3 -> YOLO bounding-box (detection) project

Merge multiple YOLO projects

class simba.third_party_label_appenders.transform.merge_yolo_projects.MergeYoloProjects(yaml_paths, save_dir, train_val_split=None, seed=None, verbose=True)[source]

Bases: object

Merge multiple YOLO projects into a single YOLO project.

Reads each project’s YAML, validates that all projects share the same task type (bounding-box detection, segmentation, or keypoint pose) and class names, then copies all images and labels into a single output project with train/val splits.

See also

Parameters
  • yaml_paths (List[Union[str, os.PathLike]]) – List of paths to YOLO project YAML files.

  • save_dir (Union[str, os.PathLike]) – Root output directory for the merged project.

  • train_val_split (Optional[float]) – If provided, reshuffle all samples and split at this ratio (0.1-0.9). If None, preserve each project’s existing train/val assignments. Default None.

  • seed (Optional[int]) – Random seed for reproducible splitting. Only used when train_val_split is not None.

  • verbose (bool) – If True, print progress. Default True.

Example

>>> merger = MergeYoloProjects(yaml_paths=[r'/project_a/map.yaml', r'/project_b/map.yaml'], save_dir=r'/merged_project', train_val_split=0.8)
>>> merger.run()

Multi-animal DeepLabCut predictions -> YOLO pose-estimation annotations format conversion

class simba.third_party_label_appenders.transform.dlc_ma_h5_to_yolo.MADLCH52Yolo(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, threshold=0, train_size=0.7, flip_idx=None, greyscale=False, clahe=False, padding=0.0, single_id=None)[source]

Bases: object

Convert multi-animal DeepLabCut pose estimation H5 data and corresponding videos into YOLO keypoint dataset format.

Note

This converts DeepLabCut inference data to YOLO keypoints (not DeepLabcut annotations).

param Union[str, os.PathLike] data_dir

Directory path containing DLC-generated H5 files with inferred keypoints.

param Union[str, os.PathLike] video_dir

Directory path containing corresponding videos from which frames are to be extracted.

param Union[str, os.PathLike] save_dir

Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.

param Optional[int] frms_cnt

Number of frames to randomly sample from each video for conversion. If None, all frames are used.

param float threshold

Minimum confidence score threshold to filter out low-confidence pose instances. Only instances with instance.score >= this threshold are used.

param float train_size

Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

param bool verbose

If True, prints progress. Default: True.

param Tuple[int, …] flip_idx

Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping. If None, it will be inferred.

param float padding

Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.

param Optional[str] single_id

If the data contains pose-estimation for multiple indivisuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.

return

None. Results saved in save_dir.

example

>>> DATA_DIR = r'D: roubleshooting\dlc_h5_multianimal_to_yolo\data'
>>> VIDEO_DIR = r'D:        roubleshooting\dlc_h5_multianimal_to_yolo
ideos’
>>> SAVE_DIR = r"D:\imgs\madlc"
>>> runner = MADLCH52Yolo(data_dir=DATA_DIR, video_dir=VIDEO_DIR, save_dir=SAVE_DIR, clahe=True, single_id='animal_1')
>>> runner.run()

DeepLabCut predictions -> YOLO pose-estimation annotations

class simba.third_party_label_appenders.transform.dlc_to_yolo.DLC2Yolo(dlc_dir, save_dir, train_size=0.7, verbose=False, padding=0.15, flip_idx=None, names=('mouse',), greyscale=False, clahe=False)[source]

Bases: object

Converts DLC annotations into YOLO keypoint format formatted for model training.

Important

Use for single animal DLC data. For multi-animal DLC data,

Note

dlc_dir can be a directory with subdirectories containing images and CSV files with the CollectedData substring filename. For creating the flip_idx, see simba.third_party_label_appenders.converters.get_yolo_keypoint_flip_idx(). For creating the bp_id_idx, see simba.third_party_label_appenders.converters.get_yolo_keypoint_bp_id_idx()

Parameters
  • dlc_dir (Union[str, os.PathLike]) – Directory path containing DLC-generated CSV files with keypoint annotations and images.

  • save_dir (Union[str, os.PathLike]) – Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.

  • train_size (float) – Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

  • verbose (bool) – If True, prints progress. Default: True.

  • padding (float) – Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.

  • flip_idx (Tuple[int, ...]) – Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping.

  • names (Tuple[str]) – Tuple of animal (class) names. Used for creating the YAML class names mapping file.

Returns

None. Results saved in save_dir.

Example

>>> DLC_DIR = r'D:/rat_resident_intruder/dlc_data'
>>> SAVE_DIR = r'D:/rat_resident_intruder/yolo_3'
>>> runner = DLC2Yolo(dlc_dir=DLC_DIR, save_dir=SAVE_DIR, verbose=True, clahe=True, names=('resident', 'intruder'))
>>> runner.run()

Lightning Pose annotations -> YOLO pose-estimation annotations

class simba.third_party_label_appenders.transform.litpose_to_yolo_keypoints.LitPose2YOLO(litpose_dir, save_dir, train_size=0.7, verbose=False, padding=0.0, sample_n=None, flip_idx=None, names=('mouse',), greyscale=False, clahe=False)[source]

Bases: object

Convert LitPose keypoint annotations into a YOLO keypoint dataset.

Parameters
  • litpose_dir (Union[str, os.PathLike]) – Path to LitPose directory containing annotation CSV files and the labeled-data image folder.

  • save_dir (Union[str, os.PathLike]) – Output directory where YOLO-formatted images and labels subdirectories are created.

  • train_size (float) – Fraction of samples assigned to the training split. Default 0.7.

  • verbose (bool) – If True, print per-image progress during conversion.

  • padding (float) – Extra padding factor used when computing normalized YOLO boxes from keypoints.

  • sample_n (Optional[int]) – Optional cap on the number of sampled frames before split. If None, all frames are used.

  • flip_idx (Optional[Tuple[int, ...]]) – Optional keypoint flip index order for YOLO pose augmentation. If None, inferred from body-part names.

  • names (Tuple[str, ...]) – Class names in YOLO index order.

  • greyscale (bool) – If True, load and save images in grayscale.

  • clahe (bool) – If True, apply CLAHE preprocessing when reading images.

References

1

Lightning Pose documentation: https://lightning-pose.readthedocs.io/en/latest/

2

Biderman et al., Lightning Pose: improved animal pose estimation via semi-supervised learning, Bayesian ensembling and cloud-native open-source tools, Nature Methods (2024), doi: https://doi.org/10.1038/s41592-024-02319-1

Example

>>> runner = LitPose2YOLO(litpose_dir=r'Z:\home\simon\lp_300126', save_dir=r'E:\litpose_yolobox', verbose=True, clahe=False, greyscale=False, sample_n=1000, padding=0.15)
>>> runner.run()

Lightning Pose annotations -> YOLO bounding box annotations

class simba.third_party_label_appenders.transform.litpose_to_yolo_bbox.LitPose2YOLOBbox(litpose_dir, save_dir, train_size=0.7, verbose=False, padding=0.0, sample_n=None, names=('mouse',), greyscale=False, clahe=False)[source]

Bases: object

Convert LitPose keypoint annotations into a YOLO bounding-box dataset.

Parameters
  • litpose_dir (Union[str, os.PathLike]) – Path to LitPose directory containing annotation CSV files and the labeled-data image folder.

  • save_dir (Union[str, os.PathLike]) – Output directory where YOLO-formatted images and labels subdirectories are created.

  • train_size (float) – Fraction of samples assigned to the training split. Default 0.7.

  • verbose (bool) – If True, print per-image progress during conversion.

  • padding (float) – Extra fractional padding around each axis-aligned box inferred from keypoints.

  • sample_n (Optional[int]) – Optional cap on the number of sampled frames before split. If None, all frames are used.

  • names (Tuple[str, ...]) – Class names in YOLO index order.

  • greyscale (bool) – If True, load and save images in grayscale.

  • clahe (bool) – If True, apply CLAHE preprocessing when reading images.

References

1

Lightning Pose documentation: https://lightning-pose.readthedocs.io/en/latest/

2

Biderman et al., Lightning Pose: improved animal pose estimation via semi-supervised learning, Bayesian ensembling and cloud-native open-source tools, Nature Methods (2024), doi: https://doi.org/10.1038/s41592-024-02319-1

Example

>>> runner = LitPose2YOLOBbox(litpose_dir=r'Z:\home\simon\lp_300126', save_dir=r'E:\litpose_yolobox', verbose=True, clahe=False, greyscale=False, sample_n=1000, padding=0.15)
>>> runner.run()

Merge Lightning Pose projects

Crop Lightning Pose annotations

class simba.third_party_label_appenders.transform.litpose_crop_annotations.CropLPAnnotations(lp_project_dir, save_dir, crop_size=(512, 512), visualize=None, padding=None)[source]

Bases: object

Creates a new, self-contained Lightning Pose project from an existing one where every labeled image is cropped to a fixed size around the annotated animal.

When running inference on cropped frames (e.g. from an object detector), the training data should also be cropped so the model sees the same distribution at train and test time. This class produces a new LP project where every labeled frame is cropped around the animal’s keypoints, ready for training a model that will run inference on crops — without any re-labeling.

The output project is ready for training/inference: configs, calibrations, models, scripts, and project.yaml are copied, config paths are updated to the new location, and all CollectedData_*.csv keypoint coordinates are shifted to match the cropped frames. A row is only dropped when it is all-NaN in every camera view; in any other case each view is processed independently. For a view that has all-NaN keypoints but valid keypoints in some other view, the image is center-cropped and the keypoint coords stay NaN.

Parameters
  • lp_project_dir (str) – Root of the source LP project (e.g. Z:/home/simon/lp_300126).

  • save_dir (str) – Root of the new cropped LP project.

  • crop_size (Tuple[int, int]) – Output crop (width, height) in pixels (e.g. (512, 512)). Each crop is centered on the keypoint centroid per frame.

  • visualize (Optional[Union[bool, int]]) – If True, save annotated overlay images for every cropped frame to save_dir/visualizations/. If int, save that many randomly sampled overlays. None / False disables visualization.

  • padding (Optional[int]) – Minimum number of pixels between any keypoint and the crop edge. The crop window is shifted (within image bounds) so that all keypoints are at least padding pixels from each border. If the keypoint span plus 2 * padding exceeds crop_size in either dimension, a warning is printed and the padding is best-effort. None is treated as 0.

See also

CropLPAnnotationsBboxSquare

Bounding-box-based square crop that pads and resizes — matches inference-time crop behavior.

Crop Lightning Pose annotations (bounding box square)

class simba.third_party_label_appenders.transform.litpose_crop_annotations_bbox_square.CropLPAnnotationsBboxSquare(lp_project_dir, save_dir, crop_size=(512, 512), bbox_pad_frac=0.15, visualize=None, verbose=False)[source]

Bases: object

Creates a cropped Lightning Pose project where each labeled image is cropped to a square region around the keypoint bounding box, then resized to crop_size. This produces training images that match the inference pipeline’s crop-and-resize behavior.

Unlike CropLPAnnotations, which takes a fixed-size crop centered on the keypoint centroid, this class computes a tight bounding box around the keypoints, pads it by a fraction, extends the shorter side to make a square, and resizes to crop_size.

See also

CropLPAnnotations Fixed-size center crop around keypoint centroid.

Parameters
  • lp_project_dir (str) – Root of the source LP project.

  • save_dir (str) – Root of the new cropped LP project.

  • crop_size (Tuple[int, int]) – Output size (width, height), e.g. (512, 512).

  • bbox_pad_frac (float) – Fraction to pad the bbox on each side (default 0.15 = 15%).

  • visualize (Optional[Union[bool, int]]) – Save annotated overlays for QC.

  • verbose (bool) – If True, print per-frame progress (frame i/N within view j/M) as each image is cropped. Default False.

Create Lightning Pose bounding box files

simba.third_party_label_appenders.transform.utils.get_litpose_project_bboxes(project_dir, padding=0.15, visualize=None, verbose=True)[source]

Create per-view bounding box CSV files for a Lightning Pose multiview project.

Parameters
  • project_dir (Union[str, os.PathLike]) – Root of the Lightning Pose project (must contain project.yaml and CollectedData_*.csv files).

  • padding (Optional[float]) – Fractional padding applied to each side of the keypoint bounding box (default 0.15 = 15%).

  • visualize (Optional[Union[bool, int]]) – If True, save bbox overlay images for all frames. If int, save overlays for that many randomly sampled frames per view. If None/False, skip visualization. Default None.

  • verbose (Optional[bool]) – If True, print progress messages. Default True.

Example

>>> get_litpose_project_bboxes(project_dir=r'Z:\home\simon\LPProjects\mini_project_0504', padding=0.15, verbose=True, visualize=20)

Multi-animal DeepLabCut -> YOLO pose-estimation annotations

class simba.third_party_label_appenders.transform.dlc_multi_to_yolo.MultiDLC2Yolo(dlc_dir, save_dir, train_size=0.7, verbose=False, padding=0.0, flip_idx=None, names=('resident', 'intruder'), greyscale=False, clahe=False)[source]

Bases: object

example

>>> DLC_DIR = r'E:\deeplabcut_projects
esident_intruder_white_black-SN-2025-09-30labeled-data’
>>> SAVE_DIR = r'E:\yolo_resident_intruder'
>>> runner = MultiDLC2Yolo(dlc_dir=DLC_DIR, save_dir=SAVE_DIR, verbose=True, clahe=True, names=('resident', 'intruder'))
>>> runner.run()

DeepLabCut single-to-multi-animal format converter

simba.third_party_label_appenders.transform.dlc_single_to_multi_format_converter.convert_dlc_annotation_format(input_dir, output_dir)[source]

Converts DeepLabCut annotation files from format without individuals row to format with individuals row.

Takes annotation files where bodyparts include individual identifiers as suffixes (e.g., Ear_left_1, Nose_2) and converts them to the standard DeepLabCut multi-individual format with a separate individuals header row. Recursively searches for CSV files containing ‘CollectedData’ in the filename, preserves the folder structure, and copies all image files from the source folders to the output folders.

Parameters
  • input_dir (Union[str, os.PathLike]) – Path to directory containing source CSV annotation files (searches recursively).

  • output_dir (Union[str, os.PathLike]) – Path to directory where converted CSV files and images will be saved (preserves folder structure).

Returns

None

Example

>>> INPUT_DIR = r'E:\maplight_videos'
>>> OUTPUT_DIR = r'E:\maplight_videos\converted'
>>> convert_dlc_annotation_format(input_dir=INPUT_DIR, output_dir=OUTPUT_DIR)

DeepLabCut annotations -> Labelme annotations

class simba.third_party_label_appenders.transform.dlc_to_labelme.DLC2Labelme(dlc_dir, save_dir, labelme_version='5.3.1', flags=None, verbose=True, greyscale=False, clahe=False)[source]

Bases: object

Convert a folder of DLC annotations into labelme json format.

See also

For Labelme -> DLC annotation conversion, see simba.third_party_label_appenders.transform.labelme_to_dlc.Labelme2DLC()

param Union[str, os.PathLike] dlc_dir

Folder with DLC annotations. I.e., directory inside

param Union[str, os.PathLike] save_dir

Directory to where to save the labelme json files.

param Optional[str] labelme_version

Version number encoded in the json files.

param Optional[Dict[Any, Any] flags

Flags included in the json files.

param Optional[bool] verbose

If True, prints progress.

return

None

example

>>> DLC2Labelme(dlc_dir="D:\TS_DLC\labeled-data     s_annotations", save_dir="C:    roubleshooting\coco_data\labels est").run()
>>> DLC2Labelme(dlc_dir=r'D:

at_resident_intruderdlc_dataWIN_20190816081353’, save_dir=r’D: at_resident_intruderlabelme’).run()

Labelme annotations -> DeepLabCut annotations

class simba.third_party_label_appenders.transform.labelme_to_dlc.Labelme2DLC(labelme_dir, scorer='SN', greyscale=False, clahe=False, verbose=True, save_dir=None)[source]

Bases: object

Convert labels from labelme format to DLC annotation format.

See also

For DLC -> Labelme annotation conversion, see simba.third_party_label_appenders.transform.dlc_to_labelme.DLC2Labelme()

Parameters
  • labelme_dir (Union[str, os.PathLike]) – Directory with labelme json files.

  • scorer (Optional[str]) – Name of the scorer (anticipated by DLC as header)

  • greyscale (bool) – If True, convert images to grayscale.

  • clahe (bool) – If True, apply CLAHE (Contrast Limited Adaptive Histogram Equalization).

  • verbose (bool) – If True, prints progress.

  • save_dir (Optional[Union[str, os.PathLike]]) – Directory where to save the DLC annotations. If None, then same directory as labelme_dir with _dlc_annotations suffix.

Returns

None

Example

>>> labelme_dir = r"D:\platea       s_annotations"
>>> runner = Labelme2DLC(labelme_dir=labelme_dir)
>>> runner.run()

Labelme annotations -> DataFrame

class simba.third_party_label_appenders.transform.labelme_to_df.LabelMe2DataFrame(labelme_dir, greyscale=False, clahe=False, pad=False, size=None, normalize=False, save_path=None, verbose=True)[source]

Bases: object

Convert a directory of labelme .json files into a pandas dataframe.

Note

The images are stores as a 64-bit bytestring under the image header of the output dataframe.

Parameters
  • labelme_dir (Union[str, os.PathLike]) – Directory with labelme json files.

  • greyscale (Optional[bool]) – If True, converts the labelme images to greyscale if in rgb format. Default: False.

  • pad (Optional[bool]) – If True, checks if all images are the same size and if not; pads the images with black border so all images are the same size.

  • size (Union[Literal['min', 'max'], Tuple[int, int]]) – The size of the output images. Can be the smallesgt (min) the largest (max) or a tuple with the width and height of the images. Automatically corrects the labels to account for the image size.

  • normalize (Optional[bool]) – If true, normalizes the images. Default: False.

  • save_path (Optional[Union[str, os.PathLike]]) – The location where to store the dataframe. If None, then returns the dataframe. Default: None.

Return type

Union[None, pd.DataFrame]

Example I

>>> LABELME_DIR = r'C:      roubleshooting\coco_data\labels est_2'
>>> runner = LabelMe2DataFrame(labelme_dir=LABELME_DIR)
>>> runner.run()
Example II

>>> LABELME_DIR = r'C:      roubleshooting\coco_data\labels est_2'
>>> runner = LabelMe2DataFrame(labelme_dir=LABELME_DIR, greyscale=True, pad=True, normalize=True, size='min')
>>> runner.run()

Labelme annotations -> YOLO bounding box annotations

class simba.third_party_label_appenders.transform.labelme_to_yolo.LabelmeBoundingBoxes2YoloBoundingBoxes(labelme_dir, save_dir, obb=False, verbose=True, clahe=False, train_size=0.7, greyscale=False)[source]

Bases: object

Convert LabelMe annotations in json to YOLO format and save the corresponding images and labels in txt format.

Note

For more information on the LabelMe annotation tool, see the LabelMe GitHub repository. The Labelme Json files has too contain a imageData key holding the image as a b64 string. For an expected Labelme json format, see THIS FILE.

See also

To split YOLO data into train, test, and validation sets (expected by e.g., UltraLytics), see simba.third_party_label_appenders.converters.split_yolo_train_test_val(). To convert Labelme points annotations to YOLO keypoint training data, see simba.third_party_label_appenders.transform.labelme_to_yolo_keypoints.LabelmeKeypoints2YoloKeypoints().

Important

For YOLO bounding boxes (not YOLO keypoint data!) from labelme keypoints.

Parameters
  • labelme_dir (Union[str, os.PathLike) – Path to the directory containing LabelMe annotation .json files.

  • save_dir (Union[str, os.PathLike) – Directory where the YOLO-format images and labels will be saved. Will create ‘images/’, ‘labels/’, and ‘map.json’ inside this directory.

  • obb (bool) – If True, saves annotations as oriented bounding boxes (8 coordinates). If False, uses standard YOLO format (x_center, y_center, width, height)

  • verbose (bool) – If True, prints progress messages during conversion.

Example

>>> LABELME_DIR = r'D:\platea       s_annotations'
>>> SAVE_DIR = r"D:\platea\yolo"
>>> runner = LabelmeBoundingBoxes2YoloBoundingBoxes(labelme_dir=LABELME_DIR, save_dir=SAVE_DIR)
>>> runner.run()

Labelme points -> YOLO keypoints annotations

class simba.third_party_label_appenders.transform.labelme_to_yolo_keypoints.LabelmeKeypoints2YoloKeypoints(data_path, save_dir, greyscale=True, train_size=0.7, padding=0.0, names=('mouse',), flip_idx=None, clahe=True, verbose=True)[source]

Bases: object

Labelme points -> YOLO segmentation annotations

class simba.third_party_label_appenders.transform.labelme_to_yolo_seg.LabelmeKeypoints2YoloSeg(data_path, save_dir, greyscale=True, train_size=0.7, padding=0, names=('mouse',), clahe=True, verbose=True)[source]

Bases: object

SimBA ROIs -> YOLO bounding box annotations

class simba.third_party_label_appenders.transform.simba_roi_to_yolo.SimBAROI2Yolo(config_path=None, roi_path=None, video_dir=None, save_dir=None, roi_frm_cnt=10, train_size=0.7, obb=False, greyscale=False, clahe=False, verbose=True)[source]

Bases: object

Converts SimBA roi definitions into annotations and images for training yolo network.

Parameters
  • config_path (Optional[Union[str, os.PathLike]]) – Optional path to the project config file in SimBA project.

  • roi_path (Optional[Union[str, os.PathLike]]) – Path to the SimBA roi definitions .h5 file. If None, then the roi_coordinates_path of the project.

  • video_dir (Optional[Union[str, os.PathLike]]) – Directory where to find the videos. If None, then the videos folder of the project.

  • save_dir (Optional[Union[str, os.PathLike]]) – Directory where to save the labels and images. If None, then the logs folder of the project.

  • roi_frm_cnt (Optional[int]) – Number of frames for each video to create bounding boxes for.

  • train_size (float) – Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

  • obb (Optional[bool]) – If True, created object-oriented yolo bounding boxes. Else, axis aligned yolo bounding boxes. Default False.

  • greyscale (Optional[bool]) – If True, converts the images to greyscale if rgb. Default: True.

  • verbose (Optional[bool]) – If True, prints progress. Default: False.

Returns

None

Example I

>>> SimBAROI2Yolo(config_path=r"C:/troubleshooting/RAT_NOR/project_folder/project_config.ini").run()
Example II

>>> SimBAROI2Yolo(config_path=r"C:/troubleshooting/RAT_NOR/project_folder/project_config.ini", save_dir=r"C:/troubleshooting/RAT_NOR/project_folder/logs/yolo", video_dir=r"C:/troubleshooting/RAT_NOR/project_folder/videos", roi_path=r"C:/troubleshooting/RAT_NOR/project_folder/logs/measures/ROI_definitions.h5").run()
Example III

>>> SimBAROI2Yolo(video_dir=r"C:/troubleshooting/RAT_NOR/project_folder/videos", roi_path=r"C:/troubleshooting/RAT_NOR/project_folder/logs/measures/ROI_definitions.h5", save_dir=r'C:/troubleshooting/RAT_NOR/project_folder/yolo', verbose=True, roi_frm_cnt=20, obb=True).run()

SimBA pose-estimation -> YOLO pose-estimation annotations

class simba.third_party_label_appenders.transform.simba_to_yolo.SimBA2Yolo(config_path, save_dir, data_dir=None, train_size=0.7, verbose=False, greyscale=False, clahe=False, padding=0.0, threshold=0.0, flip_idx=None, names=('animal_1',), sample_size=None, bp_id_idx=None, single_id=None)[source]

Bases: object

Convert pose estimation data from a SimBA project into the YOLO keypoint format, including frame sampling, image-label pair creation, bounding box computation, and train/validation splitting.

Parameters
  • config_path (Union[str, os.PathLike]) – Path to the SimBA project .ini configuration file.

  • save_dir (Union[str, os.PathLike]) – Directory where YOLO-formatted data will be saved. Subdirectories for images/labels (train/val) are created.

  • data_dir (Optional[Union[str, os.PathLike]) – Optional directory containing outlier-corrected SimBA pose estimation data. If None, uses path from config.

  • train_size (float) – Proportion of samples to allocate to the training set (range 0.1–0.99). Remaining samples go to validation.

  • verbose (bool) – If True, prints progress updates to the console.

  • greyscale (bool) – If True, saves extracted video frames in greyscale. Otherwise, saves in color.

  • padding (float) – Padding added around the bounding box (as a proportion of image dimensions, range 0.0–1.0). Useful if animal body-parts are in a “line”.

  • flip_idx (Tuple[int, ...]) – Tuple defining symmetric keypoint indices for horizontal flipping. Used to write the map.yaml file. If None, then attempt to infer.

  • names (Dict[int, str]) – Dictionary mapping instance IDs to class names. Used in annotation labels and map.yaml.

  • sample_size (Optional[int]) – If specified, limits the number of randomly sampled frames per video. If None, all frames are used.

  • bp_id_idx (Optional[Dict[int, Union[Tuple[int], List[int]]]]) – Optional mapping of instance IDs to keypoint index groups, allowing support for multiple animals per frame. Must match keys in map_dict.

  • single_id (Optional[str]) – If the data contains pose-estimation for multiple indivisuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.

Returns

None. Saves YOLO-formatted images and annotations to disk in the save_dir location.

Example

>>> SAVE_DIR = r'D: roubleshooting\mitra\mitra_yolo'
>>> CONFIG_PATH = r"C:      roubleshooting\mitra\project_folder\project_config.ini"
>>> runner = SimBA2Yolo(config_path=CONFIG_PATH, save_dir=SAVE_DIR, sample_size=10, verbose=True)
>>> runner.run()

SLEAP CSV predictions -> YOLO pose-estimation annotations

class simba.third_party_label_appenders.transform.sleap_csv_to_yolo.Sleap2Yolo(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, instance_threshold=0, train_size=0.7, flip_idx=None, names=None, greyscale=False, clahe=False, padding=0.0, single_id=None)[source]

Bases: object

Convert SLEAP pose estimation CSV data and corresponding videos into YOLO keypoint dataset format.

Note

This converts SLEAP inference data to YOLO keypoints (not SLEAP annotations).

Parameters
  • data_dir (Union[str, os.PathLike]) – Directory path containing SLEAP-generated CSV files with inferred keypoints.

  • video_dir (Union[str, os.PathLike]) – Directory path containing corresponding videos from which frames are to be extracted.

  • save_dir (Union[str, os.PathLike]) – Output directory where YOLO-formatted images, labels, and map YAML file will be saved. Subdirectories images/train, images/val, labels/train, labels/val will be created.

  • frms_cnt (Optional[int]) – Number of frames to randomly sample from each video for conversion. If None, all frames are used.

  • instance_threshold (float) – Minimum confidence score threshold to filter out low-confidence pose instances. Only instances with instance.score >= this threshold are used.

  • train_size (float) – Proportion of frames randomly assigned to the training dataset. Value must be between 0.1 and 0.99. Default: 0.7.

  • verbose (bool) – If True, prints progress. Default: True.

  • flip_idx (Tuple[int, ...]) – Tuple of keypoint indices used for horizontal flip augmentation during training. The tuple defines the order of keypoints after flipping.

  • map_dict (Dict[str, int]) – Dictionary mapping class indices to class names. Used for creating the YAML class names mapping file.

  • padding (float) – Fractional padding to add around the bounding boxes (relative to image dimensions). Helps to slightly enlarge bounding boxes by this percentage. Default 0.05. E.g., Useful when all body-parts are along animal length.

  • single_id (Optional[str]) – If the data contains pose-estimation for multiple individuals, but you want to treat it as examples of a single individual, pass the name of the single individual. Defaults to None, and the YOLO data will be formatted to the number of objects which the H5 data contains.

Returns

None. Results saved in save_dir.

Example

>>> DATA_DIR = r'D:res\datant\sleap_csv'
>>> VIDEO_DIR = r'D:res\datant\sleap_video'
>>> SAVE_DIR = r"D:\imgs\sleap_csv"
>>> runner = Sleap2Yolo(data_dir=DATA_DIR, video_dir=VIDEO_DIR, frms_cnt=50, train_size=0.8, instance_threshold=0.9, save_dir=SAVE_DIR, single_id='ant')
>>> runner.run()

SLEAP H5 predictions -> YOLO pose-estimation annotations

class simba.third_party_label_appenders.transform.sleap_h5_to_yolo.SleapH52Yolo(data_dir, video_dir, save_dir, frms_cnt=None, verbose=True, threshold=0, train_size=0.7, flip_idx=None, animal_cnt=2, greyscale=False, clahe=False, padding=0.0, single_id=None)[source]

Bases: object

Convert SLEAP .h5 pose estimation annotations to YOLO keypoint annotation format.

Reads SLEAP .h5 files and associated videos, samples frames based on a confidence threshold, extracts keypoints for one or more animals, and saves image-label pairs in a format compatible with YOLOv8 keypoint training.

Parameters
  • data_dir (Union[str, os.PathLike]) – Directory containing SLEAP .h5 files.

  • video_dir (Union[str, os.PathLike]) – Directory containing the videos associated with .h5 files.

  • save_dir (Union[str, os.PathLike]) – Directory to save YOLO-formatted images, labels, and metadata.

  • frms_cnt (Optional[int]) – Number of frames to sample per video. If None, all valid frames are used.

  • verbose (bool) – If True, print progress during processing.

  • threshold (float) – Likelihood threshold below which poses are discarded.

  • train_size (float) – Proportion of frames to assign to the training set (rest go to validation).

  • flip_idx (Tuple[int, ...]) – Tuple indicating how to flip body-parts for augmentation. Length must match keypoint count.

  • animal_cnt (int) – Number of animals tracked per frame.

  • greyscale (bool) – If True, convert images to grayscale.

  • clahe (bool) – If True, apply CLAHE (Contrast Limited Adaptive Histogram Equalization).

  • padding (float) – Relative padding to apply around the bounding box of keypoints (range 0.0 to 1.0).

  • single_id (Optional[str]) – Optional custom ID to assign all annotations the same class (used in single-animal datasets).

Example

>>>DATA_DIR = r’D:/ares/data/termite_1/data’ >>>VIDEO_DIR = r’D:/ares/data/termite_1/video’ >>>SAVE_DIR = r”D:/imgs/sleap_h5” >>>runner = SleapH52Yolo(data_dir=DATA_DIR, video_dir=VIDEO_DIR, save_dir=SAVE_DIR, threshold=0.9, frms_cnt=50, single_id=’termite’) >>>runner.run()

SLEAP annotations -> YOLO pose-estimation annotations

class simba.third_party_label_appenders.transform.sleap_to_yolo.SleapAnnotations2Yolo(sleap_dir, save_dir, video_dir=None, padding=None, train_size=0.8, verbose=True, greyscale=False, clahe=False, single_id=None)[source]

Bases: object

Convert SLEAP annotations to YOLO formatted training data.

Parameters
  • data_dir (Union[str, os.PathLike]) – Directory containing SLEAP annotations .slp files

  • save_dir (Union[str, os.PathLike]) – Directory to save YOLO-formatted images, labels, and metadata.

  • verbose (bool) – If True, print progress during processing.

  • train_size (float) – Proportion of frames to assign to the training set (rest go to validation).

  • greyscale (bool) – If True, convert images to grayscale.

  • clahe (bool) – If True, apply CLAHE (Contrast Limited Adaptive Histogram Equalization).

  • padding (float) – Relative padding to apply around the bounding box of keypoints (range 0.0 to 1.0).

  • single_id (Optional[str]) – Optional custom ID to assign all annotations the same class (used in single-animal datasets).

Example

>>> runner = SleapAnnotations2Yolo(sleap_dir=r'D:/cvat_annotations/frames/slp_to_yolo', save_dir=r'D:/cvat_annotations/frames/slp_to_yolo/yolo')
>>> runner.run()

Annotation conversion utilities

simba.third_party_label_appenders.transform.utils.arr_to_b64(x)[source]

Helper to convert image in array format to an image in byte string format

simba.third_party_label_appenders.transform.utils.b64_to_arr(img_b64)[source]

Helper to convert byte string (e.g., created by labelme.) to image in numpy array format

simba.third_party_label_appenders.transform.utils.check_valid_yolo_map(yolo_map)[source]

Helper to do surface check if yaml path leads to a valid yolo map file for pose-estimation.

simba.third_party_label_appenders.transform.utils.concatenate_dlc_annotations(data_dir, save_dir, annotator='SN')[source]

Concatenate DeepLabCut annotation files from multiple directories into a single CSV file.

This function searches for DeepLabCut ‘CollectedData_*.csv’ files in the specified data directory, processes each file to standardize frame naming conventions, copies associated PNG images, and combines all annotation data into a single CSV file with multi-index headers.

param Union[str, os.PathLike] data_dir

Path to directory containing DeepLabCut annotation subdirectories.

param Union[str, os.PathLike] save_dir

Path to directory where concatenated results will be saved.

param str annotator

Name of the annotator (default: ‘SN’). Used in the output filename ‘CollectedData_{annotator}.csv’.

return

None. Creates concatenated CSV file and copies PNG images to ‘labeled-data’ subdirectory in save_dir.

example

>>> concatenate_dlc_annotations(
...     data_dir='/path/to/dlc/annotations',
...     save_dir='/path/to/output',
...     annotator='John'
... )
>>> concatenate_dlc_annotations(data_dir=r'E:\crim13_imgs\CRIM_labelled_images', save_dir=r'E:\crim13_imgs\combined')
>>> concatenate_dlc_annotations(data_dir=r'E:

gb_white_vs_black_imgsGB_labelled_images.ziplabeled-data’, save_dir=r’E: gb_white_vs_black_imgscombined’)

simba.third_party_label_appenders.transform.utils.create_yolo_keypoint_yaml(path, train_path, val_path, names, kpt_shape=None, flip_idx=None, save_path=None, use_wsl_paths=False)[source]

Given a set of paths to directories, create a model.yaml file for yolo pose model training though ultralytics wrappers.

See also

Used by simba.sandbox.coco_keypoints_to_yolo.coco_keypoints_to_yolo()

param Union[str, os.PathLike] path

Parent directory holding both an images and a labels directory.

param Union[str, os.PathLike] train_path

Directory holding training images. For example, if C: roubleshootingcoco_dataimages rain is passed, then a C: roubleshootingcoco_datalabels rain is expected.

param Union[str, os.PathLike] val_path

Directory holding validation images. For example, if C: roubleshootingcoco_dataimages est is passed, then a C: roubleshootingcoco_datalabels est is expected.

param Union[str, os.PathLike] test_path

Directory holding test images. For example, if C: roubleshootingcoco_dataimages

alidation is passed, then a C: roubleshootingcoco_datalabels alidation is expected.

param Dict[str, int] names

Dictionary mapping pairing object names to object integer identifiers. E.g., {‘OBJECT 1’: 0, ‘OBJECT 2`: 2}

param Optional[Tuple[int, …]] flip_idx

Optional tuple of integers representing keypoint switch indexes if image is flipped horizontally. Only pass if pose-estimation data.

param Optional[Tuple[int, int]] kpt_shape

Optional tuple of integers representing the shape of each animals keypoints, e.g., (6, 3). Only pass if pose-estimation data.

param Union[str, os.PathLike] save_path

Optional location where to save the yolo model yaml file. If None, then the dict is returned.

param bool use_wsl_paths

If True, use Windows WSL paths (e.g., /mnt/…) in the config file.

return None

simba.third_party_label_appenders.transform.utils.downsample_coco_dataset(json_path, img_dir, save_dir, shrink_factor=4, verbose=True)[source]

Downsample a COCO-format dataset (images and annotations) by a fixed integer factor.

This function resizes all images and updates annotation coordinates accordingly. Bounding box coordinates and keypoints (x, y only) are scaled by shrink_factor, while visibility flags in keypoints remain unchanged. The updated dataset is saved in COCO format to save_dir.

Parameters
  • json_path (Union[str, os.PathLike]) – Path to the input COCO JSON annotation file.

  • img_dir (Union[str, os.PathLike]) – Directory containing the original images referenced in the JSON file.

  • save_dir (Union[str, os.PathLike]) – Directory where resized images and updated COCO JSON will be stored.

  • shrink_factor (int) – Factor by which to downsample both images and annotation coordinates. Must be >= 2. Default is 4.

  • verbose (bool) – If True, prints progress information during processing. Default is True.

Return None

Saves new images and updated COCO JSON to save_dir.

Example

>>> downsample_coco_dataset(
...     json_path=r"D:\cvat_annotations\frames\coco_keypoints_1\merged\merged_08132025.json",
...     img_dir=r"D:\cvat_annotations\frames\all_imgs_071325",
...     save_dir=r"D:\cvat_annotations\frames\resampled_coco_081225"
... )
simba.third_party_label_appenders.transform.utils.get_litpose_project_bboxes(project_dir, padding=0.15, visualize=None, verbose=True)[source]

Create per-view bounding box CSV files for a Lightning Pose multiview project.

Parameters
  • project_dir (Union[str, os.PathLike]) – Root of the Lightning Pose project (must contain project.yaml and CollectedData_*.csv files).

  • padding (Optional[float]) – Fractional padding applied to each side of the keypoint bounding box (default 0.15 = 15%).

  • visualize (Optional[Union[bool, int]]) – If True, save bbox overlay images for all frames. If int, save overlays for that many randomly sampled frames per view. If None/False, skip visualization. Default None.

  • verbose (Optional[bool]) – If True, print progress messages. Default True.

Example

>>> get_litpose_project_bboxes(project_dir=r'Z:\home\simon\LPProjects\mini_project_0504', padding=0.15, verbose=True, visualize=20)
simba.third_party_label_appenders.transform.utils.get_yolo_keypoint_bp_id_idx(animal_bp_dict)[source]

Helper to create a dictionary holding the indexes for each animals body-parts. USed for transforming data for creating a YOLO training set.

Parameters

animal_bp_dict – Dictionaru of animal body-parts. Can be created by simba.mixins.config_reader.ConfigReader.create_body_part_dictionary().

Returns

Dictionary where the key is the animal name, and the values are the indexes of the columns belonging to each animal.

Return type

Dict[int, List[int]

simba.third_party_label_appenders.transform.utils.get_yolo_keypoint_flip_idx(x)[source]

Given a list of body-parts, create a flip_index YOLO yaml entry.

Important

Only works if the left and right bosy-parts have the substrings left and right (case-insensitive).

Parameters

x (List[str]) – List of the names of the body-parts. If several animals, then just a list of names for the body-parts for one animal.

Returns

The flip_idx required by the YOLO model yaml file. E.g., [1, 0, 2, 3, 4]

Return type

Tuple[int, …]

simba.third_party_label_appenders.transform.utils.merge_coco_keypoints_files(data_dir, save_path, max_width=None, max_height=None)[source]

Merges multiple annotation COCO-format keypoint JSON files into a single file.

Note

Image and annotation entries are appended after adjusting their id fields to be unique.

COCO-format keypoint JSON files can be created using https://www.cvat.ai/.

See also

To convert COCO-format keypoint JSON to YOLO training set, see simba.third_party_label_appenders.transform.coco_keypoints_to_yolo.COCOKeypoints2Yolo()

param Union[str, os.PathLike] data_dir

Directory containing multiple COCO keypoints .json files to merge.

param Union[str, os.PathLike] save_path

File path to save the merged COCO keypoints JSON.

param int max_width

Optional max width keypoint coordinate annotation. If above max, the annotation will be set to “not visible”

param int max_height

Optional max height keypoint coordinate annotation. If above max, the annotation will be set to “not visible”

return

None. Results are saved in save_path.

example I

>>> DATA_DIR = r'D:\cvat_annotations
ramescoco_keypoints_1TEST’
>>> SAVE_PATH = r"D:\cvat_annotations
ramescoco_keypoints_1TESTmerged.json”
>>> merge_coco_keypoints_files(data_dir=DATA_DIR, save_path=SAVE_PATH)
example II

>>> merge_coco_keypoints_files(data_dir=DATA_DIR, save_path=SAVE_PATH, max_width=662, max_height=217)
simba.third_party_label_appenders.transform.utils.normalize_img_dict(img_dict)[source]

Normalize a dictionary of grayscale or RGB images by standardizing pixel intensities.

Parameters

img_dict (Dict[str, np.ndarray]) – Dictionary of image arrays with string keys. Each image must be a 2D or 3D NumPy array.

Returns

Dictionary of normalized image arrays, with the same keys as the input.

Return type

Dict[str, np.ndarray]

simba.third_party_label_appenders.transform.utils.scale_pose_img_sizes(pose_data, imgs, size, interpolation=2)[source]

Resizes images and scales corresponding pose-estimation data to match the new image sizes.

Scale pose img sizes
Parameters
  • pose_data – 3d MxNxR array of pose-estimation data where N is number of images, N the number of body-parts in each frame and R represents x,y coordinates of the body-parts.

  • imgs – Iteralble of images of same size as pose_data M dimension. Can be byte string representation of images, or images as arrays.

  • size – The target size for the resizing operation. It can be: - ‘min’: Resize all images to the smallest height and width found among the input images. - ‘max’: Resize all images to the largest height and width found among the imgs.

  • interpolation – Interpolation method to use for resizing. This can be one of OpenCV’s interpolation methods.

Returns

The converted pose_data and converted images to align with the new size.

Return type

Tuple[np.ndarray, Iterable[Union[np.ndarray, str]]]

Example

>>> df = labelme_to_df(labelme_dir=r'C:     roubleshooting\coco_data\labels est_read', greyscale=False, pad=False, normalize=False)
>>> imgs = list(df['image'])
>>> pose_data = df.drop(['image', 'image_name'], axis=1)
>>> pose_data_arr = pose_data.values.reshape(len(pose_data), int(len(pose_data.columns) / 2), 2).astype(np.float32)
>>> new_pose, new_imgs = scale_pose_img_sizes(pose_data=pose_data_arr, imgs=imgs, size=(700, 3000))