π― Scenario 4: Adding New Experimental Data to a Previously Started Projectο
Tooltips: Many menu items and dialog options in SimBA have tooltips. Hover over a label or control to see a short description.
π Overviewο
SimBA provides several scenario-based tutorials. This tutorial covers Scenario 4: using an existing project and classifier to analyze a new batch of experimental data (e.g. Day 2 and Day 3) after you have already analyzed an earlier batch (e.g. Day 1) in Scenario 2. You have already generated a classifier (Scenario 1), run the classifier on Day 1 (Scenario 2), and optionally improved the classifier with more training data (Scenario 3). You now want to run the same classifier on Days 2 and 3 without re-analyzing the Day 1 files.
This scenario is very similar to Scenario 2; the main difference is how you prepare the project so that SimBA only processes the new data (Part 1 below).
What this scenario is about: You analyze new videos with a classifier that has already been created and validated. Once the model is trained (Scenario 1, and optionally refined in Scenario 3), the goal of using machines is to score behavior quickly on new experimental data without re-annotating or re-training. Scenario 4 is that workflowβadding a new batch of videos to an existing project and running the same classifier to get predictions and analyses with minimal extra work.
You are ready for Scenario 4 if:
You have a SimBA project that already contains analyzed data from an earlier batch (e.g. Day 1).
You have new experimental videos and pose-estimation files (e.g. Day 2 and Day 3) to add and analyze with the same classifier.
π Hypothetical Experimentο
Three days of residentβintruder testing between aggressive CD-1 mice and subordinate C57 intruders. Each day has 10 pairs of mice, for a total of 30 videos across 3 days. Recordings are 3 minutes, in color, at 30 fps. Day 1 was analyzed in Scenario 2; here you add and analyze Day 2 and Day 3.
Property |
Value |
|---|---|
Experiment |
Residentβintruder: aggressive CD-1 vs. subordinate C57 intruders |
Days |
3 days of testing |
Pairs per day |
10 pairs per day, 30 videos total |
Video specs |
3 min, color, 30 fps |
Already analyzed |
Day 1 (Scenario 2) |
To analyze here |
Day 2 and Day 3 |
πΊοΈ Workflow at a Glanceο
Part 1: Clean up your previous project (or create a new one)
Part 2: Load the project and import your new data
Part 3: Process the data for Day 2β3
Part 4: Run the classifier on Day 2β3
Part 5: Analyze machine results
Part 6: Visualize machine predictions
Part 7: Post-classification validation
π Part 1: Clean up your previous project (or create a new one)ο
You need a project that contains only the data you want to analyze in this run. If you continue from Scenario 2, your project has Day 1 CSV files in project_folder/csv/input/, project_folder/csv/features_extracted/, project_folder/csv/machine_results/, and related folders. SimBA will process every CSV SimBA finds in those directories, so if you leave Day 1 files in place, they will be analyzed again together with Day 2β3. To avoid re-analyzing Day 1, either archive the old files or use a new project.
Options:
Archive processed files β Keep using the same project folder. After loading your project, go to the Further imports tab. In the FURTHER METHODS section, click ARCHIVE PROCESSED FILES IN SIMBA PROJECT. Enter a name for the archive (e.g.
Day1) and click RUN ARCHIVE. SimBA moves the processed CSV files from the activeproject_folder/csv/subdirectories into an archive subfolder so they are no longer analyzed. Day 1 data stays in the project but is hidden from the next run. This is the same workflow described in Scenario 2 β Part 1: Clean up or create project.
Manually move files β Alternatively, move the already-analyzed CSV files out of the immediate
project_folder/csv/subdirectories into a subfolder (e.g.Day1) or another location. You must move files for (i)project_folder/csv/input/, (ii)project_folder/csv/outlier_corrected_movement/, (iii)project_folder/csv/outlier_corrected_movement_location/, (iv)project_folder/csv/features_extracted/, and (v)project_folder/csv/machine_results/. SimBA only reads CSVs that sit directly in these folders (or their configured equivalents), so moving them into a subfolder hides them. Log files (e.g. outlier correction counts, descriptive statistics) live inproject_folder/logs/; you can move those into a Day1 subfolder too if you want to keep outputs organized.Create a new project β Start a new project that contains only the new experimental data (Day 2 and Day 3). Follow Scenario 1 β Part 1: Create a new project and import only the new videos and tracking data. Use the same body-part configuration and classifier names as your existing project so the classifier runs correctly.
βοΈ Part 2: Load the project and import your new dataο
Load the project. If you archived or moved files in Part 1, load the same project. If you created a new project, load that one. Follow Scenario 1 β Step 1: Load project config.
Import the new tracking data and videos. Use the Further imports tab to add the Day 2 and Day 3 pose-estimation CSVs and videos. Follow Scenario 1 β Step 2 (optional): Further imports. After importing, the new files appear in
project_folder/csv/input/(and videos inproject_folder/videos/).
π§ Part 3: Process the data for Day 2β3ο
Process the newly imported data so it has the same structure as the data the classifier was trained on: set video parameters, correct outliers, and extract features. Follow Steps 3β5 in Scenario 1:
Step 3: Set video parameters
Step 4: Outlier correction
Step 5: Extract features
Calibration: Calibrate each new video with its own pixels per mm and fps (e.g. using the known-distance picker). The numeric value of pixels per mm will often differ between videos if the camera distance or resolution changedβthat is expected. What matters is that you calibrate so that features are in the same real-world units (e.g. mm, mm/s) as the training data, so the classifier receives inputs on the same scale. Use the same body-part configuration and outlier criteria as in your original project so the feature set (which body parts, which metrics) matches what the classifier was trained on.
π€ Part 4: Run the classifier on Day 2β3ο
The Day 2β3 data is now in the project with corrected pose and extracted features. Run the BtWGaNP classifier (or whichever classifier you use) on the Day 2β3 videos. For full step-by-step instructions and all options, see Scenario 2 β Part 3: Run the classifier on new data. For where the model file comes from (training) and how to validate the model on a single video, see Scenario 1 β Step 7: Train machine model and Scenario 1 β Step 8: Validate model on new data.
Go to the Run machine model tab and click RUN MODELS.
In the pop-up, ensure the model path points to your
.savfile (e.g. fromproject_folder/models/generated_models/or from when you ran Day 1 in Scenario 2). Paths are stored inproject_config.iniand usually do not need to be re-entered.Set Discrimination threshold and Minimum bout length (ms) as in Scenario 2 (see Scenario 2 β Part 3: Set discrimination threshold and minimum bout length).
Click RUN. SimBA writes predictions to
project_folder/csv/machine_results/for each new CSV inproject_folder/csv/features_extracted/.
To validate the classifier on a single video (e.g. to tune threshold and bout length before running on all of Day 2β3), use the VALIDATE MODEL ON SINGLE VIDEO section in the same tab β see Scenario 1 β Step 8.
π Part 5: Analyze machine resultsο
Generate descriptive statistics and analysis for the Day 2β3 classifications. Go to the Run machine model tab β Analyze machine results section. All tools read from project_folder/csv/machine_results/ (or, for distance/velocity, from project_folder/csv/outlier_corrected_movement_location/) and write date-time stamped CSVs to project_folder/logs/ (and plot subfolders when applicable). For detailed options and screenshots, see Scenario 2 β Part 4: Analyze machine learning results.
Analysis |
Description |
|---|---|
Summary statistics per video and per classifier: first occurrence, bout count, total/mean/median duration, mean/median interval. Optional detailed bout table. |
|
2. Distance / velocity (aggregates) |
Total distance (cm) and average velocity (cm/s) per video per animal from pose data. No classifier required. |
Same metrics as aggregates but per time bin (e.g. 30 s or 60 s) within each video. |
|
4. Distance / velocity: time bins |
Distance and velocity per time bin per video. Optional line plots. |
5. Machine predictions: by ROI |
Behavior time and bout counts inside user-defined ROIs (zones). Requires ROIs to be defined. |
6. Machine predictions: by severity |
Grade behavior intensity (e.g. mild vs severe) using movement during bouts; output severity brackets and optional example clips. |
π Part 6: Visualize machine predictionsο
Create visualizations for the Day 2β3 results. Go to the [Visualizations] tab. For a consolidated reference of all visualization methods, see Visualizations. All tools read from project_folder/csv/machine_results/ (or project_folder/csv/outlier_corrected_movement_location/ for path/distance/data tables) and from videos in project_folder/videos/; outputs go under project_folder/frames/output/ in type-specific subfolders. You can only visualize on videos for which you have feature data. For detailed options and screenshots, see Scenario 2 β Part 5: Visualizing results.
Visualization |
Description |
|---|---|
Overlay predictions and probabilities on video (or save per-frame images). Optional pose, Gantt, timers. Output: |
|
Horizontal bar charts: when each behavior occurred and for how long. Static image, video, or per-frame. Output: |
|
Line plots of classifier probability over time. Output: |
|
Animal movement paths with behavior locations overlaid. Output: |
|
Distance or velocity over time (from pose). Output: |
|
Heatmaps of where behavior occurred in the arena. Output: |
|
Live data table (or video) of metrics per frame. Output: |
|
Concatenate multiple videos (e.g. clips or visualization outputs) into one file. Output: |
β Part 7: Post-classification validation (detecting false-positives)ο
When you have many predictions, you may want to review them in short clips that show only the classified events. The post-classification validation tool generates a video for each CSV that contains concatenated clips of all bouts of the target behavior identified by the classifier. The post-classification validation tool helps you spot false positives and tune the discrimination threshold or minimum bout length.
Seconds β Duration (in seconds) to add before the start and after the end of each event. For example, for a 2-second attack, entering 1 adds 1 second before and 1 second after, so the clip is 4 seconds long. Larger values produce longer clips.
Target β The target behavior (e.g. BtWGaNP) to include in the validation video.
Steps:
In the Run machine model tab, open the Validate / post-classification validation section.
Enter a value in the Seconds entry box (e.g. 1 or 2).
Select the target behavior from the Target dropdown.
Click Validate. Videos are generated in
project_folder/frames/output/classifier_validation/. Each file is named: videoname + target behavior + number of bouts +.mp4.
β‘οΈ Next Stepsο
Scenario 1 β Train or validate a classifier.
Scenario 2 β Run the classifier on experimental data and create visualizations.
Scenario 3 β Update the classifier with more annotated data.
π Congratulations! You have added and analyzed a new batch of experimental data in the same project. You can repeat this workflow (Parts 1β6) for further batches (e.g. Day 4, Day 5) or refine your classifier with Scenario 3.
Bugs and feature requests: Please help improve SimBA by reporting bugs or suggesting featuresβeither open an issue on GitHub or reach out on Gitter.
Author: Simon N