Statistics transformations
Statistics mixin
- class simba.mixins.statistics_mixin.Statistics[source]
Statistics methods used for feature extraction, drift assessment, distance computations, distribution comparisons in sliding and static windows.
Note
Most methods implemented using numba parallelization for improved run-times. See line graph below for expected run-times for a few methods included in this class.
Most method has numba typed signatures to decrease compilation time through reduced type inference. Make sure to pass the correct dtypes as indicated by signature decorators. If dtype is not specified at array creation, it will typically be
float64orint64. As most methods here usefloat32for the input data argument, make sure to downcast.This class contains a few probability distribution comparison methods. These are being moved to
simba.sandbox.distances(05.24).
References
- 1
Bernard Desgraupes - https://cran.r-project.org/web/packages/clusterCrit/vignettes/clusterCrit.pdf
- 2
Ikotun, A. M., Habyarimana, F., & Ezugwu, A. E. (2025). Cluster validity indices for automatic clustering: A comprehensive review. Heliyon, 11(2), e41953. https://doi.org/10.1016/j.heliyon.2025.e41953
- 3
Hassan, B. A., Tayfor, N. B., Hassan, A. A., Ahmed, A. M., Rashid, T. A., & Abdalla, N. N. (2024). From A-to-Z review of clustering validation indices. arXiv. https://doi.org/10.48550/arXiv.2407.20246
- 4
Leland McInnes - pynndescent.
- Members
- Undoc-members
Statistics GPU methods
- members
- undoc-members