emlearn is an open-source machine learning inference engine designed for microcontrollers and embedded devices. It supports various machine learning models for classification, regression, unsupervised learning, and feature extraction. The engine is portable, with a single header file include, and uses C99 code and static memory allocation. Users can train models in Python and convert them to C code for inference.
Scikit-learn — the go-to library for machine learning offering a user friendly, consistent interface.
Pycaret — lowering the entry point for machine learning with low code, automated and end to end solutions.
PyTorch — build and deploy powerful, scalable neural networks with its highly flexible architecture.
TensorFlow — one of the most mature deep learning libraries, highly flexible and suited to a wide range of applications.
Keras — TensorFlow made simple.
FastAI — makes deep learning more accessible with a high-level API built on top of PyTorch.
Comparing Clustering Algorithms
Following table will give a comparison (based on parameters, scalability and metric) of the clustering algorithms in scikit-learn.
Sr.No Algorithm Name Parameters Scalability Metric Used
1 K-Means No. of clusters Very large n_samples The distance between points.
2 Affinity Propagation Damping It’s not scalable with n_samples Graph Distance
3 Mean-Shift Bandwidth It’s not scalable with n_samples. The distance between points.
4 Spectral Clustering No.of clusters Medium level of scalability with n_samples. Small level of scalability with n_clusters. Graph Distance
5 Hierarchical Clustering Distance threshold or No.of clusters Large n_samples Large n_clusters The distance between points.
6 DBSCAN Size of neighborhood Very large n_samples and medium n_clusters. Nearest point distance
7 OPTICS Minimum cluster membership Very large n_samples and large n_clusters. The distance between points.
8 BIRCH Threshold, Branching factor Large n_samples Large n_clusters The Euclidean distance between points.
Linear Regression
Multiple Regression
Polynomial Regression
Decision Tree
Logistic Regression
K Nearest Neighbor
Naive Bayes
Random Forest
Support Vector Machines
Principal Component Analysis
Linear Discriminant Analysis
K Means Clustering
Hierarchical Clustering