This time around on SD Times GitHub Project of the Week, we are shining the spotlight on SmileMiner, which stands for Statistical Machine Intelligence and Learning Engine. Created by Haifeng Li, a chief data scientist at ADP, SmileMiner is a comprehensive library and engine of advanced machine-learning algorithms.

“SMILE is self contained and requires only the standard Java library,” he wrote on his blog. “With advanced data structures and learning algorithms, SMILE achieves the state of the art of performance,” Li wrote on his blog.

According to Li, the major components of SmileMiner include:

  • A core machine-learning library
  • Mathematical functions
  • Graph algorithms
  • One- and two-dimensional interpolation
  • A Swing-based data visualization library, which requires SwingX library for JXTable

In addition, SmileMiner implements major machine-learning algorithms such as:

  • Classification, including support vector machines, decision trees, AdaBoost, gradient boosting, logistic regression and neural networks
  • Regression, including support vector regression, regression processes, regression trees, gradient boosting, random forest, and ridge regression
  • Feature selection, including genetic algorithm-based feature selection, signal noise ratio, and sum squares ratio
  • Clustering, including deterministic annealing, growing neural gas, hierarchical clustering and self-organizing maps
  • Association rule and frequent item set mining, including the FP-Growth mining algorithm
  • Manifold learning, including Laplacian Eigenmap, PCA, kernel PCA, probabilistic PBA, and random projection
  • Nearest neighbor search, including BK-tree, cover tree, KD-tree and LSH
  • Sequence learning, including the hidden Markov model

More information is available here.