Location: stefie10's Home Page / Technical / Python Machine Learning Packages

This is a brief overview of Python machine learning toolkits, as of June 7, 2008. I was looking for something like Weka for Python. I settled on Orange, as it seemed to have the largest feature set, and was the only one with a gui. I’ve used it for about a week and it seems pretty nice, although I haven’t tried out the gui yet.

Package Last release # of classifiers Clustering? Cross-validation? Gui? Native to python? Sparse data sets? Integrates with Matplotlib? Notes
Orange 05/2008 10+ (rules, svm, clustering, trees) Has clustering Has cross-validation Has gui Wraps C++, but designed for Python Has sparse data sets Does not integrate with matplotlib
PyML 05/2008 3 classifiers No clustering Has cross-validation No gui Native Has sparse data sets Integrates with Matplotlib
Shogun 05/2008 5 classifiers (with SVM craziness) No clustering Has cross-validation No gui Wraps C++ Has sparse data sets Integrates with Matplotlib Long page of citations. Interfaces to R, Octave, Matlab as well as python.
MDP 05/2008 10+ nodes, some of which are classifiers no clustering No cross validation No gui Native No sparse data sets No matplotlib More complicated than just a classifier suite. Users construct networks of operations, each node of which is a classifier or something else.