An example of what such code might look like (for 2 classes) is as follows:
(define (load-learn-and-SVMlite data-directory category1 category2) (fluid-let ((*required-words* 50) ; min required words for a doc (*data-directory* data-directory) ; directory from which to load data (*documents-considered* 500) ; how many documents to load (*dictionary-size* 100)) ; default dictionary size (begin (load (string-append *data-directory* "allfiles.scm")) ;; defines countfiles (let* ;; load up the raw datasets from file ((category1-data (load-dataset category1 countfiles)) (category2-data (load-dataset category2 countfiles)) ;; split up our documents into a training set and testing set (c1-train (sublist category1-data 0 (/ (length category1-data) 2))) (c1-test (sublist category1-data (/ (length category1-data) 2) (length category1-data))) (c2-train (sublist category2-data 0 (/ (length category2-data) 2))) (c2-test (sublist category2-data (/ (length category2-data) 2) (length category2-data))) (train-docs (append c1-train c2-train)) (test-docs (append c1-test c2-test)) ;; turn documents into feature vectors (dictionary (make-dictionary train-docs)) (train-set (data-set-from-docs train-docs)) (test-set (data-set-from-docs test-docs)) (train-filename (string-append "train.svm-mc.w" (number->string *dictionary-size*) ".in." category1 "." category2)) (test-filename (string-append "test.svm-mc.w" (number->string *dictionary-size*) ".in." category1 "." category2))) (display* "writing : " train-filename) (write-svm-multiclass-file train-filename train-set) (display* "writing : " test-filename) (write-svm-multiclass-file test-filename test-set)))))
(load-learn-and-SVMlite"../Data/" "basketball" "dance")You will have to go manually delete previous output files before invoking the code again, because it refuses to automatically clobber files.
Learning options: -c float - C: trade-off between training error and margin (default [avg. x*x]^-1) Kernel options: -t int - type of kernel function: 0: linear (default) 1: polynomial (s a*b+c)^d 2: radial basis function exp(-gamma ||a-b||^2) -d int - parameter d in polynomial kernel -g float - parameter gamma in rbf kernel -s float - parameter s in sigmoid/poly kernel -r float - parameter c in sigmoid/poly kernel
for example:
../../Svm/Bin/svm_multiclass_learn -c .01 -t 1 -d 3 train.svm-mc.w100.in.basketball.dancetrained a multiclass SVM with a cubic kernel, with C = 0.01.
../../Svm/Bin/svm_multiclass_classify test.svm-mc.w100.in.basketball.dance svm_struct_model
and voila! you will see the performance of your SVM. For example:
Reading model... (380 support vectors read) done. Reading test examples..Scanning examples...done Reading examples into memory...100..200..300..400..500..OK. (500 examples read) (500 examples) done. Classifying test examples..99..199..299..399..499..done Runtime (without IO) in cpu-seconds: 0.40 Average loss on test set: 0.0620 Zero/one-error on test set: 6.20% (469 correct, 31 incorrect, 500 total)