An example of what such code might look like (for 2 classes) is as follows:
(define (load-learn-and-SVMlite data-directory category1 category2)
(fluid-let ((*required-words* 50) ; min required words for a doc
(*data-directory* data-directory) ; directory from which to load data
(*documents-considered* 500) ; how many documents to load
(*dictionary-size* 100)) ; default dictionary size
(begin
(load (string-append *data-directory* "allfiles.scm")) ;; defines countfiles
(let*
;; load up the raw datasets from file
((category1-data (load-dataset category1 countfiles))
(category2-data (load-dataset category2 countfiles))
;; split up our documents into a training set and testing set
(c1-train (sublist category1-data 0 (/ (length category1-data) 2)))
(c1-test (sublist category1-data (/ (length category1-data) 2) (length category1-data)))
(c2-train (sublist category2-data 0 (/ (length category2-data) 2)))
(c2-test (sublist category2-data (/ (length category2-data) 2) (length category2-data)))
(train-docs (append c1-train c2-train))
(test-docs (append c1-test c2-test))
;; turn documents into feature vectors
(dictionary (make-dictionary train-docs))
(train-set (data-set-from-docs train-docs))
(test-set (data-set-from-docs test-docs))
(train-filename (string-append "train.svm-mc.w" (number->string *dictionary-size*) ".in." category1 "." category2))
(test-filename (string-append "test.svm-mc.w" (number->string *dictionary-size*) ".in." category1 "." category2)))
(display* "writing : " train-filename)
(write-svm-multiclass-file train-filename train-set)
(display* "writing : " test-filename)
(write-svm-multiclass-file test-filename test-set)))))
(load-learn-and-SVMlite"../Data/" "basketball" "dance")You will have to go manually delete previous output files before invoking the code again, because it refuses to automatically clobber files.
Learning options:
-c float - C: trade-off between training error
and margin (default [avg. x*x]^-1)
Kernel options:
-t int - type of kernel function:
0: linear (default)
1: polynomial (s a*b+c)^d
2: radial basis function exp(-gamma ||a-b||^2)
-d int - parameter d in polynomial kernel
-g float - parameter gamma in rbf kernel
-s float - parameter s in sigmoid/poly kernel
-r float - parameter c in sigmoid/poly kernel
for example:
../../Svm/Bin/svm_multiclass_learn -c .01 -t 1 -d 3 train.svm-mc.w100.in.basketball.dancetrained a multiclass SVM with a cubic kernel, with C = 0.01.
../../Svm/Bin/svm_multiclass_classify test.svm-mc.w100.in.basketball.dance svm_struct_model
and voila! you will see the performance of your SVM. For example:
Reading model... (380 support vectors read) done. Reading test examples..Scanning examples...done Reading examples into memory...100..200..300..400..500..OK. (500 examples read) (500 examples) done. Classifying test examples..99..199..299..399..499..done Runtime (without IO) in cpu-seconds: 0.40 Average loss on test set: 0.0620 Zero/one-error on test set: 6.20% (469 correct, 31 incorrect, 500 total)