Zhi-Zhuo Zhang
MOTTO:
“What is the meaning of life is to find
out what it ought to be !”
Research Interest:
I have
wide research interests, mainly including artificial intelligence, statistical machine
learning, data mining, bioinformatics, information retrieval, nonlinear
Embedding, Mathematics Modeling. And Recently, I may focus following problems:
“Non-Convex Optimization and Imbalance Learning”, “Efficient Semi-Supervised
Learning with Fenchel Duality ”, “Nowhere Differentiable Functions in
Learning”. I will be so glad if
someone has any idea of these topics and shares with me.
My Thesis is
available now (New!)
《基于损失函数的不平衡分类问题的研究》(A
Study of Imbalance Classification Problem based on Loss Functions) (PDF 2.2M)
Data imbalance
is considered as an important factor affecting the performance of classifiers.
Many Meta methods like Re-sampling, classifiers ensemble, various evaluation,
have been tried to handle the imbalance problem. This paper takes the machine
learning problem as an optimization problem based on Tikhonov Regularization framework, and discusses the effect
of different loss function on the optimized solutions in the imbalance
situation. In this paper, Classified Discussion of three type convex loss
function further points out that the essential cause is the convex
optimization, which leads to the performance degradation of the classifiers in
the imbalance situation. Moreover, the “imbalance insensitive” and “imbalance
insensitive” loss function are defined in this paper with their sufficient
condition. However, the “imbalance insensitive” loss function is non-convex
function, which turns the machine learning problem to a non-convex optimizing
problem. Hence, the further analysis on the non-convex problem solving methods
like random gradient decreasing and semi-define programming approximating are
given in this paper too. Finally, the paper generalizes the “imbalance
insensitive” theory in the multi-class case and cost-sensitive case and makes
some discussion on the issue “sampling imbalance”.
Keywords: data imbalance, Tikhonov
Regularization, non-convex
optimizing, imbalance insensitive, loss function, semi-define programming