Linear and kernel methods are important machine learning techniques for data classification. Popular examples include support vector machines (SVM) and logistic regression. We begin with an introduction on this subject by deriving their optimization problems through different aspects. This discussion is useful because many people are confused about the relationships between, for example, SVM and logistic regression. We then move to investigate techniques for solving optimization problems for linear and kernel classification. In particular, we show details of two representative settings: coordinate descent methods and Newton methods. Recently, extending these optimization techniques to handle big data in either multi-core or distributed environments is a very important research direction. We present some promising results and discuss future challenges.