In this chapter we describe a class of regression techniques that achieve flexibility in estimating the regression function f(X) over the domain IRp by fitting a different but simple model separately at each query point x0. This is done by using only those observations close to the target point x0 to fit the simple model, and in such a way that the resulting estimated function ˆ f(X) is smooth in IRp. This localization is achieved via a weighting function or kernel Kλ(x0, xi), which assigns a weight to xi based on its distance from x0. The kernels Kλ are typically indexed by a parameter λ that dictates the width of the neighborhood. These memory-based methods require in principle little or no training; all the work gets done at evaluation time. The only parameter that needs to be determined from the training data is λ. The model, however, is the entire training data set. We also discuss more general classes of kernel-based techniques , which tie in with structured methods in other chapters, and are useful for density estimation and classification. The techniques in this chapter should not be confused with those associated with the more recent usage of the phrase “kernel methods”. In this chapter kernels are mostly used as a device for localization. We discuss kernel methods in Sections 5.8, 14.5.4, 18.5 and Chapter 12; in those contexts the kernel computes an inner product in a high-dimensional (implicit) feature space, and is used for regularized nonlinear modeling. We make some connections to the methodology in this chapter at the end of Section 6.7.