First Principal Component
Forrest Young's Notes
Copyright © 1999 by Forrest W. Young.

The First Principal Component

The first principal component is the linear
function of the set of variables which fits the variables as well
as possible in a least squares sense. The linear combination has
certain properties:

The first principal component line is as close as possible, in a specific
average
least squares sense, to all of the points.

The first principal component line identifies the "central tendency"
of the set of variables, just as the mean identifies the "central tendency"
of a single variable.

The first principal component line provides a simplified description 
a model  of the set of variables.

The first principal component line gives us a way to summarize the
set of variables by a single linear combination.

Equation for the first Principal Component

The n variables, denoted

Y1, Y2,
... Yn
are described by the following linear equation, where X1
is
the vector of scores on the first principal component, and b1
is the coefficient of the first principal component:
Y1, Y2,
... Yn = a + b1X1
+
r
where r is the "residual" information in
the Y's not fit by the component's linear combination.
