Standard Definition of Standard Error
Definition 1
For a given estimator $T$, the estimated standard deviation of $T$ is called the standard error. $$ \text{s.e.} \left( T \right) := \sqrt{ \widehat{ \operatorname{Var} \left( T \right) } } $$
Explanation
The reason why it is precisely defined as an estimator in the definition, not a statistic, is because the standard error becomes meaningless unless we are discussing whether it ‘matches or not’ with the parameter $\theta$ I want to estimate. That’s why, even though $\theta$ never appears in the equation, it is purposely defined in terms of an estimator. Therefore, potential candidates for $T$ are obviously the sample mean $\overline{X}$ or regression coefficients $\beta_{k}$, and $\text{s.e.} \left( T \right)$ becomes necessary because we are curious about their confidence intervals.
Usually, since we learn about the standard error of $\overline{X} = \sum_{k=1}^{n} X_{k}$ $S / \sqrt{n}$ as if it were the only kind of standard error from definitions like these, many believe it to be the sole form of standard error, when in reality, it is not even a definition but just a formula derived through calculation. Let’s calculate it without omitting as much as possible. $$ \begin{align*} \text{s.e.} \left( \overline{X} \right) =& \sqrt{ \widehat{ \operatorname{Var} \left( \overline{X} \right) } } \\ =& \sqrt{ \widehat{ \operatorname{Var} \left( {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k} \right) } } \\ =& \sqrt{ \widehat{ {{ 1 } \over { n^{2} }} \operatorname{Var} \left( \sum_{k=1}^{n} X_{k} \right) } } \\ \overset{\text{iid}}{=} & \sqrt{ \widehat{ {{ 1 } \over { n^{2} }} \sum_{k=1}^{n} \operatorname{Var} \left( X_{k} \right) } } \\ =& \sqrt{ {{ 1 } \over { n^{2} }} \sum_{k=1}^{n} \widehat{ \operatorname{Var} \left( X_{k} \right) } } \\ =& \sqrt{ {{ 1 } \over { n^{2} }} \sum_{k=1}^{n} S^{2} } \\ =& \sqrt{ {{ 1 } \over { n^{2} }} n S^{2} } \\ =& \sqrt{ {{ 1 } \over { n }} S^{2} } \\ =& {{ 1 } \over { \sqrt{n} }} S \end{align*} $$ As you can see, the concepts of estimator and estimate differ, hence even in this simple example it can be quite confusing. Furthermore, since in many cases where standard error is actually used, the form of dividing sample variance by degrees of freedom and taking the square root is often utilized, it’s easy to misconstrue that form as the standard error itself. However, despite such intuition frequently being correct, the standard error isn’t defined by such methods but rather derived from mathematical progressions as shown.
Hadi. (2006). Regression Analysis by Example(4th Edition): p33. ↩︎