it's pretty well known that there are issues with initialization and dealing with non-stationarity of resulting estimates etc. But, for this question, one can assume that all these things are taken care of in the algorithm. So, my question is : can the resulting parameter estimates be thought of as those that minimize the one step ahead forecast error of the respective arima model ? Mathematically, any algorithm is maximizing a likelihood, but that likelihood is a function of the estimated residuals so it still feels like one is capturing the best one step ahead forecast ? Thanks for any insights or references ? This question came to my mind recently when I realized that I'm not really interested in the one step ahead forecast.

Full article content could not be extracted automatically. Read the original below.