AI/ML News & Innovations Hub

The challenge is that, at the time the forecast is generated, transactions from the second half of June are not yet available, creating a gap between the latest observed data and the beginning of the forecast horizon.

This is not a problem at all: if you are requested to provide your July forecast on June 15, simply forecast out from your history ending June 15 beyond the end of July (see below). Sure it would be nice to have more history, but we can't always get what we want.

The data I have consists of purchase order transactions at the SKU level, including timestamp (date and time) and quantity consumed.

Note that purchase orders are not the same as raw demand. If SKU A is out of stock, people may want (demand) it, but they can't buy it. So they might substitute SKU B, inflating POs there above raw demand for SKU B. Or demand for SKU A may build up, and once A is available again, POs for A may reflect unsatisfied demand from the stockout period. This may or may not be an issue.

To your main question: I am not completely clear what it is. You write about weekly and monthly forecasting. I will proceed under the assumption that you want a total forecast for July, but split up by (partial) ISO week, so for July 2026, you want five buckets:

July 1-5
the full week starting July 6
the full week starting July 13
the full week starting July 20
and July 27-31

So you will usually have five buckets, except when a non-leap year February covers exactly four weeks.

This is a textbook case of hierarchical forecasting, where past data have a natural sum constraint (daily POs add up to weekly and to monthly totals), which we may want to also have in our forecasts, or which fact we may at least want to leverage. Take a look at https://otexts.com/fpp3/hierarchical.html.

There are two simple and one more complex approach here.

Bottom up: forecast on daily level, aggregate to (partial) weeks. Very simple, and importantly, easy to explain. You can directly model dynamics that happen on daily level, like day-of-week patterns or promotions (which often do not coincide with ISO weeks).
Top down: forecast on weekly level for all weeks that touch July, then cut down the partial weeks. Since some of your partial weeks cover the beginning and others the end of the week (and there are usually day-of-week patterns), I would not simply take 2/7 of the forecast for CW 27 to be the forecast for "the July part of CW 27". Rather, I would in parallel run a daily forecast and then prorate the weekly forecast by the daily forecasts for the relevant days. This is "disaggregation by forecasted proportions" in hierarchical forecasting parlance.

Either one can be more accurate. It will depend on how strong your daily dynamics are, perhaps also on whether your customers work more on daily or weekly granularity. I personally would have a slight preference for the bottom-up approach, but that is because in my world we always need the daily forecasts in the first place, and I usually work with retail series with a lot of promotions, which, as above, are not aligned with ISO weeks.

Finally, here is a third approach:

Optimal reconciliation (https://otexts.com/fpp3/reconciliation.html). This is currently the state of the art for forecasting smallish hierarchies (it gets numerically intractable with complex and large ones), and has regularly been found to improve forecasts on all levels involved. The downside is that it is a bit harder to explain if someone wants to drill more deeply into just why your forecast is the way it is.

The basic idea is that you forecast on all levels separately. The resulting forecasts will not be sum-consistent ("coherent"). So in a second step, we reconcile them. And this second step leverages information from all forecasts.

In your case, I see two possibilities:
- The simpler approach is to reconcile weekly and daily forecasts. You would forecast as above and use a two-level hierarchy with no single top node. The summation matrix $S$ contains an identity matrix (for the bottom "daily" level) and one block of five lines, one per week, each of these five lines containing seven $1$s and the rest $0$s. So $S$ would have 36 rows (31 for the days, 5 for the weeks) and 35 columns for the 35 days involved. After reconciliation, you would again collect the (now reconciled) bottom level forecasts into (possibly partial) weeks as above.
- The more complex approach would be to leverage monthly forecasts in addition. So in addition to the weekly and daily forecasts (covering all days from June 29 to August 2), you would also calculate a monthly forecast for July. The summation matrix $S$ would look like the previous one but have one more row for the monthly forecast, which would start with two $0$s corresponding to the daily forecasts for June 29 & 30, which do not enter into the total for July, then 31 $1$s for the days in July, finally another two $0$s for the August 1 & 2 forecasts.

My preference would depend on how important it is to explain the final forecasts to someone, and how comfortable that someone might be with the linear algebra involved in the reconciliation approach.

In any case, of course you could (and really should) prototype all possible approaches and test them on holdout data.

Weekly demand forecasting: Should I train on weekly or daily data and then aggregate?