I realized this morning that it has been a while since I posted any Python code. I’ve been a bit busy with Handyman Kevin and haven’t been doing much data science. Still, I decided it was time to carve out a couple hours this morning to practice my skills. The result are these functions, which perform basic double exponential smoothing using the Holt-Winters method. I deliberately avoided using NumPy, SciPy, or any other libraries. It isn’t that I dislike Numpy/Scipy (far from it), but you can’t always get sysadmins to install extra libraries on the machines you’re using, especially if you are a guerrilla data scientist like me.
There are a lot of different time series methods out there, and they all have their points. Holt-Winters is the one that I keep coming back to, though. One of the reasons is simplicity–I can always remember it and bang it into a spreadsheet without needing to Google anything or download libraries. About the 40th time I typed it into a spreadsheet, though, it occurred to me that it would be smart to implement it in Python so I could save some typing.
The first function, MAPE, simply calculates the mean absolute percentage error (MAPE) of a list of estimated values, as compared to a list of actual values.
The next function, holtwinters, uses Holt-Winters to predict the next three values in a time series. You need to supply two smoothing coefficients, alpha and beta, for the level and trend, respectively. Typically, you would have a pretty good idea what these were from doing similar forecasts in the past.
If you don’t know the coefficients then use the third function, holtwinters_auto, to automatically determine them. This function uses a grid search. Those of you who have read my monograph probably remember that I’m not usually wild about grid searches. In this case it makes sense, though, since you don’t usually need more than a few digits of precision on the coefficients.
def MAPE(actual, estimate): '''Given two lists, one of actual values and one of estimated values, computes the Mean Absolute Percentage Error''' if len(actual) != len(estimate): print "ERROR: Lists not the same length." return  pcterrors =  for i in range(len(estimate)): pcterrors.append(abs(estimate[i]-actual[i])/actual[i]) return sum(pcterrors)/len(pcterrors)
def holtwinters(ts, *args): '''Uses the Holt-Winters exp. smoothing method to forecast the next three points in a time series. The second two arguments are smoothing coefficients, alpha and beta. If no coefficients are given, both are assumed to be 0.5. ''' if len(args) >= 1: alpha = args else: alpha = .5 findcoeff = True if len(args) >= 2: beta = args else: beta = .5 if len(ts) < 3: print "ERROR: At least three points are required for TS forecast." return 0 est =  #estimated value (level) trend =  #estimated trend '''For first value, assume trend and level are both 0.''' est.append(0) trend.append(0) '''For second value, assume trend still 0 and level same as first actual value''' est.append(ts) trend.append(0) '''Now roll on for the rest of the values''' for i in range(len(ts)-2): trend.append(beta*(ts[i+1]-ts[i])+(1-beta)*trend[i+1]) est.append(alpha*ts[i+1]+(1-alpha)*est[i+1]+trend[i+2]) '''now back-cast for the first three values that we fudged''' est.reverse() trend.reverse() ts.reverse() for i in range(len(ts)-3, len(ts)): trend[i] = beta*(ts[i-1]-ts[i-2])+(1-beta)*(trend[i-1]) est[i] = alpha*ts[i-1]+(1-alpha)*est[i-1]+trend[i] est.reverse() trend.reverse() ts.reverse() '''and do one last forward pass to smooth everything out''' for i in range(2, len(ts)): trend[i] = beta*(ts[i-1]-ts[i-2])+(1-beta)*(trend[i-1]) est[i]= alpha*ts[i-1]+(1-alpha)*est[i-1]+trend[i] '''Holt-Winters method is only good for about 3 periods out''' next3 = [alpha*ts[-1]+(1-alpha)*(est[-1])+beta*(ts[-1]-ts[-2])+(1-beta)* trend[-1]] next3.append(next3+trend[-1]) next3.append(next3+trend[-1]) return next3, MAPE(ts,est)
def holtwinters_auto(ts, *args): '''Calls the holtwinters function, but automatically determines the alpha and betta coefficients which minimize the error. The optional argument is the number of digits of precision you need for the coefficients. The default is 4, which is plenty for most real life forecasting applications. ''' if len(args) > 0: digits = args else: digits = 4 '''Perform an iterative grid search to find minimum MAPE''' alpha = .5 beta = .5 for d in range(1,digits): grid =  for b in [x * .1**d+beta for x in range(-5,6)]: for a in [x * .1**d+alpha for x in range(-5,6)]: grid.append(holtwinters(ts, a, b)[-1]) if grid[-1]==min(grid): alpha = a beta = b next3, mape = holtwinters(ts, alpha, beta) return(next3, mape, alpha, beta)