Expanded Methods

EXPANDED SUMMARY OF RECONSTRUCTION METHOD

This section briefly summarizes the reconstruction method. A detailed description of the reconstruction steps can be found below. Hydrologic reconstructions for 8 different basins were generated for the California Department of Water Resources (CADWR) using a combination of multiple linear regression (Myers 1990) and locally weighted regression (loess; Martinez and Martinez 2005). The predictand, y, for a basin was either river discharge (Q) averaged over the water year, or precipitation (P) summed for the water year. This section summarizes the reconstruction method, whose two main phases are (1) single-site reconstruction (SSR) to convert individual tree-ring chronologies into estimates of y, and (2) multi-site reconstruction (MSR) to combine the individual SSRs into an improved estimate of y.

The choice of reconstruction method was guided by several considerations. First, the relationship between a tree-ring site chronology and any particular discharge or precipitation record might not be linear and might not be contemporaneous. Moreover the linearity and lagged properties of the relationship might reasonably be expected to vary across species and tree-ring sites. SSR is intended to deal with lags and curvilinear relationships at a early stage in the reconstruction process, and at the level of the individual tree-ring site. SSR is intended to check that relationships between the hydrologic series and tree-ring series are temporally stable. MSR, a subsequent step to SSR, can deal with the question of combining tree-ring information from multiple sites without complications from site-to-site differences in lagged dependence and linearity of signal. Overfitting is unlikely to be a problem with combining the SSRs into a MSR because the MSR model requires estimation of only the smoothing parameter for loess. The MSR itself is interpolated from a smoothed scatter plot, and does not involve fitting of parameters. Time-nested modeling was adopted to take advantage of the millennial length of the longest tree-ring chronologies and the dense site network of chronologies that emerges in more-recent centuries.

The SSR/MSR reconstruction approach was introduced in a methods paper in the context of a sample application to tree-ring reconstructions of precipitation Meko (1997). The approach was further modified and extended in reconstructions of streamflow for the Sacramento River (Meko et al. 2001), Colorado River (Meko et al. 2007), and various rivers in the Sacramento and San Joaquin basins (Meko et al. 2014).

A network of 46 tree-ring sites was assembled for this study. Most sites were developed or updated specifically for this study by new field collections in 2015 and 2016. A few additional sites were added based on exploratory correlation analysis and knowledge on the quality of chronologies for hydroclimatic reconstruction from previous studies. The 46 tree-ring sites can be divided into groups: 9 for Colorado River reconstruction and 37 for reconstructions in southern California.

All chronologies for the study were standardized uniformly with program ARSTAN (Cook and Krusic 2007). Program settings included detrending by the ratio method using a cubic smoothing spline with a frequency response 0.5 at a wavelength equal to the series length (Cook and Peters 1981), and variance stabilization to account for effects of time-varying sample size (Osborn et al. 1997) and to remove any remaining variance trends over the length of the series in the site chronology (Cook and Kairiukstis1990). The 46 chronologies for the CADWR study have start years ranging from -350 to 1710, and end years ranging from 1992 to 2016. While each ring-width series was detrended over its full available length, site chronologies for use in reconstruction were truncated to begin with the first year with expressed population signal (EPS) exceeding 0.85 (Wigley et al. 1984). With this criterion, the start-year of usable site chronologies in the 46-site network ranges from 585 to 1775. Standard rather than residual site chronologies were used in reconstructions.

Four separate time-nested and overlapping reconstructions with differing time coverage and accuracy were estimated for each of the 8 basins:

1. Rec1: starts in early 1100s; relies on only the longest available chronologies; least accurate of the nested reconstructions
2. Rec2: starts in the late 1390s to early 1400s; tree-ring network generally large enough to give close to maximum accuracy of reconstruction while still providing long time coverage
3. Rec3: starts in early 1700s: makes use of a dense tree-ring network; the most accurate reconstruction, but generally only marginally more accurate than Rec2
4. Rec4; highly variable start year, but extends as close as possible to present (2014 for Colorado River, 2015 for other basins); tree-ring network much smaller than for Rec2 and Rec3; generally lower reconstruction accuracy than Rec2 or Rec3; sole purpose is to provide the last reconstructed year

Two spliced versions of reconstruction were built from the basic nested reconstructions. Except for the last year (provided by Rec4), the spliced reconstructions are based on a tree-ring network of unvarying site composition, and are intended to give a longer and shorter perspective on relative changes in moisture anomalies (droughts and wet periods) over time.

1. “Longest” (Rec1): uses Rec4 for just its last year, and Rec1 for all other years
2. “Most-skillful” (Rec23): uses Rec4 for its last year, and Rec2/Rec3 for all other years

Four basic nested reconstructions and the spliced reconstructions built from them were developed for each of the 8 basins using the same statistical methods. A 95% confidence interval was assigned to each reconstructed value using the method of upper and lower smooths (Martinez and Martinez 2005). Confidence bands are interpolated from separate loess models fit to the positive and negative cross-validation residuals of the MSR. The resulting confidence bands vary with magnitude of reconstructed discharge or precipitation, and reflect, for example, the amplified uncertainty in wet years typical of dendrohydrologic reconstructions (Meko and Woodhouse 2011). While the loess reconstruction method does not yield an “R-squared for regression”, an ad hoc explained variance statistic was computed as a single metric to compare the “typical” accuracy of these reconstructions with that of other published reconstructions. Here, SSE is the sum of squares of departures of observed minus the reconstructed predictand, and the SSE is the sum of squares of departures of the observed predictand from its calibration period mean.

DETAILED STEPS IN RECONSTRUCTION

1 Single-site reconstruction (SSR). SSR is the filtering and scaling of a single tree-ring chronology, x, into an estimate of a predictand y. The goal is an SSR that has desirable statistical properties resembling those of the observed y, and has variance proportional to the strength of the relationship between the x and y. The SSR procedure moreover is intended to screen out chronologies whose signal for y is either weak or temporally unstable, to adjust for possible lagged dependence of y on x, and for possible curvature in the relationship between x and y.

1.1 Preliminary stepwise regression

1.1.1 Define calibration period as the overlap of y and x, possibly shortened by 1 year on the recent end to accommodate a +1 year lag in the model
1.1.2 Regress y on x and x2 lagged -1, 0 and +1 years from y in stepwise regression. The predictand y for regression is log10(Q) for reconstruction of river discharge, and P (no transformation) for reconstruction of precipitation. The resulting equation has at most 6 predictors -- original and squared x at lags 0, -1 and +1 years from y. Predictors are entered stepwise, with p-to-enter of 0.05 and p-to-remove of 0.10 (Weisberg 1985, Wilks 1995). If no variables enter in the stepwise, the default model is assumed to be y on x without lags or a squared term.
1.1.3 Store the order of entry of predictors in the above preliminary stepwise regression

1.2 Cross-validation of preliminary regression

1.2.1 Repeat the stepwise regression above, using the same order of entry of predictors, and cross-validating (Myers 1990; Michaelsen 1987) at each step by leave-5-out cross-validation. Omitting 5 observations instead of 1 observation and predicting for central observation of the omitted segment is suggested to ensure that none of the same predictor observations are used for the calibration and validation data when a model includes lags up to ±1 year on the predictor time series (Meko 1997).
1.2.2 Store the cross-validation reduction of error statistic (RE; Fritts et al. 1990) at each step.
1.2.3 Mark as the stopping step for the final SSR model the last step before RE begins to decline (cross-validation stopping rule; Myers 1990; Wilks 1995).

1.3 Final SSR

1.3.1 Re-calibrate the model for the stopping step defined in steps 1.2.3. If no lag +1 term is needed for the model, the calibration period can be extended by 1 year on the recent end.
1.3.2 Store the regression R2 as a measure of accuracy of the SSR model.
1.3.3 Substitute the long-term tree-ring index into the model to generate the single-site reconstruction (SSR). If no lags in the model, this reconstruction extends from last year of the available tree-ring series back to the first year when the chronology reaches the EPS threshold of 0.85 (Wigley et al. 1984).

1.3.4 Validate the SSR model using both leave-5-out cross-validation and split-sample validation. For split-sample validation, the full calibration period is split in half, and the model is fit and validated in turn on separate halves (e.g., Meko and Graybill 1995).

1.3.5 Flag the chronology’s SSR as unusable in subsequent steps of the reconstruction if any of the following conditions are true:

1.3.5.1 Calibration overall-F statistic is not significant at p = 0.05 (no signal)
1.3.5.2 Cross-validation RE≤0
1.3.5.3 Split-sample RE≤0 for either half of the validation

1.4 Repeat steps1.2 - 1.3 for each of the n1 tree-ring chronologies (for Colorado River, for other basins)

1.5 Backtransform the n1 SSRs to the original units of the predictand; this step is necessary only if the predictand for SSR was transformed (e.g., log10 transform for river discharge)

2 Multi-site reconstruction (MSR). MSR results from combining the SSRs from individual tree-ring chronologies into a single reconstructed time series. The idea is that averaging SSRs over tree-ring sites will emphasize the common signal and de-emphasize local and nonclimatic noise. Because the SSRs as defined have variance proportional to strength of their signal for y, no differential weighting is needed: an unweighted average emphasizes those tree-ring sites with stronger signal. Broadly, the MSR is interpolated from a smoothed scatter plot of observed y on the average of SSRs. Which of the original SSRs are included among the SSRs for a particular basin and nested model? This depends on several factors. Obviously, to be considered a candidate for a particular nested model (e.g., Rec1) an SSR must completely cover the specified time period for the model. To qualify as a candidate, the SSR is also required to have a regression, which, for the lengths of calibration period used here, is a stricter requirement that just a significant overall-F statistic for the regression. Other requirements are that the SSR must have a positive RE of cross-validation and a positive RE in both halves of its split-sample validation. These constraints rule out many of the original SSRs. Remaining SSRs make up the pool of SSRs from which the SSRs for use in MSR are selected. MSR models are built chronologically (Rec1, Rec2, …) by the steps described in detail below.

2.1 Identify the SSRs for the Rec1 reconstruction model (earliest nested model)

2.1.1 Identify qualified (see above) SSRs covering the common interval, and consider just those SSRs.
2.1.2 Identify how many different species are represented in that set of SSRs.
2.1.3 For each species represented, include the 3 SSRs with the strongest signal, as measured by the R2 of the SSR model (see 1.3.2). If fewer than 3 SSRs available, include all of them.
2.1.3.1 Aim is to favor representation from multiple species, and to use those chronologies with the strongest bivariate signal for the predictand y
2.1.3.2 Will end up with a subset of SSRs.
2.1.4 Compute the time series of reconstructed predictand averaged over the selected SSRs; call this average v.

2.2 Plot the observed predictand, y, against v in a scatter plot

2.3 Fit a loess model to the scatter plot

2.3.1 Make loess estimates of y at 6 points along the abscissa of the scatter plot: minimum, maximum and percentiles 20 40 60 and 80 of v.
2.3.2 Begin with a loess smoothing parameter, α=0.6
2.3.3 Fit the loess model. Check that the resulting loess curve is monotonic increasing; if not, increase by 0.1 (less flexible curve), and check again; repeat until monotonic increasing curve is attained. The resulting is a final value of smoothing parameter.

2.4 Validate the loess model

2.4.1 Cross-validate, leaving out 5 observations at each iteration and predicting y for the middle year of the omitted segment
2.4.2 Split-sample validate, exchanging the first and last halves of the overlap of v and y
2.4.3 Check that the final loess model has positive RE of cross-validation and positive RE of split-sample validation on both halves (temporal stability)

2.5 Extend the loess curve to cover the full range of v over the full nested period (not just calibration years)

2.5.1 Identify extreme high and low values in the time series of v of SSRs averaged over tree-ring sites. Usually these extremes lie outside the range of v in the calibration period of the fitted loess curve
2.5.2 Extend the loess curve to the left and right on the scatter plot and cover the identified high and low extremes of v. The loess curve before extension is monotonic increasing and piecewise linear, with segments joining the minimum, 20th, 40th, 60th, 80th percentiles and maximum of v for the calibration period. Call the segments joining these 6 points segment 1-5. Two new straight-line segments (segment 0 on the left and segment 6 on the right) will be added.
2.5.2.1 On the right side, set the slope of segment 6 such that the change in slope from segment 5 to 6 is the same as the change in slope from segment 4 to 5
2.5.2.2 On the left side, set the slope of segment 0 such that the change in slope from segment 0 to segment 1 is the same as the change in slope from segment 1 to segment 2.

2.6 Interpolate the multi-site reconstruction (MSR) from the extended loess curve
For each year of the current nest, or set of tree-ring chronologies with common time coverage, linearly interpolate a reconstructed value of y from the extended loess plot of observed y against v

2.7 Estimate error bars for the MSR (50% confidence band around annual reconstructed y)
The method of “upper and lower smooths” (Martinez and Martinez 2005) was used to estimate a 50% confidence band for reconstructed y. This method is specifically applicable where, as here, the reconstructed values are estimated from a smoothed scatter plot and the error variance is a function of the size of reconstructed y.

2.7.1 Scatter plot the positive cross-validation residuals against the fitted values, or estimated y
2.7.2 Fit a piecewise-linear loess model to the scatter plot
2.7.3 Smoothing parameter. Use this same setting for all basins and nested models. This setting was selected from exploratory analysis, and is not claimed to be optimal in a statistical sense
2.7.4 Estimation points at the minimum, maximum, and percentiles 20, 40, 60, and 80 of calibration-period predicted y
2.7.5 Extend the loess curve by adding leading and trailing straight-line segments connecting to the lowest (left) and highest (right) reconstructed y in the full-length reconstruction. Unlike the extension used for the loess models of the reconstruction itself (see 2.4), the extension is horizontal. Thus the confidence interval for extremely low or high reconstructed y is assumed to stay at the same width as for the extremes in reconstructed y for the calibration period.
2.7.6 Repeat steps 2.6.1-2.6.5 for the negative cross-validation residuals
2.7.7 Linearly interpolate from the 2 smoothed scatter plots (upper and lower smooths) to get estimated upper and lower 50% confidence bands for each year of reconstructed y

2.8 Repeat steps 2.1-2.6 for nested models Rec2, Rec3, and Rec4

2.8.1 Each nested model has a specified time coverage (e.g., 1405-2015)
2.8.2 More sites, and SSRs, become available in Rec2 and Rec3; site density declines for Rec4
2.8.3 If an SSR is used in an earlier nested model and that SSR covers the current nested period, include the SSR in the current model. This approach favors continuity in the mix of tree-ring predictors from one period to the next.
2.8.4 The rule of using the 3 “best” SSRs (highest) for each available species is followed at each nest. Any SSRs retained from an earlier model do not count toward this 3. It is therefore possible for an MSR model for Rec2, Rec3 or Rec4 to be represented by more than 3 chronologies of the same species

REFERENCES

Cleveland, W. S. (1979), Robust locally weighted regression and smoothing scatterplots, J. Am. Stat. Assoc., 74, 829–836.

Cleveland, W. S., and R. McGill (1984), The many faces of a scatterplot, J. Am. Stat.Assoc., 79, 807–822.

Cook, E. R., and L. A. Kairiukstis (Eds.) (1990), Methods of Dendrochronology: Applications in the Environmental Sciences, Kluwer Academic Publishers, Dordrecht, 394 pp.

Cook, E. R., and K. Peters (1981), The smoothing spline: A new approach to standardizing forest interior tree-ring width series for dendroclimatic studies, Tree-Ring Bulletin, 41, 45–53.

Cook, E. R., P. J. Krusic, R. H. Holmes, and K. Peters (2007), Program ARSTAN, Version 41d, 2007, www.ldeo.columbia.edu/tree-ring-laboratory.

Fritts, H. C., J. Guiot, and G. A. Gordon (1990), Verification, in Cook and Kairiukstis [1990], pp. 178–185, 394 pp.

Martinez, W. L., and A. R. Martinez (2005), Exploratory Data Analysis with MATLAB, Chapman & Hall/CRC, New York, 405 pp.

Meko, D. (1997), Dendroclimatic reconstruction with time varying predictor subsets of tree indices, J. Climate, 10 (4), 687–696.

Meko, D., and D. A. Graybill (1995), Tree-ring reconstruction of Upper Gila River discharge, J. Am. Water Resour. Assoc., 31 (4), 605–616.

Meko, D. M., and C. A. Woodhouse (2011), Application of Streamflow Reconstruction to Water Resources Management, in Dendroclimatology, Progress and Prospects, Developments in Paleoenvironmental Research, vol. 11, edited by M. K. Hughes, T. W. Swetnam, and H. F. Diaz, pp. 231–261, Springer Netherlands.

Meko, D. M., M. D. Therrell, C. H. Baisan, and M. K. Hughes (2001), Sacramento River flow reconstructed to A.D. 869 from tree rings, J. Am. Water Resour. Assoc., 37 (4), 1029–1040.

Meko, D. M., C. A. Woodhouse, C. H. Baisan, T. Knight, J. J. Lukas, M. K. Hughes, and M. W. Salzer (2007), Medieval drought in the Upper Colorado River Basin, Geophys. Res. Lett., 34 (L10705), 10.1029/2007GL029,988.

Meko, D. M., C. A. Woodhouse, and R. Touchan (2014), Klamath/San Joaquin/Sacramento Hydroclimatic Reconstructions from Tree Rings, Draft Final Report to California Department of Water Resources Agreement 4600008850, Available at: http://www.water.ca.gov/waterconditions/docs/tree_ring_report_for_web.pdf.

Michaelsen, J. (1987), Cross-validation in statistical climate forecast models, J. Clim. Appl. Meteor., 26, 1589–1600.

Osborn, T. J., K. R. Briffa, and P. D. Jones (1997), Adjusting variance for sample-size in tree-ring chronologies and other regional mean timeseries, Dendrochronologia, 15, 89–99.

Weisberg, S. (1985), Applied Linear Regression, 2nd ed., John Wiley, New York, 324 pp.

Wigley, T. M. L., K. R. Briffa, and P. D. Jones (1984), On the average value of correlated time series, with applications in dendroclimatology and hydrometeorology, J. Clim. Appl. Meteor., 23, 201–213.

Wilks, D. S. (1995), Statistical Methods in the Atmospheric Sciences, International Geophysics Series, vol. 59, Academic Press, San Diego, 467 pp.