Trail drainage features: Development and testing of an assessment tool

By Kaitlin Burroughs, Yu-Fai Leung, Roger L. Moore, and Gary B. Blank

Trails that are sustainably designed and well maintained allow visitors better access to natural areas and reduce their impacts on those resources by concentrating use on a limited number of linear paths (Hesselbarth et al. 2007). Increased understanding about what makes trails sustainable supports protected area and park managers who strive to integrate conservation and recreation goals.

The sustainability of trails as a recreation resource depends, to a large extent, on how effectively surface runoff can be controlled (Appalachian Mountain Club 2008; Grab and Kalibbala 2008; Hesselbarth et al. 2007; Marion and Wimpey 2017). Running water on and alongside trail treads leads to degradation problems such as muddiness, widening, multiple treading (additional trampled tracks alongside the original tread), root exposure, and soil erosion (Monz et al. 2010; Upland Pathwork Advisory Group 2004). If trail degradation becomes too advanced, rehabilitation can be prohibitively expensive with lasting impacts on surrounding resources and recreational experiences (Garland 1990). Trail degradation often leads to altered visitor behavior in negotiating degraded areas and can act as a catalyst for larger-scale resource damage, such as vegetation trampling, land erosion, water pollution, and wildlife displacement (Leung and Marion 1996; Olive and Marion 2009).

A common solution to water-induced trail degradation is to construct and maintain drainage features (Birchard and Proudman 1982; Grab and Kalibbala 2008; Hesselbarth et al. 2007; Marion and Wimpey 2017). A trail drainage feature or TDF is a purposeful arrangement or installation of any material (most commonly rock, wood, or soil) on or adjacent to the trail tread that aids in intercepting continuous surface runoff and diverting water from tread surfaces (fig. 1). Trail drainage features are commonly used by land management agencies such as the National Park Service, the US Forest Service, and state park systems. A variety of “best practice” approaches to constructing trail drainage features have been developed by various agencies and outdoor organizations (Appalachian Mountain Club 2008; Birchard and Proudman 1982; Hasselbarth et al. 2007). However, there is little evidence that this guidance is directly linked to empirical research, and little information exists on assessing, monitoring, and evaluating TDFs and their effectiveness. Thus data are limited for empirical examination of factors that contribute to the effectiveness of TDFs.

Left: A trail drainage feature constructed of rock. Right: A trail drainage feature constructed of wood.
Figure 1. Rock (left) and wood waterbars (right) are common examples of trail drainage features.

Kaitlin Burroughs, 2015 (2)

This study aims to contribute to the long-term sustainability of trails through an improved TDF assessment tool that can generate timely and objective data supporting analysis and evaluation, thereby informing trail and TDF management decisions. The purpose of this report is to illustrate the TDF assessment tool, including its conception, development, and testing.


Phase one: Assessment tool

We reviewed construction and maintenance “best practice” methods used by land management agencies and associated trail-building organizations such as the US Forest Service, the National Park Service, the Conservation Corps Trail Crews, the Appalachian Trail Conservancy, the Pacific Crest Trail Association, and the International Mountain Bicycling Association. Characteristics of trail drainage features described as a priority in multiple handbooks were among those selected for review (Appalachian Mountain Club 2008; Birchard and Proudman 1982; Hesselbarth et al. 2007; Upland Path Advisory Group 2004). In addition, we consulted previous research on trail and other drainage features used on logging and unpaved backcountry roads to better understand the empirical data collected on similar drainage characteristics. Significant findings related to drainage attributes were among those selected for review (Garland 1990; Grab and Kalibbala 2008; Leung and Marion 1999). Last, we considered personal knowledge and experience gained from participating in and leading trail-building crews for three years in generating a list of important characteristics that should be assessed when determining overall effectiveness of trail drainage features.

Collectively, we selected 30 measurements for pilot testing in Wesser, North Carolina, in February 2015, on 10 different trail drainage features constructed along the Appalachian Trail. Measurements, definitions, and methods were refined as a result of the pilot. For example, we further distinguished the TDF construction measurement responses so as to provide more accurate identification of TDF construction methods (displayed in fig. 2). A TDF material type categorized as “rock” may describe a construction method of rocks assembled flush to each other, stacked on top of or underneath one another, arranged in an alternating pattern, or combined in a way that has no distinguishable pattern.

Left: Rocks are laid side by side in this flush-constructed trail waterbar. Right: Rocks are stacked above and below one another in this example of a stacked waterbar.
Figure 2. Rock waterbars are often constructed using different methods. Shown here are flush (left) and stacked under (right).

Kaitlin Burroughs, 2015 (2)

Phase two: Reliability test

We collected data on existing TDFs to determine the reliability of the measurements used in the trail drainage feature assessment as performed by different field staff. Two groups of two raters, or field assessment tool users, were recruited and trained separately on how to use the assessment tool. Each training session lasted 30 minutes, in which participants were given background information and walked through each measurement as defined, explained, and illustrated in two assessment handouts. An example of these procedures is shown in table 1 and figure 3.

Table 1. Assessment measures and descriptions for trail drainage features

Category Measurement Description Unit
General TDF Information 1. TDF Number Identification number N/A
2. Date and Time Date and time of assessment N/A
3. Trimble GPS Number Spatial reference identification number N/A
4. Photo Up/Downslope View of TDF looking uphill and downhill N/A
TDF Characteristics 5. Material Type Material used to construct TDF (rock, wood, soil, plastic, other) N/A
6. Construction Method Building style (flushed, stacked over, stacked under, stacked alternating, tree log, treated log, rolling dip, knick, other) N/A
7. TDF Length Length of TDF from end to end along the centerline cm
8. TDF Thickness Average thickness of TDF material on the left, center, and right side or centerline of each individual rock, whichever is most appropriate cm
9. TDF to Tread Angle Angle of TDF and trail tread taken on the downhill side hugging the inside of the landscape (opposite end of drainage/trench) degrees
10. Material Gaps Categorical estimation of existing gaps in TDF structure (0–5 cm, 5.1–10 cm, or >10 cm) cm
11. Side Slope Connection Whether or not TDF material is tied into the trail’s surrounding inside landscape N/A
12. Structure Stability Categorical estimation of TDF strength using a firm boot shove (strongly anchored, moderately anchored, weakly anchored) N/A
Sediment Characteristics 13. TDF Height Average depth of sediment/organic material measured 3 cm uphill from TDF feature and on the left side, center, and right side cm
14. Erosion Feature The presence of erosion within 5 m of TDF uphill or downhill N/A
15. Trench Extension Whether or not an existing channel dug into the soil exists on the outside side slope N/A
16. Trench Extension Depth Depth of sediment/organic material in trench measured 30 cm away from TDF and from the top of the trench to the bottom cm
Trail Characteristics 17. Trail Grade Trail slope percentage measured from the TDF feature and 5 m upslope along existing trail %
18. Trail Direction Trail direction measured from TDF feature and existing trail centerline degrees
19. Landform Grade Landform slope percentage measured from the TDF feature and 4 m upslope along the fall line of existing landscape %
20. Landform Direction Landform direction measured from TDF feature and fall line of existing landscape degrees
21. Canopy Cover Percentage of canopy cover from the center of existing trail recorded 10 m upslope, 5 m upslope, and directly at TDF %
Maintenance 22. Recent Maintenance Evidence of recent TDF maintenance (trench cleaning, sediment clearing, upturned soil, fresh material, other) N/A
Effectiveness 23. Effectiveness Rating Categorical estimation of the overall TDF effectiveness (effective, partially effective, not effective) N/A
Four waterbar photos labeled with text, arrows showing these measures: material type, length and height of feature, thickness and angle to trail tread, structural stability, gaps, trench depth and extension, and side slope correction.
Figure 3. The photo diagram illustrates examples of measurements from the trail drainage feature assessment, including material type, length and height of feature, thickness and angle to trail tread, structural stability, gaps, trench depth and extension, and side slope correction.

Photos Kaitlin Burroughs, 2015 (4)

Using the data collected in the field, we assessed the inter-rater reliability using Cohen’s Kappa (K) and Intraclass Correlation Coefficient (ICC). Inter-rater reliability describes the degree of agreement among different raters and reflects the consistency among ratings. We selected two statistics according to the different types of data we collected. Some of the measurements generated categorical data (such as material type and construction method, indicated in table 2 by an asterisk); thus we applied Cohen’s Kappa (K), which compares the observed accuracy among raters with expected accuracy and considers the possibility that raters give similar answers by random chance (Hallgren 2012). Other measurements generated numeric data (such as TDF length and TDF thickness, indicated in table 2 by the absence of an asterisk); therefore, we applied the Intraclass Correlation Coefficient (ICC, using a two-way random, absolute agreement), which compares the variability of different ratings of the same attribute to the total variation across all ratings and all attributes. It incorporates the magnitude of disagreement and assumes that attributes are composed of a true score and a measurement error (Hallgren 2012). High inter-rater reliability scores mean that different raters are more likely to record the same or very similar values for each measure. This consistency in measuring is important as it demonstrates a decrease in the amount of subjectivity a rater uses to provide a given score, and consistency in measurement is important when providing a tool that can be used across varying land management agencies in different regions of the world. Both statistical models were run using IBM SPSS Statistics 2015 software.

Table 2. Inter-rater reliability results from tests performed at Baxter State Park, Maine, and Umstead State Park, North Carolina

Measurement K/ICC Measurement K/ICC
TDF Characteristics Trail Characteristics
Material Type* 1 Trail Grade 0.52
Construction Method* 0.64 Trail Direction 0.96
TDF Length 0.85 Landform Grade 0.76
TDF Thickness 0.93 Landform Direction 0.96
TDF to Tread Angle 0.70 Canopy Cover (10 m) 0.66
Material Gaps** 0.62 Canopy Cover (5 m) 0.58
Side Slope Connection* 0.69 Canopy Cover (TDF) 0.42
Structural Stability** 1
Sediment Characteristics Maintenance
TDF Height 0.95 Recent Maintenance** 0.78
Erosion Feature* 0.69
Trench Extension* 1 Overall Effectiveness Rating
Trench Depth 0.56 Effectiveness Rating** 0.54
Note: The absence of an asterisk (*) signifies Intraclass Correlation Coefficient (exact measurements), one asterisk indicates Cohen’s Kappa (categorical measurements), and two asterisks denote the measurement was weighted using Cohen’s Kappa.

The first group of raters consisted of students with undergraduate environmental knowledge in addition to trail assessment experience. They performed the reliability test in Baxter State Park, Maine. The second group had advanced environmental knowledge (MS and PhD) and some trail assessment experience. They performed the reliability test in Umstead State Park, North Carolina (fig. 4). We chose the two locations for variety in trail drainage features and settings and to ensure consistency in application of the assessment across a diverse region where TDFs are common in managed natural areas.

Trail assessors use calipers to check the width of a waterbar.
Figure 4. Field assessors perform a reliability test at Umstead State Park, North Carolina.

Kaitlin Burroughs, 2015


Phase one: Assessment tool

A refined version of the TDF assessment tool from the pilot yielded 21 measurements (25 total questions), representing four distinct categories (fig. 5): (1) TDF characteristics, (2) sediment characteristics, (3) trail characteristics, and (4) maintenance. We added an “overall effectiveness rating” measure from Leung and Marion’s 1999 study of Great Smoky Mountains National Park to build upon previous research on trail drainage features.

A diagram showing the relationship of four key attributes of trail drainage features to TDF effectiveness, as follows: TDF characteristics, sediment characteristics, trail characteristics, and maintenance.
Figure 5. The training materials paired written instructions and illustrations for these measurement categories, which contribute to TDF effectiveness.

A description of each measurement category used in the assessment is shown in table 1. Those categorized as “TDF characteristics” measure physical attributes of and relate to the integrity of the trail drainage features. Measurements under “sediment characteristics” document trends in sediment deposition around the features and serve as indicators of whether or not water is effectively diverted from the trail. “Trail characteristics” capture trail and landform attributes that may influence how effectively water is drained from the trail. Finally, the “maintenance” measure identifies evidence of recent or obvious maintenance of the features.

Phase two: Reliability test

Each measurement listed in table 1 was calculated for inter-rater reliability, shown in table 2, and resulted in moderate agreement (Cohen’s K or ICC ³ 0.41) or higher as interpreted using the six-category K scale displayed in table 3 (Zenk et al. 2007). Measurements among highest agreement were material type (1.0), TDF length (0.85) and thickness (0.93), structural stability (1.0), TDF height (0.95), and trail (0.96) and landform (0.96) direction. Measurements among moderate agreement (and the lowest recorded agreement in this study) were trench extension depth (0.56), trail grade (0.52), canopy cover (at TDF location) (0.42), and canopy cover (at 5 m) (0.58). The “overall effectiveness rating” applied by guidelines in Leung and Marion (1999) received a lower inter-rater reliability value of 0.54 when compared to three of the identified key variables of TDF length (0.85) (fig. 6), TDF thickness (0.93), and TDF to tread angle (0.70). It received a comparable inter-rater reliability value when compared to the last key measurement of trail grade (0.52).

Table 3. Inter-rater reliability interpretation

Kappa Interpretation
< 0 Poor Agreement
0.0–0.20 Slight Agreement
0.21–0.40 Fair Agreement
0.41–0.60 Moderate Agreement
0.61–0.80 Substantial Agreement
0.81–1.00 Strong Agreement
Source: Zenk et al. 2007.
A woman measures the width of a trail drainage feature as part of the TDF assessment.
Figure 6. An assessor measures TDF length along the Appalachian Trail in New Hampshire.

Kaitlin Burroughs, 2015


Measurements with high K and ICC scores such as material type likely occurred because trail drainage features are commonly built with only one type of material, and different material types are distinct (e.g., rock versus tree log). Consistent recording of material type is valuable as several “best practices” manuals describe soil-based TDFs as more effective over the long term compared to waterbars made of rocks and logs (Appalachian Mountain Club 2008; Hesselbarth et al. 2007). Additional measurements with higher consistency such as TDF thickness and diagonal tread length are valuable because past research has found thickness of drainage features to be an important indicator of TDF effectiveness (Grab and Kalibbala 2008).

Trail drainage feature height, also a high-performing inter-rater reliability measurement, may be a good indication of TDF effectiveness, given that an assessor is able to record precise measurements. Such precision may allow for a better understanding of the amount of sediment deposited from surface runoff in front of a trail drainage feature, further informing land managers how often it may need to be serviced or maintained.

Measurements such as trench depth performed less favorably in terms of consistency. Trail drainage feature trenches vary in construction and often have uneven surfaces on either side. Grab and Kalibbala (2007) determined the importance of TDF trenches (referring to them as drainage furrows); however, features with poorly constructed trenches do not function well, consequently demonstrating the importance of trench characteristic measurements such as trench depth.

Also among the less favorably performing K and ICC scores (moderate agreement) were trail grade, canopy cover (TDF), and canopy cover (5 m). Although these measurements received moderate agreement, more accurate measurements may be obtained using GIS if geospatial data are available. For example, trail grade can be obtained with high-tech lidar data for a given area and can provide consistent measurements over entire landscapes, as described by Marion et al. (2011). The canopy cover measurements would also benefit from lidar data as more accurate readings can be taken along the entire trail system or surrounding landscape, enabling further analysis and modeling of rainfall drainage patterns as influenced by trail networks at a watershed or landscape scale. Trail grade measurements can and should still be taken in the field to check lidar data for accuracy. More precise instructions can be given for making the trail grade measurement, such as “Stand at the center of the TDF, just above the feature and not standing on top of the TDF material, and measure 5 m uphill with an accurate measuring device.” Wind can be a significant factor when recording tree canopy, so wind strength should be monitored with an anemometer in order to record wind speed and direction.

Finally, the singular “overall effectiveness” measure had the least consistency among raters. This parallels past discussion by Leung and Marion (1999) identifying the limitation of subjectivity in determining TDF effectiveness. By using the other 24 measurements, a more quantitative and objective assessment is possible in determining actual TDF effectiveness. Additionally, the assessment of one TDF can be recorded in an average of just six minutes after all initial training and practice TDFs are completed. Newly trained field staff will complete assessments in about 20 minutes for the first four to five features and then rapidly increase in speed as they become more familiar with the process, leading quickly to the six-minute TDF assessment speed.


We suggest training improvements be made for similar research in the future. First, approximate measurements associated with trail grade and canopy cover can be changed to exact measurements. For example, a tool such as a laser range finder can be used to determine exactly how far along the trail 5 meters is. Additionally, precise directions should be given to assessors describing where to stand to take such measurements as trail grade and canopy cover. Field training will continue to improve as assessors become familiar with multiple examples of trail drainage features and terminology. Finally, consistency in measurements will improve as individuals who already have a background in trail building and maintenance are recruited, because of their familiarity with terms and features.

We have several recommendations for improving the TDF assessment tool in the field. A side slope measurement will be helpful in determining what percentage of trail is sloping outward, aiding in better water drainage. Additionally, canopy cover, trail grades, and landscape grades may be recorded for features using categorical data when lidar data are unavailable. However, high-precision field measurements may not add value to the evaluation of trail drainage features and trail condition. For example, a trail grade of 5% can be variously measured as 4–7% depending on where and how far apart recorders stand, and what material the trail is made of. These differences decrease inter-rater reliability even though the fine level of detail recorded increases, and they do not add meaningful information to the evaluation. It is more important to record major differences in trail grade (e.g., 5% or 20%) as this has a dramatic effect on the flow of water across the trail.

The TDF trench measurement can be improved by tying a string to two stakes, laying out the string line perpendicular to and across the trench, and measuring trench depth from the center of the line down to the deepest spot. Finally, maintaining a log in which the age of trail drainage features is recorded and information about how often each is maintained is important in evaluating the effectiveness of the feature and greatly reduces subjectivity. Information such as construction and maintenance dates, type of maintenance, and presence of erosion can be recorded using the NPS Facility Management Software System or filed online by maintenance crews and volunteer groups.



An assessment tool is indispensable in constructing and maintaining effective trail drainage features insofar as it uses the most relevant measurements in conjunction with information from research. Our assessment tool provides a way to inventory existing trail drainage features and their corresponding attributes. It highlights undesirable attributes that can lead to poor feature performance such as incorrect tread to TDF angle or trench depth, to which managers can respond by allocating additional time and resources. The tool also helps identify TDFs that should be removed if, for example, a majority of the measurements produced results that were undesirable. Finally, this tool helps land managers to justify the expense of money, time, and resources on trail management with objectively collected evidence.


Future research should test this assessment tool with larger sample sizes. Also, it should evaluate the need for additional TDF measurements and measurement alterations in order to determine the reliability and value of incorporating this information into the existing assessment tool. As measurements are updated, added, and tested, this rapid assessment tool can be further developed by leveraging the best and most reliable measurements (Marion and Wimpey 2017). Additionally, training methods such as in the classroom, in the field, or a combination of the two, can be assessed to determine which is most effective at producing consistent and high inter-rater reliability measurement outcomes. The assessment tool can also be used to give each TDF a composite score, meaning one simple score to serve as an overall effectiveness measure allowing for efficient comparisons among trail drainage features.


This field assessment tool gives trail managers a quantitative and more objective tool to aid in determining the effectiveness of trail drainage features in parks and protected areas. It compiles a suite of important variables that can be used in assessments while highlighting those that are critical in determining TDF effectiveness. The assessment complements existing trail-building manuals such as the Appalachian Mountain Club’s Complete Guide to Trail Building and Maintenance of 2008 and the US Forest Service’s Trail Construction and Maintenance Notebook of 2007. The longer list of variables incorporated into the assessment tool is informed by these and other relevant literature and helps to differentiate levels of effectiveness among trail drainage features. Finally, identifying, locating, and fixing underperforming TDFs will improve overall trail quality and long-term sustainability, and contribute to surrounding environmental health and overall visitor enjoyment.


The authors would like to thank Dr. Jeffrey Marion for allowing this work to be conducted within the larger AT sustainability study as well as for his valuable input. Dylan Spencer, Mary Burnett, Katelin McArdle, Casey Hastings, Anna Miller, Mirza Halim, Chelsey Walden-Schreiner, and Shaun Fisher are acknowledged for their field assistance. We also appreciate the constructive comments and edits provided by reviewers.


Appalachian Mountain Club. 2008. AMC’s complete guide to trail building and maintenance. Fourth edition. Appalachian Mountain Club, Boston, Massachusetts, USA.

Birchard, W., and R. Proudman. 1982. Appalachian Trail: Design, construction, and maintenance. Second edition. The Appalachian Trail Conference, Harpers Ferry, West Virginia, USA.

Garland, G. G. 1990. Technique for assessing erosion risk from mountain footpaths. Environmental Management 14(6):793–796.

Grab, S., and F. Kalibbala. 2008. “Anti-erosion” logs across paths in the southern Ukhahlamba-Drakensberg Transfrontier Park, South Africa: Cure or curse? Catena 73(1):134–145.

Hallgren, K. A. 2012. Computing inter-rater reliability for observational data: An overview and tutorial. Tutorials in Quantitative Methods for Psychology 8(1):23–34.

Hesselbarth, W., B. Vachowski, and M. A. Davies. 2007. Trail construction and maintenance notebook: 2007 edition. US Forest Service and Missoula Technology and Development Center, Missoula, Montana, USA.

Leung, Y-F., and J. L. Marion. 1996. Trail degradation as influenced by environmental factors: A state-of-knowledge review. Journal of Soil and Water Conservation 51(2):130–136.

———. 1999. Assessing trail conditions in protected areas: Application of problem-assessment method in Great Smoky Mountains National Park, USA. Environmental Conservation 26(4):270–279.

Marion, J. L., and J. F. Wimpey. 2017. Assessing the influence of sustainable trail design and maintenance on soil loss. Journal of Environmental Management 189:46–57.

Marion, J. L., J. F. Wimpey, and L. O. Park. 2011. The science of trail surveys: Recreation ecology provides new tools for managing wilderness trails. Park Science 28(3):60–65.

Monz, C. A., D. N. Cole, Y-F. Leung, and J. L. Marion. 2010. Sustaining visitor use in protected areas: Future opportunities in recreation ecology research based on the USA experience. Environmental Management 45(3):551–562.

Olive, N., and J. L. Marion. 2009. The influence of use-related environmental and managerial factors on soil loss from recreational trails. Journal of Environmental Management 90(3):1483–1493.

Upland Path Advisory Group. 2004. Upland pathwork: Construction standards for Scotland. Second edition. Scottish Natural Heritage, Battleby, Scotland, UK.

Zenk, S. N., A. J. Schulz, G. Mentz, J. S. House, C. C. Gravlee, P. Y. Miranda, P. Miller, and S. Kannan. 2007. Inter-rater and test-retest reliability: Methods and results for the neighborhood observational checklist. Health and Place 13(2):452–465.

About the authors

All of the authors are with North Carolina State University (NCSU), Raleigh, except as noted. Kaitlin Burroughs received her MS in Natural Resources (Outdoor Recreation Technical Option) from NCSU in 2016 and currently serves as a Wilderness Fellow at Hawai’i Volcanoes National Park. She can be reached at Yu-Fai Leung is professor and director of graduate programs in the Department of Parks, Recreation, and Tourism Management (PRTM) at NCSU. Roger L. Moore is associate professor in PRTM at NCSU. Gary B. Blank is associate professor and director of undergraduate programs in the Department of Forestry and Environmental Resources at NCSU.

Appalachian National Scenic Trail

Last updated: September 13, 2019