index

Predictive Modelling Methodology (continued)

by
Luke Dalla Bona

Table of Contents

Introduction
Settlement Studies
Predictive Modelling
Inductive Models
Deductive Models
The Graphical Approach
Modelling Procedures
The Intersection Method
The Weighted Value Method
Methodological Issues
Advantages of Predictive Modelling
On Developing A Predictive Model
Primary Stage Modelling: Organization and Data Collection
Secondary Stage Modelling: Initial Model Development and Testing
Tertiary Stage Modelling: Application and Refinement
Summary


METHODOLOGICAL ISSUES

There are a number of methodological issues concerning predictive modelling as it is represented in the literature. The first of these relates to all of the data sources used to develop predictive models. Despite the theoretical framework or the methodological approach used in the development of predictive models, all make use of a limited number of primary variables. These are slope, aspect, elevation and distance to water, vegetation zones, and in some cases, soil characteristics. These variables are ones that are regularly found on site record forms. Additionally, these variables can be traced to three main sources: topographic maps (either paper or digital), existing archaeological data, or aerial photographs. For example, Parker's Sparta Mine predictive model lists fifteen variables used to predict site location. Seven of those fifteen relate to streams, one is soil moisture and another is depth to the water table. Thus nine of the fifteen variables are expressions of a water variable. While the impression is given of a complex interplay of variables, it is simply that more emphasis is being given to water.

Also, while the choice of variables has been limited and the source for these variables even more so, there seems to be an absence of culturally relevant variables. Very few, if any, predictive models incorporate variables derived from Native land use studies, ethnographic data or local informant interviews. While the utility of incorporating these kinds of data has yet to be demonstrated, logically, there is considerable value to incorporating them into predictive models of prehistoric activity locations (Dalla Bona and Larcombe 1992).

A second concern relates to the quality of some primary data sources. For example, Altschul states that the primary source for his three variables is a USGS digital terrain model. Kvamme (1990:114) evaluated USGS digital terrain data and concluded:

"although there is a general correspondence, in the (purchased USGS data) (1) many small ridges and hills are absent, (2) minor drainages are missing, (3) large features are greatly smoothed, and (4) there is a major error in the form of a 120 ft. high cliff face which would surely make a spectacular waterfall on the Colorado River (which flows down the central valley) if it really existed!"

Archaeologists generally consider terraces, hills and small creeks to be extremely important for regional settlement analysis. It is difficult to perform an analytical study using criteria such as terraces and small hills, when the map from which these criteria are drawn does not represent them accurately.

In addition to digital data of questionable quality is the issue of appropriate scale for modelling. For example, let us consider a predictive model developed for a large region at an effective scale of 1:50,000. However, much of the data may be derived from maps published at a scale of 1:125,000 and 1:250,000. The high level of data generalization on these maps relative to the 1:50,000 map forces a significant degradation of the quality of the final model. Additional potential difficulties can develop if some data variables are derived from large scale sources such as 1:15,000 aerial photographs. In this circumstance it may be necessary to reduce the precision and detail of variables derived from large scale sources (air photos) to match the scale of the 1:50,000 predictive model. The implications of variable scales of primary data are addressed more fully in a later chapter.

Thirdly, those developing predictive models tend to present their results as statements of high/medium/low potential areas, or areas of favourability/non-favourability. However, the means by which these terms were defined is seldom clearly expressed. The reader is rarely informed of the means by which determination of categories of potential is made. The cutoff points between high and medium, and medium and low potential is rarely if ever discussed. Clearly, this is an issue that is of importance to cultural resource managers and archaeological researchers alike. The modelling approach developed in this research does not categorize potential, rather it presents a scale of potential where zones of high/medium/low can be determined more clearly and the rationale for that determination is openly presented for further discussion.

ADVANTAGES OF PREDICTIVE MODELLING

With respect to the kinds of research and applications being made by cultural resource managers, predictive modelling holds considerable promise as a planning tool. A predictive model can offer explicit measures of the likelihood of cultural resources in specific localities. Generating such information can greatly increase the timeframe within which cultural resource managers may plan survey, confirmation and mitigation - well ahead of development activities. This could result in the avoidance of conflicts between land "values", and allow land developers and cultural resource managers to plan land use in fashions that minimize deleterious impacts. In addition, a credible predictive model can help focus archaeological reconnaissance, and direct research to areas within a region that hold some cultural significance. Using such "pointing tools" offers tremendous savings in time and money, and can be highly effective as a means of "stratifying" conventional random sample surveys over large areas. Finally, a good predictive model can be of use to both the cultural resource manager and the archaeologist interested in academic research applications. A predictive model may provide insight into the dynamics of the settlement system within a given region. By explicitly outlining the variables associated with specific types of sites, a predictive model may actually indicate some probable choices made by prehistoric people in developing their land use strategy.

ON DEVELOPING A PREDICTIVE MODEL

As discussed in the sections above, there are many different approaches to the development of predictive models. These approaches make use of different theoretical and methodological frameworks. The methodology outlined in Volume 4 of this report series crosscuts a number of theoretical and methodological constraints. It can be applied to academic research applications or for 'applied' cultural resource management purposes. It can be used from a deductive or inductive perspective, or employ elements of both. It can be used to develop a numerical and/or graphical model, and it can manipulate variables using the intersection and/or the weighted value approach. In summary, this methodology is relevant to many different archaeological applications with limits imposed only by the creativeness of individual researchers.

As stated in Kohler's definition (1988:33), a predictive model is comprised of a set of testable hypotheses. To arrive at testable hypotheses, a model must be explicit in the variables that are used, and the manner in which those variables are manipulated. This includes clearly identifying and outlining the variables included in the model, the manner in which these variables interact, and any weighting placed upon the variables. Ideally, a flowchart-like diagram outlining the various processes involved in developing the model should be available. Such a diagram would graphically illustrate what variables are used, and how they interact to produce the final result. One of the major stumbling blocks and criticisms of all predictive models is the subjective input of the researcher's own knowledge. All archaeologists acknowledge that this information is important and should not be ignored. However, to be really useful, it should be made clear what knowledge is being applied to the development of the model as well as how it is being used. The methodology employed here makes that explicit - indeed the researcher is forced to be explicit.

There are a number of assumptions that one works under when developing a predictive model. The first involves the assumption that choices of activity locations made by prehistoric people were influenced by elements of the natural and physical environment. The researcher also assumes these environmental variables have survived, and can be represented by presently available data. These data may be in the form of maps, monographs or may still remain to be collected in the field. The third assumption asserts that correlations between archaeological sites and the natural/physical environment observed by modern researchers reflect land use choices made by prehistoric decision makers. That is, the correlation is assumed to be not due to chance, or reflecting the affect of another, presently undocumented, independent variable. These assumptions may be strengthened or confirmed by repeated testing or application of a model, but the true nature of prehistoric human action can never be fully known.

As predictive models attempt to codify aspects of human behaviour, one cannot expect a model to be simplistic in its makeup, or to be developed in a single effort. The development period of a predictive model is not finite. Altschul calls this a

"...dynamic modelling approach. Once anomalies...are identified, they become the subject of additional research. As patterns are found, many anomalies become predictable. Those sites whose locations remain anomalous grow in importance" (1990:228).

Modelling should be seen and conducted as a dynamic process whereby data collected from any source, at any time, can be incorporated into the modelling process to increase its integrity, accuracy and scope. As such, predictive modelling may be seen as involving three stages: (1) primary stage predictive modelling involving data collection and organization; (2) secondary stage predictive modelling in which an initial model is developed and tested, and; (3) tertiary stage predictive modelling in which the model is subjected to an infinite number of applications and refinements. This process is summarized in Table 2.

Table 2. Summary of the three stage modelling process.
Primary Stage hypothesis building - data collection strategies
initial data collection
field reconnaissance, collection of baseline data
Secondary Stage deductive phase of modelling
association between environmental variables and sites
literature review and integration into model(s)
development or application of initial model(s)
testing of model(s) on previously surveyed areas
Tertiary Stage continued application of model(s)
new data continuously incorporated into process
new sites discovered are interpreted and incorporated into existing model(s)

Primary Stage Modelling: Organization and Data Collection

The development of the primary stage predictive model involves three activities: hypthesis building and data collection strategies, initial data collection, and field reconnaissance. Hypothesis building and data collection strategies are the crucial first step. Hyptheses must be generated about the people and activities being modelled. These hypotheses will in large part dictate the important variables to be modelled, the manner in which those variables will contribute to site potential and the data that will be collected.. Initial data collection is conducted within the parameters of hypotheses generated and can be viewed as taking place within both deductive and inductive theoretical frameworks. Existing archaeological site inventories are often a primary source of such information. Predictive models presented in the literature can be reviewed for pertinent data and analytically useful variables. In addition, information gathered from other sources such as ethnographic or land use studies can be evaluated and incorporated into the model. These data are important in developing a theoretical framework in which to interpret the results of the model as well as to guide the data to be collected.

"To have confidence in any models which emerge, we need to know why the behavior we predict patterns as it does" (Tainter 1983:7).

It is important to note that the researcher must start somewhere and existing data and successful examples of other predictive models offer an acceptable base, subject to a careful evaluation of their relevance and completeness.

The primary stage may be understood as the organizational stage of the modelling process. The researcher must make numerous decisions including:

a) the scale at which modelling will take place;
b) the spatial boundaries within which the model is applicable;
c) the temporal scope of the model, and;
d) the functional scope of the model, i.e. does it apply to all or selected activity types.

Many issues may be predetermined and a function of the project proposal or terms of reference. This is particularly the case with predictive models developed for cultural resource management purposes. It is during the primary stage that an archaeological field survey may be conducted. Usually this involves inductive data collection from portions of the "research universe" that are unrepresented in the existing heritage resource inventory. While it may be that some archaeological information already exists in the form of a site database, it may be subject to a number of biases beyond the control of the researcher. Thus, the collection of new baseline archaeological data provides the researcher with a more complete and representative database with which to build the model. The field program should include as complete an areal survey as possible. The size of the survey area need not be exceedingly large but should represent the study area as a whole. It is also important that a range of environmental characteristics, that are deemed to be the independent variables, be known and mapped within the survey area. The intention of the survey is to understand the distribution, frequency, and component parts of all the sites in the survey area. With the completion of the initial round of data collection and archaeological reconnaissance, primary stage predictive modelling is complete.

Secondary Stage Modelling: Initial Model Development and Testing

A secondary stage predictive model can be said to begin when the requirements of primary stage modelling have been fulfilled. Once this has been achieved, the researcher enters into a deductive phase and can begin to incorporate this data into the second stage of the model. The degree of correlation between the sites discovered during the field survey and the defined environmental variables is measured and ranked. Existing variables derived from the literature can then be evaluated on the basis of the strength of their correlation with this expanded site database. Cultural variables such as plant gathering or species-specific hunting activities can be incorporated into the variables to be modelled.

The researcher may now develop an initial predictive model and test it using the area surveyed in the primary stage. While it may appear that this step is a 'self-fulfilling prophecy', one must be reminded that a variety of data were used to develop the initial predictive model - not solely the data derived from the primary stage survey. Based upon the hypotheses generated earlier, variables can be introduced or removed from the process, or the weighting of the variables can be adjusted until the model is able to predict the highest percentage of sites possible.

A second field survey program in an area near the first is necessary to collect more baseline data and/or test the model. It is recognized however, that this may not always be feasible because of external limitations such as time and money. Once again, the strength of correlation between known site locations and the identified independent variables should be measured. This information should be incorporated into a new, 'second generation' predictive model. This model would then be applied to both the primary and secondary stage survey areas. The variables would be modified in such a way as to produce a model predicting the highest percentages of known archaeological sites. Once this has been achieved, tertiary stage modelling may begin.

Tertiary Stage Modelling: Application and Refinement

A tertiary stage predictive model begins when the secondary stage predictive model predicts the location of the highest number of sites possible in the two previous survey areas. It is at this point that the model may be considered applicable in a real sense. At this point, testing procedures have been carried out, and have demonstrated the validity and integrity of the model. The researcher must be vigilant at this point to ensure that the model is not applied blindly. Any continuing application, (or expansion of applicaion) of the model must undergo thorough testing, much like that conducted in the secondary stage to ensure the ongoing validity of the model. New data must be incorporated into the model year after year in an effort to produce the most robust model possible. In addition, as repeated applications of the model are effected, sites that were once 'anomalous' may now become patterned. Such observations must be formalized, carefully interpreted, and then integrated into the model to maintain and update its integrity.

Few, if any, models have achieved tertiary stage development because of the nature of the agencies employing them. Most cultural resource management agencies are limited by their role as resource managers and have neither the time nor resources for ongoing research and development. They are interested in identifying the location of resources in order to facilitate informed planning and management. Given the urgency of these goals, it is often an irresistible temptation to implement partially tested models as a part of routine resource management and planning. As a result, predictive models developed for such agencies often resemble a procedural 'cook book'. The variables identified in the prototype model become tranformed from untested indicators of site distribution into routine 'red flags' of site location. As the predictive model is being routinely used, it gains unwarranted credence and uncritical acceptance. Users may reason that if certain steps are followed, a scientifically valid result will follow.

The three stage modelling process outlined here reduces the likelihood that such a 'cook book' approach will result. An additional point may be raised concerning the prospect that a predictive modelling approach will supplant conventional archaeological field work. There should never be a point where predictive models take the place of field work. In the context of an academic modelling exercise, the negative implications of a poor model are relatively minor. However, in the context of cultural resource management, the implications of applying a poorly tested model can be severe, and perhaps even disastrous. At the same time, it is not realistic to expect resource management agencies to do nothing until all possible sites have been field inspected. Clearly a reasonable compromise is possible and best serves all agencies involved.

"I have no objection to the use of multivariate locational models for research and planning purposes, but they simply cannot provide sufficient evidence to warrant the granting of archaeological clearance without the benefit of field survey. Any such reliance on predictive models to 'write off' areas of low projected site density constitutes both an abuse of statistical methods and an abrogation of É management responsibilities" (Berry 1984:845-6).

Once again, the development of predictive models is a dynamic process where models are rigorously tested over many years and in many different areas. The results of each year's testing must also be incorporated into the existing model. Information gained in future years of application are also incorporated into the model development process. Ideally, this process should never stop.

SUMMARY

Predictive modelling is a research methodology used by archaeologists to identify prehistoric activity locations. It has its basis in settlement pattern analysis, and the results of predictive modelling continue to further the aims of settlement pattern studies. Predictive models have been identified as operating within inductive and deductive theoretical frameworks. They are also developed using distinct methodological approaches. Since the mid 1970s, predictive models have become less associated with the settlement studies from which they emerged. For the most part, predictive models have developed within the sphere of cultural resource management, and their application has been primarily in this management context. However, the breadth of application is increasing as archaeologists recognize the potential of predictive models as an academic research tool.

(Back to Part 2) (Bibliography)