Exercise 9: Seriation.    [Back]   [PDF version]
 

1.  First let's try a little experiment.  Using the data in PLAY4.SYD (the same data used in the previous exercise), calculate a matrix of euclidean distances for all pairs of objects using the CORR procedure (Statistics | Correlations | Simple).  Then, based on this matrix, use the MDS procedure to scale these objects in two dimensions (Statistics | Scale | MDS).  How well does SYSTAT's multidimensional scaling algorithm reproduce the relative locations of the points in your original scatter diagram?  Was this the result you expected?

[Hint: To calculate euclidean distances with SYSTAT, you must create a dataset in which the cases correspond to the original variables X and Y, and the variables correspond to the original objects A-Q.  In other words, the dataset PLAY4 must be turned on its side: the variables must become cases and the cases must become variables.  This can be accomplished easily by using the TRANSPOSE command (Data | Transpose).  Because TRANSPOSE works only on numeric variables, the character variable called OBJECT$ will automatically be dropped. After the dataset has been transposed, you can  use the Data Window to rename each variable with the object designations that were formerly in OBJECT$.  This will make it easier for you to interpret the output of the multidimensional scaling.  In order to get scatter plot of your results, be sure that the "Statistical Quickgraphs" option (Edit | Options | Output) is turned on when you run MDS.]
 

2. The second part of this exercise is designed to give you practice in using a variety of seriation techniques.  The data consist of type frequencies (percentages) at nine Late Woodland components in the Eno River drainage of North Carolina.  Site Or231H is a historic village with European trade goods dating to about AD 1700.  The rest of the components lack European trade goods and are presumably prehistoric.  The data are presented to you in three forms: (a) a bar chart on the sheet attached showing the relative frequencies of types at the various sites; (b) a SYSTAT file called ENOSITES.SYD in which the cases are sites and the variables are types; and (c) a SYSTAT file called ENOTYPES.SYD in which the cases are types and the variables are sites (in other words, ENOTYPES is simply a transposed version of ENOSITES).  Your job is to do the following:

Discuss and interpret your results.  (Note: You can use Kintigh's FORD.EXE program to produce nice seriation graphs and send them to a laser printer or an HPGL plotter file.)


Datasets and programs for this exercise (right-click to download):


Back