Exercise 11: Correspondence Analysis.   [Back]  [PDF version]
 

The file ARCHAIC2.SYD contains data on assemblage composition from 10 Early Archaic sites on the Atlantic slope of the eastern U.S., the same data used in the cluster-analysis exercise. However, in this case, the dataset has been re-configured into a form more suitable for correspondence analysis. The variables in the file are the following: SITE$, site name; TOOL$, tool type; and N, the count of that tool type at that site. If you think of ARCHAIC.SYD (the cluster-analysis dataset) as a spreadsheet, then ARCHAIC2.SYD (the correspondence-analysis dataset) is the same data reconfigured so that each cell in the original spreadsheet now appears as a separate row.

In order to do a correspondence analysis, you must first tell SYSTAT that N is a “frequency variable” (Data | Frequency), i.e., that N represents the frequency (count) with which each particular combination of SITE$ and TOOL$ occurs.

Once you’ve done this, carry out a correspondence analysis of these data (Statistics | Data reduction | Correspondence analysis). The two variables to use in this analysis are SITE$ and TOOL$. In SYSTAT’s terminology, the variable you specify as the “dependent variable” at the outset becomes the “row variable” in the output listing; similarly, the variable you specify as the “independent variable” at the outset becomes the “column variable” in the output listing.


Datasets and command files for this exercise (right-click to download):


Back