Glossary

A  B  C  D  E  F  G  H  I  J  K  L  M
N  O  P  Q  R  S  T  U  V  W  X  Y  Z


A

Absent - a feature may be present or absent in an artifact or assemblage.  Using the feature parameters Earlier, Later, Blanks and Zeroes, a user may indicate when blank data or a zero should be considered as indicating the absence of a feature.

Active - a parameter that allows the user to specify the data set and seriation to use.

Artifact - an object being seriated (ordered). See Artifacts.

Assemblage  - a collection of artifacts. See Assemblages.


B

Blank - values in the data may be blank. See Blanks.

Blanks - a feature parameter that lets the user tell OptiPath how to interpret blanks in the data (as zeroes, as absences or as unknowns). See Blanks and Setting the Earlier, Later, Blanks, Zeroes and Transition Parameters


C

Classed Data - alphanumeric values ("red", "-78", "large", "3.14159", "0", "absent", "dentate", etc.) that represent classifications of feature values. The feature parameter Data can be used to specify that a feature should use classed data. See Classed Data.

Custom Seriation - a seriation technique that is determined by the user by selecting parameter settings normally used to to specify Optimal Path, Occurrence, Frequency, Discrete or Nominal seriation techniques. See Custom Seriation.


D

Data - values or measurements associated with the features for each artifact. See Data.

Data Parameter - a feature parameter that lets the user tell OptiPath how to interpret the input data (as Measured, Ranked or Classed). See Data.

Data Set - a single set, or sample, of artifacts which the user is trying to seriate. See Data Sets.

Discrete Seriation - a seriation technique that is a special case of optimal path seriation where the feature values happen to be integral. See Discrete Seriation.

Distance - a measure of how much the features of one artifact differ from another. Distance is measured according to a distance function, or metric (Euclidean, Manhattan or Hamming). See Metric.

Distance Function - a metric (Euclidean, Manhattan or Hamming). See Metric.


E

Earlier - a feature parameter that lets the user tell OptiPath what assumptions to make about artifacts earlier than those in the data set. See Earlier and Setting the Earlier, Later, Blanks, Zeroes and Transition Parameters

Earliest - a parameter that lets the user tell OptiPath the earliest allowable date for artifacts. An earliest date for all artifacts can be set in the Seriations window; earliest dates for individual artifacts can be set in the Artifacts and Results windows. See Earliest.

Euclidean Distance - a distance function (metric) where the distance between two items (artifacts) is the square root of the sum of the squared differences in all of the dimensions (features) in which the items differ. Compare with Hamming distance and Manhattan distance. See Euclidean Distance.

Exclude - a parameter that lets the user tell OptiPath exclude features, artifacts or assemblages from the current seriation without removing them permanently from the data.

Excel - Microsoft Excel. OptiPath supports importing a data table from and exporting results to a Microsoft Excel spreadsheet. See Import and Export.

Export - to export a data set and seriation from OptiPath to an external data format. Currently OptiPath only supports exporting a data set to a Microsoft Excel spreadsheet. See Export.


F

Features - Features are measurable attributes shared by artifacts. It is assumed in seriation that the evolution of a feature's measure over time is gradual. See Features.

Files -  Files are the permanent files retained on your computer even when OptiPath is not running. A file holds all of the data for your data sets and seriations. See Files.

Frequency Seriation - a seriation technique that is a special case of optimal path seriation where assemblages are treated as artifacts. See Frequency Seriation.


H

Hamming Distance - a distance function (metric) where the distance between two items (artifacts) is the number of dimensions (features) in which the items differ. Compare with Manhattan distance and Euclidean distance. See Hamming Distance.

Heuristic - a simplified and approximate method of analyzing a problem (for example, a rule of thumb rather than formal mathematical analysis). Heuristics may not always achieve the optimal outcome, but they are generally much easier to implement than optimal procedures and are generally much faster to execute.


I

Import - to import data from a source outside of OptiPath. Currently OptiPath only supports importing a data table from a Microsoft Excel spreadsheet. See Import.

Index - an integer value assigned to an item that allows users to sort items in a list.


L

Later - a feature parameter that lets the user tell OptiPath what assumptions to make about artifacts later than those in the data set. See Later and Setting the Earlier, Later, Blanks, Zeroes and Transition Parameters

Latest - a seriation parameter that lets the user tell OptiPath the latest allowable date for all artifacts.  A latest date for all artifacts can be set in the Seriations window; latest dates for individual artifacts can be set in the Artifacts and Results windows. See Latest.


M

Manhattan Distance - a distance function (metric) where the distance between two items (artifacts) is the sum of the absolute values of the differences in all of the dimensions (features) in which the items differ. Compare with Hamming distance and Euclidean distance. See Manhattan Distance.

Measured Data - numerical values, including integers (-5, 1, 3, 0, etc.) and fractions expressed as decimals (1.3, -0.43, -3.14159, 2.71828, etc.), that represent measurements (or counts) of feature values. The feature parameter Data can be used to specify that a feature should use measured data. See Measured Data.

Metric - a feature parameter that lets the user tell OptiPath which metric, or distance function, (Euclidean, Manhattan or Hamming) to use in computing the feature's contribution to the distance between two artifacts.


N

Nominal - establishing an identity rather than an ordinal value or seriable magnitude The feature parameter Data can be used to specify that a feature should use classed data. See Nominal Seriation.

Nominal Seriation - a seriation technique that is a generalization of occurrence seriation. Instead of categorizing each feature into two classes (present or absent), the technique allows any number of classes. See Nominal Seriation.

Normalize - a feature parameter that lets the user tell OptiPath to normalize the data for the feature. Normalization is done by converting values to standard deviations. See Normalize.


O

Occurrence Seriation - a seriation technique used when only the presence or absence of features is known and measured data is unavailable. See Occurrence Seriation.

OPS Date Index - the OPS Date Index is a rough measure of the effectiveness of a seriation. The OPS Date Index for a feature is an R squared value. For each feature the OPS Date Index is an estimate of the fraction of the variance in the dates assigned by the seriation for all features which is accounted for by an optimal assignment of dates when only the one feature is considered. The value is calculated for each feature. Features with fewer than 2 data entries are ignored. The average OPS Date Index for all features is reported in the Distance column. The variance in the sample data is the variation of the item dates from their average value. A seriation determines an ordering of the items. Given this ordering, we infer dates for our items for each feature taken alone. We use these dates to create a "best fit" curve in the two-dimensional space defined by feature values and dates. The OPS Date Index statistic is the fraction of the variation in the original dates that is accounted for by this "best fit" curve.

OPS R Squared - the OPS R Squared is a rough measure of the effectiveness of a seriation. This statistic is an estimate of the fraction of the variance in the sample data that is accounted for by the seriation. The value is calculated for each feature. Features with fewer than 2 data entries are ignored. The average R squared value for all features is reported in the Distance column. The variance in the sample data is the variation of the feature values from their average value. A seriation determines an ordering of the items. Given this ordering, we infer dates for our items for each feature taken alone. We use these dates to create a "best fit" curve in the two-dimensional space defined by feature values and dates. The OPS R Squared statistic is the fraction of the variation in the sample data that is accounted for by this "best fit" curve.

OPS Order Index - the OPS Order Index is a rough measure of the effectiveness of a seriation. For each feature the OPS Order Index is the ratio of the minimum possible path length for that feature considered alone and the current seriation path length for that feature alone. Normally, the minimum path length for a single feature is simply the difference between the largest value and the smallest value for the feature. However, values for artifacts earlier or later than those being seriated may be considered as absences, zeroes or unknowns, and zeroes may be considered as values, absences or unknowns. Depending upon the value of the transition penalty, and the settings for Earlier, Later and Zeroes (see Features), the minimum path length may be more complex than the difference between the largest and smallest feature values. For the overall seriation the OPS Order Index is the average of the OPS Order Indices for all the features, and is recorded in the Distance column. If your data contains earliest and latest dates for the items being seriated, OptiPath ignores these in computing the OPS Order Index, resulting in a more conservative estimate of effectiveness.

Optimal Path Seriation - a seriation technique based upon mathematical optimization heuristics.  Optimal path seriation relies on the assumption that the artifacts to be seriated share characteristics or features whose measures evolve gradually over time. The ordering of artifacts that produces the most gradual evolution of all features for all artifacts is considered the optimal seriation. Optimal path seriation was developed by Brett Shepardson and Fred Shepardson and is being documented in a paper to be submitted for publication. See Optimal Path Seriation.

Order - an artifact or assemblage's sequential (ordinal) position in a seriation.


P

Present - a feature may be present or absent in an artifact or assemblage.


R

R Squared - a statistic that measures the fraction of the variance in the sample data that is accounted for by a proposed explanation often referred to as the "best fit" curve.

Randomize - a seriation parameter that lets the user tell OptiPath whether or not to randomize the search for a good seriation when using heuristic solution procedures. Randomizing can lead to better, but not necessarily reproducible, results. Reproducible randomized results can be obtained by setting the Seed. See Randomize.

Ranked Data - integer values (-5, 1, 3, 0, etc.) that represent counts or rankings of feature values. The feature parameter Data can be used to specify that a feature should use ranked data. See Ranked Data.

Ranks - a feature parameter that lets the user tell OptiPath the limit on the number of categories or classes to which a feature can be assigned. See Ranks.


S

Seed - a seriation parameter that lets the user specify a starting point for the random number generator that allows results to be reproduced when using randomized heuristic solution procedures. See Seed.

Seriate - to order or arrange in a series. For archaeologists a seriation is generally an attempt to arrange items in chronological order.

Seriation - an ordering or arrangement of items (artifacts) in a manner to approximate assumed serial patterns in the measures of the items' features. The underlying assumption in archaeological seriation is that a correct chronological ordering of artifacts would exhibit such patterns, in particular that the evolution of a feature is gradual. Evolutionary gradualness is rooted in the assumption that the form of an artifact is influenced by the form of functionally similar artifacts that preceded it. The implication is that artifacts whose manufacture is proximate in time and space will be of similar form or style. From this the converse is concluded, that artifacts with similar style or form are likely to have origins that are proximate in time and space. Therefore, by considering artifacts’ style and form it is possible to determine a most likely ordering of their origins in time or space. See Seriations.

Shuffle - to reorder a seriation randomly. Shuffle is an option on the seriation dialog. See Shuffle.


T

Table - a standard way of presenting data in OptiPath. Any information entered into a row of a table will not be saved or take effect until the cursor is moved to another row of the table.

Technique - a seriation parameter that allows the user to specify the seriation technique (Optimal Path, Occurrence, Frequency, Discrete, Nominal or Custom) to use. See Technique.

Transition Penalty - a feature parameter that lets the user tell OptiPath how important it is that a feature appear only once in the archaeological record, rather than appearing multiple times with intervening intervals where the feature is absent. See Transition and Setting the Earlier, Later, Blanks, Zeroes and Transition Parameters.


U

Unknown - a blank or zero in the data may indicate that the value of the feature for this artifact is unknown. Using the feature parameters Earlier, Later, Blanks and Zeroes, a user may indicate when blank data or a zero should be considered as having an unknown value. See Setting the Earlier, Later, Blanks, Zeroes and Transition Parameters

Updates - for the latest version of OptiPath visit the OptiPath website.


V

Value - using the feature parameter Zeroes a user may indicate that a zero in the data should be considered the value for the feature (rather than an absence or an unknown value). See Setting the Earlier, Later, Blanks, Zeroes and Transition Parameters

Value & Absence - using the feature parameter Zeroes a user may indicate that a zero in the data should be considered both as the value for the feature and as indicating the feature is absent (rather than an unknown value). See Setting the Earlier, Later, Blanks, Zeroes and Transition Parameters

Values - the input data after it has been interpreted and processed according to the user's feature parameters. See Values.


W

Weight - a feature parameter that lets the user tell OptiPath how much importance to give a feature relative to other features. See Weight.

Weights - a seriation parameter that lets the user tell OptiPath whether or not to use weights on features while computing distances between artifacts. See Weights.


Z

Zero - values in the data may be zero. Using the feature parameters Earlier, Later and Blanks, a user may indicate when blank data should be considered as having a value of zero.

Zeroes - a feature parameter that lets the user tell OptiPath how to interpret zeroes in the data (as values, as absences, as values and absences, or as unknowns). See Setting the Earlier, Later, Blanks, Zeroes and Transition Parameters