Copyright Sociological Research Online, 1998


Prandy, K. and W. Bottero (1998) 'The Use of Marriage Data to Measure the Social Order in Nineteenth-Century Britain'
Sociological Research Online, vol. 3, no. 1, <>

To cite articles published in Sociological Research Online, please reference the above information and include paragraph numbers if necessary

Received: 24/11/97      Accepted: 19/1/98     Published: 30/3/98


This article describes the construction of a measure of the social order in the nineteenth century, which will subsequently be used as a basis for studying processes of social reproduction (or social mobility). The technique of correspondence analysis is used to map the ordering of groups of occupations in two time periods 1777-1866 and 1867-1913. The data are derived from the occupations at marriage of the groom, his father and his father-in-law (the occupations of brides, unfortunately, being very much under-recorded). Marriage, it is argued, is a socially significant act linking, on average, families that occupy similar positions in the social order and analyses of the patterns of social interaction involved provide a means of determining the nature of the social space within which similarity is defined. The three occupations provide three pair-wise comparisons and each comparison gives a mapping of the row occupations and the column occupations six in all. Since any one of these should provide a measure of the social order, assuming there to be any consistency in such a concept, we would expect that, at both time periods, the result of the analyses would be six closely-related estimates of the same underlying dimension. This is what is found; the inter-correlations are very high. Furthermore, there is a very strong relationship between the measures of the social order constructed for the two time periods. The analyses are presented within a framework that emphasises the value of the procedures used for understanding the nature of measurement in social science.

Correspondence Analysis; Marriage; Measurement; Social Mobility; Social Order; Social Reproduction; Social Space


The idea that patterns of marriage - who, in occupational terms, marries whom - can be used to provide information about the nature of the social order is not a new one. Specifically in the case of Britain in the nineteenth century, there have been several studies using this approach (Foster, 1974; Gray, 1976; Crossick, 1978; Penn, 1985). These authors, however, have all operated with a theoretical model of social classes. The purpose of their studies was to investigate the issue of class boundaries and their existence either within the manual working class or between it and non-manual groups. Given the statistical techniques generally available at the time those studies were carried out, even this limited aim proved difficult to deal with in a wholly satisfactory manner. Essentially, the problem is that although it was possible to assess the social distance between a particular occupational group and the extremes - unskilled labourers at one end and professionals or bourgeoisie at the other - it was much more difficult to take account of the social distances involving intermediate groups. In this paper we demonstrate that a technique, correspondence analysis, now exists to enable this kind of problem to be dealt with much more effectively, but also, and more importantly, to show how the technique allows us to develop a far more appropriate conceptualisation of the nature of the social order.

The fact that previous attempts to study historical mobility have generally used class models of the social structure has entailed them making a great many prior assumptions about the occupational ordering. The method presented here starts, not with the assumption of a hierarchy of social groups that may interact to a greater or lesser extent, but from the opposite direction, from the patterns of social interaction themselves. Instead of measuring how the class structure gives rise to social groups we ask instead: what is the social space - the pattern of social relationships - within which occupations are located? Using very simple pieces of information - namely the sets of relationships that can be derived from marriage records - a picture of the social space within which marriage occurs can be constructed. With no prior assumptions about the nature of this space, we can nonetheless produce a model of both the hierarchical ordering of occupational groups and of their relative social distance from each other.

The method uses occupations, taken from marriage records, and the outcome is a scale of occupations. The scale is derived from patterns of social interaction, starting from the assumption that close social relationships (such as friendship or marriage) occur in situations of social similarity. We might expect the families of two people entering into a marriage partnership to be roughly similar in their social position - that is to be close to each other in the social order. The idea, put crudely, is that like marries like. This does not, of course, assume a simple model of endogamy; marriage inevitably occurs across social divides, between individuals with a gulf in their social backgrounds. The assumption is simply that close social ties such as inter-marriage are more likely between people in those occupations that are similarly placed in the social order; through contiguity, through similar standards of lifestyle, through routine social interactions - at work, in the community and so on - which bring them and their families into contact with each other. By taking a large sample of these pairs of marriage relations, we can build up a model of the relative social distance of occupations, and thus of the social structure within which marriage, and other key social relationships, occur.


However, although this last aspect will be the sole substantive issue that the paper deals with, we believe that there are more general lessons that can be drawn and that the technique used has potentially a wider application. The point cannot be elaborated in the present context, but we believe that central to the concept of measurement is the idea of a set of 'objects' and an operation that defines a significant theoretical relationship between those 'objects'. A paradigm case in the natural world is a set of physical objects and the operation of tilting a balance. Although simple, this operation is central to the concept of weight or, in more developed theoretical terms, mass and its hypothesised causal effect. Using the operation for every pairing of the objects it is possible to determine a structure of relations between the objects. In this example it would be a simple uni-dimensional ordering of the objects by their weight. It is this one-dimensional structure of relationships between a set of objects that allows us to claim the existence of a quantity.

The idea can be extended in a slightly more complex, but still relatively straightforward, way to situations where there are two sets of objects and an operation that links them. For example, we might have a set of weights and a set of threads of varying thickness, together with the operation of 'suspending', by which is meant a thread holding the weight without breaking. The result of all pair-wise operations would be an ordering of the objects by weight and the threads by 'strength'. In this case, then, we have established the existence of two quantities, with a very clear theoretical and causal link between them.[1]

It is not possible in the social world to undertake a precisely similar investigation of a system that can be closed in terms of either objects or operations. Social 'objects' cannot be constructed by the investigator, but have to be taken as they are found. Similarly, operations cannot be imposed; they can only be observed when they occur.[2] However, almost anything that can be recorded in a two-way cross-tabulation can be seen as an analogous case of two sets of objects (the categories of the rows and the columns) and an operation - whatever it is that links the two. For our present purposes the categories are those of a woman (or her family) and a man (or his family) and the link is 'marries (into)'.Of course, we need to be more precise about the question of individual or family and, as we shall see, this actually involves us in several cross- tabulations. One of our main concerns will be whether the results of these are compatible with one another.

By their nature, cross-tabulations involve the use of aggregate groupings. Thus, in the present case, the set of simple 'marries/does not marry' dichotomies for any two individuals (or families) are converted into a set of frequencies, which has the effect of transforming the analysis into one involving structure rather than individual action. A further, important consequence is that the information available about pairs of categories can now be considered as information about relative distances between the categories. Broadly speaking, we could consider two row (or column) categories to be close, if their frequency distributions are very similar, or distant, if the frequency distributions are very different. By a slightly circuitous route we have arrived at the same position as with the paradigm case earlier. That is to say, we have information about a whole set of relationships and we need to be able to investigate whether that information can be adequately represented by a simple structure. In the paradigm case the structure was a uni-dimensional ordering, a quantity. A single dimension may underlie the structure of a particular cross-tabulation, though this is extremely unlikely outside a closed, determinate system. Rather more likely is a situation where there appears to be a single dimension, strongly indicating a quantity, but with a certain amount of random 'error' that is being represented in further dimensions. It may also be the case, partly for reasons that we consider later, that two or more dimensions are required to represent the structure of relationships adequately. The issue of establishing the existence of a quantity is more complex in such cases. The dimensions may give a clue, but of course they are necessarily orthogonal, whereas it is rather unlikely that the underlying quantities are totally unrelated. Effectively, one is looking for a line in the space that indicates a quantitative ordering, but without any clear criteria for its selection.[3] In such cases there has to be an appeal to evidence beyond the basic data.

One way of treating cross-tabulations of the kind that we are dealing with, particularly if they are very large, is that of Multi- Dimensional Scaling analysis (MDS) (Coxon, 1982), as used in the original construction of the Cambridge Scale (Stewart et al, 1980). This involves obtaining measures of social 'distance' between the members of the set of row categories by comparing the proportionate distributions across the columns for each pair of rows and deriving a summary statistic such as the index of dissimilarity. Thus, the more similar two row categories are in their distributions, the 'closer' they are in a spatial sense, and conversely the more dissimilar they are, the more distant. (The whole procedure can be carried out in an identical way, but using the set of column categories.) What MDS does, simply, is to take a set of such distances between points (in fact, it uses only their rank order, not the absolute values) and determine how well they can be fitted into a space of few dimensions. In the simplest case, all of the estimated 'distances' would be found to be consistent with an ordering of the points along a single dimension, a line or continuum.

MDS is an iterative, numerical, rather than an analytical, technique and one where the criteria for the adequacy of the results are not wholly clear. There is also the problem of whether a solution based on the rows or on the columns should be used or, if both are obtained, how the two might be combined. For a re-analysis of the marriage data published in the studies referred to earlier, Prandy (1993) advocated the use of an alternative method, which is analytical and has the advantage of dealing with rows and columns together, that of correspondence analysis (CA) (Greenacre, 1984). It appears, in fact, that the method had unwittingly been used in an earlier revision of the Cambridge Scale (Prandy, 1990), where the technique adopted was 'identical to the "reciprocal averaging" interpretation of correspondence analysis' (Lampard, 1993: p. 4; see also Lampard (1992) for a discussion of the superiority of CA over MDS). The use of MDS depends upon a prior stage of determination of distances; these are taken as given data, with MDS providing the solution to the problem of fitting the set of inter-point distances into a low-dimensional space. However, this preliminary process involves a loss of information. For each pair of categories, say of rows in a cross-tabulation, the two distributions across the columns are collapsed into a single measure of the 'distance' between the categories. At the same time, information about the row totals is lost (the column totals will have been taken account of in calculating proportions). Thus, the estimated distance between two relatively small groups will be given equal weight to that between two large groups, despite the greater statistical significance of the latter.

The nature of 'distance' in CA is rather harder to grasp than it is with MDS, partly because there is no separation into the two distinct phases of distance estimation and fitting. Rather than collapsing pairs of rows or columns into a single statistic, as MDS does, CA uses the idea that a table of i rows and j columns can be seen as representing either i points in a space of j dimensions or j points in a space of i dimensions, or indeed as both at the same time. The total 'size' of the space within which the points are located is the familiar chi-square (that is, the sum over all the cells of the square of the observed minus the expected frequency divided by the expected frequency) divided by n, the total number of cases in the table. Each of the i row points in the j-dimensional space can be considered with respect to its contribution to the total value of chi-square. In essence, this is the measure of 'distance', referred to in CA as inertia. It is a measure that takes account of both the distance of a particular point from the 'average' position of the set of points (the centroid) and its relative importance, as indicated by the column relative frequency. The distance between each pair of i row points is expressed by their (chi-square related) location in each of the dimensions corresponding to the j columns. The concept of inertia is also used to evaluate how much a particular row or column has contributed to a dimension, and how well each point is represented in a dimension.

Technically, one description of CA (Lampard, 1993: p. 4) is that it 'is in effect the canonical correlation analysis of a contingency table, and it can be shown also to be equivalent to the Singular Value Decomposition of a transformed version of the table (see Weller and Romney, 1990, p. 58)'. As the reference to canonical correlation may suggest, the major purpose is not simply to locate the points in a space, but to determine the major axes, or dimensions, of that space. That is to say, interest centres on the question of how much of the total inertia can be explained in a limited number of dimensions. Preferably, just one or two dimensions will suffice to account for a high proportion of the variance, because it is most likely that these are the ones that can be given a meaningful interpretation.

A major advantage of CA is that it is available in the SPSS statistical package, where it is presented in terms of optimal scaling, that is as a method that 'assigns scores to the categories of the row and column variables in such a way as to account for as much of the association between the two variable as possible'(SPSS Inc., 1990: A-4). This is perhaps the most straightforward way of understanding the end, though not the statistical means, of CA, while the SPSS package, used in the analysis reported here, is the simplest way to carry it out. Although there is no direct reference to distance in this form of presentation, the emphasis that we have placed on the spatial analogy in scaling procedures should make the link clear.[4] Fortunately, a good deal of order has been introduced into what was threatening to become a rather confused area by Goodman (1985, 1987), who has shown how CA is related to a broader class of statistical models, including latent -class analysis and the log-linear-based association model. While it is fair to point out that, broadly speaking, Goodman's own preference is for the last of these models (in part because it is more satisfactory from an inferential point of view), given the very positive nature of our results it seems highly unlikely that use of any of the variants would have led to any significant differences. In this respect, we are particularly encouraged by the value of the Cambridge Scale which, whatever the limitations of the technique used to create it, has proved remarkably successful in analysis (Prandy, 1996, 1998).

We referred earlier to the question of the nature of the 'objects' that enter into our analyses - that is to say, the categories used in the cross-tabulation. In effect, the process of classification or categorisation is one of constructing 'objects'. In some cases these objects may have a reasonably strong relationship to actual social categories - type of school attended, or party voted for, for example - but in other cases they are to a much greater extent the researcher's own constructions. Of course, such categories would normally be constructed in a way that the researcher believes to be sociologically significant, providing groupings that are important even though may not have actual social counterparts. This is the case with the social class categories that have been used in the previous studies of marriage that we have referred to.

However, any particular categorisation is clearly open to theoretical challenge. If a set of 'objects' do not display the results expected of them, then the problem may lie with the categories rather than with the theoretical expectations. There is always the possibility that a different set of categories would display the results, or would display them far more coherently. Referring back to our paradigm case, it should be clear that as far as possible objects should be homogeneous in nature and that their homogeneity should be in terms of the quality that is believed to form the basis for their potential quantitative ordering. So, a set of physical objects to be used in the weight example, with the 'tilting a balance' operation, would ideally be composed of the same material and have the same shape (apart from necessary differences in size).

As we have said, this degree of homogeneity is not attainable in social analysis, but that is not to say that we should not seek to achieve as much of it as possible. In the case of the 'marries' operation, as with many other situations in which class categories are used, it is arguable whether this has actually been achieved and it is quite likely that different and/or more refined categories would provide a clearer, more coherent picture. In fact there is generally little to be lost by using more categories, if at all possible, because if any two or more of them were indistinguishable (with respect to the operation), this would emerge in the analysis. So, if one starts with more basic occupational groups it should become apparent if these tend to cluster together, leaving distinct gaps between the clusters - that is to say, if a class model is in fact more appropriate.

Marriage and the Social Order

If it proves to be possible to identify a space within which the set of relationships between occupations determined by the 'marries' operation can be economically located, what can we say about its nature? The answer to this lies in the theoretical meaning of the operation. Just as 'tilting a balance' or 'suspending' are key theoretical operations for defining the quantity of weight (and strength), so marriage, if the procedure is successful, has to be seen as a key theoretical operation that will define a social quantity. Since a quantity is simply a particular structuring of objects in a space - a one-dimensional ordering - we need to generalise the idea of 'quantity' to one of 'space'.[5] The nature of this, what in the present case we may refer to as the social space, as established by empirical investigation, will determine the theoretical interpretation. A set of distinct groups, each in its own dimension - a situation that would arise if there was no intermarriage between groups - would clearly suggest an extreme caste-like structure. An ordering along a major dimension would suggest a social hierarchy.[6] The evidence to be presented later overwhelmingly suggests the latter situation, that is, that there is a quantity defined by marriage patterns.

In deciding what this quantity is, we have to consider what it is that determines that two families will be more or less likely to be connected by marriage. In doing this, we must beware of attempting to reduce it to something else. It may, of course, be the case that it is related to some other quantity that has previously been established. Social science, though, is not particularly well endowed with satisfactory measures and it is unlikely that anything suitable would be found. In the present case, for example, it could be argued that the quantity is 'status' and that this can be established, or tested, by considering the relationship of the newly- found quantity to reputational prestige or socioeconomic status measures. However, given the inadequacy of the latter, on both methodological and theoretical grounds (Stewart and Blackburn, 1975), this amounts to an attempt to validate a good measure by reference to poor ones. At the same time, it means incorporating a range of theoretical baggage, in particular concerning the role and nature of valuations in social life, that is highly unsatisfactory.

The only way that the quantity can be understood, in our view, is as a measure of what in shorthand form we refer to as the 'social order'. Like the Cambridge Scale[7], on which this work is modelled and which also uses data on marriage as well as social interaction in the form of friendship, what emerges is a measure of general, hierarchical 'material and social advantage' (Stewart et al, 1980: p. 28). We demonstrate elsewhere (Prandy, 1998), that the Scale provides a theoretically and empirically more adequate account of social mobility than does a class schema - further evidence, in our view, that the distinction between class and status serves no useful purpose (see Stewart et al, 1980; Prandy and Bottero, 1995; cf also Bourdieu, 1985). Since the time of the Scale's original construction there has been increasing evidence that whichever of a range of social relationships is chosen, the underlying structure that is revealed is very much the same. Examples are the work of Hout (1982), which uses the occupations of husbands and wives in two-earner households, and that of Mitchell and Critchley (1985), which uses the occupations of male respondents, their fathers, fathers-in-law and spare- time associates. The fact that relations of social mobility and of social interaction seem to be aspects of the same structure is particularly significant. As we shall see, it is also the case in our analyses that essentially the same structure emerges, whichever of several possible combinations of individuals involved in the marriage relationship is considered - evidence that serves to strengthen our interpretation.

Occupation and Marriage Data


As we have said, the construction of the objects of analysis through the creation of categories is inevitably a matter of judgement, tempered by experience in the use of the categories chosen. Like virtually all other researchers in the area of social stratification in industrial societies, we have chosen to use occupation as the basic starting point. Occupational groups are not necessarily ideal, but occupation is, in Bourdieu's (1987: p. 4) words, 'generally a good and economical indicator of position in social space'. 'Occupation' is, in any case, a far from unproblematic concept, as any study of the varying classification schemes, either over time or across nations, reveals. There are the further problems that, however good the categorisation, the data available must be detailed enough to make it possible to allocate individual cases with a reasonable degree of certainty, and that there have to be sufficient cases in a category to make any conclusions drawn about it fairly reliable.

Our data on occupations at marriage are derived from 3,200 family historians who responded to our request for information on their ancestors. This request was made initially through the journals of county family history and genealogical societies in England and Wales and later through similar bodies in Scotland and Ireland. Our assumption was that members of such societies would be likely to be more thorough in their approach than the average amateur family historian, as well as being easier to contact. Given the circumstances, it was not possible to attempt to sample respondents and the final number of participants happened to be of the order that we had anticipated. There is a slight preponderance of female respondents, but since it was information on their ancestors that we were collecting, this probably makes no difference. Although the quality of information provided varied, overall we were very impressed by the responses. Each respondent was sent a set of forms covering 31 couples, following ancestors back from themselves through their mother and father, and so on along both the female and male direct lines for five generations. For each couple we asked for whatever information the respondent had been able to collect on the occupations of the husband and wife, in most cases together with the date and the place of residence at the time. Like the respondents, we were obviously constrained by the nature of the information recorded at the time, much more of which relates to men than to women. Although we certainly intend to extend our analyses to cover women, the results presented here, as we explain later, use only the occupations of men. Many family historians spend considerable time, effort and money ferreting out information on their ancestors, including their occupations, from sources such as birth, marriage and death registers, censuses (those from 1851 to 1891 now being available in Britain), trade directories and employers' records. In many cases, therefore, we have quite good pictures at least of male ancestors' work histories. It would be financially prohibitive to attempt to match this by employing research assistants and it is unlikely that the quality of work would be any higher.

The work history data will obviously be valuable for later analyses, but it has also been useful in supplementing the information recorded at the time of marriage. We have used this information, for example on the number of acres farmed or on whether the individual employs others, in order make more precise allocations than would otherwise be possible. In some cases also, where the individual's occupation at marriage is not known, we have been able to 'estimate' it by taking the nearest occupation in time (within a maximum time span of ten years either side, though the average was actually around four).

Our objective in the categorisation was to group individuals who were believed to be socially similar, which meant interpreting the concept of 'occupation' in a particular way. For example, we tried to distinguish masters, those employing others, from employed journeyman. So, we have separate categories of those calling themselves 'gentleman' or 'independent', of those calling themselves 'manufacturer' (whether alone or with a qualifier) and of those known to be employers, whatever their trade. For occupations in mining and textile manufacture we have probably been quite successful in applying the distinction, because there is a sharp differentiation between owners/employers and employees. Elsewhere, however, where the separation of masters and journeymen may in any case be less significant, it is quite possible that our categories, based purely on occupation, will have a mixture of both.

In deciding on the grouping of occupations we had to bear in mind the need to have groups of sufficient size - twenty was taken as the usual minimum. Even with the relatively large numbers in our study this has in some cases meant us using groups that are probably rather more heterogeneous than we would wish. There was also, in this respect, a trade-off between the numbers available and the fineness of the time periods into which our data could be broken down. Treating the whole of our period - broadly speaking, an extended nineteenth century - as a single unit would give greater numbers, but would also miss any major changes occurring in the nature of particular occupations. On the other hand, a set of much more restricted time periods would be more precise from that point of view, but would provide groupings that were either far too small or too diffuse. The compromise was to split the period into two parts with roughly equal numbers: marriages taking place between 1777 and 1866 (the median year being 1849) and those taking place between 1867 and 1913 (median 1885). The analyses for the first period were based on 9,700 marriages (5,208 father to father-in-law, 7,760 father to son and 6,896 father-in-law to son pairs), divided into a total of 83 groups, and for the second period 8,664 marriages (5,046 father to father-in-law, 6,787 father to son and 6,668 father-in-law to son pairs) and 91 groups.

In both periods there were sufficient numbers for us to be able add one further refinement to occupation alone and to create separate groups for farmers, based on the acreage farmed. The divisions were: large (over 150 acres), medium-large (76 - 150 acres), small-medium (26 - 75 acres) and small (under 26 acres). This still left a very large group of farmers about whom we had no more information. As we shall see, this refinement provides a good opportunity to apply an internal test to the final results.


Marriage brings together two families. The fact that there are the two parties to the marriage means that, although our concern is with a single social order, we are potentially considering two structures, that in which the wife's family is located and another in which the family of the husband is placed. Of course, given that a family's appearance at a particular marriage on either the bride's or the groom's side is more or less random, we would expect these two structures to be simply subsets of a single structure. If the results of our analysis were to suggest otherwise that would throw considerable doubt on the existence of a social order. In fact the situation is rather more complicated, because at the individual level there is more than one representative of each of these families. Specifically, if one is looking at occupation and at what is practically available as data, there are four persons involved. What is recorded at the point of marriage is (or should be) the occupations of the groom, the bride, the groom's father and the bride's father.

In practice, unfortunately, throughout much of the nineteenth century the bride's occupation was not recorded at marriage. Of course, one cannot infer from this that most brides did not have an occupation - indeed it is likely that there was considerable under-reporting of women's occupations. However, this means that one of the parties cannot be properly taken into account. This still leaves the three men and their inter-relationships. While the woman's side is represented by just one person, her father, the groom's side is represented by himself and by his father. There are arguments for treating either of these as the more significant; in particular, there is an argument for taking the two fathers, because they will be more similar in terms of age and career position, and an argument for taking the groom and his father-in-law because their relationship is the more direct. One obvious solution is to consider both relationships; however, this has the consequence one is then introducing a third, potentially different, structure - that in which the groom is located.

Again, though, it has to be the case that for the argument about a social order to be convincing, there should be a reasonably strong relationship between the social position of fathers and sons. If this were not the case, then it would be inappropriate to see marriage as a linking of families, rather than simply of individuals. It then follows, though, that one should include also the relationship between fathers and sons. This introduces a different operation, in our previous terms - 'transmits to', rather than 'marries' - but if this operation leads to much the same structure, the same 'social order' quantity, then the theoretical argument about the nature of this quantity as being one of hierarchical material and social advantage is considerably strengthened.

The three relationships, then - between (groom's) father and son (groom), son and (groom's) father-in-law, and father and father-in-law - can all be utilised and should each reveal essentially the same structure. Because each relationship involves two parties, we are actually obtaining six estimates of what is being hypothesised as the same structure. They are the structure of relationships between the occupational groups of: (1) fathers as revealed by their relationship to the occupations of (a) fathers-in-law and (b) sons; (2) fathers-in-law as revealed by their relationship to the occupations of (a) fathers and (b) sons; (3) sons as revealed by their relationship to the occupations of (a) fathers-in-law and (b) fathers.


The starting point for each analysis is a cross-tabulation of occupations - fathers by sons, fathers by fathers-in-law and sons by fathers-in-law. In the optimal scaling terms that we referred to earlier, a correspondence analysis of each cross-tabulation should give an ordering and scoring of the column categories and an ordering and scoring of the row categories such that the association, or correlation, between the two is maximised. Each of these orderings and scorings constitutes a 'solution', or in other words a particular spatial, quantitative account of the structure that underlies the association in the cross-tabulation. Three tables and two solutions for each table give a total of six solutions (for each of the two periods). Our theoretical position requires that each of the six solutions is essentially an estimate of the same structure of social relations of inequality, so if we are to place any great faith in any of the solutions - the structures - that emerge from the correspondence analyses, there has to be a high degree of consistency between them. There are three requirements that should be satisfied in order for us to be able to accept that there is consistency. In ascending order of demands being made on the results they are as follows. First, for each of the solutions derived from a particular cross-tabulation there has to be a strong relationship between the structure of the row and column points. If this demand is satisfied we can then consider the second, which is that the solutions from each of the three analyses should be strongly related to one another. If the exercise is repeated for a different time period and the above demands are satisfied for both of them, the third requirement is that there should be a strong relationship between the structures at the two points in time. Technological and social change might result in the movement up or down of particular occupations, but overall one would not expect to find any dramatic change in the nature of the social order.

Two additional considerations need to be introduced at this point. It is conceivable that a set of results could meet all of the requirements set out above and yet the structure that emerged might not appear to bear much relationship to a conventional conception of the nature of the social order. Undoubtedly, this would be an interesting result, and the nature of the structure that was revealed would merit careful investigation. However, we would hope to be spared this and that the results of our analyses would at least have a high degree of face validity. Any further demonstration of validity would, of course, require more extended analyses investigating the theoretical coherence of the new measure. The second, closely related, consideration, is that it should not be necessary to cast around, as it were, in the social space revealed by the correspondence analyses in order to discover the appropriate dimension. If the structure that we are interested in is truly of social significance, then it should be the, or at the very least, a major determinant of patterns of marriage. That is to say, it should emerge as the first dimension, or at least as a component of the first two dimensions.

This last is actually quite a severe demand. In particular, spatial location inevitably constrains patterns of social interaction, and this would have been especially true in the days before mass transport. Those in rural occupations would have limited opportunity to meet, far less marry, many of those in occupations associated with large-scale urban centres. Similarly, particular industries would be sited in distinct localities and regions, making social interaction between those employed in them relatively unlikely. In technical scaling terms, the situation that we are dealing with is not one of a total ordering of occupational groups, but of the aggregation of a set of partial orders. In other words, although two occupations may, in terms of inter-marriage, appear to be socially, because spatially, distant, they may nevertheless be placed close to each other in a particular dimension because of the similar relationships they have to the set of other occupational groups. There can be many of these spatial separations and they emerge as major components of higher-order dimensions. Some may be so marked that they could affect even first or second dimensions - as we shall see, fishermen and coalminers caused us particular problems in this regard. However, as we have said, we would hope that this does not seriously interfere with the identification of the social ordering dimension.

Period 1: 1777-1866

The degree to which we have been successful in obtaining a series of estimates of essentially the same structure can be seen by considering the correlations between them. So as not to capitalise on the effect from possible outliers, we have calculated the Spearman rank-order correlations, as well as the standard product-moment coefficients. Table 1 shows the correlation matrix, with the product-moment coefficients below the diagonal and the Spearman rank-order coefficients above. It should be borne in mind that in both cases the correlations are between the values for the groups, where each group counts once. In other words, there is no allowance for the fact that some groups are much larger than others. Our assumption would be that our estimates will be less reliable for the smaller-sized groups, because they are more prone to sampling error. Consequently, they are more likely to vary across the different solutions and to lead to weaker correlations. If this is correct, then the correlations at the individual level should be stronger. Alternatively, of course, it may happen that it is the larger groupings that show the greater variability in scores. In any event, it is important not to confuse the correlation given below between, for example, the occupational group scores for fathers as compared with those for sons, with the correlation between the occupations of all fathers in our sample and those of their sons. The latter may be either stronger or weaker.

As far as the first requirement is concerned, there can be little doubt that all three of the analyses produced solutions where the ordering and scoring of the row categories was very strongly related to the ordering and scoring of the column categories (the relevant coefficients are displayed in bold in the table). In the case of sons and fathers-in-law, the value of the first coefficients was 0.91 and of the second 0.88. For fathers and fathers-in-law the product-moment correlation is higher, 0.94, and the rank-order correlation lower, 0.85.The solution from the cross-tabulation of fathers and sons is a little more complicated, because it was clear in this case that the nature of the first dimension, for both row and column solutions, was slightly different from that emerging from the other two cross-tabulations. This was indicated by correlations that were quite reasonable, but not impressively high. At the same time it was clear that the second dimension was also moderately well related to the other solutions, suggesting that a small rotation of the plane of the first two dimensions would give a one-dimensional solution that was more consistent with the overall pattern. The appropriate values for the rotation were obtained by performing a canonical correlation analysis between the first-dimensional solutions for the fathers/fathers-in-law and sons/fathers-in-law analyses and the values on the first and second dimensions of, respectively, the row and column solutions for the father/son analysis. The product-moment correlation between these two new derived measures was 0.98, the rank-order correlation 0.95.

There can be no doubt, then, that for all three analyses the row and column solutions were giving essentially the same result. The comparison across solutions was equally persuasive. All of the former fall within the range 0.63 to 0.98; the latter within the range 0.75 to 0.95. A principal components analysis confirms the picture and allows a degree of quantitative evaluation: the first factor accounts for 84 percent of the total variance. The factor loadings are all very high, ranging from 0.85 to 0.96.

Table 1: Correlations between six marriage-based estimates of the structure of the social order 1777-1866
(vs son)
(vs father)
(vs father-in- law)
(vs father)
(vs son)
(vs father-in-law)
(vs son)
1.000.950.830.750.8 30.78
(vs father)
0.981.000.850.790 .870.83
(vs father-in- law)
0.730.741.000.850.8 90.86
(vs father)
0.710.750.941.000 .910.80
(vs son)
(vs father-in- law)

The six solutions from the three analyses can be regarded with a very high degree of certainty as different estimates of essentially the same structure. The best estimate of that structure would be given by a combination of all six. There is a potential problem in cases where the numbers falling into a particular occupational category were relatively small, because the likelihood of the solution being unduly affected by one or two atypical an unrepresentative cases was obviously higher in such instances. In calculating the final scores, therefore, each value was weighted by the number of cases used in its derivation. Although, given the similarity of the factor coefficients, there was hardly any need to do so, we also incorporated these in the weightings.

There were two aspects of the resulting scale which caused us some concern. One was the location of three groups - fishermen, colliers and coalminers - in a position below that of labourers and, moreover, some distance below them. We had separated out those using the occupational title of 'collier' from those using 'coalminer' in the hope that it might distinguish the more skilled face-workers, although in practice the titles seem not to have worked in that way. More seriously, though, both fishing and coalmining are industries that are geographically isolated and it seemed highly likely that what we were getting was an overlay of geographical upon social distance. Fishermen and coalminers, that is, were probably genuinely low in social order terms, but were being pushed even lower to accommodate the fact that they were (spatially) distant even from the lowest of the other groups. In order to check this, we looked at the occupations of those with whom they were associated in marriage, excluding all cases where the occupations of both parties were the same, and using the scores obtained so far to calculate a mean for each of the three problem groups. This suggested that fishermen and, to a lesser extent, colliers/coalminers did in fact rank somewhat higher in the social scale than labourers. Since the ordering of the other occupational groups was only very slightly affected by this procedure, we were able to move fishermen, colliers and coalminers to what seemed to be a more appropriate rank-order position. In doing so, we simply gave them a score intermediate between that of the occupation above and that below them.

The other source of disquiet lay at the other end of the scale, where the top few groups tended to be disproportionately spaced out. In part, this may reflect a social reality, a higher degree of social exclusivity at the upper levels of society, but it is almost certainly to some extent an artefact.[8] We decided, therefore, without changing the rank order of occupations at all, to scale down these extreme values.

Period 2: 1867-1913

Table 2: Correlations between six marriage-based estimates of the structure of the social order 1867-1913
(vs son)
(vs father)
(vs father-in- law)
(vs father)
(vs son)
(vs father-in-law)
(vs son)
1.000.910.860.820.8 50.83
(vs father)
0.951.000.770.790 .820.89
(vs father-in- law)
0.890.851.000.760.8 50.76
(vs father)
0.800.790.861.000 .850.81
(vs son)
(vs father-in- law)

These cases that were difficult in the first period were so again in the second. In fact, the isolation of fishermen was such that in several solutions the first dimension was simply fishermen at one extreme as against the remainder at the other. In these cases the second dimension was fairly clearly the social order quantity, except that we could have little confidence in the location of fishermen on it. Given this and the possibility that the special situation of the fishermen might even be distorting the location of some of the other groups, we decided to omit them from the initial analyses. Similarly, colliers and coalminers were again located some way below labourers, and the highest groups were stretched out at the top. We adopted the same procedures as in the first period to deal with all of these cases.

Overall, the results for the second period are very similar to those for the first (Table 2), in fact in certain respects rather better. The correlations between pairs of solutions are again high: father/son 0.95 (product-moment) and 0.91 (Spearman), father/father-in-law 0.86 and 0.76, and father- in-law/son 0.88 for both. There was a similar problem with the father/father-in-law solution to the one experienced with the father/son solution in the earlier period and it was dealt with in the same way.

As can be seen from the table, the product-moment correlations range from 0.77 to 0.95 and the Spearman coefficients from 0.76 to 0.91. Once again, a principal components analysis reveals that one factor accounts for the greater part (89%) of the total variance of the six measures, confirming as before that what we obtained were essentially six different estimates of the same structure. They could therefore be combined to construct a single measure, with a similar weighting procedure to that used previously.

The Relationship between Periods 1 and 2

The final requirement that we set out as necessary for an evaluation of these results was that there should be a reasonably strong relationship between the measures constructed for the two periods. One might not expect an extremely high correlation, because of possible changes either in the nature and composition of occupational groups - more strictly groups with the same label - or a rise or fall in their social position. On the other hand, as we said earlier, there should be a substantial amount of continuity in the social order. In the event, looking at the two measures it is clear that they differ only in fairly minor ways - a fact that is best illustrated by the very high correlation between them. The value of this - 0.92 for both the product-moment and rank-order coefficients - would be regarded as high even for two estimates of the same structure derived from independent samples. Confidence in the reality of what these analyses have uncovered is therefore increased even further.


Apart from the couple of problem cases we have not so far said anything about the substantive nature of the social order quantity that has been uncovered by these analyses. We referred earlier to the question of face validity and in that respect our results are very satisfactory. At one end of the scale are located the main professional groups of doctors, clergy, lawyers and army officers, closely followed by manufacturers, those of independent means and those in high administrative positions. At the other end, are labourers and other unskilled workers. The intermediate groupings are consistent with conventional class-type distinctions between manual and non-manual workers or between skilled artisan and other manual workers. One particularly persuasive piece of evidence for the ability of the method to detect relatively fine, but socially meaningful distinctions is the fact that the four farming groups are properly ordered in terms of size from large farmers through the two medium groups to the small farmers. The undifferentiated farmers, those for whom there was no information on acreage farmed, are located in the middle of this range.

As we argued earlier, the idea of assessing this scale in terms of criterion-related validity is misguided - there simply are no alternative measures of the same 'quantity' (certainly none that possess anything like the same degree of technical adequacy) against which it can be assessed. Since the technique establishes a quantity, that quantity is what it is.[9] Of course, there is then the empirical task of exploring its relationship to other phenomena and the theoretical one of setting these results into a coherent explanatory framework. It is only in this way that the quantity can be properly assessed.

For the present, we would claim that this example clearly shows that CA is a powerful technique for the analysis of relationships between sets of categories. Sensitively used, it provides a means of clarifying the nature of the space within which social 'objects' are located and so of suggesting the existence of unidimensional orderings that can properly be conceived of as quantities. Of course, given the open nature of such objects, the unavoidable intrusion of characteristics and qualities other than the one of immediate theoretical interest, these 'quantities' are unlikely ever to have the same status as those of the physical world. One could find that the same set of objects used together with a different empirical operation - i.e. in a cross-tabulation with a new set of categories - would actually be rank-ordered differently, suggesting a different quantity. The extent to which this was theoretically disturbing would depend on the nature of the operation and the predictions based upon it. In any event, the development of more adequate measures of social phenomena would allow a far more fruitful exploration of the relationships involving differences of this kind.

Substantively, we believe that these results fully vindicate our belief in the existence of a hierarchical order of generalised social and material advantage in nineteenth and early twentieth-century Britain. Marriage was (and, much evidence would suggest, remains) a linking of two families that is structurally significant in the sense that it reflects, on average, their social similarity in terms of their position in that hierarchical order. In other words, those who are alike in social, cultural and material terms will tend to inter-marry. The significance of the fact that they do so does not lie simply in seeing it as a consequence of their similarity; the precise, socially determined and socially perceived nature of the similarity is reaffirmed in these patterns of marriage. The social order is not just being reflected; it is also, at the same time, being reproduced. There is no need to believe that those involved have any conscious intention of bringing this about, nor that they have any clear view or evaluation of the nature of the total structure in which they are located, though of course some may, to varying extents. The operation of 'marries into' (or 'transmits to') that links the occupational groups is a theoretical link between, or within, social structures, out of which emerges a structural quantity of hierarchical social order.


1 This will be recognised as the Guttman scaling model.

2 This may not be entirely true. For example, questionnaires can be constructed and administered and experimental situations can be created. In such cases, though, the element of social unreality is always a potential problem.

3 There is a related problem of rotation of axes in factor analysis.

4 Optimal scaling, in the form of Conjoint Measurement, was in fact one component of the early set of programs written by Lingoes under the broad heading of Smallest Space Analysis, which also included MDS.

5 The most useful concept here is the mathematical one of a topological space - 'a collection of objects (these objects usually being referred to as points), and a structure that endows this collection of points with some coherence, in the sense that we may speak of nearby points or points that in some sense are close together' (Mendelson, 1968).

6 The issue of whether this is a hierarchy of status, something conceptually or analytically distinct from class, is one that we cannot go into here. In our view, the empirical evidence, such as that presented later in this paper, demonstrates that such a distinction serves no useful purpose (see Stewart et al, 1980; Prandy and Bottero, 1995; cf also Bourdieu, 1985).

7 Further information on the background and use of the Cambridge Scale is available from the website at <http://www.sps.c>.

8 A similar phenomenon seems often to occur in these 'smallest space' techniques. It is probably a result of the fact that the extreme points are less constrained than those closer to the centre.

9 This may sound rather like the infamous claim that intelligence is what IQ tests measure, except that in the latter case the measurement procedures - the addition of individual item scores - are wholly inadequate and certainly do not establish a quantity. If no quantity is established, it very obviously cannot be 'intelligence'.


The authors wish to acknowledge the support of the Economic and Social Research Council in funding the work reported in this article under awards R000235147 (The Family, Occupation and Social Stratification: 1840 to the Present) and R000221729 (The Family, Occupation and Social Stratification in Scotland and Ireland).


BOURDIEU, P. (1985) 'The Social Space and the Genesis of Groups', Theory and Society, vol. 14, pp. 723 - 744.

BOURDIEU, P. (1987) 'What Makes a Social Class? On the Theoretical and Practical Existence of Groups', Berkeley Journal of Sociology, vol. 32, pp. 1 - 17.

COXON, A.P.M. (1982) The User's Guide to Multidimensional Scaling. London: Heinemann Educational Books.

CROSSICK, G. (1978) An Artisan Elite in Victorian Society: Kentish London 1840-1880. London: Croom Helm.

FOSTER, J. (1974) Class Struggle and the Industrial Revolution. London: Weidenfeld and Nicolson.

GOODMAN, L.A. (1985) 'The Analysis of Cross- Classified Data having Ordered and/or Unordered Categories: Association Models, Correlation Models, and Asymmetry Models for Contingency Table with or without Missing Entries', The Annals of Statistics, vol. 13, pp. 10 - 69.

GOODMAN, L.A. (1987) 'New Methods for Analyzing the Intrinsic Character of Qualitative Variables using Cross-Classified Data', American Journal of Sociology, vol. 93, pp. 529 - 583.

GRAY, R.Q. (1976) The Labour Aristocracy in Victorian Edinburgh. Oxford: Oxford University Press.

GREENACRE, M.J. (1984) Theory and Applications of Correspondence Analysis. London: Academic Press.

HOUT, M. (1982) 'The Association between Husbands' and Wives' Occupations in Two-Earner Families', American Journal of Sociology, vol. 88, no. 2.

MITCHELL, J.C. and CRITCHLEY, F. (1985) 'Configurational Similarity in Three Class Contexts in British Society', Sociology, vol. 19, no. 1.

LAMPARD, R. (1992) An Empirical Study of Marriage and Social Stratification. Unpublished DPhil thesis, University of Oxford.

LAMPARD, R. (1993) 'Applications of association models to sociological data'. Paper presented to the British Sociological Association Annual Conference, University of Essex.

MENDELSON, B. (1968) Introduction to Topology. Boston: Allyn and Bacon.

PENN, R.D. (1985) Skilled Workers in the Class Structure. Cambridge: Cambridge University Press.

PRANDY, K. (1990) 'The Revised Cambridge Scale of Occupations', Sociology, vol. 24, pp. 629 - 655.

PRANDY, K. (1993) Marriage and Social Stratification in Nineteenth-Century Britain. Cambridge: Sociological Research Group Working Paper No. 15.

PRANDY, K. (1996) 'Categories or Quantities: Class or Hierarchy?' Paper presented at the ISA Fourth International Social Science Methods Conference, University of Essex..

PRANDY, K. (1998) 'Class and Continuity in Social Reproduction', Sociological Review (forthcoming).

PRANDY, K. and BOTTERO, W. (1995) The Social Analysis of Stratification and Mobility. Cambridge: Sociological Research Group Working Paper No. 18.

SPSS Inc. (1990) SPSS Categories. Chicago: SPSS Inc..

STEWART, A. and BLACKBURN, R.M. (1975) 'The Stability of Structural Inequality', Sociological Review, vol. 23, pp. 481 - 508.

STEWART, A., PRANDY, K. and BLACKBURN, R.M. (1980) Social Stratification and Occupations. London: Macmillan.

WELLER, S.C. and ROMNEY, A.K. (1990) Metric Scaling. Newbury Park, CA: Sage.

Copyright Sociological Research Online, 1998