Saturday, December 03, 2005

A ranged marriage / The Importance of being Normal

If you're Indian, in your mid 20's or older and unmarried, there's a reasonably good chance that you've had at least one conversation (with family, with married friends) about arranged marriage. You've been told that there's no harm in at least meeting with people that your parents / family find for you. That the arranged marriage market can be really efficient and really convenient if you just let it. That of course, you'll need to make some compromises, but that's what all relationships are about.

The trouble with all these arguments is that they ignore the fundamental heterogeneity of human personality, and make the implicit assumption that everyone will experience the arranged marriage market the same way. This means that there is a single objective assessment of the value of the marriage market, and so any disagreement about whether you should look for an arranged marriage implies that at least one of the parties in question is being biased or prejudiced and allowing his / her subjective feelings to come in the way of good sense.

My own contention is that how efficient the marriage market is and how much compromise you need to make to operate through that market is overwhelmingly a function of who you are. This means that it's perfectly feasible for person A to see the arranged marriage market as the best thing ever and person B to see it as a completely value-less proposition, and for them both to be right.

Let's see why this should be true. The key assumption you need to make to understand this is that human characteristics are normally distributed, i.e. they conform to the familiar bell curve. This is not an entirely arbitrary assumption. Normal distribution of human behaviour is an inherent assumption in practically all grading and evaluation, and the argument for human behaviour being normally distributed goes all the way back to Thorndike's treatise on Individuality in 1911. Of course, personality is a notoriously difficult thing to measure, and it's certainly true that many human choices are driven by states, moods or attitudes rather than by enduring traits. Still, I think it's safe to argue that if you're solving for someone you're going to spend the rest of your life with, you'd want a little more than common mood or emotion - you'd want compatibility at a trait level.

For the purposes of what follows, I'm going to assume that all aspects of personality (including values, tastes, interests, abilities, etc.) can be combined into a single construct and furthermore, that this construct can somehow be scaled to give a continuous measure of personality. The positive and negative of the scale is irrelevant here - I'm not trying to make value judgements - I'm simply interested in looking at what is average and what is not. It's an unfortunate (though not, perhaps, entirely coincidental) quirk of the language that words like common, average, mean, etc. come with negative connotations. I use them here in a purely statistical sense - I'm not trying to imply that someone who is at the mean is any way inferior to someone who is four standard deviations above the mean - only that the two are very different.

Right, then. The crux of my argument is that within a given population, human personality is normally distributed. Let us assume that we took the entire population and scored them on our personality scale. Let u denote the mean value of the personality measure and s denote its standard deviation. Then my assumption is that the personality scores of the population will be normally distributed around this mean, that is to say they will show a distribution such as the one shown below. I can't prove that this is true, of course, but anecdotal empirical evidence would suggest it is - just think about the number of people who watch football vs those who watch ballet [1].


What does the search for a partner look like in a population like this. To understand this, we first need to define a compatibility level c - which is defined as the distance between two people and is measured in multiples of standard deviation[2]. Thus c= 1s implies that people are one population standard deviation away from each other. I assume that the value of the relationship is inversely related to c in a purely linear fashion, i.e. a relationship between people who are 1s apart is twice as valuable as a relationship between people who are 2s apart. Notice that this implicitly assumes that this distribution is not visible to people and that therefore the value of the relationship per se is independent of the probability of its occurence. I will return to examine this assumption more closely.

Let us take an individual located at u + xs [3] on this distribution and assume that she enters the arranged marriage market and meets with n potential spouses, evaluating them at a minimum compatibility cut off c'. Assuming that within the selected population draws in the marriage market are made at random [4] (I will have more to say shortly on how the relevant population is selected and what impact the marriage market has on that). What is the probability that this person will find a mate through this process?

Given c', the % of the population that is compatible with this person is given by N((u+xs+c')/s) - N((u+xs-c')/s) where N is the value of the cumulative z distribution. Thus a person exactly at the mean and with a compatibility criterion of 1s would be compatible with 68% of the population and would therefore have a 68% chance of meeting someone and finding them acceptable. A person at the mean plus one standard deviation (u+s) would have a 48% chance, a person at u+2s would have a 16% chance, and so on. The point is, of course, that as you get out in the tail of the distribution, the probability of finding any one person you meet compatible declines rapidly.

But let's get back to the market. Let's denote the probability of finding a person drawn at random acceptable = % of population that is compatible = p. If you meet n people, the probability of finding at least one person who you think is compatible (assuming all trials are independent) is m = 1-(1-p)^n. Say n=5 (not an unreasonable number, you would think). For the strictly average person (located at u, with p=68%), the probability of finding at least one person they feel they're compatible with is 99.7% - a virtual certainty. Conversely, for someone at u+3s(p=7%) the chance of finding at least one acceptable mate is a mere 11%. And if you happen to be out at u+4s (p=0.1%), forget it, you've barely got a 1% chance of meeting someone acceptable. The graph below maps out the probability of finding at least one acceptable person for the case where c=1s and n=5. You can see how sharply this probability drops beyond a deviation of 1.5 from the average.

Notice that I'm not even trying to make the case for what Byron calls the 'credulous hope of mutual minds' here. I'm not talking about love or soul-mates - a compatibility criterion of 1 deviation is a fairly substantial compromise. Yet even so, if you're out in the tail of the distribution, the chance of your being able to find someone even that barely acceptable is pretty low.

There are a number of different ways to think about this problem, of course. One way is to think about how many people you would have to meet before you'd have a reasonable chance of finding someone acceptable. The figure below graphs out the the different probability distributions for m taking different ns, keeping c'=1s. Clearly, as n increases, the inflexion point beyond which m starts to drop rapidly shifts out, so that for people at u+2s for example, m with fifteen trials (92%) is significantly better than m with just three trials (40%). For people in the extreme tail, however, the difference, though significant in relative terms, isn't particularly great in absolute terms. If you're at u+4s, shifting from 3 trials to 20 trials will take your chances of finding someone, m, from virtually zero to 3%. That's a big increase, but you're still going to meet 20 potential spouses and have a 3% chance of finding someone acceptable. Hardly exciting odds. In practise, of course, n is likely to be sharply constrained, so that going beyond single digit trials may be hard (if you're at u+4s for instance, you'd need 1072 trials to achieve a 90% probability of finding someone acceptable - that's four and a half years of spending every night, weekends included, no repeats, of meeting new prospectives!). Still, if you think you're at u+2s, you may want to push for a larger number of trials.

The more obvious variable to change here, of course, is c'. The graph below holds number of trials constant at 5, but plots the distribution of m for different value of c'. The key point here is that relaxing your criteria has a much greater benefit than meeting more people if you're in the tail of the distribution. If you're at u+4s, relaxing c' to 1.5s will give you the same probability of finding someone (still 3%, though) with just five trials as you would get by sticking to your criteria but meeting 20 prospectives. If you think about the way the arranged marriage process is structured, you can see why for people in the tail there's a disproportionate pressure to compromise on who they end up with rather than on how many people they meet. If you're at u+4s, you could get a 90% chance of finding someone 'acceptable' with just five trails, if you set c' at 3.66s - so that basically you're looking for someone just marginally different from average.

Two points need to be made here. First, the analyses above suggest why it's important to consider carefully before entering the arranged marriage market in the first place. Research shows that people like to finish things they've started, and once you meet with the first few prospectives the pressure to lower your standards and go with someone less compatible rather than carry on petulantly rejecting people will mount, especially if you're the kind of person who is sensitive about rejecting other people. So as the arranged marriage process goes forward, you're likely to find yourself lowering your criteria considerably. This brings me to the second point - the assumption here is that you're trying to maximise compatibility with person you're going to spend your life with, I'm assuming that the fact of being married carries no value by itself. You could make the opposite assumption and assume that the entire value comes from the marriage and the person is irrelevant, in which case the argument above becomes simply a way of estimating the minimum compromise you have to make to have reasonable certainty of finding someone.

As you think about adjusting c', you can also go the other way, of course. If you're at u, you can achieve the same m as someone at u+2s could get with 5 trials and c' of 1s (57.5%) by setting c' to .198s; you could do as well as a person at u+4s who has c' set to 1s with a c' of .001s. In other words, people who are more or less average have as good a chance of finding their true soulmates through the arranged marriage process as people far out in the tails have of finding someone barely acceptable.

The implication of all this is clear - how good a deal arranged marriage is is entirely a function of how different / similar you are to people from your own socio-economic and cultural background. If you're just a regular guy / girl, then the arranged marriage market may actually make extremely good sense - you have a pretty fair shot of finding someone you could truly come to love (for people at u, m=67% with n=5 and c'=.25s); if you're someone who finds it difficult to get along with people of your own age group / background, then you've basically got no chance at all, unless you're willing to totally give up on any semblance of compatibility and just get married for the sake of getting married. That's why people who are close to the average can speak so blithely of compromise - the compromise they need to make is so much smaller than the ones the people in the tail would have to make.

At this point, you're probably asking what arranged marriages have to do with this - isn't this true more generally? The key point about arranged marriages, of course, is that the population they sample from is selected very differently from the one that you might sample from yourself. For one thing, arranged marriage samples are likely to have lower standard deviation, simply because they'll sample from within a particular community and control for various socio-economic and other background factors, all of which will tend to drive the variance down. So you may not be a particularly unique person otherwise, but within the narrow subset of the arranged marriage sample you may still end up being at u+3s. The second point is that arranged marriage samples are generally standard samples that are drawn fairly independent of the person they are being drawn for (partly because those drawing the sample don't know much about the person it's being drawn for themselves, and partly because the whole point of the arranged marriage market is that it increases efficiency by making one big common draw of the sample, rather than making seperate draws for each individual), so that unlike a personal sample, where you're more likely to be close to the mean, the arranged sample has nothing that will tend to centre it around you. The key comparison to make then, is between the sample population that the arranged marriage market already has drawn and ready, and the population that you can draw for yourself. If you believe that you're going to be between u-1.5s and u+1.5s in the arranged marriage sample, that's a good way to go, if you think you're going to be closer to u-3s or u+3s, then you probably want to consider drawing your own sample. (notice that on the assumption that most people intuitively do this anyway, the variance in the arranged marriage sample declines even further, because people in the tails essentially self select out of the market). [5]

Finally, for those of you who are bristling with indignation at the cold analytics of this and want to deliver long soliloquies about love, let me say that I totally agree with you. A steady teenage diet of Keats, Byron and Shelley, mean that I'm all for rejecting the "lore of nicely calculated less and more" and doing the Emersonian "Give all to love" thing. The trouble is that that's not a useful argument against those who recommend arranged marriage - they'll argue that you're just being romantic and impractical (as if that's a bad thing!). The point of this is simply to prove that even from a purely objective, analytic perspective, arranged marriage does not make sense for some people.

Now if only I can get my great aunts to understand normal distributions and binomial probability!

Notes

[1] An alternative assumption would be to assume a chi-square distribution - suggesting that the distribution has a longer right tail and is concentrated on the left. As should be obvious, this would have even more extreme implications for the model presented - in that sense the normality assumption is a conservative one.

[2] Another implicit assumption in this analysis, now that I think about it, is homophily - the idea that you would want to be with someone who is more like you than otherwise. I agree that opposites attract, of course, but I'm not sure they make for stable life-partners. At any rate, I don't see why you would even consider an arranged marriage if you were looking for someone completely unlike you. You could of course hypothesize a curvi-linear relationship between attraction and similarity, and re-do this analysis with different assumptions on preferences.

[3] I use u+xs purely for notational convenience, the argument is exactly the same for u-xs - the final probability distribution is symmetric around u and depends only on the absolute value of the deviation, not on its sign.

[4] You could argue that people who know you may be able to do better than a random draw. I think the random draw assumption is reasonable because a) the people making the draws usually don't know anything about you and just use some general criteria b) to the extent that the search can be extremely specific and targeted, I'm not sure that it qualifies as an arranged marriage anymore.

[5] Another way to think about the difference between the distribution from the arranged marriage vs your personal distribution, is that respecifying the personality vector in your own rather than general terms, may cause the same population to arrange itself somewhat differently, leaving you closer to the mean. In practise, this is equivalent to hypothesising a different population entirely.

4 comments:

Heh Heh said...

brilliant!

the One said...

Er .. what about "opposites attract" and that sort of thing?

zedzded said...

fantastic! I totally agree with your analysis. I have gone through this whole process and finally decided to stay single. People see me as a "u" person but I really am a "u+4s" person. With such disparity between the "boys" introduced to me thro this process and the real me, it is like watching a painfully-boring-badly-made-movie in a popcorn-smelling theatre with bad sound system.
My american friends think it is a pretty neat idea, as you get to have a dozen blind-dates when you visit home, all the background-check has been done your parents and you have all the family backing and support if things go wrong [you can blame it on them :-)].

Falstaff said...

Heh heh: Thanks. I thought you would like it.

The One: See note [2]

zedzded: Thanks. And yes, I see how it could be comforting to know that you can blame someone else for your own problems. Aside from the fact that that's cold comfort, though, there's the problem that it could work the other way - your family could come back saying, look, we found you the best guy around, if you didn't manage to make it work even with him then the problem has to be with you. As a general rule, it's dangerous to assume that you can get the better of family - remember they've had a lot more practise, plus making you feel bad about yourself is what they do for a living!