The Mathematics of French-Canadian Cousinhood

From the Storyline Notebook
· · ·

The Mathematics of French-Canadian Cousinhood

Why, if you have a French-Canadian ancestor, we are probably related

A note on the founder effect · pedigree collapse · and why the math always works

When my son mentioned recently that a coworker had French-Canadian background, I told him before either of us had checked anything: "We are probably related, somewhere in the 9th or 10th cousin range. We just need to figure out the line."

It turned out we shared Marguerite Gaulin (1627–1703) and her daughter Marie Crête. My descent traces through Marie's second husband; his coworker's traces through her first. Two branches that diverged in seventeenth-century Beauport rejoined three and a half centuries later in a conversation between two economists.

This was not the first time I had made the prediction. I have used the same framing in conversations about the filmmaker Denis Villeneuve, the actress Angélina Jolie, and the singer Madonna — all documented French-Canadian-founder descendants whose public profiles make the cousinhood claim a concrete demonstration of the underlying math. In every case I have examined directly, the cousinship has been traceable in the records.

What I had been observing across cases is, in fact, one of the most mathematically tractable problems in population genetics. Two economists in particular might appreciate the structure of it. So this piece is for them, with the math in plain sight.

Eight thousand five hundred to eight million

The first thing to know about French-Canadian ancestry is how few people made it. New France received roughly 33,500 colonists from the founding of Quebec in 1608 to the British Conquest in 1760. Fewer than 10,000 stayed. Of those who stayed, only about 8,500 actually settled, married, and had at least one child in the colony. That is the entire genealogical base for the modern French-Canadian population — somewhere between eight and ten million descendants today, depending on how broadly you count.

~8,500
Founding settlers
who reproduced
~800
Filles du roi
1663–1673
~262
Filles à marier
1634–1663

The 8,500 figure is the canonical number used in current peer-reviewed research. The Project BALSAC database at the Université du Québec à Chicoutimi, the most complete population-genealogy database for any human group in the world, has reconstructed lineages for the vast majority of these founders and their descendants. Pair it with the Programme de recherche en démographie historique (PRDH) at the Université de Montréal and the data infrastructure becomes the closest thing to a controlled experiment in human pedigree collapse that exists.

The crucial point for what follows: the modern French-Canadian descendant pool ran through a demographic bottleneck of fewer than ten thousand people. Mathematically, that has consequences.

The math of pedigree collapse

An individual has two parents, four grandparents, eight great-grandparents, and in general 2n ancestors at the nth generation back. This doubling is the most familiar piece of arithmetic in genealogy. The next observation is the less familiar one: that arithmetic cannot possibly hold forever.

For a typical North American descendant of seventeenth-century New France settlers, Marguerite Gaulin sits roughly at the tenth generation back. The arithmetic says I should have 210 = 1,024 distinct ancestors at her level. But Marguerite belongs to a population pool of only 8,500 individuals. By the eleventh generation back the formula calls for 2,048 distinct ancestors, and by the twelfth, 4,096. By the thirteenth, the formula demands more ancestors than the entire effective founding population.

What that mismatch means is that the same individuals must appear multiple times in any deep French-Canadian pedigree. Three lines that look distinct in the first few generations must, by combinatorial necessity, converge on the same ancestral pool.

Mathematical Sidebar 1

Where the lines must converge

If a population has F effective founders (here F ≈ 8,500), and an individual's pedigree calls for 2n ancestor slots at the nth generation, then the pedigree must collapse onto repeated individuals once:

2n > F    ⇒    n > log2(F) ≈ 13.05

By the thirteenth generation back, the arithmetic forces convergence. In practice convergence begins much earlier, because the effective founder pool is far smaller than F: many founders contributed no surviving descendants, and a small subset contributed disproportionately. (For more on this Pareto-shaped contribution structure, see the next sidebar.)

The 8,500 figure is itself an upper bound. The number of founders whose lines actually persisted into modern populations — the "effective" founder size — is much smaller, perhaps 2,500–5,000 depending on the descendant subgroup. Population-genetics researchers have shown that the distribution of founder contributions is sharply Pareto-shaped: a small minority of founders, particularly the high-fertility filles du roi and filles à marier, account for a disproportionate share of modern descendants.

Mathematical Sidebar 2

The Pareto structure of founder contribution

Not every founder contributed equally to the modern descendant pool. The contribution distribution is heavily skewed:

  • A small share of founders accounts for a disproportionate share of contemporary descendants — characteristic of a power-law or Pareto-shaped distribution.
  • Some founders (notably high-fertility filles du roi and filles à marier) saw 7–11 surviving children, each producing 7–11 of their own, compounding exponentially.
  • Other founders left no descendants at all — emigrated back to France, died young, married late, or had no surviving issue.
  • The resulting Gini-like inequality in ancestral contribution amplifies the convergence: descendants concentrate disproportionately around the high-contribution founders.

This is why the expected number of common ancestors between any two French-Canadian descendants is even higher than the naive birthday-paradox calculation would predict.

The fertility engine: filles à marier and filles du roi

The 8,500-founder bottleneck would have produced strong pedigree collapse on its own. What made the convergence even sharper was the policy-driven demographic surge between 1634 and 1673, when two cohorts of state-supported women emigrants — roughly 262 filles à marier and roughly 770–800 filles du roi — arrived in a colony desperate for marriages and children.

Marguerite Gaulin was a fille à marier: she arrived in or near 1654, married within months of landing, and bore ten children at Beauport over forty-nine years of marriage. That ratio — immediate marriage, large completed family, low infant mortality relative to seventeenth-century France — is what produced the explosive descent fanning.

Belleau, a leading Quebec genealogist and herself a descendant of one of these women, estimates that roughly 95 percent of "old stock" French-Canadians can find at least one fille du roi in her family tree. Other published estimates put the figure at about two-thirds of all Canadians of French descent, including those of more recent or mixed origin. Both figures are useful: the higher one applies to the population genealogists most often encounter; the lower one applies to the broader demographic envelope.

The cumulative effect is what economists would call a multiplier on a small base. Eight hundred founding mothers, multiplied by ~5–10 surviving children, multiplied across ten or eleven generations of compounded reproductive success, produces a descendant pool numbering in the millions — with each modern descendant typically tracing back to multiple founders rather than just one.

The birthday paradox, applied to ancestors

Here is the question of practical interest: if two random French-Canadian descendants meet, what is the probability that they share at least one common ancestor in the founding cohort?

The answer, to a first approximation, is essentially one.

The reasoning is structurally identical to the birthday paradox. In a room of 23 people, the probability that two share a birthday exceeds 50 percent — counterintuitive because we instinctively compare 23 to 365, not the 253 possible pairs of people. The same logic, scaled up, applies to French-Canadian ancestral pools.

Mathematical Sidebar 3

The probability of shared ancestry

Suppose two unrelated descendants each have approximately 1,024 ancestral slots at the tenth generation back, drawing from an effective founder pool of approximately 2,500–5,000 high-contribution founders.

The expected number of shared ancestors between them, if their ancestor slots were drawn independently from that pool, is given by:

E[shared] ≈ (n1 × n2) / Feffective

For two descendants each with ~1,024 slots and an effective pool of ~4,000, this yields an expected overlap on the order of 200 shared ancestral lineages. The Pareto skew amplifies this further, since both descendants are drawing disproportionately from the same high-contribution founders.

The probability that the actual overlap is zero is, for any realistic parameter values, vanishingly small. Two French-Canadian descendants will share ancestors. The empirical question is only which ancestors and how many.

The exact arithmetic depends on the choice of Feffective and on the depth at which one starts counting, but the result is qualitatively robust to a wide range of plausible assumptions. The genealogical infrastructure of seventeenth-century Quebec essentially guarantees shared ancestry for any two modern descendants of the founding population.

If two French-Canadians meet and both have ancestry reaching the founding period, the question is not whether they share ancestors. It is which, and how many, and through which lines.

What modern DNA studies confirm

The arithmetic above is a prediction derived from population history. Genome-wide DNA analyses provide the empirical confirmation. Studies of identity-by-descent (IBD) sharing — segments of DNA that are identical between two individuals because they were inherited from a common ancestor — show extensive IBD sharing throughout the French-Canadian population.

Researchers using the BALSAC genealogical data, paired with modern genome-wide single-nucleotide polymorphism data, have demonstrated that the kinship structure of contemporary Quebec was visibly established by roughly 1750, after only three to four generations of European settlement. Regional founder effects subsequently amplified the convergence: the Saguenay–Lac-Saint-Jean population, the Beauce, the Gaspésian Acadians, all show even sharper internal kinship than the Quebec average.

The genealogical evidence and the genetic evidence agree. The mathematics is not theoretical; it is documented in both the parish registers and the genome.

Documented Distant Cousins

The same arithmetic that connects us

Hillary Clinton · Madonna · Angélina Jolie
Céline Dion · Justin Trudeau · Pierre Trudeau
Alanis Morissette · Jack Kerouac · Denis Villeneuve

All documented French-Canadian-founder descendants, all distant cousins of one another. A 2008 study by the New England Historic Genealogical Society found, for example, that Hillary Clinton and Angélina Jolie are ninth cousins twice removed, both descending from Jean Cusson who died in St-Sulpice, Quebec, in 1718.

The modal cousin distance

If you ask a population geneticist what cousin distance to expect between two random French-Canadians who both descend from the founding cohort, the answer falls in a narrow band: roughly the eighth to eleventh cousin range, often with multiple cousin relationships at slightly different distances through different ancestor pairs.

This is exactly the distance the Hillary Clinton × Angélina Jolie example shows. It is exactly what I predicted to my son about his coworker before doing the research. It is what the math says we should expect, and what the parish registers and the genome both confirm.

At that genealogical depth, DNA matches become unreliable — ninth cousins share, on average, only a few centimorgans of detectable DNA, often less than the threshold most consumer testing services report. But the genealogical record — the parish baptisms, the marriage contracts, the burials — bridges that gap with documentary precision the genetic data alone cannot achieve. This is why, for French-Canadian genealogy specifically, the documentary record matters as much as it does. The cousinship is provable on paper even where DNA cannot detect it.

What this means in practice

The rule of thumb I have used across cases comes out of this math more than out of anecdotal observation. If two people both have meaningful French-Canadian ancestry reaching back to the founding period — one or two great-grandparents born in Quebec, say, with deeper roots from there — the probability that they share multiple common ancestors at the 8th-to-11th cousin range is essentially one. The genealogical exercise is not to discover whether they are related, but to identify which lines, and which ancestors, are the bridges.

For my own family, the bridge to my son's coworker is Marguerite Gaulin and her daughter Marie Crête. Marie had three marriages over twenty-two years — to Pierre Pepin, to Jean Brideau, to Pierre Soudain dit Bellerose. My descent passes through husband number two. The coworker's descent passes through husband number one. Two distinct genealogical lines, both originating in a single Beauport household of the 1660s, both re-emerging three and a half centuries later in adjacent cubicles.

One boundary condition belongs alongside the rule of thumb. Acadian descent is not the same founder population as the Québécois one, even when both are grouped colloquially under the same "French-Canadian" label. Acadia's seventeenth-century founders settled the Maritime provinces — Nova Scotia, New Brunswick, Prince Edward Island — and were dispersed in the Grand Dérangement of 1755. They produced their own distinct pedigree-collapse arithmetic, through a different founder pool, with a different chronology. The math above applies specifically to descendants of the Nouvelle-France founder pool established along the St. Lawrence valley. A "French-Canadian ancestor" who turns out to be Acadian belongs to a parallel but distinct documentary chain, and the cousinhood calculus runs through a different set of founders.

This is the kind of finding that gives the math its weight. The numbers are not abstract. The 8,500 settlers are real people whose names are in the parish registers. The 800 filles du roi and 262 filles à marier are real women whose marriages are documented to the day. The cousins they produced are all of us.

If you have a French-Canadian ancestor, we are probably related too

Storyline Genealogy can identify the line. The documentary record of seventeenth-century New France is among the most complete of any colonial population, and the cousin-bridge from your family to any other French-Canadian descendant is almost always provable in writing.

Begin Your Research Inquiry

Sources and further reading

Project BALSAC, Université du Québec à Chicoutimi. The most comprehensive population-genealogy database for any human group. balsac.uqac.ca

Programme de recherche en démographie historique (PRDH), Université de Montréal. The foundational Early Quebec Population Register. prdh-igd.com

Anglehart, A., Vezina, H., Roy-Gagnon, M.-H., et al. Deciphering the genetic structure of the Quebec founder population using genealogies. European Journal of Human Genetics (2023). The 8,500-founder figure and the 1750 kinship-structure date are drawn from this paper.

Moreau, C., Bhérer, C., Vézina, H., Jomphe, M., Labuda, D., & Excoffier, L. Deep human genealogies reveal a selective advantage to be on an expanding wave front. Science (2011). Foundational study of the spatial dynamics of the Quebec founder population.

Roberts, Gary Boyd. The Royal Descents, Notable Kin, and Printed Sources. New England Historic Genealogical Society, 2008. The documented genealogies underlying the Hillary Clinton × Angélina Jolie ninth-cousin finding and related celebrity-descent research.

Landry, Yves. Les filles du roi au xviie siècle. Leméac, 1992. The canonical reference on the filles du roi and their marriages.

Bouchard, Gérard, ed. Naissance d'une population: les Français établis au Canada au XVIIe siècle. Presses Universitaires de France & Presses de l'Université de Montréal, 1987. The foundational demographic study of the New France founding population.

Previous
Previous

Cathedral-Basilica of Notre-Dame de Québec

Next
Next

Église Saint-Martin de Vieux-Bellême