On Randomness, Determinism, False Dichotomies and Cancer

Before I start – a short summary

[1] A recent paper attributed a large proportion of variation in incidence of cancers across different tissues to the number of stem cell divisions in them, and
stochastic errors in cell division.

[2] The paper grouped tumour types with known external causes as “deterministic” and those without as “stochastic”

[3] I have seen people being hostile to the notion of stochasticity in cancer who’ve postulated other deterministic factors, with the implicit assumption that what is stochastic is really deterministic processes with as-of-now undiscovered causes.

[4] Here I explain why processes with known causes are still stochastic, leading to my gripe with both the misunderstanding that has permeated discussion of the paper as well as the iffy notion of grouping tumours into stochastic and deterministic ones in the paper. My assertion is that even those cancers strongly driven by external carcinogens involve randomness/stochasticity.


Sooo, last week, a paper was published in the journal Science that linked the number of stem cell divisions in normal tissues to the rates of incidence of cancer in that tissue. So the more dividing stem cells there were in a tissue, it turns out, the more likely the tissue would be prone to developing cancers in populations.

The paper is to be found here and where I quote without further reference, it is from this paper http://www.sciencemag.org/content/347/6217/78

To quote, the abstract reads…


Some tissue types give rise to human cancers millions of times more often than other tissue types. Although this has been recognized for more than a century, it has never been explained. Here, we show that the lifetime risk of cancers of many different types is strongly correlated (0.81) with the total number of divisions of the normal self-renewing cells maintaining that tissue’s homeostasis. These results suggest that only a third of the variation in cancer risk among tissues is attributable to environmental factors or inherited predispositions. The majority is due to “bad luck,” that is, random mutations arising during DNA replication in normal, noncancerous stem cells. This is important not only for understanding the disease but also for designing strategies to limit the mortality it causes.

 Much of the reaction I’ve seen to the paper on the cybersphere involves a fundamental misunderstanding of the processes that drive cancer – far too many people have been thinking that things cause cancers deterministically ,and even in the paper the authors group cancers into stochastic ones and deterministic ones in Figure 2, somehow conveying the impression that there are those that are caused, and those that are due to chance. There are several well-described summaries for laypeople already on the web, ranging from the almost always excellent David Gorski’s post http://www.sciencebasedmedicine.org/is-cancer-due-mostly-to-bad-luck/ , to PZ Myers’ explanation of the paper http://freethoughtblogs.com/pharyngula/2015/01/03/cancer-bad-genes-or-bad-luck/ . David’s post in particular describes the trainwreck that the media misinterpretation of stochastic errors as “bad luck” has led to.

To summarise all of that – the paper says that differences in the incidence of cancers amongst different tissues can be mostly explained by the number of cell divisions in stem cells, and known environmental factors and genetic predisposition only explain a very small percentage of why different tissues get cancers at different rates. They postulate that mutations accumulate with the number of stem cell divisions because of stochastic or chance errors in cell division.

This led David Colquhoun, on twitter, to note that a lot of the opposition to this finding seemed to be from people who opposed the role of chance in driving cancers…and he is right about the amazing indignance of those reacting with hostility to the role of chance , for reasons I will tell you in a little bit.



Additionally, there were people positing the notion that it couldn’t be chance, there was just some undiscovered latent factor/factors – and so the dichotomy was set up between stochastic (assumed to be with no cause) and deterministic (assumed to be with causes) in what passed for discourse amongst those that did protest too much.

Where the paper gets it right…

Coming to the paper itself, there are bits I like – it was quite elegant evidence for the role of stem cell divisions in driving the evolution of cancer; in some cases tumours can be latent for a long long time before they present clinically; and previous studies have reported a case of latency in lung cancer for up to two decades before the tumour showed up. Turns out you need multiple mutations to go from a normal cell to a cancerous cell , and obviously cellular lineages that persist longer (and have more divisions) are likelier to acquire the full complement of mutations.

What I disagree with is the paper is the authors lumping everything that is not attributable to external factors to be the product of stochastic errors in cell division.

Importantly, my gripe with that is not that they are attributing it to a stochastic process, because, as I shall explain, almost all mutations, even those with known causes are still random. My problem was with them putting it down to cell division and DNA replication ; turns out there are loads of internal cellular processes, which are neither genetic nor products of environmental factors, that can generate mutations – of course, all of these are still stochastic, but their phrasing of everything as errors in cell division is something I find too vague.

Some of these internal processes include age-related mutations, which are characterised by the spontaneous deamination of CpG dinucleotides, and by and large comprise a mutational signature that is found ubiquitously across cancers of different types. http://www.sanger.ac.uk/about/press/2013/130814.html

Wondering what a mutational signature is? Well, DNA is made of 4 bases, and when you look at what mutations (DNA sequence changes) have taken place in tumours compared to normal tissue you can look at the DNA sequence around the mutated sequence, and turns out certain processes generate mutations in certain sequence contexts; I’ve blogged about this earlier in the context of APOBEC enzymes, which accidentally mutate human DNA and can potentially cause cancer.

So, the point I am trying to make here is that every mutation that isn’t caused by inheritance or exposure to external carcinogens are not down to errors in DNA replication during  cell division , unless you use the term for all mutations generated internally, in which case it loses nuance. Indeed, there is extensive documentation of internal mutagenic processes, the repair pathways that deal with the lesions they produce, and so on and so forth… http://www.nature.com/nrg/journal/v15/n9/full/nrg3729.html

However, this is the important bit – Even if causes are known, external or internal, this does not mean they are deterministic; I reiterate, mutations with known causes are still random and chance still plays a massive role.

How can things have causes and yet be random?!

The trouble is that the popular use of the word “random” differs from the scientifically and statistically rigorous usage of the term thereof. In common parlance, people often assume that random means “for no reason” or “with no cause” , like “She turned up wearing a hat, totally random, blud”.

In science, it means “following a probability distribution”, where uncertainty is involved. This is why we talk of risks – see, smoking causes lung cancer and there is a mutational signature associated with smoking, and many of the mechanisms of mutations induced by cigarette smoke are well known. However, the relationship between smoking and lung cancer is not deterministic – i.e, not everyone who smokes heavily gets lung cancer – what smoking does is it increases the chances one has of getting lung cancer. This is why the relationship between lung cancer and smoking is stochastic (chance is involved).

Likewise, APOBEC enzymes can cause very specific mutations and have a specific mutational signature associated with them, i.e, they change C to T or C to G when there is a T before and a G,A or T after; i.e, TCW -> TTW or TGW – however, the mutations induced by APOBEC enzymes are still random.

How can this be the case?

Well, it turns out, that for cancers to develop, you need mutations or epigenetic changes in certain types of genes – those that control the cell’s (or in this case, a lineage of cells’) ability to acquire the hallmarks of cancer (i.e, ability to grow without external stimulation, ability to escape cell death, ability to escape the immune system et cetera)  and not all regions of the human genome harbour genes that can cause cancer. So there are, for instance , plenty of TCW sites in the genome that are not capable of affecting cancer-associated genes if they pick up an APOBEC induced mutation.

So while we know that a molecule of APOBEC can act upon a TCW site to mutate it at a given rate – which TCW site in the genome gets mutated is down to chance, and the probability it gets repaired also involves a chance element; this is how even a factor with a well-defined mode of action can still make random mutations.

On top of this – you see chance involved when different combinations of mutations occur in a cell or its lineage – the right combination of cancer-causing mutations happening is still a matter of chance – whether it evolves sufficiently to evade the immune system is still a matter of chance; chance is everywhere – cancer evolution is a stochastic phenomenon, fundamentally.

Additionally, mutations happen at different sites in the genome with different probabilities, but if it happens in a cancer related gene that then gives cells that carry it a selective advantage or not is a matter of chance – this is why cancers contain both driver mutations that confer growth advantages and passenger mutations which don’t.

This leads me to my main bone of contention with the paper, along with the iffy statistics of the second figure in the paper, I find the authors group them into stochastic and deterministic classes ;

They also make this clanger

We refer to the tumors with relatively high ERS as D-tumors (D for deterministic; blue cluster in Fig. 2) because deterministic factors such as environmental mutagens or hereditary predispositions strongly affect their risk. We refer to tumors with relatively low ERS as R-tumors (R for replicative; green cluster in Fig. 2) because stochastic factors, presumably related to errors during DNA replication, most strongly appear to affect their risk.

It turns out that they are all stochastic – because, for instance where exactly in the genome smoke-induced carcinogens induce mutations is down to chance. Smoke and environmental mutagens are not deterministic factors, nor are any internal mutational processes. 


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s