Right, it is not often that I write two paper reviews in a row but I think this paper is interesting enough to review. The paper in question is “A census of amplified and overexpressed human cancer genes, Santarius et al, Nature Reviews Cancer, doi:10.1038/nrc2771”. I will be back to writing non-specialist articles for laymen from the next article I post.
Now unfortunately the paper needs institutional access (NRC and its paywalls) so you will have to find a way of getting a copy of the paper yourself. The apposite NRC page where further access may be sought is here
The abstract provides a decent introduction to what is covered in the paper, I will elaborate upon the details later in the post. I quote the abstract verbatim.
Abstract | Integrated genome-wide screens of DNA copy number and gene expression in human cancers have accelerated the rate of discovery of amplified and overexpressed genes.
However, the biological importance of most of the genes identified in such studies remains unclear. In this Analysis, we propose a weight-of-evidence based classification system for identifying individual genes in amplified regions that are selected for during tumour development.
In a census of the published literature we have identified 77 genes for which there is good evidence of involvement in the development of human cancer.
This paper describes a system that outlines criteria for using evidence reported in the literature to identify genes within amplified regions that are selected for during cancer (the fact that something is selected for in cancer cells is a strong pointer to it being critical to disease phenotypes)
The problem that the authors were addressing was that while there are regions that are significantly amplified in cancer cells (that is, the number of copies of particular genes, called the copy number is increased relative to normal cells by means of duplication of chromosomal regions), it is difficult to identify which genes in these amplified segments (called amplicons) actually drive cancer. They note that until this paper was published, the Cancer Genome Atlas had just a paltry six entries for genes that were causally implicated in cancer due to amplification and consequent overexpression. This was as opposed to the sum total of three-hundred-eighty-four genes that were causally implicated in cancer through other mutagenic processes. They emphasize that this shortness of the list was due to lack of reliable data.
In the context of the paper, they have defined an amplified gene as that which has a somatically acquired increased copy number and is overexpressed because of that. They note that there may be varying levels of evidence to establish this link, in some cases we may have data associated just with genes being in an amplified region, which tells us nothing, in some cases we may have evidence showing that amplification is linked to overexpression, in some other cases we may have stringent data showing that knocking such genes out perturbs a cancer phenotype, or that copy number is correlated with clinical outcome. It makes sense to try and put all of this into perspective by developing a system that takes the strength of evidence into consideration, and such a system is what the paper describes.
Details of the Classification System.
The system assigns points to genes for which evidence is present. I present a graphical summary of the scoring and classification system below, I think it does a good job of making details of the classification system clear.
I think the table is pretty much self-explanatory; it is however useful in my opinion to reiterate that in this system all types of evidence are awarded one point and are weighted equally. Class IV genes have just evidence of being in an amplified region, Class III genes have to score at least 1 point, Class II genes 3 points and Class I genes must be demonstrated, preferably in clinical trials, to be viable therapeutic targets where blocking the gene or the gene product improves clinical outcome.
Class I genes can therefore be said to be integral to pathogenesis by amplification, Class II genes could also potentially be implicated and Class III genes require further study. Now I will go back to talking about heterogeneity, the authors have pointed to instances in the paper where different genes within the same amplicon (amplified region) act as drivers in different types of cancer, for example, they point out that in Lobular Breast Carcinoma, the gene FGFR1 is a driver while conventional wisdom deemed two other genes in the same amplified region as better candidates. This has led the authors to classify genes using this system on a per-cancer basis.
Results & Comments
The analysis only includes amplified unmutated (normal) genes & does not include genes that are both mutated and amplified, nor does it include miRNA (microRNA) which may also be involved in dysfunctional expression and can be treated as normal genes that are amplified. For a brief introduction to miRNA please read my blog post on RNA Interference on this very blog.
The authors have documented 62 Class III genes, 12 Class II genes and 3 Class I genes. I fully expect the number of genes in these classes to go up as more and more data from integrated genomic studies start to flood in. The system may prove particularly useful in guiding research programmes by informing further research and what evidence needs to be investigated. It may also enable the identification of overexpressed genes that are integral to cancer progression and thus offer a way of identifying drug targets.
I will now go on to post a table and a diagram of the results the authors have presented in the paper. You may bring up larger versions of these images by clicking on them.
The whole lot can be visualized using a virtual karyogram/idiogram, which follows below.
Further data about what evidence was used in the classification and scoring of amplified genes may be obtained from the supplements. You may find these supplementary materials here
To conclude, I think this will show you how oncologists may try to make sense of information stemming from a wide range of studies that all provide evidence to different extents and of different types. I also hope that you will be able to visualize how such systems can help organize, compare and utilize data to guide things that range from what studies one may wish to carry out to what targets drug development programmes may look at in the quest for the development of safe, effective targeted therapies against cancer.
That is all from me as far as this paper review is concerned, I hope you enjoyed the post and the paper, and I also hope that you will take time to try and know more about the Cancer Genome Project and how it is helping us understand cancer in all its complexity.