scnRCA is a Java program to obtain the self-consistent reference set for a given genome using the nRCA (or CAI) codon bias index. The self-consistent reference set is defined as the set of genes within the genome that possess a dominant codon bias, in the sense that ranking all genes in the genome with a codon usage index based on such a set leads to picking out the same set as the top-scoring group of genes in the genome. When translational bias is present, the self-consistent reference set is likely to be populated by genes with heavy translational bias, although other biases, such as %GC content, can confound the algorithm. We have shown that nRCA outperforms CAI at identifying the self-consistent reference set in biased genomes.
scnRCA is written integrally in Java. Based on the iterative self-consistent reference set algorithm by Carbone et al.*, scnRCA allows computation of the self-consistent reference set using a GenBank genome file as input. scnRCA performs expectation maximization on the source genome by progressively partitioning the genome into ranked fractions until it homes in on a final reference set that ranks itself the highest in a stable manner. The final percentage of genes from the genome that makes up the reference set can be specified by the user, but is preset at 1%. The user may also specify the denominator used in genome partitioning (2 by default), the maximum number of iterations or whether the search should start at different random points in the genome.
If using RCA for your research, please cite:
O’Neill, PK, Erill I. (2013) “scnRCA: a novel method to detect consistent patterns of translational selection in mutationally-biased genomes”, PLoS ONE. 8(10):e76177 [PubMed]
* Carbone A, Zinovyev A, Képès F. (2003) Codon adaptation index as a measure of dominating codon bias. Bioinformatics, 19(16), 2005-15.