Genomic Feature: Any genomic region with some annotated function. E.g. a gene.
Homologous: Genomic features that share a common ancestry. Although there are different definitions of this term, for comparative genomics work, this definition is the simplest. If two genes are said to be homologous with respect to one another, then they share a common ancestral gene. In such a case, these genes are homologs. With this definition, such phrases as "95% homolgous" is nonsensical. Things are either homologous, or not. The only exception would be gene fusion events, whereby a new gene is created by the fusion of two other genes (or domains thereof).
Orthologous: A special case of homology where the genomic features in question (e.g. genes) are in different organisms. Orthologs arise from speciation events.
Perfect orthologs: Orthologs that retain synteny.
Paralogs: A special case of homology where duplicate genes or regions are in the same genome.
Homeologs: A special case of paralogy resulting from polyploidy.
Syntenlog: A special case of orthologs and/or paralogs where genes are derived from the same ancestral genomic region.
Synteny: Synteny is a valid deduction that two regions of chromosome are derived from a single ancestral region of chromosome. This deduction is based on collinearity data. If two regions have obvious collinear features, they are syntenous. However, many noncollinear regions may be logically rearranged (e.g. through inversious) into a common ancestor, and thus be syntenous.
Fractionation: The mechanism by which a duplicated genomic feature is lost from one of the duplicate region but not the other. For example, if a genome is duplicated via tetraploidy, every genomic feature in the genome is also duplicated. Over evolutionary time, most of the duplicated features are lost from one genomic region or the other (although a significant fraction may be retained in duplicate, which we can detect to identify syntenic regions). Post-tetraploid genome fractionation will cause a genome to return to its pre-duplication state in terms of total gene content (but not its gene order).
Fractionation is a natural process where genomic regions that contain genes, regulatory elements, or other genomic features are lost over evolutionary time. This process is especially prevalent in plant genomes with a history of polyploidy events. After the duplication of a genomic region, or the entire genome as is the case with tetraploidy, most of the duplicated genomic features are lost from one homeolog or the other. This is assumed to happen because there is no positive or purifying selection for the retention of these featrues in duplicate. (However, it is important to note that there are several classes of genomic features that are selected to be retained following genome duplication events. As example of such a class are transcription factor genes.)
When conserved noncoding sequences (CNSs) are fractionated, this is evidence that the cis-regulation of their retained nearby genes is being subfuncationalized. Since the regulatory function of CNSs is inferred, we do not know how they may be regulating gene function (e.g. enhancers, repressors). However, since they are usually clustered near a single gene in plants, we can assume that their presence is usually linked to that gene. As such, when a CNS is found by comparison to an outgroup genomic region (e.g. rice in our example), and not found in the neighboring gene's homeolog, then the ancestral function of that CNS is retained in one homeolog and not in the other. When regulatory elements are fractionated, then the function of the gene has been subfunctionalized. For example, let's say there are two cis-regulatory elements in a hypothetical ancestor gene, one responsible for regulating gene expression in leaves, and the other for regulating gene expression in roots. If this gene is duplicated and one duplicate loses the root regulatory element and the other looses the leave regulatory element, then the gene's function has subfunctionalized.
CNS: Conserved noncoding sequence. Conservation of genomic sequence that does not encode for protein. When detected, CNSs indicate that that genomic region has function related to its primary sequence. CNSs are often operationally (or methodologically) defined by the sequence alignment algorithms used for their detection and a cutoff imposed on the percent of sequence identity and length. CNSs have been extensively characterized in vertebrate and plant lineages. Vertebrate CNSs have been defined as >= 100bp long with >70% sequence identity; plant CNSs have been dfined as >=15bp long with a blast e-value <= to that of a 15/15bp exact match. However, other operational definitions will work as well using different alignment algorithms.
Useful divergence: When detecting conserved sequences between genomic regions (usually non-coding sequence), the sequences need to have been diverged for a long enough period of time so that sequences that are not under selection will have been randomized, but not for so long that detecting conservation is impossible. This means that there is a useful "window" of divergence between genomic regions for detecting conserved sequences. For example, siblings are not diverged enough and much conserved sequence is expected due to their recent ancestry, whereas comparing a bacteria to an archaea is useless for finding conserved noncoding sequences. CNSs have been detected in vertebrate lineages spanning humans to fish (~400 My divergence), while the most significant CNSs tend to disappear after ~50 My divergence in plants.
Genespace: A genomic region containing a gene and its neighboring cis-action regulatory sequences. Genespaces are computationally defined by, for example, finding all the CNSs around a gene and using the positions of the outermost CNS or genic feature (UTR) on either side of the gene.
Subfunctionalization: The evolutionary process by which each duplicate gene or cis-acting feature loses a different, complementary part of its ancestral function, but combined, they retain the full complement of their ancestral functions. For protein coding genes, subfunctionalization can happen at the level of protein function, or can happen in cis-acting regulatory sequences (e.g. CNSs). An example would be a gene with essential functions A and B that is duplicated. One duplicate loses function A; the other loses function B. As long as they are both present, they retaining the full function (A and B) of their preduplicated state. Example of the functions A and B are protein binding and phosphorylation, or expression in root and leaves.
Neofunctionalization: The evolutionary process for a duplicated genomic feature by which one duplicate retains the complete ancestral function and the other evolves a new function.