Nomenklatur
Transgene
Excerpt from ILAR News 1992;34(4):45-52.
1. SYMBOL
A transgene symbol consists of three parts, all in Roman type, as follows:
TgX(YYYYYY)#####Zzz,
TgX = mode
(YYYYYY) = insert designation
##### = laboratory-assign number and
Zzz = laboratory code
2. References
3. Resources
4. RULES FOR NAMING TARGETED MUTANTS
The first part of the symbol always consists of the letters Tg (for "transgene") and a letter designating the mode of insertion of the DNA: N for nonhomologous insertion, R for insertion via infection with a retroviral vector, and H for homologous recombination. The purpose of this designation is to identify it as a symbol for a transgene and to distinguish among three fundamentally different organizations of the introduced sequence relative to the host genome, not simply to indicate the method of insertion or nature of the vector.
For example, mice derived by infection of embryos with MuLV vectors will be designated TgR, and mice derived by microinjection or electroporation of MuLV DNA into zygotes will be designated TgN; mice derived from ES cells by introduction of DNA followed by recombination with the homologous genomic sequence will be designated TgH, while mice derived by insertions of the same sequence by nonhomologous crossing-over events will be designated TgN.
When a targeted mutation introduced by homologous recombination does not involve the insertion of a novel functional sequence, the new mutant allele (often called a "knockout" mutation) will be designated in accordance with the guidelines for gene nomenclature for each species (see Lyon 1989b reproduced below). The gene nomenclature will also be used when the process of homologous recombination results in integration of a novel functional sequence if that sequence is a functional drug-resistance gene. For example, Mbpm1Dn would be used to denote the first targeted mutation of the myelin basic protein (Mbp) in the mouse made by Muriel T. Davisson (Dn). In this example, the transgenic insertion, even if it contains a functional neomycin-resistance gene, is incidental to "knocking out" or mutating the targeted locus (see Lyon 1989b reproduced below). The mode symbol TgH is reserved for a time in the future when homologous recombination might be employed to transfer genes to specific sites in the genome by using cloned DNA from the target site to produce a homologous recombination vector. Such target loci might be anonymous, but might exhibit important regulator features that render them desirable for targeting transgenes. A hypothetical example is given below.
The second part of the symbol indicates the salient features of the transgene as determined by the investigator. It is always in parentheses and consists of no more than eight characters: letters (capitals or capitals and lower-case letters) or a combination of letters and numbers. Italics, superscripts, subscripts, internal spaces, and punctuation should not be used. The choice of the insert designation is up to the investigator, but the following guidelines should be used:
Short symbols (six or fewer characters) are preferred. The total number of characters in the insert designation plus the laboratory-assigned number may not exceed 11 (see following item); therefore, if seven or eight characters are used, the number of digits in the laboratory-assigned number will be limited to four or three, respectively.
The insert designation should identify the inserted sequence and indicate important features. If the insertion uses sequences from a named gene, it is preferable that the insert designation contain the standard symbol for that gene. If the gene symbol would exceed the spaces available, its beginning letters should be used. Hyphens should be omitted when normally hyphenated gene symbols are used. For example, Insl should be used in the symbols of transgenes that contain either coding or regulatory sequences from the mouse insulin gene (Ins-l) as an important part of the insert designation. Resources are available to identify standard gene symbols (see below).
Symbols that are identical with other named genes in the same species should be avoided. For example, the use of Ins to designate "insertion" would be incorrect.
For consistency, a series of transgenic animals produced with the same construct might be given the same insert designation. However, that is not required; some lines might manifest unique and important characteristics (e.g., insertional mutations) that would warrant a unique insert designation. If two different symbols are used for the same construct in different transgenic lines, the published descriptions should clearly identify the construct as being the same in both lines. Two different gene constructs used for transgenic animal production, either within a laboratory or in separate laboratories, should not be identified by identical insert designations. Designations can be checked through the available resources (see below).
A standard abbreviation can be used as part of the insert designation (see examples). If a standard abbreviation is used, it should be placed at the end of the insert designation.This list will be expanded as needed and maintained by appropriate international nomenclature committees. The insert designation should identify the inserted sequence, not its location or phenotype.
Examples:
An = anonymous sequence
Ge = genomic clone
Im = insertional mutation
Nc = noncoding sequence
Rp = reporter sequence
Sn = synthetic sequence
Et = enhancer trap constructP>
Pt = promoter trap construct
LABORATORY-ASSIGNED NUMBER AND LABORATORY CODE
The third part of the symbol consists of two components the laboratory-assigned number and the laboratory code
The laboratory assigned number
The laboratory-assigned number is a unique number that is assigned by the laboratory to each stably transmitted insertion when germline transmission is confirmed. As many as five characters (numbers as high as 99,999) may be used; however, the total number of characters in the insert designation plus the laboratory-assigned number may not exceed 11. No two lines generated within one laboratory should have the same assigned number. Unique numbers should be given even to separate lines with the same insert integrated at different positions. The number can have some intralaboratory meaning or simply be a number in a series of transgenes produced by the laboratory.
The laboratory code is uniquely assigned to each laboratory that produces transgenic animals. A laboratory that has already been assigned such a code for other genetically defined mice and rats or for DNA loci should use that code. The registry of these codes is maintained by ILAR
The complete designation identifies the inserted site, provides a symbol for ease of communication, and supplies a unique identifier to distinguish it from all other insertions.
Each insertion retains the same symbol even if it is placed on a different genetic background. Specific lines of animals carrying the insertion should be additionally distinguished by a stock designator preceding the transgene symbol. In general, this designator will follow the established conventions for the naming of strains or stocks of the particular animal used. If the background is a mixture of several strains, stocks, or both, the transgene symbol should be used without a strain or stock name.
C57BL/6J-TgN(CD8Ge)23Jwg. The human CD8 genomic clone (Ge) inserted into C57BL/6 mice from the Jackson Laboratory (J); the 23rd mouse screened in a series of microinjections in the laboratory of Jon W. Gordon (Jwg).
Crl:ICR-TgN(SVDhfr)432Jwg. The SV40 early promoter driving a mouse dihydrofolate reductase (Dhfr) gene; 4 kilobase plasmid; the 32nd animal screened in the laboratory of Jon W. Gordon (Jwg). The ICR outbred mice were obtained from Charles River Laboratories (Crl).
TgN(GPDHIm)1Bir. The human glycerol phosphate dehydrogenase (GPDH) gene inserted into zygotes retrieved from (C57BL/6J x SJL/J)F1 females; the insertion caused an insertional mutation (Im) and was the 1st transgenic mouse named by Edward H. Birkenmeier (Bir). No strain designation is provided because each zygote derived from such an F1 hybrid mouse has a different complement of alleles derived from the original inbred parental strains.
129/J-TgH(SV40Tk)65Rpw (hypothetical). An SV40-thymidine kinase (Tk) transgene targeted by homologous recombination to a specific but anonymous locus by using embryonic stem cells derived from mouse strain 129/J. This was the 65th mouse of this series produced by Richard P. Woychik (Rpw).
Abbreviations
Transgene symbols can be abbreviated by omitting the insert. For example, the full symbol TgN(GPDHIm)1Bir would be abbreviated TgN1Bir. The full symbol should be used the first time the transgene is mentioned in a publication; thereafter, the abbreviation may be used.
Insertional Mutations and Phenotypes
The symbol should not be used to identify the specific insertional mutation or phenotype caused directly or indirectly by the transgene. If an insertional mutation that produces an observable phenotype is caused by the insertion, the locus so identified must be named according to standard procedures for the species involved (see Resources Available for Assistance with Transgenic Nomenclature below). The allele of the locus identified by the insertion can then be identified by the abbreviated transgene symbol (see previous paragraph) according to the conventions adopted for the species.
Examples:
ho(TgN447Jwg) The insertion of a transgene into the hotfoot locus (ho).
xxx(TgN21Jwg). The insertion of a transgene that leads to a recessive mutation in a previously unidentified gene. A gene symbol for xxx must be obtained from a species-genome database or member of a nomenclature committee (see below).
Lyon M. F. 1989b. Alleles. Section 1.1.5.6 of Rules and guidelines for gene nomenclature. p. 2 in Genetic Variants and Strains of the Laboratory Mouse, 2d Ed., M. F. Lyon and G. Searle, eds. London: Oxford University Press.
Resources Available for Assistance with Transgenic Nomenclature
Before naming a transgene, an investigator should obtain a laboratory code from ILAR at the address given in the list that follows. An investigator who has already been assigned such a code for other genetically defined mice and rats or for DNA loci should use the same code. The transgene should be named as stated in the rules. Assistance in selecting transgene symbols is available from several organizations (see list that follows). Lists of named genes for mice and rats are published periodically in Mouse Genome (Oxford University Press, Journal Subscriptions Department, Pinkhill House, Southfield Road, Eynsham, Oxford OX8 1JJ, UK) and Rat News Letter (Dr. Viktor Stolc, ed., Rat News Letter, 2542 Harlo Drive, Allison Park, Pittsburgh, PA 15101). The list of mouse genes is also maintained in GBASE, a genomic data base for the mouse maintained by Dr. Don P. Doolittle, Dr. Alan L. Hillyard, Ms. Lois J. Maltais, Dr. Muriel T. Davisson, Dr. Thomas H. Roderick, and Mr. John N. Guidi at The Jackson Laboratory (see list that follows). Human gene symbols are recorded in the Genome Data Base (GDB), which is maintained at The Johns Hopkins University.
Institute of Laboratory Animal Resources (ILAR). Assigns laboratory codes; assists in naming transgenes; provides rules for naming transgenes. Contact: Dr. Dorothy D. Greenhouse, ILAR, National Research Council, 2101 Constitution Avenue, Washington, DC 20418 (telephone 1-202-334-2590; fax 1-202-334-1687; Bitnet DGREENHO@NAS).
The Jackson Laboratory. Assists in naming transgenes; provides rules for standardized nomenclature for mice; provides lists of named mouse genes. Contact: Dr. Muriel T. Davisson, The Jackson Laboratory, Bar Harbor, ME 04609 (telephone 1-207-288-3371; fax 1-207-288-8982).
Medical Research Council Radiobiology Unit. Assists in naming transgenes; provides lists of named mouse genes. Contact: Dr. Josephine Peters, MRC Radiobiology Unit, Chilton, Didcot, Oxford OX11 0RD, UK (telephone 44-235-834-393; fax 44-235-834-918).
The Transgenic and Targeted Mutant Animal Database. Records, stores, and provides information on transgenic mice and other species, including standardized nomenclature and a complete description of each transgenic animal; maintains rules for transgenic nomenclature on electronic bulletin board. Contact: Ms. Karin Schneider, Coordinator, Oak Ridge National Laboratory, 1060 Commerce Park MS-6480, Oak Ridge, TN 37830 (telephone 1-615-574-7776; fax 1-615-574-9888; Bitnet TUG@ORNLSTC; Internet tbase@iravx2.hsr.ornl.gov).
Genome Data Base (GDB). Records, stores, and provides information on mapped human genes and clones. Contact: GDB, Welch Medical Library, The Johns Hopkins University, 1830 East Monument Street, Baltimore, MD 21205 (telephone 1-301-955-9705; fax 301-955-0054). For assistance in naming human genes, the contact is Dr. Phyllis J. McAlpine, GDB Nomenclature Editor, University of Manitoba, Department of Human Genetics, 250 Old Basic Sciences Building, 770 Bannatyne Avenue, Winnipeg, Manitoba, Canada R3E 0W3 (telephone 1-204-788-6393; fax 1-
204-786-8712; Bitnet GENMAP@UOFMCC).
Rules for Naming Targeted Mutations
(excerpted from Lyon M. F. 1989b. Alleles. Section 1.1.5.6 of Rules and guidelines for gene nomenclature. p. 2 in Genetic Variants and Strains of the Laboratory Mouse, 2d Ed., M. F. Lyon, and G. Searle, eds. London: Oxford University Press.)
Sect. 1.1.5 Alleles
6. Mutations or other variations occurring in known alleles may be denoted by a superscript m followed by an appropriate series symbol and separate from the original allele symbol by a hyphen; e.b. Mod-1(a-m1Lws), the first mutant allele of Mod-1a found by Lewis.
For known deletions of all or part of an allele the superscript m may be replaced with dl. Information on the allele of origin of mutations may be valuable in elucidating changes in DNA sequence.