Appendix F to Subpart G of Part 1 - XXX  


Latest version.
  • Appendix F to Subpart G of Part 1 - List of Feature Keys Related to Protein Sequences

    Source: World Intellectual Property Organization (WIPO) Handbook on Industrial Property Information and Documentation, Standard ST.25: Standard for the Presentation of Nucleotide and Amino Acid Sequence Listings in Patent Applications (2009).

    Key Description
    CONFLICT different papers report differing sequences.
    VARIANT authors report that sequence variants exist.
    VARSPLIC description of sequence variants produced by alternative splicing.
    MUTAGEN site which has been experimentally altered.
    MOD__RES post-translational modification of a residue.
    ACETYLATION N-terminal or other.
    AMIDATION generally at the C-terminal of a mature active peptide.
    BLOCKED undetermined N- or C-terminal blocking group.
    FORMYLATION of the N-terminal methionine.
    GAMMA-CARBOXYGLUTAMIC ACID HYDROXYLATION of asparagine, aspartic acid, proline, or lysine.
    METHYLATION generally of lysine or arginine.
    PHOSPHORYLATION of serine, threonine, tyrosine, aspartic acid or histidine.
    PYRROLIDONE CARBOXYLIC ACID N-terminal glutamate which has formed an internal cyclic lactam.
    SULFATATION generally of tyrosine.
    LIPID covalent binding of a lipidic moiety.
    MYRISTATE myristate group attached through an amide bond to the N-terminal glycine residue of the mature form of a protein or to an internal lysine residue.
    PALMITATE palmitate group attached through a thioether bond to a cysteine residue or through an ester bond to a serine or threonine residue.
    FARNESYL farnesyl group attached through a thioether bond to a cysteine residue.
    GERANYL-GERANYL geranyl-geranyl group attached through a thioether bond to a cysteine residue.
    GPI-ANCHOR glycosyl-phosphatidylinositol (GPI) group linked to the alpha- carboxyl group of the C-terminal residue of the mature form of a protein.
    N-ACYL DIGLYCERIDE N-terminal cysteine of the mature form of a prokaryotic lipoprotein with an amide-linked fatty acid and a glyceryl group to which two fatty acids are linked by ester linkages.
    DISULFID disulfide bond; the `FROM' and `TO' endpoints represent the two residues which are linked by an intra-chain disulfide bond; if the `FROM' and `TO' endpoints are identical, the disulfide bond is an interchain one and the description field indicates the nature of the cross-link.
    THIOLEST thiolester bond; the `FROM' and `TO' endpoints represent the two residues which are linked by the thiolester bond.
    THIOETH thioether bond; the `FROM' and `TO' endpoints represent the two residues which are linked by the thioether bond.
    CARBOHYD glycosylation site; the nature of the carbohydrate (if known) is given in the description field.
    METAL binding site for a metal ion; the description field indicates the nature of the metal.
    BINDING binding site for any chemical group (co-enzyme, prosthetic group, etc.); the chemical nature of the group is given in the description field.
    SIGNAL extent of a signal sequence (prepeptide).
    TRANSIT extent of a transit peptide (mitochondrial, chloroplastic, or for a microbody).
    PROPEP extent of a propeptide.
    CHAIN extent of a polypeptide chain in the mature protein.
    PEPTIDE extent of a released active peptide.
    DOMAIN extent of a domain of interest on the sequence; the nature of that domain is given in the description field.
    CA__BIND extent of a calcium-binding region.
    DNA__BIND extent of a DNA-binding region.
    NP__BIND extent of a nucleotide phosphate binding region; the nature of the nucleotide phosphate is indicated in the description field.
    TRANSMEM extent of a transmembrane region.
    ZN__FING extent of a zinc finger region.
    SIMILAR extent of a similarity with another protein sequence; precise information, relative to that sequence, is given in the description field.
    REPEAT extent of an internal sequence repetition.
    HELIX secondary structure: Helices, for example, Alpha-helix, 3(10) helix, or Pi-helix.
    STRAND secondary structure: Beta-strand, for example, Hydrogen bonded beta-strand, or Residue in an isolated beta-bridge.
    TURN secondary structure Turns, for example, H-bonded turn (3-turn, 4-turn, or 5-turn).
    ACT__SITE amino acid(s) involved in the activity of an enzyme.
    SITE any other interesting site on the sequence.
    INIT__MET the sequence is known to start with an initiator methionine.
    NON__TER the residue at an extremity of the sequence is not the terminal residue; if applied to position 1, this signifies that the first position is not the N-terminus of the complete molecule; if applied to the last position, it signifies that this position is not the C-terminus of the complete molecule; there is no description field for this key.
    NON__CONS non consecutive residues; indicates that two residues in a sequence are not consecutive and that there are a number of unsequenced residues between them.
    UNSURE uncertainties in the sequence; used to describe region(s) of a sequence for which the authors are unsure about the sequence assignment.

    [86 FR 57052, Oct. 14, 2021]