Created at the Biohackathon 2012 and 2013
Used to describe a location that consists of a number of Regions but where the order is not known. e.g. the oddly named order() keyword in a INSDC file.
Bag of regions
The 'both strands position' indicates a region that is best described as being on 'both' strands of a double-stranded sequence, rather than one or the other.
Both strands position
The C-terminus is the end of an amino acid chain (protein or polypeptide), terminated by a free carboxyl group (-COOH).
C-Terminal position
Sometimes a location of a feature is defined by a collection of regions e.g. join() and order() in INSDC records. One should always try to model the semantics more accurately than this, these are fallback options to encode legacy data.
Collection of regions
A position that is exactly known.
Exact position
1
The position is on the forward (positive, 5' to 3') strand. Shown as a '+' in GFF3 and GTF.
Forward/positive strand position
A position that lacks exact data.
Fuzzy position
This indicates that a feature is between two other positions that are both known exactly and that are next to each other. An example is a restriction enzyme cutting site. The cut is after one position and before the other position (hence, in between).
In between positions
1
1
Use when you have an idea of the range in which you can find the position, but you cannot be sure about the exact position.
Indeterminate position within a range
1
1
As an ordered list of regions (but the list might not be complete).
Should be used when the location of a region is defined by an ordered list of Regions. However, try to avoid using these types in favor of using more explicit semantics about why the order is important.
List of regions
The position of the starting amino-acid a protein or polypeptide terminated by an amino acid with a free amine group (-NH2). The convention for writing peptide sequences is to put the N-terminus on the left and write the sequence from N- to C-terminus. Instances of this class are often used when the reference sequence is not complete
The position is known to be one of the more detailed positions listed by the location predicate.
One of positions
Superclass for the general concept of a position on a sequence. The sequence is designated with the reference predicate.
Position
1
A region describes a length of sequence with a start position and end position that represents a feature on a sequence, e.g. a gene.
Region
1
1
The position is on the reverse (complement, 3' to 5') strand of the sequence. Shown as '-' in GTF and GFF3.
Negative/reverse strand position
Part of the coordinate system denoting on which strand the feature can be found. If you do not yet know which stand the feature is on, you should tag the position with just this class. If you know more you should use one of the subclasses. This means a region described with a '.' in GFF3. A GFF3 unstranded position does not have this type in FALDO -- those are just a 'position'.
Stranded position
This predicate is used when you want to describe a non-inclusive range. Only used in the InBetweenPosition to say it is after a nucleotide, but before the next one.
after
This predicate is used to indicate that the feature is found before the exact position. Use to indicate, for example, a cleavage site. The cleavage happens between two amino acids before one and after the other.
before
The inclusive beginning of a position. Also known as start.
begin
This is the inverse of the begin:property. It is included to make it easier to write a number of OWL axioms. You should rarely use this in your raw data.
beginOf
The inclusive end of the position.
end
This is the inverse of the begin:end. It is included to make it easier to write a number of OWL axioms. You should rarely use this in your raw data.
endOf
This is the link between the concept whose location you are annotating and its range or position. For example, when annotating the region that describes an exon, the exon would be the subject and the region would be the object of the triple or: 'active site' 'location' [is] 'position 3'.
location
Denoted in 1-based closed coordinates, i.e. the position on the first amino acid or nucleotide of a sequence has the value 1. For nucleotide sequences we count from the 5'end of the sequence, while for Aminoacid sequences we start counting from the N-Terminus.
The position value is the offset along the reference where this position is found. Thus the only the position value in combination with the reference determines where a position is.
position
1
One of the possible positions listed for a OneOfPosition element.
possiblePosition
The reference is the resource that the position value is anchored to. For example, a contig or chromosome in a genome assembly.
reference