Genetic variants are drawing increasing interest regarding their role in pathologies, for designing new drugs or refining treatment efficacy. Selecting between existing treatment options and offering a personalized solution based on the latest evidences require to efficiently retrieve the literature for variants. While many databases of polymorphisms and somatic variants exist, such as ClinVar, COSMIC or dbSNP, using those resources as terminologies is fairly challenging. Depending on the database, variants are mapped on different levels: genomic, transcript or protein. The combinatorial nature of these levels (many-to-many relationships) hinder a linear mapping between them. Also, as required by HGVS standards, a precise syntax as well as a reference sequence on which the variation is described is needed to avoid positional ambiguities of the variant, which are often not used in publications.
To enable a smooth and effective retrieval of variants in the literature, we developed a synonym generation tool that enables to generate for a given SNP – including variants not described in existing databases – its corresponding description at the genome, transcript and protein level, in the HGVS format as well as in many non standard – yet frequently used – descriptions found in the literature. It is also adapted for variant recognition at any description level and normalization.
Enter a chromosome number or name (e.g. 7, MT, X) or a gene name (e.g. BRAF). The field can be empty if a dbSNP or COSMIC id is searched.
Enter a description of the variation in the following format: V600E (for amino acid sequence) or 1799T>A (for DNA sequence) or a dbSNP id (e.g. rs113488022) or COSMIC id (e.g. COSM476).
Select the level of variant description you provided (protein, transcript, genome, dbsnp or cosmic).
You can directly access the results in the xml format or beacon format (json) by using the url 'http://goldorak.hesge.ch/synvar/generate/litterature/fromMutation' with the following parameters:
- ref: Gene name or chromosome number or name (e.g. JAK2, BRAF, 9, X). Optional if an identifier (COSMIC or dnSNP) is given in the variant parameter.
- variant: Variant description, COSMIC id, or dbSNP id (e.g. V617F, Val600Glu, rs113488022, COSM476).
- level: Level of variant description (protein, transcript, genome, dbsnp or cosmic).
- iso: Validate on and generate synonyms for isoforms: false (default) or true.
- map: Output syntactic variations even if the variant could not be mapped on genome: false (default) or true.
- format: Output format: xml (default) or beacon (json).
To query the service and parse the output: queryVariant.java
The output is in XML format. Variants are grouped by affected genes. The main elements are the following:
- synonym: Synonyms of gene and protein names.
- hgvs: Variant description in the standard HGVS format. The main HGVS description can be used as a unique identifier. Other HGVS are given for each level of description using the NCBI reference sequences.
- syntactic-variation: Variant formats as encountered in the literature.