Worksheet - Switch database IDs

The GI is an identification number for nucleotide and protein sequences. The Accession Number of such sequences represents the database record of a sequence in GenBank, a database where nucleotide and protein sequences from more than 260,000 organisms are publicly available thanks to an international collaboration among the NCBI, the European Molecular Biology Laboratory (EMBL) Nucleotide Sequence Database and DNA DataBank of Japan (DDBJ).

You can switch from an Accession Number to its corresponding Gene Identifier (GI) or vice versa or from a database ID into another database ID by selecting “Annotation” -> “switch database format”. A dialog will appear allowing you to choose a worksheet column in the dialog “select column from” and type of data in the dialog “Format from”. Then, you have two options: a) select a pre-existing column via the boxes “Select column to” and “Format to” if you want to replace the terms of this column; b) create a new column with new information where you just need to give it a name.

The tool permits you to make this process in two modes, first applying the changes directly on a CSV open via the GPRO worksheet or in batch mode (by selecting the option Select folder) to process several CSVs simultaneously.


