A comprehensive system for consistent numbering of HCV sequences, proteins and epitopes

Carla Kuiken, Christophe Combet, Jens Bukh, Tadasu Shin-I, Gilbert Deleage, Masashi Mizokami, Russell Richardson, Erwin Sablon, Karina Yusim, Jean Michel Pawlotsky, Peter Simmonds, Bette Korber, Werner Abfalterer, Charles Calef, Brian Foley, Robert Funkhouser, Brian Gaschen, Dorothy Lang, Thomas Leitner, James SzingerMing Zhang

Research output: Contribution to journalReview article

83 Citations (Scopus)

Abstract

This numbering proposal, using the AF009606 (isolate H77) sequence as a reference, should be able to unequivocally number all possible mutations in HCV, both natural and manmade. The HCV sequence databases8 and the Los Alamos HCV immunology database9 (as well as the Los Alamos HIV database) number positions and epitopes according to this system. Moreover, the databases websites provides tools for finding stretches of sequence by their numbers, for assigning start and end coordinates to a sequence, and for converting between the various numbering systems. Numbering HCV nucleotide sequences is done by analogy to H77. The first step is aligning your sequence to H77. If there is no length variation, the numbering is straightforward; nucleotide numbers run from 1 (start of 5′ UTR) to 9646 (end of 3′ UTR). Insertions relative to H77 are labeled with letters. Protein numbering works like the nucleotide numbering, but starts at the start of the polyprotein. The sequence databases will support both systems, but use polyprotein numbering as a basis. Absolute numbering moves across the coding regions, relative numbering starts over at every coding region. Relative numbering is almost exclusively used for proteins, polyprotein numbering mainly in immunology, protein numbering in drug resistance research. The Los Alamos immunology database uses polyprotein numbering. The 5′ UTR numbering starts at 1 and ends at 341; the Core cds starts at 342. The numbering of the 3′ UTR starts at 9378 (after the stop codon), but complications arise due to the variable length of the PPT. The UTR consists of 3 elements: a variable 5′ region, the PPT, and a conserved 3′ region, often called X. The first region is numbered 9378-9410. The PPT consists almost entirely of T's and therefore cannot be meaningfully aligned; it is numbered according to its length in H77, 9411-9545. The X region starts at 9546 (regardless of its actual location, which depends on the length of the PPT) and ends at 9646.

Original languageEnglish (US)
Pages (from-to)1355-1361
Number of pages7
JournalHepatology
Volume44
Issue number5
DOIs
StatePublished - Nov 1 2006

Fingerprint

Polyproteins
Epitopes
3' Untranslated Regions
Allergy and Immunology
Databases
5' Untranslated Regions
Proteins
Nucleotides
Terminator Codon
Drug Resistance
HIV
Mutation
Research

ASJC Scopus subject areas

  • Hepatology

Cite this

Kuiken, C., Combet, C., Bukh, J., Shin-I, T., Deleage, G., Mizokami, M., ... Zhang, M. (2006). A comprehensive system for consistent numbering of HCV sequences, proteins and epitopes. Hepatology, 44(5), 1355-1361. https://doi.org/10.1002/hep.21377

A comprehensive system for consistent numbering of HCV sequences, proteins and epitopes. / Kuiken, Carla; Combet, Christophe; Bukh, Jens; Shin-I, Tadasu; Deleage, Gilbert; Mizokami, Masashi; Richardson, Russell; Sablon, Erwin; Yusim, Karina; Pawlotsky, Jean Michel; Simmonds, Peter; Korber, Bette; Abfalterer, Werner; Calef, Charles; Foley, Brian; Funkhouser, Robert; Gaschen, Brian; Lang, Dorothy; Leitner, Thomas; Szinger, James; Zhang, Ming.

In: Hepatology, Vol. 44, No. 5, 01.11.2006, p. 1355-1361.

Research output: Contribution to journalReview article

Kuiken, C, Combet, C, Bukh, J, Shin-I, T, Deleage, G, Mizokami, M, Richardson, R, Sablon, E, Yusim, K, Pawlotsky, JM, Simmonds, P, Korber, B, Abfalterer, W, Calef, C, Foley, B, Funkhouser, R, Gaschen, B, Lang, D, Leitner, T, Szinger, J & Zhang, M 2006, 'A comprehensive system for consistent numbering of HCV sequences, proteins and epitopes', Hepatology, vol. 44, no. 5, pp. 1355-1361. https://doi.org/10.1002/hep.21377
Kuiken C, Combet C, Bukh J, Shin-I T, Deleage G, Mizokami M et al. A comprehensive system for consistent numbering of HCV sequences, proteins and epitopes. Hepatology. 2006 Nov 1;44(5):1355-1361. https://doi.org/10.1002/hep.21377
Kuiken, Carla ; Combet, Christophe ; Bukh, Jens ; Shin-I, Tadasu ; Deleage, Gilbert ; Mizokami, Masashi ; Richardson, Russell ; Sablon, Erwin ; Yusim, Karina ; Pawlotsky, Jean Michel ; Simmonds, Peter ; Korber, Bette ; Abfalterer, Werner ; Calef, Charles ; Foley, Brian ; Funkhouser, Robert ; Gaschen, Brian ; Lang, Dorothy ; Leitner, Thomas ; Szinger, James ; Zhang, Ming. / A comprehensive system for consistent numbering of HCV sequences, proteins and epitopes. In: Hepatology. 2006 ; Vol. 44, No. 5. pp. 1355-1361.
@article{6388a8d9ae4544598aa31c53e3bfd8a1,
title = "A comprehensive system for consistent numbering of HCV sequences, proteins and epitopes",
abstract = "This numbering proposal, using the AF009606 (isolate H77) sequence as a reference, should be able to unequivocally number all possible mutations in HCV, both natural and manmade. The HCV sequence databases8 and the Los Alamos HCV immunology database9 (as well as the Los Alamos HIV database) number positions and epitopes according to this system. Moreover, the databases websites provides tools for finding stretches of sequence by their numbers, for assigning start and end coordinates to a sequence, and for converting between the various numbering systems. Numbering HCV nucleotide sequences is done by analogy to H77. The first step is aligning your sequence to H77. If there is no length variation, the numbering is straightforward; nucleotide numbers run from 1 (start of 5′ UTR) to 9646 (end of 3′ UTR). Insertions relative to H77 are labeled with letters. Protein numbering works like the nucleotide numbering, but starts at the start of the polyprotein. The sequence databases will support both systems, but use polyprotein numbering as a basis. Absolute numbering moves across the coding regions, relative numbering starts over at every coding region. Relative numbering is almost exclusively used for proteins, polyprotein numbering mainly in immunology, protein numbering in drug resistance research. The Los Alamos immunology database uses polyprotein numbering. The 5′ UTR numbering starts at 1 and ends at 341; the Core cds starts at 342. The numbering of the 3′ UTR starts at 9378 (after the stop codon), but complications arise due to the variable length of the PPT. The UTR consists of 3 elements: a variable 5′ region, the PPT, and a conserved 3′ region, often called X. The first region is numbered 9378-9410. The PPT consists almost entirely of T's and therefore cannot be meaningfully aligned; it is numbered according to its length in H77, 9411-9545. The X region starts at 9546 (regardless of its actual location, which depends on the length of the PPT) and ends at 9646.",
author = "Carla Kuiken and Christophe Combet and Jens Bukh and Tadasu Shin-I and Gilbert Deleage and Masashi Mizokami and Russell Richardson and Erwin Sablon and Karina Yusim and Pawlotsky, {Jean Michel} and Peter Simmonds and Bette Korber and Werner Abfalterer and Charles Calef and Brian Foley and Robert Funkhouser and Brian Gaschen and Dorothy Lang and Thomas Leitner and James Szinger and Ming Zhang",
year = "2006",
month = "11",
day = "1",
doi = "10.1002/hep.21377",
language = "English (US)",
volume = "44",
pages = "1355--1361",
journal = "Hepatology",
issn = "0270-9139",
publisher = "John Wiley and Sons Ltd",
number = "5",

}

TY - JOUR

T1 - A comprehensive system for consistent numbering of HCV sequences, proteins and epitopes

AU - Kuiken, Carla

AU - Combet, Christophe

AU - Bukh, Jens

AU - Shin-I, Tadasu

AU - Deleage, Gilbert

AU - Mizokami, Masashi

AU - Richardson, Russell

AU - Sablon, Erwin

AU - Yusim, Karina

AU - Pawlotsky, Jean Michel

AU - Simmonds, Peter

AU - Korber, Bette

AU - Abfalterer, Werner

AU - Calef, Charles

AU - Foley, Brian

AU - Funkhouser, Robert

AU - Gaschen, Brian

AU - Lang, Dorothy

AU - Leitner, Thomas

AU - Szinger, James

AU - Zhang, Ming

PY - 2006/11/1

Y1 - 2006/11/1

N2 - This numbering proposal, using the AF009606 (isolate H77) sequence as a reference, should be able to unequivocally number all possible mutations in HCV, both natural and manmade. The HCV sequence databases8 and the Los Alamos HCV immunology database9 (as well as the Los Alamos HIV database) number positions and epitopes according to this system. Moreover, the databases websites provides tools for finding stretches of sequence by their numbers, for assigning start and end coordinates to a sequence, and for converting between the various numbering systems. Numbering HCV nucleotide sequences is done by analogy to H77. The first step is aligning your sequence to H77. If there is no length variation, the numbering is straightforward; nucleotide numbers run from 1 (start of 5′ UTR) to 9646 (end of 3′ UTR). Insertions relative to H77 are labeled with letters. Protein numbering works like the nucleotide numbering, but starts at the start of the polyprotein. The sequence databases will support both systems, but use polyprotein numbering as a basis. Absolute numbering moves across the coding regions, relative numbering starts over at every coding region. Relative numbering is almost exclusively used for proteins, polyprotein numbering mainly in immunology, protein numbering in drug resistance research. The Los Alamos immunology database uses polyprotein numbering. The 5′ UTR numbering starts at 1 and ends at 341; the Core cds starts at 342. The numbering of the 3′ UTR starts at 9378 (after the stop codon), but complications arise due to the variable length of the PPT. The UTR consists of 3 elements: a variable 5′ region, the PPT, and a conserved 3′ region, often called X. The first region is numbered 9378-9410. The PPT consists almost entirely of T's and therefore cannot be meaningfully aligned; it is numbered according to its length in H77, 9411-9545. The X region starts at 9546 (regardless of its actual location, which depends on the length of the PPT) and ends at 9646.

AB - This numbering proposal, using the AF009606 (isolate H77) sequence as a reference, should be able to unequivocally number all possible mutations in HCV, both natural and manmade. The HCV sequence databases8 and the Los Alamos HCV immunology database9 (as well as the Los Alamos HIV database) number positions and epitopes according to this system. Moreover, the databases websites provides tools for finding stretches of sequence by their numbers, for assigning start and end coordinates to a sequence, and for converting between the various numbering systems. Numbering HCV nucleotide sequences is done by analogy to H77. The first step is aligning your sequence to H77. If there is no length variation, the numbering is straightforward; nucleotide numbers run from 1 (start of 5′ UTR) to 9646 (end of 3′ UTR). Insertions relative to H77 are labeled with letters. Protein numbering works like the nucleotide numbering, but starts at the start of the polyprotein. The sequence databases will support both systems, but use polyprotein numbering as a basis. Absolute numbering moves across the coding regions, relative numbering starts over at every coding region. Relative numbering is almost exclusively used for proteins, polyprotein numbering mainly in immunology, protein numbering in drug resistance research. The Los Alamos immunology database uses polyprotein numbering. The 5′ UTR numbering starts at 1 and ends at 341; the Core cds starts at 342. The numbering of the 3′ UTR starts at 9378 (after the stop codon), but complications arise due to the variable length of the PPT. The UTR consists of 3 elements: a variable 5′ region, the PPT, and a conserved 3′ region, often called X. The first region is numbered 9378-9410. The PPT consists almost entirely of T's and therefore cannot be meaningfully aligned; it is numbered according to its length in H77, 9411-9545. The X region starts at 9546 (regardless of its actual location, which depends on the length of the PPT) and ends at 9646.

UR - http://www.scopus.com/inward/record.url?scp=33750999948&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33750999948&partnerID=8YFLogxK

U2 - 10.1002/hep.21377

DO - 10.1002/hep.21377

M3 - Review article

C2 - 17058236

AN - SCOPUS:33750999948

VL - 44

SP - 1355

EP - 1361

JO - Hepatology

JF - Hepatology

SN - 0270-9139

IS - 5

ER -