Skip to content
Advertisement

Iterate over and index a list?

I am trying to iterate over a string that has many repeated characters in order to reorganize it into a list with each character replaced with a 3 letter code and its index+1.

So I want to reorganize:

Seq = "THGTILH"

Into:

NewSeq = [Thr 1, His 2, Gly 3, Thr 4, Ile 5, Leu 6, His 7, His 8]

This is just an example string, and the final one will be ~300 characters long. Thanks for any advice!

Edit: Here is the code I have written which iterates through the string to replace the single letters with 3 letter codes.

Seq = "THGTILH"
NewSeq = []

for i in Seq:
    AA = None
    Num = Seq.index(i) + 1
    if i == 'M':
        AA = 'Met'
    if i == 'E':
        AA = 'Glu'
    if i == 'A':
        AA = 'Ala'
    if i == 'C':
        AA = 'Cys'
    if i == 'D':
        AA = 'Asp'
    if i == 'F':
        AA = 'Phe'
    if i == 'G':
        AA = 'Gly'
    if i == 'H':
        AA = 'His'
    if i == 'I':
        AA = 'Ile'
    if i == 'K':
        AA = 'Lys'
    if i == 'L':
        AA = 'Leu'
    if i == 'N':
        AA = 'Asn'
    if i == 'P':
        AA = 'Pro'
    if i == 'Q':
        AA = 'Gln'
    if i == 'R':
        AA = 'Arg'
    if i == 'S':
        AA = 'Ser'
    if i == 'T':
        AA = 'Thr'
    if i == 'V':
        AA = 'Val'
    if i == 'W':
        AA = 'Trp'
    if i == 'Y':
        AA = 'Tyr'
    NewSeq.append(AA)

Advertisement

Answer

Basically these are the steps.

  • You write a dictionary with keys as 1 letter aa code and value as 3 letter code.
  • Then you make a list of the peptide/protein
  • Declare an empty list to which you append later
  • Then enumerate and iterate over it
  • For every single letter code, you get the 3 letter code and index. Add this to the list defined above
  • Print or return the final list

Here is the code.

AA_3_Letter_Code = {'A':"ALA",
                    "C":"CYS",
                    "D":"ASP",
                    "E":"GLU",
                    "F":"PHE",
                    "G":"GLY",
                    "H":"HIS",
                    "I":"ILE",
                    "K":"LYS",
                    "L":"LEU",
                    "M":"MET",
                    "N":"ASN",
                    "P":"PRO",
                    "Q":"GLN",
                    "R":"ARG",
                    "S":"SER",
                    "T":"THR",
                    "V":"VAL",
                    "W":"TRP",
                    "Y":"TYR"}

def Convert_Peptide(Peptide):
    SplitIntoList = list(Peptide)
    FinalAnswer = []
    for index, aa in enumerate(SplitIntoList):
        FinalAnswer.append(AA_3_Letter_Code[aa] +" "+ str(index+1))
    print(FinalAnswer)
    return FinalAnswer

When you call it for your peptide, following is the answer

Convert_Peptide("THGTILH")
['THR 1', 'HIS 2', 'GLY 3', 'THR 4', 'ILE 5', 'LEU 6', 'HIS 7']

You can potentially convert any big peptide or protein. For insulin follwing is the answer

Convert_Peptide("MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN")
['MET 1', 'ALA 2', 'LEU 3', 'TRP 4', 'MET 5', 'ARG 6', 'LEU 7', 'LEU 8', 'PRO 9', 'LEU 10', 'LEU 11', 'ALA 12', 'LEU 13', 'LEU 14', 'ALA 15', 'LEU 16', 'TRP 17', 'GLY 18', 'PRO 19', 'ASP 20', 'PRO 21', 'ALA 22', 'ALA 23', 'ALA 24', 'PHE 25', 'VAL 26', 'ASN 27', 'GLN 28', 'HIS 29', 'LEU 30', 'CYS 31', 'GLY 32', 'SER 33', 'HIS 34', 'LEU 35', 'VAL 36', 'GLU 37', 'ALA 38', 'LEU 39', 'TYR 40', 'LEU 41', 'VAL 42', 'CYS 43', 'GLY 44', 'GLU 45', 'ARG 46', 'GLY 47', 'PHE 48', 'PHE 49', 'TYR 50', 'THR 51', 'PRO 52', 'LYS 53', 'THR 54', 'ARG 55', 'ARG 56', 'GLU 57', 'ALA 58', 'GLU 59', 'ASP 60', 'LEU 61', 'GLN 62', 'GLY 63', 'SER 64', 'LEU 65', 'GLN 66', 'PRO 67', 'LEU 68', 'ALA 69', 'LEU 70', 'GLU 71', 'GLY 72', 'SER 73', 'LEU 74', 'GLN 75', 'LYS 76', 'ARG 77', 'GLY 78', 'ILE 79', 'VAL 80', 'GLU 81', 'GLN 82', 'CYS 83', 'CYS 84', 'THR 85', 'SER 86', 'ILE 87', 'CYS 88', 'SER 89', 'LEU 90', 'TYR 91', 'GLN 92', 'LEU 93', 'GLU 94', 'ASN 95', 'TYR 96', 'CYS 97', 'ASN 98']

You should ask this question at Bioinformatics stack-exchange for more elegant solution.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement