Topic 6: Sequence motif searches and protein domain structure analysis II

Multiple alignments:


Tasks

6.1

Construction of a leucine-rich repeat (LRR) profile. Take the file of LRR repeats produced in Task 5.3 and select all sequences of equal length. Display them using the SMS Color Align Conservation utility (lower the identity/similarity threshold to 60 % to see something interesting). Then go back to your original repeats file (i.e. all the repeats), open it in BioEdit and try to introduce gaps into shorter sequences to adjust their length while keeping the pattern.

Deduce a consensus pattern and write it in the PROSITE format. Then compare this pattern to the "official" LRR profile kept at the SMART site.


6.2

Below is a collection of a few GDIs and related sequences from yeast, plants and metazoa. Produce an alignment using MACAW (keep the file for future use). Try to define sequence motifs characteristic for all the sequences, as well as significant differences between the GDIs and the Rab escort proteins.
>ScGDI1 gi_6320983_ref_NP_011062.1 Regulates vesicle traffic in secretory pathway... ; Gdi1p [Saccharomyces cerevisiae]
MDQETIDTDYDVIVLGTGITECILSGLLSVDGKKVLHIDKQDHYGGEAASVTLSQLYEKFKQNPISKEER
ESKFGKDRDWNVDLIPKFLMANGELTNILIHTDVTRYVDFKQVSGSYVFKQGKIYKVPANEIEAISSPLM
GIFEKRRMKKFLEWISSYKEDDLSTHQGLDLDKNTMDEVYYKFGLGNSTKEFIGHAMALWTNDDYLQQPA
RPSFERILLYCQSVARYGKSPYLYPMYGLGELPQGFARLSAIYGGTYMLDTPIDEVLYKKDTGKFEGVKT
KLGTFKAPLVIADPTYFPEKCKSTGQRVIRAICILNHPVPNTSNADSLQIIIPQSQLGRKSDIYVAIVSD
AHNVCSKGHYLAIISTIIETDKPHIELEPAFKLLGPIEEKFMGIAELFEPREDGSKDNIYLSRSYDASSH
FESMTDDVKDIYFRVTGHPLVLKQRQEQEKQ
>ScMRS6 gi_6324946_ref_NP_015015.1 protein of the TCD/MRS6 family... (Rab escort protein); Mrs6p [Saccharomyces cerevisiae]
MLSPERRPSMAERRPSFFSFTQNPSPLVVPHLAGIEDPLPATTPDKVDVLIAGTGMVESVLAAALAWQGS
NVLHIDKNDYYGDTSATLTVDQIKRWVNEVNEGSVSCYKNAKLYVSTLIGSGKYSSRDFGIDLSPKILFA
KSDLLSILIKSRVHQYLEFQSLSNFHTYENDCFEKLTNTKQEIFTDQNLPLMTKRNLMKFIKFVLNWEAQ
TEIWQPYAERTMSDFLGEKFKLEKPQVFELIFSIGLCYDLNVKVPEALQRIRRYLTSFDVYGPFPALCSK
YGGPGELSQGFCRSAAVGGATYKLNEKLVSFNPTTKVATFQDGSKVEVSEKVIISPTQAPKDSKHVPQQQ
YQVHRLTCIVENPCTEWFNEGESAAMVVFPPGSLKSGNKEVVQAFILGAGSEICPEGTIVWYLSTTEQGP
RAEMDIDAALEAMEMALLRESSSGLENDEEIVQLTGNGHTIVNSVKLGQSFKEYVPRERLQFLFKLYYTQ
YTSTPPFGVVNSSFFDVNQDLEKKYIPGASDNGVIYTTMPSAEISYDEVVTAAKVLYEKIVGSDDDFFDL
DFEDEDEIQASGVANAEQFENAIDDDDDVNMEGSGEFVGEMEI
>DmGDI gi_480358_pir_S36746 GDP dissociation inhibitor - fruit fly (Drosophila melanogaster)
MDEEYDVDVLGTGLKECILSGIMLSVSGKKVLHIDRNKYYGGESASITPLEELFQRYRTGAARPRFGRGR
DWNVDLIPKFLMANGQLVKLLIHTGVTRYLEFKSIEGSYVYKGGKIAKVPVDQKEALASDLMGMFEKRRF
RNFLIYVQDFREDDPKTWKDFDPTKANMQGLYDKFGLDKNTQDFTGHALALFRDDEYLNEPAVNTIRRIK
LYSDSLARYGKSPYLYPMYGLGELPQGFARLSAIYGGTYMLDKPIDEIVLGEGGKVVGVRSGEEVAKCKQ
VYCDPSYVPRRLRKRGKVIRCICIQDHPGASTKDGLSTQIIIPQKQVGRKSDIYVSLVSSTHQVAAKGWF
VGMVSTTVETENPEVEIKPGLDLLEPIAQKFVTISDYLEPIDDGSESQIFISESYDATTHFETTCWDVLN
IFKRGTGETFDFSKDQGTSWVTRSSKRE
>DmRepP1 gi_17137652_ref_NP_477420.1 Rep-P1; rab escort protein [Drosophila melanogaster]
MLDDLPEQFDLVVIGTGFTESCIAAAGSRIGKSVLHLDSNEYYGDVWSSFSMDALCARLDQEVEPHSALR
NARYTWHSMEKESETDAQSWNRDSVLAKSRRFSLDLCPRILYAAGELVQLLIKSNICRYAEFRAVDHVCM
RHNGEIVSVPCSRSDVFNTKTLTIVEKRLLMKFLTACNDYGEDKCNEDSLEFRGRTFLEYLQAQRVTEKI
SSCVMQAIAMCGPSTSFEEGMQRTQRFLGSLGRYGNTPFLFPMYGCGELPQCFCRLCAVYGGIYCLKRAV
DDIALDSNSNEFLLSSAGKTLRAKNVVSAPGYTPVSKGIELKPHISRGLFISSSPLGNEELNKGGGGVNL
LRLLDNEGGREAFLIQLSHYTGACPEGLYIFHLTTPALSEDPASDLAIFTSQLFDQSDAQIIFSSYFTIA
AQSSKSPAAEHIYYTDPPTYELDYDAAIANARDIFGKMFPDADFLPRAPDPEEIVVDGEDPSALNEHTLP
EDLRAQLHDMQQATQEMDIQE
>HsGDI1 gi_4503971_ref_NP_001484.1 GDP dissociation inhibitor 1; mental retardation, X-linked... [Homo sapiens]
MDEEYDVIVLGTGLTECILSGIMSVNGKKVLHMDRNPYYGGESSSITPLEELYKRFQLLEGPPESMGRGR
DWNVDLIPKFLMANGQLVKMLLYTEVTRYLDFKVVEGSFVYKGGKIYKVPSTETEALASNLMGMFEKRRF
RKFLVFVANFDENDPKTFEGVDPQTTSMRDVYRKFDLGQDVIDFTGHALALYRTDDYLDQPCLETVNRIK
LYSESLARYGKSPYLYPLYGLGELPQGFARLSAIYGGTYMLNKPVDDIIMENGKVVGVKSEGEVARCKQL
ICDPSYIPDRVRKAGQVIRIICILSHPIKNTNDANSCQIIIPQNQVNRKSDIYVCMISYAHNVAAQGKYI
AIASTTVETTDPEKEVEPALELLEPIDQKFVAISDLYEPIDDGCESQVFCSCSYDATTHFETTCNDIKDI
YKRMAGTAFDFENMKRKQNDVFGEAEQ
>HsGDI2 gi_6598323_ref_NP_001485.2 GDP dissociation inhibitor 2; rab GDP-dissociation inhibitor, beta [Homo sapiens]
MNEEYDVIVLGTGLTECILSGIMSVNGKKVLHMDRNPYYGGESASITPLEDLYKRFKIPGSPPESMGRGR
DWNVDLIPKFLMANGQLVKMLLYTEVTRYLDFKVTEGSFVYKGGKIYKVPSTEAEALASSLMGLFEKRRF
RKFLVYVANFDEKDPRTFEGIDPKKTTMRDVYKKFDLGQDVIDFTGHALALYRTDDYLDQPCYETINRIK
LYSESLARYGKSPYLYPLYGLGELPQGFARLSAIYGGTYMLNKPIEEIIVQNGKVIGVKSEGEIARCKQL
ICDPSYVKDRVEKVGQVIRVICILSHPIKNTNDANSCQIIIPQNQVNRKSDIYVCMISFAHNVAAQGKYI
AIVSTTVETKEPEKEIRPALELLEPIEQKFVSISDLLVPKDLGTESQIFISRTYDATTHFETTCDDIKNI
YKRMTGSEFDFEEMKRKKNDIYGED
>HsREP2 gi_4502811_ref_NP_001812.1 choroideremia-like Rab escort protein 2; REP-2 ... [Homo sapiens]
MADNLPTEFDVVIIGTGLPESILAAACSRSGQRVLHIDSRSYYGGNWASFSFSGLLSWLKEYQQNNDIGE
ESTVVWQDLIHETEEAITLRKKDETIQHTEAFPYASQDMEDNVEEIGALQKNPSLGVSNTFTEVLDSALP
EESQLSYFNSDEMPAKHTQKSDTEISLEVTDVEESVEKEKYCGDKTCMHTVSDKDGDKDESKSTVEDKAD
EPIRNRITYSQIVKEGRRFNIDLVSKLLYSQGLLIDLLIKSDVSRYVEFKNVTRILAFREGKVEQVPCSR
ADVFNSKELTMVEKRMLMKFLTFCLEYEQHPDEYQAFRQCSFSEYLKTKKLTPNLQHFVLHSIAMTSESS
CTTIDGLNATKNFLQCLGRFGNTPFLFPLYGQGEIPQGFCRMCAVFGGIYCLRHKVQCFVVDKESGRCKA
IIDHFGQRINAKYFIVEDSYLSEETCSNVQYKQISRAVLITDQSILKTDLDQQTSILIVPPAEPGACAVR
VTELCSSTMTCMKDTYLVHLTCSSSKTAREDLESVVKKLFTPYTETEINEEELTKPRLLWALYFNMRDSS
GISRSSYNGLPSNVYVCSGPDCGLGNEHAVKQAETLFQEIFPTEEFCPPPPNPEDIIFDGDDKQPEAPGT
NNVVMAKLESSEESKNLESPEKHLQN
>CeGDI1 gi_17540906_ref_NP_502788.1 GDI-1 GDP dissociation inhibitor [Caenorhabditis elegans]
MDEEYDAIVLGTGLKECIISGMLSVSGKKVLHIDRNNYYGGESASLTPLEQLYEKFHGPQAKPQQEMGRG
RDWNVDLIPKFLMANGPLVKLLIHTGVTRYLEFKSIEASFVVKGGKIYKVPADEMEALATSLMGMFEKRR
FKKFLVWVQQFDENKEDTWQGLDPHNSTMQQVYEKFGLDENTADFTGHALALYRDDEHKNQPYAPAVEKI
RLYSDSLARYGKSPYLYPLYGLGELPQGFARLSAIYGGTYMLDKPVDEIVMENGKAIGVKCGDEIVRGKQ
IYCDPSYAKDRVKKTGQVVRAICLLNHPIPNTNDAQSCQIIIPQKQVGRHYDIYISCCSNTNMVTPKGWY
LAMVSTTVETANPEAEVLPGLQLLGAIAEKFIQISDVYEPSDLGSESQIFISQSYDATTHFETTCKDVLN
MFERGTTKEFDFTNITHLSLNDQE
>CeY67D2 gi_17556376_ref_NP_497423.1 Y67D2.1.p [Caenorhabditis elegans]
MDEKLPESVDVVVLGTGLPEAILASACARAGLSVLHLDRNEYYGGDWSSFTMSMVHEVTENQVKKLDSSE
ISKLSELLTENEQLIELGNREIVENIEMTWIPRGKDEEKPMKTQLEEASQMRRFSIDLVPKILLSKGAMV
QTLCDSQVSHYAEFKLVNRQLCPTETPEAGITLNPVPCSKGEIFQSNALSILEKRALMKFITFCTQWSTK
DTEEGRKLLAEHADRPFSEFLEQMGVGKTLQSFIINTIGILQQRPTAMTGMLASCQFMDSVGHFGPSPFL
FPLYGCGELSQCFCRLAAVFGSLYCLGRPVQAIVKKDGKITAVIANGDRVNCRYIVMSPRFVPETVPASS
TLKIERIVYATDKSIKEAEKEQLTLLNLASLRPDAAVSRLVEVGFEACTAPKGHFLVHATGTQEGETSVK
TIAEKIFEKNEVEPYWKMSFTANSMKFDTAGAENVVVAPPVDANLHYASVVEECRQLFCTTWPELDFLPR
AMKKEEEEEEEPETEEIAEN
>OsGDI2 AAB69871.1 GDP dissociation inhibitor protein OsGDI2 [Oryza sativa]
MDEEYDLIVLGTGLKECILSGLLSVDGLKVLHMDRNDYYGGDSTSLNLNQLWKRFRGEDKPPAHLGSSKD
YNVDMVPKFMMANGTLVRTLIHTDVTKYLSFKAVDGSYVFSKGKIHKVPATDMEALKSPLMGLFEKRRAR
NFFIYVQDYNEADPKTHQGLDLTTMTTRELIAKYGLSDDTVDFIGHALALHKDDRYLNEPAIDTVKRMKV
YAESLAPFQGGSPSIYPLYGLGELPQGHARLSAVYGGTYILNKPDCKVEFDMEGKVCGVTSEGETAKCKK
VVCDPSYLPNKVRKDRKVARAIAIMSHPIASTNDSHSVQIILPQKQLGRKSDMYVFCCSYTHNVAPKGKF
IAFVSTEAETDNPQSELKPGIDLLGQVDELFFDIYDRYEPVNEPSLDNCFVSTSYDATTHFETTVTDVLN
MYTLITGKAVDLSVDLSAASAAEEY
>OsGDI1 AAB69870.1 GDP dissociation inhibitor protein OsGDI1 [Oryza sativa]
MDEEYDVIVLGTGLKECILSGLLSVDGLKVLHMDRNDYYGGDSTSLNLNQLWKRFRGEDKPPAHLGASRD
YNVDMVPKFMMANGTLVRTLIHTDVTKYLSFKAVDGSYVFSKGKIHKVPATDMEALKSPLMGLFEKRRAR
NFFIYVQDYDEADPKTHQGLDLTTMTTRELIAKYGLSDDTVDFIGHALALHRDDRYLNEPAIDTVKRMKL
YAESLPRFQGGSPSIYPLYGLGELPQGFARLRAVYGGTYMLNKPDCKVEFDMEGKVCGVTSEGESAKCKK
VVCDPSYLPNKVRKIGKVARAIAIMSHPIANTNDSHSVQIILPQKQLGRKSDMYVFGCSYTHNVAPKGKF
IAFVSTEAETDHPESELKPGIDLLGQVDELFFDIYDRYEPVNEPSLDNCFVSTSYDATTHFETTVTDVLN
MYTLITGKTVDLSVDLSAASAAEKY
>OsREP NP_001042697.1 Os01g0269100 [Oryza sativa Japonica Group]
MADAPATGGGFPAQDYPTIDPTSFDVVLCGTGLPESVLAAACAAAGKTVLHVDPNPFYGSLFSSLPLPSL
PSFLSPSPSDDPAPSPSPSSAAAVDLRRRSPYSEVETSGAVPEPSRRFTADLVGPRLLYCADEAVDLLLR
SGGSHHVEFKSVEGGTLLYWDGDLYPVPDSRQAIFKDTTLQLREKNLLFRFFKLVQAHIAASAAGAAAAG
EGEASGRLPDEDLDLPFVEFLKRQNLSPKMRAVVLYAIAMADYDQDGVESCERLLTTREGVKTIALYSSS
IGRFANAEGAFIYPMYGHGELPQAFCRCAAVKGIANASHSTSC 
    


Searching databases of known patterns - a few entry points


Tasks

6.3

Take at least one of the motifs produced in tasks 6.1 or 6.2 and use it to search SwissProt/Uniprot using one of the above provided tools. Restrict taxonomically as close to Arabidopsis thaliana as possible. Watch out for pattern formats!