Identified secondary metabolite clusters

Cluster Type From To Size (kb) Core domains Product/substrate predicted by subgroup Most similar known cluster MIBiG BGC-ID
The following clusters are from record NC_066579.1:
Cluster 1Alkaloid-Fatty_acid400266933400951382684.45BBE, FAD_binding_4, FA_hydroxylase---
Cluster 2Alkaloid-Saccharide457936123458129637193.51Pyridoxal_deC, UDPGT_2*saccharide-2, cyanogenic glucoside, monoterpenoid--
The following clusters are from record NC_066580.1:
Cluster 3Saccharide15348841700613165.73Acetyltransf_1, Lyase_aromatic, UDPGT_2---
Cluster 4Saccharide22377888234929311115.04Transferase, UDPGT_2flavonoid--
Cluster 5Cyclopeptide1150802391205175435437.30BURP---
Cluster 6Saccharide136017328136286230268.90UDPGT_2, p450---
Cluster 7Cyclopeptide3199602373244074794447.24BURP---
Cluster 8Polyketide490186082490582249396.17AMP-binding, Acetyltransf_1, Chal_sti_synt_C, Chal_sti_synt_N---
The following clusters are from record NC_066581.1:
Cluster 9Fatty_acid3007013013017122311010.93Epimerase, FA_hydroxylase, PALP, p450---
Cluster 10Saccharide318388410318943107554.70Aldo_ket_red, Methyltransf_11, UDPGT_2flavonoid-5, oleananes-5--
Cluster 11Cyclopeptide4929767994954383592461.56BURP---
The following clusters are from record NC_066582.1:
Cluster 12Saccharide4621182746466121254.292OG-FeII_Oxy, DIOX_N, Glyco_hydro_1, Peptidase_S10---
Cluster 13Saccharide176346356177168869822.51UDPGT_2, p450, polyprenyl_synt---
Cluster 14Cyclopeptide2758705312821509786280.45BURP---
Cluster 15Cyclopeptide37855686739302708214470.22BURP---
Cluster 16Cyclopeptide3867764753893812662604.79BURP---
Cluster 17Cyclopeptide4857975704874978831700.31BURP---
Cluster 18Saccharide496896586497081872185.29UDPGT_2, p450small phenolic-4--
The following clusters are from record NC_066583.1:
Cluster 19Cyclopeptide12167775167988694631.09BURP---
Cluster 20Cyclopeptide13274831184138835139.05BURP---
Cluster 21Cyclopeptide15983889184541602470.27BURP---
Cluster 22Cyclopeptide16808328195269762718.65BURP---
Cluster 23Saccharide102550203103091286541.08Aminotran_1_2, Glycos_transf_2, SE---
Cluster 24Polyketide194555028195494026939.002OG-FeII_Oxy, Chal_sti_synt_C, Chal_sti_synt_N, DIOX_N, Methyltransf_11---
Cluster 25Polyketide2148120412178675003055.46Chal_sti_synt_C, Chal_sti_synt_N, adh_short---
Cluster 26Saccharide246094858246596422501.56NAD_binding_1, Peptidase_S10, UDPGT_2cyanogenic glucoside-5, monoterpenoid-5--
Cluster 27Transporter_associated-Fatty_acid248378010249024385646.38ADH_N, ADH_zinc_N, FA_desaturase_2, LTP_2, Lipoxygenase, adh_short---
Cluster 28Cyclopeptide253965621254775003809.38BURP---
Cluster 29Saccharide586402507586727140324.63Cellulose_synt, p450---
The following clusters are from record NC_066584.1:
Cluster 30Cyclopeptide14105463158251821719.72BURP---
Cluster 31Cyclopeptide1492891515617745688.83BURP---
Cluster 32Cyclopeptide1506822715949968881.74BURP---
Cluster 33Cyclopeptide34223403354018891178.49BURP---
Cluster 34Terpene2655589292682523202693.39Terpene_synth, Terpene_synth_C, Transferase, p450---
The following clusters are from record NC_066585.1:
Cluster 35Alkaloid-Fatty_acid161969102162676728707.63BBE, FAD_binding_4, FA_hydroxylase---
Cluster 36Terpene181259511182181420921.91Epimerase, Terpene_synth, Terpene_synth_C---
Cluster 37Saccharide233677316234181648504.33Epimerase, UDPGT_2, adh_short_C2flavonoid--
Cluster 38Terpene2398716312412730471401.42Epimerase, Peptidase_S10, Terpene_synth, Terpene_synth_C---
Cluster 39Cyclopeptide2907558962981993037443.41BURP---
Cluster 40Cyclopeptide2914240922944977253073.63BURP---
Cluster 41Cyclopeptide2922319942945664462334.45BURP---
Cluster 42Polyketide3042497473054842611234.51Chal_sti_synt_C, Epimerase, FAE1_CUT1_RppA, NAD_binding_4, p450---
Cluster 43Saccharide4711048514729337021828.85AMP-binding, Glyco_hydro_1, Lipoxygenase---
Cluster 44Lignan489750385490502114751.73Dirigent, p450---
Cluster 45Saccharide505507535506336966829.43UDPGT_2, p450flavonoid, oleananes--
Cluster 46Lignan-Saccharide536942171537667057724.89Dirigent, Glyco_hydro_1, Methyltransf_11---
Cluster 47Terpene540270736540492999222.26Epimerase, SQHop_cyclase_C, SQHop_cyclase_N, p450beta-amyrin-2, triterpene-2yossoside I/yossoside II/yossoside III/yossoside IV/yossos... (80% of genes show similarity)BGC0002402.2_c1

NC_066579 - Cluster 1 - Alkaloid-fatty_acid

Gene cluster description

NC_066579 - Gene Cluster 1. Type = alkaloid-fatty_acid. Location: 400266933 - 400951382 nt. Click on genes for more information.
Show pHMM detection rules used
plants/alkaloid: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Bet_v_1/Cu_amine_oxid/Str_synth/BBE/Orn_DAP_Arg_deC/Pyridoxal_deC]))
plants/fatty_acid: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[FA_desaturase/FA_desaturase_2/FA_hydroxylase/CER1-like_C]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,ECH_2]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,AMP-binding]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066579 - Cluster 2 - Alkaloid-saccharide

Gene cluster description

NC_066579 - Gene Cluster 2. Type = alkaloid-saccharide. Location: 457936123 - 458129637 nt. Click on genes for more information.
Show pHMM detection rules used
plants/alkaloid: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Bet_v_1/Cu_amine_oxid/Str_synth/BBE/Orn_DAP_Arg_deC/Pyridoxal_deC]))
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066580 - Cluster 3 - Saccharide

Gene cluster description

NC_066580 - Gene Cluster 3. Type = saccharide. Location: 1534884 - 1700613 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066580 - Cluster 4 - Saccharide

Gene cluster description

NC_066580 - Gene Cluster 4. Type = saccharide. Location: 22377888 - 23492931 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066580 - Cluster 5 - Cyclopeptide

Gene cluster description

NC_066580 - Gene Cluster 5. Type = cyclopeptide. Location: 115080239 - 120517543 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127122177
Repeat occurs 3 times in a sequence of 190 amino acids
Location between 117318496 and 117322021
Coverage of 9.47 %
Instances:
EAEAKT | EAEEKA | EAEAKA |
pattern: EAE[EA]K[TA]
MGLVSLIELTSYNEALQDKELIMEMEKELDQFEKNDVWDLVPKPKGTHVIGTKEAGERLHDRL
AREAEAKTCREAEEKARLEEEESAREVAEKAAAEAVAVVEAEAKAKADAEKAARIATEEAGKAR
YTALTQGEKSHSDFAPLVLKTLEELQKEQQIMRARLDQQDSVNSNIQKLLTQLLQRMPPPPNP
Repeat found in LOC127122182
Repeat occurs 6 times in a sequence of 265 amino acids
Location between 117692849 and 117693784
Coverage of 15.85 %
Instances:
EAEERAR | EAEEKAR | EAEEKAV | EAEAEAK | EAEAKAK
EAEEATR |
pattern: EAE[EA][EAKR][TA][VKR]
MQGWETYFQRLYEHPNPPPEHPNPTTSKQPQTPPPAQQPNRPPKRPIYSEPQQTHSPFEPTPQ
PEQTTQSPSSVPTPITFVSSITPTLNCSAPNSPSTSSLASATEPETTLPTLEEAILVFVESSVE
KVKDTGIRLQERLAREAEERARKEAEEKARQEEEQRIREAEEKAVADAEAEAEAKAKAEAEEAT
R
ITVEEATKAKADALTQGEHTNYGFVPLVLKTLEELQKEQQVVRARLDQHDSVNINIQNMLSQL
LQRMRPLPNP

Similar gene clusters

NC_066580 - Cluster 6 - Saccharide

Gene cluster description

NC_066580 - Gene Cluster 6. Type = saccharide. Location: 136017328 - 136286230 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066580 - Cluster 7 - Cyclopeptide

Gene cluster description

NC_066580 - Gene Cluster 7. Type = cyclopeptide. Location: 319960237 - 324407479 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output

Similar gene clusters

NC_066580 - Cluster 8 - Polyketide

Gene cluster description

NC_066580 - Gene Cluster 8. Type = polyketide. Location: 490186082 - 490582249 nt. Click on genes for more information.
Show pHMM detection rules used
plants/polyketide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Chal_sti_synt_C/Chal_sti_synt_N]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Thr_dehydrat_C]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Chal_sti_synt_C,Chal_sti_synt_N]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066581 - Cluster 9 - Fatty_acid

Gene cluster description

NC_066581 - Gene Cluster 9. Type = fatty_acid. Location: 300701301 - 301712231 nt. Click on genes for more information.
Show pHMM detection rules used
plants/fatty_acid: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[FA_desaturase/FA_desaturase_2/FA_hydroxylase/CER1-like_C]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,ECH_2]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,AMP-binding]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066581 - Cluster 10 - Saccharide

Gene cluster description

NC_066581 - Gene Cluster 10. Type = saccharide. Location: 318388410 - 318943107 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066581 - Cluster 11 - Cyclopeptide

Gene cluster description

NC_066581 - Gene Cluster 11. Type = cyclopeptide. Location: 492976799 - 495438359 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output

Similar gene clusters

NC_066582 - Cluster 12 - Saccharide

Gene cluster description

NC_066582 - Gene Cluster 12. Type = saccharide. Location: 46211827 - 46466121 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066582 - Cluster 13 - Saccharide

Gene cluster description

NC_066582 - Gene Cluster 13. Type = saccharide. Location: 176346356 - 177168869 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066582 - Cluster 14 - Cyclopeptide

Gene cluster description

NC_066582 - Gene Cluster 14. Type = cyclopeptide. Location: 275870531 - 282150978 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127075102
Repeat occurs 3 times in a sequence of 532 amino acids
Location between 280138140 and 280140790
Coverage of 3.38 %
Instances:
DVDRAV | DVDKIA | DVDQAV |
pattern: DVD[KQR][IA][VA]
MASSMRISRFISRSVSSSSVFSRGLSSRVCKYSTNASSIEEPIKPTFHVDHTQLLIDGKFVDS
ASGKTFPTLDPRNGQVIAHVSEGQHEDVDRAVAAARKAFDHGPWPKMTAYERQRILLRAADLLE
KHNNELATLETWDNGKPYEQAAEIEVPMLTRLIRYYAGWADKIHGLTVPADGPHQVHTLHEPIG
VAGQIIPWNFPLLMFGWKVGPALACGNTIVLKTSEQTPLSALYAAKLFHEAGLPPGVLNIVSGF
GPIAGAALASHMDVDKIAFTGSTATGKIILELAAKSNLKAATLELGGKSPFIVCEDADVDQAVE
LAHFALFFNQGQCCCAGSRTFVHESVYDEFVEKAKARALKRVVGDPFKSGVEQGPQIDSKQFEK
ILKYINSGVENGATLEAGGEKIGNKGFYIQPTVFSNVQDEMLIAKDEIFGPVQTILKFKEIDEV
IRRANNSRFGLAAGIFTKSIDTANTLTRALRVGSVWVNCYDVFDATIPFGGYKMSGQGREKGEY
SLKNYLQVKAVVTPLKNPAWL

Similar gene clusters

NC_066582 - Cluster 15 - Cyclopeptide

Gene cluster description

NC_066582 - Gene Cluster 15. Type = cyclopeptide. Location: 378556867 - 393027082 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127075589
Repeat occurs 7 times in a sequence of 473 amino acids
Location between 387342525 and 387345756
Coverage of 10.36 %
Instances:
NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTT
NQPFGLS | NQPFETQ |
pattern: NQPF[GE][LT][SQTF]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTRTNQP
FGTF
VWWYKKQTGSLTTRSDKATKIETTATNQPFGTFVWWNEKEFDRPTIRSDKLTKIETTRAN
QPFGTF
VWWYKKEIERPTIRSDKTTKIETTTINQPFGTFVWWNKKETDRPTIRSDKVTKIETTR
TNQPFGTTAWWHKKETEKETEIETENNLLEENQPFGLSEQGKKETEKSNQPFETQTSDEKEAHV
LNNYCGTPSAIGEHKHCALSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVM
CHRLNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVS
VCHFIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075590
Repeat occurs 7 times in a sequence of 470 amino acids
Location between 387462977 and 387475338
Coverage of 10.43 %
Instances:
NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTT
NQPFGLS | NQPFETQ |
pattern: NQPF[GE][LT][SQTF]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGTF
VWWYKKETGRPTTRSDKETKIETTATNQPFGTFVWWNKKEFDRPTTIRSDKLTKIETTRA
NQPFGTF
VWWYKKEIERPTIRSDKTTKIETTTTNQPFGTFVWWNKKETDRPTIRSDKVTKIETT
RTNQPFGTTAWWHKKETEIETENNLLEENQPFGLSEQGKKETKKSNQPFETQTLDEKEAHVLNS
YCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVICHR
LNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVSVCH
FIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075590
Repeat occurs 7 times in a sequence of 470 amino acids
Location between 387471988 and 387475338
Coverage of 10.43 %
Instances:
NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTT
NQPFGLS | NQPFETQ |
pattern: NQPF[GE][LT][SQTF]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGTF
VWWYKKETGRPTTRSDKETKIETTATNQPFGTFVWWNKKEFDRPTTIRSDKLTKIETTRA
NQPFGTF
VWWYKKEIERPTIRSDKTTKIETTTTNQPFGTFVWWNKKETDRPTIRSDKVTKIETT
RTNQPFGTTAWWHKKETEIETENNLLEENQPFGLSEQGKKETKKSNQPFETQTLDEKEAHVLNS
YCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVICHR
LNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVSVCH
FIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075591
Repeat occurs 5 times in a sequence of 371 amino acids
Location between 387592445 and 387595987
Coverage of 9.43 %
Instances:
NQPFGTF | NQPFGTF | NQPFGTF | NQPFGLS | NQPFETQ

pattern: NQPF[GE][LT][SQF]
MEFKNLSVLALFFLTFLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHHLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGTF
VWSDKLTKIETTRANQPFGTFVWCDKTTKIETTTTNQPFGTFVWWNKKETDKENQPFGLS
EQGKKETKKSNQPFETQTLDEKEAHVLNSYCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKV
MSSSFSQSQDKYVVQEVNKIGDKAVMCHRLNFEEVVFYCHVVNATTTYMVPLMASDGTISKALT
ICHHDTRGMNPKVLNEVLNVKPGNVSVCHFIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075602
Repeat occurs 9 times in a sequence of 575 amino acids
Location between 389027355 and 389029325
Coverage of 10.96 %
Instances:
KPTFKDM | KPTFIEK | KPTSIEK | KPTLIER | KPTFIER
KPTFIER | KPTFVER | KPTFIER | KPTSIEK |
pattern: KPT[LSF][VKI][ED][MKR]
MAFNSQRTFRAPTFRFPLSVRRVYETWEPKSDVKETSETYFLHVYLPGYTKNQPKITLEDASQ
KLRITGERPIEGDKWKKFDQTYPVPENSDVGTLEAKFEQETLILKMQKKPISQSQVVAPKQQVE
KSQQEPLSNEGLDGTKLEKVQETIQPTQSTTKFEESTQDMNSDLPQTQSIEKKRQETIHDDTLS
QIAKETISNDTTKTQIGENSQQQFELKPTFKDMTKLQFNEKAQKGPEEFEPKPTFIEKIKTQID
EIAQKGQEEFEKKSTFIEKVKTQISEKAHKAQEEFALKPTSIEKAKTEPNEKPQISEEEFEKKP
TLIER
IITQIAERAQKGPKEIEAKPTFIERTNKQIDENVQKVQEEFESKPTFIERTKTQIDEKA
QNGLEEFEKKPTFVERIKTRIIEEAQKVQEEFEAKPTFIERIKTQIDEKVQKDKEEFEPKPTSI
EK
AKTETNKKLQKGPEEFEPKPIEKIVTKENLEKNIVKNSDEDAEKKRILVKEETKEKKEKPYE
SSKTLVGVKNQNIKENETEKEELPTPKVTESKWLGEERHLIENVSVAILVIAAFGAYISYKFSS
Repeat found in LOC127075617
Repeat occurs 9 times in a sequence of 508 amino acids
Location between 391600911 and 391607480
Coverage of 12.4 %
Instances:
CKFGESC | CKFDHPI | CKFGSKC | CKFNHPN | CKFHHPK
CKFGASC | CKFGERC | CKFGATC | CKFDHPP |
pattern: CKF[GNHD][ESHA][SKTPR][KPNCI]
MENRIYASYSPTNYTVSGAPSTSPTRFYNPDTMFLAHYRRTAEAAAAAAAIDIAPPGVSSTAN
FLCHTNPWASAFTAANVASASLGLKRSSDALYHPTILSTIGQNEAWYTTNSLAKRPRYETGSTL
PIYPHRPGERDCAHYMLTRTCKFGESCKFDHPIWVPDGGIPDWKEVPNNVPSETLPERPGEPDC
PFFLKTQKCKFGSKCKFNHPNVPSENADVSGLPERPLEPPCAFYLKTGKCKYGVACKFHHPKDI
QIQFYDELSRTVEQTQTNSSVFDGAIGDAQPTKSLISPLLHNSKGLPVRQGEVDCPFYMKTGSC
KFGASC
RYNHPDMNAINPAMSALAPSVLASSAAANLNIGVINPAASFYQAFDPRLSNPMAQVGM
TENIYPQRPGMIECDFFMKTGICKFGERCKYHHPVDRSTSSLSKLQSNVKLTPAGLPRREDVEL
CPYYLKTGTCKFGATCKFDHPPPGEVMEKAKSQGTSTTNGEEEETNVNVAGSAPEQCLDHV
Repeat found in LOC127075617
Repeat occurs 7 times in a sequence of 350 amino acids
Location between 391600911 and 391607245
Coverage of 14.0 %
Instances:
CKFGSKC | CKFNHPN | CKFHHPK | CKFGASC | CKFGERC
CKFGATC | CKFDHPP |
pattern: CKF[GDHN][ESHA][SKTPR][NKCP]
MLPLLHLVLNDPLTVPNNVPSETLPERPGEPDCPFFLKTQKCKFGSKCKFNHPNVPSENADVS
GLPERPLEPPCAFYLKTGKCKYGVACKFHHPKDIQIQFYDELSRTVEQTQTNSSVFDGAIGDAQ
PTKSLISPLLHNSKGLPVRQGEVDCPFYMKTGSCKFGASCRYNHPDMNAINPAMSALAPSVLAS
SAAANLNIGVINPAASFYQAFDPRLSNPMAQVGMTENIYPQRPGMIECDFFMKTGICKFGERCK
YHHPVDRSTSSLSKLQSNVKLTPAGLPRREDVELCPYYLKTGTCKFGATCKFDHPPPGEVMEKA
KSQGTSTTNGEEEETNVNVAGSAPEQCLDHV

Similar gene clusters

NC_066582 - Cluster 16 - Cyclopeptide

Gene cluster description

NC_066582 - Gene Cluster 16. Type = cyclopeptide. Location: 386776475 - 389381266 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127075589
Repeat occurs 7 times in a sequence of 473 amino acids
Location between 387342525 and 387345756
Coverage of 10.36 %
Instances:
NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTT
NQPFGLS | NQPFETQ |
pattern: NQPF[GE][LT][SQTF]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTRTNQP
FGTF
VWWYKKQTGSLTTRSDKATKIETTATNQPFGTFVWWNEKEFDRPTIRSDKLTKIETTRAN
QPFGTF
VWWYKKEIERPTIRSDKTTKIETTTINQPFGTFVWWNKKETDRPTIRSDKVTKIETTR
TNQPFGTTAWWHKKETEKETEIETENNLLEENQPFGLSEQGKKETEKSNQPFETQTSDEKEAHV
LNNYCGTPSAIGEHKHCALSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVM
CHRLNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVS
VCHFIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075590
Repeat occurs 7 times in a sequence of 470 amino acids
Location between 387462977 and 387475338
Coverage of 10.43 %
Instances:
NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTT
NQPFGLS | NQPFETQ |
pattern: NQPF[GE][LT][SQTF]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGTF
VWWYKKETGRPTTRSDKETKIETTATNQPFGTFVWWNKKEFDRPTTIRSDKLTKIETTRA
NQPFGTF
VWWYKKEIERPTIRSDKTTKIETTTTNQPFGTFVWWNKKETDRPTIRSDKVTKIETT
RTNQPFGTTAWWHKKETEIETENNLLEENQPFGLSEQGKKETKKSNQPFETQTLDEKEAHVLNS
YCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVICHR
LNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVSVCH
FIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075590
Repeat occurs 7 times in a sequence of 470 amino acids
Location between 387471988 and 387475338
Coverage of 10.43 %
Instances:
NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTF | NQPFGTT
NQPFGLS | NQPFETQ |
pattern: NQPF[GE][LT][SQTF]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGTF
VWWYKKETGRPTTRSDKETKIETTATNQPFGTFVWWNKKEFDRPTTIRSDKLTKIETTRA
NQPFGTF
VWWYKKEIERPTIRSDKTTKIETTTTNQPFGTFVWWNKKETDRPTIRSDKVTKIETT
RTNQPFGTTAWWHKKETEIETENNLLEENQPFGLSEQGKKETKKSNQPFETQTLDEKEAHVLNS
YCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVICHR
LNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVSVCH
FIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075591
Repeat occurs 5 times in a sequence of 371 amino acids
Location between 387592445 and 387595987
Coverage of 9.43 %
Instances:
NQPFGTF | NQPFGTF | NQPFGTF | NQPFGLS | NQPFETQ

pattern: NQPF[GE][LT][SQF]
MEFKNLSVLALFFLTFLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHHLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGTF
VWSDKLTKIETTRANQPFGTFVWCDKTTKIETTTTNQPFGTFVWWNKKETDKENQPFGLS
EQGKKETKKSNQPFETQTLDEKEAHVLNSYCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKV
MSSSFSQSQDKYVVQEVNKIGDKAVMCHRLNFEEVVFYCHVVNATTTYMVPLMASDGTISKALT
ICHHDTRGMNPKVLNEVLNVKPGNVSVCHFIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075602
Repeat occurs 9 times in a sequence of 575 amino acids
Location between 389027355 and 389029325
Coverage of 10.96 %
Instances:
KPTFKDM | KPTFIEK | KPTSIEK | KPTLIER | KPTFIER
KPTFIER | KPTFVER | KPTFIER | KPTSIEK |
pattern: KPT[LSF][VKI][ED][MKR]
MAFNSQRTFRAPTFRFPLSVRRVYETWEPKSDVKETSETYFLHVYLPGYTKNQPKITLEDASQ
KLRITGERPIEGDKWKKFDQTYPVPENSDVGTLEAKFEQETLILKMQKKPISQSQVVAPKQQVE
KSQQEPLSNEGLDGTKLEKVQETIQPTQSTTKFEESTQDMNSDLPQTQSIEKKRQETIHDDTLS
QIAKETISNDTTKTQIGENSQQQFELKPTFKDMTKLQFNEKAQKGPEEFEPKPTFIEKIKTQID
EIAQKGQEEFEKKSTFIEKVKTQISEKAHKAQEEFALKPTSIEKAKTEPNEKPQISEEEFEKKP
TLIER
IITQIAERAQKGPKEIEAKPTFIERTNKQIDENVQKVQEEFESKPTFIERTKTQIDEKA
QNGLEEFEKKPTFVERIKTRIIEEAQKVQEEFEAKPTFIERIKTQIDEKVQKDKEEFEPKPTSI
EK
AKTETNKKLQKGPEEFEPKPIEKIVTKENLEKNIVKNSDEDAEKKRILVKEETKEKKEKPYE
SSKTLVGVKNQNIKENETEKEELPTPKVTESKWLGEERHLIENVSVAILVIAAFGAYISYKFSS

Similar gene clusters

NC_066582 - Cluster 17 - Cyclopeptide

Gene cluster description

NC_066582 - Gene Cluster 17. Type = cyclopeptide. Location: 485797570 - 487497883 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127076485
Repeat occurs 3 times in a sequence of 357 amino acids
Location between 486646512 and 486648940
Coverage of 8.4 %
Instances:
IQPSRSNAQI | IQPFGAWRNE | IQPFGAWRNE |
pattern: IQP[SF][GR][SA][NW][RA][NQ][EI]
MEFTSLSILALLCVAFMEVNASMSGEEYWNSIWPNTPIPKTISDLVLSNNTELIRGQEMKQYW
TVFFNHDLYPGKEMSLGIQKQSYIQPSRSNAQIFIKKASTHVATKEEIEKSTQPHGEATKEDIE
EPIQPFGAWRNEKEIEEPIQPFGAWRNEATKKEIERPNKHFEGIVWPRKTTIKKLEKVSQTSIT
RTLDEKETHILRDYCEKPSAIGEDRHCVTSLESMMYFVISKLGKNIKVMSSSFAQNQTQYVVEE
VKKIGDKAVMCHKMNLKIVVFNCHQVNATTIYKVPLVASDGTKSNALTICHHDTRGMNANALYK
VLKVRPGTVPICHFIGNKAIAWVPNDSVSEDDDCPRLI
Repeat found in LOC127076498
Repeat occurs 7 times in a sequence of 238 amino acids
Location between 487372378 and 487373095
Coverage of 44.12 %
Instances:
TTTTSSETTTTTTTS | TTTSSETTTTTTTSN | TTTTTTTSNSSTSSS | TTTTTTSNSSTSSSS | TTTTTSNSSTSSSSS
TTTTSNSSTSSSSSP | TTTSNSSTSSSSSPP |
pattern: TTT[ST][NST][ESTN][ESTN][NST][NST][ST][ST][ST][ST][STP][NSP]
MTQTTTTSSETTTTTTTSNSSTSSSSSPPPPSHSTQKQTQTKPRDDDNNNNNKHPTYHGVRKR
SWGKWVSEIREPRKKSRIWLGTFSTPEAAARAHDVAALTIKGKTAILNFPNISNMLPIPATSAP
RDIQAAATAAAAMVDFDEPVVHVTEQCCSESDESEQEQEQELSQIVELPKINEGEDDSVVDSAG
SEFVLLDDSVGSTNWVYHHPFTPSIGFEDGIEFYATFSDDFLSPIWD

Similar gene clusters

NC_066582 - Cluster 18 - Saccharide

Gene cluster description

NC_066582 - Gene Cluster 18. Type = saccharide. Location: 496896586 - 497081872 nt. Click on genes for more information.
Show pHMM detection rules used
plants/plant: (minimum(4,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[]))
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 19 - Cyclopeptide

Gene cluster description

NC_066583 - Gene Cluster 19. Type = cyclopeptide. Location: 12167775 - 16798869 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127078457
Repeat occurs 5 times in a sequence of 394 amino acids
Location between 13291365 and 13293782
Coverage of 7.61 %
Instances:
PPPEQQ | PPPSDQ | PPPEQQ | PPPEQP | PPPSDI

pattern: PPP[ES][QD][IQP]
MDAQEQSVYNYSQQMESTDQTPISTQQTTTTTGVVSTSIYREPHILDREPHVHLATPFEKLEV
LCESLVDFDNMKRNGIELTKELIMQGWETYFQRLYGHVYTYLLPEHPPNLMKRQREPSEKSKKA
KKEKLEETSGSRPPVPLADSPKVTSPPPEQQNPPPSDQPQTPPPEQQTNPPSEQPQTPPPEQPA
PSPSEHQPSPPLEQTTPPPSDIPPLPTSEAIITPTQNPADTNPNPPSSPSSIPEPETAFPILEE
AITLFAESLVEKIKSLDAGIRPQARLAREAKEKARKEAEEKALLEEEQRIIEVEQKGVVADAAE
AEAKVKVEAEEAAHIAAEEAAKASVDALTQGEQSNSGFAPLVLKTLEELQKEQQVVRARLDHQD
SVNNNIQNLLT
Repeat found in LOC127084472
Repeat occurs 3 times in a sequence of 156 amino acids
Location between 14251448 and 14260342
Coverage of 28.85 %
Instances:
GAIGEFEPRPYASAY | GAIGEFEPRPNASAY | GAIGEFETRPNASAY |
pattern: GAIGEFE[TP]RP[NY]ASAY
The following known motifs were found:
FEPR was found 3 times in this sequence
MMSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKE
NMGAIGEFEPRPYASAYPYASAYGDNEIHAKENMGAIGEFEPRPNASAYPNASAYGDNEIHANE
NKGAIGEFETRPNASAYGDNEIGAEFTDDFEPRPSMTKYNA
Repeat found in LOC127084472
Repeat occurs 4 times in a sequence of 182 amino acids
Location between 14251448 and 14260342
Coverage of 32.97 %
Instances:
GEFEPRPYASAYGDN | GEFEPRPNASAYGDN | GEFEPRPNISAYGDN | GEFETRPNASAYGDN |
pattern: GEFE[TP]RP[NY][IA]SAYGDN
The following known motifs were found:
FEPR was found 4 times in this sequence
MMSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKE
NMGAIGEFEPRPYASAYGDNPYASAYGDNEIHAKENMGAIGEFEPRPNASAYGDNPNASAYGDN
EIHANENKGATGEFEPRPNISAYGDNPNISAYGDNEIHANENKGAIGEFETRPNASAYGDNEIG
AEFTDDFEPRPSMTKYNA
Repeat found in LOC127084472
Repeat occurs 4 times in a sequence of 181 amino acids
Location between 14251448 and 14253205
Coverage of 33.15 %
Instances:
GEFEPRPYASAYGDN | GEFEPRPNASAYGDN | GEFEPRPNISAYGDN | GEFETRPNASAYGDN |
pattern: GEFE[TP]RP[NY][IA]SAYGDN
The following known motifs were found:
FEPR was found 4 times in this sequence
MSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKEN
MGAIGEFEPRPYASAYGDNPYASAYGDNEIHAKENMGAIGEFEPRPNASAYGDNPNASAYGDNE
IHANENKGATGEFEPRPNISAYGDNPNISAYGDNEIHANENKGAIGEFETRPNASAYGDNEIGA
EFTDDFEPRPSMTKYNA
Repeat found in LOC127084473
Repeat occurs 5 times in a sequence of 291 amino acids
Location between 14482842 and 14483801
Coverage of 18.9 %
Instances:
HHHHHHHQHNH | HHHHHHQHNHM | HHHHHQHNHMT | HHHHQHNHMTH | HHHQHNHMTHN

pattern: HHH[QH][QH][NQH][NQH][HMQN][NMTH][NMTH][NMTH]
MAINFASWIILLHLLLILLCSNGNQAREIVETENKSLEIQHNHHHHHHHQHNHMTHNIDPSLM
VFFTLKDLKVGKTMQIYFPKRDPSTSPKLWPKEKAESLPFSSNQLSYLLKFFSFSPNTPQAMAM
ENTLQECESKHIKGEVKFCATSLQSMLEFTQKTLGSTSEIQVYATLHKTKSSVTFQNYTIVEIM
MEILAPKMVACHTVPYPFAVFYCHSQESENRVYKVLLGGENGDKVEAMVVCHMDTSQWSPSHVS
FQVLGVTPGSSSVCHFFPADNYIWVPKLKSQGSSSM
Repeat found in LOC127084474
Repeat occurs 9 times in a sequence of 285 amino acids
Location between 14758282 and 14759808
Coverage of 22.11 %
Instances:
FEPRPNV | FEPKPNV | FEPRPNV | FEPRPNI | FEPRPNI
FEPRPNV | FEPIPNV | FEPRPNV | FEPRPSV |
pattern: FEP[KIR]P[NS][VI]
The following known motifs were found:
VS[AI]Y was found 3 times in this sequence
FEPR was found 7 times in this sequence
MMRLRPAFALLPLFLLLIITIVESRKDLGKYWKLVMKDQDVSEEIQGLLDANIKKNFKTLRQS
FDAKENKVVKDFEPRPNVPNVSVYGENDIDFMKNKAAIEEFEPKPNVSVYGNNNIDVEENNKGI
EDFEPRPNVPNVSTYGNNDIDNKKKDKEVEDFEPRPNIPNISAYGNNDIDNKKKDKEVEDFEPR
PNI
PNISAYGNNDIDNKEKEKAVEDFEPRPNVPNVSAYGNNDINSRENEKVVEDFEPIPNVSAY
GNNDIYNKEKKKVVEDFEPRPNVPNVSAYGNNEIGAEFTEDFEPRPSV
Repeat found in LOC127084474
Repeat occurs 8 times in a sequence of 259 amino acids
Location between 14758282 and 14759808
Coverage of 21.62 %
Instances:
FEPRPNV | FEPKPNV | FEPRPNV | FEPRPNI | FEPRPNV
FEPIPNV | FEPRPNV | FEPRPSV |
pattern: FEP[KIR]P[NS][VI]
The following known motifs were found:
VS[AI]Y was found 3 times in this sequence
FEPR was found 6 times in this sequence
MMRLRPAFALLPLFLLLIITIVESRKDLGKYWKLVMKDQDVSEEIQGLLDANIKKNFKTLRQS
FDAKENKVVKDFEPRPNVPNVSVYGENDIDFMKNKAAIEEFEPKPNVSVYGNNNIDVEENNKGI
EDFEPRPNVPNVSTYGNNDIDNKKKDKEVEDFEPRPNIPNISAYGNNDIDNKEKEKAVEDFEPR
PNV
PNVSAYGNNDINSRENEKVVEDFEPIPNVSAYGNNDIYNKEKKKVVEDFEPRPNVPNVSAY
GNNEIGAEFTEDFEPRPSV
Repeat found in LOC127084475
Repeat occurs 17 times in a sequence of 525 amino acids
Location between 15248183 and 15251142
Coverage of 38.86 %
Instances:
PNISAYGKKNVD | PNISAYGENDID | PNISAYGENNID | PNISAYGENNID | PNISAYGENNID
PNISAYGENDID | PNISAYGENNFD | PNISAYGENDID | PNISAYGENNFD | PNISAYGENDID
PNISAYGENNFD | PNISAYGENNID | PNISAYGENNVD | PNISAYGENNID | PNISAYVGNDID
PNISAYGNNNID | PNISAYGNNEID |
pattern: PNISAY[GV][EGKN][NK][NED][VIF]D
The following known motifs were found:
FEPR was found 14 times in this sequence
MKMMRPALSLLPLFLLLIVGIVESRKDLGEYWKLVMKQQDMPQEIQGLLNQNPKKNFKTLKQF
FDDGKKKKVVKDFEQRPNISAYGKKNVDVKEKNGVIEDFEPRPNISAYGENDIDVKEKKGAIED
FEPIPNISAYGENNIDDKEKNEGIEDFEPRPNISAYGENNIDVKEKKGVIEDFEPRPNISAYGE
NNID
VKEKNGTIEEFEPRPNISAYGENDIDVKEKKGAIEDFEPRPNISAYGENNFDDKKKNGAI
EDFEPRPNISAYGENDIDVKENKGNIEDFEPRPNISAYGENNFDDKKKNGAIEDFEPRPNISAY
GENDID
VKENKGNIEDFEPRPNISAYGENNFDVKENNGAIEDFEPRPNISAYGENNIDFKEKKG
AIEEFEPRPNISAYGENNVDVKEKSGAIEDFEPRPNISAYGENNIDIKEKKGAIEDFKPRPNIS
AYVGNDID
VKEKKGDIEDFEPRPNISAYGNNNIDVKEKNKTIKDFEPRPNISAYGNNEIDDESM
KDVEPIPSLTKYDA
Repeat found in LOC127085072
Repeat occurs 16 times in a sequence of 636 amino acids
Location between 15931056 and 15934133
Coverage of 20.13 %
Instances:
YLDGWLKK | YLDGWLKD | YLDGWLKD | YLDGWLKD | YLDGWLKD
YLDGWLKD | YLDGWLKD | YLDGWLKD | YLDGWFKD | YLDGWLKD
YLDGWLKD | YLDGWLKN | YLDGWLKD | YLDGWLKD | YLDGLLKD
YLDIGSKI |
pattern: YLD[GI][LWG][LSF]K[NKID]
MTYRIVRSILHFLIFLLMNGHGNFARDTKLLQENVEEKQVDQPYLDGWLKKPLKNQKRIPDSN
EVYHDGWLKDNRGEKEKTNLDSNQVYLDGWLKDTRTEKEKVSHDSKQVYLDGWLKDTRVEKAKG
NPDSKQVYLDGWLKDIRAEKAQVNPDTNQVYLDGWLKDTRDEKEKVNPNSNQAYLDGWLKDIRT
EKAKSTLDTNQVYLDGWLKDGRAKKVKFTPDTNQVYLDGWLKDSRTEKAKSTPDSNQIYLDGWF
KD
NRGDKSKSTPDTNQVYLDGWLKDFRVEKEKSTPNEVYLDGWLKDTRDQKEKSTTNSNQVYLD
GWLKN
TQAEKEKVTPNSNKVYLDGWLKDTKDQKKKTTRNFNPAYLDGWLKDSHVDKAKFTPNSK
QAYLDGLLKDSHAESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPLLPRKVAD
DIPFSKSQIPSLLQLFSFTKDSPQGEDMKDIINQCEFEPTKGETKACPTSLESMVEFVHSVIGT
ETKFNIHSTSYPTTSGARLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSDMNPNHFIFELLGMKPGEAPLCHFFPVKHVLWVPAPPDVTK
Repeat found in LOC127078471
Repeat occurs 6 times in a sequence of 286 amino acids
Location between 15949366 and 15957118
Coverage of 27.27 %
Instances:
SNQVYLDGWLKDT | SNQVYLDGWLKDT | SNQVYLDGWLKDT | SNQIYLDGWLKDT | SNQVYLDEWLKDT
SNQVYLDGWLKDT |
pattern: SNQ[VI]YLD[GE]WLKDT
MQLAKEFSKVMQDEFEMSIMGELNYFLGLQIKQLDEGTLMCQTKYYNDLLKRFGMENAKSIDT
PMPTNGTWKGMKMSAPKESHLKAVKRNLRYLHGTSKYGLWYSKGSGCNLVGYTDSDFAGCKSDR
KSISGPCHMFSNSLTLNNRDVQVEKEKPTPESNQVYLDGWLKDTRAEKEKITSNSNQVYLDGWL
KDT
RVKKEKLTPESNQVYLDGWLKDTRAEKTKPTPNSNQIYLDGWLKDTRTEKAKSVSNSNQVY
LDEWLKDT
RAEKEKFTLTSNQVYLDGWLKDT
Repeat found in LOC127085073
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 15987697 and 15990606
Coverage of 28.68 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVI][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDG
WLKDTR
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQ
VYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSS
PNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGE
DMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 28.24 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 21 times in a sequence of 756 amino acids
Location between 15987697 and 15990606
Coverage of 27.78 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDG
WLKDTR
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQ
VYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKN
GQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQ
SPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHS
TSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDI
MNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 28.24 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVI][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 15987697 and 15990606
Coverage of 27.29 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDT
R
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNR
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 15987697 and 15990606
Coverage of 26.76 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDT
R
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDG
WLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127085073
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 15987697 and 15990606
Coverage of 26.2 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVI][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKE
KSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDT
R
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 15987697 and 15990606
Coverage of 25.6 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPD
SKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVEND
KSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 16009594 and 16012357
Coverage of 27.29 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDT
R
AEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQ
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 16009594 and 16012357
Coverage of 26.76 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDI
R
VQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127085074
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 16009594 and 16012357
Coverage of 26.2 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDI
R
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][STF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 16009594 and 16012357
Coverage of 25.6 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKE
KVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 24.96 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
RAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 24.96 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
RAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 23.53 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 23.53 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK

Similar gene clusters

NC_066583 - Cluster 20 - Cyclopeptide

Gene cluster description

NC_066583 - Gene Cluster 20. Type = cyclopeptide. Location: 13274831 - 18413883 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127078457
Repeat occurs 5 times in a sequence of 394 amino acids
Location between 13291365 and 13293782
Coverage of 7.61 %
Instances:
PPPEQQ | PPPSDQ | PPPEQQ | PPPEQP | PPPSDI

pattern: PPP[ES][QD][IQP]
MDAQEQSVYNYSQQMESTDQTPISTQQTTTTTGVVSTSIYREPHILDREPHVHLATPFEKLEV
LCESLVDFDNMKRNGIELTKELIMQGWETYFQRLYGHVYTYLLPEHPPNLMKRQREPSEKSKKA
KKEKLEETSGSRPPVPLADSPKVTSPPPEQQNPPPSDQPQTPPPEQQTNPPSEQPQTPPPEQPA
PSPSEHQPSPPLEQTTPPPSDIPPLPTSEAIITPTQNPADTNPNPPSSPSSIPEPETAFPILEE
AITLFAESLVEKIKSLDAGIRPQARLAREAKEKARKEAEEKALLEEEQRIIEVEQKGVVADAAE
AEAKVKVEAEEAAHIAAEEAAKASVDALTQGEQSNSGFAPLVLKTLEELQKEQQVVRARLDHQD
SVNNNIQNLLT
Repeat found in LOC127084472
Repeat occurs 3 times in a sequence of 156 amino acids
Location between 14251448 and 14260342
Coverage of 28.85 %
Instances:
GAIGEFEPRPYASAY | GAIGEFEPRPNASAY | GAIGEFETRPNASAY |
pattern: GAIGEFE[TP]RP[NY]ASAY
The following known motifs were found:
FEPR was found 3 times in this sequence
MMSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKE
NMGAIGEFEPRPYASAYPYASAYGDNEIHAKENMGAIGEFEPRPNASAYPNASAYGDNEIHANE
NKGAIGEFETRPNASAYGDNEIGAEFTDDFEPRPSMTKYNA
Repeat found in LOC127084472
Repeat occurs 4 times in a sequence of 182 amino acids
Location between 14251448 and 14260342
Coverage of 32.97 %
Instances:
GEFEPRPYASAYGDN | GEFEPRPNASAYGDN | GEFEPRPNISAYGDN | GEFETRPNASAYGDN |
pattern: GEFE[TP]RP[NY][IA]SAYGDN
The following known motifs were found:
FEPR was found 4 times in this sequence
MMSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKE
NMGAIGEFEPRPYASAYGDNPYASAYGDNEIHAKENMGAIGEFEPRPNASAYGDNPNASAYGDN
EIHANENKGATGEFEPRPNISAYGDNPNISAYGDNEIHANENKGAIGEFETRPNASAYGDNEIG
AEFTDDFEPRPSMTKYNA
Repeat found in LOC127084472
Repeat occurs 4 times in a sequence of 181 amino acids
Location between 14251448 and 14253205
Coverage of 33.15 %
Instances:
GEFEPRPYASAYGDN | GEFEPRPNASAYGDN | GEFEPRPNISAYGDN | GEFETRPNASAYGDN |
pattern: GEFE[TP]RP[NY][IA]SAYGDN
The following known motifs were found:
FEPR was found 4 times in this sequence
MSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKEN
MGAIGEFEPRPYASAYGDNPYASAYGDNEIHAKENMGAIGEFEPRPNASAYGDNPNASAYGDNE
IHANENKGATGEFEPRPNISAYGDNPNISAYGDNEIHANENKGAIGEFETRPNASAYGDNEIGA
EFTDDFEPRPSMTKYNA
Repeat found in LOC127084473
Repeat occurs 5 times in a sequence of 291 amino acids
Location between 14482842 and 14483801
Coverage of 18.9 %
Instances:
HHHHHHHQHNH | HHHHHHQHNHM | HHHHHQHNHMT | HHHHQHNHMTH | HHHQHNHMTHN

pattern: HHH[QH][QH][NQH][NQH][HMQN][NMTH][NMTH][NMTH]
MAINFASWIILLHLLLILLCSNGNQAREIVETENKSLEIQHNHHHHHHHQHNHMTHNIDPSLM
VFFTLKDLKVGKTMQIYFPKRDPSTSPKLWPKEKAESLPFSSNQLSYLLKFFSFSPNTPQAMAM
ENTLQECESKHIKGEVKFCATSLQSMLEFTQKTLGSTSEIQVYATLHKTKSSVTFQNYTIVEIM
MEILAPKMVACHTVPYPFAVFYCHSQESENRVYKVLLGGENGDKVEAMVVCHMDTSQWSPSHVS
FQVLGVTPGSSSVCHFFPADNYIWVPKLKSQGSSSM
Repeat found in LOC127084474
Repeat occurs 9 times in a sequence of 285 amino acids
Location between 14758282 and 14759808
Coverage of 22.11 %
Instances:
FEPRPNV | FEPKPNV | FEPRPNV | FEPRPNI | FEPRPNI
FEPRPNV | FEPIPNV | FEPRPNV | FEPRPSV |
pattern: FEP[KIR]P[NS][VI]
The following known motifs were found:
VS[AI]Y was found 3 times in this sequence
FEPR was found 7 times in this sequence
MMRLRPAFALLPLFLLLIITIVESRKDLGKYWKLVMKDQDVSEEIQGLLDANIKKNFKTLRQS
FDAKENKVVKDFEPRPNVPNVSVYGENDIDFMKNKAAIEEFEPKPNVSVYGNNNIDVEENNKGI
EDFEPRPNVPNVSTYGNNDIDNKKKDKEVEDFEPRPNIPNISAYGNNDIDNKKKDKEVEDFEPR
PNI
PNISAYGNNDIDNKEKEKAVEDFEPRPNVPNVSAYGNNDINSRENEKVVEDFEPIPNVSAY
GNNDIYNKEKKKVVEDFEPRPNVPNVSAYGNNEIGAEFTEDFEPRPSV
Repeat found in LOC127084474
Repeat occurs 8 times in a sequence of 259 amino acids
Location between 14758282 and 14759808
Coverage of 21.62 %
Instances:
FEPRPNV | FEPKPNV | FEPRPNV | FEPRPNI | FEPRPNV
FEPIPNV | FEPRPNV | FEPRPSV |
pattern: FEP[KIR]P[NS][VI]
The following known motifs were found:
VS[AI]Y was found 3 times in this sequence
FEPR was found 6 times in this sequence
MMRLRPAFALLPLFLLLIITIVESRKDLGKYWKLVMKDQDVSEEIQGLLDANIKKNFKTLRQS
FDAKENKVVKDFEPRPNVPNVSVYGENDIDFMKNKAAIEEFEPKPNVSVYGNNNIDVEENNKGI
EDFEPRPNVPNVSTYGNNDIDNKKKDKEVEDFEPRPNIPNISAYGNNDIDNKEKEKAVEDFEPR
PNV
PNVSAYGNNDINSRENEKVVEDFEPIPNVSAYGNNDIYNKEKKKVVEDFEPRPNVPNVSAY
GNNEIGAEFTEDFEPRPSV
Repeat found in LOC127084475
Repeat occurs 17 times in a sequence of 525 amino acids
Location between 15248183 and 15251142
Coverage of 38.86 %
Instances:
PNISAYGKKNVD | PNISAYGENDID | PNISAYGENNID | PNISAYGENNID | PNISAYGENNID
PNISAYGENDID | PNISAYGENNFD | PNISAYGENDID | PNISAYGENNFD | PNISAYGENDID
PNISAYGENNFD | PNISAYGENNID | PNISAYGENNVD | PNISAYGENNID | PNISAYVGNDID
PNISAYGNNNID | PNISAYGNNEID |
pattern: PNISAY[GV][EGKN][NK][NED][VIF]D
The following known motifs were found:
FEPR was found 14 times in this sequence
MKMMRPALSLLPLFLLLIVGIVESRKDLGEYWKLVMKQQDMPQEIQGLLNQNPKKNFKTLKQF
FDDGKKKKVVKDFEQRPNISAYGKKNVDVKEKNGVIEDFEPRPNISAYGENDIDVKEKKGAIED
FEPIPNISAYGENNIDDKEKNEGIEDFEPRPNISAYGENNIDVKEKKGVIEDFEPRPNISAYGE
NNID
VKEKNGTIEEFEPRPNISAYGENDIDVKEKKGAIEDFEPRPNISAYGENNFDDKKKNGAI
EDFEPRPNISAYGENDIDVKENKGNIEDFEPRPNISAYGENNFDDKKKNGAIEDFEPRPNISAY
GENDID
VKENKGNIEDFEPRPNISAYGENNFDVKENNGAIEDFEPRPNISAYGENNIDFKEKKG
AIEEFEPRPNISAYGENNVDVKEKSGAIEDFEPRPNISAYGENNIDIKEKKGAIEDFKPRPNIS
AYVGNDID
VKEKKGDIEDFEPRPNISAYGNNNIDVKEKNKTIKDFEPRPNISAYGNNEIDDESM
KDVEPIPSLTKYDA
Repeat found in LOC127085072
Repeat occurs 16 times in a sequence of 636 amino acids
Location between 15931056 and 15934133
Coverage of 20.13 %
Instances:
YLDGWLKK | YLDGWLKD | YLDGWLKD | YLDGWLKD | YLDGWLKD
YLDGWLKD | YLDGWLKD | YLDGWLKD | YLDGWFKD | YLDGWLKD
YLDGWLKD | YLDGWLKN | YLDGWLKD | YLDGWLKD | YLDGLLKD
YLDIGSKI |
pattern: YLD[GI][LWG][LSF]K[NKID]
MTYRIVRSILHFLIFLLMNGHGNFARDTKLLQENVEEKQVDQPYLDGWLKKPLKNQKRIPDSN
EVYHDGWLKDNRGEKEKTNLDSNQVYLDGWLKDTRTEKEKVSHDSKQVYLDGWLKDTRVEKAKG
NPDSKQVYLDGWLKDIRAEKAQVNPDTNQVYLDGWLKDTRDEKEKVNPNSNQAYLDGWLKDIRT
EKAKSTLDTNQVYLDGWLKDGRAKKVKFTPDTNQVYLDGWLKDSRTEKAKSTPDSNQIYLDGWF
KD
NRGDKSKSTPDTNQVYLDGWLKDFRVEKEKSTPNEVYLDGWLKDTRDQKEKSTTNSNQVYLD
GWLKN
TQAEKEKVTPNSNKVYLDGWLKDTKDQKKKTTRNFNPAYLDGWLKDSHVDKAKFTPNSK
QAYLDGLLKDSHAESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPLLPRKVAD
DIPFSKSQIPSLLQLFSFTKDSPQGEDMKDIINQCEFEPTKGETKACPTSLESMVEFVHSVIGT
ETKFNIHSTSYPTTSGARLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSDMNPNHFIFELLGMKPGEAPLCHFFPVKHVLWVPAPPDVTK
Repeat found in LOC127078471
Repeat occurs 6 times in a sequence of 286 amino acids
Location between 15949366 and 15957118
Coverage of 27.27 %
Instances:
SNQVYLDGWLKDT | SNQVYLDGWLKDT | SNQVYLDGWLKDT | SNQIYLDGWLKDT | SNQVYLDEWLKDT
SNQVYLDGWLKDT |
pattern: SNQ[VI]YLD[GE]WLKDT
MQLAKEFSKVMQDEFEMSIMGELNYFLGLQIKQLDEGTLMCQTKYYNDLLKRFGMENAKSIDT
PMPTNGTWKGMKMSAPKESHLKAVKRNLRYLHGTSKYGLWYSKGSGCNLVGYTDSDFAGCKSDR
KSISGPCHMFSNSLTLNNRDVQVEKEKPTPESNQVYLDGWLKDTRAEKEKITSNSNQVYLDGWL
KDT
RVKKEKLTPESNQVYLDGWLKDTRAEKTKPTPNSNQIYLDGWLKDTRTEKAKSVSNSNQVY
LDEWLKDT
RAEKEKFTLTSNQVYLDGWLKDT
Repeat found in LOC127085073
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 15987697 and 15990606
Coverage of 28.68 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVI][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDG
WLKDTR
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQ
VYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSS
PNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGE
DMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 28.24 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 21 times in a sequence of 756 amino acids
Location between 15987697 and 15990606
Coverage of 27.78 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDG
WLKDTR
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQ
VYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKN
GQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQ
SPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHS
TSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDI
MNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 28.24 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVI][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 15987697 and 15990606
Coverage of 27.29 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDT
R
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNR
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 15987697 and 15990606
Coverage of 26.76 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDT
R
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDG
WLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127085073
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 15987697 and 15990606
Coverage of 26.2 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVI][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKE
KSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDT
R
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 15987697 and 15990606
Coverage of 25.6 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPD
SKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVEND
KSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 16009594 and 16012357
Coverage of 27.29 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDT
R
AEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQ
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 16009594 and 16012357
Coverage of 26.76 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDI
R
VQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127085074
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 16009594 and 16012357
Coverage of 26.2 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDI
R
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][STF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 16009594 and 16012357
Coverage of 25.6 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKE
KVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 24.96 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
RAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 24.96 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
RAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 23.53 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 23.53 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088327
Repeat occurs 18 times in a sequence of 684 amino acids
Location between 17186384 and 17189258
Coverage of 26.32 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
RAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYLDGWLK
DTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQHLEESN
GKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPSLLQLF
SLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSYPTTSG
APLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICH
LDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 27.89 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 27.89 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088327
Repeat occurs 22 times in a sequence of 776 amino acids
Location between 17186384 and 17189258
Coverage of 28.35 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQ
VYLDGWLKDTQAKSNLDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNS
KQAYLDGWLKDSRVENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIR
EYAPFLPRKLADEIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLE
SMLEFVHGIIGPETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCH
YLDIGSKIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLW
VPSPSNATK
Repeat found in LOC127088328
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 17248750 and 17251664
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDNH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][STNFI][RKHP]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKINPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKDTRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127088328
Repeat occurs 19 times in a sequence of 706 amino acids
Location between 17248750 and 17251664
Coverage of 26.91 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDNH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STNFI][RKHP]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSR
AEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDGWLKDTRGEKEKSNH
DSNQVYLDGWLKDTRGEKEKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQK
AKSNSDSNRVYLDGWLKDTRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKD
TRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLD
GWLKDSH
AENDMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFL
PRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFV
HGIIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGS
KIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSD
ATK
Repeat found in LOC127088328
Repeat occurs 21 times in a sequence of 752 amino acids
Location between 17248750 and 17251664
Coverage of 27.93 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSR | YLDGWLKDTR | YLDRWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDNH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GIR][GW][LS]K[NID][STNFI][RKHP]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSR
AEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDRWLKDTRVQKEKVNS
DSNEVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEK
EKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKD
TR
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLD
GWLKDTR
AEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHL
EESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSL
LQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYP
TTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNAL
GICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 672 amino acids
Location between 18120165 and 18123127
Coverage of 26.79 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKAKANPDSNQVY
LDGWLKDTR
VEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRA
DKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWL
KDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVY
LDGWLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPN
SNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEA
FKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDM
IDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDIS
KDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFI
FDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKAKVNPDSNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPD
SNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
AEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTGAENAKSNLD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENE
KSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 718 amino acids
Location between 18120165 and 18123127
Coverage of 27.86 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWL
KDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVY
LDGWLKDTR
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSD
SNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIA
KSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMT
LQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKA
CPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPY
ALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFP
VKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 19 times in a sequence of 687 amino acids
Location between 18120165 and 18123127
Coverage of 27.66 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKE
KSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDT
R
AEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDG
WLKDTR
AEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLL
QLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 17 times in a sequence of 641 amino acids
Location between 18120165 and 18123127
Coverage of 26.52 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRAEKA
KLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNT
Q
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDG
WLKDSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
RKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVH
GVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHA
TK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 733 amino acids
Location between 18120165 and 18123127
Coverage of 28.65 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDT
R
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDG
WLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNR
VYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDV
MNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 23 times in a sequence of 787 amino acids
Location between 18120165 and 18123127
Coverage of 29.22 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVY
LDGWLKDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPD
SKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKL
NSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDS
H
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNK
GETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 28.68 %
Instances:
YLDGWLKNTS | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][SKHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTR
GENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
AEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDG
WLKDIR
DEKAKSTPDSNQVYVDGWLKDTLAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 18128433 and 18215138
Coverage of 28.24 %
Instances:
YLDGWLKNTS | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][SKHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTR
GENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTRAEKTKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQ
VYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKST
PNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPT
SLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 28.68 %
Instances:
YLDGWLKNTS | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][SKHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTR
GENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTRAEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 27 times in a sequence of 894 amino acids
Location between 18128433 and 18215138
Coverage of 24.16 %
Instances:
DGWLKNTS | DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTR
DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTR
DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTQ | DGWLKDIR
DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTQ | DGWLKDIR
DGWLKDTL | DGWLKDTR | DGWLKDTR | DGWLKDTQ | DGWLKDTR
DGWLKDSH | DGWLKDSH |
pattern: DGWLK[ND][SIT][LSHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQ
VYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDGWLKDTLAEKAKLN
SDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTL
NPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLK
DSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKL
ADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVI
GAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK

Similar gene clusters

NC_066583 - Cluster 21 - Cyclopeptide

Gene cluster description

NC_066583 - Gene Cluster 21. Type = cyclopeptide. Location: 15983889 - 18454160 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127085073
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 15987697 and 15990606
Coverage of 28.68 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVI][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDG
WLKDTR
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQ
VYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSS
PNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGE
DMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 28.24 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 21 times in a sequence of 756 amino acids
Location between 15987697 and 15990606
Coverage of 27.78 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDG
WLKDTR
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQ
VYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKN
GQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQ
SPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHS
TSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDI
MNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 28.24 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVI][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 15987697 and 15990606
Coverage of 27.29 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDT
R
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNR
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 15987697 and 15990606
Coverage of 26.76 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SVTF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDT
R
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDG
WLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127085073
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 15987697 and 15990606
Coverage of 26.2 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDVR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVI][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKE
KSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDT
R
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 15987697 and 15990606
Coverage of 25.6 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKNTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTR
GEKAKANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPD
SKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVEND
KSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 16009594 and 16012357
Coverage of 27.29 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDT
R
AEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQ
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 16009594 and 16012357
Coverage of 26.76 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDI
R
VQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127085074
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 16009594 and 16012357
Coverage of 26.2 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDI
R
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][STF][KHPQR]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 16009594 and 16012357
Coverage of 25.6 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKE
KVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 24.96 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
RAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 24.96 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
RAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 23.53 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 23.53 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][RKHP]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKTNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088327
Repeat occurs 18 times in a sequence of 684 amino acids
Location between 17186384 and 17189258
Coverage of 26.32 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
RAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYLDGWLK
DTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQHLEESN
GKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPSLLQLF
SLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSYPTTSG
APLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICH
LDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 27.89 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 27.89 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088327
Repeat occurs 22 times in a sequence of 776 amino acids
Location between 17186384 and 17189258
Coverage of 28.35 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQ
VYLDGWLKDTQAKSNLDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNS
KQAYLDGWLKDSRVENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIR
EYAPFLPRKLADEIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLE
SMLEFVHGIIGPETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCH
YLDIGSKIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLW
VPSPSNATK
Repeat found in LOC127088328
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 17248750 and 17251664
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDNH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][STNFI][RKHP]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKINPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKDTRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127088328
Repeat occurs 19 times in a sequence of 706 amino acids
Location between 17248750 and 17251664
Coverage of 26.91 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDNH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STNFI][RKHP]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSR
AEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDGWLKDTRGEKEKSNH
DSNQVYLDGWLKDTRGEKEKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQK
AKSNSDSNRVYLDGWLKDTRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKD
TRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLD
GWLKDSH
AENDMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFL
PRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFV
HGIIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGS
KIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSD
ATK
Repeat found in LOC127088328
Repeat occurs 21 times in a sequence of 752 amino acids
Location between 17248750 and 17251664
Coverage of 27.93 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSR | YLDGWLKDTR | YLDRWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDNH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GIR][GW][LS]K[NID][STNFI][RKHP]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSR
AEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDRWLKDTRVQKEKVNS
DSNEVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEK
EKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKD
TR
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLD
GWLKDTR
AEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHL
EESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSL
LQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYP
TTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNAL
GICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 672 amino acids
Location between 18120165 and 18123127
Coverage of 26.79 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKAKANPDSNQVY
LDGWLKDTR
VEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRA
DKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWL
KDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVY
LDGWLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPN
SNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEA
FKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDM
IDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDIS
KDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFI
FDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKAKVNPDSNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPD
SNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
AEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTGAENAKSNLD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENE
KSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 718 amino acids
Location between 18120165 and 18123127
Coverage of 27.86 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWL
KDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVY
LDGWLKDTR
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSD
SNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIA
KSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMT
LQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKA
CPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPY
ALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFP
VKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 19 times in a sequence of 687 amino acids
Location between 18120165 and 18123127
Coverage of 27.66 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKE
KSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDT
R
AEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDG
WLKDTR
AEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLL
QLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 17 times in a sequence of 641 amino acids
Location between 18120165 and 18123127
Coverage of 26.52 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRAEKA
KLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNT
Q
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDG
WLKDSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
RKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVH
GVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHA
TK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 733 amino acids
Location between 18120165 and 18123127
Coverage of 28.65 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDT
R
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDG
WLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNR
VYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDV
MNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 23 times in a sequence of 787 amino acids
Location between 18120165 and 18123127
Coverage of 29.22 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVY
LDGWLKDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPD
SKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKL
NSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDS
H
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNK
GETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 28.68 %
Instances:
YLDGWLKNTS | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][SKHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTR
GENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
AEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDG
WLKDIR
DEKAKSTPDSNQVYVDGWLKDTLAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 18128433 and 18215138
Coverage of 28.24 %
Instances:
YLDGWLKNTS | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][SKHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTR
GENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTRAEKTKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQ
VYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKST
PNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPT
SLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 28.68 %
Instances:
YLDGWLKNTS | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][SKHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTR
GENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTRAEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 27 times in a sequence of 894 amino acids
Location between 18128433 and 18215138
Coverage of 24.16 %
Instances:
DGWLKNTS | DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTR
DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTR
DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTQ | DGWLKDIR
DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTQ | DGWLKDIR
DGWLKDTL | DGWLKDTR | DGWLKDTR | DGWLKDTQ | DGWLKDTR
DGWLKDSH | DGWLKDSH |
pattern: DGWLK[ND][SIT][LSHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQ
VYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDGWLKDTLAEKAKLN
SDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTL
NPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLK
DSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKL
ADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVI
GAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK

Similar gene clusters

NC_066583 - Cluster 22 - Cyclopeptide

Gene cluster description

NC_066583 - Gene Cluster 22. Type = cyclopeptide. Location: 16808328 - 19526976 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127088327
Repeat occurs 18 times in a sequence of 684 amino acids
Location between 17186384 and 17189258
Coverage of 26.32 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
RAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYLDGWLK
DTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQHLEESN
GKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPSLLQLF
SLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSYPTTSG
APLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICH
LDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 27.89 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 27.89 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSR
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088327
Repeat occurs 22 times in a sequence of 776 amino acids
Location between 17186384 and 17189258
Coverage of 28.35 %
Instances:
YLDGWLKNTP | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDAR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDVR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSR | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STFVIA][KHPQR]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDAR
GEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQ
VYLDGWLKDTQAKSNLDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNS
KQAYLDGWLKDSRVENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIR
EYAPFLPRKLADEIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLE
SMLEFVHGIIGPETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCH
YLDIGSKIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLW
VPSPSNATK
Repeat found in LOC127088328
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 17248750 and 17251664
Coverage of 24.27 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDNH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][STNFI][RKHP]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKINPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKDTRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127088328
Repeat occurs 19 times in a sequence of 706 amino acids
Location between 17248750 and 17251664
Coverage of 26.91 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDNH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STNFI][RKHP]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSR
AEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDGWLKDTRGEKEKSNH
DSNQVYLDGWLKDTRGEKEKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQK
AKSNSDSNRVYLDGWLKDTRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKD
TRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLD
GWLKDSH
AENDMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFL
PRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFV
HGIIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGS
KIFK
VLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSD
ATK
Repeat found in LOC127088328
Repeat occurs 21 times in a sequence of 752 amino acids
Location between 17248750 and 17251664
Coverage of 27.93 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDSR | YLDGWLKDTR | YLDRWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDNH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GIR][GW][LS]K[NID][STNFI][RKHP]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSR
AEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDRWLKDTRVQKEKVNS
DSNEVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEK
EKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKD
TR
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLD
GWLKDTR
AEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHL
EESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSL
LQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYP
TTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNAL
GICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 672 amino acids
Location between 18120165 and 18123127
Coverage of 26.79 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKAKANPDSNQVY
LDGWLKDTR
VEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRA
DKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWL
KDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVY
LDGWLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPN
SNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEA
FKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDM
IDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDIS
KDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFI
FDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKAKVNPDSNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPD
SNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
AEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTGAENAKSNLD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENE
KSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 718 amino acids
Location between 18120165 and 18123127
Coverage of 27.86 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWL
KDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVY
LDGWLKDTR
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSD
SNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIA
KSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMT
LQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKA
CPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPY
ALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFP
VKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 19 times in a sequence of 687 amino acids
Location between 18120165 and 18123127
Coverage of 27.66 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKE
KSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDT
R
AEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDG
WLKDTR
AEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLL
QLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 17 times in a sequence of 641 amino acids
Location between 18120165 and 18123127
Coverage of 26.52 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRAEKA
KLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNT
Q
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDG
WLKDSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
RKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVH
GVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHA
TK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][STF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 733 amino acids
Location between 18120165 and 18123127
Coverage of 28.65 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDT
R
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDG
WLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNR
VYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDV
MNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 23 times in a sequence of 787 amino acids
Location between 18120165 and 18123127
Coverage of 29.22 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][KHPQGR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVY
LDGWLKDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPD
SKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKL
NSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDS
H
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNK
GETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 28.68 %
Instances:
YLDGWLKNTS | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][SKHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTR
GENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
AEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDG
WLKDIR
DEKAKSTPDSNQVYVDGWLKDTLAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 18128433 and 18215138
Coverage of 28.24 %
Instances:
YLDGWLKNTS | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][SKHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTR
GENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTRAEKTKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQ
VYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKST
PNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPT
SLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 28.68 %
Instances:
YLDGWLKNTS | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[GI][GW][LS]K[NID][SITF][SKHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTR
GENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTRAEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 27 times in a sequence of 894 amino acids
Location between 18128433 and 18215138
Coverage of 24.16 %
Instances:
DGWLKNTS | DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTR
DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTR
DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTQ | DGWLKDIR
DGWLKDTR | DGWLKDTR | DGWLKDTR | DGWLKDTQ | DGWLKDIR
DGWLKDTL | DGWLKDTR | DGWLKDTR | DGWLKDTQ | DGWLKDTR
DGWLKDSH | DGWLKDSH |
pattern: DGWLK[ND][SIT][LSHQR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQ
VYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDGWLKDTLAEKAKLN
SDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTL
NPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLK
DSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKL
ADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVI
GAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK

Similar gene clusters

NC_066583 - Cluster 23 - Saccharide

Gene cluster description

NC_066583 - Gene Cluster 23. Type = saccharide. Location: 102550203 - 103091286 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 24 - Polyketide

Gene cluster description

NC_066583 - Gene Cluster 24. Type = polyketide. Location: 194555028 - 195494026 nt. Click on genes for more information.
Show pHMM detection rules used
plants/polyketide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Chal_sti_synt_C/Chal_sti_synt_N]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Thr_dehydrat_C]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Chal_sti_synt_C,Chal_sti_synt_N]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 25 - Polyketide

Gene cluster description

NC_066583 - Gene Cluster 25. Type = polyketide. Location: 214812041 - 217867500 nt. Click on genes for more information.
Show pHMM detection rules used
plants/polyketide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Chal_sti_synt_C/Chal_sti_synt_N]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Thr_dehydrat_C]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Chal_sti_synt_C,Chal_sti_synt_N]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 26 - Saccharide

Gene cluster description

NC_066583 - Gene Cluster 26. Type = saccharide. Location: 246094858 - 246596422 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 27 - Transporter_associated-fatty_acid

Gene cluster description

NC_066583 - Gene Cluster 27. Type = transporter_associated-fatty_acid. Location: 248378010 - 249024385 nt. Click on genes for more information.
Show pHMM detection rules used
plants/fatty_acid: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[FA_desaturase/FA_desaturase_2/FA_hydroxylase/CER1-like_C]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,ECH_2]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,AMP-binding]))
plants/plant: (minimum(4,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[]))
plants/transporter_associated: (minimum(4,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[MatE/LTP_2/ABC2_membrane/ABC_tran]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 28 - Cyclopeptide

Gene cluster description

NC_066583 - Gene Cluster 28. Type = cyclopeptide. Location: 253965621 - 254775003 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127083614
Repeat occurs 4 times in a sequence of 634 amino acids
Location between 254369311 and 254371312
Coverage of 4.42 %
Instances:
DSFKNYS | DSFKKYS | DSFTSYS | DSFQSYG |
pattern: DSF[KTQ][NSK]Y[GS]
MKIQFIFLFFSLSFFHGSIATIGNKGITQTQEYSFKQKTNPFTPKASLIRYWNTKISNKLPNP
IPNFFLSKASPLTPQHYANLVNLLKQKPFSANFHNSLCSTPYLLCSFDHPSEYYQSKKTNKPDA
NFAVYSNKKFATYGSSRLGGVDSFKNYSNGLNTNNDSFKKYSTTSTRHSGQFNSYAENGNVANT
NFTSYGSGSSSGTGEFKSYDKLVNDPNLGFTTYDSSATNHKLSFASYGNETNSGSESFNSYGKR
VRSGNSDFINYAVSSNILQSSFTGYGELGTGAANDSFTSYSFNGNNPRSTFKTYGAGSVSGSDT
FVSYRNRANVGDDSFQSYGSKSKSGAATFTNYGQSFNEGNDTFTEYGKGSSGKTAFGFKTYGLG
RAFKGYNKNGVSFSSYNNFSTFSGKIVNKFVEPGKFFRESMLKEGNVMVMPDIRDKMPERSFLP
LSISSKLPFSSSMLEDIKEAFHARDGSATEHVIKNALGECERGPSMGETKRCVGSAEAMIDFAV
SVLGPNVVVKTTESVNGSKNSVMIGKVYGINGGKVTKSVSCHQTLYPYLLYYCHSVPQVRVYEA
EILDVETKSKINHGVAICHLDTSSWGPQHGAFIALGSEPGKIEVCHWIFENDMTWTIAS

Similar gene clusters

NC_066583 - Cluster 29 - Saccharide

Gene cluster description

NC_066583 - Gene Cluster 29. Type = saccharide. Location: 586402507 - 586727140 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066584 - Cluster 30 - Cyclopeptide

Gene cluster description

NC_066584 - Gene Cluster 30. Type = cyclopeptide. Location: 14105463 - 15825182 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127093676
Repeat occurs 3 times in a sequence of 322 amino acids
Location between 14652541 and 14669039
Coverage of 10.25 %
Instances:
FSTQVPPFSTQ | FSTQVPPFSTQ | FSTQVGTEKEE |
pattern: FSTQV[GP][TP][EF][SK][ET][EQ]
MQNYQNPNPQNSQIPPMPTNPAIFLPSPNNPNMYPIPQTNSNSMEFSTQVPPFSTQVPPFSTQ
VGTEKEE
RVVVKKRSREQFTREEDILLIQSWLNVSKDPIVGVDQKAESFWLRIVGSYNQYRGQL
REKLGGQLKCRWHRINGMVQKFVGCYKISLNGKKSGTSETDVMADAHAIFAQDQGTTFNLEYAW
RLSYDEAKWRIVEESIGSSAKITKTYASGASSENPDTTSSYEFNSSSPMERPMGQKAAKRKGKA
SEIPNATQDAKNKRAITMDRLAQAKEDELELRVVQMMMKDTSTMNDSQRDIHEKYCNKMKKKNM
ECS

Similar gene clusters

NC_066584 - Cluster 31 - Cyclopeptide

Gene cluster description

NC_066584 - Gene Cluster 31. Type = cyclopeptide. Location: 14928915 - 15617745 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output

No repeats detected in this cluster.

Similar gene clusters

NC_066584 - Cluster 32 - Cyclopeptide

Gene cluster description

NC_066584 - Gene Cluster 32. Type = cyclopeptide. Location: 15068227 - 15949968 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output

No repeats detected in this cluster.

Similar gene clusters

NC_066584 - Cluster 33 - Cyclopeptide

Gene cluster description

NC_066584 - Gene Cluster 33. Type = cyclopeptide. Location: 34223403 - 35401889 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output

No repeats detected in this cluster.

Similar gene clusters

NC_066584 - Cluster 34 - Terpene

Gene cluster description

NC_066584 - Gene Cluster 34. Type = terpene. Location: 265558929 - 268252320 nt. Click on genes for more information.
Show pHMM detection rules used
plants/terpene: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Terpene_synth/Terpene_synth_C/Prenyltrans/SQHop_cyclase_C/SQHop_cyclase_N/PRISE]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 35 - Alkaloid-fatty_acid

Gene cluster description

NC_066585 - Gene Cluster 35. Type = alkaloid-fatty_acid. Location: 161969102 - 162676728 nt. Click on genes for more information.
Show pHMM detection rules used
plants/alkaloid: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Bet_v_1/Cu_amine_oxid/Str_synth/BBE/Orn_DAP_Arg_deC/Pyridoxal_deC]))
plants/fatty_acid: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[FA_desaturase/FA_desaturase_2/FA_hydroxylase/CER1-like_C]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,ECH_2]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,AMP-binding]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 36 - Terpene

Gene cluster description

NC_066585 - Gene Cluster 36. Type = terpene. Location: 181259511 - 182181420 nt. Click on genes for more information.
Show pHMM detection rules used
plants/terpene: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Terpene_synth/Terpene_synth_C/Prenyltrans/SQHop_cyclase_C/SQHop_cyclase_N/PRISE]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 37 - Saccharide

Gene cluster description

NC_066585 - Gene Cluster 37. Type = saccharide. Location: 233677316 - 234181648 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 38 - Terpene

Gene cluster description

NC_066585 - Gene Cluster 38. Type = terpene. Location: 239871631 - 241273047 nt. Click on genes for more information.
Show pHMM detection rules used
plants/terpene: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Terpene_synth/Terpene_synth_C/Prenyltrans/SQHop_cyclase_C/SQHop_cyclase_N/PRISE]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 39 - Cyclopeptide

Gene cluster description

NC_066585 - Gene Cluster 39. Type = cyclopeptide. Location: 290755896 - 298199303 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127101320
Repeat occurs 10 times in a sequence of 330 amino acids
Location between 292120408 and 292121902
Coverage of 39.39 %
Instances:
EFEPRPSVTKYDG | EFEPRPSVTKYDG | EFEPRPSATKYDG | EFEPRPSATKYDG | EFEPRPSATKYDG
EFEPIPSVTKYDG | EFEPRPSATKYDG | EFEPRPSATKYDG | EFEPRPSATKYDG | EFEPRPSATKYND

pattern: EFEP[IR]PS[VA]TKY[ND][GD]
The following known motifs were found:
FEPR was found 9 times in this sequence
MRPALALLPFLFLFMFAVTIESRKDLKEYWKTVMKDEAMPEGIQGLLQFKSVIEPLKNSKAQE
QLAKGKCDQLDVKEKKLVKEEFEPRPSVTKYDGPSVTKYDGGEGNKNMKLLVNDEFEPRPSVTK
YDG
PSVTKYDGDESYKNMKLSINDEFEPRPSATKYDGPSATKYDGNEGYKNIKLPVNDEFEPRP
SATKYDG
PSATKYDGDDGYKNMKLPINDEFEPRPSATKYDGPSATKYDGDDGYKNMKLSVNDEF
EPIPSVTKYDG
DEGYKNLKLTINDEFEPRPSATKYDGPSATKYDGDDGYQNMKLPINDEFEPRP
SATKYDG
PSATKYDGDDGYQNMKLPINYEFEPRPSATKYDGPSATKYDGDDGYKNMKLPLNDEF
EPR
PSATKYND
Repeat found in LOC127101320
Repeat occurs 13 times in a sequence of 408 amino acids
Location between 292120408 and 292121902
Coverage of 41.42 %
Instances:
EFEPRPSVTKYDG | EFEPRPSVTKYDG | EFEPRPSATKYDG | EFEPRPSVTKYDG | EFEPRPSVTKYDG
EFEPRPSATEYDG | EFEPRPSATKYDG | EFEPRPSATKYDG | EFEPIPSVTKYDG | EFEPRPSATKYDG
EFEPRPSATKYDG | EFEPRPSATKYDG | EFEPRPSATKYND |
pattern: EFEP[IR]PS[VA]T[EK]Y[ND][GD]
The following known motifs were found:
FEPR was found 12 times in this sequence
MRPALALLPFLFLFMFAVTIESRKDLKEYWKTVMKDEAMPEGIQGLLQFKSVIEPLKNSKAQE
QLAKGKCDQLDVKEKKLVKEEFEPRPSVTKYDGPSVTKYDGGEGNKNMKLLVNDEFEPRPSVTK
YDG
PSVTKYDGDESYKNMKLSINDEFEPRPSATKYDGPSATKYDGNEGYKNIKLPVNDEFEPRP
SVTKYDG
PSVTKYDGGEGNKNMKLLVNDEFEPRPSVTKYDGPSVTKYDGDESYKNMKLSINDEF
EPR
PSATEYDGPSATEYDGNEGYKNIKLPVNDEFEPRPSATKYDGPSATKYDGDDGYKNMKLPI
NDEFEPRPSATKYDGPSATKYDGDDGYKNMKLSVNDEFEPIPSVTKYDGDEGYKNLKLTINDEF
EPR
PSATKYDGPSATKYDGDDGYQNMKLPINDEFEPRPSATKYDGPSATKYDGDDGYQNMKLPI
NYEFEPRPSATKYDGPSATKYDGDDGYKNMKLPLNDEFEPRPSATKYND
Repeat found in LOC127105993
Repeat occurs 9 times in a sequence of 437 amino acids
Location between 293350214 and 293351747
Coverage of 12.36 %
Instances:
YITGYR | YITSYR | YITGYR | YITSYS | YITGYR
YITSYG | YITGYR | YITSYG | YITQYR |
pattern: YIT[GSQ]Y[GSR]
MAHALALQFLTLLLFFFMNGQAITARDLKAELQDHKPVDSNEEPYPTSYGNHEVKQPDYITGY
R
THSHDSNKPYITSYRNHEAKQPDYITGYRTHSHDSNKPYITSYSNHEAKQPDYITGYRTHADE
SNGQYITSYGKHEAKQPDYITGYRTHADESNGPYITSYGKHEAKQPYITQYRPTSLDLKGLTYP
NSKDQEGSASPNMDRTEAFKTGYFNMDDLYVGHVMTLQFPVQEVSPYLSKKEADYIPLSKSQLP
SVLQLFSIAEDSTQTKSMINTLEECEGETVTGETKICANSLESMLEFVDKIIGSDTKHSILSTS
KPTPTATPLQKYTILEVSHEIHTPKWVACHPLPYPYAIYYCHYIATGTKVFKVTLVGDENGDKM
EALGMCHLDTSEWNPDHIVFRQLGVKAGKNTPVCHFFPVNHLLWVPEEPSKATM
Repeat found in LOC127105993
Repeat occurs 9 times in a sequence of 434 amino acids
Location between 293350214 and 293351747
Coverage of 12.44 %
Instances:
YITGYR | YITSYR | YITGYR | YITSYS | YITGYR
YITSYG | YITGYR | YITSYG | YITQYR |
pattern: YIT[GSQ]Y[GSR]
MAHALALQFLTLLLFFFMAITARDLKAELQDHKPVDSNEEPYPTSYGNHEVKQPDYITGYRTH
SHDSNKPYITSYRNHEAKQPDYITGYRTHSHDSNKPYITSYSNHEAKQPDYITGYRTHADESNG
QYITSYGKHEAKQPDYITGYRTHADESNGPYITSYGKHEAKQPYITQYRPTSLDLKGLTYPNSK
DQEGSASPNMDRTEAFKTGYFNMDDLYVGHVMTLQFPVQEVSPYLSKKEADYIPLSKSQLPSVL
QLFSIAEDSTQTKSMINTLEECEGETVTGETKICANSLESMLEFVDKIIGSDTKHSILSTSKPT
PTATPLQKYTILEVSHEIHTPKWVACHPLPYPYAIYYCHYIATGTKVFKVTLVGDENGDKMEAL
GMCHLDTSEWNPDHIVFRQLGVKAGKNTPVCHFFPVNHLLWVPEEPSKATM
Repeat found in LOC127106000
Repeat occurs 15 times in a sequence of 690 amino acids
Location between 294476165 and 294479033
Coverage of 30.43 %
Instances:
NPWLPWGSRETKKP | NPWLPWGSREIKKP | NPWLPWGSRETKRP | NPWLPWGSRETIKP | NPWLPWGSREIKRP
NPWLPWGSRETKRP | NPWLPWGSRETIKP | NPWLPWGSREIKKP | NPWLPWGSRETKRP | NPWLPWGSRETIKP
NPWLPWGSRETKKR | NPWLPWGSRETKKS | NPWLPWGSREINKL | NPWRPWGSRETKKH | NPWLPWGSREVKKP

pattern: NPW[LR]PWGSRE[VIT][NKI][KR][LSHPR]
MKTLYNTQKMAPTLAFHFLSLVLFFVTMGEGIIVEDMKIELPDQKDIEEAKQSNHLHNLIDEA
KKPNYNSEDITHDPNPWLPWGSRETKKPIYNSEVNTHDPNPWLPWGSREIKKPIYNSEVNTRDL
NPWLPWGSRETKRP
IYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPWLPWGSREIKRP
IYNSEVNTRDPNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPW
LPWGSREIKKP
IYNSEVNTRDLNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYN
NEVNTRDLNPWLPWGSRETKKRNYNSEADTHDPNPWLPWGSRETKKSNYNSEVNPRDPNPWLPW
GSREINKL
NYDSEVNTRDPNPWRPWGSRETKKHNFNSEVYTRNPNPWLPWGSREVKKPKYNYKI
KTHDPNPYIDHTDAFEKGFFNLEDLHVGNVMTLQFSVQEIPHFFSRKEEADSIPFSVSQFSSVL
QLFSIPEDSLEAKTMRGTLEHCQEETVVGETKICANSVESMFEFVDTIIGSENKHNILRTSYPS
PTAAPLQKYTILKVSHDIDAPKWVSCHPLPYPYAVYYCHTMATGTRVFKVTLVGDKNGDKMEAL
GMCHLDTADWNPNHMIFKTLKVKPGKNTPVCHFFSINHLLWLPLPDSKVTM

Similar gene clusters

NC_066585 - Cluster 40 - Cyclopeptide

Gene cluster description

NC_066585 - Gene Cluster 40. Type = cyclopeptide. Location: 291424092 - 294497725 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127101320
Repeat occurs 10 times in a sequence of 330 amino acids
Location between 292120408 and 292121902
Coverage of 39.39 %
Instances:
EFEPRPSVTKYDG | EFEPRPSVTKYDG | EFEPRPSATKYDG | EFEPRPSATKYDG | EFEPRPSATKYDG
EFEPIPSVTKYDG | EFEPRPSATKYDG | EFEPRPSATKYDG | EFEPRPSATKYDG | EFEPRPSATKYND

pattern: EFEP[IR]PS[VA]TKY[ND][GD]
The following known motifs were found:
FEPR was found 9 times in this sequence
MRPALALLPFLFLFMFAVTIESRKDLKEYWKTVMKDEAMPEGIQGLLQFKSVIEPLKNSKAQE
QLAKGKCDQLDVKEKKLVKEEFEPRPSVTKYDGPSVTKYDGGEGNKNMKLLVNDEFEPRPSVTK
YDG
PSVTKYDGDESYKNMKLSINDEFEPRPSATKYDGPSATKYDGNEGYKNIKLPVNDEFEPRP
SATKYDG
PSATKYDGDDGYKNMKLPINDEFEPRPSATKYDGPSATKYDGDDGYKNMKLSVNDEF
EPIPSVTKYDG
DEGYKNLKLTINDEFEPRPSATKYDGPSATKYDGDDGYQNMKLPINDEFEPRP
SATKYDG
PSATKYDGDDGYQNMKLPINYEFEPRPSATKYDGPSATKYDGDDGYKNMKLPLNDEF
EPR
PSATKYND
Repeat found in LOC127101320
Repeat occurs 13 times in a sequence of 408 amino acids
Location between 292120408 and 292121902
Coverage of 41.42 %
Instances:
EFEPRPSVTKYDG | EFEPRPSVTKYDG | EFEPRPSATKYDG | EFEPRPSVTKYDG | EFEPRPSVTKYDG
EFEPRPSATEYDG | EFEPRPSATKYDG | EFEPRPSATKYDG | EFEPIPSVTKYDG | EFEPRPSATKYDG
EFEPRPSATKYDG | EFEPRPSATKYDG | EFEPRPSATKYND |
pattern: EFEP[IR]PS[VA]T[EK]Y[ND][GD]
The following known motifs were found:
FEPR was found 12 times in this sequence
MRPALALLPFLFLFMFAVTIESRKDLKEYWKTVMKDEAMPEGIQGLLQFKSVIEPLKNSKAQE
QLAKGKCDQLDVKEKKLVKEEFEPRPSVTKYDGPSVTKYDGGEGNKNMKLLVNDEFEPRPSVTK
YDG
PSVTKYDGDESYKNMKLSINDEFEPRPSATKYDGPSATKYDGNEGYKNIKLPVNDEFEPRP
SVTKYDG
PSVTKYDGGEGNKNMKLLVNDEFEPRPSVTKYDGPSVTKYDGDESYKNMKLSINDEF
EPR
PSATEYDGPSATEYDGNEGYKNIKLPVNDEFEPRPSATKYDGPSATKYDGDDGYKNMKLPI
NDEFEPRPSATKYDGPSATKYDGDDGYKNMKLSVNDEFEPIPSVTKYDGDEGYKNLKLTINDEF
EPR
PSATKYDGPSATKYDGDDGYQNMKLPINDEFEPRPSATKYDGPSATKYDGDDGYQNMKLPI
NYEFEPRPSATKYDGPSATKYDGDDGYKNMKLPLNDEFEPRPSATKYND
Repeat found in LOC127105993
Repeat occurs 9 times in a sequence of 437 amino acids
Location between 293350214 and 293351747
Coverage of 12.36 %
Instances:
YITGYR | YITSYR | YITGYR | YITSYS | YITGYR
YITSYG | YITGYR | YITSYG | YITQYR |
pattern: YIT[GSQ]Y[GSR]
MAHALALQFLTLLLFFFMNGQAITARDLKAELQDHKPVDSNEEPYPTSYGNHEVKQPDYITGY
R
THSHDSNKPYITSYRNHEAKQPDYITGYRTHSHDSNKPYITSYSNHEAKQPDYITGYRTHADE
SNGQYITSYGKHEAKQPDYITGYRTHADESNGPYITSYGKHEAKQPYITQYRPTSLDLKGLTYP
NSKDQEGSASPNMDRTEAFKTGYFNMDDLYVGHVMTLQFPVQEVSPYLSKKEADYIPLSKSQLP
SVLQLFSIAEDSTQTKSMINTLEECEGETVTGETKICANSLESMLEFVDKIIGSDTKHSILSTS
KPTPTATPLQKYTILEVSHEIHTPKWVACHPLPYPYAIYYCHYIATGTKVFKVTLVGDENGDKM
EALGMCHLDTSEWNPDHIVFRQLGVKAGKNTPVCHFFPVNHLLWVPEEPSKATM
Repeat found in LOC127105993
Repeat occurs 9 times in a sequence of 434 amino acids
Location between 293350214 and 293351747
Coverage of 12.44 %
Instances:
YITGYR | YITSYR | YITGYR | YITSYS | YITGYR
YITSYG | YITGYR | YITSYG | YITQYR |
pattern: YIT[GSQ]Y[GSR]
MAHALALQFLTLLLFFFMAITARDLKAELQDHKPVDSNEEPYPTSYGNHEVKQPDYITGYRTH
SHDSNKPYITSYRNHEAKQPDYITGYRTHSHDSNKPYITSYSNHEAKQPDYITGYRTHADESNG
QYITSYGKHEAKQPDYITGYRTHADESNGPYITSYGKHEAKQPYITQYRPTSLDLKGLTYPNSK
DQEGSASPNMDRTEAFKTGYFNMDDLYVGHVMTLQFPVQEVSPYLSKKEADYIPLSKSQLPSVL
QLFSIAEDSTQTKSMINTLEECEGETVTGETKICANSLESMLEFVDKIIGSDTKHSILSTSKPT
PTATPLQKYTILEVSHEIHTPKWVACHPLPYPYAIYYCHYIATGTKVFKVTLVGDENGDKMEAL
GMCHLDTSEWNPDHIVFRQLGVKAGKNTPVCHFFPVNHLLWVPEEPSKATM
Repeat found in LOC127106000
Repeat occurs 15 times in a sequence of 690 amino acids
Location between 294476165 and 294479033
Coverage of 30.43 %
Instances:
NPWLPWGSRETKKP | NPWLPWGSREIKKP | NPWLPWGSRETKRP | NPWLPWGSRETIKP | NPWLPWGSREIKRP
NPWLPWGSRETKRP | NPWLPWGSRETIKP | NPWLPWGSREIKKP | NPWLPWGSRETKRP | NPWLPWGSRETIKP
NPWLPWGSRETKKR | NPWLPWGSRETKKS | NPWLPWGSREINKL | NPWRPWGSRETKKH | NPWLPWGSREVKKP

pattern: NPW[LR]PWGSRE[VIT][NKI][KR][LSHPR]
MKTLYNTQKMAPTLAFHFLSLVLFFVTMGEGIIVEDMKIELPDQKDIEEAKQSNHLHNLIDEA
KKPNYNSEDITHDPNPWLPWGSRETKKPIYNSEVNTHDPNPWLPWGSREIKKPIYNSEVNTRDL
NPWLPWGSRETKRP
IYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPWLPWGSREIKRP
IYNSEVNTRDPNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPW
LPWGSREIKKP
IYNSEVNTRDLNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYN
NEVNTRDLNPWLPWGSRETKKRNYNSEADTHDPNPWLPWGSRETKKSNYNSEVNPRDPNPWLPW
GSREINKL
NYDSEVNTRDPNPWRPWGSRETKKHNFNSEVYTRNPNPWLPWGSREVKKPKYNYKI
KTHDPNPYIDHTDAFEKGFFNLEDLHVGNVMTLQFSVQEIPHFFSRKEEADSIPFSVSQFSSVL
QLFSIPEDSLEAKTMRGTLEHCQEETVVGETKICANSVESMFEFVDTIIGSENKHNILRTSYPS
PTAAPLQKYTILKVSHDIDAPKWVSCHPLPYPYAVYYCHTMATGTRVFKVTLVGDKNGDKMEAL
GMCHLDTADWNPNHMIFKTLKVKPGKNTPVCHFFSINHLLWLPLPDSKVTM

Similar gene clusters

NC_066585 - Cluster 41 - Cyclopeptide

Gene cluster description

NC_066585 - Gene Cluster 41. Type = cyclopeptide. Location: 292231994 - 294566446 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeatfinder output


Repeat found in LOC127105993
Repeat occurs 9 times in a sequence of 437 amino acids
Location between 293350214 and 293351747
Coverage of 12.36 %
Instances:
YITGYR | YITSYR | YITGYR | YITSYS | YITGYR
YITSYG | YITGYR | YITSYG | YITQYR |
pattern: YIT[GSQ]Y[GSR]
MAHALALQFLTLLLFFFMNGQAITARDLKAELQDHKPVDSNEEPYPTSYGNHEVKQPDYITGY
R
THSHDSNKPYITSYRNHEAKQPDYITGYRTHSHDSNKPYITSYSNHEAKQPDYITGYRTHADE
SNGQYITSYGKHEAKQPDYITGYRTHADESNGPYITSYGKHEAKQPYITQYRPTSLDLKGLTYP
NSKDQEGSASPNMDRTEAFKTGYFNMDDLYVGHVMTLQFPVQEVSPYLSKKEADYIPLSKSQLP
SVLQLFSIAEDSTQTKSMINTLEECEGETVTGETKICANSLESMLEFVDKIIGSDTKHSILSTS
KPTPTATPLQKYTILEVSHEIHTPKWVACHPLPYPYAIYYCHYIATGTKVFKVTLVGDENGDKM
EALGMCHLDTSEWNPDHIVFRQLGVKAGKNTPVCHFFPVNHLLWVPEEPSKATM
Repeat found in LOC127105993
Repeat occurs 9 times in a sequence of 434 amino acids
Location between 293350214 and 293351747
Coverage of 12.44 %
Instances:
YITGYR | YITSYR | YITGYR | YITSYS | YITGYR
YITSYG | YITGYR | YITSYG | YITQYR |
pattern: YIT[GSQ]Y[GSR]
MAHALALQFLTLLLFFFMAITARDLKAELQDHKPVDSNEEPYPTSYGNHEVKQPDYITGYRTH
SHDSNKPYITSYRNHEAKQPDYITGYRTHSHDSNKPYITSYSNHEAKQPDYITGYRTHADESNG
QYITSYGKHEAKQPDYITGYRTHADESNGPYITSYGKHEAKQPYITQYRPTSLDLKGLTYPNSK
DQEGSASPNMDRTEAFKTGYFNMDDLYVGHVMTLQFPVQEVSPYLSKKEADYIPLSKSQLPSVL
QLFSIAEDSTQTKSMINTLEECEGETVTGETKICANSLESMLEFVDKIIGSDTKHSILSTSKPT
PTATPLQKYTILEVSHEIHTPKWVACHPLPYPYAIYYCHYIATGTKVFKVTLVGDENGDKMEAL
GMCHLDTSEWNPDHIVFRQLGVKAGKNTPVCHFFPVNHLLWVPEEPSKATM
Repeat found in LOC127106000
Repeat occurs 15 times in a sequence of 690 amino acids
Location between 294476165 and 294479033
Coverage of 30.43 %
Instances:
NPWLPWGSRETKKP | NPWLPWGSREIKKP | NPWLPWGSRETKRP | NPWLPWGSRETIKP | NPWLPWGSREIKRP
NPWLPWGSRETKRP | NPWLPWGSRETIKP | NPWLPWGSREIKKP | NPWLPWGSRETKRP | NPWLPWGSRETIKP
NPWLPWGSRETKKR | NPWLPWGSRETKKS | NPWLPWGSREINKL | NPWRPWGSRETKKH | NPWLPWGSREVKKP

pattern: NPW[LR]PWGSRE[VIT][NKI][KR][LSHPR]
MKTLYNTQKMAPTLAFHFLSLVLFFVTMGEGIIVEDMKIELPDQKDIEEAKQSNHLHNLIDEA
KKPNYNSEDITHDPNPWLPWGSRETKKPIYNSEVNTHDPNPWLPWGSREIKKPIYNSEVNTRDL
NPWLPWGSRETKRP
IYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPWLPWGSREIKRP
IYNSEVNTRDPNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPW
LPWGSREIKKP
IYNSEVNTRDLNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYN
NEVNTRDLNPWLPWGSRETKKRNYNSEADTHDPNPWLPWGSRETKKSNYNSEVNPRDPNPWLPW
GSREINKL
NYDSEVNTRDPNPWRPWGSRETKKHNFNSEVYTRNPNPWLPWGSREVKKPKYNYKI
KTHDPNPYIDHTDAFEKGFFNLEDLHVGNVMTLQFSVQEIPHFFSRKEEADSIPFSVSQFSSVL
QLFSIPEDSLEAKTMRGTLEHCQEETVVGETKICANSVESMFEFVDTIIGSENKHNILRTSYPS
PTAAPLQKYTILKVSHDIDAPKWVSCHPLPYPYAVYYCHTMATGTRVFKVTLVGDKNGDKMEAL
GMCHLDTADWNPNHMIFKTLKVKPGKNTPVCHFFSINHLLWLPLPDSKVTM

Similar gene clusters

NC_066585 - Cluster 42 - Polyketide

Gene cluster description

NC_066585 - Gene Cluster 42. Type = polyketide. Location: 304249747 - 305484261 nt. Click on genes for more information.
Show pHMM detection rules used
plants/plant: (minimum(4,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[]))
plants/polyketide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Chal_sti_synt_C/Chal_sti_synt_N]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Thr_dehydrat_C]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Chal_sti_synt_C,Chal_sti_synt_N]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 43 - Saccharide

Gene cluster description

NC_066585 - Gene Cluster 43. Type = saccharide. Location: 471104851 - 472933702 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 44 - Lignan

Gene cluster description

NC_066585 - Gene Cluster 44. Type = lignan. Location: 489750385 - 490502114 nt. Click on genes for more information.
Show pHMM detection rules used
plants/lignan: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Dirigent]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 45 - Saccharide

Gene cluster description

NC_066585 - Gene Cluster 45. Type = saccharide. Location: 505507535 - 506336966 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 46 - Lignan-saccharide

Gene cluster description

NC_066585 - Gene Cluster 46. Type = lignan-saccharide. Location: 536942171 - 537667057 nt. Click on genes for more information.
Show pHMM detection rules used
plants/lignan: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Dirigent]))
plants/saccharide: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 47 - Terpene

Gene cluster description

NC_066585 - Gene Cluster 47. Type = terpene. Location: 540270736 - 540492999 nt. Click on genes for more information.
Show pHMM detection rules used
plants/terpene: (minimum(3,[NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Terpene_synth/Terpene_synth_C/Prenyltrans/SQHop_cyclase_C/SQHop_cyclase_N/PRISE]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

Similar known gene clusters