Identified secondary metabolite clusters

Cluster Type From To Size (kb) Core domains Product/substrate predicted by subgroup Most similar known cluster MIBiG BGC-ID
The following clusters are from record NC_066579.1:
Cluster 1Fatty_acid-Mate-Alkaloid400266933400951382684.45BBE, FAD_binding_4, FA_hydroxylase, MatE---
Cluster 2Saccharide-Alkaloid457935053458130707195.65Pyridoxal_deC, UDPGT_2*saccharide-2, cyanogenic glucoside, monoterpenoid--
The following clusters are from record NC_066580.1:
Cluster 3Saccharide15348841700613165.73Acetyltransf_1, Lyase_aromatic, UDPGT_2---
Cluster 4Saccharide22377888234929311115.04Transferase, UDPGT_2flavonoid--
Cluster 5Saccharide136017328136286230268.90UDPGT_2, p450---
Cluster 6Cyclopeptide3199602373244074794447.24BURP---
Cluster 7Polyketide490186082490582249396.17AMP-binding, Acetyltransf_1, Chal_sti_synt_C, Chal_sti_synt_N---
The following clusters are from record NC_066581.1:
Cluster 8Fatty_acid3007013013017122311010.93Epimerase, FA_hydroxylase, PALP, p450---
Cluster 9Saccharide318388410318943107554.70Aldo_ket_red, Methyltransf_11, UDPGT_2flavonoid-5, oleananes-5--
Cluster 10Cyclopeptide4929767994954383592461.56BURP---
The following clusters are from record NC_066582.1:
Cluster 11Saccharide4621182746466121254.292OG-FeII_Oxy, DIOX_N, Glyco_hydro_1, Peptidase_S10---
Cluster 12Saccharide176346356177168869822.51UDPGT_2, p450, polyprenyl_synt---
Cluster 13Cyclopeptide2758705312821509786280.45BURP---
Cluster 14Cyclopeptide37855686739302708214470.22BURP---
Cluster 15Cyclopeptide3867764753893812662604.79BURP---
Cluster 16Cyclopeptide4857975704874978831700.31BURP---
Cluster 17Saccharide496896586497081872185.29UDPGT_2, p450small phenolic-4--
The following clusters are from record NC_066583.1:
Cluster 18Cyclopeptide12167775167988694631.09BURP---
Cluster 19Cyclopeptide13274831184138835139.05BURP---
Cluster 20Cyclopeptide15983889184541602470.27BURP---
Cluster 21Cyclopeptide16808328195269762718.65BURP---
Cluster 22Saccharide102550203103091286541.08Aminotran_1_2, Glycos_transf_2, SE---
Cluster 23Polyketide194555028195494026939.002OG-FeII_Oxy, Chal_sti_synt_C, Chal_sti_synt_N, DIOX_N, Methyltransf_11---
Cluster 24Polyketide2148120412178675003055.46Chal_sti_synt_C, Chal_sti_synt_N, adh_short---
Cluster 25Saccharide246094858246596422501.56NAD_binding_1, Peptidase_S10, UDPGT_2cyanogenic glucoside-5, monoterpenoid-5--
Cluster 26Fatty_acid248378010249024385646.38ADH_N, ADH_zinc_N, FA_desaturase_2, Lipoxygenase, adh_short---
The following clusters are from record NC_066584.1:
Cluster 27Terpene2655589292682523202693.39Terpene_synth, Terpene_synth_C, Transferase, p450---
The following clusters are from record NC_066585.1:
Cluster 28Fatty_acid-Alkaloid161969102162676728707.63BBE, FAD_binding_4, FA_hydroxylase---
Cluster 29Terpene181259511182181420921.91Epimerase, Terpene_synth, Terpene_synth_C---
Cluster 30Saccharide233677316234181648504.33Epimerase, UDPGT_2, adh_short_C2flavonoid--
Cluster 31Terpene2398716312412730471401.42Epimerase, Peptidase_S10, Terpene_synth, Terpene_synth_C---
Cluster 32Cyclopeptide2907558962981993037443.41BURP---
Cluster 33Cyclopeptide2914240922944977253073.63BURP---
Cluster 34Cyclopeptide2922319942945664462334.45BURP---
Cluster 35Polyketide3042497473054842611234.513Beta_HSD, Chal_sti_synt_C, Epimerase, p450---
Cluster 36Saccharide4711048514729337021828.85AMP-binding, Glyco_hydro_1, Lipoxygenase---
Cluster 37Lignan489750385490502114751.73Dirigent, p450---
Cluster 38Saccharide505507535506336966829.43UDPGT_2, p450flavonoid, oleananes--
Cluster 39Lignan-Saccharide536942171537667057724.89Dirigent, Glyco_hydro_1, Methyltransf_11---
Cluster 40Terpene540270736540492999222.26Epimerase, SQHop_cyclase_C, SQHop_cyclase_N, p450beta-amyrin-2, triterpene-2yossoside I/yossoside II/yossoside III/yossoside IV/yossos... (80% of genes show similarity)BGC0002402.2_c1

NC_066579 - Cluster 1 - Fatty_acid-mate-alkaloid

Gene cluster description

NC_066579 - Gene Cluster 1. Type = fatty_acid-MatE-alkaloid. Location: 400266933 - 400951382 nt. Click on genes for more information.
Show pHMM detection rules used
plants/MatE: (minimum(4,[MatE,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[MatE]))
plants/alkaloid: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Bet_v_1/Cu_amine_oxid/Str_synth/BBE/Orn_DAP_Arg_deC/Pyridoxal_deC]))
plants/fatty_acid: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[FA_desaturase/FA_desaturase_2/FA_hydroxylase/CER1-like_C]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,ECH_2]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,AMP-binding]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066579 - Cluster 2 - Saccharide-alkaloid

Gene cluster description

NC_066579 - Gene Cluster 2. Type = saccharide-alkaloid. Location: 457935053 - 458130707 nt. Click on genes for more information.
Show pHMM detection rules used
plants/alkaloid: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Bet_v_1/Cu_amine_oxid/Str_synth/BBE/Orn_DAP_Arg_deC/Pyridoxal_deC]))
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066580 - Cluster 3 - Saccharide

Gene cluster description

NC_066580 - Gene Cluster 3. Type = saccharide. Location: 1534884 - 1700613 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066580 - Cluster 4 - Saccharide

Gene cluster description

NC_066580 - Gene Cluster 4. Type = saccharide. Location: 22377888 - 23492931 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066580 - Cluster 5 - Saccharide

Gene cluster description

NC_066580 - Gene Cluster 5. Type = saccharide. Location: 136017328 - 136286230 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066580 - Cluster 6 - Cyclopeptide

Gene cluster description

NC_066580 - Gene Cluster 6. Type = cyclopeptide. Location: 319960237 - 324407479 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066580 - Cluster 7 - Polyketide

Gene cluster description

NC_066580 - Gene Cluster 7. Type = polyketide. Location: 490186082 - 490582249 nt. Click on genes for more information.
Show pHMM detection rules used
plants/polyketide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Chal_sti_synt_C/Chal_sti_synt_N]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Thr_dehydrat_C]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Chal_sti_synt_C,Chal_sti_synt_N]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066581 - Cluster 8 - Fatty_acid

Gene cluster description

NC_066581 - Gene Cluster 8. Type = fatty_acid. Location: 300701301 - 301712231 nt. Click on genes for more information.
Show pHMM detection rules used
plants/fatty_acid: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[FA_desaturase/FA_desaturase_2/FA_hydroxylase/CER1-like_C]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,ECH_2]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,AMP-binding]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066581 - Cluster 9 - Saccharide

Gene cluster description

NC_066581 - Gene Cluster 9. Type = saccharide. Location: 318388410 - 318943107 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066581 - Cluster 10 - Cyclopeptide

Gene cluster description

NC_066581 - Gene Cluster 10. Type = cyclopeptide. Location: 492976799 - 495438359 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066582 - Cluster 11 - Saccharide

Gene cluster description

NC_066582 - Gene Cluster 11. Type = saccharide. Location: 46211827 - 46466121 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066582 - Cluster 12 - Saccharide

Gene cluster description

NC_066582 - Gene Cluster 12. Type = saccharide. Location: 176346356 - 177168869 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066582 - Cluster 13 - Cyclopeptide

Gene cluster description

NC_066582 - Gene Cluster 13. Type = cyclopeptide. Location: 275870531 - 282150978 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066582 - Cluster 14 - Cyclopeptide

Gene cluster description

NC_066582 - Gene Cluster 14. Type = cyclopeptide. Location: 378556867 - 393027082 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeat found in LOC127075590
Repeat occurs 7 times in a sequence of 470 amino acids
Location between 387462977 and 387475338
Coverage of 8.94 %
Instances:
NQPFGT | NQPFGT | NQPFGT | NQPFGT | NQPFGT
NQPFGL | NQPFET |
pattern: NQPF[EG][TL]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGT
FVWWYKKETGRPTTRSDKETKIETTATNQPFGTFVWWNKKEFDRPTTIRSDKLTKIETTRA
NQPFGT
FVWWYKKEIERPTIRSDKTTKIETTTTNQPFGTFVWWNKKETDRPTIRSDKVTKIETT
RTNQPFGTTAWWHKKETEIETENNLLEENQPFGLSEQGKKETKKSNQPFETQTLDEKEAHVLNS
YCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVICHR
LNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVSVCH
FIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075591
Repeat occurs 5 times in a sequence of 371 amino acids
Location between 387592445 and 387595987
Coverage of 8.09 %
Instances:
NQPFGT | NQPFGT | NQPFGT | NQPFGL | NQPFET

pattern: NQPF[EG][TL]
MEFKNLSVLALFFLTFLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHHLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGT
FVWSDKLTKIETTRANQPFGTFVWCDKTTKIETTTTNQPFGTFVWWNKKETDKENQPFGLS
EQGKKETKKSNQPFETQTLDEKEAHVLNSYCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKV
MSSSFSQSQDKYVVQEVNKIGDKAVMCHRLNFEEVVFYCHVVNATTTYMVPLMASDGTISKALT
ICHHDTRGMNPKVLNEVLNVKPGNVSVCHFIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075602
Repeat occurs 9 times in a sequence of 575 amino acids
Location between 389027355 and 389029325
Coverage of 10.96 %
Instances:
KPTFKDM | KPTFIEK | KPTSIEK | KPTLIER | KPTFIER
KPTFIER | KPTFVER | KPTFIER | KPTSIEK |
pattern: KPT[SLF][IKV][ED][KRM]
MAFNSQRTFRAPTFRFPLSVRRVYETWEPKSDVKETSETYFLHVYLPGYTKNQPKITLEDASQ
KLRITGERPIEGDKWKKFDQTYPVPENSDVGTLEAKFEQETLILKMQKKPISQSQVVAPKQQVE
KSQQEPLSNEGLDGTKLEKVQETIQPTQSTTKFEESTQDMNSDLPQTQSIEKKRQETIHDDTLS
QIAKETISNDTTKTQIGENSQQQFELKPTFKDMTKLQFNEKAQKGPEEFEPKPTFIEKIKTQID
EIAQKGQEEFEKKSTFIEKVKTQISEKAHKAQEEFALKPTSIEKAKTEPNEKPQISEEEFEKKP
TLIER
IITQIAERAQKGPKEIEAKPTFIERTNKQIDENVQKVQEEFESKPTFIERTKTQIDEKA
QNGLEEFEKKPTFVERIKTRIIEEAQKVQEEFEAKPTFIERIKTQIDEKVQKDKEEFEPKPTSI
EK
AKTETNKKLQKGPEEFEPKPIEKIVTKENLEKNIVKNSDEDAEKKRILVKEETKEKKEKPYE
SSKTLVGVKNQNIKENETEKEELPTPKVTESKWLGEERHLIENVSVAILVIAAFGAYISYKFSS
Repeat found in LOC127075590
Repeat occurs 7 times in a sequence of 470 amino acids
Location between 387471988 and 387475338
Coverage of 8.94 %
Instances:
NQPFGT | NQPFGT | NQPFGT | NQPFGT | NQPFGT
NQPFGL | NQPFET |
pattern: NQPF[EG][TL]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGT
FVWWYKKETGRPTTRSDKETKIETTATNQPFGTFVWWNKKEFDRPTTIRSDKLTKIETTRA
NQPFGT
FVWWYKKEIERPTIRSDKTTKIETTTTNQPFGTFVWWNKKETDRPTIRSDKVTKIETT
RTNQPFGTTAWWHKKETEIETENNLLEENQPFGLSEQGKKETKKSNQPFETQTLDEKEAHVLNS
YCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVICHR
LNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVSVCH
FIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075589
Repeat occurs 7 times in a sequence of 473 amino acids
Location between 387342525 and 387345756
Coverage of 8.88 %
Instances:
NQPFGT | NQPFGT | NQPFGT | NQPFGT | NQPFGT
NQPFGL | NQPFET |
pattern: NQPF[EG][TL]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTRTNQP
FGT
FVWWYKKQTGSLTTRSDKATKIETTATNQPFGTFVWWNEKEFDRPTIRSDKLTKIETTRAN
QPFGT
FVWWYKKEIERPTIRSDKTTKIETTTINQPFGTFVWWNKKETDRPTIRSDKVTKIETTR
TNQPFGTTAWWHKKETEKETEIETENNLLEENQPFGLSEQGKKETEKSNQPFETQTSDEKEAHV
LNNYCGTPSAIGEHKHCALSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVM
CHRLNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVS
VCHFIGNKAVAWVPNVSQSRGHPCVI

Similar gene clusters

NC_066582 - Cluster 15 - Cyclopeptide

Gene cluster description

NC_066582 - Gene Cluster 15. Type = cyclopeptide. Location: 386776475 - 389381266 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeat found in LOC127075602
Repeat occurs 9 times in a sequence of 575 amino acids
Location between 389027355 and 389029325
Coverage of 10.96 %
Instances:
KPTFKDM | KPTFIEK | KPTSIEK | KPTLIER | KPTFIER
KPTFIER | KPTFVER | KPTFIER | KPTSIEK |
pattern: KPT[SLF][IKV][ED][KRM]
MAFNSQRTFRAPTFRFPLSVRRVYETWEPKSDVKETSETYFLHVYLPGYTKNQPKITLEDASQ
KLRITGERPIEGDKWKKFDQTYPVPENSDVGTLEAKFEQETLILKMQKKPISQSQVVAPKQQVE
KSQQEPLSNEGLDGTKLEKVQETIQPTQSTTKFEESTQDMNSDLPQTQSIEKKRQETIHDDTLS
QIAKETISNDTTKTQIGENSQQQFELKPTFKDMTKLQFNEKAQKGPEEFEPKPTFIEKIKTQID
EIAQKGQEEFEKKSTFIEKVKTQISEKAHKAQEEFALKPTSIEKAKTEPNEKPQISEEEFEKKP
TLIER
IITQIAERAQKGPKEIEAKPTFIERTNKQIDENVQKVQEEFESKPTFIERTKTQIDEKA
QNGLEEFEKKPTFVERIKTRIIEEAQKVQEEFEAKPTFIERIKTQIDEKVQKDKEEFEPKPTSI
EK
AKTETNKKLQKGPEEFEPKPIEKIVTKENLEKNIVKNSDEDAEKKRILVKEETKEKKEKPYE
SSKTLVGVKNQNIKENETEKEELPTPKVTESKWLGEERHLIENVSVAILVIAAFGAYISYKFSS
Repeat found in LOC127075589
Repeat occurs 7 times in a sequence of 473 amino acids
Location between 387342525 and 387345756
Coverage of 8.88 %
Instances:
NQPFGT | NQPFGT | NQPFGT | NQPFGT | NQPFGT
NQPFGL | NQPFET |
pattern: NQPF[EG][TL]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTRTNQP
FGT
FVWWYKKQTGSLTTRSDKATKIETTATNQPFGTFVWWNEKEFDRPTIRSDKLTKIETTRAN
QPFGT
FVWWYKKEIERPTIRSDKTTKIETTTINQPFGTFVWWNKKETDRPTIRSDKVTKIETTR
TNQPFGTTAWWHKKETEKETEIETENNLLEENQPFGLSEQGKKETEKSNQPFETQTSDEKEAHV
LNNYCGTPSAIGEHKHCALSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVM
CHRLNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVS
VCHFIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075591
Repeat occurs 5 times in a sequence of 371 amino acids
Location between 387592445 and 387595987
Coverage of 8.09 %
Instances:
NQPFGT | NQPFGT | NQPFGT | NQPFGL | NQPFET

pattern: NQPF[EG][TL]
MEFKNLSVLALFFLTFLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHHLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGT
FVWSDKLTKIETTRANQPFGTFVWCDKTTKIETTTTNQPFGTFVWWNKKETDKENQPFGLS
EQGKKETKKSNQPFETQTLDEKEAHVLNSYCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKV
MSSSFSQSQDKYVVQEVNKIGDKAVMCHRLNFEEVVFYCHVVNATTTYMVPLMASDGTISKALT
ICHHDTRGMNPKVLNEVLNVKPGNVSVCHFIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075590
Repeat occurs 7 times in a sequence of 470 amino acids
Location between 387471988 and 387475338
Coverage of 8.94 %
Instances:
NQPFGT | NQPFGT | NQPFGT | NQPFGT | NQPFGT
NQPFGL | NQPFET |
pattern: NQPF[EG][TL]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGT
FVWWYKKETGRPTTRSDKETKIETTATNQPFGTFVWWNKKEFDRPTTIRSDKLTKIETTRA
NQPFGT
FVWWYKKEIERPTIRSDKTTKIETTTTNQPFGTFVWWNKKETDRPTIRSDKVTKIETT
RTNQPFGTTAWWHKKETEIETENNLLEENQPFGLSEQGKKETKKSNQPFETQTLDEKEAHVLNS
YCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVICHR
LNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVSVCH
FIGNKAVAWVPNVSQSRGHPCVI
Repeat found in LOC127075590
Repeat occurs 7 times in a sequence of 470 amino acids
Location between 387462977 and 387475338
Coverage of 8.94 %
Instances:
NQPFGT | NQPFGT | NQPFGT | NQPFGT | NQPFGT
NQPFGL | NQPFET |
pattern: NQPF[EG][TL]
MEFKNLSVLALFFLTLLGIHASKSGEEYWKSVWPNTPIPKTLLDLLLTDKGTSIPIKSQEEKQ
YWTIFFEHDLYPGKTMNLGIQKHSDIQSSKSTTHAPVKRASHTFKTLKGLGQTPEKETTKTNQP
FGT
FVWWYKKETGRPTTRSDKETKIETTATNQPFGTFVWWNKKEFDRPTTIRSDKLTKIETTRA
NQPFGT
FVWWYKKEIERPTIRSDKTTKIETTTTNQPFGTFVWWNKKETDRPTIRSDKVTKIETT
RTNQPFGTTAWWHKKETEIETENNLLEENQPFGLSEQGKKETKKSNQPFETQTLDEKEAHVLNS
YCGTPSAIGEHKHCVLSLESMMDFAISKLGKNIKVMSSSFSQSQDKYVVQEVNKIGDKAVICHR
LNFEEVVFYCHVVNATTTYMVPMMASDGTISKALTICHHDTRGMNPKVLNEVLNVKPGNVSVCH
FIGNKAVAWVPNVSQSRGHPCVI

Similar gene clusters

NC_066582 - Cluster 16 - Cyclopeptide

Gene cluster description

NC_066582 - Gene Cluster 16. Type = cyclopeptide. Location: 485797570 - 487497883 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeat found in LOC127076485
Repeat occurs 3 times in a sequence of 357 amino acids
Location between 486646512 and 486648940
Coverage of 5.88 %
Instances:
ATKEEIE | ATKEDIE | ATKKEIE |
pattern: ATK[KE][ED]IE
MEFTSLSILALLCVAFMEVNASMSGEEYWNSIWPNTPIPKTISDLVLSNNTELIRGQEMKQYW
TVFFNHDLYPGKEMSLGIQKQSYIQPSRSNAQIFIKKASTHVATKEEIEKSTQPHGEATKEDIE
EPIQPFGAWRNEKEIEEPIQPFGAWRNEATKKEIERPNKHFEGIVWPRKTTIKKLEKVSQTSIT
RTLDEKETHILRDYCEKPSAIGEDRHCVTSLESMMYFVISKLGKNIKVMSSSFAQNQTQYVVEE
VKKIGDKAVMCHKMNLKIVVFNCHQVNATTIYKVPLVASDGTKSNALTICHHDTRGMNANALYK
VLKVRPGTVPICHFIGNKAIAWVPNDSVSEDDDCPRLI

Similar gene clusters

NC_066582 - Cluster 17 - Saccharide

Gene cluster description

NC_066582 - Gene Cluster 17. Type = saccharide. Location: 496896586 - 497081872 nt. Click on genes for more information.
Show pHMM detection rules used
plants/plant: (minimum(4,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[]))
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 18 - Cyclopeptide

Gene cluster description

NC_066583 - Gene Cluster 18. Type = cyclopeptide. Location: 12167775 - 16798869 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 14.98 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDIR
WLKDIR | WLKDTR | WLKDTR | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
R
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127084475
Repeat occurs 17 times in a sequence of 525 amino acids
Location between 15248183 and 15251142
Coverage of 29.14 %
Instances:
SAYGKKNVD | SAYGENDID | SAYGENNID | SAYGENNID | SAYGENNID
SAYGENDID | SAYGENNFD | SAYGENDID | SAYGENNFD | SAYGENDID
SAYGENNFD | SAYGENNID | SAYGENNVD | SAYGENNID | SAYVGNDID
SAYGNNNID | SAYGNNEID |
pattern: SAY[GV][KEGN][KN][EDN][IFV]D
The following known motifs were found:
FEPR was found 14 times in this sequence
MKMMRPALSLLPLFLLLIVGIVESRKDLGEYWKLVMKQQDMPQEIQGLLNQNPKKNFKTLKQF
FDDGKKKKVVKDFEQRPNISAYGKKNVDVKEKNGVIEDFEPRPNISAYGENDIDVKEKKGAIED
FEPIPNISAYGENNIDDKEKNEGIEDFEPRPNISAYGENNIDVKEKKGVIEDFEPRPNISAYGE
NNID
VKEKNGTIEEFEPRPNISAYGENDIDVKEKKGAIEDFEPRPNISAYGENNFDDKKKNGAI
EDFEPRPNISAYGENDIDVKENKGNIEDFEPRPNISAYGENNFDDKKKNGAIEDFEPRPNISAY
GENDID
VKENKGNIEDFEPRPNISAYGENNFDVKENNGAIEDFEPRPNISAYGENNIDFKEKKG
AIEEFEPRPNISAYGENNVDVKEKSGAIEDFEPRPNISAYGENNIDIKEKKGAIEDFKPRPNIS
AYVGNDID
VKEKKGDIEDFEPRPNISAYGNNNIDVKEKNKTIKDFEPRPNISAYGNNEIDDESM
KDVEPIPSLTKYDA
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 14.98 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDIR
WLKDIR | WLKDTR | WLKDTR | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
R
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127085073
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 15987697 and 15990606
Coverage of 15.72 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][ISTV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKE
KSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDT
R
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 15987697 and 15990606
Coverage of 15.36 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPD
SKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVEND
KSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085072
Repeat occurs 16 times in a sequence of 636 amino acids
Location between 15931056 and 15934133
Coverage of 20.13 %
Instances:
YLDGWLKK | YLDGWLKD | YLDGWLKD | YLDGWLKD | YLDGWLKD
YLDGWLKD | YLDGWLKD | YLDGWLKD | YLDGWFKD | YLDGWLKD
YLDGWLKD | YLDGWLKN | YLDGWLKD | YLDGWLKD | YLDGLLKD
YLDIGSKI |
pattern: YLD[IG][GLW][SLF]K[IKDN]
MTYRIVRSILHFLIFLLMNGHGNFARDTKLLQENVEEKQVDQPYLDGWLKKPLKNQKRIPDSN
EVYHDGWLKDNRGEKEKTNLDSNQVYLDGWLKDTRTEKEKVSHDSKQVYLDGWLKDTRVEKAKG
NPDSKQVYLDGWLKDIRAEKAQVNPDTNQVYLDGWLKDTRDEKEKVNPNSNQAYLDGWLKDIRT
EKAKSTLDTNQVYLDGWLKDGRAKKVKFTPDTNQVYLDGWLKDSRTEKAKSTPDSNQIYLDGWF
KD
NRGDKSKSTPDTNQVYLDGWLKDFRVEKEKSTPNEVYLDGWLKDTRDQKEKSTTNSNQVYLD
GWLKN
TQAEKEKVTPNSNKVYLDGWLKDTKDQKKKTTRNFNPAYLDGWLKDSHVDKAKFTPNSK
QAYLDGLLKDSHAESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPLLPRKVAD
DIPFSKSQIPSLLQLFSFTKDSPQGEDMKDIINQCEFEPTKGETKACPTSLESMVEFVHSVIGT
ETKFNIHSTSYPTTSGARLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSDMNPNHFIFELLGMKPGEAPLCHFFPVKHVLWVPAPPDVTK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 16009594 and 16012357
Coverage of 16.37 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][IST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDT
R
AEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQ
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 15987697 and 15990606
Coverage of 17.21 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][ISTV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDG
WLKDTR
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQ
VYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSS
PNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGE
DMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 14.12 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
The following known motifs were found in CDS LOC127084472
Location between 14251448 and 14253205
FEPR was found 4 times in this sequence
Sequence:
MSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKEN
MGAIGEFEPRPYASAYGDNEIHAKENMGAIGEFEPRPNASAYGDNEIHANENKGATGEFEPRPN
ISAYGDNEIHANENKGAIGEFETRPNASAYGDNEIGAEFTDDFEPRPSMTKYNA
Repeat found in LOC127085074
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 16009594 and 16012357
Coverage of 16.06 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDTR | WLKDIR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDI
R
VQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTR
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 16.94 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127084474
Repeat occurs 8 times in a sequence of 259 amino acids
Location between 14758282 and 14759808
Coverage of 21.62 %
Instances:
FEPRPNV | FEPKPNV | FEPRPNV | FEPRPNI | FEPRPNV
FEPIPNV | FEPRPNV | FEPRPSV |
pattern: FEP[IKR]P[SN][IV]
The following known motifs were found:
FEPR was found 6 times in this sequence
VS[AI]Y was found 3 times in this sequence
MMRLRPAFALLPLFLLLIITIVESRKDLGKYWKLVMKDQDVSEEIQGLLDANIKKNFKTLRQS
FDAKENKVVKDFEPRPNVPNVSVYGENDIDFMKNKAAIEEFEPKPNVSVYGNNNIDVEENNKGI
EDFEPRPNVPNVSTYGNNDIDNKKKDKEVEDFEPRPNIPNISAYGNNDIDNKEKEKAVEDFEPR
PNV
PNVSAYGNNDINSRENEKVVEDFEPIPNVSAYGNNDIYNKEKKKVVEDFEPRPNVPNVSAY
GNNEIGAEFTEDFEPRPSV
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 14.12 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
The following known motifs were found in CDS LOC127084472
Location between 14251448 and 14260342
FEPR was found 3 times in this sequence
Sequence:
MMSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKE
NMGAIGEFEPRPYASAYGDNEIHAKENMGAIGEFEPRPNASAYGDNEIHANENKGAIGEFETRP
NASAYGDNEIGAEFTDDFEPRPSMTKYNA
Repeat found in LOC127085074
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 16009594 and 16012357
Coverage of 15.72 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDIR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDI
R
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 15987697 and 15990606
Coverage of 16.06 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDT
R
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDG
WLKDTR
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127085073
Repeat occurs 21 times in a sequence of 756 amino acids
Location between 15987697 and 15990606
Coverage of 16.67 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDG
WLKDTR
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQ
VYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKN
GQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQ
SPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHS
TSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDI
MNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
The following known motifs were found in CDS LOC127084472
Location between 14251448 and 14260342
FEPR was found 4 times in this sequence
Sequence:
MMSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKE
NMGAIGEFEPRPYASAYGDNEIHAKENMGAIGEFEPRPNASAYGDNEIHANENKGATGEFEPRP
NISAYGDNEIHANENKGAIGEFETRPNASAYGDNEIGAEFTDDFEPRPSMTKYNA
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][ST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 16.94 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][ISTV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127084474
Repeat occurs 9 times in a sequence of 285 amino acids
Location between 14758282 and 14759808
Coverage of 22.11 %
Instances:
FEPRPNV | FEPKPNV | FEPRPNV | FEPRPNI | FEPRPNI
FEPRPNV | FEPIPNV | FEPRPNV | FEPRPSV |
pattern: FEP[IKR]P[SN][IV]
The following known motifs were found:
FEPR was found 7 times in this sequence
VS[AI]Y was found 3 times in this sequence
MMRLRPAFALLPLFLLLIITIVESRKDLGKYWKLVMKDQDVSEEIQGLLDANIKKNFKTLRQS
FDAKENKVVKDFEPRPNVPNVSVYGENDIDFMKNKAAIEEFEPKPNVSVYGNNNIDVEENNKGI
EDFEPRPNVPNVSTYGNNDIDNKKKDKEVEDFEPRPNIPNISAYGNNDIDNKKKDKEVEDFEPR
PNI
PNISAYGNNDIDNKEKEKAVEDFEPRPNVPNVSAYGNNDINSRENEKVVEDFEPIPNVSAY
GNNDIYNKEKKKVVEDFEPRPNVPNVSAYGNNEIGAEFTEDFEPRPSV
Repeat found in LOC127085074
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 16009594 and 16012357
Coverage of 15.36 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDIR | WLKDIR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKE
KVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 15987697 and 15990606
Coverage of 16.37 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDT
R
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNR
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK

Similar gene clusters

NC_066583 - Cluster 19 - Cyclopeptide

Gene cluster description

NC_066583 - Gene Cluster 19. Type = cyclopeptide. Location: 13274831 - 18413883 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 14.12 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][STF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
The following known motifs were found in CDS LOC127084472
Location between 14251448 and 14260342
FEPR was found 3 times in this sequence
Sequence:
MMSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKE
NMGAIGEFEPRPYASAYGDNEIHAKENMGAIGEFEPRPNASAYGDNEIHANENKGAIGEFETRP
NASAYGDNEIGAEFTDDFEPRPSMTKYNA
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085074
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 16009594 and 16012357
Coverage of 15.72 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDIR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDI
R
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085073
Repeat occurs 21 times in a sequence of 756 amino acids
Location between 15987697 and 15990606
Coverage of 16.67 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDG
WLKDTR
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQ
VYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKN
GQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQ
SPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHS
TSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDI
MNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 17 times in a sequence of 641 amino acids
Location between 18120165 and 18123127
Coverage of 26.52 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][STF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRAEKA
KLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNT
Q
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDG
WLKDSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
RKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVH
GVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHA
TK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 672 amino acids
Location between 18120165 and 18123127
Coverage of 26.79 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKAKANPDSNQVY
LDGWLKDTR
VEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRA
DKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWL
KDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVY
LDGWLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPN
SNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEA
FKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDM
IDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDIS
KDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFI
FDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088327
Repeat occurs 18 times in a sequence of 684 amino acids
Location between 17186384 and 17189258
Coverage of 15.79 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
R
AEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYLDGWLK
DTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQHLEESN
GKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPSLLQLF
SLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSYPTTSG
APLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICH
LDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
The following known motifs were found in CDS LOC127084472
Location between 14251448 and 14260342
FEPR was found 4 times in this sequence
Sequence:
MMSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKE
NMGAIGEFEPRPYASAYGDNEIHAKENMGAIGEFEPRPNASAYGDNEIHANENKGATGEFEPRP
NISAYGDNEIHANENKGAIGEFETRPNASAYGDNEIGAEFTDDFEPRPSMTKYNA
Repeat found in LOC127088336
Repeat occurs 27 times in a sequence of 894 amino acids
Location between 18128433 and 18215138
Coverage of 18.12 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTL | WLKDTR | WLKDTR | WLKDTQ | WLKDTR
WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSRL]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQ
VYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDGWLKDTLAEKAKLN
SDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTL
NPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLK
DSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKL
ADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVI
GAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 733 amino acids
Location between 18120165 and 18123127
Coverage of 28.65 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDT
R
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDG
WLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNR
VYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDV
MNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][ST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 16.94 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][ISTV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 16.73 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTQ | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTR
AEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127084474
Repeat occurs 9 times in a sequence of 285 amino acids
Location between 14758282 and 14759808
Coverage of 22.11 %
Instances:
FEPRPNV | FEPKPNV | FEPRPNV | FEPRPNI | FEPRPNI
FEPRPNV | FEPIPNV | FEPRPNV | FEPRPSV |
pattern: FEP[IKR]P[SN][IV]
The following known motifs were found:
FEPR was found 7 times in this sequence
VS[AI]Y was found 3 times in this sequence
MMRLRPAFALLPLFLLLIITIVESRKDLGKYWKLVMKDQDVSEEIQGLLDANIKKNFKTLRQS
FDAKENKVVKDFEPRPNVPNVSVYGENDIDFMKNKAAIEEFEPKPNVSVYGNNNIDVEENNKGI
EDFEPRPNVPNVSTYGNNDIDNKKKDKEVEDFEPRPNIPNISAYGNNDIDNKKKDKEVEDFEPR
PNI
PNISAYGNNDIDNKEKEKAVEDFEPRPNVPNVSAYGNNDINSRENEKVVEDFEPIPNVSAY
GNNDIYNKEKKKVVEDFEPRPNVPNVSAYGNNEIGAEFTEDFEPRPSV
Repeat found in LOC127085074
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 16009594 and 16012357
Coverage of 15.36 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDIR | WLKDIR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKE
KVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 15987697 and 15990606
Coverage of 16.37 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDT
R
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNR
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 23 times in a sequence of 787 amino acids
Location between 18120165 and 18123127
Coverage of 29.22 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVY
LDGWLKDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPD
SKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKL
NSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDS
H
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNK
GETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 16.73 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTQ | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTR
AEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088338
Repeat occurs 19 times in a sequence of 687 amino acids
Location between 18120165 and 18123127
Coverage of 27.66 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKE
KSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDT
R
AEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDG
WLKDTR
AEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLL
QLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 14.98 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDIR
WLKDIR | WLKDTR | WLKDTR | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
R
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127085073
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 15987697 and 15990606
Coverage of 16.06 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDT
R
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDG
WLKDTR
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 718 amino acids
Location between 18120165 and 18123127
Coverage of 27.86 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWL
KDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVY
LDGWLKDTR
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSD
SNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIA
KSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMT
LQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKA
CPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPY
ALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFP
VKHVLWVPSPPHATK
Repeat found in LOC127084475
Repeat occurs 17 times in a sequence of 525 amino acids
Location between 15248183 and 15251142
Coverage of 29.14 %
Instances:
SAYGKKNVD | SAYGENDID | SAYGENNID | SAYGENNID | SAYGENNID
SAYGENDID | SAYGENNFD | SAYGENDID | SAYGENNFD | SAYGENDID
SAYGENNFD | SAYGENNID | SAYGENNVD | SAYGENNID | SAYVGNDID
SAYGNNNID | SAYGNNEID |
pattern: SAY[GV][KEGN][KN][EDN][IFV]D
The following known motifs were found:
FEPR was found 14 times in this sequence
MKMMRPALSLLPLFLLLIVGIVESRKDLGEYWKLVMKQQDMPQEIQGLLNQNPKKNFKTLKQF
FDDGKKKKVVKDFEQRPNISAYGKKNVDVKEKNGVIEDFEPRPNISAYGENDIDVKEKKGAIED
FEPIPNISAYGENNIDDKEKNEGIEDFEPRPNISAYGENNIDVKEKKGVIEDFEPRPNISAYGE
NNID
VKEKNGTIEEFEPRPNISAYGENDIDVKEKKGAIEDFEPRPNISAYGENNFDDKKKNGAI
EDFEPRPNISAYGENDIDVKENKGNIEDFEPRPNISAYGENNFDDKKKNGAIEDFEPRPNISAY
GENDID
VKENKGNIEDFEPRPNISAYGENNFDVKENNGAIEDFEPRPNISAYGENNIDFKEKKG
AIEEFEPRPNISAYGENNVDVKEKSGAIEDFEPRPNISAYGENNIDIKEKKGAIEDFKPRPNIS
AYVGNDID
VKEKKGDIEDFEPRPNISAYGNNNIDVKEKNKTIKDFEPRPNISAYGNNEIDDESM
KDVEPIPSLTKYDA
Repeat found in LOC127088327
Repeat occurs 22 times in a sequence of 776 amino acids
Location between 17186384 and 17189258
Coverage of 17.01 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQ
VYLDGWLKDTQAKSNLDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNS
KQAYLDGWLKDSRVENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIR
EYAPFLPRKLADEIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLE
SMLEFVHGIIGPETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCH
YLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLW
VPSPSNATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 17.21 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDIR | WLKDTL | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSRL]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
AEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDG
WLKDIR
DEKAKSTPDSNQVYVDGWLKDTLAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENE
KSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 14.98 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDIR
WLKDIR | WLKDTR | WLKDTR | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
R
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127085073
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 15987697 and 15990606
Coverage of 15.72 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][ISTV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKE
KSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDT
R
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
AEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTGAENAKSNLD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 18128433 and 18215138
Coverage of 16.94 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDTR
WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQ
VYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKST
PNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPT
SLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKAKVNPDSNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPD
SNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 15987697 and 15990606
Coverage of 15.36 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPD
SKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVEND
KSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 17.21 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088328
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 17248750 and 17251664
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDNH | WLKDSH

pattern: WLK[DN][ISTN][PRH]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKDTRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127085072
Repeat occurs 16 times in a sequence of 636 amino acids
Location between 15931056 and 15934133
Coverage of 20.13 %
Instances:
YLDGWLKK | YLDGWLKD | YLDGWLKD | YLDGWLKD | YLDGWLKD
YLDGWLKD | YLDGWLKD | YLDGWLKD | YLDGWFKD | YLDGWLKD
YLDGWLKD | YLDGWLKN | YLDGWLKD | YLDGWLKD | YLDGLLKD
YLDIGSKI |
pattern: YLD[IG][GLW][SLF]K[IKDN]
MTYRIVRSILHFLIFLLMNGHGNFARDTKLLQENVEEKQVDQPYLDGWLKKPLKNQKRIPDSN
EVYHDGWLKDNRGEKEKTNLDSNQVYLDGWLKDTRTEKEKVSHDSKQVYLDGWLKDTRVEKAKG
NPDSKQVYLDGWLKDIRAEKAQVNPDTNQVYLDGWLKDTRDEKEKVNPNSNQAYLDGWLKDIRT
EKAKSTLDTNQVYLDGWLKDGRAKKVKFTPDTNQVYLDGWLKDSRTEKAKSTPDSNQIYLDGWF
KD
NRGDKSKSTPDTNQVYLDGWLKDFRVEKEKSTPNEVYLDGWLKDTRDQKEKSTTNSNQVYLD
GWLKN
TQAEKEKVTPNSNKVYLDGWLKDTKDQKKKTTRNFNPAYLDGWLKDSHVDKAKFTPNSK
QAYLDGLLKDSHAESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPLLPRKVAD
DIPFSKSQIPSLLQLFSFTKDSPQGEDMKDIINQCEFEPTKGETKACPTSLESMVEFVHSVIGT
ETKFNIHSTSYPTTSGARLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSDMNPNHFIFELLGMKPGEAPLCHFFPVKHVLWVPAPPDVTK
Repeat found in LOC127085074
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 16009594 and 16012357
Coverage of 16.37 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][IST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDT
R
AEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQ
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088328
Repeat occurs 19 times in a sequence of 706 amino acids
Location between 17248750 and 17251664
Coverage of 16.15 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDSR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDNH | WLKDSH |
pattern: WLK[DN][ISTN][PRH]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSRAEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDGWLKDTRGEKEKSNH
DSNQVYLDGWLKDTRGEKEKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQK
AKSNSDSNRVYLDGWLKDTRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKD
TR
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLD
GWLKDSHAENDMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFL
PRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFV
HGIIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGS
KIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSD
ATK
Repeat found in LOC127085073
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 15987697 and 15990606
Coverage of 17.21 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][ISTV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDG
WLKDTR
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQ
VYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSS
PNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGE
DMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
The following known motifs were found in CDS LOC127084472
Location between 14251448 and 14253205
FEPR was found 4 times in this sequence
Sequence:
MSLRSAFALLPLFLFLIVANVESRKDVGEYWKLVMKDQDMPEEIQGLLDASNIKNSKTHAKEN
MGAIGEFEPRPYASAYGDNEIHAKENMGAIGEFEPRPNASAYGDNEIHANENKGATGEFEPRPN
ISAYGDNEIHANENKGAIGEFETRPNASAYGDNEIGAEFTDDFEPRPSMTKYNA
Repeat found in LOC127084474
Repeat occurs 8 times in a sequence of 259 amino acids
Location between 14758282 and 14759808
Coverage of 21.62 %
Instances:
FEPRPNV | FEPKPNV | FEPRPNV | FEPRPNI | FEPRPNV
FEPIPNV | FEPRPNV | FEPRPSV |
pattern: FEP[IKR]P[SN][IV]
The following known motifs were found:
FEPR was found 6 times in this sequence
VS[AI]Y was found 3 times in this sequence
MMRLRPAFALLPLFLLLIITIVESRKDLGKYWKLVMKDQDVSEEIQGLLDANIKKNFKTLRQS
FDAKENKVVKDFEPRPNVPNVSVYGENDIDFMKNKAAIEEFEPKPNVSVYGNNNIDVEENNKGI
EDFEPRPNVPNVSTYGNNDIDNKKKDKEVEDFEPRPNIPNISAYGNNDIDNKEKEKAVEDFEPR
PNV
PNVSAYGNNDINSRENEKVVEDFEPIPNVSAYGNNDIYNKEKKKVVEDFEPRPNVPNVSAY
GNNEIGAEFTEDFEPRPSV
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 14.12 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 16009594 and 16012357
Coverage of 16.06 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDTR | WLKDIR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDI
R
VQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTR
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127088328
Repeat occurs 21 times in a sequence of 752 amino acids
Location between 17248750 and 17251664
Coverage of 16.76 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDSR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDNH
WLKDSH |
pattern: WLK[DN][ISTN][PRH]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSRAEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDRWLKDTRVQKEKVNS
DSNEVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEK
EKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKD
TR
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLD
GWLKDTRAEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHL
EESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSL
LQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYP
TTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNAL
GICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 16.94 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK

Similar gene clusters

NC_066583 - Cluster 20 - Cyclopeptide

Gene cluster description

NC_066583 - Gene Cluster 20. Type = cyclopeptide. Location: 15983889 - 18454160 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeat found in LOC127085074
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 16009594 and 16012357
Coverage of 16.06 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDTR | WLKDIR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDI
R
VQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTR
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 16.73 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTQ | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTR
AEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 16.94 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 733 amino acids
Location between 18120165 and 18123127
Coverage of 28.65 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDT
R
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDG
WLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNR
VYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDV
MNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127088327
Repeat occurs 22 times in a sequence of 776 amino acids
Location between 17186384 and 17189258
Coverage of 17.01 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQ
VYLDGWLKDTQAKSNLDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNS
KQAYLDGWLKDSRVENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIR
EYAPFLPRKLADEIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLE
SMLEFVHGIIGPETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCH
YLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLW
VPSPSNATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 16.73 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTQ | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTR
AEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127085074
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 16009594 and 16012357
Coverage of 15.72 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDIR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDI
R
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085073
Repeat occurs 21 times in a sequence of 756 amino acids
Location between 15987697 and 15990606
Coverage of 16.67 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDG
WLKDTR
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQ
VYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKN
GQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQ
SPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHS
TSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDI
MNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 14.12 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][ST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127085073
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 15987697 and 15990606
Coverage of 16.94 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][ISTV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDG
WLKDTR
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQ
IYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSI
PNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPT
SLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
AEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTGAENAKSNLD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088328
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 17248750 and 17251664
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDNH | WLKDSH

pattern: WLK[DN][ISTN][PRH]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKDTRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127085074
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 16009594 and 16012357
Coverage of 15.36 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDIR | WLKDIR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKE
KVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 17.21 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDIR | WLKDTL | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSRL]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
AEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDG
WLKDIR
DEKAKSTPDSNQVYVDGWLKDTLAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 23 times in a sequence of 787 amino acids
Location between 18120165 and 18123127
Coverage of 29.22 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVY
LDGWLKDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPD
SKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKL
NSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDS
H
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNK
GETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKAKVNPDSNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPD
SNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085073
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 15987697 and 15990606
Coverage of 16.37 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDT
R
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNR
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 17 times in a sequence of 641 amino acids
Location between 18120165 and 18123127
Coverage of 26.52 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][STF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRAEKA
KLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNT
Q
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDG
WLKDSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
RKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVH
GVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHA
TK
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 14.98 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDIR
WLKDIR | WLKDTR | WLKDTR | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
R
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127088336
Repeat occurs 27 times in a sequence of 894 amino acids
Location between 18128433 and 18215138
Coverage of 18.12 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTL | WLKDTR | WLKDTR | WLKDTQ | WLKDTR
WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSRL]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQ
VYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDGWLKDTLAEKAKLN
SDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTL
NPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLK
DSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKL
ADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVI
GAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088327
Repeat occurs 18 times in a sequence of 684 amino acids
Location between 17186384 and 17189258
Coverage of 15.79 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
R
AEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYLDGWLK
DTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQHLEESN
GKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPSLLQLF
SLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSYPTTSG
APLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICH
LDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 672 amino acids
Location between 18120165 and 18123127
Coverage of 26.79 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKAKANPDSNQVY
LDGWLKDTR
VEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRA
DKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWL
KDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVY
LDGWLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPN
SNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEA
FKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDM
IDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDIS
KDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFI
FDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085073
Repeat occurs 19 times in a sequence of 710 amino acids
Location between 15987697 and 15990606
Coverage of 16.06 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][STV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDT
R
DEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDG
WLKDTR
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQ
AYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESM
LEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPSDATK
Repeat found in LOC127088328
Repeat occurs 19 times in a sequence of 706 amino acids
Location between 17248750 and 17251664
Coverage of 16.15 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDSR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDNH | WLKDSH |
pattern: WLK[DN][ISTN][PRH]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSRAEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDGWLKDTRGEKEKSNH
DSNQVYLDGWLKDTRGEKEKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQK
AKSNSDSNRVYLDGWLKDTRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKD
TR
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLD
GWLKDSHAENDMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFL
PRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFV
HGIIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGS
KIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSD
ATK
Repeat found in LOC127085074
Repeat occurs 16 times in a sequence of 641 amino acids
Location between 16009594 and 16012357
Coverage of 14.98 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDIR
WLKDIR | WLKDTR | WLKDTR | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKA
KANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
R
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDG
WLKDSR
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
KKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVH
GIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDA
TK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 718 amino acids
Location between 18120165 and 18123127
Coverage of 27.86 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWL
KDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVY
LDGWLKDTR
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSD
SNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIA
KSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMT
LQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKA
CPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPY
ALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFP
VKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 19 times in a sequence of 687 amino acids
Location between 18120165 and 18123127
Coverage of 27.66 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKE
KSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDT
R
AEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDG
WLKDTR
AEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLL
QLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085073
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 15987697 and 15990606
Coverage of 17.21 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][ISTV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRGEKEKSNPDSNQVYLDGWLKDTRAEKEKANPNSNQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDG
WLKDTR
VENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQ
VYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSS
PNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGE
DMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088336
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 18128433 and 18215138
Coverage of 16.94 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDTR
WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQ
VYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKST
PNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPT
SLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127085073
Repeat occurs 17 times in a sequence of 664 amino acids
Location between 15987697 and 15990606
Coverage of 15.36 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTRVEKEKSAPD
SKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKEKSAPDSKEVYLDGWLKDTRVEND
KSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDTRAKKVKVNPDSNQVYLDGWLKDT
R
DEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDG
WLKDSH
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCE
SEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 17.21 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085074
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 16009594 and 16012357
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPTSNQIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085074
Repeat occurs 20 times in a sequence of 733 amino acids
Location between 16009594 and 16012357
Coverage of 16.37 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDSH | WLKDSR

pattern: WLK[DN][IST][QPRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKHNPD
SNQVYLDGWLKDTRVQKEKANSDSNQVYLDGWLKDTRVQKEKVNSNSNEVYLDGWLKDTQAEKE
KVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDT
R
AEKEKVNTDSNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQ
IYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDV
INQCESEPNKGETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENE
KSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][STF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127085074
Repeat occurs 14 times in a sequence of 595 amino acids
Location between 16009594 and 16012357
Coverage of 14.12 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IST][PRH]
MTLRVVMSLIPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKDQKPSPNSDQVYLDGWL
KDTR
DEKTKVNSDSNQVYLDGWLKDTRTEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKTNPDSNQVYLDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRAEKEKVNTD
SNQVYLDGWLKDIRVQKAKANSDSNQVYLDGWLKDIRAEKVKVNPDSNQVYLDGWLKDTRDEKA
KSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDTRAEKENSSPTSNQIYLDGWLKDS
H
VENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPKKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNK
GETKACPTSLESMLEFVHGIIGADTNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPSDATK
Repeat found in LOC127088328
Repeat occurs 21 times in a sequence of 752 amino acids
Location between 17248750 and 17251664
Coverage of 16.76 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDSR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDNH
WLKDSH |
pattern: WLK[DN][ISTN][PRH]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSRAEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDRWLKDTRVQKEKVNS
DSNEVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEK
EKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKD
TR
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLD
GWLKDTRAEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHL
EESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSL
LQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYP
TTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNAL
GICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127085073
Repeat occurs 18 times in a sequence of 687 amino acids
Location between 15987697 and 15990606
Coverage of 15.72 %
Instances:
WLKNTP | WLKDTR | WLKNTR | WLKDTR | WLKDTR
WLKDVR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][ISTV][PRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQIDQPYLDGWLKNTPQKNQKSSLNSDQVYLDGWL
KDTR
DEKTKTNPDTNQVYLDGWLKNTRAEKAKVNLDSNQVYLDGWLKDTRAEKEKINPDSNQVY
LDGWLKDTRGEKAKANPDSNQVYLDGWLKDVRGEKEKSNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDIRVEKEKSSPDSKQVYLDGWLKDTRVEKE
KSAPDSKEVYLDGWLKDTRVENDKSSPDSKQVYLDGWLKDTRDEKAKSTLDSNQVYLDGWLKDT
R
AKKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQIYVDGWLKDTRAEKAIVNSDSNQVYLDG
WLKDTR
AEKENSSPNSNRIYLDGWLKDSHVENAKSIPNSKQAYLDGWLKDSRAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLL
QLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGVDTNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSDATK

Similar gene clusters

NC_066583 - Cluster 21 - Cyclopeptide

Gene cluster description

NC_066583 - Gene Cluster 21. Type = cyclopeptide. Location: 16808328 - 19526976 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 16.73 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTQ | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDI
R
VEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTR
AEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088328
Repeat occurs 21 times in a sequence of 752 amino acids
Location between 17248750 and 17251664
Coverage of 16.76 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDSR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDNH
WLKDSH |
pattern: WLK[DN][ISTN][PRH]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSRAEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDRWLKDTRVQKEKVNS
DSNEVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEK
EKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKD
TR
AEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLD
GWLKDTRAEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHL
EESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSL
LQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYP
TTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNAL
GICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 718 amino acids
Location between 18120165 and 18123127
Coverage of 27.86 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWL
KDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVY
LDGWLKDTR
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSD
SNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIA
KSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMT
LQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKA
CPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPY
ALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFP
VKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 672 amino acids
Location between 18120165 and 18123127
Coverage of 26.79 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKAKANPDSNQVY
LDGWLKDTR
VEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRA
DKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWL
KDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVY
LDGWLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPN
SNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEA
FKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDM
IDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDIS
KDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFI
FDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 19 times in a sequence of 687 amino acids
Location between 18120165 and 18123127
Coverage of 27.66 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ
YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKE
KSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDT
R
AEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDG
WLKDTR
AEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLE
ESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLL
QLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPT
TSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALG
ICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 22 times in a sequence of 779 amino acids
Location between 18128433 and 18215138
Coverage of 16.94 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDTR
WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQ
VYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKST
PNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQF
PIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPT
SLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALY
YCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKH
VLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTR
AEKAKVNPDSNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPD
SNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][STF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 17.21 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSR]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 741 amino acids
Location between 18120165 and 18123127
Coverage of 28.34 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWL
KDTR
ADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVY
LDGWLKDTR
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLD
SNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKD
NSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKV
DHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSP
QGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYT
VLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMN
PNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 23 times in a sequence of 802 amino acids
Location between 18128433 and 18215138
Coverage of 17.21 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDIR | WLKDTL | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSRL]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
AEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDG
WLKDIR
DEKAKSTPDSNQVYVDGWLKDTLAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNE
VYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSS
PNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHT
EAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPHGE
DMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLD
ISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNH
FIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088336
Repeat occurs 27 times in a sequence of 894 amino acids
Location between 18128433 and 18215138
Coverage of 18.12 %
Instances:
WLKNTS | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDIR
WLKDTL | WLKDTR | WLKDTR | WLKDTQ | WLKDTR
WLKDSH | WLKDSH |
pattern: WLK[DN][IST][QHSRL]
MTHRVVMFLLPFLLLLLINDHGSFAREMNQIDQPYLDGWLKNTSLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKTNPDSNQVYLDGWLKDTRTGKAKVNPDSNQVYLDGWLKDTRAEKEKANPDSNQVY
LDGWLKDTRGENEKSNPESNQVYLDGWLKDTRTEKEKSNPDSNQVYLDGWLKDTRAEKAKTNPN
SNQVYLDGWLKDTRVAKEKSNPDSNQVYLDGWLKDTRVEKEKPSPESKQVYLDGWLKDTRAEKA
KANPDSNQVYLDGWLKDTRSEKGKFNLDSDQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDT
R
VEKEKSSPDSKQVYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDG
WLKDTR
AEKTKLNSDSNQVYLDGWLKDTRAEKAKSNPDSNQVYLDGWLKDTRVEKEKSSPDSKQ
VYLDGWLKDTQVEKEKSAPNSKQVYLDGWLKDIRDEKAKSTPDSNQVYVDGWLKDTLAEKAKLN
SDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNHNSNQVYLDGWLKDTQTL
NPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSTPNSKQAYLDGWLK
DSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKL
ADEIPVSKSQSSSLLQLFSLTKDSPHGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVI
GAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088328
Repeat occurs 19 times in a sequence of 706 amino acids
Location between 17248750 and 17251664
Coverage of 16.15 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDSR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDIR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDNH | WLKDSH |
pattern: WLK[DN][ISTN][PRH]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRDEKEKSNQVYLDGWLKDTRAEKEKSNPDSNQV
YLDGWLKDSRAEKEKHNPNSNQVYLDGWLKDTRVQKEKASSDSNQVYLDGWLKDTRGEKEKSNH
DSNQVYLDGWLKDTRGEKEKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQK
AKSNSDSNRVYLDGWLKDTRAEKVKVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKD
TR
AEKAIVNSDSNQVYLDGWLKDTRAEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLD
GWLKDSHAENDMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFL
PRKVADDIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFV
HGIIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGS
KIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSD
ATK
Repeat found in LOC127088328
Repeat occurs 15 times in a sequence of 618 amino acids
Location between 17248750 and 17251664
Coverage of 14.56 %
Instances:
WLKNTP | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDNH | WLKDSH

pattern: WLK[DN][ISTN][PRH]
MTHKVVMSLIPFLLLWLINDHGSLARDMNQVDQPYLDGWLKNTPLKNQKPSPNSDQVYLDGWL
KDTR
GEKAKVNLDSNQVYLDGWLKDTRTEKAKVNPDSNLVYLDGWLKDTRAEKEKVNPDSNQVY
LDGWLKDTRAEKEKINPDSNQVYLDGWLKDTRGEKEKSNHDSNQVYLDGWLKDTRGEKEKVNPD
SNQVYLDGWLKDTRAEKEKVNPDSNQIYLDGWLKDIRVQKAKSNSDSNRVYLDGWLKDTRAEKV
KVNPDSNQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDTRAEKAIVNSDSNQVYLDGWLKDT
R
AEKENSSPNSNLIYLDGWLKDNHVENAKSIPNSKQAYLDGWLKDSHAENDMKNGQHLEESNGK
LSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKVADDIPVSKSQSPSLLQLFSL
TKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGAETNYNIHSTSYPTTSGAP
LQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICHLD
TSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPTPSDATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVENE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127088327
Repeat occurs 18 times in a sequence of 684 amino acids
Location between 17186384 and 17189258
Coverage of 15.79 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTQ
WLKDTR | WLKDSH | WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDGWLKDT
R
AEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYLDGWLK
DTR
AEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQHLEESN
GKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPSLLQLF
SLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSYPTTSG
APLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNALGICH
LDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088327
Repeat occurs 22 times in a sequence of 776 amino acids
Location between 17186384 and 17189258
Coverage of 17.01 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDIR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTQ | WLKDTR
WLKDSH | WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDIRVEKEKSATDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDG
WLKDTR
DEKAKSTPDSNQVYVDGWLKDTRAEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQ
VYLDGWLKDTQAKSNLDSNQVYLDGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNS
KQAYLDGWLKDSRVENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIR
EYAPFLPRKLADEIPVSKSQSPSLLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLE
SMLEFVHGIIGPETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCH
YLDIGSKIFKVLLKGEYGDIMNALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLW
VPSPSNATK
Repeat found in LOC127088338
Repeat occurs 22 times in a sequence of 764 amino acids
Location between 18120165 and 18123127
Coverage of 28.8 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
AEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPDSNQVYLDGWLKDTGAENAKSNLD
SNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKEKSAPNSKQVYLDGWLKDTRVENE
KSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVYLDGWLKDTRVENEKSAPNSKQVY
LDGWLKDIR
VEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDTRAEKAKLNSD
SNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNTQTLNP
KPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDS
H
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLAD
EIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVHGVIGA
ETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVL
LKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 17 times in a sequence of 641 amino acids
Location between 18120165 and 18123127
Coverage of 26.52 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH
YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][STF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRAEKA
KLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDGWLKNT
Q
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQAYLDG
WLKDSH
AENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLP
RKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESMLEFVH
GVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSK
IFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPPHA
TK
Repeat found in LOC127088338
Repeat occurs 18 times in a sequence of 664 amino acids
Location between 18120165 and 18123127
Coverage of 27.11 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][QPKRH]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENE
KSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDT
R
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDG
WLKDSH
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSL
DDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCE
SEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKW
VACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKP
GEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088327
Repeat occurs 21 times in a sequence of 753 amino acids
Location between 17186384 and 17189258
Coverage of 16.73 %
Instances:
WLKNTP | WLKDIR | WLKDTR | WLKDTR | WLKDAR
WLKDIR | WLKDTR | WLKDTR | WLKDVR | WLKDTR
WLKDTR | WLKDTR | WLKDTR | WLKDTR | WLKDTR
WLKDTR | WLKDTR | WLKDTQ | WLKDTR | WLKDSH
WLKDSR |
pattern: WLK[DN][IASTV][QPRH]
MAHRVVMSLLSFLLLLLINDYGSFARDMNQIDQPYLDGWLKNTPLKNQKSSLNSDQVYLDGWL
KDIR
DEKTKTNSDTNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRGEKEKANPNSNQVY
LDGWLKDARGEKEKSNPDSNQVYLDGWLKDIRGEKEKHNSDSNQVYLDGWLKDTRGEKEKSNPD
SNQVYLDGWLKDTRVEKEKVNPDSNQVYLDGWLKDVRAEKAKASPDSNQVYLDGWLKDTRAEKV
KANPDSNQVYLDGWLKDTRVENAKSNLDSNQVYLDGWLKDTRAEKAKVNPNSNQVYLDGWLKDT
R
VEKEKSAPDSKQVYLDGWLKDTRVEKEKSAPDSKQVYLDGWLKDTRDEKAKSTPDSNQVYVDG
WLKDTR
AEKAKVNSNSNQVYLDGWLKDTRTEKENSNSNSNQVYLDGWLKDTQAKSNLDSNQVYL
DGWLKDTRAEKENSSPNSNRIYLDGWLKDSHIENAKSIPNSKQAYLDGWLKDSRVENYMKNGQH
LEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSPS
LLQLFSLTKDSPQGEDMIDVINQCESEPNKGETKACPTSLESMLEFVHGIIGPETNYNIHSTSY
PTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMNA
LGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVPSPSNATK
Repeat found in LOC127088338
Repeat occurs 21 times in a sequence of 733 amino acids
Location between 18120165 and 18123127
Coverage of 28.65 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH
YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDT
R
VENEKSTPDSKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDG
WLKDTR
VEKLNSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNR
VYLDGWLKDSHVEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKV
AFFSLDDLYVGNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDV
MNQCESEPNKGETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDI
YAPKWVACHPRPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDL
LGMKPGEGPLCHFFPVKHVLWVPSPPHATK
Repeat found in LOC127088338
Repeat occurs 20 times in a sequence of 710 amino acids
Location between 18120165 and 18123127
Coverage of 28.17 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDIR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKNTQ | YLDGWLKDTR | YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK

pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPDSKQVYLDGWLKDT
R
AEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKLNSNSDSNQVYLDG
WLKNTQ
TLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDSHVEIAKSIPNSKQ
AYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYVGNVMTLQFPIREY
APFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNKGETKACPTSLESM
LEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHPRPYPYALYYCHYL
DIGSKIFK
VLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPLCHFFPVKHVLWVP
SPPHATK
Repeat found in LOC127088338
Repeat occurs 23 times in a sequence of 787 amino acids
Location between 18120165 and 18123127
Coverage of 29.22 %
Instances:
YLDGWLKNTP | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTG | YLDGWLKDTR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDIR | YLDGWLKDTR
YLDGWLKDTR | YLDGWLKDTR | YLDGWLKDTR | YLDGWLKNTQ | YLDGWLKDTR
YLDGWLKDSH | YLDGWLKDSH | YLDIGSKIFK |
pattern: YLD[IG][GW][SL]K[IDN][ISTF][GHKQPR]
MAHRVVMSLLPFLLLLLINDHGSFARDMNQVDQPYLDGWLKNTPLKNQKSSLNSDQIYLDGWL
KDTR
DEKAKLNPDTNQVYLDGWLKDTRGEKAKANPDSNQVYLDGWLKDTRAEKEKDNPDSNQVY
LDGWLKDTR
GEKAKVNPDSNQVYLDGWLKDTRAEKEKVNPDSNQVYLDGWLKDTRAEKAKVNPD
SNQVYLDGWLKDTGAENAKSNLDSNQVYLDGWLKDTRAEKAKANPDSNQVYLDGWLKDTRVEKE
KSAPNSKQVYLDGWLKDTRVENEKSALDSNAKSNLDSNQVYLDGWLKDTRADKAKANPDSNQVY
LDGWLKDTR
VENEKSAPNSKQVYLDGWLKDIRVEKEKSALDSKEVYLDGWLKDTRVENEKSTPD
SKQVYLDGWLKDTRAEKAKLNSDSNQVYLDGWLKDTRAEKAKSSLDSNEVYLDGWLKDTRVEKL
NSNSDSNQVYLDGWLKNTQTLNPKPTLDSNQVYLDGWLKDTRAEKDNSSPNSNRVYLDGWLKDS
H
VEIAKSIPNSKQAYLDGWLKDSHAENYMKNGQHLEESNGKLSSKVDHTEAFKVAFFSLDDLYV
GNVMTLQFPIREYAPFLPRKLADEIPVSKSQSSSLLQLFSLTKDSPQGEDMIDVMNQCESEPNK
GETKACPTSLESMLEFVHGVIGAETNYNIHSTSYPTTSGAPLQNYTVLDISKDIYAPKWVACHP
RPYPYALYYCHYLDIGSKIFKVLLKGEYGDIMDALGICHLDTSAMNPNHFIFDLLGMKPGEGPL
CHFFPVKHVLWVPSPPHATK

Similar gene clusters

NC_066583 - Cluster 22 - Saccharide

Gene cluster description

NC_066583 - Gene Cluster 22. Type = saccharide. Location: 102550203 - 103091286 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 23 - Polyketide

Gene cluster description

NC_066583 - Gene Cluster 23. Type = polyketide. Location: 194555028 - 195494026 nt. Click on genes for more information.
Show pHMM detection rules used
plants/polyketide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Chal_sti_synt_C/Chal_sti_synt_N]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Thr_dehydrat_C]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Chal_sti_synt_C,Chal_sti_synt_N]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 24 - Polyketide

Gene cluster description

NC_066583 - Gene Cluster 24. Type = polyketide. Location: 214812041 - 217867500 nt. Click on genes for more information.
Show pHMM detection rules used
plants/polyketide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Chal_sti_synt_C/Chal_sti_synt_N]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Thr_dehydrat_C]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Chal_sti_synt_C,Chal_sti_synt_N]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 25 - Saccharide

Gene cluster description

NC_066583 - Gene Cluster 25. Type = saccharide. Location: 246094858 - 246596422 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066583 - Cluster 26 - Fatty_acid

Gene cluster description

NC_066583 - Gene Cluster 26. Type = fatty_acid. Location: 248378010 - 249024385 nt. Click on genes for more information.
Show pHMM detection rules used
plants/fatty_acid: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[FA_desaturase/FA_desaturase_2/FA_hydroxylase/CER1-like_C]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,ECH_2]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,AMP-binding]))
plants/plant: (minimum(4,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066584 - Cluster 27 - Terpene

Gene cluster description

NC_066584 - Gene Cluster 27. Type = terpene. Location: 265558929 - 268252320 nt. Click on genes for more information.
Show pHMM detection rules used
plants/terpene: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Terpene_synth/Terpene_synth_C/Prenyltrans/SQHop_cyclase_C/SQHop_cyclase_N/PRISE]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 28 - Fatty_acid-alkaloid

Gene cluster description

NC_066585 - Gene Cluster 28. Type = fatty_acid-alkaloid. Location: 161969102 - 162676728 nt. Click on genes for more information.
Show pHMM detection rules used
plants/alkaloid: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Bet_v_1/Cu_amine_oxid/Str_synth/BBE/Orn_DAP_Arg_deC/Pyridoxal_deC]))
plants/fatty_acid: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[FA_desaturase/FA_desaturase_2/FA_hydroxylase/CER1-like_C]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,ECH_2]) or minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Transferase,AMP-binding]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 29 - Terpene

Gene cluster description

NC_066585 - Gene Cluster 29. Type = terpene. Location: 181259511 - 182181420 nt. Click on genes for more information.
Show pHMM detection rules used
plants/terpene: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Terpene_synth/Terpene_synth_C/Prenyltrans/SQHop_cyclase_C/SQHop_cyclase_N/PRISE]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 30 - Saccharide

Gene cluster description

NC_066585 - Gene Cluster 30. Type = saccharide. Location: 233677316 - 234181648 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 31 - Terpene

Gene cluster description

NC_066585 - Gene Cluster 31. Type = terpene. Location: 239871631 - 241273047 nt. Click on genes for more information.
Show pHMM detection rules used
plants/terpene: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Terpene_synth/Terpene_synth_C/Prenyltrans/SQHop_cyclase_C/SQHop_cyclase_N/PRISE]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 32 - Cyclopeptide

Gene cluster description

NC_066585 - Gene Cluster 32. Type = cyclopeptide. Location: 290755896 - 298199303 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeat found in LOC127101320
Repeat occurs 13 times in a sequence of 408 amino acids
Location between 292120408 and 292121902
Coverage of 22.3 %
Instances:
EFEPRPS | EFEPRPS | EFEPRPS | EFEPRPS | EFEPRPS
EFEPRPS | EFEPRPS | EFEPRPS | EFEPIPS | EFEPRPS
EFEPRPS | EFEPRPS | EFEPRPS |
pattern: EFEP[IR]PS
The following known motifs were found:
FEPR was found 12 times in this sequence
MRPALALLPFLFLFMFAVTIESRKDLKEYWKTVMKDEAMPEGIQGLLQFKSVIEPLKNSKAQE
QLAKGKCDQLDVKEKKLVKEEFEPRPSPSVTKYDGGEGNKNMKLLVNDEFEPRPSPSVTKYDGD
ESYKNMKLSINDEFEPRPSPSATKYDGNEGYKNIKLPVNDEFEPRPSPSVTKYDGGEGNKNMKL
LVNDEFEPRPSPSVTKYDGDESYKNMKLSINDEFEPRPSPSATEYDGNEGYKNIKLPVNDEFEP
R
PSPSATKYDGDDGYKNMKLPINDEFEPRPSPSATKYDGDDGYKNMKLSVNDEFEPIPSVTKYD
GDEGYKNLKLTINDEFEPRPSPSATKYDGDDGYQNMKLPINDEFEPRPSPSATKYDGDDGYQNM
KLPINYEFEPRPSPSATKYDGDDGYKNMKLPLNDEFEPRPSATKYND
The following known motifs were found in CDS LOC127101320
Location between 292120408 and 292121902
FEPR was found 9 times in this sequence
Sequence:
MRPALALLPFLFLFMFAVTIESRKDLKEYWKTVMKDEAMPEGIQGLLQFKSVIEPLKNSKAQE
QLAKGKCDQLDVKEKKLVKEEFEPRPSVTKYDGGEGNKNMKLLVNDEFEPRPSVTKYDGDESYK
NMKLSINDEFEPRPSATKYDGNEGYKNIKLPVNDEFEPRPSATKYDGDDGYKNMKLPINDEFEP
R
PSATKYDGDDGYKNMKLSVNDEFEPIPSVTKYDGDEGYKNLKLTINDEFEPRPSATKYDGDDG
YQNMKLPINDEFEPRPSATKYDGDDGYQNMKLPINYEFEPRPSATKYDGDDGYKNMKLPLNDEF
EPR
PSATKYND
Repeat found in LOC127106000
Repeat occurs 15 times in a sequence of 690 amino acids
Location between 294476165 and 294479033
Coverage of 17.39 %
Instances:
WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE
WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE
WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPLPDSK

pattern: WLP[LW][PG][SD][SR][KE]
MKTLYNTQKMAPTLAFHFLSLVLFFVTMGEGIIVEDMKIELPDQKDIEEAKQSNHLHNLIDEA
KKPNYNSEDITHDPNPWLPWGSRETKKPIYNSEVNTHDPNPWLPWGSREIKKPIYNSEVNTRDL
NPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPWLPWGSREIKRP
IYNSEVNTRDPNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPW
LPWGSRE
IKKPIYNSEVNTRDLNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYN
NEVNTRDLNPWLPWGSRETKKRNYNSEADTHDPNPWLPWGSRETKKSNYNSEVNPRDPNPWLPW
GSRE
INKLNYDSEVNTRDPNPWRPWGSRETKKHNFNSEVYTRNPNPWLPWGSREVKKPKYNYKI
KTHDPNPYIDHTDAFEKGFFNLEDLHVGNVMTLQFSVQEIPHFFSRKEEADSIPFSVSQFSSVL
QLFSIPEDSLEAKTMRGTLEHCQEETVVGETKICANSVESMFEFVDTIIGSENKHNILRTSYPS
PTAAPLQKYTILKVSHDIDAPKWVSCHPLPYPYAVYYCHTMATGTRVFKVTLVGDKNGDKMEAL
GMCHLDTADWNPNHMIFKTLKVKPGKNTPVCHFFSINHLLWLPLPDSKVTM

Similar gene clusters

NC_066585 - Cluster 33 - Cyclopeptide

Gene cluster description

NC_066585 - Gene Cluster 33. Type = cyclopeptide. Location: 291424092 - 294497725 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

The following known motifs were found in CDS LOC127101320
Location between 292120408 and 292121902
FEPR was found 9 times in this sequence
Sequence:
MRPALALLPFLFLFMFAVTIESRKDLKEYWKTVMKDEAMPEGIQGLLQFKSVIEPLKNSKAQE
QLAKGKCDQLDVKEKKLVKEEFEPRPSVTKYDGGEGNKNMKLLVNDEFEPRPSVTKYDGDESYK
NMKLSINDEFEPRPSATKYDGNEGYKNIKLPVNDEFEPRPSATKYDGDDGYKNMKLPINDEFEP
R
PSATKYDGDDGYKNMKLSVNDEFEPIPSVTKYDGDEGYKNLKLTINDEFEPRPSATKYDGDDG
YQNMKLPINDEFEPRPSATKYDGDDGYQNMKLPINYEFEPRPSATKYDGDDGYKNMKLPLNDEF
EPR
PSATKYND
Repeat found in LOC127101320
Repeat occurs 13 times in a sequence of 408 amino acids
Location between 292120408 and 292121902
Coverage of 22.3 %
Instances:
EFEPRPS | EFEPRPS | EFEPRPS | EFEPRPS | EFEPRPS
EFEPRPS | EFEPRPS | EFEPRPS | EFEPIPS | EFEPRPS
EFEPRPS | EFEPRPS | EFEPRPS |
pattern: EFEP[IR]PS
The following known motifs were found:
FEPR was found 12 times in this sequence
MRPALALLPFLFLFMFAVTIESRKDLKEYWKTVMKDEAMPEGIQGLLQFKSVIEPLKNSKAQE
QLAKGKCDQLDVKEKKLVKEEFEPRPSPSVTKYDGGEGNKNMKLLVNDEFEPRPSPSVTKYDGD
ESYKNMKLSINDEFEPRPSPSATKYDGNEGYKNIKLPVNDEFEPRPSPSVTKYDGGEGNKNMKL
LVNDEFEPRPSPSVTKYDGDESYKNMKLSINDEFEPRPSPSATEYDGNEGYKNIKLPVNDEFEP
R
PSPSATKYDGDDGYKNMKLPINDEFEPRPSPSATKYDGDDGYKNMKLSVNDEFEPIPSVTKYD
GDEGYKNLKLTINDEFEPRPSPSATKYDGDDGYQNMKLPINDEFEPRPSPSATKYDGDDGYQNM
KLPINYEFEPRPSPSATKYDGDDGYKNMKLPLNDEFEPRPSATKYND
Repeat found in LOC127106000
Repeat occurs 15 times in a sequence of 690 amino acids
Location between 294476165 and 294479033
Coverage of 17.39 %
Instances:
WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE
WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE
WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPLPDSK

pattern: WLP[LW][PG][SD][SR][KE]
MKTLYNTQKMAPTLAFHFLSLVLFFVTMGEGIIVEDMKIELPDQKDIEEAKQSNHLHNLIDEA
KKPNYNSEDITHDPNPWLPWGSRETKKPIYNSEVNTHDPNPWLPWGSREIKKPIYNSEVNTRDL
NPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPWLPWGSREIKRP
IYNSEVNTRDPNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPW
LPWGSRE
IKKPIYNSEVNTRDLNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYN
NEVNTRDLNPWLPWGSRETKKRNYNSEADTHDPNPWLPWGSRETKKSNYNSEVNPRDPNPWLPW
GSRE
INKLNYDSEVNTRDPNPWRPWGSRETKKHNFNSEVYTRNPNPWLPWGSREVKKPKYNYKI
KTHDPNPYIDHTDAFEKGFFNLEDLHVGNVMTLQFSVQEIPHFFSRKEEADSIPFSVSQFSSVL
QLFSIPEDSLEAKTMRGTLEHCQEETVVGETKICANSVESMFEFVDTIIGSENKHNILRTSYPS
PTAAPLQKYTILKVSHDIDAPKWVSCHPLPYPYAVYYCHTMATGTRVFKVTLVGDKNGDKMEAL
GMCHLDTADWNPNHMIFKTLKVKPGKNTPVCHFFSINHLLWLPLPDSKVTM

Similar gene clusters

NC_066585 - Cluster 34 - Cyclopeptide

Gene cluster description

NC_066585 - Gene Cluster 34. Type = cyclopeptide. Location: 292231994 - 294566446 nt. Click on genes for more information.
Show pHMM detection rules used
plants/cyclopeptide: (BURP)

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Repeat found in LOC127106000
Repeat occurs 15 times in a sequence of 690 amino acids
Location between 294476165 and 294479033
Coverage of 17.39 %
Instances:
WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE
WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE
WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPWGSRE | WLPLPDSK

pattern: WLP[LW][PG][SD][SR][KE]
MKTLYNTQKMAPTLAFHFLSLVLFFVTMGEGIIVEDMKIELPDQKDIEEAKQSNHLHNLIDEA
KKPNYNSEDITHDPNPWLPWGSRETKKPIYNSEVNTHDPNPWLPWGSREIKKPIYNSEVNTRDL
NPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPWLPWGSREIKRP
IYNSEVNTRDPNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYNNEVNTRDLNPW
LPWGSRE
IKKPIYNSEVNTRDLNPWLPWGSRETKRPIYNSEVNTRDPNPWLPWGSRETIKPIYN
NEVNTRDLNPWLPWGSRETKKRNYNSEADTHDPNPWLPWGSRETKKSNYNSEVNPRDPNPWLPW
GSRE
INKLNYDSEVNTRDPNPWRPWGSRETKKHNFNSEVYTRNPNPWLPWGSREVKKPKYNYKI
KTHDPNPYIDHTDAFEKGFFNLEDLHVGNVMTLQFSVQEIPHFFSRKEEADSIPFSVSQFSSVL
QLFSIPEDSLEAKTMRGTLEHCQEETVVGETKICANSVESMFEFVDTIIGSENKHNILRTSYPS
PTAAPLQKYTILKVSHDIDAPKWVSCHPLPYPYAVYYCHTMATGTRVFKVTLVGDKNGDKMEAL
GMCHLDTADWNPNHMIFKTLKVKPGKNTPVCHFFSINHLLWLPLPDSKVTM

Similar gene clusters

NC_066585 - Cluster 35 - Polyketide

Gene cluster description

NC_066585 - Gene Cluster 35. Type = polyketide. Location: 304249747 - 305484261 nt. Click on genes for more information.
Show pHMM detection rules used
plants/plant: (minimum(4,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[]))
plants/polyketide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Chal_sti_synt_C/Chal_sti_synt_N]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Thr_dehydrat_C]) or minimum(3,[E1_dh,PALP,Thr_dehydrat_C,Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[AMP-binding,Chal_sti_synt_C,Chal_sti_synt_N]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 36 - Saccharide

Gene cluster description

NC_066585 - Gene Cluster 36. Type = saccharide. Location: 471104851 - 472933702 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 37 - Lignan

Gene cluster description

NC_066585 - Gene Cluster 37. Type = lignan. Location: 489750385 - 490502114 nt. Click on genes for more information.
Show pHMM detection rules used
plants/lignan: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Dirigent]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 38 - Saccharide

Gene cluster description

NC_066585 - Gene Cluster 38. Type = saccharide. Location: 505507535 - 506336966 nt. Click on genes for more information.
Show pHMM detection rules used
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 39 - Lignan-saccharide

Gene cluster description

NC_066585 - Gene Cluster 39. Type = lignan-saccharide. Location: 536942171 - 537667057 nt. Click on genes for more information.
Show pHMM detection rules used
plants/lignan: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Dirigent]))
plants/saccharide: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Glycos_transf_1/Glycos_transf_2/Glycos_transf_28/UDPGT/UDPGT_2/Glyco_hydro_1/Cellulose_synt]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

NC_066585 - Cluster 40 - Terpene

Gene cluster description

NC_066585 - Gene Cluster 40. Type = terpene. Location: 540270736 - 540492999 nt. Click on genes for more information.
Show pHMM detection rules used
plants/terpene: (minimum(3,[Orn_DAP_Arg_deC,Pyridoxal_deC,BBE,FA_hydroxylase,CER1-like_C,ECH_2,Oxidored_FMN,3Beta_HSD,Glyco_hydro_1,ADH_N,ADH_N_2,Abhydrolase_3,Aldo_ket_red,cMT,nMT,oMT,adh_short,Chal_sti_synt_C,Chal_sti_synt_N,COesterase,UDPGT,Glyco_transf_28,Glycos_transf_1,Glycos_transf_2,Lycopene_cycl,NAD_binding_1,p450,SQHop_cyclase_C,SQHop_cyclase_N,Prenyltrans,Terpene_synth_C,Terpene_synth,Transferase,Aminotran_1_2,AMP-binding,DIOX_N,Dirigent,Bet_v_1,Cu_amine_oxid,Str_synth,Trp_syntA,His_biosynth,adh_short_C2,Peptidase_S10,Prenyltransf,Epimerase,2OG-FeII_Oxy,Aminotran_3,Methyltransf_2,Methyltransf_3,Methyltransf_7,PRISE,Cellulose_synt,Chalcone,ERG4_ERG24,FA_desaturase,FA_desaturase_2,Methyltransf_11,polyprenyl_synt,SE,SQS_PSY,TPMT,UbiA,Lipoxygenase,Lyase_aromatic,HMGL-like,Chalcone_3,Chalcone_2,Acetyltransf_1,UDPGT_2,GMC_oxred_N,GMC_oxred_C,Amino_oxidase,DAHP_synth_1,DAHP_synth_2],[Terpene_synth/Terpene_synth_C/Prenyltrans/SQHop_cyclase_C/SQHop_cyclase_N/PRISE]))

Legend:

Only available when smCOG analysis was run
biosynthetic genes
transport-related genes
regulatory genes
other genes

Similar gene clusters

Similar known gene clusters