******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/91/91.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10777 1.0000 500 15610 1.0000 500 17337 1.0000 500 17396 1.0000 500 24641 1.0000 500 263598 1.0000 500 264258 1.0000 500 268576 1.0000 500 32983 1.0000 500 36522 1.0000 500 37960 1.0000 500 4308 1.0000 500 4359 1.0000 500 6837 1.0000 500 8069 1.0000 500 bd1094 1.0000 500 bd1095 1.0000 500 bd722 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/91/91.seqs.fa -oc motifs/91 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.283 C 0.238 G 0.219 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.283 C 0.238 G 0.219 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 8 llr = 150 E-value = 6.1e-007 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :95:5a198:399:866:591 pos.-specific C 9:1a4:9::a81:a1:491:9 probability G ::4:1:::3::::::1::41: matrix T 11:::::1::::1:13:1::: bits 2.2 2.0 * * * 1.8 * * * * 1.5 * * ** * * * * Relative 1.3 ** * *** * *** * ** Entropy 1.1 ** * ********* * ** (27.0 bits) 0.9 ** * ********** ** ** 0.7 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CAACAACAACCAACAAACAAC consensus G C G A TC G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 17396 75 1.23e-11 CAGTGTTTCT CAACAACAGCCAACAAACAAC AACGCATGCA 17337 82 5.02e-11 CAGTGTTTCT CAACAACAGCCAACATACAAC AACGCAAGCA 10777 346 5.14e-10 GCACCATTCA TAGCAACAACAAACAAACAAC TCCTCTCTCA 24641 446 1.53e-09 AACCAACCAA CACCCACAACAAACAAACGGC TCTCCTCCCT bd1094 262 2.04e-09 CCATGTAAAA CAGCAAAAACCAACAACCGAA AGACAATTGC 37960 343 2.36e-09 CTCACTACAC CAGCGACTACCAACAGCCGAC TCGCAAAAAT 6837 468 3.26e-09 TCCTCCACAT CAACCACAACCAACTAATCAC CAACACCCCA 15610 384 3.47e-08 CACCATCAAA CTACCACAACCCTCCTCCAAC AAAACTCTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17396 1.2e-11 74_[+1]_405 17337 5e-11 81_[+1]_398 10777 5.1e-10 345_[+1]_134 24641 1.5e-09 445_[+1]_34 bd1094 2e-09 261_[+1]_218 37960 2.4e-09 342_[+1]_137 6837 3.3e-09 467_[+1]_12 15610 3.5e-08 383_[+1]_96 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=8 17396 ( 75) CAACAACAGCCAACAAACAAC 1 17337 ( 82) CAACAACAGCCAACATACAAC 1 10777 ( 346) TAGCAACAACAAACAAACAAC 1 24641 ( 446) CACCCACAACAAACAAACGGC 1 bd1094 ( 262) CAGCAAAAACCAACAACCGAA 1 37960 ( 343) CAGCGACTACCAACAGCCGAC 1 6837 ( 468) CAACCACAACCAACTAATCAC 1 15610 ( 384) CTACCACAACCCTCCTCCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 10.0755 E= 6.1e-007 -965 188 -965 -105 163 -965 -965 -105 82 -93 77 -965 -965 207 -965 -965 82 65 -81 -965 182 -965 -965 -965 -117 188 -965 -965 163 -965 -965 -105 141 -965 19 -965 -965 207 -965 -965 -18 165 -965 -965 163 -93 -965 -965 163 -965 -965 -105 -965 207 -965 -965 141 -93 -965 -105 114 -965 -81 -6 114 65 -965 -965 -965 188 -965 -105 82 -93 77 -965 163 -965 -81 -965 -117 188 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 6.1e-007 0.000000 0.875000 0.000000 0.125000 0.875000 0.000000 0.000000 0.125000 0.500000 0.125000 0.375000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.375000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 0.750000 0.125000 0.000000 0.125000 0.625000 0.000000 0.125000 0.250000 0.625000 0.375000 0.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.500000 0.125000 0.375000 0.000000 0.875000 0.000000 0.125000 0.000000 0.125000 0.875000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CA[AG]C[AC]ACA[AG]C[CA]AACA[AT][AC]C[AG]AC -------------------------------------------------------------------------------- Time 2.93 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 18 llr = 187 E-value = 2.2e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::3:22::12::::31 pos.-specific C 12272::2:1::1137 probability G 183:63:74:761822 matrix T 8:2316a147349111 bits 2.2 2.0 * 1.8 * 1.5 * Relative 1.3 * * * ** Entropy 1.1 ** * ** **** (15.0 bits) 0.9 ** * ** ***** 0.7 ** * ********* * 0.4 ** *********** * 0.2 ** *********** * 0.0 ---------------- Multilevel TGACGTTGGTGGTGAC consensus CGT G T TT C sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- bd1095 485 2.36e-08 TATCAATCGG TGACCTTGTTGGTGAC 36522 171 4.04e-08 TGGGTGGTGG TGGTGGTGGTGGTGGC TGGTGGGGAG 263598 340 2.40e-07 GATATGGAAT TCGCGTTGTTTGTGGC TGGATTGAAG 268576 131 3.18e-07 CGACATCATC TGGTGGTGGTGGTGGG CTCGGCGAAG 24641 181 6.00e-07 CACATTAATG TCACGTTCTTGTTGAC AACTAGAGCA 17396 293 1.07e-06 GATGACGGCT TGCCATTGATGTTGCC TGATGACGAA 17337 293 1.07e-06 GATGACGACT TGCCATTGATGTTGCC TGATGACGAA 32983 5 1.48e-06 CAGT TGTCTTTGTCGGTGAC GGAGGCAGTC 37960 42 2.27e-06 AGAATGAGCG TGACGATGTTTGTCCC TGGAAAGAAG 4359 43 7.79e-06 GTTCTCTCCC TCGTGGTGGTGTTGCT CTGACTCCAA bd722 274 1.64e-05 GATTCGAGTT GGACGTTGGAGGTCGC GTGCGGTGGT 264258 345 1.77e-05 GTATCGTTCA TGTCGTTGGAGGGGAA GATAATCACG 8069 300 2.38e-05 TCGTCCAAAT TGCCTTTTGTTTTGCG AGAATACTTT bd1094 197 2.96e-05 AAGGAGTCAG CGACAATGTTGTTGAA GCCGTGCTTG 6837 390 2.96e-05 TTCAACCTGC TGGCCGTGTCTGTGTG CAGTATAGAA 10777 394 3.41e-05 CAGTCCATCA TCATCGTCGAGTTGCC ATCGCGAGCT 4308 25 7.34e-05 TTACTATCAT CGTCGTTTGTTGTTAC TTTTGTAGCG 15610 153 7.79e-05 TGGAGAAAGT TGTTGATCTTGTCGTC TTCATCCAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd1095 2.4e-08 484_[+2] 36522 4e-08 170_[+2]_314 263598 2.4e-07 339_[+2]_145 268576 3.2e-07 130_[+2]_354 24641 6e-07 180_[+2]_304 17396 1.1e-06 292_[+2]_192 17337 1.1e-06 292_[+2]_192 32983 1.5e-06 4_[+2]_480 37960 2.3e-06 41_[+2]_443 4359 7.8e-06 42_[+2]_442 bd722 1.6e-05 273_[+2]_211 264258 1.8e-05 344_[+2]_140 8069 2.4e-05 299_[+2]_185 bd1094 3e-05 196_[+2]_288 6837 3e-05 389_[+2]_95 10777 3.4e-05 393_[+2]_91 4308 7.3e-05 24_[+2]_460 15610 7.8e-05 152_[+2]_332 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=18 bd1095 ( 485) TGACCTTGTTGGTGAC 1 36522 ( 171) TGGTGGTGGTGGTGGC 1 263598 ( 340) TCGCGTTGTTTGTGGC 1 268576 ( 131) TGGTGGTGGTGGTGGG 1 24641 ( 181) TCACGTTCTTGTTGAC 1 17396 ( 293) TGCCATTGATGTTGCC 1 17337 ( 293) TGCCATTGATGTTGCC 1 32983 ( 5) TGTCTTTGTCGGTGAC 1 37960 ( 42) TGACGATGTTTGTCCC 1 4359 ( 43) TCGTGGTGGTGTTGCT 1 bd722 ( 274) GGACGTTGGAGGTCGC 1 264258 ( 345) TGTCGTTGGAGGGGAA 1 8069 ( 300) TGCCTTTTGTTTTGCG 1 bd1094 ( 197) CGACAATGTTGTTGAA 1 6837 ( 390) TGGCCGTGTCTGTGTG 1 10777 ( 394) TCATCGTCGAGTTGCC 1 4308 ( 25) CGTCGTTTGTTGTTAC 1 15610 ( 153) TGTTGATCTTGTCGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 8.91886 E= 2.2e-004 -1081 -110 -198 168 -1081 -10 183 -1081 24 -52 34 -23 -1081 160 -1081 10 -76 -52 134 -123 -76 -1081 34 110 -1081 -1081 -1081 194 -1081 -52 172 -123 -135 -1081 102 77 -76 -110 -1081 147 -1081 -1081 172 10 -1081 -1081 134 77 -1081 -210 -198 177 -1081 -110 193 -222 24 48 2 -123 -135 148 -40 -222 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 18 E= 2.2e-004 0.000000 0.111111 0.055556 0.833333 0.000000 0.222222 0.777778 0.000000 0.333333 0.166667 0.277778 0.222222 0.000000 0.722222 0.000000 0.277778 0.166667 0.166667 0.555556 0.111111 0.166667 0.000000 0.277778 0.555556 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.722222 0.111111 0.111111 0.000000 0.444444 0.444444 0.166667 0.111111 0.000000 0.722222 0.000000 0.000000 0.722222 0.277778 0.000000 0.000000 0.555556 0.444444 0.000000 0.055556 0.055556 0.888889 0.000000 0.111111 0.833333 0.055556 0.333333 0.333333 0.222222 0.111111 0.111111 0.666667 0.166667 0.055556 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GC][AGT][CT]G[TG]TG[GT]T[GT][GT]TG[ACG]C -------------------------------------------------------------------------------- Time 5.90 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 102 E-value = 4.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::2::2::4:::2::428::: pos.-specific C ::26:4a::::82::28::2: probability G :a62a2:a:a826:8:::2:: matrix T a::2:2::6:2::a24:288a bits 2.2 * * * * 2.0 ** * ** * * * 1.8 ** * ** * * * 1.5 ** * ** * * * Relative 1.3 ** * ** *** ** * *** Entropy 1.1 ** * ** *** ** ***** (29.4 bits) 0.9 ** * ****** ** ***** 0.7 ***** ********* ***** 0.4 ***** *************** 0.2 ***** *************** 0.0 --------------------- Multilevel TGGCGCCGTGGCGTGACATTT consensus AG A A TGA TTATGC sequence CT G C C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 17396 28 1.92e-13 ATTTTTTCTT TGGCGCCGTGGCGTGACATTT GACCCGTGAC 17337 35 4.02e-12 ATTTTATCTT TGGCGCCGTGGCGTGACTTTT GACCCGTGAC 36522 378 2.98e-10 AGGCAACGCG TGCCGGCGTGGGCTGTCATTT TTTCACTTAG bd1095 232 3.43e-09 CTCTCAATGG TGGTGTCGAGGCGTTCAAGTT TGAGAGTCTA bd722 202 4.37e-09 ATTAACAGAC TGAGGACGAGTCATGTCATCT CGTTGCATCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17396 1.9e-13 27_[+3]_452 17337 4e-12 34_[+3]_445 36522 3e-10 377_[+3]_102 bd1095 3.4e-09 231_[+3]_248 bd722 4.4e-09 201_[+3]_278 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 17396 ( 28) TGGCGCCGTGGCGTGACATTT 1 17337 ( 35) TGGCGCCGTGGCGTGACTTTT 1 36522 ( 378) TGCCGGCGTGGGCTGTCATTT 1 bd1095 ( 232) TGGTGTCGAGGCGTTCAAGTT 1 bd722 ( 202) TGAGGACGAGTCATGTCATCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 11.0057 E= 4.4e-001 -897 -897 -897 194 -897 -897 219 -897 -50 -25 145 -897 -897 133 -13 -38 -897 -897 219 -897 -50 75 -13 -38 -897 207 -897 -897 -897 -897 219 -897 50 -897 -897 120 -897 -897 219 -897 -897 -897 187 -38 -897 175 -13 -897 -50 -25 145 -897 -897 -897 -897 194 -897 -897 187 -38 50 -25 -897 62 -50 175 -897 -897 150 -897 -897 -38 -897 -897 -13 162 -897 -25 -897 162 -897 -897 -897 194 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 4.4e-001 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.000000 1.000000 0.000000 0.200000 0.400000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.000000 0.600000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.200000 0.000000 0.200000 0.200000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.400000 0.200000 0.000000 0.400000 0.200000 0.800000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 0.200000 0.800000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TG[GAC][CGT]G[CAGT]CG[TA]G[GT][CG][GAC]T[GT][ATC][CA][AT][TG][TC]T -------------------------------------------------------------------------------- Time 8.69 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10777 5.18e-07 345_[+1(5.14e-10)]_27_\ [+2(3.41e-05)]_67_[+1(6.16e-05)]_3 15610 4.64e-05 152_[+2(7.79e-05)]_215_\ [+1(3.47e-08)]_96 17337 2.58e-17 34_[+3(4.02e-12)]_26_[+1(5.02e-11)]_\ 35_[+1(8.43e-05)]_134_[+2(1.07e-06)]_192 17396 3.62e-19 27_[+3(1.92e-13)]_26_[+1(1.23e-11)]_\ 35_[+1(6.30e-05)]_141_[+2(1.07e-06)]_192 24641 5.68e-08 180_[+2(6.00e-07)]_249_\ [+1(1.53e-09)]_34 263598 1.31e-03 339_[+2(2.40e-07)]_145 264258 3.55e-03 140_[+1(3.94e-05)]_183_\ [+2(1.77e-05)]_140 268576 3.03e-03 130_[+2(3.18e-07)]_354 32983 5.96e-03 4_[+2(1.48e-06)]_480 36522 4.38e-10 170_[+2(4.04e-08)]_117_\ [+2(6.51e-05)]_58_[+3(2.98e-10)]_102 37960 8.24e-09 41_[+2(2.27e-06)]_285_\ [+1(2.36e-09)]_23_[+3(4.29e-05)]_19_[+1(3.33e-06)]_16_[+1(3.30e-05)]_16 4308 6.71e-03 24_[+2(7.34e-05)]_156_\ [+1(3.65e-05)]_283 4359 3.27e-02 42_[+2(7.79e-06)]_442 6837 3.65e-06 389_[+2(2.96e-05)]_62_\ [+1(3.26e-09)]_12 8069 2.26e-03 161_[+3(5.66e-05)]_117_\ [+2(2.38e-05)]_185 bd1094 1.97e-06 196_[+2(2.96e-05)]_49_\ [+1(2.04e-09)]_218 bd1095 6.03e-09 231_[+3(3.43e-09)]_232_\ [+2(2.36e-08)] bd722 1.98e-06 201_[+3(4.37e-09)]_51_\ [+2(1.64e-05)]_211 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************