******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/94/94.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31619 1.0000 500 42763 1.0000 500 42800 1.0000 500 17928 1.0000 500 13789 1.0000 500 4073 1.0000 500 25379 1.0000 500 19513 1.0000 500 45992 1.0000 500 49368 1.0000 500 35784 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/94/94.seqs.fa -oc motifs/94 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.245 G 0.222 T 0.276 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.245 G 0.222 T 0.276 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 7 llr = 113 E-value = 4.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1a171:13:11:::1:::1 pos.-specific C 47:::493:1:39:9:::46 probability G 61:93::634:3::19:::3 matrix T :::::41:44931a::aa6: bits 2.2 2.0 * * ** 1.7 * * ** 1.5 ** * ****** Relative 1.3 ** * * ****** Entropy 1.1 * *** * * ****** (23.3 bits) 0.9 ***** * * ******* 0.7 ***** ** ** ******** 0.4 *********** ******** 0.2 *********** ******** 0.0 -------------------- Multilevel GCAGACCGTGTCCTCGTTTC consensus C GT CAT G CG sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 35784 317 1.59e-12 CCACCCGCGC GCAGACCGTGTCCTCGTTTC AACTGGGAGT 19513 226 1.32e-08 GAAGATTGTC CGAGATCGTGTTCTCGTTCA AAGCCGCGAA 49368 457 2.36e-08 GGAACTCCGA CAAGATCAATTGCTCGTTCC GTTTTCCCAA 4073 139 2.36e-08 TCGAACATTC GCAGATCGGTTTTTGGTTTC GTTCGTCCCT 17928 177 2.36e-08 CGATACGCGC GCAGGCCCAGAACTCGTTTC GAAAGGTGCG 13789 367 3.72e-08 ATTTCTCTGT GCAGGACCGCTCCTCGTTCG CTCGGTACAG 31619 402 1.95e-07 AACGCCAATT CCAAACTGTTTGCTCATTTG GCATTCTTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35784 1.6e-12 316_[+1]_164 19513 1.3e-08 225_[+1]_255 49368 2.4e-08 456_[+1]_24 4073 2.4e-08 138_[+1]_342 17928 2.4e-08 176_[+1]_304 13789 3.7e-08 366_[+1]_114 31619 1.9e-07 401_[+1]_79 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=7 35784 ( 317) GCAGACCGTGTCCTCGTTTC 1 19513 ( 226) CGAGATCGTGTTCTCGTTCA 1 49368 ( 457) CAAGATCAATTGCTCGTTCC 1 4073 ( 139) GCAGATCGGTTTTTGGTTTC 1 17928 ( 177) GCAGGCCCAGAACTCGTTTC 1 13789 ( 367) GCAGGACCGCTCCTCGTTCG 1 31619 ( 402) CCAAACTGTTTGCTCATTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5291 bayes= 10.1664 E= 4.2e+000 -945 81 136 -945 -85 154 -63 -945 196 -945 -945 -945 -85 -945 195 -945 147 -945 37 -945 -85 81 -945 63 -945 181 -945 -95 -85 22 136 -945 15 -945 37 63 -945 -78 95 63 -85 -945 -945 163 -85 22 37 5 -945 181 -945 -95 -945 -945 -945 185 -945 181 -63 -945 -85 -945 195 -945 -945 -945 -945 185 -945 -945 -945 185 -945 81 -945 105 -85 122 37 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 4.2e+000 0.000000 0.428571 0.571429 0.000000 0.142857 0.714286 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.714286 0.000000 0.285714 0.000000 0.142857 0.428571 0.000000 0.428571 0.000000 0.857143 0.000000 0.142857 0.142857 0.285714 0.571429 0.000000 0.285714 0.000000 0.285714 0.428571 0.000000 0.142857 0.428571 0.428571 0.142857 0.000000 0.000000 0.857143 0.142857 0.285714 0.285714 0.285714 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.142857 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.000000 0.571429 0.142857 0.571429 0.285714 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC]CAG[AG][CT]C[GC][TAG][GT]T[CGT]CTCGTT[TC][CG] -------------------------------------------------------------------------------- Time 1.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 9 llr = 102 E-value = 1.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 9a13281a1::1:: pos.-specific C 1:::722:91a322 probability G ::761:6::::3:1 matrix T ::21::1::9:287 bits 2.2 2.0 * * * 1.7 * * * 1.5 ** ** * Relative 1.3 ** * **** Entropy 1.1 ** * **** * (16.3 bits) 0.9 *** ** **** * 0.7 ****** **** ** 0.4 *********** ** 0.2 ************** 0.0 -------------- Multilevel AAGGCAGACTCCTT consensus TAACC GCC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 4073 184 4.53e-08 CGCTGCGCCC AAGGAAGACTCGTT CCTCTCATAC 19513 445 2.15e-07 GACTTTTCGT AAGGCACACTCCTC TGAAGACAAC 42763 298 6.04e-07 AGGAGAGAGA CAGACAGACTCCTT GGCGAGAAAC 42800 437 1.19e-06 TACTCAAGAA AAGGCACACTCACT ATCAGTTGGC 35784 27 4.86e-06 GTTCTCGGGA AAAGCCGACTCGTG TTAGAAAAAC 45992 468 6.17e-06 TTCTTACATT AATACATACTCGCT AGTAGTTGTG 31619 101 8.29e-06 AGCTGCAGAC AAGTAAGAATCCTT TACTCTCATG 25379 479 9.49e-06 TATCCAAGGC AAGGCAAACCCTTC CCCAAAAC 13789 269 1.01e-05 CGGGTCGTAC AATAGCGACTCTTT CGCTTCAATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4073 4.5e-08 183_[+2]_303 19513 2.1e-07 444_[+2]_42 42763 6e-07 297_[+2]_189 42800 1.2e-06 436_[+2]_50 35784 4.9e-06 26_[+2]_460 45992 6.2e-06 467_[+2]_19 31619 8.3e-06 100_[+2]_386 25379 9.5e-06 478_[+2]_8 13789 1e-05 268_[+2]_218 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=9 4073 ( 184) AAGGAAGACTCGTT 1 19513 ( 445) AAGGCACACTCCTC 1 42763 ( 298) CAGACAGACTCCTT 1 42800 ( 437) AAGGCACACTCACT 1 35784 ( 27) AAAGCCGACTCGTG 1 45992 ( 468) AATACATACTCGCT 1 31619 ( 101) AAGTAAGAATCCTT 1 25379 ( 479) AAGGCAAACCCTTC 1 13789 ( 269) AATAGCGACTCTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 5357 bayes= 9.34938 E= 1.1e+003 179 -114 -982 -982 196 -982 -982 -982 -121 -982 159 -31 37 -982 132 -131 -21 144 -99 -982 160 -14 -982 -982 -121 -14 132 -131 196 -982 -982 -982 -121 186 -982 -982 -982 -114 -982 168 -982 203 -982 -982 -121 45 59 -31 -982 -14 -982 149 -982 -14 -99 127 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 9 E= 1.1e+003 0.888889 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.000000 0.666667 0.222222 0.333333 0.000000 0.555556 0.111111 0.222222 0.666667 0.111111 0.000000 0.777778 0.222222 0.000000 0.000000 0.111111 0.222222 0.555556 0.111111 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 1.000000 0.000000 0.000000 0.111111 0.333333 0.333333 0.222222 0.000000 0.222222 0.000000 0.777778 0.000000 0.222222 0.111111 0.666667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- AA[GT][GA][CA][AC][GC]ACTC[CGT][TC][TC] -------------------------------------------------------------------------------- Time 2.26 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 11 llr = 106 E-value = 4.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 523:3:2::9a2 pos.-specific C 3::31::::1:8 probability G :3351:5aa::: matrix T 35525a4::::: bits 2.2 ** 2.0 * ** * 1.7 * ** * 1.5 * **** Relative 1.3 * ***** Entropy 1.1 * ***** (13.9 bits) 0.9 * ***** 0.7 * * ***** 0.4 **** ******* 0.2 ************ 0.0 ------------ Multilevel ATTGTTGGGAAC consensus CGACA T sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49368 440 3.36e-07 AAACAGGTTG ATAGTTGGGAAC TCCGACAAGA 25379 416 5.95e-07 TCCAAGGCTT CTGGTTGGGAAC AGGTACAGAC 31619 242 2.12e-06 TCTTCGATGA ATTCTTTGGAAC CTTCCAAGGA 19513 194 3.74e-06 ATTCTTTTGT AGAGATGGGAAC GGCAAGAATG 13789 308 1.21e-05 TAGTCGATAT CTTTTTTGGAAC TGACCTCTAA 42800 199 1.49e-05 CTTCCGGGTC TGTTTTGGGAAC TAAAAGACCA 4073 293 2.80e-05 AGAAGCTGAC CTTGATTGGAAA GTCAATAACA 35784 95 4.11e-05 GCAAGCAGAT AGGGCTAGGAAC TCGTTTGGTT 42763 265 4.11e-05 AACGATGCGG TAGGGTGGGAAC GAATGCCGGG 17928 241 7.02e-05 GCGGTGGGGG AAACTTTGGAAA CACTGTCAGA 45992 249 1.20e-04 ACCGAACTAC TTTCATAGGCAC TCTTCAACAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49368 3.4e-07 439_[+3]_49 25379 6e-07 415_[+3]_73 31619 2.1e-06 241_[+3]_247 19513 3.7e-06 193_[+3]_295 13789 1.2e-05 307_[+3]_181 42800 1.5e-05 198_[+3]_290 4073 2.8e-05 292_[+3]_196 35784 4.1e-05 94_[+3]_394 42763 4.1e-05 264_[+3]_224 17928 7e-05 240_[+3]_248 45992 0.00012 248_[+3]_240 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=11 49368 ( 440) ATAGTTGGGAAC 1 25379 ( 416) CTGGTTGGGAAC 1 31619 ( 242) ATTCTTTGGAAC 1 19513 ( 194) AGAGATGGGAAC 1 13789 ( 308) CTTTTTTGGAAC 1 42800 ( 199) TGTTTTGGGAAC 1 4073 ( 293) CTTGATTGGAAA 1 35784 ( 95) AGGGCTAGGAAC 1 42763 ( 265) TAGGGTGGGAAC 1 17928 ( 241) AAACTTTGGAAA 1 45992 ( 249) TTTCATAGGCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 9.28648 E= 4.0e+002 82 16 -1010 -2 -50 -1010 30 98 8 -1010 30 72 -1010 16 130 -60 8 -143 -128 98 -1010 -1010 -1010 185 -50 -1010 104 40 -1010 -1010 217 -1010 -1010 -1010 217 -1010 182 -143 -1010 -1010 196 -1010 -1010 -1010 -50 174 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 4.0e+002 0.454545 0.272727 0.000000 0.272727 0.181818 0.000000 0.272727 0.545455 0.272727 0.000000 0.272727 0.454545 0.000000 0.272727 0.545455 0.181818 0.272727 0.090909 0.090909 0.545455 0.000000 0.000000 0.000000 1.000000 0.181818 0.000000 0.454545 0.363636 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.909091 0.090909 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.181818 0.818182 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [ACT][TG][TAG][GC][TA]T[GT]GGAAC -------------------------------------------------------------------------------- Time 3.39 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31619 1.01e-07 100_[+2(8.29e-06)]_127_\ [+3(2.12e-06)]_148_[+1(1.95e-07)]_79 42763 4.98e-04 264_[+3(4.11e-05)]_21_\ [+2(6.04e-07)]_189 42800 2.03e-04 198_[+3(1.49e-05)]_226_\ [+2(1.19e-06)]_50 17928 2.22e-05 176_[+1(2.36e-08)]_44_\ [+3(7.02e-05)]_248 13789 1.29e-07 268_[+2(1.01e-05)]_25_\ [+3(1.21e-05)]_47_[+1(3.72e-08)]_114 4073 1.28e-09 138_[+1(2.36e-08)]_25_\ [+2(4.53e-08)]_95_[+3(2.80e-05)]_196 25379 1.40e-04 9_[+3(5.65e-05)]_394_[+3(5.95e-07)]_\ 51_[+2(9.49e-06)]_8 19513 4.90e-10 193_[+3(3.74e-06)]_20_\ [+1(1.32e-08)]_41_[+1(7.44e-05)]_138_[+2(2.15e-07)]_42 45992 7.18e-03 467_[+2(6.17e-06)]_19 49368 2.06e-07 439_[+3(3.36e-07)]_5_[+1(2.36e-08)]_\ 24 35784 1.84e-11 26_[+2(4.86e-06)]_54_[+3(4.11e-05)]_\ 210_[+1(1.59e-12)]_164 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************