******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/44/44.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43560 1.0000 500 51528 1.0000 500 43747 1.0000 500 49112 1.0000 500 50507 1.0000 500 44793 1.0000 500 26775 1.0000 500 45714 1.0000 500 27361 1.0000 500 34622 1.0000 500 34885 1.0000 500 46690 1.0000 500 48087 1.0000 500 47842 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/44/44.seqs.fa -oc motifs/44 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.251 G 0.218 T 0.258 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.251 G 0.218 T 0.258 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 14 llr = 138 E-value = 2.5e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1:a:4:::a51 pos.-specific C 516:a1:28:15 probability G 11::::a:::42 matrix T 474::5:82::2 bits 2.2 * 2.0 ** * * 1.8 ** * * 1.5 ** * * Relative 1.3 ** **** Entropy 1.1 *** **** (14.3 bits) 0.9 *** **** 0.7 *********** 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel CTCACTGTCAAC consensus T T A CT GG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47842 387 2.17e-07 TCGGGAGGTT CTCACAGTCAAC GAGCTTGCTC 46690 245 2.17e-07 GAGATAATAC CTCACAGTCAAC TACACCCTAT 45714 378 8.73e-07 TCACAACAGC TTCACAGTCAGC TTGGACATCA 34885 400 1.63e-06 TTGCGACGGT CTCACTGTCAGT CAATATCCGA 50507 332 1.63e-06 TTTATTCGTA CTCACTGTCAGT ATTTGACTCC 48087 149 1.46e-05 TTGAAGTTCG CTTACTGTTAGG TCGGCAAAAA 51528 316 1.46e-05 ATACATATCC CCTACTGTCAAC GCCGATGTCG 43560 304 1.81e-05 TGCGACGTAT TTTACAGCCAAG TCTTGTTCAA 27361 84 2.28e-05 TACATTTATG TTTACAGTTAGG TTCCGTGAAG 49112 205 2.56e-05 CGGAATCACT GGCACTGTCAAC AACGAAAGCG 44793 263 4.20e-05 TGGAAAGTCC TATACTGTCAAT CCACAATGAG 34622 107 4.54e-05 TAGTGTCCTG TGTACCGTCAAC CAACATCGTG 43747 41 5.44e-05 CTGTAATTGA TTCACAGCTACC AAAGTTTAGT 26775 446 6.59e-05 CCAACAACTC CTCACTGCCACA GTTCCTTTGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47842 2.2e-07 386_[+1]_102 46690 2.2e-07 244_[+1]_244 45714 8.7e-07 377_[+1]_111 34885 1.6e-06 399_[+1]_89 50507 1.6e-06 331_[+1]_157 48087 1.5e-05 148_[+1]_340 51528 1.5e-05 315_[+1]_173 43560 1.8e-05 303_[+1]_185 27361 2.3e-05 83_[+1]_405 49112 2.6e-05 204_[+1]_284 44793 4.2e-05 262_[+1]_226 34622 4.5e-05 106_[+1]_382 43747 5.4e-05 40_[+1]_448 26775 6.6e-05 445_[+1]_43 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=14 47842 ( 387) CTCACAGTCAAC 1 46690 ( 245) CTCACAGTCAAC 1 45714 ( 378) TTCACAGTCAGC 1 34885 ( 400) CTCACTGTCAGT 1 50507 ( 332) CTCACTGTCAGT 1 48087 ( 149) CTTACTGTTAGG 1 51528 ( 316) CCTACTGTCAAC 1 43560 ( 304) TTTACAGCCAAG 1 27361 ( 84) TTTACAGTTAGG 1 49112 ( 205) GGCACTGTCAAC 1 44793 ( 263) TATACTGTCAAT 1 34622 ( 107) TGTACCGTCAAC 1 43747 ( 41) TTCACAGCTACC 1 26775 ( 446) CTCACTGCCACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.53747 E= 2.5e-002 -1045 99 -161 73 -193 -181 -61 147 -1045 119 -1045 73 187 -1045 -1045 -1045 -1045 199 -1045 -1045 65 -181 -1045 95 -1045 -1045 219 -1045 -1045 -23 -1045 161 -1045 165 -1045 -27 187 -1045 -1045 -1045 87 -81 71 -1045 -193 99 -3 -27 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 2.5e-002 0.000000 0.500000 0.071429 0.428571 0.071429 0.071429 0.142857 0.714286 0.000000 0.571429 0.000000 0.428571 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.428571 0.071429 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.214286 0.000000 0.785714 0.000000 0.785714 0.000000 0.214286 1.000000 0.000000 0.000000 0.000000 0.500000 0.142857 0.357143 0.000000 0.071429 0.500000 0.214286 0.214286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT]T[CT]AC[TA]G[TC][CT]A[AG][CGT] -------------------------------------------------------------------------------- Time 1.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 10 llr = 109 E-value = 8.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1::2::::::a1 pos.-specific C 6:4:4493:::7 probability G ::::6:161a:: matrix T 3a68:6:19::2 bits 2.2 * 2.0 * ** 1.8 * ** 1.5 * * *** Relative 1.3 * * * *** Entropy 1.1 ****** *** (15.8 bits) 0.9 *********** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CTTTGTCGTGAC consensus T CACC C T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43747 324 4.84e-07 CATAGAATTG CTTTCCCGTGAC TACCACTTCG 47842 82 5.86e-07 AGCTTAGTTT TTCTGTCGTGAC TGACGATACC 50507 172 5.86e-07 CCGAGAAGAA TTTTGCCGTGAC TATCCCTGTT 43560 389 2.17e-06 TACGATTGAC CTTACTCGTGAC GCGCTTTGTC 46690 412 3.14e-06 ACTTAATCGA CTCTGTCTTGAC CTCAAAAGGC 27361 139 3.61e-06 CGCAGTGACA CTCACTCGTGAC TCGGAAATGC 44793 378 9.75e-06 CGGAGCACTT CTTTCTGCTGAC TCTAGAGCGC 48087 328 1.05e-05 GAGTCGATTT ATTTGCCGTGAT CGATCTTTTC 26775 462 1.64e-05 GCCACAGTTC CTTTGTCCGGAT AAGCTTGACG 34622 409 1.95e-05 AAAAATTCTC TTCTGCCCTGAA GAACAACATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43747 4.8e-07 323_[+2]_165 47842 5.9e-07 81_[+2]_407 50507 5.9e-07 171_[+2]_317 43560 2.2e-06 388_[+2]_100 46690 3.1e-06 411_[+2]_77 27361 3.6e-06 138_[+2]_350 44793 9.8e-06 377_[+2]_111 48087 1.1e-05 327_[+2]_161 26775 1.6e-05 461_[+2]_27 34622 1.9e-05 408_[+2]_80 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=10 43747 ( 324) CTTTCCCGTGAC 1 47842 ( 82) TTCTGTCGTGAC 1 50507 ( 172) TTTTGCCGTGAC 1 43560 ( 389) CTTACTCGTGAC 1 46690 ( 412) CTCTGTCTTGAC 1 27361 ( 139) CTCACTCGTGAC 1 44793 ( 378) CTTTCTGCTGAC 1 48087 ( 328) ATTTGCCGTGAT 1 26775 ( 462) CTTTGTCCGGAT 1 34622 ( 409) TTCTGCCCTGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.66888 E= 8.8e+001 -145 126 -997 22 -997 -997 -997 195 -997 67 -997 122 -45 -997 -997 163 -997 67 146 -997 -997 67 -997 122 -997 184 -112 -997 -997 26 146 -136 -997 -997 -112 180 -997 -997 219 -997 187 -997 -997 -997 -145 148 -997 -37 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 8.8e+001 0.100000 0.600000 0.000000 0.300000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.600000 0.200000 0.000000 0.000000 0.800000 0.000000 0.400000 0.600000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.900000 0.100000 0.000000 0.000000 0.300000 0.600000 0.100000 0.000000 0.000000 0.100000 0.900000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.100000 0.700000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT]T[TC][TA][GC][TC]C[GC]TGA[CT] -------------------------------------------------------------------------------- Time 3.47 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 5 llr = 87 E-value = 3.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::2:a8::2a4:22:2: pos.-specific C ::::4::::42:::22::: probability G 22a24a::862:6:44::a matrix T 88:8:::22:4::a22a8: bits 2.2 * * * 2.0 * ** * * * * 1.8 * ** * * * * 1.5 * ** * * * * Relative 1.3 **** ** * * * *** Entropy 1.1 **** ***** *** *** (25.0 bits) 0.9 **** ***** *** *** 0.7 **** ***** *** *** 0.4 ********** *** *** 0.2 ********** ******** 0.0 ------------------- Multilevel TTGTCGAAGGTAGTGGTTG consensus GG GG TTCA A AA A sequence A C CC G TT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 34885 38 1.33e-09 AAATGAAACA TGGGCGAAGGTAGTGGTTG CTCCTGCATG 45714 113 4.08e-09 AGAAAAGAAT TTGTAGAAGGAAGTCTTTG TCAAGGATCA 26775 207 4.08e-09 GCAGATGGAT TTGTGGAATCCAATGGTTG CCATCCCCCG 43747 425 1.00e-08 AAACGTAGTT TTGTGGATGGGAATTCTTG ACACTTGACG 50507 198 2.65e-08 CCTGTTTGAC GTGTCGAAGCTAGTAATAG TGGATACGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34885 1.3e-09 37_[+3]_444 45714 4.1e-09 112_[+3]_369 26775 4.1e-09 206_[+3]_275 43747 1e-08 424_[+3]_57 50507 2.7e-08 197_[+3]_284 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=5 34885 ( 38) TGGGCGAAGGTAGTGGTTG 1 45714 ( 113) TTGTAGAAGGAAGTCTTTG 1 26775 ( 207) TTGTGGAATCCAATGGTTG 1 43747 ( 425) TTGTGGATGGGAATTCTTG 1 50507 ( 198) GTGTCGAAGCTAGTAATAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 6748 bayes= 10.649 E= 3.9e+002 -897 -897 -13 163 -897 -897 -13 163 -897 -897 219 -897 -897 -897 -13 163 -45 67 87 -897 -897 -897 219 -897 187 -897 -897 -897 155 -897 -897 -37 -897 -897 187 -37 -897 67 146 -897 -45 -33 -13 63 187 -897 -897 -897 55 -897 146 -897 -897 -897 -897 195 -45 -33 87 -37 -45 -33 87 -37 -897 -897 -897 195 -45 -897 -897 163 -897 -897 219 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 5 E= 3.9e+002 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.200000 0.400000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.400000 0.600000 0.000000 0.200000 0.200000 0.200000 0.400000 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.200000 0.200000 0.400000 0.200000 0.200000 0.200000 0.400000 0.200000 0.000000 0.000000 0.000000 1.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG][TG]G[TG][CGA]GA[AT][GT][GC][TACG]A[GA]T[GACT][GACT]T[TA]G -------------------------------------------------------------------------------- Time 5.13 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43560 4.58e-04 303_[+1(1.81e-05)]_73_\ [+2(2.17e-06)]_100 51528 6.18e-02 315_[+1(1.46e-05)]_173 43747 9.57e-09 40_[+1(5.44e-05)]_271_\ [+2(4.84e-07)]_89_[+3(1.00e-08)]_57 49112 8.20e-02 204_[+1(2.56e-05)]_284 50507 1.11e-09 171_[+2(5.86e-07)]_14_\ [+3(2.65e-08)]_115_[+1(1.63e-06)]_157 44793 3.34e-03 262_[+1(4.20e-05)]_103_\ [+2(9.75e-06)]_111 26775 1.26e-07 206_[+3(4.08e-09)]_220_\ [+1(6.59e-05)]_4_[+2(1.64e-05)]_27 45714 1.42e-07 112_[+3(4.08e-09)]_246_\ [+1(8.73e-07)]_111 27361 7.61e-04 83_[+1(2.28e-05)]_43_[+2(3.61e-06)]_\ 350 34622 5.69e-03 106_[+1(4.54e-05)]_290_\ [+2(1.95e-05)]_80 34885 1.28e-07 37_[+3(1.33e-09)]_343_\ [+1(1.63e-06)]_89 46690 1.91e-05 244_[+1(2.17e-07)]_155_\ [+2(3.14e-06)]_77 48087 1.33e-03 148_[+1(1.46e-05)]_167_\ [+2(1.05e-05)]_161 47842 2.75e-06 31_[+1(3.34e-05)]_38_[+2(5.86e-07)]_\ 293_[+1(2.17e-07)]_102 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************