******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/328/328.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42447 1.0000 500 46343 1.0000 500 13175 1.0000 500 13951 1.0000 500 54765 1.0000 500 47663 1.0000 500 48166 1.0000 500 48800 1.0000 500 49717 1.0000 500 10581 1.0000 500 10260 1.0000 500 25840 1.0000 500 44531 1.0000 500 44767 1.0000 500 34403 1.0000 500 26742 1.0000 500 34805 1.0000 500 45356 1.0000 500 12655 1.0000 500 50499 1.0000 500 45352 1.0000 500 49722 1.0000 500 47783 1.0000 500 47807 1.0000 500 49855 1.0000 500 45040 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/328/328.seqs.fa -oc motifs/328 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 26 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13000 N= 26 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.277 C 0.240 G 0.221 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.277 C 0.240 G 0.221 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 13 llr = 165 E-value = 2.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 5::2215137:1::25465 pos.-specific C 3:::7:291:92:4822:5 probability G :18:291:4:15:2:12:: matrix T 2928::2:23:2a5:224: bits 2.2 2.0 * 1.7 * * * * 1.5 * * * * * Relative 1.3 *** * * * * * Entropy 1.1 *** * * ** * * (18.3 bits) 0.9 ***** * ** * * ** 0.7 ***** * ** *** ** 0.4 ****** * ** *** ** 0.2 **************** ** 0.0 ------------------- Multilevel ATGTCGACGACGTTCAAAA consensus C T C AT C CACGTC sequence T T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 10260 332 2.24e-09 CGCAAGCGAC ATGTCGACAACCTTCATAA AAACAAGAAG 45356 424 1.93e-08 GCAACACTAG ATGTCGTCGACCTTCACAC TGCCTTGTAA 48166 308 4.46e-08 CACATGCGTC CTGTCGACATCTTCCAGTA AGTGAAGAAT 49717 216 5.77e-08 GCGAGGCCGA ATTTCGACGACGTCCTCAA CTATATTGTT 34805 381 3.71e-07 TGAACACACC ATGACGACTACGTTACTAC CTCGAATACA 13951 448 4.92e-07 TCTTCCTTCG AGGTCGCCAACGTCCTATC GGAAGCTACG 44531 119 5.38e-07 AGAAAAACAT CTGTAGACGTCGTTCGGTA CTGGAAACAT 54765 389 6.43e-07 CGAACGAAGA CTGTGGCCCACTTTCAGAA CGCGATGCGA 45352 167 1.45e-06 TCTATTTATC ATGTCGTATACTTGCAAAA GGATTTATCT 44767 65 2.41e-06 TCAGCCGCGT TTGTCGACAAGCTGCCTAC AATAACACTT 25840 13 3.59e-06 GTATGTATAC CTGTAAACGTCGTCAAATC GGACGATCTT 10581 466 4.33e-06 GTGGACGAGT TTTACGCCGACATTCCAAA ACTCCGAGAC 47783 122 5.18e-06 GACTGTATAT ATTTGGGCTTCGTCAAATC CATGTCGTAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10260 2.2e-09 331_[+1]_150 45356 1.9e-08 423_[+1]_58 48166 4.5e-08 307_[+1]_174 49717 5.8e-08 215_[+1]_266 34805 3.7e-07 380_[+1]_101 13951 4.9e-07 447_[+1]_34 44531 5.4e-07 118_[+1]_363 54765 6.4e-07 388_[+1]_93 45352 1.5e-06 166_[+1]_315 44767 2.4e-06 64_[+1]_417 25840 3.6e-06 12_[+1]_469 10581 4.3e-06 465_[+1]_16 47783 5.2e-06 121_[+1]_360 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=13 10260 ( 332) ATGTCGACAACCTTCATAA 1 45356 ( 424) ATGTCGTCGACCTTCACAC 1 48166 ( 308) CTGTCGACATCTTCCAGTA 1 49717 ( 216) ATTTCGACGACGTCCTCAA 1 34805 ( 381) ATGACGACTACGTTACTAC 1 13951 ( 448) AGGTCGCCAACGTCCTATC 1 44531 ( 119) CTGTAGACGTCGTTCGGTA 1 54765 ( 389) CTGTGGCCCACTTTCAGAA 1 45352 ( 167) ATGTCGTATACTTGCAAAA 1 44767 ( 65) TTGTCGACAAGCTGCCTAC 1 25840 ( 13) CTGTAAACGTCGTCAAATC 1 10581 ( 466) TTTACGCCGACATTCCAAA 1 47783 ( 122) ATTTGGGCTTCGTCAAATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 12532 bayes= 10.4424 E= 2.3e+001 96 36 -1035 -77 -1035 -1035 -152 182 -1035 -1035 180 -18 -85 -1035 -1035 169 -85 153 -53 -1035 -184 -1035 206 -1035 96 -6 -152 -77 -184 194 -1035 -1035 15 -164 80 -18 132 -1035 -1035 23 -1035 194 -152 -1035 -184 -6 106 -18 -1035 -1035 -1035 193 -1035 68 -53 82 -26 168 -1035 -1035 96 -6 -152 -77 48 -64 6 -18 115 -1035 -1035 55 96 94 -1035 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 13 E= 2.3e+001 0.538462 0.307692 0.000000 0.153846 0.000000 0.000000 0.076923 0.923077 0.000000 0.000000 0.769231 0.230769 0.153846 0.000000 0.000000 0.846154 0.153846 0.692308 0.153846 0.000000 0.076923 0.000000 0.923077 0.000000 0.538462 0.230769 0.076923 0.153846 0.076923 0.923077 0.000000 0.000000 0.307692 0.076923 0.384615 0.230769 0.692308 0.000000 0.000000 0.307692 0.000000 0.923077 0.076923 0.000000 0.076923 0.230769 0.461538 0.230769 0.000000 0.000000 0.000000 1.000000 0.000000 0.384615 0.153846 0.461538 0.230769 0.769231 0.000000 0.000000 0.538462 0.230769 0.076923 0.153846 0.384615 0.153846 0.230769 0.230769 0.615385 0.000000 0.000000 0.384615 0.538462 0.461538 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AC]T[GT]TCG[AC]C[GAT][AT]C[GCT]T[TC][CA][AC][AGT][AT][AC] -------------------------------------------------------------------------------- Time 5.69 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 7 llr = 94 E-value = 5.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::4:1:::3: pos.-specific C :::a6:::61:: probability G ::a:::1:::7: matrix T aa:::a7a49:a bits 2.2 * 2.0 **** * * * 1.7 **** * * * 1.5 **** * * * Relative 1.3 **** * * *** Entropy 1.1 ****** ***** (19.3 bits) 0.9 ************ 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTGCCTTTCTGT consensus A T A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49717 327 5.72e-08 ACTCTCCACT TTGCCTTTCTGT CTTTTACTGG 34805 86 1.86e-07 CTTTCGGGTG TTGCCTTTTTGT TCAAATACCG 13951 48 2.58e-07 CTTCCAGGTG TTGCATTTTTGT GTTGGCGGTT 49722 352 3.30e-07 AACAGCGGCT TTGCCTTTCTAT ATATACCAAA 45356 292 1.10e-06 TCAACGTTTT TTGCATTTCCGT CTTTGCGTCA 12655 218 1.16e-06 TCACCATTAT TTGCATGTTTGT GGATCAGGTC 49855 293 1.51e-06 ATAGGCATTA TTGCCTATCTAT GTGCCTTCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49717 5.7e-08 326_[+2]_162 34805 1.9e-07 85_[+2]_403 13951 2.6e-07 47_[+2]_441 49722 3.3e-07 351_[+2]_137 45356 1.1e-06 291_[+2]_197 12655 1.2e-06 217_[+2]_271 49855 1.5e-06 292_[+2]_196 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=7 49717 ( 327) TTGCCTTTCTGT 1 34805 ( 86) TTGCCTTTTTGT 1 13951 ( 48) TTGCATTTTTGT 1 49722 ( 352) TTGCCTTTCTAT 1 45356 ( 292) TTGCATTTCCGT 1 12655 ( 218) TTGCATGTTTGT 1 49855 ( 293) TTGCCTATCTAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12714 bayes= 10.6698 E= 5.7e+001 -945 -945 -945 193 -945 -945 -945 193 -945 -945 217 -945 -945 206 -945 -945 63 125 -945 -945 -945 -945 -945 193 -95 -945 -63 144 -945 -945 -945 193 -945 125 -945 71 -945 -75 -945 171 5 -945 169 -945 -945 -945 -945 193 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 5.7e+001 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.142857 0.000000 0.142857 0.714286 0.000000 0.000000 0.000000 1.000000 0.000000 0.571429 0.000000 0.428571 0.000000 0.142857 0.000000 0.857143 0.285714 0.000000 0.714286 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TTGC[CA]TTT[CT]T[GA]T -------------------------------------------------------------------------------- Time 11.11 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 7 llr = 121 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 9:61:1:::911:::a1:::: pos.-specific C ::11369:::6:1:::1:473 probability G 161:13:4a1:19:a:61::7 matrix T :4176:16::37:a::1963: bits 2.2 * * 2.0 * *** 1.7 * *** 1.5 * * **** Relative 1.3 * * ** **** * * Entropy 1.1 ** **** **** **** (24.9 bits) 0.9 ** * **** ***** **** 0.7 ** ************* **** 0.4 ** ****************** 0.2 ********************* 0.0 --------------------- Multilevel AGATTCCTGACTGTGAGTTCG consensus T CG G T CTC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 10260 45 1.20e-10 CTTTTCGACC ATTTTCCGGATTGTGAGTTCG AAAAAACGCG 49855 174 1.62e-10 GTACTTGGAG AGATCCCGGACTGTGAGTTTC AGTATCGATT 45356 249 2.80e-09 TCAACTATGA AGCTCGCTGATTGTGAATTCG AAAAGATTGT 54765 248 8.04e-09 AATTGATCTA ATATTCTGGGCAGTGAGTCCG ACGAATGCGC 48800 155 1.20e-08 AATTTTTGAT GGAATACTGACTGTGACTTCG CGCGGAAGGC 10581 385 2.98e-08 GGAGAGGTAA AGACTCCTGAAGGTGATTCTG ATACAGATCA 25840 268 5.44e-08 ACTGCATCTC ATGTGGCTGACTCTGAGGCCC ATCAAAATCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10260 1.2e-10 44_[+3]_435 49855 1.6e-10 173_[+3]_306 45356 2.8e-09 248_[+3]_231 54765 8e-09 247_[+3]_232 48800 1.2e-08 154_[+3]_325 10581 3e-08 384_[+3]_95 25840 5.4e-08 267_[+3]_212 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=7 10260 ( 45) ATTTTCCGGATTGTGAGTTCG 1 49855 ( 174) AGATCCCGGACTGTGAGTTTC 1 45356 ( 249) AGCTCGCTGATTGTGAATTCG 1 54765 ( 248) ATATTCTGGGCAGTGAGTCCG 1 48800 ( 155) GGAATACTGACTGTGACTTCG 1 10581 ( 385) AGACTCCTGAAGGTGATTCTG 1 25840 ( 268) ATGTGGCTGACTCTGAGGCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12480 bayes= 11.4052 E= 1.1e+002 163 -945 -63 -945 -945 -945 137 71 105 -75 -63 -87 -95 -75 -945 144 -945 25 -63 112 -95 125 37 -945 -945 184 -945 -87 -945 -945 95 112 -945 -945 217 -945 163 -945 -63 -945 -95 125 -945 12 -95 -945 -63 144 -945 -75 195 -945 -945 -945 -945 193 -945 -945 217 -945 185 -945 -945 -945 -95 -75 137 -87 -945 -945 -63 171 -945 84 -945 112 -945 157 -945 12 -945 25 169 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 1.1e+002 0.857143 0.000000 0.142857 0.000000 0.000000 0.000000 0.571429 0.428571 0.571429 0.142857 0.142857 0.142857 0.142857 0.142857 0.000000 0.714286 0.000000 0.285714 0.142857 0.571429 0.142857 0.571429 0.285714 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.142857 0.571429 0.000000 0.285714 0.142857 0.000000 0.142857 0.714286 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.142857 0.571429 0.142857 0.000000 0.000000 0.142857 0.857143 0.000000 0.428571 0.000000 0.571429 0.000000 0.714286 0.000000 0.285714 0.000000 0.285714 0.714286 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[GT]AT[TC][CG]C[TG]GA[CT]TGTGAGT[TC][CT][GC] -------------------------------------------------------------------------------- Time 16.49 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42447 4.30e-01 500 46343 9.58e-01 500 13175 7.22e-01 500 13951 4.92e-06 47_[+2(2.58e-07)]_388_\ [+1(4.92e-07)]_34 54765 1.31e-07 247_[+3(8.04e-09)]_120_\ [+1(6.43e-07)]_93 47663 3.99e-02 130_[+3(7.90e-05)]_349 48166 2.27e-04 307_[+1(4.46e-08)]_174 48800 7.40e-05 154_[+3(1.20e-08)]_325 49717 1.35e-07 215_[+1(5.77e-08)]_92_\ [+2(5.72e-08)]_162 10581 4.59e-06 384_[+3(2.98e-08)]_60_\ [+1(4.33e-06)]_16 10260 2.49e-11 44_[+3(1.20e-10)]_266_\ [+1(2.24e-09)]_150 25840 1.89e-06 12_[+1(3.59e-06)]_236_\ [+3(5.44e-08)]_212 44531 4.19e-03 118_[+1(5.38e-07)]_363 44767 4.98e-03 64_[+1(2.41e-06)]_417 34403 8.77e-01 500 26742 4.91e-01 500 34805 2.88e-06 85_[+2(1.86e-07)]_283_\ [+1(3.71e-07)]_101 45356 3.82e-12 31_[+3(2.66e-05)]_196_\ [+3(2.80e-09)]_22_[+2(1.10e-06)]_120_[+1(1.93e-08)]_58 12655 5.68e-04 75_[+1(7.35e-05)]_123_\ [+2(1.16e-06)]_271 50499 8.43e-01 500 45352 4.35e-03 166_[+1(1.45e-06)]_315 49722 5.67e-03 351_[+2(3.30e-07)]_137 47783 6.31e-03 121_[+1(5.18e-06)]_360 47807 4.05e-01 500 49855 1.07e-08 173_[+3(1.62e-10)]_98_\ [+2(1.51e-06)]_196 45040 3.15e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************