******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/406/406.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42451 1.0000 500 83 1.0000 500 43052 1.0000 500 46569 1.0000 500 36842 1.0000 500 38694 1.0000 500 38702 1.0000 500 32877 1.0000 500 44326 1.0000 500 44704 1.0000 500 26714 1.0000 500 34708 1.0000 500 45099 1.0000 500 45149 1.0000 500 45418 1.0000 500 46110 1.0000 500 33646 1.0000 500 44376 1.0000 500 46370 1.0000 500 46487 1.0000 500 46499 1.0000 500 46500 1.0000 500 34815 1.0000 500 45186 1.0000 500 35853 1.0000 500 41559 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/406/406.seqs.fa -oc motifs/406 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 26 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13000 N= 26 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.242 G 0.219 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.242 G 0.219 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 19 llr = 191 E-value = 5.5e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :17:82:485:: pos.-specific C ::1a:22::2:4 probability G :9:::6:62:14 matrix T a:2:2:8::492 bits 2.2 2.0 * * 1.8 ** * 1.5 ** * * Relative 1.3 ** * * * Entropy 1.1 ** ** *** * (14.5 bits) 0.9 ** ****** * 0.7 ********* * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGACAGTGAATC consensus T TACA T G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 32877 14 1.23e-07 CTTCCTCTAT TGACAGTGAATG TGTTTGAGTT 43052 59 1.23e-07 CTGTTTGCTT TGACAGTGAATG TGTATCAAGA 26714 105 3.98e-07 GCTATCTCAT TGACAGTAAATC GCTCTACGAA 44376 49 4.70e-07 CCTAGAGTAG TGACAGTGAATT ATTCCAGTGG 45149 461 8.02e-07 CCACAGTGAG TGACAGTGACTC TGCCAATAAA 46370 441 4.55e-06 TTTACAGCTT TGACAGCGATTT TATCAGTCCG 34815 169 5.98e-06 GCGACTACTA TAACAGTGAATC ATTTATCAAA 38702 201 6.43e-06 TCGGTTGACC TGACACTAATTG TAATGTTGAG 38694 201 6.43e-06 TCGGTTGACC TGACACTAATTG TAATGTTGAG 44326 137 1.29e-05 AAAATCCTTC TAACAGTAAATG ACACTGCGTA 44704 124 1.64e-05 GGACTCACAC TGTCAATGAATT GACGGGTGGT 41559 29 1.81e-05 CTAGCTGATA TGTCTGTAATTC TGGAATTCCC 46487 120 2.00e-05 CACAAGTCAC TGACAGTGAAGT CAAGGGATGG 46500 61 2.48e-05 GTGGTTTGAT TGCCTGTAAATG ACCCAAGGCA 35853 221 3.28e-05 ACAAAAACTA TGTCAATGGTTC GATAGCCAGT 33646 221 3.28e-05 ACAAAAACTA TGTCAATGGTTC GATAGCCAGT 83 206 5.49e-05 CGAACTACTT TGACAACAGTTC TAGCCAAGCA 45099 80 6.14e-05 GTCCTCCAAA TGACTCCGACTC TAAGTTAAGC 34708 261 6.78e-05 CGTACCGTGA TGCCTGCGACTG GAGCATTTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32877 1.2e-07 13_[+1]_475 43052 1.2e-07 58_[+1]_430 26714 4e-07 104_[+1]_384 44376 4.7e-07 48_[+1]_440 45149 8e-07 460_[+1]_28 46370 4.6e-06 440_[+1]_48 34815 6e-06 168_[+1]_320 38702 6.4e-06 200_[+1]_288 38694 6.4e-06 200_[+1]_288 44326 1.3e-05 136_[+1]_352 44704 1.6e-05 123_[+1]_365 41559 1.8e-05 28_[+1]_460 46487 2e-05 119_[+1]_369 46500 2.5e-05 60_[+1]_428 35853 3.3e-05 220_[+1]_268 33646 3.3e-05 220_[+1]_268 83 5.5e-05 205_[+1]_283 45099 6.1e-05 79_[+1]_409 34708 6.8e-05 260_[+1]_228 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=19 32877 ( 14) TGACAGTGAATG 1 43052 ( 59) TGACAGTGAATG 1 26714 ( 105) TGACAGTAAATC 1 44376 ( 49) TGACAGTGAATT 1 45149 ( 461) TGACAGTGACTC 1 46370 ( 441) TGACAGCGATTT 1 34815 ( 169) TAACAGTGAATC 1 38702 ( 201) TGACACTAATTG 1 38694 ( 201) TGACACTAATTG 1 44326 ( 137) TAACAGTAAATG 1 44704 ( 124) TGTCAATGAATT 1 41559 ( 29) TGTCTGTAATTC 1 46487 ( 120) TGACAGTGAAGT 1 46500 ( 61) TGCCTGTAAATG 1 35853 ( 221) TGTCAATGGTTC 1 33646 ( 221) TGTCAATGGTTC 1 83 ( 206) TGACAACAGTTC 1 45099 ( 80) TGACTCCGACTC 1 34708 ( 261) TGCCTGCGACTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 12714 bayes= 10.2825 E= 5.5e-004 -1089 -1089 -1089 190 -137 -1089 203 -1089 133 -120 -1089 -35 -1089 205 -1089 -1089 154 -1089 -1089 -35 -37 -61 153 -1089 -1089 -20 -1089 156 44 -1089 153 -1089 163 -1089 -47 -1089 80 -61 -1089 46 -1089 -1089 -205 182 -1089 80 75 -35 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 19 E= 5.5e-004 0.000000 0.000000 0.000000 1.000000 0.105263 0.000000 0.894737 0.000000 0.684211 0.105263 0.000000 0.210526 0.000000 1.000000 0.000000 0.000000 0.789474 0.000000 0.000000 0.210526 0.210526 0.157895 0.631579 0.000000 0.000000 0.210526 0.000000 0.789474 0.368421 0.000000 0.631579 0.000000 0.842105 0.000000 0.157895 0.000000 0.473684 0.157895 0.000000 0.368421 0.000000 0.000000 0.052632 0.947368 0.000000 0.421053 0.368421 0.210526 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TG[AT]C[AT][GA][TC][GA]A[AT]T[CGT] -------------------------------------------------------------------------------- Time 5.81 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 7 llr = 124 E-value = 8.3e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :9337:9:4:::::116a pos.-specific C 4::73::::a3::1::4: probability G 6:7::9::3::a::99:: matrix T :1:::11a3:7:a9:::: bits 2.2 * 2.0 * * ** * 1.8 * * ** * 1.5 * * * ** ** * Relative 1.3 ** *** * ***** * Entropy 1.1 ******** ******* * (25.6 bits) 0.9 ******** ********* 0.7 ******** ********* 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel GAGCAGATACTGTTGGAA consensus C AAC G C C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 38702 418 7.43e-11 AAAGTGGACA GAGCAGATTCTGTTGGAA TAAGAGAGCG 38694 418 7.43e-11 AAAGTGGACA GAGCAGATTCTGTTGGAA TAAGAGAGCG 35853 143 6.67e-10 TGAAAGTTGT CAACAGATACTGTTGGCA CCATCCGAAC 33646 143 6.67e-10 TGAAAGTTGT CAACAGATACTGTTGGCA CCATCCGAAC 42451 381 3.36e-08 ACAGGCGTCG GAGCCGATGCCGTCGACA GGTGCCATTT 45418 128 3.69e-08 CATTGATTGT GAGAATATACCGTTAGAA AATGGGGAAA 45149 53 4.06e-08 CCGGACATTG CTGACGTTGCTGTTGGAA GACGCTTCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38702 7.4e-11 417_[+2]_65 38694 7.4e-11 417_[+2]_65 35853 6.7e-10 142_[+2]_340 33646 6.7e-10 142_[+2]_340 42451 3.4e-08 380_[+2]_102 45418 3.7e-08 127_[+2]_355 45149 4.1e-08 52_[+2]_430 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=7 38702 ( 418) GAGCAGATTCTGTTGGAA 1 38694 ( 418) GAGCAGATTCTGTTGGAA 1 35853 ( 143) CAACAGATACTGTTGGCA 1 33646 ( 143) CAACAGATACTGTTGGCA 1 42451 ( 381) GAGCCGATGCCGTCGACA 1 45418 ( 128) GAGAATATACCGTTAGAA 1 45149 ( 53) CTGACGTTGCTGTTGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 12558 bayes= 11.4142 E= 8.3e-003 -945 83 138 -945 166 -945 -945 -90 7 -945 170 -945 7 156 -945 -945 139 24 -945 -945 -945 -945 197 -90 166 -945 -945 -90 -945 -945 -945 190 66 -945 38 9 -945 205 -945 -945 -945 24 -945 142 -945 -945 219 -945 -945 -945 -945 190 -945 -76 -945 168 -93 -945 197 -945 -93 -945 197 -945 107 83 -945 -945 188 -945 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 7 E= 8.3e-003 0.000000 0.428571 0.571429 0.000000 0.857143 0.000000 0.000000 0.142857 0.285714 0.000000 0.714286 0.000000 0.285714 0.714286 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.857143 0.000000 0.000000 0.142857 0.000000 0.000000 0.000000 1.000000 0.428571 0.000000 0.285714 0.285714 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.000000 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.000000 0.857143 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.571429 0.428571 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC]A[GA][CA][AC]GAT[AGT]C[TC]GTTGG[AC]A -------------------------------------------------------------------------------- Time 11.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 9 llr = 143 E-value = 1.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a::1a711:a:::::4661 pos.-specific C :2a6:1398::72a143:1 probability G :3:3::::::7::::1::: matrix T :4:::26:2:338:9:148 bits 2.2 2.0 * * * * * 1.8 * * * * * 1.5 * * * * * * Relative 1.3 * * * *** ** Entropy 1.1 * * * ******** (22.9 bits) 0.9 * * * ******** ** 0.7 * ***************** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel ATCCAATCCAGCTCTAAAT consensus G G TC T TTC CCT sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 35853 454 5.52e-10 GTTTACTCAG ACCCAACCCAGCTCTAATT TTTTGAGTTT 44376 393 5.52e-10 GGCGCTGTAA AGCCAATCCAGCCCTCAAT AATCTCCACT 33646 454 5.52e-10 GTTTACTCAG ACCCAACCCAGCTCTAATT TTTTGAGTTT 38702 82 1.38e-08 AAGCAAAAGC ATCGATTCCATTTCTCAAT TTTGACGTCC 38694 81 1.38e-08 AAGCAAAAGC ATCGATTCCATTTCTCAAT TTTGACGTCC 34815 361 3.41e-08 AACCAGCATT ATCAAACCTAGCTCTACTT CAGGGTTTGC 41559 201 2.26e-07 GCCTCGACTG AGCGACTCCAGTCCTACAC TCCCGGATCC 46487 77 2.37e-07 CAACTCATCG AGCCAATCCATCTCCGCAA GTGTTGATAC 45186 365 3.00e-07 CCGTCCATCC ATCCAAAATAGCTCTCTTT CTGTCTTCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35853 5.5e-10 453_[+3]_28 44376 5.5e-10 392_[+3]_89 33646 5.5e-10 453_[+3]_28 38702 1.4e-08 81_[+3]_400 38694 1.4e-08 80_[+3]_401 34815 3.4e-08 360_[+3]_121 41559 2.3e-07 200_[+3]_281 46487 2.4e-07 76_[+3]_405 45186 3e-07 364_[+3]_117 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=9 35853 ( 454) ACCCAACCCAGCTCTAATT 1 44376 ( 393) AGCCAATCCAGCCCTCAAT 1 33646 ( 454) ACCCAACCCAGCTCTAATT 1 38702 ( 82) ATCGATTCCATTTCTCAAT 1 38694 ( 81) ATCGATTCCATTTCTCAAT 1 34815 ( 361) ATCAAACCTAGCTCTACTT 1 41559 ( 201) AGCGACTCCAGTCCTACAC 1 46487 ( 77) AGCCAATCCATCTCCGCAA 1 45186 ( 365) ATCCAAAATAGCTCTCTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 12532 bayes= 9.74375 E= 1.8e-001 188 -982 -982 -982 -982 -12 61 73 -982 205 -982 -982 -129 120 61 -982 188 -982 -982 -982 129 -112 -982 -27 -129 46 -982 105 -129 188 -982 -982 -982 169 -982 -27 188 -982 -982 -982 -982 -982 161 32 -982 146 -982 32 -982 -12 -982 154 -982 205 -982 -982 -982 -112 -982 173 71 88 -98 -982 103 46 -982 -127 103 -982 -982 73 -129 -112 -982 154 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 9 E= 1.8e-001 1.000000 0.000000 0.000000 0.000000 0.000000 0.222222 0.333333 0.444444 0.000000 1.000000 0.000000 0.000000 0.111111 0.555556 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.111111 0.000000 0.222222 0.111111 0.333333 0.000000 0.555556 0.111111 0.888889 0.000000 0.000000 0.000000 0.777778 0.000000 0.222222 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.666667 0.000000 0.333333 0.000000 0.222222 0.000000 0.777778 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.000000 0.888889 0.444444 0.444444 0.111111 0.000000 0.555556 0.333333 0.000000 0.111111 0.555556 0.000000 0.000000 0.444444 0.111111 0.111111 0.000000 0.777778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[TGC]C[CG]A[AT][TC]C[CT]A[GT][CT][TC]CT[AC][AC][AT]T -------------------------------------------------------------------------------- Time 16.89 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42451 4.10e-04 380_[+2(3.36e-08)]_102 83 7.96e-02 205_[+1(5.49e-05)]_283 43052 4.83e-04 58_[+1(1.23e-07)]_430 46569 3.00e-01 500 36842 3.49e-01 500 38694 4.82e-13 80_[+3(1.38e-08)]_101_\ [+1(6.43e-06)]_205_[+2(7.43e-11)]_65 38702 4.82e-13 81_[+3(1.38e-08)]_100_\ [+1(6.43e-06)]_205_[+2(7.43e-11)]_65 32877 2.74e-04 13_[+1(1.23e-07)]_475 44326 8.21e-03 136_[+1(1.29e-05)]_58_\ [+2(8.56e-05)]_276 44704 5.16e-02 123_[+1(1.64e-05)]_365 26714 3.53e-03 104_[+1(3.98e-07)]_384 34708 3.66e-02 260_[+1(6.78e-05)]_228 45099 1.70e-02 79_[+1(6.14e-05)]_409 45149 7.41e-07 52_[+2(4.06e-08)]_390_\ [+1(8.02e-07)]_28 45418 1.92e-04 127_[+2(3.69e-08)]_355 46110 8.95e-01 500 33646 8.46e-13 142_[+2(6.67e-10)]_60_\ [+1(3.28e-05)]_221_[+3(5.52e-10)]_28 44376 1.35e-08 48_[+1(4.70e-07)]_332_\ [+3(5.52e-10)]_89 46370 9.57e-03 440_[+1(4.55e-06)]_48 46487 6.33e-05 76_[+3(2.37e-07)]_24_[+1(2.00e-05)]_\ 369 46499 2.91e-01 500 46500 9.29e-02 60_[+1(2.48e-05)]_428 34815 3.89e-06 168_[+1(5.98e-06)]_180_\ [+3(3.41e-08)]_121 45186 1.55e-03 364_[+3(3.00e-07)]_117 35853 8.46e-13 142_[+2(6.67e-10)]_60_\ [+1(3.28e-05)]_221_[+3(5.52e-10)]_28 41559 5.17e-05 28_[+1(1.81e-05)]_160_\ [+3(2.26e-07)]_281 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************