******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/407/407.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10850 1.0000 500 1943 1.0000 500 20746 1.0000 500 21158 1.0000 500 22319 1.0000 500 24357 1.0000 500 24974 1.0000 500 260980 1.0000 500 262070 1.0000 500 5028 1.0000 500 5232 1.0000 500 6698 1.0000 500 7299 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/407/407.seqs.fa -oc motifs/407 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.253 C 0.240 G 0.241 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.253 C 0.240 G 0.241 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 133 E-value = 2.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 2:1:::2::3:12:4:1:711 pos.-specific C 72166a:383a:682693397 probability G ::2:1::1:::42:14::::2 matrix T 18643:8623:4:22::7::: bits 2.1 * * 1.9 * * 1.6 * * * * 1.4 * * * * Relative 1.2 * ** * * * * * Entropy 1.0 * * ** * * * ***** (21.2 bits) 0.8 ** * ** * * * ****** 0.6 ** ****** **** ****** 0.4 ** *********** ****** 0.2 ********************* 0.0 --------------------- Multilevel CTTCCCTTCACGCCACCTACC consensus ACGTT ACTC TATCG CC G sequence T G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 24357 242 2.40e-10 AACTAGTAAG ATTTCCTTCTCGCCAGCTACC AGACAGTCGA 24974 369 2.63e-08 AGCCCCGGGC CTGCCCTGCCCTCTCGCTACC GTCCCTTGCC 21158 243 3.56e-08 GAGTGGCTTC CTGCGCTCCTCGCCGCCTCCC TCCCTTCCTA 10850 278 4.77e-08 AGACCTTCGT CCTTCCTCCCCACCAGCTCCG ACGTCCGACG 1943 189 5.75e-08 ACTTTTTCAA TTTTCCTTTACTCCTGCCACC TTCTCAATCT 20746 401 1.06e-07 GGGGTTATTG CTACTCTTCACGGCACACACC GAACTATCAC 5028 206 1.71e-07 TTCCCTCTGC ATCCTCATCTCGGCTCCTACC GCGGTGCGAG 5232 471 5.05e-07 GACGTAGGGC CTTCTCACCCCTACACCCCAG CCGAAGCAC 6698 278 5.35e-07 TCCCTCCTTT CCTTCCTTTACTATCCCTACA CCACCTCCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24357 2.4e-10 241_[+1]_238 24974 2.6e-08 368_[+1]_111 21158 3.6e-08 242_[+1]_237 10850 4.8e-08 277_[+1]_202 1943 5.7e-08 188_[+1]_291 20746 1.1e-07 400_[+1]_79 5028 1.7e-07 205_[+1]_274 5232 5e-07 470_[+1]_9 6698 5.4e-07 277_[+1]_202 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 24357 ( 242) ATTTCCTTCTCGCCAGCTACC 1 24974 ( 369) CTGCCCTGCCCTCTCGCTACC 1 21158 ( 243) CTGCGCTCCTCGCCGCCTCCC 1 10850 ( 278) CCTTCCTCCCCACCAGCTCCG 1 1943 ( 189) TTTTCCTTTACTCCTGCCACC 1 20746 ( 401) CTACTCTTCACGGCACACACC 1 5028 ( 206) ATCCTCATCTCGGCTCCTACC 1 5232 ( 471) CTTCTCACCCCTACACCCCAG 1 6698 ( 278) CCTTCCTTTACTATCCCTACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6240 bayes= 9.56981 E= 2.5e+001 -19 147 -982 -126 -982 -11 -982 154 -119 -111 -11 106 -982 121 -982 74 -982 121 -111 32 -982 206 -982 -982 -19 -982 -982 154 -982 48 -111 106 -982 170 -982 -26 40 48 -982 32 -982 206 -982 -982 -119 -982 88 74 -19 121 -11 -982 -982 170 -982 -26 81 -11 -111 -26 -982 121 88 -982 -119 189 -982 -982 -982 48 -982 132 140 48 -982 -982 -119 189 -982 -982 -119 147 -11 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 2.5e+001 0.222222 0.666667 0.000000 0.111111 0.000000 0.222222 0.000000 0.777778 0.111111 0.111111 0.222222 0.555556 0.000000 0.555556 0.000000 0.444444 0.000000 0.555556 0.111111 0.333333 0.000000 1.000000 0.000000 0.000000 0.222222 0.000000 0.000000 0.777778 0.000000 0.333333 0.111111 0.555556 0.000000 0.777778 0.000000 0.222222 0.333333 0.333333 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.111111 0.000000 0.444444 0.444444 0.222222 0.555556 0.222222 0.000000 0.000000 0.777778 0.000000 0.222222 0.444444 0.222222 0.111111 0.222222 0.000000 0.555556 0.444444 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.666667 0.333333 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.111111 0.666667 0.222222 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA][TC][TG][CT][CT]C[TA][TC][CT][ACT]C[GT][CAG][CT][ACT][CG]C[TC][AC]C[CG] -------------------------------------------------------------------------------- Time 1.69 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 7 llr = 97 E-value = 7.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::33:a7:6::97a: pos.-specific C 6a:49:3a1:9:::9 probability G ::131:::33::3:1 matrix T 4:6::::::711::: bits 2.1 * * * * 1.9 * * * * 1.6 * * * * 1.4 * ** * ** ** Relative 1.2 * **** ***** Entropy 1.0 ** **** ****** (19.9 bits) 0.8 ** **** ****** 0.6 *** *********** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CCTCCAACATCAAAC consensus T AA C GG G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 262070 472 1.47e-08 TCGGGTGCAC CCTGCAACGTCAAAC CACAGTCTAT 10850 485 6.78e-08 CAACCGAATA TCACCACCATCAAAC C 24357 144 1.16e-07 TTCCGTAATT TCTGCACCGTCAAAC GGGTCAAATT 260980 125 1.58e-07 TTGATTTCAC CCGACAACATCAGAC ATGATCGTGC 22319 82 2.42e-07 TGATTTGTTG CCTACAACAGCAAAG AATCGAGAGA 1943 343 9.32e-07 TTCGAAGCCT CCTCGAACAGTAAAC ACGGCATCTA 21158 310 1.33e-06 CCCTTCAATA TCACCAACCTCTGAC CTCTCTTCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 262070 1.5e-08 471_[+2]_14 10850 6.8e-08 484_[+2]_1 24357 1.2e-07 143_[+2]_342 260980 1.6e-07 124_[+2]_361 22319 2.4e-07 81_[+2]_404 1943 9.3e-07 342_[+2]_143 21158 1.3e-06 309_[+2]_176 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=7 262070 ( 472) CCTGCAACGTCAAAC 1 10850 ( 485) TCACCACCATCAAAC 1 24357 ( 144) TCTGCACCGTCAAAC 1 260980 ( 125) CCGACAACATCAGAC 1 22319 ( 82) CCTACAACAGCAAAG 1 1943 ( 343) CCTCGAACAGTAAAC 1 21158 ( 310) TCACCAACCTCTGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6318 bayes= 9.66 E= 7.7e+001 -945 125 -945 68 -945 206 -945 -945 17 -945 -75 110 17 84 25 -945 -945 184 -75 -945 198 -945 -945 -945 150 25 -945 -945 -945 206 -945 -945 117 -75 25 -945 -945 -945 25 142 -945 184 -945 -90 176 -945 -945 -90 150 -945 25 -945 198 -945 -945 -945 -945 184 -75 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 7.7e+001 0.000000 0.571429 0.000000 0.428571 0.000000 1.000000 0.000000 0.000000 0.285714 0.000000 0.142857 0.571429 0.285714 0.428571 0.285714 0.000000 0.000000 0.857143 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.571429 0.142857 0.285714 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.857143 0.000000 0.142857 0.857143 0.000000 0.000000 0.142857 0.714286 0.000000 0.285714 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT]C[TA][CAG]CA[AC]C[AG][TG]CA[AG]AC -------------------------------------------------------------------------------- Time 3.12 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 105 E-value = 2.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::3::::::18334:: pos.-specific C :9::11:::1:5:1:: probability G a11:499a851:8:18 matrix T ::6a5:1:3313:593 bits 2.1 * * 1.9 * * * 1.6 * * * 1.4 ** * *** * Relative 1.2 ** * **** * ** Entropy 1.0 ** * **** * ** (19.0 bits) 0.8 ** * **** * * ** 0.6 ********* * **** 0.4 ********* ****** 0.2 **************** 0.0 ---------------- Multilevel GCTTTGGGGGACGTTG consensus A G TT AAA T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 6698 365 7.30e-08 TGGGCTTGTG GCTTGGGGTTAAGTTG AATGTACAGC 22319 261 1.29e-07 CGGGGGTCGA GGTTTGGGGGACGCTG GAGGAAAAGG 10850 18 1.29e-07 TATCGACCTG GCATGGGGGGATGATT TGATCCTATG 20746 385 2.51e-07 TTTTTGATAC GCTTTGGGGGTTATTG CTACTCTTCA 21158 112 5.43e-07 GGGCGGTTCC GCGTGGTGGGAAGTTG AGAGTGGACA 260980 83 5.95e-07 ACTGTGATAG GCTTTGGGGAACGAGT TGCGCAATCT 24974 93 6.92e-07 TATTGCCACT GCTTTGGGTTGCATTG CTTCGGTGTT 262070 142 1.62e-06 TACGGACGCC GCATCCGGGCACGATG ACGACATCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6698 7.3e-08 364_[+3]_120 22319 1.3e-07 260_[+3]_224 10850 1.3e-07 17_[+3]_467 20746 2.5e-07 384_[+3]_100 21158 5.4e-07 111_[+3]_373 260980 5.9e-07 82_[+3]_402 24974 6.9e-07 92_[+3]_392 262070 1.6e-06 141_[+3]_343 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 6698 ( 365) GCTTGGGGTTAAGTTG 1 22319 ( 261) GGTTTGGGGGACGCTG 1 10850 ( 18) GCATGGGGGGATGATT 1 20746 ( 385) GCTTTGGGGGTTATTG 1 21158 ( 112) GCGTGGTGGGAAGTTG 1 260980 ( 83) GCTTTGGGGAACGAGT 1 24974 ( 93) GCTTTGGGTTGCATTG 1 262070 ( 142) GCATCCGGGCACGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.62045 E= 2.9e+002 -965 -965 205 -965 -965 187 -94 -965 -2 -965 -94 123 -965 -965 -965 191 -965 -94 64 91 -965 -94 186 -965 -965 -965 186 -109 -965 -965 205 -965 -965 -965 164 -9 -102 -94 105 -9 157 -965 -94 -109 -2 106 -965 -9 -2 -965 164 -965 57 -94 -965 91 -965 -965 -94 171 -965 -965 164 -9 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 2.9e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.250000 0.000000 0.125000 0.625000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.375000 0.500000 0.000000 0.125000 0.875000 0.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.125000 0.125000 0.500000 0.250000 0.750000 0.000000 0.125000 0.125000 0.250000 0.500000 0.000000 0.250000 0.250000 0.000000 0.750000 0.000000 0.375000 0.125000 0.000000 0.500000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.750000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GC[TA]T[TG]GGG[GT][GT]A[CAT][GA][TA]T[GT] -------------------------------------------------------------------------------- Time 4.57 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10850 2.37e-11 17_[+3(1.29e-07)]_244_\ [+1(4.77e-08)]_186_[+2(6.78e-08)]_1 1943 1.85e-06 188_[+1(5.75e-08)]_133_\ [+2(9.32e-07)]_143 20746 5.66e-08 334_[+2(7.04e-05)]_35_\ [+3(2.51e-07)]_[+1(1.06e-07)]_79 21158 1.11e-09 111_[+3(5.43e-07)]_115_\ [+1(3.56e-08)]_46_[+2(1.33e-06)]_176 22319 1.12e-06 81_[+2(2.42e-07)]_164_\ [+3(1.29e-07)]_224 24357 2.32e-09 143_[+2(1.16e-07)]_83_\ [+1(2.40e-10)]_129_[+2(4.77e-05)]_94 24974 5.79e-07 92_[+3(6.92e-07)]_260_\ [+1(2.63e-08)]_111 260980 2.01e-06 82_[+3(5.95e-07)]_26_[+2(1.58e-07)]_\ 361 262070 1.04e-06 141_[+3(1.62e-06)]_314_\ [+2(1.47e-08)]_14 5028 1.66e-04 101_[+2(5.97e-05)]_89_\ [+1(1.71e-07)]_274 5232 3.78e-03 51_[+1(6.94e-05)]_398_\ [+1(5.05e-07)]_9 6698 1.30e-06 277_[+1(5.35e-07)]_66_\ [+3(7.30e-08)]_120 7299 5.73e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************