******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/125/125.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10331 1.0000 500 10744 1.0000 500 15161 1.0000 500 19701 1.0000 500 2023 1.0000 500 21447 1.0000 500 21963 1.0000 500 22703 1.0000 500 23705 1.0000 500 269194 1.0000 500 5515 1.0000 500 5870 1.0000 500 8354 1.0000 500 8655 1.0000 500 9396 1.0000 500 bd1850 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/125/125.seqs.fa -oc motifs/125 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.243 G 0.226 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.243 G 0.226 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 17 sites = 6 llr = 102 E-value = 1.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::8:2a:282753:a3: pos.-specific C :::::::::2:2::::: probability G a7:88:a822337a::a matrix T :322:::::5:::::7: bits 2.1 * * * * 1.9 * ** ** * 1.7 * ** ** * 1.5 * ***** ** * Relative 1.3 * ******* ** * Entropy 1.1 ********* * ***** (24.5 bits) 0.9 ********* * ***** 0.6 ********* ******* 0.4 ********* ******* 0.2 ***************** 0.0 ----------------- Multilevel GGAGGAGGATAAGGATG consensus T GGA A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 21963 56 1.08e-09 CGCGAACGTT GGAGGAGGAAGAGGATG GCATGGTGTG 22703 255 4.45e-09 CGTCCAAGAA GGAGGAGGATACAGAAG CAATCGTCAC 19701 456 4.45e-09 CGAGAAGGCT GGAGGAGGACAAAGAAG GGATGGGAAA 5515 221 1.45e-08 GGGCGATGGA GGAGAAGAATAAGGATG AAGGCGAAGG 269194 219 2.09e-08 GCATACTTTC GTAGGAGGGGGGGGATG GAGGATGTTG 9396 56 3.28e-08 ATACGGTTGT GTTTGAGGATAGGGATG AGCTCAATCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21963 1.1e-09 55_[+1]_428 22703 4.5e-09 254_[+1]_229 19701 4.5e-09 455_[+1]_28 5515 1.5e-08 220_[+1]_263 269194 2.1e-08 218_[+1]_265 9396 3.3e-08 55_[+1]_428 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=17 seqs=6 21963 ( 56) GGAGGAGGAAGAGGATG 1 22703 ( 255) GGAGGAGGATACAGAAG 1 19701 ( 456) GGAGGAGGACAAAGAAG 1 5515 ( 221) GGAGAAGAATAAGGATG 1 269194 ( 219) GTAGGAGGGGGGGGATG 1 9396 ( 56) GTTTGAGGATAGGGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 7744 bayes= 10.7805 E= 1.0e-001 -923 -923 214 -923 -923 -923 156 32 166 -923 -923 -68 -923 -923 188 -68 -66 -923 188 -923 192 -923 -923 -923 -923 -923 214 -923 -66 -923 188 -923 166 -923 -44 -923 -66 -54 -44 90 134 -923 56 -923 92 -54 56 -923 34 -923 156 -923 -923 -923 214 -923 192 -923 -923 -923 34 -923 -923 132 -923 -923 214 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 6 E= 1.0e-001 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.833333 0.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.166667 0.166667 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.833333 0.000000 0.166667 0.000000 0.166667 0.166667 0.166667 0.500000 0.666667 0.000000 0.333333 0.000000 0.500000 0.166667 0.333333 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GT]AGGAGGAT[AG][AG][GA]GA[TA]G -------------------------------------------------------------------------------- Time 3.01 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 122 E-value = 3.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :4:::::111:::::1:::6: pos.-specific C ::aa:6a6319a766169a19 probability G 14:::4::14:::::4::::: matrix T 91::a::3431:344341:31 bits 2.1 ** * * * 1.9 *** * * * 1.7 *** * * * 1.5 *** * ** ** * Relative 1.3 * *** * ** ** * Entropy 1.1 * ***** ***** *** * (25.2 bits) 0.9 * ***** ***** *** * 0.6 ******** ***** ***** 0.4 ******** ***** ***** 0.2 ********************* 0.0 --------------------- Multilevel TACCTCCCTGCCCCCGCCCAC consensus G G TCT TTTTT T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 269194 436 9.40e-11 TCCTACACCC TACCTGCCAGCCCCTGCCCAC AACCAACTCA 2023 144 1.84e-10 CCCTGCCTGA TACCTGCCTTCCCCCACCCAC GAATCGTCTC 5870 125 2.17e-09 CACTGTTTAG TACCTCCCGTCCCTCGTCCCC ACTCTAGTTC 15161 455 2.67e-09 TTTCTCTTGC TGCCTCCTCCCCTCCTCCCAC CCTGCCTTAC 8655 10 2.11e-08 AGGAAAGAC TTCCTCCTTGTCCTTGTCCTC GGTGCCGCCC 5515 48 3.32e-08 CTTCTTCTAC TGCCTGCATACCTCTCTCCTC GTTTCTTGAA 8354 84 3.68e-08 CTACTCGTTC GGCCTCCCCGCCCTCTCTCAT TGCGTTCTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 269194 9.4e-11 435_[+2]_44 2023 1.8e-10 143_[+2]_336 5870 2.2e-09 124_[+2]_355 15161 2.7e-09 454_[+2]_25 8655 2.1e-08 9_[+2]_470 5515 3.3e-08 47_[+2]_432 8354 3.7e-08 83_[+2]_396 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 269194 ( 436) TACCTGCCAGCCCCTGCCCAC 1 2023 ( 144) TACCTGCCTTCCCCCACCCAC 1 5870 ( 125) TACCTCCCGTCCCTCGTCCCC 1 15161 ( 455) TGCCTCCTCCCCTCCTCCCAC 1 8655 ( 10) TTCCTCCTTGTCCTTGTCCTC 1 5515 ( 48) TGCCTGCATACCTCTCTCCTC 1 8354 ( 84) GGCCTCCCCGCCCTCTCTCAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 9.94195 E= 3.5e-001 -945 -945 -66 168 70 -945 92 -90 -945 204 -945 -945 -945 204 -945 -945 -945 -945 -945 190 -945 123 92 -945 -945 204 -945 -945 -88 123 -945 10 -88 23 -66 68 -88 -76 92 10 -945 182 -945 -90 -945 204 -945 -945 -945 155 -945 10 -945 123 -945 68 -945 123 -945 68 -88 -76 92 10 -945 123 -945 68 -945 182 -945 -90 -945 204 -945 -945 111 -76 -945 10 -945 182 -945 -90 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 3.5e-001 0.000000 0.000000 0.142857 0.857143 0.428571 0.000000 0.428571 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.571429 0.428571 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.571429 0.000000 0.285714 0.142857 0.285714 0.142857 0.428571 0.142857 0.142857 0.428571 0.285714 0.000000 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.571429 0.000000 0.428571 0.000000 0.571429 0.000000 0.428571 0.142857 0.142857 0.428571 0.285714 0.000000 0.571429 0.000000 0.428571 0.000000 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.571429 0.142857 0.000000 0.285714 0.000000 0.857143 0.000000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[AG]CCT[CG]C[CT][TC][GT]CC[CT][CT][CT][GT][CT]CC[AT]C -------------------------------------------------------------------------------- Time 5.87 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 7 llr = 105 E-value = 2.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 4:4:1:::31:::1:: pos.-specific C ::1::::9::3:16:: probability G 6a:::9::741a::a1 matrix T ::4a91a1:46:93:9 bits 2.1 * * * 1.9 * * * * * 1.7 * * * * * 1.5 * * *** * * Relative 1.3 * ****** ** ** Entropy 1.1 ** ****** ** ** (21.7 bits) 0.9 ** ****** ** ** 0.6 ** ************* 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GGATTGTCGGTGTCGT consensus A T ATC T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 21963 267 7.46e-10 CGAGTTGGTC GGTTTGTCGTTGTCGT TGGCGCTTTA 8655 428 2.36e-08 CACAACTCCG AGCTTGTCGTCGTCGT TGCAGGGGTT 19701 217 2.85e-08 CTGTTTTTGC GGATTGTCGTCGTCGG AAAAAGGCAG 2023 246 3.53e-08 GCGTCGCGTT AGTTTGTCGGGGTTGT CTTATTATAT 8354 246 2.04e-07 AACTCGAATA AGTTTTTTGGTGTCGT TTTCTTGGTC 5870 358 2.04e-07 GATGAACGAT GGATTGTCAGTGCAGT GCTTTGATCT 10331 251 3.27e-07 TGCTACCAGG GGATAGTCAATGTTGT CTCTGATTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21963 7.5e-10 266_[+3]_218 8655 2.4e-08 427_[+3]_57 19701 2.9e-08 216_[+3]_268 2023 3.5e-08 245_[+3]_239 8354 2e-07 245_[+3]_239 5870 2e-07 357_[+3]_127 10331 3.3e-07 250_[+3]_234 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=7 21963 ( 267) GGTTTGTCGTTGTCGT 1 8655 ( 428) AGCTTGTCGTCGTCGT 1 19701 ( 217) GGATTGTCGTCGTCGG 1 2023 ( 246) AGTTTGTCGGGGTTGT 1 8354 ( 246) AGTTTTTTGGTGTCGT 1 5870 ( 358) GGATTGTCAGTGCAGT 1 10331 ( 251) GGATAGTCAATGTTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 9.95692 E= 2.4e+000 70 -945 134 -945 -945 -945 214 -945 70 -76 -945 68 -945 -945 -945 190 -88 -945 -945 168 -945 -945 192 -90 -945 -945 -945 190 -945 182 -945 -90 11 -945 166 -945 -88 -945 92 68 -945 23 -66 110 -945 -945 214 -945 -945 -76 -945 168 -88 123 -945 10 -945 -945 214 -945 -945 -945 -66 168 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 2.4e+000 0.428571 0.000000 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.142857 0.000000 0.428571 0.000000 0.000000 0.000000 1.000000 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.142857 0.285714 0.000000 0.714286 0.000000 0.142857 0.000000 0.428571 0.428571 0.000000 0.285714 0.142857 0.571429 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.142857 0.571429 0.000000 0.285714 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA]G[AT]TTGTC[GA][GT][TC]GT[CT]GT -------------------------------------------------------------------------------- Time 8.75 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10331 2.79e-03 250_[+3(3.27e-07)]_234 10744 5.05e-01 500 15161 2.55e-05 454_[+2(2.67e-09)]_25 19701 7.20e-09 216_[+3(2.85e-08)]_223_\ [+1(4.45e-09)]_28 2023 1.59e-12 26_[+1(3.64e-06)]_100_\ [+2(1.84e-10)]_81_[+3(3.53e-08)]_239 21447 1.88e-01 500 21963 4.33e-11 55_[+1(1.08e-09)]_194_\ [+3(7.46e-10)]_218 22703 3.46e-05 254_[+1(4.45e-09)]_229 23705 2.35e-01 500 269194 6.70e-11 218_[+1(2.09e-08)]_200_\ [+2(9.40e-11)]_44 5515 2.61e-08 47_[+2(3.32e-08)]_152_\ [+1(1.45e-08)]_263 5870 1.38e-08 124_[+2(2.17e-09)]_212_\ [+3(2.04e-07)]_127 8354 3.46e-07 83_[+2(3.68e-08)]_141_\ [+3(2.04e-07)]_239 8655 8.55e-09 9_[+2(2.11e-08)]_397_[+3(2.36e-08)]_\ 57 9396 3.17e-04 55_[+1(3.28e-08)]_428 bd1850 7.06e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************