******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/95/95.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43224 1.0000 500 21030 1.0000 500 28794 1.0000 500 47742 1.0000 500 15495 1.0000 500 48970 1.0000 500 50045 1.0000 500 50216 1.0000 500 50272 1.0000 500 31322 1.0000 500 44961 1.0000 500 45464 1.0000 500 12329 1.0000 500 36102 1.0000 500 46842 1.0000 500 45950 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/95/95.seqs.fa -oc motifs/95 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.259 C 0.242 G 0.224 T 0.275 Background letter frequencies (from dataset with add-one prior applied): A 0.259 C 0.242 G 0.224 T 0.275 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 16 llr = 173 E-value = 8.5e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :3317:12a:81:18 pos.-specific C 6141:713:a:::3: probability G 172612:4:::9:52 matrix T 3:232181::3:a2: bits 2.2 1.9 ** * 1.7 ** ** 1.5 ** ** Relative 1.3 ** ** * Entropy 1.1 * * ***** * (15.6 bits) 0.9 * *** ***** * 0.6 ** *** ***** * 0.4 ** **** ******* 0.2 ** ************ 0.0 --------------- Multilevel CGCGACTGACAGTGA consensus TAAT C T C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 45464 357 9.35e-09 ATCCCGATGC CGCGACTGACTGTGA AAGCACGATC 50216 6 9.35e-09 GATGC CGCGACTGACTGTGA AAGCACGATC 47742 379 9.35e-09 CTTTCAGCAT CGCGACTGACTGTGA GAAAAATATT 36102 359 5.02e-07 GTATACTCTC CGCGGCCGACAGTGA TCGACTACGG 21030 157 6.40e-07 AGAAGAGAAG CGAGAGTAACAGTCA ACATTGTTTG 45950 44 1.27e-06 AGGTGACTAT GGAAACTCACAGTGA CAGGAATACA 12329 38 2.37e-06 GGATTCCCCT CATGACTCACAGTCG AGCAGAGCTG 48970 38 3.47e-06 CAATGTTTGA TGGGTTTGACAGTGA ATCGTCCCAC 44961 275 5.45e-06 CCGGATTTAC TGTTACAGACAGTGA CATGAAACGC 31322 131 5.45e-06 GACATTACAG TACTACTTACAGTGA GTTACAGTCA 50272 167 7.65e-06 TGAAATACTC CCGTACTCACAGTTA CTGCTTTTTC 15495 149 1.52e-05 CAGATGTCGG TGGGACCAACAGTCG TATGAGCTTG 43224 125 2.77e-05 CGCGAACAGT CATTTGTCACAGTTA GTGTCGGTGT 50045 384 3.74e-05 CTACTCGAAT CGAGAGTAACAATAA AGCTTCCAAT 28794 94 5.48e-05 AAAATTCAGT GAACGCTCACAGTCA GGCAGTTAAA 46842 195 6.07e-05 GTGCTCCATA CGCATTTGACTGTTG GAAGCGAGGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45464 9.3e-09 356_[+1]_129 50216 9.3e-09 5_[+1]_480 47742 9.3e-09 378_[+1]_107 36102 5e-07 358_[+1]_127 21030 6.4e-07 156_[+1]_329 45950 1.3e-06 43_[+1]_442 12329 2.4e-06 37_[+1]_448 48970 3.5e-06 37_[+1]_448 44961 5.4e-06 274_[+1]_211 31322 5.4e-06 130_[+1]_355 50272 7.6e-06 166_[+1]_319 15495 1.5e-05 148_[+1]_337 43224 2.8e-05 124_[+1]_361 50045 3.7e-05 383_[+1]_102 28794 5.5e-05 93_[+1]_392 46842 6.1e-05 194_[+1]_291 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=16 45464 ( 357) CGCGACTGACTGTGA 1 50216 ( 6) CGCGACTGACTGTGA 1 47742 ( 379) CGCGACTGACTGTGA 1 36102 ( 359) CGCGGCCGACAGTGA 1 21030 ( 157) CGAGAGTAACAGTCA 1 45950 ( 44) GGAAACTCACAGTGA 1 12329 ( 38) CATGACTCACAGTCG 1 48970 ( 38) TGGGTTTGACAGTGA 1 44961 ( 275) TGTTACAGACAGTGA 1 31322 ( 131) TACTACTTACAGTGA 1 50272 ( 167) CCGTACTCACAGTTA 1 15495 ( 149) TGGGACCAACAGTCG 1 43224 ( 125) CATTTGTCACAGTTA 1 50045 ( 384) CGAGAGTAACAATAA 1 28794 ( 94) GAACGCTCACAGTCA 1 46842 ( 195) CGCATTTGACTGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7776 bayes= 10.2456 E= 8.5e-006 -1064 137 -84 -14 -5 -195 162 -1064 -5 63 -26 -55 -105 -195 133 -14 141 -1064 -84 -55 -1064 151 -26 -114 -205 -95 -1064 156 -47 37 96 -214 195 -1064 -1064 -1064 -1064 205 -1064 -1064 153 -1064 -1064 -14 -205 -1064 206 -1064 -1064 -1064 -1064 186 -205 5 116 -55 165 -1064 -26 -1064 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 16 E= 8.5e-006 0.000000 0.625000 0.125000 0.250000 0.250000 0.062500 0.687500 0.000000 0.250000 0.375000 0.187500 0.187500 0.125000 0.062500 0.562500 0.250000 0.687500 0.000000 0.125000 0.187500 0.000000 0.687500 0.187500 0.125000 0.062500 0.125000 0.000000 0.812500 0.187500 0.312500 0.437500 0.062500 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.062500 0.000000 0.937500 0.000000 0.000000 0.000000 0.000000 1.000000 0.062500 0.250000 0.500000 0.187500 0.812500 0.000000 0.187500 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT][GA][CA][GT]ACT[GC]AC[AT]GT[GC]A -------------------------------------------------------------------------------- Time 2.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 5 llr = 103 E-value = 2.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :2a2:2:::a2:2::2::a6: pos.-specific C 82:::8622:24::::a:::a probability G 2::::::88:::6:8::a::: matrix T :6:8a:4:::662a28:::4: bits 2.2 * 1.9 * * * * *** * 1.7 * * * * *** * 1.5 * * *** * *** * Relative 1.3 * * ** *** ** *** * Entropy 1.1 * ******** ****** * (29.7 bits) 0.9 * ******** * ******** 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CTATTCCGGATTGTGTCGAAC consensus GA A ATCC ACA TA T sequence C C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45464 436 7.37e-13 ATTGGCAATG CTATTCCGGATTGTGTCGATC TTTCTGTAGA 50216 85 7.37e-13 ATTGGCAATG CTATTCCGGATTGTGTCGATC TTTCTGTAGA 15495 90 1.37e-09 CTCATACCCT CTATTCTGCACCATGACGAAC CTTCGTAAGT 44961 310 1.45e-09 GTAGCGTAAA GAATTCTGGAACGTTTCGAAC CCTACCGCTC 50272 131 2.09e-09 CCTTGGATCT CCAATACCGATTTTGTCGAAC GGGAGTGAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45464 7.4e-13 435_[+2]_44 50216 7.4e-13 84_[+2]_395 15495 1.4e-09 89_[+2]_390 44961 1.4e-09 309_[+2]_170 50272 2.1e-09 130_[+2]_349 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=5 45464 ( 436) CTATTCCGGATTGTGTCGATC 1 50216 ( 85) CTATTCCGGATTGTGTCGATC 1 15495 ( 90) CTATTCTGCACCATGACGAAC 1 44961 ( 310) GAATTCTGGAACGTTTCGAAC 1 50272 ( 131) CCAATACCGATTTTGTCGAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 10.8357 E= 2.0e+000 -897 173 -16 -897 -37 -27 -897 112 195 -897 -897 -897 -37 -897 -897 154 -897 -897 -897 186 -37 173 -897 -897 -897 131 -897 54 -897 -27 183 -897 -897 -27 183 -897 195 -897 -897 -897 -37 -27 -897 112 -897 73 -897 112 -37 -897 142 -46 -897 -897 -897 186 -897 -897 183 -46 -37 -897 -897 154 -897 205 -897 -897 -897 -897 216 -897 195 -897 -897 -897 121 -897 -897 54 -897 205 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 2.0e+000 0.000000 0.800000 0.200000 0.000000 0.200000 0.200000 0.000000 0.600000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.200000 0.000000 0.600000 0.000000 0.400000 0.000000 0.600000 0.200000 0.000000 0.600000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.200000 0.000000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG][TAC]A[TA]T[CA][CT][GC][GC]A[TAC][TC][GAT]T[GT][TA]CGA[AT]C -------------------------------------------------------------------------------- Time 4.02 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 7 llr = 123 E-value = 2.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 394::1:11a7:97:a1:1:9 pos.-specific C :161::a16:1413::3:47: probability G 7::::::41::1::1:::4:1 matrix T :::9a9:31:14::9:6a:3: bits 2.2 1.9 * * * * * 1.7 * * * * * 1.5 * * * * * Relative 1.3 ** **** * * ** * * Entropy 1.1 ******* * **** * ** (25.3 bits) 0.9 ******* ** **** * ** 0.6 ******* ** ********* 0.4 ******* ************* 0.2 ********************* 0.0 --------------------- Multilevel GACTTTCGCAACAATATTCCA consensus A A T T C C GT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45464 392 2.90e-12 GCTTTTGACT GAATTTCGCAATAATATTCCA CTCCGACCAC 50216 41 2.90e-12 GCTTTTGACT GAATTTCGCAATAATATTCCA CTCCGACCAC 50045 123 4.61e-09 ACAGAAAGCA GACTTTCTCAACACGACTACA TTTATTGCCC 44961 130 5.03e-09 ATCTGCTGAC GACTTACGTACCAATATTCCA ACGAAGAATC 46842 235 1.97e-08 CAATATTTTC AAATTTCAAAACACTAATGCA AACAAGACCG 45950 451 6.47e-08 ATAGACGTTA AACCTTCTGATTAATACTGTA ATTTCCATTT 15495 270 7.10e-08 CGACCCCAAG GCCTTTCCCAAGCATATTGTG ATTCCCTTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45464 2.9e-12 391_[+3]_88 50216 2.9e-12 40_[+3]_439 50045 4.6e-09 122_[+3]_357 44961 5e-09 129_[+3]_350 46842 2e-08 234_[+3]_245 45950 6.5e-08 450_[+3]_29 15495 7.1e-08 269_[+3]_210 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=7 45464 ( 392) GAATTTCGCAATAATATTCCA 1 50216 ( 41) GAATTTCGCAATAATATTCCA 1 50045 ( 123) GACTTTCTCAACACGACTACA 1 44961 ( 130) GACTTACGTACCAATATTCCA 1 46842 ( 235) AAATTTCAAAACACTAATGCA 1 45950 ( 451) AACCTTCTGATTAATACTGTA 1 15495 ( 270) GCCTTTCCCAAGCATATTGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 9.94195 E= 2.0e+000 14 -945 167 -945 173 -76 -945 -945 73 124 -945 -945 -945 -76 -945 164 -945 -945 -945 186 -86 -945 -945 164 -945 205 -945 -945 -86 -76 93 5 -86 124 -65 -94 195 -945 -945 -945 146 -76 -945 -94 -945 83 -65 64 173 -76 -945 -945 146 24 -945 -945 -945 -945 -65 164 195 -945 -945 -945 -86 24 -945 105 -945 -945 -945 186 -86 83 93 -945 -945 156 -945 5 173 -945 -65 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 2.0e+000 0.285714 0.000000 0.714286 0.000000 0.857143 0.142857 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.142857 0.000000 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.142857 0.142857 0.428571 0.285714 0.142857 0.571429 0.142857 0.142857 1.000000 0.000000 0.000000 0.000000 0.714286 0.142857 0.000000 0.142857 0.000000 0.428571 0.142857 0.428571 0.857143 0.142857 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.000000 0.000000 0.142857 0.857143 1.000000 0.000000 0.000000 0.000000 0.142857 0.285714 0.000000 0.571429 0.000000 0.000000 0.000000 1.000000 0.142857 0.428571 0.428571 0.000000 0.000000 0.714286 0.000000 0.285714 0.857143 0.000000 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA]A[CA]TTTC[GT]CAA[CT]A[AC]TA[TC]T[CG][CT]A -------------------------------------------------------------------------------- Time 6.23 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43224 1.27e-01 17_[+1(9.63e-05)]_92_[+1(2.77e-05)]_\ 361 21030 5.98e-03 156_[+1(6.40e-07)]_329 28794 3.63e-02 93_[+1(5.48e-05)]_392 47742 9.36e-05 276_[+1(7.03e-05)]_87_\ [+1(9.35e-09)]_107 15495 7.64e-11 89_[+2(1.37e-09)]_38_[+1(1.52e-05)]_\ 106_[+3(7.10e-08)]_210 48970 3.87e-02 37_[+1(3.47e-06)]_448 50045 4.64e-06 122_[+3(4.61e-09)]_240_\ [+1(3.74e-05)]_102 50216 3.44e-21 5_[+1(9.35e-09)]_20_[+3(2.90e-12)]_\ 23_[+2(7.37e-13)]_346_[+2(6.29e-05)]_28 50272 6.10e-07 130_[+2(2.09e-09)]_15_\ [+1(7.65e-06)]_319 31322 5.90e-03 130_[+1(5.45e-06)]_355 44961 2.57e-12 129_[+3(5.03e-09)]_124_\ [+1(5.45e-06)]_20_[+2(1.45e-09)]_170 45464 3.44e-21 356_[+1(9.35e-09)]_20_\ [+3(2.90e-12)]_23_[+2(7.37e-13)]_44 12329 1.09e-02 37_[+1(2.37e-06)]_448 36102 5.53e-03 358_[+1(5.02e-07)]_127 46842 1.61e-05 194_[+1(6.07e-05)]_25_\ [+3(1.97e-08)]_245 45950 1.90e-06 43_[+1(1.27e-06)]_392_\ [+3(6.47e-08)]_29 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************