******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/225/225.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43208 1.0000 500 46491 1.0000 500 46595 1.0000 500 2957 1.0000 500 37283 1.0000 500 6834 1.0000 500 43288 1.0000 500 40492 1.0000 500 16376 1.0000 500 10208 1.0000 500 50443 1.0000 500 33844 1.0000 500 44486 1.0000 500 26387 1.0000 500 44803 1.0000 500 35562 1.0000 500 12280 1.0000 500 42018 1.0000 500 12813 1.0000 500 48208 1.0000 500 42503 1.0000 500 40043 1.0000 500 33412 1.0000 500 37124 1.0000 500 41277 1.0000 500 50215 1.0000 500 45465 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/225/225.seqs.fa -oc motifs/225 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 27 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13500 N= 27 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.249 C 0.257 G 0.236 T 0.258 Background letter frequencies (from dataset with add-one prior applied): A 0.249 C 0.257 G 0.236 T 0.258 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 155 E-value = 7.7e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::3:443::::2:::3:3::: pos.-specific C :2::3::21:2:::3::1:29 probability G :847:2:8:828:a67:3181 matrix T a:23237:926:a:1:a29:: bits 2.1 * 1.9 * ** * 1.7 * ** * 1.5 * * ** * * * Relative 1.2 ** *** *** * *** Entropy 1.0 ** * **** *** ** *** (24.9 bits) 0.8 ** * **** *** ** *** 0.6 ** * *********** *** 0.4 ***************** *** 0.2 ********************* 0.0 --------------------- Multilevel TGGGAATGTGTGTGGGTATGC consensus CATCTAC TCA CA G C sequence T TG G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41277 104 1.62e-09 TATCTGCAGC TGGTATTGTTTGTGGATTTGC AAAACACAAA 40043 104 1.62e-09 TATCTGCAGC TGGTATTGTTTGTGGATTTGC AAAACACAAA 35562 193 2.04e-09 GGGGGGGAGA TGGGATTGTGGATGGGTGTCC GCGATCGTTC 46595 326 2.04e-09 AGTGTGTGTG TGTGTGTGTGTGTGTGTGTGC CATTCTTGGT 45465 305 3.42e-09 GATGACGTAC TCAGCAAGTGCGTGCGTATGC GATGACGCTT 50215 305 3.42e-09 GATGACGTAC TCAGCAAGTGCGTGCGTATGC GATGACGCTT 6834 54 1.01e-08 TCCAACCCGC TGTTCATCCGTGTGGGTGTGC CGATGCTTCT 12813 37 3.77e-08 GCCGTTCCGG TGGGTATCTGGATGGGTCTCC CTATTGCCAC 37124 405 4.75e-08 CGTTGTCAGT TGAGAGAGTGTGTGCATAGGG AGTTCTCACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41277 1.6e-09 103_[+1]_376 40043 1.6e-09 103_[+1]_376 35562 2e-09 192_[+1]_287 46595 2e-09 325_[+1]_154 45465 3.4e-09 304_[+1]_175 50215 3.4e-09 304_[+1]_175 6834 1e-08 53_[+1]_426 12813 3.8e-08 36_[+1]_443 37124 4.7e-08 404_[+1]_75 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 41277 ( 104) TGGTATTGTTTGTGGATTTGC 1 40043 ( 104) TGGTATTGTTTGTGGATTTGC 1 35562 ( 193) TGGGATTGTGGATGGGTGTCC 1 46595 ( 326) TGTGTGTGTGTGTGTGTGTGC 1 45465 ( 305) TCAGCAAGTGCGTGCGTATGC 1 50215 ( 305) TCAGCAAGTGCGTGCGTATGC 1 6834 ( 54) TGTTCATCCGTGTGGGTGTGC 1 12813 ( 37) TGGGTATCTGGATGGGTCTCC 1 37124 ( 405) TGAGAGAGTGTGTGCATAGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12960 bayes= 10.6252 E= 7.7e-004 -982 -982 -982 195 -982 -21 172 -982 42 -982 91 -21 -982 -982 149 37 83 38 -982 -21 83 -982 -9 37 42 -982 -982 137 -982 -21 172 -982 -982 -121 -982 178 -982 -982 172 -21 -982 -21 -9 111 -16 -982 172 -982 -982 -982 -982 195 -982 -982 208 -982 -982 38 123 -121 42 -982 149 -982 -982 -982 -982 195 42 -121 49 -21 -982 -982 -109 178 -982 -21 172 -982 -982 179 -109 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 7.7e-004 0.000000 0.000000 0.000000 1.000000 0.000000 0.222222 0.777778 0.000000 0.333333 0.000000 0.444444 0.222222 0.000000 0.000000 0.666667 0.333333 0.444444 0.333333 0.000000 0.222222 0.444444 0.000000 0.222222 0.333333 0.333333 0.000000 0.000000 0.666667 0.000000 0.222222 0.777778 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 0.777778 0.222222 0.000000 0.222222 0.222222 0.555556 0.222222 0.000000 0.777778 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.555556 0.111111 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.333333 0.111111 0.333333 0.222222 0.000000 0.000000 0.111111 0.888889 0.000000 0.222222 0.777778 0.000000 0.000000 0.888889 0.111111 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[GC][GAT][GT][ACT][ATG][TA][GC]T[GT][TCG][GA]TG[GC][GA]T[AGT]T[GC]C -------------------------------------------------------------------------------- Time 6.98 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 5 llr = 117 E-value = 4.1e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::a:::4:::::4::2aa6 pos.-specific C a4:8:::::a:::::::8::4 probability G ::22:::a4:8:aa6:6:::: matrix T :68::aa:2:2a:::a4:::: bits 2.1 * * ** ** 1.9 * **** * *** * ** 1.7 * **** * *** * ** 1.5 * **** * *** * ** Relative 1.2 * ****** ***** * *** Entropy 1.0 ******** ************ (33.8 bits) 0.8 ******** ************ 0.6 ******** ************ 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CTTCATTGACGTGGGTGCAAA consensus CGG G T A TA C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41277 261 1.47e-12 AGGCACCATT CTTCATTGGCGTGGGTTCAAA TCCCACTCCG 40043 261 1.47e-12 AGGCACCATT CTTCATTGGCGTGGGTTCAAA TCCCACTCCG 45465 188 8.58e-12 AAGTCGATAA CCTCATTGACGTGGATGCAAC AGAAGAAACC 50215 188 8.58e-12 AAGTCGATAA CCTCATTGACGTGGATGCAAC AGAAGAAACC 42503 55 1.58e-10 CTCGAACCCT CTGGATTGTCTTGGGTGAAAA TACATAAGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41277 1.5e-12 260_[+2]_219 40043 1.5e-12 260_[+2]_219 45465 8.6e-12 187_[+2]_292 50215 8.6e-12 187_[+2]_292 42503 1.6e-10 54_[+2]_425 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=5 41277 ( 261) CTTCATTGGCGTGGGTTCAAA 1 40043 ( 261) CTTCATTGGCGTGGGTTCAAA 1 45465 ( 188) CCTCATTGACGTGGATGCAAC 1 50215 ( 188) CCTCATTGACGTGGATGCAAC 1 42503 ( 55) CTGGATTGTCTTGGGTGAAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 12960 bayes= 11.5909 E= 4.1e-003 -897 196 -897 -897 -897 64 -897 122 -897 -897 -24 163 -897 164 -24 -897 200 -897 -897 -897 -897 -897 -897 195 -897 -897 -897 195 -897 -897 208 -897 68 -897 76 -37 -897 196 -897 -897 -897 -897 176 -37 -897 -897 -897 195 -897 -897 208 -897 -897 -897 208 -897 68 -897 134 -897 -897 -897 -897 195 -897 -897 134 63 -32 164 -897 -897 200 -897 -897 -897 200 -897 -897 -897 127 64 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 4.1e-003 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.200000 0.800000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.400000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.400000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[TC][TG][CG]ATTG[AGT]C[GT]TGG[GA]T[GT][CA]AA[AC] -------------------------------------------------------------------------------- Time 13.27 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 18 sites = 7 llr = 122 E-value = 2.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 96a:a6aaa::34:4:9: pos.-specific C :4:7:4:::31619:::: probability G :::3:::::711413a:: matrix T 1:::::::::7:::3:1a bits 2.1 * * *** * 1.9 * * *** * * 1.7 * * *** * * 1.5 * * * *** * *** Relative 1.2 * * * **** * *** Entropy 1.0 ********** * *** (25.1 bits) 0.8 *********** * *** 0.6 ************** *** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel AAACAAAAAGTCACAGAT consensus C G C C AG G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 40043 126 2.53e-10 TGGATTTGCA AAACACAAAGTCACGGAT GACCAATCAA 45465 430 2.31e-09 AGCCGGCCGC ACACAAAAACTAGCAGAT TTCAGGCGAC 50215 430 2.31e-09 AGCCGGCCGC ACACAAAAACTAGCAGAT TTCAGGCGAC 41277 126 4.42e-09 TGGATTTGCA AAACACAAAGTCACGGTT GACCAACCAA 2957 186 1.65e-08 TGCTCCGACA TCAGACAAAGTCGCTGAT GAGGCAGAGA 40492 276 2.41e-08 CAAAATCCGT AAAGAAAAAGCGACTGAT CAAGAGCGTC 48208 58 2.55e-08 TTGTGAGGAG AAACAAAAAGGCCGAGAT CTTGAAAAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40043 2.5e-10 125_[+3]_357 45465 2.3e-09 429_[+3]_53 50215 2.3e-09 429_[+3]_53 41277 4.4e-09 125_[+3]_357 2957 1.7e-08 185_[+3]_297 40492 2.4e-08 275_[+3]_207 48208 2.6e-08 57_[+3]_425 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=18 seqs=7 40043 ( 126) AAACACAAAGTCACGGAT 1 45465 ( 430) ACACAAAAACTAGCAGAT 1 50215 ( 430) ACACAAAAACTAGCAGAT 1 41277 ( 126) AAACACAAAGTCACGGTT 1 2957 ( 186) TCAGACAAAGTCGCTGAT 1 40492 ( 276) AAAGAAAAAGCGACTGAT 1 48208 ( 58) AAACAAAAAGGCCGAGAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 13041 bayes= 10.7064 E= 2.0e-001 178 -945 -945 -85 120 74 -945 -945 200 -945 -945 -945 -945 148 27 -945 200 -945 -945 -945 120 74 -945 -945 200 -945 -945 -945 200 -945 -945 -945 200 -945 -945 -945 -945 15 159 -945 -945 -84 -73 147 20 115 -73 -945 78 -84 86 -945 -945 174 -73 -945 78 -945 27 15 -945 -945 208 -945 178 -945 -945 -85 -945 -945 -945 195 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 7 E= 2.0e-001 0.857143 0.000000 0.000000 0.142857 0.571429 0.428571 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 1.000000 0.000000 0.000000 0.000000 0.571429 0.428571 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.142857 0.142857 0.714286 0.285714 0.571429 0.142857 0.000000 0.428571 0.142857 0.428571 0.000000 0.000000 0.857143 0.142857 0.000000 0.428571 0.000000 0.285714 0.285714 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.000000 0.142857 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[AC]A[CG]A[AC]AAA[GC]T[CA][AG]C[AGT]GAT -------------------------------------------------------------------------------- Time 19.73 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43208 5.30e-01 500 46491 7.92e-01 500 46595 1.34e-05 325_[+1(2.04e-09)]_154 2957 6.16e-04 185_[+3(1.65e-08)]_297 37283 8.30e-01 500 6834 8.04e-06 53_[+1(1.01e-08)]_426 43288 3.00e-01 500 40492 8.40e-05 275_[+3(2.41e-08)]_207 16376 1.34e-01 500 10208 9.03e-01 500 50443 8.70e-01 500 33844 2.11e-01 238_[+3(5.94e-05)]_244 44486 4.62e-01 500 26387 5.25e-01 500 44803 3.30e-01 500 35562 6.54e-06 192_[+1(2.04e-09)]_287 12280 2.44e-01 500 42018 6.02e-01 500 12813 2.13e-05 36_[+1(3.77e-08)]_121_\ [+3(3.99e-05)]_304 48208 5.73e-04 57_[+3(2.55e-08)]_425 42503 1.03e-07 54_[+2(1.58e-10)]_135_\ [+3(2.14e-05)]_272 40043 9.08e-20 103_[+1(1.62e-09)]_1_[+3(2.53e-10)]_\ 117_[+2(1.47e-12)]_219 33412 8.33e-01 500 37124 7.25e-04 404_[+1(4.75e-08)]_75 41277 1.42e-18 103_[+1(1.62e-09)]_1_[+3(4.42e-09)]_\ 117_[+2(1.47e-12)]_219 50215 8.46e-18 187_[+2(8.58e-12)]_96_\ [+1(3.42e-09)]_104_[+3(2.31e-09)]_53 45465 8.46e-18 187_[+2(8.58e-12)]_96_\ [+1(3.42e-09)]_104_[+3(2.31e-09)]_53 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************