******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/245/245.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42559 1.0000 500 9283 1.0000 500 47555 1.0000 500 267 1.0000 500 14856 1.0000 500 29885 1.0000 500 4127 1.0000 500 16262 1.0000 500 16357 1.0000 500 30807 1.0000 500 23871 1.0000 500 5873 1.0000 500 44351 1.0000 500 26157 1.0000 500 2217 1.0000 500 27282 1.0000 500 36161 1.0000 500 48984 1.0000 500 43827 1.0000 500 43322 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/245/245.seqs.fa -oc motifs/245 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.256 C 0.255 G 0.239 T 0.251 Background letter frequencies (from dataset with add-one prior applied): A 0.256 C 0.255 G 0.239 T 0.251 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 13 llr = 172 E-value = 4.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1128:16542::8322:68: pos.-specific C 74::9::3::9215:66428 probability G 227:19315418:2524::: matrix T 1422::1115::1:3::::2 bits 2.1 1.9 1.7 ** * 1.4 ** ** * Relative 1.2 *** *** ** Entropy 1.0 *** *** **** (19.0 bits) 0.8 ***** * *** **** 0.6 * ***** ***** ***** 0.4 * ****************** 0.2 ******************** 0.0 -------------------- Multilevel CCGACGAAGTCGACGCCAAC consensus T T GCAG ATAGC T sequence GA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 30807 155 4.37e-11 AGCGAAGAAG CTGACGAAGTCGAAGCGAAC CTCCGGCCTT 9283 137 2.76e-08 CACTCTACTA CCGTCGGAATCGAAGACCAC GACGCAAATC 27282 424 3.10e-08 TCCAAGACTA CCGACGAGGACGACACGAAC GCTCCCGACC 2217 256 1.12e-07 GAACCGGATC CCTACGACGACGAAGACCAC CACGACGACA 47555 468 1.12e-07 CCGAGTCGTT GTTACGAAATCGAAGCCAAT CGCACATAGG 267 309 1.49e-07 CAGCGGTAGC CAGACGATGGCGACGGCCAC TGCGGCTGCC 16357 127 4.38e-07 CAGCGTTTGG CCGTCGAAGGCGCGTCCACC AACCGGTTAA 23871 390 6.58e-07 GTCACGGTTG GTGACGGCTTCGACAACCAC ACATCGCAAA 4127 23 7.12e-07 TGACATTTCC AGGACGGAATCGTCTCCAAC AACCGTTCAA 43322 7 7.70e-07 CTACCA CGAACGTCGGCGAGTCGCAC GAATGTCTGG 42559 227 2.11e-06 TGCTCATTTT TTGACGGCGGCCAGACGAAT TATGGATGGC 14856 181 2.41e-06 GGCCGGTACG CCAAGAAAAGCGACGGCAAC AAGGAGATCC 48984 88 5.32e-06 TACTATAAAA CTGTCGAAATGCACTCGACT TTCTATGGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 30807 4.4e-11 154_[+1]_326 9283 2.8e-08 136_[+1]_344 27282 3.1e-08 423_[+1]_57 2217 1.1e-07 255_[+1]_225 47555 1.1e-07 467_[+1]_13 267 1.5e-07 308_[+1]_172 16357 4.4e-07 126_[+1]_354 23871 6.6e-07 389_[+1]_91 4127 7.1e-07 22_[+1]_458 43322 7.7e-07 6_[+1]_474 42559 2.1e-06 226_[+1]_254 14856 2.4e-06 180_[+1]_300 48984 5.3e-06 87_[+1]_393 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=13 30807 ( 155) CTGACGAAGTCGAAGCGAAC 1 9283 ( 137) CCGTCGGAATCGAAGACCAC 1 27282 ( 424) CCGACGAGGACGACACGAAC 1 2217 ( 256) CCTACGACGACGAAGACCAC 1 47555 ( 468) GTTACGAAATCGAAGCCAAT 1 267 ( 309) CAGACGATGGCGACGGCCAC 1 16357 ( 127) CCGTCGAAGGCGCGTCCACC 1 23871 ( 390) GTGACGGCTTCGACAACCAC 1 4127 ( 23) AGGACGGAATCGTCTCCAAC 1 43322 ( 7) CGAACGTCGGCGAGTCGCAC 1 42559 ( 227) TTGACGGCGGCCAGACGAAT 1 14856 ( 181) CCAAGAAAAGCGACGGCAAC 1 48984 ( 88) CTGTCGAAATGCACTCGACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 9620 bayes= 9.28465 E= 4.2e+000 -173 144 -63 -170 -173 59 -63 62 -73 -1035 154 -70 159 -1035 -1035 -12 -1035 185 -163 -1035 -173 -1035 195 -1035 127 -1035 37 -170 107 27 -163 -170 59 -1035 117 -170 -73 -1035 69 88 -1035 185 -163 -1035 -1035 -73 183 -1035 173 -173 -1035 -170 27 85 -5 -1035 -15 -1035 95 30 -15 127 -63 -1035 -1035 127 69 -1035 127 59 -1035 -1035 173 -73 -1035 -1035 -1035 159 -1035 -12 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 13 E= 4.2e+000 0.076923 0.692308 0.153846 0.076923 0.076923 0.384615 0.153846 0.384615 0.153846 0.000000 0.692308 0.153846 0.769231 0.000000 0.000000 0.230769 0.000000 0.923077 0.076923 0.000000 0.076923 0.000000 0.923077 0.000000 0.615385 0.000000 0.307692 0.076923 0.538462 0.307692 0.076923 0.076923 0.384615 0.000000 0.538462 0.076923 0.153846 0.000000 0.384615 0.461538 0.000000 0.923077 0.076923 0.000000 0.000000 0.153846 0.846154 0.000000 0.846154 0.076923 0.000000 0.076923 0.307692 0.461538 0.230769 0.000000 0.230769 0.000000 0.461538 0.307692 0.230769 0.615385 0.153846 0.000000 0.000000 0.615385 0.384615 0.000000 0.615385 0.384615 0.000000 0.000000 0.846154 0.153846 0.000000 0.000000 0.000000 0.769231 0.000000 0.230769 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[CT]G[AT]CG[AG][AC][GA][TG]CGA[CAG][GTA][CA][CG][AC]A[CT] -------------------------------------------------------------------------------- Time 3.78 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 8 llr = 100 E-value = 3.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3::::::::::: pos.-specific C 18::91::a4:: probability G 43a11:a:::8: matrix T 3::9:9:a:63a bits 2.1 * *** * 1.9 * *** * 1.7 * *** * 1.4 ******* * Relative 1.2 ******** ** Entropy 1.0 *********** (18.0 bits) 0.8 *********** 0.6 *********** 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel GCGTCTGTCTGT consensus AG CT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 29885 196 5.39e-08 GGTACGAAGA GCGTCTGTCTGT AATCCGTTCC 27282 164 1.68e-07 GGTTGACCCC ACGTCTGTCTGT CTTTTTGCTC 9283 83 1.68e-07 TAGTCTATAC TCGTCTGTCTGT AATGCGTTTT 48984 181 3.90e-07 AGTTCGGGGA GGGTCTGTCTGT GAGTAGTGTG 43827 314 8.98e-07 ATTCATTTTT CCGTCTGTCCGT CTGTTACGCC 43322 199 2.71e-06 AAAGAAGGAA ACGTCCGTCCGT CAGTCCGTCC 26157 262 4.36e-06 CCATAGTCAG GCGTGTGTCCTT TTCGACCAAG 2217 366 6.53e-06 ACCGTCGCCG TGGGCTGTCTTT GATGCCTGGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 29885 5.4e-08 195_[+2]_293 27282 1.7e-07 163_[+2]_325 9283 1.7e-07 82_[+2]_406 48984 3.9e-07 180_[+2]_308 43827 9e-07 313_[+2]_175 43322 2.7e-06 198_[+2]_290 26157 4.4e-06 261_[+2]_227 2217 6.5e-06 365_[+2]_123 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=8 29885 ( 196) GCGTCTGTCTGT 1 27282 ( 164) ACGTCTGTCTGT 1 9283 ( 83) TCGTCTGTCTGT 1 48984 ( 181) GGGTCTGTCTGT 1 43827 ( 314) CCGTCTGTCCGT 1 43322 ( 199) ACGTCCGTCCGT 1 26157 ( 262) GCGTGTGTCCTT 1 2217 ( 366) TGGGCTGTCTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9780 bayes= 10.9919 E= 3.7e+000 -3 -103 65 0 -965 155 7 -965 -965 -965 207 -965 -965 -965 -93 180 -965 178 -93 -965 -965 -103 -965 180 -965 -965 207 -965 -965 -965 -965 200 -965 197 -965 -965 -965 56 -965 132 -965 -965 165 0 -965 -965 -965 200 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 3.7e+000 0.250000 0.125000 0.375000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.875000 0.125000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.375000 0.000000 0.625000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GAT][CG]GTCTGTC[TC][GT]T -------------------------------------------------------------------------------- Time 7.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 10 llr = 132 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :4:::3572a1::821 pos.-specific C ::::::2::::5111: probability G 41:647338:::9:1: matrix T 65a46:::::95:169 bits 2.1 * * 1.9 * * 1.7 * * * 1.4 * ** * * Relative 1.2 * * *** * * Entropy 1.0 * **** ******* * (19.0 bits) 0.8 * **** ******* * 0.6 ****** ******* * 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TTTGTGAAGATCGATT consensus GA TGAGGA T A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 2217 113 7.79e-09 TCCTACAGCG TTTGGGGAGATCGATT ACGGCCAAGT 48984 120 1.60e-07 CTATGGCCAA GATGTAAAGATCGAAT CGCTGGAGAT 42559 397 1.60e-07 ATTAATGTGA TTTTTGCAAATTGATT CCAACGTCCC 26157 409 2.18e-07 CGAAAACCTG TTTTTGGAGATTCATT CCCACCCTCA 14856 476 2.18e-07 ATCGACCGTG TTTGGACGGATCGATT TTACCGAAT 16262 143 2.43e-07 TAAACAAACT GGTGGGAGGATTGATT GATGTCGCAA 36161 35 4.54e-07 GGATGGAAAC GATGGGAGGATCGACT ACTTGCGGCG 29885 89 9.09e-07 CCCAGTTACA TATTTGAAGATTGTGT GATGACGGAC 5873 158 2.27e-06 CTGTCAAATC GTTTTGGAGATCGCTA ATTCTACGCA 43827 94 2.98e-06 CTACCAAATG TATGTAAAAAATGAAT GAAGGAAAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2217 7.8e-09 112_[+3]_372 48984 1.6e-07 119_[+3]_365 42559 1.6e-07 396_[+3]_88 26157 2.2e-07 408_[+3]_76 14856 2.2e-07 475_[+3]_9 16262 2.4e-07 142_[+3]_342 36161 4.5e-07 34_[+3]_450 29885 9.1e-07 88_[+3]_396 5873 2.3e-06 157_[+3]_327 43827 3e-06 93_[+3]_391 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=10 2217 ( 113) TTTGGGGAGATCGATT 1 48984 ( 120) GATGTAAAGATCGAAT 1 42559 ( 397) TTTTTGCAAATTGATT 1 26157 ( 409) TTTTTGGAGATTCATT 1 14856 ( 476) TTTGGACGGATCGATT 1 16262 ( 143) GGTGGGAGGATTGATT 1 36161 ( 35) GATGGGAGGATCGACT 1 29885 ( 89) TATTTGAAGATTGTGT 1 5873 ( 158) GTTTTGGAGATCGCTA 1 43827 ( 94) TATGTAAAAAATGAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 10.8645 E= 1.5e+002 -997 -997 74 126 65 -997 -125 100 -997 -997 -997 200 -997 -997 133 67 -997 -997 74 126 23 -997 155 -997 97 -35 33 -997 145 -997 33 -997 -35 -997 174 -997 197 -997 -997 -997 -135 -997 -997 184 -997 97 -997 100 -997 -135 191 -997 164 -135 -997 -132 -35 -135 -125 126 -135 -997 -997 184 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 1.5e+002 0.000000 0.000000 0.400000 0.600000 0.400000 0.000000 0.100000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.400000 0.600000 0.300000 0.000000 0.700000 0.000000 0.500000 0.200000 0.300000 0.000000 0.700000 0.000000 0.300000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.100000 0.000000 0.000000 0.900000 0.000000 0.500000 0.000000 0.500000 0.000000 0.100000 0.900000 0.000000 0.800000 0.100000 0.000000 0.100000 0.200000 0.100000 0.100000 0.600000 0.100000 0.000000 0.000000 0.900000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG][TA]T[GT][TG][GA][AGC][AG][GA]AT[CT]GA[TA]T -------------------------------------------------------------------------------- Time 11.18 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42559 6.41e-06 159_[+3(6.44e-06)]_51_\ [+1(2.11e-06)]_150_[+3(1.60e-07)]_88 9283 1.28e-07 82_[+2(1.68e-07)]_42_[+1(2.76e-08)]_\ 344 47555 1.54e-03 467_[+1(1.12e-07)]_13 267 1.06e-03 308_[+1(1.49e-07)]_172 14856 9.02e-06 180_[+1(2.41e-06)]_275_\ [+3(2.18e-07)]_9 29885 4.50e-08 88_[+3(9.09e-07)]_91_[+2(5.39e-08)]_\ 173_[+1(2.92e-05)]_100 4127 2.04e-03 22_[+1(7.12e-07)]_458 16262 1.97e-03 142_[+3(2.43e-07)]_342 16357 9.65e-04 126_[+1(4.38e-07)]_354 30807 9.00e-07 75_[+1(5.07e-05)]_59_[+1(4.37e-11)]_\ 326 23871 1.74e-03 389_[+1(6.58e-07)]_91 5873 9.73e-03 157_[+3(2.27e-06)]_327 44351 4.00e-01 500 26157 1.40e-05 261_[+2(4.36e-06)]_135_\ [+3(2.18e-07)]_76 2217 2.73e-10 112_[+3(7.79e-09)]_127_\ [+1(1.12e-07)]_90_[+2(6.53e-06)]_123 27282 2.74e-07 163_[+2(1.68e-07)]_248_\ [+1(3.10e-08)]_57 36161 3.48e-03 34_[+3(4.54e-07)]_450 48984 1.18e-08 87_[+1(5.32e-06)]_12_[+3(1.60e-07)]_\ 45_[+2(3.90e-07)]_308 43827 6.30e-06 93_[+3(2.98e-06)]_204_\ [+2(8.98e-07)]_175 43322 5.33e-05 6_[+1(7.70e-07)]_172_[+2(2.71e-06)]_\ 290 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************