******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/176/176.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42477 1.0000 500 46411 1.0000 500 36932 1.0000 500 54956 1.0000 500 41746 1.0000 500 1341 1.0000 500 43720 1.0000 500 48946 1.0000 500 48966 1.0000 500 49071 1.0000 500 49288 1.0000 500 16140 1.0000 500 50168 1.0000 500 18872 1.0000 500 19000 1.0000 500 44770 1.0000 500 44908 1.0000 500 12533 1.0000 500 36080 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/176/176.seqs.fa -oc motifs/176 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.240 G 0.226 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.270 C 0.240 G 0.226 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 19 llr = 176 E-value = 1.6e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 31:a:51::882 pos.-specific C 3:a:a31:a212 probability G 31:::152::23 matrix T 28:::238:::3 bits 2.1 * * * 1.9 *** * 1.7 *** * 1.5 *** * Relative 1.3 *** *** Entropy 1.1 **** **** (13.3 bits) 0.9 **** **** 0.6 **** **** 0.4 **** ***** 0.2 ********** 0.0 ------------ Multilevel ATCACAGTCAAG consensus C CT T sequence G A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 54956 26 1.21e-07 GTGGTCATCT CTCACAGTCAAG TTAATCCTTC 36080 386 2.59e-07 AGAGATAAAT GTCACAGTCAAT AATGTAAACT 43720 101 3.32e-07 AAGTTTCATG CTCACAGTCAAT TTGGATTCTG 18872 212 1.28e-06 GAAGAAGAGA ATCACAGTCAAA ATGCATCAGG 19000 185 2.28e-06 AGGAAGGAAT CTCACTGTCAAG TATTGAGGAA 44908 480 4.65e-06 TTTTCCGGCA CTCACATTCAAA GAATGATAT 42477 367 9.21e-06 CCAAAAAGCC ATCACTGTCAAC GCGACCCCTT 44770 297 1.05e-05 CGTTGATTGG TGCACAGTCAAG AAGGACGAAA 48946 352 2.91e-05 GTTGCCGTCC GTCACGATCAAG GACGAAACAC 46411 16 3.18e-05 GACAATGATC CTCACGGTCCAT ATGGAGACGT 49071 478 3.48e-05 CTTCCCTTTT TTCACCTGCAAG GCGTACAGAC 16140 437 5.21e-05 CGACGATCCC GTCACCGTCCGT GACTGTCGAC 49288 261 5.21e-05 GAGGGAGGGA AACACCTTCAAT TTTGCAGAGC 50168 438 5.65e-05 ACAACACGGA ATCACCTTCAGC CCAGAAAAAA 41746 159 6.13e-05 AGTCGAGCGA GTCACACGCAAA GTTGAACTTG 1341 427 7.07e-05 GTTGAGAACG TTCACATTCACT AAACCACCTA 36932 154 7.54e-05 ATGCATTTTG GTCACCGTCCGA CTCCTGTTTC 12533 258 1.76e-04 AAGCAACGAA AACACAAGCAAG AGTCTGTGTT 48966 314 2.01e-04 CACCAACTTC TGCACTCTCAAC CCTGCCACTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54956 1.2e-07 25_[+1]_463 36080 2.6e-07 385_[+1]_103 43720 3.3e-07 100_[+1]_388 18872 1.3e-06 211_[+1]_277 19000 2.3e-06 184_[+1]_304 44908 4.7e-06 479_[+1]_9 42477 9.2e-06 366_[+1]_122 44770 1.1e-05 296_[+1]_192 48946 2.9e-05 351_[+1]_137 46411 3.2e-05 15_[+1]_473 49071 3.5e-05 477_[+1]_11 16140 5.2e-05 436_[+1]_52 49288 5.2e-05 260_[+1]_228 50168 5.6e-05 437_[+1]_51 41746 6.1e-05 158_[+1]_330 1341 7.1e-05 426_[+1]_62 36932 7.5e-05 153_[+1]_335 12533 0.00018 257_[+1]_231 48966 0.0002 313_[+1]_175 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=19 54956 ( 26) CTCACAGTCAAG 1 36080 ( 386) GTCACAGTCAAT 1 43720 ( 101) CTCACAGTCAAT 1 18872 ( 212) ATCACAGTCAAA 1 19000 ( 185) CTCACTGTCAAG 1 44908 ( 480) CTCACATTCAAA 1 42477 ( 367) ATCACTGTCAAC 1 44770 ( 297) TGCACAGTCAAG 1 48946 ( 352) GTCACGATCAAG 1 46411 ( 16) CTCACGGTCCAT 1 49071 ( 478) TTCACCTGCAAG 1 16140 ( 437) GTCACCGTCCGT 1 49288 ( 261) AACACCTTCAAT 1 50168 ( 438) ATCACCTTCAGC 1 41746 ( 159) GTCACACGCAAA 1 1341 ( 427) TTCACATTCACT 1 36932 ( 154) GTCACCGTCCGA 1 12533 ( 258) AACACAAGCAAG 1 48966 ( 314) TGCACTCTCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 9.12593 E= 1.6e-004 -4 13 22 -32 -136 -1089 -110 158 -1089 206 -1089 -1089 189 -1089 -1089 -1089 -1089 206 -1089 -1089 81 13 -110 -74 -136 -119 122 0 -1089 -1089 -52 168 -1089 206 -1089 -1089 164 -60 -1089 -1089 155 -219 -52 -1089 -36 -60 48 26 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 19 E= 1.6e-004 0.263158 0.263158 0.263158 0.210526 0.105263 0.000000 0.105263 0.789474 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.473684 0.263158 0.105263 0.157895 0.105263 0.105263 0.526316 0.263158 0.000000 0.000000 0.157895 0.842105 0.000000 1.000000 0.000000 0.000000 0.842105 0.157895 0.000000 0.000000 0.789474 0.052632 0.157895 0.000000 0.210526 0.157895 0.315789 0.315789 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [ACGT]TCAC[AC][GT]TCAA[GTA] -------------------------------------------------------------------------------- Time 3.06 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 17 llr = 188 E-value = 8.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :8472135:816442:27 pos.-specific C ::6::4:22:81:42582 probability G 92:261728113::11:1 matrix T 11:125:1:1::6355:: bits 2.1 1.9 1.7 1.5 * * Relative 1.3 * * * * * Entropy 1.1 * * * *** * (15.9 bits) 0.9 ***** * ***** ** 0.6 ***** * ***** *** 0.4 ******* ****** *** 0.2 ****************** 0.0 ------------------ Multilevel GACAGTGAGACATATCCA consensus A CAC GACCT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 41746 121 1.93e-09 TTGATTGACA GACAGGGAGACATCTCCA TATAATTTCG 19000 328 9.54e-08 AGGTAGTAGT GACTGTGAGACAACATCA AGCAACGACA 42477 179 2.68e-07 GACCCGGATA GAAAGTACGACATTTTCC ACAGCATTGG 54956 407 3.41e-07 TCCGAAGCAA GAAGGCGTGACATCTCCA CTCTTTGGAA 49071 441 6.73e-07 TCGTTTGATT GACAGTGAGTAAAATTCA GAGCACTGTC 12533 139 9.28e-07 GGCGTTTCCA GAAAGTGAGACGATTTAC AGTTAGGTTG 46411 202 1.26e-06 ACACACCACG GACATTGGGACATTACAA CCCGTCTCCC 44908 87 1.40e-06 GGAGACTCCG GAATGAGAGACATCCTCA CCTAGAGTAT 36080 116 2.49e-06 TTTCTGAAAG GACAGCGGCACAAACGCA TTACTACATT 44770 46 2.73e-06 CCCGAAGCAA GAAAATAACACGTACCCA AACCCTTGTC 43720 72 4.64e-06 GTGATTTGTC GAAATGACGACAATCTCA AAAGTTTCAT 48946 318 5.97e-06 TGGCGCGAAC GTCAACACGACATCACCA TATCCGGTTG 16140 83 8.24e-06 ACTCCTTCTG GACGGTGAGTCCAAGCCA AAGTGTCCAT 49288 170 1.98e-05 CAGGTTCATA TGCAGCGCCACGTCTTCG TAGAGTGTAG 36932 126 2.43e-05 GGGGGGGGGG GGCATTGGGAAGAATTCG ATGCATTTTG 18872 19 2.76e-05 GATAGTGAAA GGCAGCGAGGGGTTGCCA TGCGAGGCCG 48966 417 3.55e-05 CTTGATCGAA TACGACAAGACATATCAC ACATTCTCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41746 1.9e-09 120_[+2]_362 19000 9.5e-08 327_[+2]_155 42477 2.7e-07 178_[+2]_304 54956 3.4e-07 406_[+2]_76 49071 6.7e-07 440_[+2]_42 12533 9.3e-07 138_[+2]_344 46411 1.3e-06 201_[+2]_281 44908 1.4e-06 86_[+2]_396 36080 2.5e-06 115_[+2]_367 44770 2.7e-06 45_[+2]_437 43720 4.6e-06 71_[+2]_411 48946 6e-06 317_[+2]_165 16140 8.2e-06 82_[+2]_400 49288 2e-05 169_[+2]_313 36932 2.4e-05 125_[+2]_357 18872 2.8e-05 18_[+2]_464 48966 3.6e-05 416_[+2]_66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=17 41746 ( 121) GACAGGGAGACATCTCCA 1 19000 ( 328) GACTGTGAGACAACATCA 1 42477 ( 179) GAAAGTACGACATTTTCC 1 54956 ( 407) GAAGGCGTGACATCTCCA 1 49071 ( 441) GACAGTGAGTAAAATTCA 1 12533 ( 139) GAAAGTGAGACGATTTAC 1 46411 ( 202) GACATTGGGACATTACAA 1 44908 ( 87) GAATGAGAGACATCCTCA 1 36080 ( 116) GACAGCGGCACAAACGCA 1 44770 ( 46) GAAAATAACACGTACCCA 1 43720 ( 72) GAAATGACGACAATCTCA 1 48946 ( 318) GTCAACACGACATCACCA 1 16140 ( 83) GACGGTGAGTCCAAGCCA 1 49288 ( 170) TGCAGCGCCACGTCTTCG 1 36932 ( 126) GGCATTGGGAAGAATTCG 1 18872 ( 19) GGCAGCGAGGGGTTGCCA 1 48966 ( 417) TACGACAAGACATATCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 9177 bayes= 9.14334 E= 8.6e-001 -1073 -1073 196 -116 150 -1073 -36 -216 39 143 -1073 -1073 138 -1073 -36 -116 -61 -1073 151 -58 -220 56 -94 84 12 -1073 164 -1073 97 -3 -36 -216 -1073 -44 186 -1073 161 -1073 -194 -116 -120 178 -194 -1073 126 -203 38 -1073 61 -1073 -1073 116 39 56 -1073 16 -61 -3 -94 84 -1073 97 -194 84 -61 178 -1073 -1073 138 -44 -94 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 17 E= 8.6e-001 0.000000 0.000000 0.882353 0.117647 0.764706 0.000000 0.176471 0.058824 0.352941 0.647059 0.000000 0.000000 0.705882 0.000000 0.176471 0.117647 0.176471 0.000000 0.647059 0.176471 0.058824 0.352941 0.117647 0.470588 0.294118 0.000000 0.705882 0.000000 0.529412 0.235294 0.176471 0.058824 0.000000 0.176471 0.823529 0.000000 0.823529 0.000000 0.058824 0.117647 0.117647 0.823529 0.058824 0.000000 0.647059 0.058824 0.294118 0.000000 0.411765 0.000000 0.000000 0.588235 0.352941 0.352941 0.000000 0.294118 0.176471 0.235294 0.117647 0.470588 0.000000 0.470588 0.058824 0.470588 0.176471 0.823529 0.000000 0.000000 0.705882 0.176471 0.117647 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GA[CA]AG[TC][GA][AC]GAC[AG][TA][ACT][TC][CT]CA -------------------------------------------------------------------------------- Time 6.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 101 E-value = 3.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :66:3::a:::a pos.-specific C 1:3::a9::9:: probability G 931:8:::41a: matrix T :1:a::1:6::: bits 2.1 * * 1.9 * * * ** 1.7 * * * ** 1.5 * * *** *** Relative 1.3 * ***** *** Entropy 1.1 * ********* (18.2 bits) 0.9 * ********* 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GAATGCCATCGA consensus GC A G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 36080 36 5.87e-08 TAAGAACGTC GAATGCCATCGA GTCGGCTCAT 46411 346 4.16e-07 GACAGGGCAA GAGTGCCATCGA ATCCATCGTT 18872 178 5.17e-07 AAAGTGATGA GGCTGCCATCGA TTTGGTTTGA 16140 128 5.77e-07 TTCGTTGATT GAATACCAGCGA GTGCATTTTG 36932 255 9.47e-07 CACGGATCGA GGATACCATCGA TCTAAACGGT 12533 313 1.10e-06 GGAATGAGAC GAATGCCAGGGA AAATAGGAGC 50168 20 2.25e-06 AAATTCCACT CACTGCCAGCGA TGTTACAGTT 49288 53 3.34e-06 CCCGCTGATA GTATGCTATCGA ATCGTTATTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36080 5.9e-08 35_[+3]_453 46411 4.2e-07 345_[+3]_143 18872 5.2e-07 177_[+3]_311 16140 5.8e-07 127_[+3]_361 36932 9.5e-07 254_[+3]_234 12533 1.1e-06 312_[+3]_176 50168 2.3e-06 19_[+3]_469 49288 3.3e-06 52_[+3]_436 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 36080 ( 36) GAATGCCATCGA 1 46411 ( 346) GAGTGCCATCGA 1 18872 ( 178) GGCTGCCATCGA 1 16140 ( 128) GAATACCAGCGA 1 36932 ( 255) GGATACCATCGA 1 12533 ( 313) GAATGCCAGGGA 1 50168 ( 20) CACTGCCAGCGA 1 49288 ( 53) GTATGCTATCGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 10.1804 E= 3.0e+000 -965 -94 195 -965 121 -965 14 -107 121 6 -86 -965 -965 -965 -965 192 -11 -965 173 -965 -965 206 -965 -965 -965 187 -965 -107 189 -965 -965 -965 -965 -965 73 125 -965 187 -86 -965 -965 -965 214 -965 189 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 3.0e+000 0.000000 0.125000 0.875000 0.000000 0.625000 0.000000 0.250000 0.125000 0.625000 0.250000 0.125000 0.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.875000 0.000000 0.125000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.375000 0.625000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AG][AC]T[GA]CCA[TG]CGA -------------------------------------------------------------------------------- Time 9.07 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42477 5.57e-05 178_[+2(2.68e-07)]_170_\ [+1(9.21e-06)]_122 46411 4.27e-07 15_[+1(3.18e-05)]_174_\ [+2(1.26e-06)]_126_[+3(4.16e-07)]_143 36932 2.65e-05 125_[+2(2.43e-05)]_10_\ [+1(7.54e-05)]_89_[+3(9.47e-07)]_234 54956 1.17e-06 25_[+1(1.21e-07)]_369_\ [+2(3.41e-07)]_76 41746 1.22e-06 76_[+2(6.68e-05)]_26_[+2(1.93e-09)]_\ 20_[+1(6.13e-05)]_330 1341 1.53e-01 426_[+1(7.07e-05)]_62 43720 2.48e-05 71_[+2(4.64e-06)]_11_[+1(3.32e-07)]_\ 388 48946 1.21e-03 317_[+2(5.97e-06)]_16_\ [+1(2.91e-05)]_137 48966 1.96e-02 416_[+2(3.55e-05)]_66 49071 3.72e-04 440_[+2(6.73e-07)]_19_\ [+1(3.48e-05)]_11 49288 4.88e-05 52_[+3(3.34e-06)]_105_\ [+2(1.98e-05)]_73_[+1(5.21e-05)]_228 16140 4.79e-06 82_[+2(8.24e-06)]_27_[+3(5.77e-07)]_\ 297_[+1(5.21e-05)]_52 50168 1.57e-03 19_[+3(2.25e-06)]_406_\ [+1(5.65e-05)]_51 18872 4.63e-07 18_[+2(2.76e-05)]_141_\ [+3(5.17e-07)]_22_[+1(1.28e-06)]_277 19000 6.11e-06 184_[+1(2.28e-06)]_131_\ [+2(9.54e-08)]_155 44770 4.78e-04 45_[+2(2.73e-06)]_233_\ [+1(1.05e-05)]_192 44908 6.35e-05 86_[+2(1.40e-06)]_375_\ [+1(4.65e-06)]_9 12533 3.49e-06 138_[+2(9.28e-07)]_156_\ [+3(1.10e-06)]_176 36080 1.62e-09 35_[+3(5.87e-08)]_68_[+2(2.49e-06)]_\ 230_[+1(4.65e-06)]_10_[+1(2.59e-07)]_103 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************