******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/297/297.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11034 1.0000 500 14637 1.0000 500 2235 1.0000 500 23507 1.0000 500 24413 1.0000 500 262396 1.0000 500 2735 1.0000 500 3755 1.0000 500 4907 1.0000 500 5655 1.0000 500 7479 1.0000 500 7817 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/297/297.seqs.fa -oc motifs/297 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.223 G 0.254 T 0.261 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.223 G 0.254 T 0.261 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 8 llr = 129 E-value = 1.8e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::::3:85:51::311:1 pos.-specific C :9156:191:6:1a::5::9 probability G 61843a3:1::56:a5:9a: matrix T 4:111:41:54:1::34::: bits 2.2 * 2.0 * ** * 1.7 * ** * 1.5 * * * ** *** Relative 1.3 * * * ** *** Entropy 1.1 ** * * * ** *** (23.3 bits) 0.9 *** ** ***** ** *** 0.7 ****** ***** ** **** 0.4 ****** ************* 0.2 ****** ************* 0.0 -------------------- Multilevel GCGCCGTCAACAGCGGCGGC consensus T GG A TTG AT sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 23507 102 5.51e-10 AAGGCGTTCA GCGGCGGCATTGGCGACGGC CAATGGTGCT 14637 28 4.41e-09 ATGGACGTCC TCGTCGTCATCGGCGGAGGC GTTGCCATTG 7817 289 5.55e-09 CACATCCTCT TCCCCGCCAACAGCGTCGGC AAACTGAGTC 5655 177 5.55e-09 TTTGGTGTTG GCGCTGTCATCATCGGTGGC GGAGAGTACT 11034 454 1.00e-08 AACCAACACG GCGCCGACCACAACGACGGC GAGACATCGC 4907 174 6.18e-08 TTTGGAGGTA GGGGCGGCAATGGCGGTGGA TCCCAACGTA 3755 140 9.93e-08 TCAAGACGGC GCTCGGACATCACCGGCAGC ATGATGTTTT 24413 127 2.14e-07 CCCTCTTGCC TCGGGGTTGATGGCGTTGGC ACAATACTGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23507 5.5e-10 101_[+1]_379 14637 4.4e-09 27_[+1]_453 7817 5.5e-09 288_[+1]_192 5655 5.5e-09 176_[+1]_304 11034 1e-08 453_[+1]_27 4907 6.2e-08 173_[+1]_307 3755 9.9e-08 139_[+1]_341 24413 2.1e-07 126_[+1]_354 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=8 23507 ( 102) GCGGCGGCATTGGCGACGGC 1 14637 ( 28) TCGTCGTCATCGGCGGAGGC 1 7817 ( 289) TCCCCGCCAACAGCGTCGGC 1 5655 ( 177) GCGCTGTCATCATCGGTGGC 1 11034 ( 454) GCGCCGACCACAACGACGGC 1 4907 ( 174) GGGGCGGCAATGGCGGTGGA 1 3755 ( 140) GCTCGGACATCACCGGCAGC 1 24413 ( 127) TCGGGGTTGATGGCGTTGGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5772 bayes= 9.49285 E= 1.8e-002 -965 -965 130 52 -965 197 -102 -965 -965 -83 156 -106 -965 117 56 -106 -965 149 -2 -106 -965 -965 198 -965 -7 -83 -2 52 -965 197 -965 -106 151 -83 -102 -965 93 -965 -965 94 -965 149 -965 52 93 -965 98 -965 -107 -83 130 -106 -965 217 -965 -965 -965 -965 198 -965 -7 -965 98 -6 -107 117 -965 52 -107 -965 179 -965 -965 -965 198 -965 -107 197 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 1.8e-002 0.000000 0.000000 0.625000 0.375000 0.000000 0.875000 0.125000 0.000000 0.000000 0.125000 0.750000 0.125000 0.000000 0.500000 0.375000 0.125000 0.000000 0.625000 0.250000 0.125000 0.000000 0.000000 1.000000 0.000000 0.250000 0.125000 0.250000 0.375000 0.000000 0.875000 0.000000 0.125000 0.750000 0.125000 0.125000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.625000 0.000000 0.375000 0.500000 0.000000 0.500000 0.000000 0.125000 0.125000 0.625000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.500000 0.250000 0.125000 0.500000 0.000000 0.375000 0.125000 0.000000 0.875000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.875000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT]CG[CG][CG]G[TAG]CA[AT][CT][AG]GCG[GAT][CT]GGC -------------------------------------------------------------------------------- Time 1.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 11 llr = 115 E-value = 2.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3:a::9::431 pos.-specific C :67::a:57129 probability G 5:::::11:51: matrix T 513:a::43:5: bits 2.2 * 2.0 *** 1.7 *** * 1.5 **** * Relative 1.3 ***** * * Entropy 1.1 ***** * * (15.1 bits) 0.9 ******* * * 0.7 ********** * 0.4 ********** * 0.2 ************ 0.0 ------------ Multilevel TCCATCACCGTC consensus GAT TTAA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 5655 473 1.54e-07 ATCTGAATCA TCCATCACCGAC CATCATTATC 3755 267 4.24e-07 TACAAACGCC GCCATCACCGCC AATTGCAAAT 14637 162 6.90e-07 TGACGTCCTT TCCATCATCGAC GTTAGCAACG 24413 415 1.83e-06 ATCGGTTCAT GACATCACCGAC GCGACGAGAA 2235 489 2.73e-06 ACGCATTACA TACATCATCATC 262396 486 4.77e-06 GAGATAACAG GCTATCACCACC ACA 4907 489 1.15e-05 AATTGATGAT TCTATCATCGGC 2735 317 1.51e-05 ACGAACTCAC TTTATCACCATC GTCACTCTCC 7817 370 2.25e-05 TACATAAATA TCCATCATTGTA ACGCAAGATC 7479 377 3.53e-05 ATACAATTGC GCCATCAGTCTC TTGAAAGCAA 11034 476 3.90e-05 ACGACGGCGA GACATCGCTATC CTACTACTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5655 1.5e-07 472_[+2]_16 3755 4.2e-07 266_[+2]_222 14637 6.9e-07 161_[+2]_327 24413 1.8e-06 414_[+2]_74 2235 2.7e-06 488_[+2] 262396 4.8e-06 485_[+2]_3 4907 1.1e-05 488_[+2] 2735 1.5e-05 316_[+2]_172 7817 2.3e-05 369_[+2]_119 7479 3.5e-05 376_[+2]_112 11034 3.9e-05 475_[+2]_13 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=11 5655 ( 473) TCCATCACCGAC 1 3755 ( 267) GCCATCACCGCC 1 14637 ( 162) TCCATCATCGAC 1 24413 ( 415) GACATCACCGAC 1 2235 ( 489) TACATCATCATC 1 262396 ( 486) GCTATCACCACC 1 4907 ( 489) TCTATCATCGGC 1 2735 ( 317) TTTATCACCATC 1 7817 ( 370) TCCATCATTGTA 1 7479 ( 377) GCCATCAGTCTC 1 11034 ( 476) GACATCGCTATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.4122 E= 2.5e+000 -1010 -1010 84 106 5 152 -1010 -152 -1010 171 -1010 6 193 -1010 -1010 -1010 -1010 -1010 -1010 194 -1010 217 -1010 -1010 179 -1010 -148 -1010 -1010 129 -148 48 -1010 171 -1010 6 47 -129 110 -1010 5 -29 -148 80 -153 203 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 2.5e+000 0.000000 0.000000 0.454545 0.545455 0.272727 0.636364 0.000000 0.090909 0.000000 0.727273 0.000000 0.272727 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.909091 0.000000 0.090909 0.000000 0.000000 0.545455 0.090909 0.363636 0.000000 0.727273 0.000000 0.272727 0.363636 0.090909 0.545455 0.000000 0.272727 0.181818 0.090909 0.454545 0.090909 0.909091 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG][CA][CT]ATCA[CT][CT][GA][TA]C -------------------------------------------------------------------------------- Time 2.82 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 12 llr = 118 E-value = 2.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1:a::32:5::: pos.-specific C 33:22::4::21 probability G 17::86:65:59 matrix T 5::8:18::a3: bits 2.2 2.0 * * 1.7 * * 1.5 * * * Relative 1.3 *** * * * Entropy 1.1 **** ** * * (14.2 bits) 0.9 **** **** * 0.7 ********* * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGATGGTGATGG consensus CC A CG T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 3755 336 1.69e-07 CACATCGTTG TGATGGTGATGG ATTGAGCTCA 4907 128 1.95e-06 AGGAGGCTGA CGATGATGATGG TGCACTCGTT 7479 485 2.96e-06 TCGACGATGA CGATGATCGTGG CACG 2235 133 2.96e-06 TAGGATTCGG TCATGATGATGG TGTTGTTGAG 11034 25 4.06e-06 ACTAGAATGA TCATGATCATGG CGAAGAAGCT 5655 245 1.03e-05 GATGTGAATA CGACGGTGGTTG AAAAGACGCA 23507 172 1.45e-05 CTCTTGACTT TGACGGTGATCG ATTCGAACTC 7817 469 1.93e-05 TGCGTGCGTA TGATGTTCGTTG GCATCCTGAC 14637 313 1.93e-05 GAGTCTATCG TCATCGTCGTTG TTGTTGTTGT 2735 245 5.11e-05 TGGAAGGTAG AGATGGAGGTTG TCAATGCAGT 262396 249 5.11e-05 TCCACGAGAC GCATGGAGGTGG TTTGATATGA 24413 106 1.05e-04 CGTTCGTCGT CGATCGTCATCC CCTCTTGCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3755 1.7e-07 335_[+3]_153 4907 2e-06 127_[+3]_361 7479 3e-06 484_[+3]_4 2235 3e-06 132_[+3]_356 11034 4.1e-06 24_[+3]_464 5655 1e-05 244_[+3]_244 23507 1.4e-05 171_[+3]_317 7817 1.9e-05 468_[+3]_20 14637 1.9e-05 312_[+3]_176 2735 5.1e-05 244_[+3]_244 262396 5.1e-05 248_[+3]_240 24413 0.0001 105_[+3]_383 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=12 3755 ( 336) TGATGGTGATGG 1 4907 ( 128) CGATGATGATGG 1 7479 ( 485) CGATGATCGTGG 1 2235 ( 133) TCATGATGATGG 1 11034 ( 25) TCATGATCATGG 1 5655 ( 245) CGACGGTGGTTG 1 23507 ( 172) TGACGGTGATCG 1 7817 ( 469) TGATGTTCGTTG 1 14637 ( 313) TCATCGTCGTTG 1 2735 ( 245) AGATGGAGGTTG 1 262396 ( 249) GCATGGAGGTGG 1 24413 ( 106) CGATCGTCATCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.37898 E= 2.4e+001 -166 58 -160 94 -1023 58 139 -1023 193 -1023 -1023 -1023 -1023 -42 -1023 167 -1023 -42 172 -1023 34 -1023 120 -164 -66 -1023 -1023 167 -1023 90 120 -1023 93 -1023 98 -1023 -1023 -1023 -1023 194 -1023 -42 98 35 -1023 -141 185 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 2.4e+001 0.083333 0.333333 0.083333 0.500000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.833333 0.000000 0.333333 0.000000 0.583333 0.083333 0.166667 0.000000 0.000000 0.833333 0.000000 0.416667 0.583333 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.500000 0.333333 0.000000 0.083333 0.916667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC][GC]ATG[GA]T[GC][AG]T[GT]G -------------------------------------------------------------------------------- Time 4.02 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11034 4.98e-08 24_[+3(4.06e-06)]_417_\ [+1(1.00e-08)]_2_[+2(3.90e-05)]_13 14637 2.40e-09 27_[+1(4.41e-09)]_114_\ [+2(6.90e-07)]_7_[+1(1.94e-05)]_10_[+3(4.29e-05)]_90_[+3(1.93e-05)]_176 2235 8.23e-05 132_[+3(2.96e-06)]_260_\ [+3(9.05e-05)]_72_[+2(2.73e-06)] 23507 2.76e-07 101_[+1(5.51e-10)]_50_\ [+3(1.45e-05)]_317 24413 9.39e-07 126_[+1(2.14e-07)]_268_\ [+2(1.83e-06)]_74 262396 2.50e-03 248_[+3(5.11e-05)]_225_\ [+2(4.77e-06)]_3 2735 1.42e-03 244_[+3(5.11e-05)]_60_\ [+2(1.51e-05)]_172 3755 3.42e-10 139_[+1(9.93e-08)]_107_\ [+2(4.24e-07)]_57_[+3(1.69e-07)]_153 4907 4.41e-08 127_[+3(1.95e-06)]_34_\ [+1(6.18e-08)]_295_[+2(1.15e-05)] 5655 4.17e-10 176_[+1(5.55e-09)]_48_\ [+3(1.03e-05)]_216_[+2(1.54e-07)]_16 7479 7.32e-04 376_[+2(3.53e-05)]_96_\ [+3(2.96e-06)]_4 7817 7.27e-08 288_[+1(5.55e-09)]_61_\ [+2(2.25e-05)]_87_[+3(1.93e-05)]_20 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************