******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/119/119.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 24353 1.0000 500 3969 1.0000 500 20779 1.0000 500 20833 1.0000 500 46539 1.0000 500 48328 1.0000 500 48611 1.0000 500 48694 1.0000 500 48784 1.0000 500 51560 1.0000 500 49764 1.0000 500 6052 1.0000 500 26921 1.0000 500 45626 1.0000 500 35623 1.0000 500 45960 1.0000 500 47378 1.0000 500 44034 1.0000 500 42787 1.0000 500 44139 1.0000 500 49248 1.0000 500 43302 1.0000 500 35262 1.0000 500 48896 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/119/119.seqs.fa -oc motifs/119 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 24 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12000 N= 24 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.241 G 0.220 T 0.282 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.241 G 0.220 T 0.282 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 10 llr = 140 E-value = 7.2e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::4564::9:91::a: pos.-specific C 35:112:3:a:::2:1 probability G :261::::1::9:::9 matrix T 73:334a7::1:a8:: bits 2.2 2.0 * * 1.7 * * ** ** 1.5 * ***** ** Relative 1.3 * ***** ** Entropy 1.1 * * ********** (20.2 bits) 0.9 * * ********** 0.7 * * * ********** 0.4 *** ************ 0.2 **************** 0.0 ---------------- Multilevel TCGAAATTACAGTTAG consensus CTATTT C C sequence G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 35262 361 4.74e-09 TCGATCCAAG TCGAATTCACAGTTAG TTTGAAATAG 49248 313 5.41e-09 TCGACGGCGA TCGATTTTACAGTTAG AGTCAATGAC 3969 290 3.52e-08 CAATTGCAGT TTATATTTACAGTTAG GGAATACAAC 46539 195 1.61e-07 CTGTTCACAG TCATAATCACAGTCAG CACGTCGTCG 24353 73 2.00e-07 TTTAGATAAT TCGCCATTACAGTTAG TCTGGTCGTC 45626 261 2.16e-07 ATCTACCGGC CGAATCTTACAGTTAG TCACTCTCAG 48694 308 5.16e-07 TTGTGCGCCC CTGATATTACTGTTAG GGTTCCGTTT 20833 51 5.51e-07 TTACAAGCAA TGATATTTACAATTAG CTTCGCACTT 48896 296 6.79e-07 GGACGGGTAA CTGGAATCACAGTCAG AGTAATTGTT 42787 388 8.33e-07 TTGTCATATT TCGAACTTGCAGTTAC ATAGCATTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35262 4.7e-09 360_[+1]_124 49248 5.4e-09 312_[+1]_172 3969 3.5e-08 289_[+1]_195 46539 1.6e-07 194_[+1]_290 24353 2e-07 72_[+1]_412 45626 2.2e-07 260_[+1]_224 48694 5.2e-07 307_[+1]_177 20833 5.5e-07 50_[+1]_434 48896 6.8e-07 295_[+1]_189 42787 8.3e-07 387_[+1]_97 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=10 35262 ( 361) TCGAATTCACAGTTAG 1 49248 ( 313) TCGATTTTACAGTTAG 1 3969 ( 290) TTATATTTACAGTTAG 1 46539 ( 195) TCATAATCACAGTCAG 1 24353 ( 73) TCGCCATTACAGTTAG 1 45626 ( 261) CGAATCTTACAGTTAG 1 48694 ( 308) CTGATATTACTGTTAG 1 20833 ( 51) TGATATTTACAATTAG 1 48896 ( 296) CTGGAATCACAGTCAG 1 42787 ( 388) TCGAACTTGCAGTTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 11640 bayes= 10.4354 E= 7.2e-003 -997 32 -997 131 -997 105 -14 9 64 -997 145 -997 96 -127 -114 9 122 -127 -997 9 64 -27 -997 51 -997 -997 -997 183 -997 32 -997 131 181 -997 -114 -997 -997 205 -997 -997 181 -997 -997 -149 -136 -997 203 -997 -997 -997 -997 183 -997 -27 -997 151 196 -997 -997 -997 -997 -127 203 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 7.2e-003 0.000000 0.300000 0.000000 0.700000 0.000000 0.500000 0.200000 0.300000 0.400000 0.000000 0.600000 0.000000 0.500000 0.100000 0.100000 0.300000 0.600000 0.100000 0.000000 0.300000 0.400000 0.200000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.300000 0.000000 0.700000 0.900000 0.000000 0.100000 0.000000 0.000000 1.000000 0.000000 0.000000 0.900000 0.000000 0.000000 0.100000 0.100000 0.000000 0.900000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 1.000000 0.000000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC][CTG][GA][AT][AT][ATC]T[TC]ACAGT[TC]AG -------------------------------------------------------------------------------- Time 5.87 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 8 llr = 125 E-value = 7.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 64:a18:4::3:a::1:a1 pos.-specific C 451:439:8:46:4:93:4 probability G :19:1:1::a31:56:8:5 matrix T ::::4::63:13:14:::: bits 2.2 * 2.0 * * * * 1.7 * * * * 1.5 ** * * * * * Relative 1.3 ** * * * *** Entropy 1.1 * ** ** ** * **** (22.5 bits) 0.9 * ** ***** * **** 0.7 **** ***** ******** 0.4 **** ***** ******** 0.2 ******************* 0.0 ------------------- Multilevel ACGACACTCGCCAGGCGAG consensus CA TC AT AT CT C C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 48896 112 2.47e-10 ACAACGATAC ACGACACACGACACGCGAG TAATTGTAGT 26921 302 1.21e-09 GATCCGGCAG ACGACACTCGACACTCGAC ACTCGACACT 44139 19 1.27e-08 GAATTCGCAC ACGAAACTTGCCAGTCGAC GATGAACCTT 46539 156 3.74e-08 AACCGTGGAA AAGAGACACGGCAGGAGAC GCGAAGCCTT 44034 75 5.26e-08 TCAAGCTGCA ACGATCCTCGTTAGGCCAG GTATTTCTCT 49764 60 6.18e-08 GGCATCGATC CGGATACTCGCGATGCGAG GAAAAATTGA 24353 400 1.55e-07 TCTGTTCACG CAGATACTTGGCAGTCCAA ACTCACGAAA 45626 132 4.04e-07 CATTTTTCTC CACACCGACGCTACGCGAG AACGAGGGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48896 2.5e-10 111_[+2]_370 26921 1.2e-09 301_[+2]_180 44139 1.3e-08 18_[+2]_463 46539 3.7e-08 155_[+2]_326 44034 5.3e-08 74_[+2]_407 49764 6.2e-08 59_[+2]_422 24353 1.6e-07 399_[+2]_82 45626 4e-07 131_[+2]_350 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=8 48896 ( 112) ACGACACACGACACGCGAG 1 26921 ( 302) ACGACACTCGACACTCGAC 1 44139 ( 19) ACGAAACTTGCCAGTCGAC 1 46539 ( 156) AAGAGACACGGCAGGAGAC 1 44034 ( 75) ACGATCCTCGTTAGGCCAG 1 49764 ( 60) CGGATACTCGCGATGCGAG 1 24353 ( 400) CAGATACTTGGCAGTCCAA 1 45626 ( 132) CACACCGACGCTACGCGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 11568 bayes= 11.2342 E= 7.7e+000 128 64 -965 -965 54 105 -81 -965 -965 -95 199 -965 196 -965 -965 -965 -104 64 -81 41 154 5 -965 -965 -965 186 -81 -965 54 -965 -965 115 -965 164 -965 -17 -965 -965 218 -965 -4 64 18 -117 -965 137 -81 -17 196 -965 -965 -965 -965 64 118 -117 -965 -965 150 41 -104 186 -965 -965 -965 5 177 -965 196 -965 -965 -965 -104 64 118 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 8 E= 7.7e+000 0.625000 0.375000 0.000000 0.000000 0.375000 0.500000 0.125000 0.000000 0.000000 0.125000 0.875000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.375000 0.125000 0.375000 0.750000 0.250000 0.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.375000 0.000000 0.000000 0.625000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.250000 0.375000 0.250000 0.125000 0.000000 0.625000 0.125000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 0.375000 0.500000 0.125000 0.000000 0.000000 0.625000 0.375000 0.125000 0.875000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.375000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC][CA]GA[CT][AC]C[TA][CT]G[CAG][CT]A[GC][GT]C[GC]A[GC] -------------------------------------------------------------------------------- Time 11.77 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 116 E-value = 4.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :19:31::a7:: pos.-specific C :7:9:::9:::6 probability G :2:::9:1:211 matrix T a:117:a::193 bits 2.2 2.0 * 1.7 * ** * 1.5 * ** **** Relative 1.3 * ** **** * Entropy 1.1 * ******* * (16.8 bits) 0.9 *********** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCACTGTCAATC consensus G A G T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45626 215 3.14e-07 ACCGTGGCGC TCACTGTCAGTC AGACCCTCCC 49248 427 3.86e-07 TGCATCGAAC TGACTGTCAATC GACACTGTTA 47378 284 4.71e-07 CCGGAGGTGT TCACAGTCAATT AACAAAGGCT 20779 199 1.07e-06 CAGATTTCTC TCACTGTCATTC GTTGCAATTA 6052 100 1.29e-06 GTGCTCAATG TCTCTGTCAATC TCGCGCGACA 48896 321 2.05e-06 GAGTAATTGT TCACTGTCAAGT CGCCAACACA 44139 394 3.56e-06 TGCAGATGGT TGACTGTGAATC GACTACCTAT 35623 51 4.70e-06 CTAAGAATAG TAACTGTCAATG GGGATGACAA 42787 281 5.90e-06 ACTGTAAGTA TCATAGTCAATT CGTAGCAGTA 35262 463 7.18e-06 ATCAGTAACC TCACAATCAGTC ATACGCATAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45626 3.1e-07 214_[+3]_274 49248 3.9e-07 426_[+3]_62 47378 4.7e-07 283_[+3]_205 20779 1.1e-06 198_[+3]_290 6052 1.3e-06 99_[+3]_389 48896 2e-06 320_[+3]_168 44139 3.6e-06 393_[+3]_95 35623 4.7e-06 50_[+3]_438 42787 5.9e-06 280_[+3]_208 35262 7.2e-06 462_[+3]_26 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 45626 ( 215) TCACTGTCAGTC 1 49248 ( 427) TGACTGTCAATC 1 47378 ( 284) TCACAGTCAATT 1 20779 ( 199) TCACTGTCATTC 1 6052 ( 100) TCTCTGTCAATC 1 48896 ( 321) TCACTGTCAAGT 1 44139 ( 394) TGACTGTGAATC 1 35623 ( 51) TAACTGTCAATG 1 42787 ( 281) TCATAGTCAATT 1 35262 ( 463) TCACAATCAGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 11736 bayes= 10.4472 E= 4.4e+001 -997 -997 -997 183 -136 154 -14 -997 181 -997 -997 -149 -997 190 -997 -149 22 -997 -997 131 -136 -997 203 -997 -997 -997 -997 183 -997 190 -114 -997 196 -997 -997 -997 144 -997 -14 -149 -997 -997 -114 167 -997 131 -114 9 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 4.4e+001 0.000000 0.000000 0.000000 1.000000 0.100000 0.700000 0.200000 0.000000 0.900000 0.000000 0.000000 0.100000 0.000000 0.900000 0.000000 0.100000 0.300000 0.000000 0.000000 0.700000 0.100000 0.000000 0.900000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.900000 0.100000 0.000000 1.000000 0.000000 0.000000 0.000000 0.700000 0.000000 0.200000 0.100000 0.000000 0.000000 0.100000 0.900000 0.000000 0.600000 0.100000 0.300000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CG]AC[TA]GTCA[AG]T[CT] -------------------------------------------------------------------------------- Time 17.53 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24353 4.41e-07 72_[+1(2.00e-07)]_311_\ [+2(1.55e-07)]_82 3969 7.34e-04 289_[+1(3.52e-08)]_195 20779 2.55e-04 198_[+3(1.07e-06)]_29_\ [+2(2.84e-05)]_242 20833 3.43e-03 50_[+1(5.51e-07)]_434 46539 1.88e-07 155_[+2(3.74e-08)]_20_\ [+1(1.61e-07)]_290 48328 8.00e-01 500 48611 8.45e-01 500 48694 3.16e-03 307_[+1(5.16e-07)]_177 48784 1.26e-01 500 51560 4.46e-01 500 49764 3.46e-04 59_[+2(6.18e-08)]_422 6052 7.67e-03 99_[+3(1.29e-06)]_389 26921 4.94e-05 301_[+2(1.21e-09)]_180 45626 1.18e-09 131_[+2(4.04e-07)]_64_\ [+3(3.14e-07)]_34_[+1(2.16e-07)]_224 35623 2.40e-02 50_[+3(4.70e-06)]_438 45960 8.73e-01 500 47378 3.76e-03 283_[+3(4.71e-07)]_205 44034 1.13e-05 74_[+2(5.26e-08)]_365_\ [+1(1.92e-05)]_26 42787 1.02e-04 280_[+3(5.90e-06)]_95_\ [+1(8.33e-07)]_97 44139 1.49e-06 18_[+2(1.27e-08)]_356_\ [+3(3.56e-06)]_95 49248 8.84e-08 312_[+1(5.41e-09)]_98_\ [+3(3.86e-07)]_62 43302 1.87e-01 500 35262 1.17e-06 360_[+1(4.74e-09)]_86_\ [+3(7.18e-06)]_26 48896 1.99e-11 111_[+2(2.47e-10)]_36_\ [+1(8.99e-05)]_113_[+1(6.79e-07)]_9_[+3(2.05e-06)]_168 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************