******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/423/423.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10405 1.0000 500 20884 1.0000 500 21090 1.0000 500 21133 1.0000 500 21224 1.0000 500 24029 1.0000 500 24031 1.0000 500 261023 1.0000 500 262880 1.0000 500 264883 1.0000 500 268011 1.0000 500 269151 1.0000 500 3034 1.0000 500 4237 1.0000 500 7553 1.0000 500 8309 1.0000 500 8707 1.0000 500 bd1959 1.0000 500 bd721 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/423/423.seqs.fa -oc motifs/423 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.212 G 0.254 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.212 G 0.254 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 19 llr = 178 E-value = 4.6e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 282:75357919 pos.-specific C 8149:2652:91 probability G ::3:24:::::: matrix T :1111:1:11:: bits 2.2 2.0 * 1.8 * * 1.6 * ** Relative 1.3 * * *** Entropy 1.1 ** * * *** (13.5 bits) 0.9 ** * ****** 0.7 ** ** ****** 0.4 ** ********* 0.2 ************ 0.0 ------------ Multilevel CACCAACAAACA consensus A G GGAC sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 20884 288 3.36e-08 TTTGGTGAAG CACCAACCAACA GAACACAATC bd721 393 2.40e-07 CAGCTGACCA CAGCAACAAACA GACGAACACA 24029 419 9.16e-07 AGCCACCACG CACCGACAAACA ATCCTAGCGC 21133 17 1.41e-06 CAGACGAGCG CAGCAGAAAACA ACGAATGGGA 10405 419 1.59e-06 ATTTGCAGCT CATCAACAAACA CAGCAATGCG 4237 162 1.83e-06 GATTAGATGT CAGCGGCCAACA CTAAACATCA 24031 465 6.01e-06 AAAGGCATAC CAACACACAACA CATCTCACAA 264883 483 1.25e-05 TCTCCGTACC CACCAGACATCA CCACTG 262880 76 1.38e-05 GGGAATGCAG CTGCAGCAAACA TGTAACATGT bd1959 487 2.52e-05 CCGAGTTCAT AACCAACAAAAA AA 21090 235 2.97e-05 TCAGCCATCC AACCAACCAACC AAGAGAGTGG 269151 459 3.48e-05 CAACGTTTGA CAACGAAACACA CCAAGATTCA 8309 228 4.10e-05 GAATGTAATG CAGTAGACAACA GCGTAGAGTG 268011 222 4.42e-05 CTGCCGGCGA CCCCTCCCAACA ACGACGGCAT 21224 455 5.91e-05 AATCATCAAT CACCGACACAAA TCTTCGCAGC 261023 172 6.36e-05 TGCGATTGTA CAACAGTACACA TTGGATTGTG 7553 1 1.02e-04 . CAGCAGAATTCA GTGAGTGTAC 3034 175 1.15e-04 TGTACAAGTG AATCTCCCAACA TTCGTGGCGA 8707 266 1.38e-04 CACTTACCTT ACACAACCTACA TAACTCTTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20884 3.4e-08 287_[+1]_201 bd721 2.4e-07 392_[+1]_96 24029 9.2e-07 418_[+1]_70 21133 1.4e-06 16_[+1]_472 10405 1.6e-06 418_[+1]_70 4237 1.8e-06 161_[+1]_327 24031 6e-06 464_[+1]_24 264883 1.3e-05 482_[+1]_6 262880 1.4e-05 75_[+1]_413 bd1959 2.5e-05 486_[+1]_2 21090 3e-05 234_[+1]_254 269151 3.5e-05 458_[+1]_30 8309 4.1e-05 227_[+1]_261 268011 4.4e-05 221_[+1]_267 21224 5.9e-05 454_[+1]_34 261023 6.4e-05 171_[+1]_317 7553 0.0001 [+1]_488 3034 0.00012 174_[+1]_314 8707 0.00014 265_[+1]_223 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=19 20884 ( 288) CACCAACCAACA 1 bd721 ( 393) CAGCAACAAACA 1 24029 ( 419) CACCGACAAACA 1 21133 ( 17) CAGCAGAAAACA 1 10405 ( 419) CATCAACAAACA 1 4237 ( 162) CAGCGGCCAACA 1 24031 ( 465) CAACACACAACA 1 264883 ( 483) CACCAGACATCA 1 262880 ( 76) CTGCAGCAAACA 1 bd1959 ( 487) AACCAACAAAAA 1 21090 ( 235) AACCAACCAACC 1 269151 ( 459) CAACGAAACACA 1 8309 ( 228) CAGTAGACAACA 1 268011 ( 222) CCCCTCCCAACA 1 21224 ( 455) CACCGACACAAA 1 261023 ( 172) CAACAGTACACA 1 7553 ( 1) CAGCAGAATTCA 1 3034 ( 175) AATCTCCCAACA 1 8707 ( 266) ACACAACCTACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 9.12593 E= 4.6e-004 -35 189 -1089 -1089 165 -101 -1089 -233 -35 80 31 -133 -1089 216 -1089 -233 135 -1089 -27 -133 82 -43 54 -1089 23 157 -1089 -233 97 116 -1089 -1089 146 -43 -1089 -133 174 -1089 -1089 -133 -135 208 -1089 -1089 182 -201 -1089 -1089 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 19 E= 4.6e-004 0.210526 0.789474 0.000000 0.000000 0.842105 0.105263 0.000000 0.052632 0.210526 0.368421 0.315789 0.105263 0.000000 0.947368 0.000000 0.052632 0.684211 0.000000 0.210526 0.105263 0.473684 0.157895 0.368421 0.000000 0.315789 0.631579 0.000000 0.052632 0.526316 0.473684 0.000000 0.000000 0.736842 0.157895 0.000000 0.105263 0.894737 0.000000 0.000000 0.105263 0.105263 0.894737 0.000000 0.000000 0.947368 0.052632 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA]A[CGA]C[AG][AG][CA][AC]AACA -------------------------------------------------------------------------------- Time 3.47 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 135 E-value = 1.3e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :1:1::3:4::::51:1111: pos.-specific C a39335:8145:a138:6:1a probability G :::6::1:45:1:::3::::: matrix T :61:85631159:46:9398: bits 2.2 * * * 2.0 * * * 1.8 * * * 1.6 * * * * Relative 1.3 * * * ** ** * * Entropy 1.1 * * ** * *** ** * * (24.3 bits) 0.9 * * ** * *** ****** 0.7 ******** **** ******* 0.4 ******** ************ 0.2 ********************* 0.0 --------------------- Multilevel CTCGTCTCAGCTCATCTCTTC consensus C CCTATGCT TCG T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 269151 290 1.60e-11 TCTGTCTCGT CTCGTCTCGTCTCATCTCTTC TCATTTTTCG 262880 474 1.40e-09 TTCTAGTGTT CTCACCACACCTCTTCTCTTC GAATAC 264883 444 1.03e-08 GGGACAAGTC CCCCCTTCACTTCCTCTCTCC CTTCGTTCTC 10405 87 1.22e-08 CAGTTCCTAT CTCGTTATACCTCATGTCTAC TACTTCACAG 8309 335 1.54e-08 CACAAGACGT CTCGTTTCCGTTCAACATTTC GCTAAGAGAC 4237 112 1.67e-08 ACAATGCTTT CACGTCTCGGTTCTCGTCATC TTCAACACTT 3034 301 3.45e-08 ATATTATTTT CCCCTCGTTGCTCTTCTTTTC TTTCTTTCCC 24029 457 3.45e-08 TTCGACAAAC CTTGTTTCGGTGCACCTATTC GCGGTGACAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 269151 1.6e-11 289_[+2]_190 262880 1.4e-09 473_[+2]_6 264883 1e-08 443_[+2]_36 10405 1.2e-08 86_[+2]_393 8309 1.5e-08 334_[+2]_145 4237 1.7e-08 111_[+2]_368 3034 3.4e-08 300_[+2]_179 24029 3.4e-08 456_[+2]_23 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 269151 ( 290) CTCGTCTCGTCTCATCTCTTC 1 262880 ( 474) CTCACCACACCTCTTCTCTTC 1 264883 ( 444) CCCCCTTCACTTCCTCTCTCC 1 10405 ( 87) CTCGTTATACCTCATGTCTAC 1 8309 ( 335) CTCGTTTCCGTTCAACATTTC 1 4237 ( 112) CACGTCTCGGTTCTCGTCATC 1 3034 ( 301) CCCCTCGTTGCTCTTCTTTTC 1 24029 ( 457) CTTGTTTCGGTGCACCTATTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 10.1536 E= 1.3e-001 -965 223 -965 -965 -110 24 -965 123 -965 204 -965 -108 -110 24 130 -965 -965 24 -965 150 -965 124 -965 91 -10 -965 -102 123 -965 182 -965 -9 48 -76 56 -108 -965 82 98 -108 -965 124 -965 91 -965 -965 -102 172 -965 223 -965 -965 90 -76 -965 50 -110 24 -965 123 -965 182 -2 -965 -110 -965 -965 172 -110 156 -965 -9 -110 -965 -965 172 -110 -76 -965 150 -965 223 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.3e-001 0.000000 1.000000 0.000000 0.000000 0.125000 0.250000 0.000000 0.625000 0.000000 0.875000 0.000000 0.125000 0.125000 0.250000 0.625000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.500000 0.000000 0.500000 0.250000 0.000000 0.125000 0.625000 0.000000 0.750000 0.000000 0.250000 0.375000 0.125000 0.375000 0.125000 0.000000 0.375000 0.500000 0.125000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.125000 0.875000 0.000000 1.000000 0.000000 0.000000 0.500000 0.125000 0.000000 0.375000 0.125000 0.250000 0.000000 0.625000 0.000000 0.750000 0.250000 0.000000 0.125000 0.000000 0.000000 0.875000 0.125000 0.625000 0.000000 0.250000 0.125000 0.000000 0.000000 0.875000 0.125000 0.125000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[TC]C[GC][TC][CT][TA][CT][AG][GC][CT]TC[AT][TC][CG]T[CT]TTC -------------------------------------------------------------------------------- Time 7.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 9 llr = 121 E-value = 9.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 173::2:9::4814:7 pos.-specific C ::2::1::1:::::2: probability G 9319a7a:9a628:73 matrix T ::31:::1::::161: bits 2.2 2.0 * * * 1.8 * * * 1.6 * ** * ** Relative 1.3 * ** **** Entropy 1.1 * ** **** * (19.4 bits) 0.9 ** ** ********** 0.7 ** ************* 0.4 ** ************* 0.2 ** ************* 0.0 ---------------- Multilevel GAAGGGGAGGGAGTGA consensus GT A AG ACG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 8309 48 6.86e-09 GGCGTGGGAG GGCGGGGAGGGAGTGA ATCGTGCTCG 7553 402 1.74e-08 TTTGAGGGGG GATGGGGAGGGGGTGA TTTGGGGGAG 24031 150 1.36e-07 AGGTTAGTTT GAAGGGGAGGGAGATG GCTTGACCCA 4237 85 1.62e-07 GATACCAATC AATGGGGAGGGAGTGG CACAATGCTT 24029 98 1.79e-07 AACGAACAGG GGAGGAGAGGAAGAGG AGGATTAATA bd1959 45 9.36e-07 AGCACCGTCG GAGGGGGTGGAGGTGA TGATAACAGC bd721 46 1.06e-06 ACTCAATCAT GATGGCGAGGGATACA AGCAAACACA 21133 31 1.06e-06 AGAAAACAAC GAATGGGAGGAAAAGA CGACAAGACA 261023 86 1.35e-06 TGGTTGTGTC GGCGGAGACGAAGTCA AGAGCATCGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8309 6.9e-09 47_[+3]_437 7553 1.7e-08 401_[+3]_83 24031 1.4e-07 149_[+3]_335 4237 1.6e-07 84_[+3]_400 24029 1.8e-07 97_[+3]_387 bd1959 9.4e-07 44_[+3]_440 bd721 1.1e-06 45_[+3]_439 21133 1.1e-06 30_[+3]_454 261023 1.4e-06 85_[+3]_399 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=9 8309 ( 48) GGCGGGGAGGGAGTGA 1 7553 ( 402) GATGGGGAGGGGGTGA 1 24031 ( 150) GAAGGGGAGGGAGATG 1 4237 ( 85) AATGGGGAGGGAGTGG 1 24029 ( 98) GGAGGAGAGGAAGAGG 1 bd1959 ( 45) GAGGGGGTGGAGGTGA 1 bd721 ( 46) GATGGCGAGGGATACA 1 21133 ( 31) GAATGGGAGGAAAAGA 1 261023 ( 86) GGCGGAGACGAAGTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9215 bayes= 10.1329 E= 9.2e+000 -127 -982 181 -982 131 -982 39 -982 31 7 -119 33 -982 -982 181 -125 -982 -982 198 -982 -27 -93 139 -982 -982 -982 198 -982 173 -982 -982 -125 -982 -93 181 -982 -982 -982 198 -982 73 -982 113 -982 153 -982 -19 -982 -127 -982 161 -125 73 -982 -982 107 -982 7 139 -125 131 -982 39 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 9.2e+000 0.111111 0.000000 0.888889 0.000000 0.666667 0.000000 0.333333 0.000000 0.333333 0.222222 0.111111 0.333333 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 1.000000 0.000000 0.222222 0.111111 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.888889 0.000000 0.000000 0.111111 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 1.000000 0.000000 0.444444 0.000000 0.555556 0.000000 0.777778 0.000000 0.222222 0.000000 0.111111 0.000000 0.777778 0.111111 0.444444 0.000000 0.000000 0.555556 0.000000 0.222222 0.666667 0.111111 0.666667 0.000000 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AG][ATC]GG[GA]GAGG[GA][AG]G[TA][GC][AG] -------------------------------------------------------------------------------- Time 10.45 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10405 7.92e-08 86_[+2(1.22e-08)]_311_\ [+1(1.59e-06)]_70 20884 8.42e-04 287_[+1(3.36e-08)]_201 21090 1.40e-01 234_[+1(2.97e-05)]_254 21133 1.88e-05 16_[+1(1.41e-06)]_2_[+3(1.06e-06)]_\ 454 21224 1.86e-02 454_[+1(5.91e-05)]_34 24029 2.72e-10 97_[+3(1.79e-07)]_305_\ [+1(9.16e-07)]_26_[+2(3.45e-08)]_23 24031 1.79e-05 149_[+3(1.36e-07)]_299_\ [+1(6.01e-06)]_24 261023 4.53e-04 85_[+3(1.35e-06)]_70_[+1(6.36e-05)]_\ 317 262880 2.27e-07 75_[+1(1.38e-05)]_386_\ [+2(1.40e-09)]_6 264883 7.19e-07 443_[+2(1.03e-08)]_18_\ [+1(1.25e-05)]_6 268011 1.41e-02 221_[+1(4.42e-05)]_267 269151 2.96e-08 289_[+2(1.60e-11)]_148_\ [+1(3.48e-05)]_30 3034 8.31e-05 46_[+2(2.63e-05)]_233_\ [+2(3.45e-08)]_179 4237 2.40e-10 84_[+3(1.62e-07)]_11_[+2(1.67e-08)]_\ 29_[+1(1.83e-06)]_327 7553 8.36e-06 401_[+3(1.74e-08)]_83 8309 2.11e-10 47_[+3(6.86e-09)]_164_\ [+1(4.10e-05)]_95_[+2(1.54e-08)]_145 8707 3.98e-02 479_[+2(9.03e-05)] bd1959 4.69e-04 44_[+3(9.36e-07)]_278_\ [+3(4.67e-05)]_132_[+1(2.52e-05)]_2 bd721 3.50e-07 45_[+3(1.06e-06)]_331_\ [+1(2.40e-07)]_96 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************