******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/428/428.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 13505 1.0000 500 13587 1.0000 500 22821 1.0000 500 32710 1.0000 500 15948 1.0000 500 10546 1.0000 500 7988 1.0000 500 10963 1.0000 500 51848 1.0000 500 32869 1.0000 500 32937 1.0000 500 11883 1.0000 500 45146 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/428/428.seqs.fa -oc motifs/428 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.259 C 0.240 G 0.235 T 0.266 Background letter frequencies (from dataset with add-one prior applied): A 0.259 C 0.240 G 0.235 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 13 llr = 124 E-value = 1.3e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::1:28:9:15a pos.-specific C 2:223:a::22: probability G :25232::a:3: matrix T 88372::1:81: bits 2.1 * * 1.9 * * * 1.7 * * * 1.5 *** * Relative 1.3 ** **** * Entropy 1.0 ** ***** * (13.8 bits) 0.8 ** * ***** * 0.6 ** * ***** * 0.4 ** * ***** * 0.2 **** ******* 0.0 ------------ Multilevel TTGTCACAGTAA consensus GT G G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45146 308 1.42e-07 TTAGTACTGA TTGTGACAGTAA GGACCAATCA 13505 165 2.39e-06 TAGACCTACT TTCTGACAGTGA GTAGGGTAAG 22821 381 3.45e-06 CCTGTTACTG TTGCTACAGTAA CTGAAAGGGT 13587 19 4.00e-06 ATATCGTCTG TTGTCACAGCGA CATAAAACCA 10963 177 5.33e-06 ACACTGTTAG CTGTTACAGTAA GTCCTGTTTG 32937 187 1.11e-05 AGCACGGGCT TGTTCACAGTCA TGCATTCCTT 7988 107 1.20e-05 CCATGGAATG TTTTTACAGTTA GAGTAAAGTG 10546 359 1.62e-05 ATCACCGATC TGCTAACAGTAA AACGCTAAAT 32710 402 1.78e-05 ACTACTATTA TTGTAACAGAAA CCGTAACTGT 15948 229 2.52e-05 TTCTGACTGA CTGGGACAGTGA GTCGCAATTT 32869 338 5.17e-05 TACTTATCTT TGTTCGCAGTCA CATGCGCTCT 11883 50 7.62e-05 TTCAGACCTT TTAGCACAGCAA GCAAACATTG 51848 387 1.88e-04 TTGCGCGTCG TTTCGGCTGTGA GAAAGTATCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45146 1.4e-07 307_[+1]_181 13505 2.4e-06 164_[+1]_324 22821 3.4e-06 380_[+1]_108 13587 4e-06 18_[+1]_470 10963 5.3e-06 176_[+1]_312 32937 1.1e-05 186_[+1]_302 7988 1.2e-05 106_[+1]_382 10546 1.6e-05 358_[+1]_130 32710 1.8e-05 401_[+1]_87 15948 2.5e-05 228_[+1]_260 32869 5.2e-05 337_[+1]_151 11883 7.6e-05 49_[+1]_439 51848 0.00019 386_[+1]_102 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=13 45146 ( 308) TTGTGACAGTAA 1 13505 ( 165) TTCTGACAGTGA 1 22821 ( 381) TTGCTACAGTAA 1 13587 ( 19) TTGTCACAGCGA 1 10963 ( 177) CTGTTACAGTAA 1 32937 ( 187) TGTTCACAGTCA 1 7988 ( 107) TTTTTACAGTTA 1 10546 ( 359) TGCTAACAGTAA 1 32710 ( 402) TTGTAACAGAAA 1 15948 ( 229) CTGGGACAGTGA 1 32869 ( 338) TGTTCGCAGTCA 1 11883 ( 50) TTAGCACAGCAA 1 51848 ( 387) TTTCGGCTGTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 8.93074 E= 1.3e+000 -1035 -64 -1035 167 -1035 -1035 -3 153 -175 -64 97 21 -1035 -64 -61 138 -75 36 39 -21 171 -1035 -61 -1035 -1035 206 -1035 -1035 183 -1035 -1035 -179 -1035 -1035 209 -1035 -175 -64 -1035 153 83 -64 39 -179 195 -1035 -1035 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 1.3e+000 0.000000 0.153846 0.000000 0.846154 0.000000 0.000000 0.230769 0.769231 0.076923 0.153846 0.461538 0.307692 0.000000 0.153846 0.153846 0.692308 0.153846 0.307692 0.307692 0.230769 0.846154 0.000000 0.153846 0.000000 0.000000 1.000000 0.000000 0.000000 0.923077 0.000000 0.000000 0.076923 0.000000 0.000000 1.000000 0.000000 0.076923 0.153846 0.000000 0.769231 0.461538 0.153846 0.307692 0.076923 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[TG][GT]T[CGT]ACAGT[AG]A -------------------------------------------------------------------------------- Time 1.49 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 10 llr = 125 E-value = 5.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 5::1272::a:61::a pos.-specific C :3:23:41a:a:1:8: probability G 1:5:31:::::161:: matrix T 47572249:::3292: bits 2.1 * * 1.9 *** * 1.7 *** * 1.5 **** * * Relative 1.3 **** *** Entropy 1.0 ** **** *** (18.1 bits) 0.8 *** * **** *** 0.6 **** * ***** *** 0.4 **** *********** 0.2 **** *********** 0.0 ---------------- Multilevel ATGTCACTCACAGTCA consensus TCTCGTT TT T sequence A A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 32869 233 4.90e-09 GATTAATGAT ATGTAACTCACAGTCA CAAAACAGCG 51848 20 9.00e-08 TTTATTCCGC GTGTAACTCACAGTCA GAATCAACCA 10963 140 9.00e-08 GCCCCGCCTG ACGTTACTCACTGTCA ACTAGCTTCT 13587 159 3.46e-07 ACGGATGGTC ATTACATTCACTGTCA GTTCGCATAA 7988 163 3.83e-07 TCTTTTCTAG ATTTGAATCACAGGCA GTGACCATTG 15948 364 1.45e-06 AATCATTGCC TTGCGTTTCACAGTTA ATTGTCATTG 45146 65 1.68e-06 GCGAGCCATC ACTTCATTCACGCTCA TGTCACGGCC 11883 200 1.80e-06 TAGATCCGTT TCTTCTTTCACTTTCA CGACACAGTC 32937 468 2.39e-06 CGCTTTGTAC TTTTTGATCACATTCA TTTACAGCAG 32710 236 9.09e-06 TAGTGGTTCG TTGCGACCCACAATTA CCAGCAACAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32869 4.9e-09 232_[+2]_252 51848 9e-08 19_[+2]_465 10963 9e-08 139_[+2]_345 13587 3.5e-07 158_[+2]_326 7988 3.8e-07 162_[+2]_322 15948 1.4e-06 363_[+2]_121 45146 1.7e-06 64_[+2]_420 11883 1.8e-06 199_[+2]_285 32937 2.4e-06 467_[+2]_17 32710 9.1e-06 235_[+2]_249 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=10 32869 ( 233) ATGTAACTCACAGTCA 1 51848 ( 20) GTGTAACTCACAGTCA 1 10963 ( 140) ACGTTACTCACTGTCA 1 13587 ( 159) ATTACATTCACTGTCA 1 7988 ( 163) ATTTGAATCACAGGCA 1 15948 ( 364) TTGCGTTTCACAGTTA 1 45146 ( 65) ACTTCATTCACGCTCA 1 11883 ( 200) TCTTCTTTCACTTTCA 1 32937 ( 468) TTTTTGATCACATTCA 1 32710 ( 236) TTGCGACCCACAATTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.54997 E= 5.5e+000 95 -997 -123 59 -997 32 -997 139 -997 -997 109 91 -137 -26 -997 139 -37 32 35 -41 143 -997 -123 -41 -37 74 -997 59 -997 -126 -997 176 -997 206 -997 -997 195 -997 -997 -997 -997 206 -997 -997 121 -997 -123 17 -137 -126 135 -41 -997 -997 -123 176 -997 174 -997 -41 195 -997 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 5.5e+000 0.500000 0.000000 0.100000 0.400000 0.000000 0.300000 0.000000 0.700000 0.000000 0.000000 0.500000 0.500000 0.100000 0.200000 0.000000 0.700000 0.200000 0.300000 0.300000 0.200000 0.700000 0.000000 0.100000 0.200000 0.200000 0.400000 0.000000 0.400000 0.000000 0.100000 0.000000 0.900000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.100000 0.300000 0.100000 0.100000 0.600000 0.200000 0.000000 0.000000 0.100000 0.900000 0.000000 0.800000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AT][TC][GT][TC][CGAT][AT][CTA]TCAC[AT][GT]T[CT]A -------------------------------------------------------------------------------- Time 2.86 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 10 llr = 124 E-value = 9.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a22::::12::13: pos.-specific C 8::1:61:2:253::a probability G 1a:643175:::6::: matrix T 1::141833965197: bits 2.1 * * 1.9 ** * 1.7 ** * 1.5 ** * * * Relative 1.3 ** * * * * Entropy 1.0 *** ** * * *** (17.9 bits) 0.8 *** *** * ***** 0.6 *** *********** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CGAGGCTGGTTCGTTC consensus ATG TT ATC A sequence A C C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 22821 143 7.85e-09 GCATTCGCGA CGAGGCTGGTATGTTC GTCGCGCCAT 10963 278 1.09e-08 AATTCCGTTC CGAGGGTGGTTCCTTC TCCGTAGTTC 7988 207 4.15e-08 GTGACATGAG CGAGTCTTGTCCGTTC TTATCCACGG 15948 394 5.62e-08 TCATTGTCAT CGAGACTGTTTTCTTC GACCATGAGT 13587 45 1.40e-06 AAACCATCGA CGACTTTTGTTCGTTC GTTGGTGGCA 32869 485 2.85e-06 AAACAGACTG CGAAGGTGGTTTCAAC 45146 186 3.10e-06 GGTGGACAGA CGAGTCTTCTACTTAC AGGTAACTTA 32937 452 3.59e-06 AGAGCTCTCG GGAGGCCGCTTTGTAC TTTTTGATCA 13505 204 4.46e-06 GGTGGAAAGA CGATTGGGTTCCGTTC GCTTCCTCGA 11883 10 7.64e-06 CGCGTGGGA TGAAACTGTATTGTTC CCTTGTTTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22821 7.9e-09 142_[+3]_342 10963 1.1e-08 277_[+3]_207 7988 4.2e-08 206_[+3]_278 15948 5.6e-08 393_[+3]_91 13587 1.4e-06 44_[+3]_440 32869 2.9e-06 484_[+3] 45146 3.1e-06 185_[+3]_299 32937 3.6e-06 451_[+3]_33 13505 4.5e-06 203_[+3]_281 11883 7.6e-06 9_[+3]_475 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=10 22821 ( 143) CGAGGCTGGTATGTTC 1 10963 ( 278) CGAGGGTGGTTCCTTC 1 7988 ( 207) CGAGTCTTGTCCGTTC 1 15948 ( 394) CGAGACTGTTTTCTTC 1 13587 ( 45) CGACTTTTGTTCGTTC 1 32869 ( 485) CGAAGGTGGTTTCAAC 1 45146 ( 186) CGAGTCTTCTACTTAC 1 32937 ( 452) GGAGGCCGCTTTGTAC 1 13505 ( 204) CGATTGGGTTCCGTTC 1 11883 ( 10) TGAAACTGTATTGTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.54997 E= 9.4e+001 -997 174 -123 -141 -997 -997 209 -997 195 -997 -997 -997 -37 -126 135 -141 -37 -997 77 59 -997 132 35 -141 -997 -126 -123 159 -997 -997 157 17 -997 -26 109 17 -137 -997 -997 176 -37 -26 -997 117 -997 106 -997 91 -997 32 135 -141 -137 -997 -997 176 21 -997 -997 139 -997 206 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 9.4e+001 0.000000 0.800000 0.100000 0.100000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.100000 0.600000 0.100000 0.200000 0.000000 0.400000 0.400000 0.000000 0.600000 0.300000 0.100000 0.000000 0.100000 0.100000 0.800000 0.000000 0.000000 0.700000 0.300000 0.000000 0.200000 0.500000 0.300000 0.100000 0.000000 0.000000 0.900000 0.200000 0.200000 0.000000 0.600000 0.000000 0.500000 0.000000 0.500000 0.000000 0.300000 0.600000 0.100000 0.100000 0.000000 0.000000 0.900000 0.300000 0.000000 0.000000 0.700000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CGA[GA][GTA][CG]T[GT][GTC]T[TAC][CT][GC]T[TA]C -------------------------------------------------------------------------------- Time 4.33 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 13505 1.83e-04 164_[+1(2.39e-06)]_27_\ [+3(4.46e-06)]_281 13587 6.02e-08 18_[+1(4.00e-06)]_14_[+3(1.40e-06)]_\ 98_[+2(3.46e-07)]_326 22821 1.16e-06 142_[+3(7.85e-09)]_222_\ [+1(3.45e-06)]_108 32710 1.53e-03 235_[+2(9.09e-06)]_150_\ [+1(1.78e-05)]_87 15948 6.31e-08 91_[+1(6.33e-05)]_125_\ [+1(2.52e-05)]_123_[+2(1.45e-06)]_14_[+3(5.62e-08)]_91 10546 2.84e-02 358_[+1(1.62e-05)]_130 7988 7.17e-09 106_[+1(1.20e-05)]_44_\ [+2(3.83e-07)]_28_[+3(4.15e-08)]_278 10963 2.55e-10 139_[+2(9.00e-08)]_21_\ [+1(5.33e-06)]_89_[+3(1.09e-08)]_207 51848 1.47e-04 19_[+2(9.00e-08)]_465 32869 2.41e-08 232_[+2(4.90e-09)]_85_\ [+2(4.90e-05)]_135_[+3(2.85e-06)] 32937 2.04e-06 186_[+1(1.11e-05)]_253_\ [+3(3.59e-06)]_[+2(2.39e-06)]_17 11883 1.71e-05 9_[+3(7.64e-06)]_24_[+1(7.62e-05)]_\ 138_[+2(1.80e-06)]_285 45146 2.49e-08 64_[+2(1.68e-06)]_105_\ [+3(3.10e-06)]_10_[+1(4.29e-05)]_84_[+1(1.42e-07)]_181 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************