******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/211/211.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9654 1.0000 500 46968 1.0000 500 48508 1.0000 500 49488 1.0000 500 10524 1.0000 500 41639 1.0000 500 35253 1.0000 500 33267 1.0000 500 42906 1.0000 500 40088 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/211/211.seqs.fa -oc motifs/211 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.259 C 0.231 G 0.239 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.259 C 0.231 G 0.239 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 10 llr = 133 E-value = 6.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :3:1::11:4::12:111::: pos.-specific C 4:22a:139:5:343:23:39 probability G 1:84::41::2161551:921 matrix T 57:3:a451639:3246615: bits 2.1 * 1.9 ** 1.7 ** * * 1.5 ** * * * * Relative 1.3 * ** * * * * Entropy 1.1 ** ** * * * * (19.1 bits) 0.8 ** ** ** ** * * 0.6 *** ** ***** ** ** * 0.4 *** ** ***** ******* 0.2 ********************* 0.0 --------------------- Multilevel TTGGCTGTCTCTGCGGTTGTC consensus CACT TC AT CTCTCC C sequence C G AT G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41639 386 2.55e-09 GCTCGGTAGT TTGGCTTTCTCTGGTGTTGTC CGGTCAGAGA 9654 159 2.86e-08 GCTTTCTTAT CTGCCTATCATTGCCTTTGTC GATTTTGTTG 49488 420 4.09e-08 GCATTCCGTC GAGTCTTCCTTTGCGGTTGCC TCCACCGAGC 33267 281 9.83e-08 TGTAAAGGAA CTGTCTGTCTTTACTTTCGTC ATCGAAGATG 46968 195 1.33e-07 CCTCGAATGT CTGGCTGTCTGTCTCACTGTC AAGTTCAACA 35253 220 1.78e-07 TGTAGAGAAA TTGTCTGCCTGTCAGGGCGTC GCAGCCATCA 48508 360 2.35e-07 TGACAGTAAC TTGGCTGACACTGAGTTTGCG CTTTGATTGG 42906 463 1.33e-06 GGATTGTACG TACACTTCCTCGGTGGTTGGC CCTGTCCAAA 40088 162 3.33e-06 CTGGGTAGAT CACGCTCGCACTCTGGCCGGC GTGTGATGGC 10524 415 8.48e-06 TCCTTCATGG TTGCCTTTTACTGCCTAATCC TTTTTCGCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41639 2.5e-09 385_[+1]_94 9654 2.9e-08 158_[+1]_321 49488 4.1e-08 419_[+1]_60 33267 9.8e-08 280_[+1]_199 46968 1.3e-07 194_[+1]_285 35253 1.8e-07 219_[+1]_260 48508 2.3e-07 359_[+1]_120 42906 1.3e-06 462_[+1]_17 40088 3.3e-06 161_[+1]_318 10524 8.5e-06 414_[+1]_65 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=10 41639 ( 386) TTGGCTTTCTCTGGTGTTGTC 1 9654 ( 159) CTGCCTATCATTGCCTTTGTC 1 49488 ( 420) GAGTCTTCCTTTGCGGTTGCC 1 33267 ( 281) CTGTCTGTCTTTACTTTCGTC 1 46968 ( 195) CTGGCTGTCTGTCTCACTGTC 1 35253 ( 220) TTGTCTGCCTGTCAGGGCGTC 1 48508 ( 360) TTGGCTGACACTGAGTTTGCG 1 42906 ( 463) TACACTTCCTCGGTGGTTGGC 1 40088 ( 162) CACGCTCGCACTCTGGCCGGC 1 10524 ( 415) TTGCCTTTTACTGCCTAATCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 9.1559 E= 6.1e+000 -997 79 -125 88 21 -997 -997 137 -997 -21 174 -997 -137 -21 74 14 -997 211 -997 -997 -997 -997 -997 188 -137 -121 74 56 -137 37 -125 88 -997 196 -997 -144 63 -997 -997 114 -997 111 -25 14 -997 -997 -125 173 -137 37 133 -997 -37 79 -125 14 -997 37 107 -44 -137 -997 107 56 -137 -21 -125 114 -137 37 -997 114 -997 -997 191 -144 -997 37 -25 88 -997 196 -125 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 6.1e+000 0.000000 0.400000 0.100000 0.500000 0.300000 0.000000 0.000000 0.700000 0.000000 0.200000 0.800000 0.000000 0.100000 0.200000 0.400000 0.300000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.100000 0.100000 0.400000 0.400000 0.100000 0.300000 0.100000 0.500000 0.000000 0.900000 0.000000 0.100000 0.400000 0.000000 0.000000 0.600000 0.000000 0.500000 0.200000 0.300000 0.000000 0.000000 0.100000 0.900000 0.100000 0.300000 0.600000 0.000000 0.200000 0.400000 0.100000 0.300000 0.000000 0.300000 0.500000 0.200000 0.100000 0.000000 0.500000 0.400000 0.100000 0.200000 0.100000 0.600000 0.100000 0.300000 0.000000 0.600000 0.000000 0.000000 0.900000 0.100000 0.000000 0.300000 0.200000 0.500000 0.000000 0.900000 0.100000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC][TA][GC][GTC]CT[GT][TC]C[TA][CTG]T[GC][CTA][GCT][GT][TC][TC]G[TCG]C -------------------------------------------------------------------------------- Time 1.10 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 10 llr = 112 E-value = 7.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 44422:a993:63112 pos.-specific C :65:1::::1:2::2: probability G :::44a:1:4127128 matrix T 6:143:::129::85: bits 2.1 * 1.9 ** 1.7 ** 1.5 **** * Relative 1.3 **** * * Entropy 1.1 * **** * ** * (16.1 bits) 0.8 ** **** * ** * 0.6 *** **** **** * 0.4 **** **** **** * 0.2 **************** 0.0 ---------------- Multilevel TCCGGGAAAGTAGTTG consensus AAATT A CA CA sequence AA T G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 9654 339 5.91e-09 ACGTAACGAA TCCGGGAAAGTAGTCG GCGAAAAGCT 49488 346 1.14e-07 GTACAGACAG ACCTTGAAAGTCGTTG ACCGGAAGCG 10524 216 2.42e-07 TGGAGAAACA TACTTGAAAATAATTG GTAAAAGCCC 46968 145 5.94e-07 TCATCGAATT AACTGGAAAATAGTAG TCTACCTATT 42906 113 1.29e-06 TCGGTAGGTT TCCAGGAAATTGATTG CGTTGATGGG 35253 406 3.87e-06 CTTTCGTTGA TAAATGAAAATAGTGA CGATGCTGAT 41639 410 1.31e-05 TGTTGTCCGG TCAGAGAAAGGAGGGG CTTCAGGATT 48508 219 1.65e-05 ACTAGATGAG ACATGGAGATTGGTTA GCTTTTTTCG 40088 318 1.75e-05 GCTTGGATTA AATGAGAAAGTAAATG GTATCTCCTT 33267 196 2.66e-05 AATTACCCAG TCAGCGAATCTCGTCG TCTTTTTTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9654 5.9e-09 338_[+2]_146 49488 1.1e-07 345_[+2]_139 10524 2.4e-07 215_[+2]_269 46968 5.9e-07 144_[+2]_340 42906 1.3e-06 112_[+2]_372 35253 3.9e-06 405_[+2]_79 41639 1.3e-05 409_[+2]_75 48508 1.7e-05 218_[+2]_266 40088 1.8e-05 317_[+2]_167 33267 2.7e-05 195_[+2]_289 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=10 9654 ( 339) TCCGGGAAAGTAGTCG 1 49488 ( 346) ACCTTGAAAGTCGTTG 1 10524 ( 216) TACTTGAAAATAATTG 1 46968 ( 145) AACTGGAAAATAGTAG 1 42906 ( 113) TCCAGGAAATTGATTG 1 35253 ( 406) TAAATGAAAATAGTGA 1 41639 ( 410) TCAGAGAAAGGAGGGG 1 48508 ( 219) ACATGGAGATTGGTTA 1 40088 ( 318) AATGAGAAAGTAAATG 1 33267 ( 196) TCAGCGAATCTCGTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 9.17088 E= 7.3e+002 63 -997 -997 114 63 137 -997 -997 63 111 -997 -144 -37 -997 74 56 -37 -121 74 14 -997 -997 207 -997 195 -997 -997 -997 180 -997 -125 -997 180 -997 -997 -144 21 -121 74 -44 -997 -997 -125 173 121 -21 -25 -997 21 -997 155 -997 -137 -997 -125 156 -137 -21 -25 88 -37 -997 174 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 7.3e+002 0.400000 0.000000 0.000000 0.600000 0.400000 0.600000 0.000000 0.000000 0.400000 0.500000 0.000000 0.100000 0.200000 0.000000 0.400000 0.400000 0.200000 0.100000 0.400000 0.300000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.900000 0.000000 0.100000 0.000000 0.900000 0.000000 0.000000 0.100000 0.300000 0.100000 0.400000 0.200000 0.000000 0.000000 0.100000 0.900000 0.600000 0.200000 0.200000 0.000000 0.300000 0.000000 0.700000 0.000000 0.100000 0.000000 0.100000 0.800000 0.100000 0.200000 0.200000 0.500000 0.200000 0.000000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TA][CA][CA][GTA][GTA]GAAA[GAT]T[ACG][GA]T[TCG][GA] -------------------------------------------------------------------------------- Time 1.94 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 13 sites = 8 llr = 91 E-value = 9.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::6133::983:1 pos.-specific C 18:9:1:8:::a6 probability G :33::6:3:3::: matrix T 9:1:8:a:1:8:3 bits 2.1 * 1.9 * * 1.7 * * 1.5 * * * * Relative 1.3 ** * **** * Entropy 1.1 ** ** ****** (16.4 bits) 0.8 ** ********* 0.6 ************* 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel TCACTGTCAATCC consensus GG AA G GA T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 40088 291 1.47e-08 GGATTGGGCT TCACTGTCAATCC TACAGCTTGG 48508 435 4.28e-07 GCGATTTCCT TCACTGTCAGACC GTATCATTCA 49488 291 1.54e-06 ATAGACTATC TCGCAGTGAATCC ATACATACAC 41639 239 2.23e-06 GAATTGACAT TGACAGTGAATCC GGTTGTCCAC 42906 284 6.57e-06 CGCAAATATG TCACTCTCTGTCC AATCACAATA 46968 244 7.07e-06 GCACTCCGTT CGACTGTCAATCA TTCAATCCGG 10524 81 7.71e-06 TGAGCAGCAG TCTCTATCAAACT CAAAAACAAC 33267 406 9.38e-06 GGTCGCCCTG TCGATATCAATCT CATGAGCCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40088 1.5e-08 290_[+3]_197 48508 4.3e-07 434_[+3]_53 49488 1.5e-06 290_[+3]_197 41639 2.2e-06 238_[+3]_249 42906 6.6e-06 283_[+3]_204 46968 7.1e-06 243_[+3]_244 10524 7.7e-06 80_[+3]_407 33267 9.4e-06 405_[+3]_82 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=13 seqs=8 40088 ( 291) TCACTGTCAATCC 1 48508 ( 435) TCACTGTCAGACC 1 49488 ( 291) TCGCAGTGAATCC 1 41639 ( 239) TGACAGTGAATCC 1 42906 ( 284) TCACTCTCTGTCC 1 46968 ( 244) CGACTGTCAATCA 1 10524 ( 81) TCTCTATCAAACT 1 33267 ( 406) TCGATATCAATCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 4880 bayes= 9.2503 E= 9.4e+002 -965 -89 -965 169 -965 170 7 -965 127 -965 7 -112 -105 192 -965 -965 -5 -965 -965 147 -5 -89 139 -965 -965 -965 -965 188 -965 170 7 -965 176 -965 -965 -112 154 -965 7 -965 -5 -965 -965 147 -965 211 -965 -965 -105 143 -965 -12 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 8 E= 9.4e+002 0.000000 0.125000 0.000000 0.875000 0.000000 0.750000 0.250000 0.000000 0.625000 0.000000 0.250000 0.125000 0.125000 0.875000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.250000 0.125000 0.625000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.875000 0.000000 0.000000 0.125000 0.750000 0.000000 0.250000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.125000 0.625000 0.000000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CG][AG]C[TA][GA]T[CG]A[AG][TA]C[CT] -------------------------------------------------------------------------------- Time 2.88 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9654 6.21e-09 158_[+1(2.86e-08)]_159_\ [+2(5.91e-09)]_146 46968 1.90e-08 144_[+2(5.94e-07)]_34_\ [+1(1.33e-07)]_28_[+3(7.07e-06)]_244 48508 5.15e-08 218_[+2(1.65e-05)]_125_\ [+1(2.35e-07)]_54_[+3(4.28e-07)]_53 49488 3.39e-10 290_[+3(1.54e-06)]_42_\ [+2(1.14e-07)]_58_[+1(4.09e-08)]_60 10524 4.01e-07 80_[+3(7.71e-06)]_122_\ [+2(2.42e-07)]_183_[+1(8.48e-06)]_65 41639 2.94e-09 238_[+3(2.23e-06)]_134_\ [+1(2.55e-09)]_3_[+2(1.31e-05)]_75 35253 8.47e-06 219_[+1(1.78e-07)]_165_\ [+2(3.87e-06)]_79 33267 5.94e-07 195_[+2(2.66e-05)]_69_\ [+1(9.83e-08)]_104_[+3(9.38e-06)]_82 42906 2.94e-07 112_[+2(1.29e-06)]_34_\ [+2(8.40e-05)]_105_[+3(6.57e-06)]_85_[+1(7.77e-06)]_60_[+1(1.33e-06)]_17 40088 2.82e-08 161_[+1(3.33e-06)]_108_\ [+3(1.47e-08)]_14_[+2(1.75e-05)]_167 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************