******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/79/79.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17187 1.0000 500 46636 1.0000 500 46728 1.0000 500 15468 1.0000 500 43723 1.0000 500 49733 1.0000 500 44084 1.0000 500 33540 1.0000 500 45195 1.0000 500 34896 1.0000 500 20318 1.0000 500 44967 1.0000 500 45967 1.0000 500 49713 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/79/79.seqs.fa -oc motifs/79 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.232 G 0.230 T 0.275 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.232 G 0.230 T 0.275 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 7 llr = 114 E-value = 2.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 73::::4::336:::7:44: pos.-specific C 179:::1::16:11a:7::3 probability G ::1::a::9313:::1116: matrix T 1::aa:4a13:199:114:7 bits 2.1 * * 1.9 *** * * 1.7 *** * * 1.5 **** ** * Relative 1.3 ***** ** *** Entropy 1.1 ***** ** *** ** (23.5 bits) 0.8 ****** ** ***** ** 0.6 ****** ** ******* ** 0.4 ********* ********** 0.2 ********* ********** 0.0 -------------------- Multilevel ACCTTGATGACATTCACAGT consensus A T GAG TAC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 43723 208 3.22e-10 CGAACGCCAA AACTTGATGCCATTCACAGT CACTTTCTCA 44084 384 5.77e-09 ACGAAGTGGA ACCTTGTTGTCGTTCGCTAC TTGCCGGAGT 44967 215 9.38e-09 TGATATTGCT TCCTTGTTGGATTTCACTGT CGGTAGGCGC 46636 378 1.24e-08 AGTCACCGTC ACCTTGATTACGCTCACAGT CAGATTGCCG 49713 473 4.86e-08 CGCTATCTGC ACCTTGTTGAAATTCTGGAT ACCGTACA 45967 193 5.58e-08 CTCTCCCTTG CCGTTGCTGTGATTCACAGT TTCTTCCGTA 49733 352 5.58e-08 AAAGATACGA AACTTGATGGCATCCATTAC CGGTCGCACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43723 3.2e-10 207_[+1]_273 44084 5.8e-09 383_[+1]_97 44967 9.4e-09 214_[+1]_266 46636 1.2e-08 377_[+1]_103 49713 4.9e-08 472_[+1]_8 45967 5.6e-08 192_[+1]_288 49733 5.6e-08 351_[+1]_129 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=7 43723 ( 208) AACTTGATGCCATTCACAGT 1 44084 ( 384) ACCTTGTTGTCGTTCGCTAC 1 44967 ( 215) TCCTTGTTGGATTTCACTGT 1 46636 ( 378) ACCTTGATTACGCTCACAGT 1 49713 ( 473) ACCTTGTTGAAATTCTGGAT 1 45967 ( 193) CCGTTGCTGTGATTCACAGT 1 49733 ( 352) AACTTGATGGCATCCATTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6734 bayes= 9.7521 E= 2.8e+001 144 -70 -945 -94 12 162 -945 -945 -945 188 -69 -945 -945 -945 -945 186 -945 -945 -945 186 -945 -945 212 -945 71 -70 -945 64 -945 -945 -945 186 -945 -945 189 -94 12 -70 31 6 12 130 -69 -945 112 -945 31 -94 -945 -70 -945 164 -945 -70 -945 164 -945 210 -945 -945 144 -945 -69 -94 -945 162 -69 -94 71 -945 -69 64 71 -945 131 -945 -945 30 -945 138 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 2.8e+001 0.714286 0.142857 0.000000 0.142857 0.285714 0.714286 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.142857 0.000000 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.857143 0.142857 0.285714 0.142857 0.285714 0.285714 0.285714 0.571429 0.142857 0.000000 0.571429 0.000000 0.285714 0.142857 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.714286 0.000000 0.142857 0.142857 0.000000 0.714286 0.142857 0.142857 0.428571 0.000000 0.142857 0.428571 0.428571 0.000000 0.571429 0.000000 0.000000 0.285714 0.000000 0.714286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[CA]CTTG[AT]TG[AGT][CA][AG]TTCAC[AT][GA][TC] -------------------------------------------------------------------------------- Time 1.60 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 7 llr = 82 E-value = 1.7e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a::::79:33:: pos.-specific C :911:::941aa probability G ::79a3:134:: matrix T :11:::1::1:: bits 2.1 * ** 1.9 * * ** 1.7 * * ** 1.5 ** ** * ** Relative 1.3 ** ** ** ** Entropy 1.1 ** ***** ** (16.8 bits) 0.8 ******** ** 0.6 ******** ** 0.4 ********* ** 0.2 ************ 0.0 ------------ Multilevel ACGGGAACCGCC consensus G AA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45967 452 2.15e-07 TTGATATAAA ACGGGAACGACC ACGCTTCTAA 34896 88 5.81e-07 TCCGTGTGGG ACCGGAACCGCC CGATTCAAAC 44967 274 8.45e-07 TCTAGCCATC ACGGGGACAACC TTAACTCGAT 49713 408 1.80e-06 ATCCTAAGAC ACGGGGACGTCC ACTTTCTCAA 44084 151 3.20e-06 AACCCTGACC ATGGGAACCCCC GACCCCATGG 15468 168 5.42e-06 TGGAGGGCCC ACTGGATCCGCC TTTTTGGGCG 33540 140 7.58e-06 TAAGGATAAA ACGCGAAGAGCC ATCCATGGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45967 2.1e-07 451_[+2]_37 34896 5.8e-07 87_[+2]_401 44967 8.4e-07 273_[+2]_215 49713 1.8e-06 407_[+2]_81 44084 3.2e-06 150_[+2]_338 15468 5.4e-06 167_[+2]_321 33540 7.6e-06 139_[+2]_349 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=7 45967 ( 452) ACGGGAACGACC 1 34896 ( 88) ACCGGAACCGCC 1 44967 ( 274) ACGGGGACAACC 1 49713 ( 408) ACGGGGACGTCC 1 44084 ( 151) ATGGGAACCCCC 1 15468 ( 168) ACTGGATCCGCC 1 33540 ( 140) ACGCGAAGAGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.77593 E= 1.7e+003 193 -945 -945 -945 -945 188 -945 -94 -945 -70 163 -94 -945 -70 189 -945 -945 -945 212 -945 144 -945 31 -945 170 -945 -945 -94 -945 188 -69 -945 12 88 31 -945 12 -70 89 -94 -945 210 -945 -945 -945 210 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 1.7e+003 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.142857 0.714286 0.142857 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.857143 0.000000 0.000000 0.142857 0.000000 0.857143 0.142857 0.000000 0.285714 0.428571 0.285714 0.000000 0.285714 0.142857 0.428571 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- ACGGG[AG]AC[CAG][GA]CC -------------------------------------------------------------------------------- Time 3.25 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 99 E-value = 7.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 2:::3:2::11: pos.-specific C ::a:6:1:a2:9 probability G :::a::33:18: matrix T 8a::1a37:611 bits 2.1 ** * 1.9 *** * * 1.7 *** * * 1.5 *** * * * Relative 1.3 *** * * * Entropy 1.1 **** * ** ** (15.9 bits) 0.8 **** * ** ** 0.6 ****** ** ** 0.4 ****** ** ** 0.2 ****** ***** 0.0 ------------ Multilevel TTCGCTGTCTGC consensus A A TG C sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49713 461 2.31e-07 ACCACGCGTA TTCGCTATCTGC ACCTTGTTGA 46636 330 3.50e-07 ATGGTCGAGT TTCGATGTCTGC CGCGCACCTG 43723 469 2.00e-06 TCAACATCGG TTCGCTCTCCGC CATTACTGCA 44084 199 2.23e-06 TTGCGGCAAG ATCGCTGTCCGC TGGACTTTCT 17187 9 2.79e-06 TCTTTTCC TTCGCTATCAGC CACACTACTT 46728 257 5.66e-06 ATATTTGAAC TTCGCTTGCTTC ATCGCACATA 34896 366 9.22e-06 CAATCAACAT TTCGATTGCTAC GGCGATTGTC 20318 446 1.17e-05 AGTTTGTTAT ATCGTTTGCTGC GTTCCCTTCA 49733 294 1.87e-05 GCGCGTTTCT TTCGATGTCGGT ACTTCCGGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49713 2.3e-07 460_[+3]_28 46636 3.5e-07 329_[+3]_159 43723 2e-06 468_[+3]_20 44084 2.2e-06 198_[+3]_290 17187 2.8e-06 8_[+3]_480 46728 5.7e-06 256_[+3]_232 34896 9.2e-06 365_[+3]_123 20318 1.2e-05 445_[+3]_43 49733 1.9e-05 293_[+3]_195 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 49713 ( 461) TTCGCTATCTGC 1 46636 ( 330) TTCGATGTCTGC 1 43723 ( 469) TTCGCTCTCCGC 1 44084 ( 199) ATCGCTGTCCGC 1 17187 ( 9) TTCGCTATCAGC 1 46728 ( 257) TTCGCTTGCTTC 1 34896 ( 366) TTCGATTGCTAC 1 20318 ( 446) ATCGTTTGCTGC 1 49733 ( 294) TTCGATGTCGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 10.4181 E= 7.3e+002 -24 -982 -982 150 -982 -982 -982 186 -982 210 -982 -982 -982 -982 212 -982 34 126 -982 -130 -982 -982 -982 186 -24 -106 53 28 -982 -982 53 128 -982 210 -982 -982 -124 -6 -105 102 -124 -982 175 -130 -982 193 -982 -130 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 7.3e+002 0.222222 0.000000 0.000000 0.777778 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.555556 0.000000 0.111111 0.000000 0.000000 0.000000 1.000000 0.222222 0.111111 0.333333 0.333333 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.111111 0.222222 0.111111 0.555556 0.111111 0.000000 0.777778 0.111111 0.000000 0.888889 0.000000 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TA]TCG[CA]T[GTA][TG]C[TC]GC -------------------------------------------------------------------------------- Time 4.88 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17187 2.77e-03 8_[+3(2.79e-06)]_453_[+1(9.55e-05)]_\ 7 46636 4.28e-08 329_[+3(3.50e-07)]_36_\ [+1(1.24e-08)]_103 46728 1.29e-02 256_[+3(5.66e-06)]_232 15468 3.80e-02 167_[+2(5.42e-06)]_321 43723 1.42e-08 207_[+1(3.22e-10)]_241_\ [+3(2.00e-06)]_20 49733 8.84e-06 293_[+3(1.87e-05)]_46_\ [+1(5.58e-08)]_129 44084 1.73e-09 150_[+2(3.20e-06)]_36_\ [+3(2.23e-06)]_173_[+1(5.77e-09)]_97 33540 3.79e-02 139_[+2(7.58e-06)]_349 45195 4.40e-01 500 34896 6.14e-05 87_[+2(5.81e-07)]_266_\ [+3(9.22e-06)]_123 20318 7.30e-02 154_[+3(4.31e-05)]_279_\ [+3(1.17e-05)]_43 44967 2.56e-08 25_[+3(9.86e-05)]_177_\ [+1(9.38e-09)]_39_[+2(8.45e-07)]_215 45967 1.59e-07 192_[+1(5.58e-08)]_239_\ [+2(2.15e-07)]_37 49713 9.01e-10 407_[+2(1.80e-06)]_41_\ [+3(2.31e-07)]_[+1(4.86e-08)]_8 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************