******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/448/448.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42584 1.0000 500 9233 1.0000 500 43026 1.0000 500 47332 1.0000 500 33251 1.0000 500 31518 1.0000 500 37028 1.0000 500 35443 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/448/448.seqs.fa -oc motifs/448 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.230 G 0.227 T 0.278 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.230 G 0.227 T 0.277 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 4 llr = 76 E-value = 1.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::::3:53:8::::::a pos.-specific C a::a8:3:3:::::55:5: probability G :aa:33:a33835a55a5: matrix T :::::85::53:5:::::: bits 2.1 **** * * * 1.9 **** * * * * 1.7 **** * * * * 1.5 **** * * * * Relative 1.3 ***** * * * * * Entropy 1.1 ****** * ********* (27.5 bits) 0.9 ****** * ********* 0.6 ****** * ********* 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel CGGCCTTGATGAGGCCGCA consensus GGA CATGT GG G sequence C GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 33251 165 5.58e-11 TCTACTTATC CGGCCTCGATGAGGCCGCA AAATCACTAA 9233 226 6.68e-10 CCAGCGTCTA CGGCCTTGCAGATGCGGGA TCCTTTCATC 47332 274 2.82e-09 GATTTCTCGA CGGCGTTGATTATGGCGCA GAATCTTGCT 43026 393 6.04e-09 TGTCTCTCGC CGGCCGAGGGGGGGGGGGA AACGCACACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33251 5.6e-11 164_[+1]_317 9233 6.7e-10 225_[+1]_256 47332 2.8e-09 273_[+1]_208 43026 6e-09 392_[+1]_89 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=4 33251 ( 165) CGGCCTCGATGAGGCCGCA 1 9233 ( 226) CGGCCTTGCAGATGCGGGA 1 47332 ( 274) CGGCGTTGATTATGGCGCA 1 43026 ( 393) CGGCCGAGGGGGGGGGGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 3856 bayes= 9.91139 E= 1.8e+002 -865 212 -865 -865 -865 -865 214 -865 -865 -865 214 -865 -865 212 -865 -865 -865 170 14 -865 -865 -865 14 143 -9 12 -865 85 -865 -865 214 -865 91 12 14 -865 -9 -865 14 85 -865 -865 172 -15 149 -865 14 -865 -865 -865 114 85 -865 -865 214 -865 -865 112 114 -865 -865 112 114 -865 -865 -865 214 -865 -865 112 114 -865 191 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 1.8e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.250000 0.750000 0.250000 0.250000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.250000 0.000000 0.250000 0.500000 0.000000 0.000000 0.750000 0.250000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CGGC[CG][TG][TAC]G[ACG][TAG][GT][AG][GT]G[CG][CG]G[CG]A -------------------------------------------------------------------------------- Time 0.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 8 llr = 110 E-value = 3.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :916913419::14:94:3: pos.-specific C 3:141:158:::33:13::6 probability G 815::43111:643a::::3 matrix T ::3::54:::a431::4a81 bits 2.1 * 1.9 * * * 1.7 * * * 1.5 * * ** ** * Relative 1.3 ** * ** ** * Entropy 1.1 ** ** **** ** ** (19.9 bits) 0.9 ** ** **** ** *** 0.6 ** *** ***** ** *** 0.4 ** *** ***** ****** 0.2 ****** ************* 0.0 -------------------- Multilevel GAGAATTCCATGGAGAATTC consensus C TC GAA TCC T AG sequence G TG C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 35443 285 7.48e-09 ACAAACAGCA GAGAAGGACATGGAGATTTT GCTCTTGGGC 42584 362 2.21e-08 TTTTCTGGTA CAGAATACCATGCCGACTTG CATGATTTAG 47332 250 4.03e-08 TTCTGCTTGC GAGCATAAAATGTAGATTTC TCGACGGCGT 9233 162 7.77e-08 ATTCCACTCA GACAAGGCCATGATGAATTC AATCTGTCAC 37028 69 2.22e-07 AGATTTTGTT GATAATTCCATTGGGCCTAC TAAATCTTTC 31518 419 3.96e-07 TTTGTCCGAC GGGCATCCCATTTGGAATAC GCAGGGTGCT 43026 49 9.29e-07 TACGTGGCAG CATCCATACATGCCGATTTC GTTGCACTTG 33251 450 2.52e-06 AAATCTCTGC GAAAAGTGGGTTGAGAATTG AAGTTTCAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35443 7.5e-09 284_[+2]_196 42584 2.2e-08 361_[+2]_119 47332 4e-08 249_[+2]_231 9233 7.8e-08 161_[+2]_319 37028 2.2e-07 68_[+2]_412 31518 4e-07 418_[+2]_62 43026 9.3e-07 48_[+2]_432 33251 2.5e-06 449_[+2]_31 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=8 35443 ( 285) GAGAAGGACATGGAGATTTT 1 42584 ( 362) CAGAATACCATGCCGACTTG 1 47332 ( 250) GAGCATAAAATGTAGATTTC 1 9233 ( 162) GACAAGGCCATGATGAATTC 1 37028 ( 69) GATAATTCCATTGGGCCTAC 1 31518 ( 419) GGGCATCCCATTTGGAATAC 1 43026 ( 49) CATCCATACATGCCGATTTC 1 33251 ( 450) GAAAAGTGGGTTGAGAATTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3848 bayes= 9.64506 E= 3.8e+001 -965 12 172 -965 172 -965 -86 -965 -109 -88 114 -15 123 71 -965 -965 172 -88 -965 -965 -109 -965 72 85 -9 -88 14 43 50 112 -86 -965 -109 171 -86 -965 172 -965 -86 -965 -965 -965 -965 185 -965 -965 146 43 -109 12 72 -15 50 12 14 -115 -965 -965 214 -965 172 -88 -965 -965 50 12 -965 43 -965 -965 -965 185 -9 -965 -965 143 -965 144 14 -115 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 3.8e+001 0.000000 0.250000 0.750000 0.000000 0.875000 0.000000 0.125000 0.000000 0.125000 0.125000 0.500000 0.250000 0.625000 0.375000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.125000 0.000000 0.375000 0.500000 0.250000 0.125000 0.250000 0.375000 0.375000 0.500000 0.125000 0.000000 0.125000 0.750000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.625000 0.375000 0.125000 0.250000 0.375000 0.250000 0.375000 0.250000 0.250000 0.125000 0.000000 0.000000 1.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.375000 0.250000 0.000000 0.375000 0.000000 0.000000 0.000000 1.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.625000 0.250000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC]A[GT][AC]A[TG][TAG][CA]CAT[GT][GCT][ACG]GA[ATC]T[TA][CG] -------------------------------------------------------------------------------- Time 1.15 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 98 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 8:5::1:496:a415: pos.-specific C 351::383::9:::49 probability G ::34:::3:41::9:1 matrix T :516a6311:::6:1: bits 2.1 1.9 * * 1.7 * * 1.5 * ** * * Relative 1.3 * * * ** * * Entropy 1.1 ** ** * **** * * (17.7 bits) 0.9 ** ** * ****** * 0.6 ** **** ******** 0.4 ** **** ******** 0.2 **************** 0.0 ---------------- Multilevel ACATTTCAAACATGAC consensus CTGG CTC G A C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 33251 68 4.83e-09 GAAGCAAACA ACAGTTCAAACATGCC GGATGGAGCG 37028 193 1.27e-07 CTTCCCGTTT ACATTTCAAACAAGTC ATCGCCATAT 35443 10 2.66e-07 ACGTCTAGA ACCGTTCCAACAAGCC GCTTGGAGTG 31518 190 9.42e-07 TTCTGGGCCG ACGGTTCCAGCATGAG TGGCTTGCAT 9233 82 1.12e-06 AATGAAACCC ATATTATGAGCATGAC CCATGACGAG 42584 15 1.42e-06 AGCTTGGTTT CTATTCCTAGCAAGAC ATCCATTGAA 43026 483 5.56e-06 GTTGACGTTC CTTTTCCAAACATACC GT 47332 390 8.25e-06 AATCTTCATA ATGTTTTGTAGATGAC ACAAATATTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33251 4.8e-09 67_[+3]_417 37028 1.3e-07 192_[+3]_292 35443 2.7e-07 9_[+3]_475 31518 9.4e-07 189_[+3]_295 9233 1.1e-06 81_[+3]_403 42584 1.4e-06 14_[+3]_470 43026 5.6e-06 482_[+3]_2 47332 8.2e-06 389_[+3]_95 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 33251 ( 68) ACAGTTCAAACATGCC 1 37028 ( 193) ACATTTCAAACAAGTC 1 35443 ( 10) ACCGTTCCAACAAGCC 1 31518 ( 190) ACGGTTCCAGCATGAG 1 9233 ( 82) ATATTATGAGCATGAC 1 42584 ( 15) CTATTCCTAGCAAGAC 1 43026 ( 483) CTTTTCCAAACATACC 1 47332 ( 390) ATGTTTTGTAGATGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 3880 bayes= 9.65702 E= 1.5e+002 149 12 -965 -965 -965 112 -965 85 91 -88 14 -115 -965 -965 72 117 -965 -965 -965 185 -109 12 -965 117 -965 171 -965 -15 50 12 14 -115 172 -965 -965 -115 123 -965 72 -965 -965 193 -86 -965 191 -965 -965 -965 50 -965 -965 117 -109 -965 195 -965 91 71 -965 -115 -965 193 -86 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 1.5e+002 0.750000 0.250000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.500000 0.125000 0.250000 0.125000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 0.000000 1.000000 0.125000 0.250000 0.000000 0.625000 0.000000 0.750000 0.000000 0.250000 0.375000 0.250000 0.250000 0.125000 0.875000 0.000000 0.000000 0.125000 0.625000 0.000000 0.375000 0.000000 0.000000 0.875000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.375000 0.000000 0.000000 0.625000 0.125000 0.000000 0.875000 0.000000 0.500000 0.375000 0.000000 0.125000 0.000000 0.875000 0.125000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AC][CT][AG][TG]T[TC][CT][ACG]A[AG]CA[TA]G[AC]C -------------------------------------------------------------------------------- Time 1.66 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42584 5.45e-08 14_[+3(1.42e-06)]_215_\ [+1(5.72e-05)]_97_[+2(2.21e-08)]_119 9233 3.70e-12 81_[+3(1.12e-06)]_64_[+2(7.77e-08)]_\ 44_[+1(6.68e-10)]_256 43026 1.31e-09 48_[+2(9.29e-07)]_216_\ [+1(3.35e-05)]_89_[+1(6.04e-09)]_71_[+3(5.56e-06)]_2 47332 5.01e-11 249_[+2(4.03e-08)]_4_[+1(2.82e-09)]_\ 97_[+3(8.25e-06)]_95 33251 5.56e-14 67_[+3(4.83e-09)]_81_[+1(5.58e-11)]_\ 266_[+2(2.52e-06)]_31 31518 1.30e-05 189_[+3(9.42e-07)]_213_\ [+2(3.96e-07)]_62 37028 1.23e-06 68_[+2(2.22e-07)]_104_\ [+3(1.27e-07)]_292 35443 6.41e-08 9_[+3(2.66e-07)]_259_[+2(7.48e-09)]_\ 196 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************