******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/274/274.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 22554 1.0000 500 43638 1.0000 500 43698 1.0000 500 49350 1.0000 500 23260 1.0000 500 50339 1.0000 500 50570 1.0000 500 34782 1.0000 500 46031 1.0000 500 48228 1.0000 500 34978 1.0000 500 38213 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/274/274.seqs.fa -oc motifs/274 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.239 G 0.239 T 0.249 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.239 G 0.239 T 0.249 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 139 E-value = 2.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :142::1973:291131:::2 pos.-specific C :7:87:::::a7:4:7311:: probability G 1:6:3971:2:1:33:2491: matrix T 92:::12:34::116:34:98 bits 2.1 * 1.9 * 1.7 * * * 1.4 * * * * * ** Relative 1.2 * *** * * * *** Entropy 1.0 * **** ** * * * *** (22.2 bits) 0.8 ********* *** * *** 0.6 ********* *** ** **** 0.4 ************* ** **** 0.2 ********************* 0.0 --------------------- Multilevel TCGCCGGAATCCACTCCGGTT consensus TAAG T TA A GGATT A sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50339 405 3.83e-12 GTCAGACAAA TCGCCGGAAACCACTCTTGTT TACAACAGTT 49350 405 1.16e-11 GTCAGACAAA TCACCGGAAACCACTCTTGTT TACAACAGTT 48228 211 1.58e-08 TTCATCCTGG TCGACGGATACAACGATGGTT CGTTTTAGGG 43698 51 4.20e-08 TGGGATGAAA TTACCGGAATCCAGTACGGGA AAATTGAGGG 23260 355 1.42e-07 AATCCCCAAA TCGAGGAAATCAAGTCGCGTT TTCTAGGATC 22554 117 1.42e-07 CATCGCAACT GAACGGGAAGCCACGCATGTT CCAAAAGCCC 34782 450 1.85e-07 GAGTCGGCAG TCGCCGTGTTCGAGGCCGGTA CATTCGCTTC 46031 211 2.55e-07 TGGGTGAGAA TCGCGTTAAGCCTTTCCTGTT GGTTTTGTTG 50570 88 4.10e-07 AACACCAGAC TTACCGGATTCCAAAAGGCTT GGTGCACGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50339 3.8e-12 404_[+1]_75 49350 1.2e-11 404_[+1]_75 48228 1.6e-08 210_[+1]_269 43698 4.2e-08 50_[+1]_429 23260 1.4e-07 354_[+1]_125 22554 1.4e-07 116_[+1]_363 34782 1.9e-07 449_[+1]_30 46031 2.6e-07 210_[+1]_269 50570 4.1e-07 87_[+1]_392 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 50339 ( 405) TCGCCGGAAACCACTCTTGTT 1 49350 ( 405) TCACCGGAAACCACTCTTGTT 1 48228 ( 211) TCGACGGATACAACGATGGTT 1 43698 ( 51) TTACCGGAATCCAGTACGGGA 1 23260 ( 355) TCGAGGAAATCAAGTCGCGTT 1 22554 ( 117) GAACGGGAAGCCACGCATGTT 1 34782 ( 450) TCGCCGTGTTCGAGGCCGGTA 1 46031 ( 211) TCGCGTTAAGCCTTTCCTGTT 1 50570 ( 88) TTACCGGATTCCAAAAGGCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 9.45417 E= 2.4e-001 -982 -982 -110 183 -130 148 -982 -16 70 -982 122 -982 -30 170 -982 -982 -982 148 48 -982 -982 -982 189 -116 -130 -982 148 -16 170 -982 -110 -982 129 -982 -982 42 29 -982 -10 83 -982 206 -982 -982 -30 148 -110 -982 170 -982 -982 -116 -130 90 48 -116 -130 -982 48 116 29 148 -982 -982 -130 48 -10 42 -982 -110 90 83 -982 -110 189 -982 -982 -982 -110 183 -30 -982 -982 164 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 2.4e-001 0.000000 0.000000 0.111111 0.888889 0.111111 0.666667 0.000000 0.222222 0.444444 0.000000 0.555556 0.000000 0.222222 0.777778 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.888889 0.111111 0.111111 0.000000 0.666667 0.222222 0.888889 0.000000 0.111111 0.000000 0.666667 0.000000 0.000000 0.333333 0.333333 0.000000 0.222222 0.444444 0.000000 1.000000 0.000000 0.000000 0.222222 0.666667 0.111111 0.000000 0.888889 0.000000 0.000000 0.111111 0.111111 0.444444 0.333333 0.111111 0.111111 0.000000 0.333333 0.555556 0.333333 0.666667 0.000000 0.000000 0.111111 0.333333 0.222222 0.333333 0.000000 0.111111 0.444444 0.444444 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 0.111111 0.888889 0.222222 0.000000 0.000000 0.777778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[CT][GA][CA][CG]G[GT]A[AT][TAG]C[CA]A[CG][TG][CA][CTG][GT]GT[TA] -------------------------------------------------------------------------------- Time 1.39 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 5 llr = 102 E-value = 1.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::4:::a2::a::::::2:a2 pos.-specific C :2:a::::8::82:8::84:8 probability G :8::8a::28:264:66:::: matrix T a:6:2::8:2::26244:6:: bits 2.1 * * * 1.9 * * ** * * 1.7 * * ** * * 1.4 ** * ** * ** * Relative 1.2 ** ********* * * ** Entropy 1.0 ************ ******** (29.6 bits) 0.8 ************ ******** 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TGTCGGATCGACGTCGGCTAC consensus CA T AGT GCGTTTAC A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50339 84 9.86e-13 CCGGACAGCC TGACGGATCGACGTCGGCTAC CGTTTCGGTC 49350 82 9.86e-13 CCGGACAGCC TGACGGATCGACGTCGGCTAC CGTTTCGGTC 43698 358 2.90e-10 GTATTTTCCG TGTCGGATCGAGGGCTGCCAA ACGGCACCAA 43638 53 1.65e-09 CCTCCATTTT TGTCTGATGTACCTCGTCCAC TAGAGATCGT 23260 27 5.17e-09 TATCTTGCGG TCTCGGAACGACTGTTTATAC CGCGACAATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50339 9.9e-13 83_[+2]_396 49350 9.9e-13 81_[+2]_398 43698 2.9e-10 357_[+2]_122 43638 1.7e-09 52_[+2]_427 23260 5.2e-09 26_[+2]_453 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=5 50339 ( 84) TGACGGATCGACGTCGGCTAC 1 49350 ( 82) TGACGGATCGACGTCGGCTAC 1 43698 ( 358) TGTCGGATCGAGGGCTGCCAA 1 43638 ( 53) TGTCTGATGTACCTCGTCCAC 1 23260 ( 27) TCTCGGAACGACTGTTTATAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 10.4204 E= 1.8e+000 -897 -897 -897 200 -897 -26 174 -897 55 -897 -897 127 -897 206 -897 -897 -897 -897 174 -32 -897 -897 206 -897 187 -897 -897 -897 -45 -897 -897 168 -897 174 -26 -897 -897 -897 174 -32 187 -897 -897 -897 -897 174 -26 -897 -897 -26 133 -32 -897 -897 74 127 -897 174 -897 -32 -897 -897 133 68 -897 -897 133 68 -45 174 -897 -897 -897 74 -897 127 187 -897 -897 -897 -45 174 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 1.8e+000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.800000 0.000000 0.400000 0.000000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.800000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.000000 0.400000 0.600000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.600000 0.400000 0.200000 0.800000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GC][TA]C[GT]GA[TA][CG][GT]A[CG][GCT][TG][CT][GT][GT][CA][TC]A[CA] -------------------------------------------------------------------------------- Time 2.72 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 4 llr = 88 E-value = 9.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a35:5::::::5:::::8: pos.-specific C a:55a:aaa8a8:aa83::a probability G :::::::::::::::::a:: matrix T ::3::5:::3:35::38:3: bits 2.1 * * *** * ** * * 1.9 ** * *** * ** * * 1.7 ** * *** * ** * * 1.4 ** * *** * ** * * Relative 1.2 ** * ****** ***** * Entropy 1.0 ** ***************** (31.9 bits) 0.8 ** ***************** 0.6 ** ***************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CACACACCCCCCACCCTGAC consensus AC T T TT TC T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 49350 293 1.74e-12 AGATTCGTCG CACCCACCCCCCTCCCTGAC ATTGTATTCG 50339 293 2.75e-11 GAGTTCGTCG CACCCACCCCCTTCCCTGAC ATTGTATTCG 23260 436 1.95e-10 GTTCCCATCC CATACTCCCCCCACCTTGTC CAGACTAATG 43698 386 2.66e-10 CAAACGGCAC CAAACTCCCTCCACCCCGAC TCGAACCATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49350 1.7e-12 292_[+3]_188 50339 2.7e-11 292_[+3]_188 23260 1.9e-10 435_[+3]_45 43698 2.7e-10 385_[+3]_95 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=4 49350 ( 293) CACCCACCCCCCTCCCTGAC 1 50339 ( 293) CACCCACCCCCTTCCCTGAC 1 23260 ( 436) CATACTCCCCCCACCTTGTC 1 43698 ( 386) CAAACTCCCTCCACCCCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5772 bayes= 10.4939 E= 9.5e+000 -865 206 -865 -865 187 -865 -865 -865 -13 106 -865 0 87 106 -865 -865 -865 206 -865 -865 87 -865 -865 100 -865 206 -865 -865 -865 206 -865 -865 -865 206 -865 -865 -865 165 -865 0 -865 206 -865 -865 -865 165 -865 0 87 -865 -865 100 -865 206 -865 -865 -865 206 -865 -865 -865 165 -865 0 -865 7 -865 159 -865 -865 206 -865 145 -865 -865 0 -865 206 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 9.5e+000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.500000 0.000000 0.250000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.500000 0.000000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[CAT][AC]C[AT]CCC[CT]C[CT][AT]CC[CT][TC]G[AT]C -------------------------------------------------------------------------------- Time 3.83 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22554 2.19e-03 116_[+1(1.42e-07)]_363 43638 1.16e-05 52_[+2(1.65e-09)]_427 43698 3.39e-16 50_[+1(4.20e-08)]_286_\ [+2(2.90e-10)]_7_[+3(2.66e-10)]_95 49350 4.30e-24 81_[+2(9.86e-13)]_190_\ [+3(1.74e-12)]_92_[+1(1.16e-11)]_75 23260 1.24e-14 26_[+2(5.17e-09)]_307_\ [+1(1.42e-07)]_60_[+3(1.95e-10)]_45 50339 2.12e-23 83_[+2(9.86e-13)]_188_\ [+3(2.75e-11)]_92_[+1(3.83e-12)]_75 50570 2.76e-03 87_[+1(4.10e-07)]_392 34782 1.47e-03 449_[+1(1.85e-07)]_30 46031 1.02e-03 210_[+1(2.55e-07)]_269 48228 4.11e-05 210_[+1(1.58e-08)]_269 34978 8.84e-01 500 38213 9.94e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************