******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/148/148.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10554 1.0000 500 1337 1.0000 500 19890 1.0000 500 21956 1.0000 500 25252 1.0000 500 25480 1.0000 500 261726 1.0000 500 262636 1.0000 500 262637 1.0000 500 262807 1.0000 500 263001 1.0000 500 2771 1.0000 500 3069 1.0000 500 32319 1.0000 500 32503 1.0000 500 35643 1.0000 500 37621 1.0000 500 38632 1.0000 500 4254 1.0000 500 9163 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/148/148.seqs.fa -oc motifs/148 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.235 G 0.235 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.235 G 0.235 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 20 llr = 176 E-value = 4.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 24:921495611 pos.-specific C 71:1::3:::21 probability G 16a189315459 matrix T 2:::::1:1:3: bits 2.1 * 1.9 * 1.7 * * 1.5 * * * Relative 1.3 **** * * Entropy 1.0 **** * * * (12.7 bits) 0.8 ***** *** * 0.6 ****** *** * 0.4 ****** *** * 0.2 ************ 0.0 ------------ Multilevel CGGAGGAAGAGG consensus A A G AGT sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 4254 239 9.01e-08 GTGGTGGTGA CGGAGGGAGAGG AAGGACGAAA 2771 7 3.12e-06 TGAAGA AGGAGGAAAAGG CTCCCAAACG 35643 150 4.88e-06 TTGCCTCTTG CGGAAGAAAGGG CATCCTCTTC 263001 270 7.73e-06 AGCGAATATG AGGAGGAAGATG AGGAGGACGA 262636 383 7.73e-06 GACGTGTTTA CGGAGGAATAGG TAAATGTAGA 21956 53 7.73e-06 AAATCTTCAA CAGAGGAAGAAG TCATCACAAT 262637 407 8.90e-06 GTGAGAGTTG CGGAGGAGAGGG GGTTTGAACC 10554 371 9.76e-06 CGGAGGGAAG CAGAGGCAAGCG TCGCCTCGTC 37621 158 2.28e-05 GAGGTGGATG CGGAGAGAGATG GCGGTTCATG 3069 154 2.28e-05 ATTCGTTGAG GAGAGGGAGAGG GCTCTGTCCA 38632 254 2.71e-05 AAGGTGACGA CGGCGGCAGACG AGGTGGTCGA 262807 192 3.58e-05 AAACTGTTAG TAGAGGCAAGTG TATCAACACA 25480 329 3.94e-05 CGCCATTTTT CGGGGGAAGGTG CCACCCTGAT 32503 366 5.59e-05 CCCCTCCTCG CGGAGGGGGAGC CTCTGCCGGC 1337 243 6.01e-05 CAATACGGAT TAGAAGGAAGGG TCGTGGTCTT 32319 174 7.11e-05 CTTTCTTCAA CCGAGGAAGGAG CAAGCTTGTT 261726 72 8.95e-05 GCATCAACCG CAGAAGCAAAGC GATTGCCCGT 25252 291 1.11e-04 TCTCCCATCC AGGAGGCAAAGA GTTGGTGGAT 9163 227 2.46e-04 TCTTGCATTC CAGAAAGAGGCG TCTTGAAGAG 19890 440 3.31e-04 TCGAAACTCT TGGCGGTAAATG TGAGAATCGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4254 9e-08 238_[+1]_250 2771 3.1e-06 6_[+1]_482 35643 4.9e-06 149_[+1]_339 263001 7.7e-06 269_[+1]_219 262636 7.7e-06 382_[+1]_106 21956 7.7e-06 52_[+1]_436 262637 8.9e-06 406_[+1]_82 10554 9.8e-06 370_[+1]_118 37621 2.3e-05 157_[+1]_331 3069 2.3e-05 153_[+1]_335 38632 2.7e-05 253_[+1]_235 262807 3.6e-05 191_[+1]_297 25480 3.9e-05 328_[+1]_160 32503 5.6e-05 365_[+1]_123 1337 6e-05 242_[+1]_246 32319 7.1e-05 173_[+1]_315 261726 8.9e-05 71_[+1]_417 25252 0.00011 290_[+1]_198 9163 0.00025 226_[+1]_262 19890 0.00033 439_[+1]_49 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=20 4254 ( 239) CGGAGGGAGAGG 1 2771 ( 7) AGGAGGAAAAGG 1 35643 ( 150) CGGAAGAAAGGG 1 263001 ( 270) AGGAGGAAGATG 1 262636 ( 383) CGGAGGAATAGG 1 21956 ( 53) CAGAGGAAGAAG 1 262637 ( 407) CGGAGGAGAGGG 1 10554 ( 371) CAGAGGCAAGCG 1 37621 ( 158) CGGAGAGAGATG 1 3069 ( 154) GAGAGGGAGAGG 1 38632 ( 254) CGGCGGCAGACG 1 262807 ( 192) TAGAGGCAAGTG 1 25480 ( 329) CGGGGGAAGGTG 1 32503 ( 366) CGGAGGGGGAGC 1 1337 ( 243) TAGAAGGAAGGG 1 32319 ( 174) CCGAGGAAGGAG 1 261726 ( 72) CAGAAGCAAAGC 1 25252 ( 291) AGGAGGCAAAGA 1 9163 ( 227) CAGAAAGAGGCG 1 19890 ( 440) TGGCGGTAAATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9780 bayes= 9.18275 E= 4.7e+000 -84 147 -223 -80 38 -223 135 -1097 -1097 -1097 209 -1097 166 -123 -223 -1097 -42 -1097 177 -1097 -142 -1097 194 -1097 58 9 35 -239 175 -1097 -123 -1097 75 -1097 109 -239 116 -1097 77 -1097 -142 -65 109 -7 -242 -123 186 -1097 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 20 E= 4.7e+000 0.150000 0.650000 0.050000 0.150000 0.350000 0.050000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.850000 0.100000 0.050000 0.000000 0.200000 0.000000 0.800000 0.000000 0.100000 0.000000 0.900000 0.000000 0.400000 0.250000 0.300000 0.050000 0.900000 0.000000 0.100000 0.000000 0.450000 0.000000 0.500000 0.050000 0.600000 0.000000 0.400000 0.000000 0.100000 0.150000 0.500000 0.250000 0.050000 0.100000 0.850000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[GA]GA[GA]G[AGC]A[GA][AG][GT]G -------------------------------------------------------------------------------- Time 4.37 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 140 E-value = 3.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :44:412:::12::::::1:: pos.-specific C 33:66::31:322a:28:22: probability G 7:2::81194:6::92:8::a matrix T :234:176:66:8:162278: bits 2.1 * * 1.9 * * 1.7 * ** * 1.5 * ** * Relative 1.3 * * *** ** ** Entropy 1.0 * *** ** *** ** ** (22.4 bits) 0.8 * *** ** *** ***** 0.6 * ****************** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GAACCGTTGTTGTCGTCGTTG consensus CCTTA AC GCAC CTTCC sequence TG C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 1337 151 3.54e-12 GCAAGCAGAT GCACCGTCGTTGTCGTCGTTG TCATTTGTCG 262636 11 1.53e-10 GACATTTGCA GCGTCGTCGTCGTCGTCGTTG TGTAGGCGGC 9163 5 3.48e-08 GAAC CAACAGTTGTTGCCGGCTTCG ACCGCACGAA 263001 15 3.48e-08 TCAGTGTTGT GTTCAGGTGGTGTCGGTGTTG GAGTTCAGTA 32503 73 6.16e-08 TGACGATAAT GCGTCAATGGAGTCGTCGTTG CTTTGGAGTG 261726 29 8.34e-08 GTACTACGCT GTTCCGACGGCACCGTCGCTG TACTTTGCAT 35643 122 8.98e-08 CTTGGAAGCG CAACAGTGGTTCTCTCCGTTG CCTCTTGCGG 3069 52 3.00e-07 TCAATCAGTT CATTAGTTGGTATCGCTGACG ACCAAGTTTG 32319 128 3.34e-07 AACGCCCCAT GAATCTTTCTCCTCGTCTCTG AAAACGCCCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1337 3.5e-12 150_[+2]_329 262636 1.5e-10 10_[+2]_469 9163 3.5e-08 4_[+2]_475 263001 3.5e-08 14_[+2]_465 32503 6.2e-08 72_[+2]_407 261726 8.3e-08 28_[+2]_451 35643 9e-08 121_[+2]_358 3069 3e-07 51_[+2]_428 32319 3.3e-07 127_[+2]_352 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 1337 ( 151) GCACCGTCGTTGTCGTCGTTG 1 262636 ( 11) GCGTCGTCGTCGTCGTCGTTG 1 9163 ( 5) CAACAGTTGTTGCCGGCTTCG 1 263001 ( 15) GTTCAGGTGGTGTCGGTGTTG 1 32503 ( 73) GCGTCAATGGAGTCGTCGTTG 1 261726 ( 29) GTTCCGACGGCACCGTCGCTG 1 35643 ( 122) CAACAGTGGTTCTCTCCGTTG 1 3069 ( 52) CATTAGTTGGTATCGCTGACG 1 32319 ( 128) GAATCTTTCTCCTCGTCTCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9600 bayes= 11.4628 E= 3.9e+001 -982 50 151 -982 73 50 -982 -24 73 -982 -8 35 -982 124 -982 76 73 124 -982 -982 -127 -982 173 -124 -27 -982 -108 135 -982 50 -108 108 -982 -108 192 -982 -982 -982 92 108 -127 50 -982 108 -27 -8 124 -982 -982 -8 -982 157 -982 209 -982 -982 -982 -982 192 -124 -982 -8 -8 108 -982 172 -982 -24 -982 -982 173 -24 -127 -8 -982 135 -982 -8 -982 157 -982 -982 209 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 3.9e+001 0.000000 0.333333 0.666667 0.000000 0.444444 0.333333 0.000000 0.222222 0.444444 0.000000 0.222222 0.333333 0.000000 0.555556 0.000000 0.444444 0.444444 0.555556 0.000000 0.000000 0.111111 0.000000 0.777778 0.111111 0.222222 0.000000 0.111111 0.666667 0.000000 0.333333 0.111111 0.555556 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 0.444444 0.555556 0.111111 0.333333 0.000000 0.555556 0.222222 0.222222 0.555556 0.000000 0.000000 0.222222 0.000000 0.777778 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.222222 0.222222 0.555556 0.000000 0.777778 0.000000 0.222222 0.000000 0.000000 0.777778 0.222222 0.111111 0.222222 0.000000 0.666667 0.000000 0.222222 0.000000 0.777778 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC][ACT][ATG][CT][CA]G[TA][TC]G[TG][TC][GAC][TC]CG[TCG][CT][GT][TC][TC]G -------------------------------------------------------------------------------- Time 8.62 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 10 llr = 126 E-value = 5.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :3a1:9::16:6::7 pos.-specific C 1::31::4:31::23 probability G 97:691158:9158: matrix T ::::::9111:35:: bits 2.1 1.9 * 1.7 * * * * 1.5 * * *** * * Relative 1.3 *** *** * * Entropy 1.0 *** *** * * *** (18.2 bits) 0.8 ******* * * *** 0.6 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GGAGGATGGAGAGGA consensus A C C C TTCC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 262637 479 7.92e-10 GGGAGGAGGA GGAGGATGGAGAGGA TAGTTTC 32503 207 1.67e-08 CTACTCCTCC GGAGGATCGAGTTGA TTACTAATAT 4254 144 7.64e-08 AAGGCTTTGT GGAGGATGGAGAGCC TGTAACGGTA 38632 175 4.98e-07 AGGTCGTTTG GGACGATGGCGGTGC CACCGGCGAA 35643 211 8.24e-07 CCTCGCTTGG GAAGCATCGCGATGA TTCAAATGAC 19890 162 8.24e-07 AATCAACATT GGAGGATTGACATGA AGCTCGTTAT 25480 169 1.63e-06 ATTCCGCGTG CAACGATCGAGTGGA ACGATCAACA 21956 279 2.79e-06 TACCCATGTT GGAAGGTCGCGTTGA GATTGTGTAG 9163 187 3.16e-06 GGCCGCCTGC GAAGGATGTTGAGGC GGTCAAGTTT 263001 286 5.62e-06 AAGATGAGGA GGACGAGGAAGAGCA GGTGTATCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 262637 7.9e-10 478_[+3]_7 32503 1.7e-08 206_[+3]_279 4254 7.6e-08 143_[+3]_342 38632 5e-07 174_[+3]_311 35643 8.2e-07 210_[+3]_275 19890 8.2e-07 161_[+3]_324 25480 1.6e-06 168_[+3]_317 21956 2.8e-06 278_[+3]_207 9163 3.2e-06 186_[+3]_299 263001 5.6e-06 285_[+3]_200 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=10 262637 ( 479) GGAGGATGGAGAGGA 1 32503 ( 207) GGAGGATCGAGTTGA 1 4254 ( 144) GGAGGATGGAGAGCC 1 38632 ( 175) GGACGATGGCGGTGC 1 35643 ( 211) GAAGCATCGCGATGA 1 19890 ( 162) GGAGGATTGACATGA 1 25480 ( 169) CAACGATCGAGTGGA 1 21956 ( 279) GGAAGGTCGCGTTGA 1 9163 ( 187) GAAGGATGTTGAGGC 1 263001 ( 286) GGACGAGGAAGAGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 9720 bayes= 10.1751 E= 5.1e+001 -997 -123 194 -997 16 -997 158 -997 190 -997 -997 -997 -142 35 135 -997 -997 -123 194 -997 175 -997 -123 -997 -997 -997 -123 178 -997 77 109 -139 -142 -997 177 -139 116 35 -997 -139 -997 -123 194 -997 116 -997 -123 19 -997 -997 109 93 -997 -23 177 -997 138 35 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 10 E= 5.1e+001 0.000000 0.100000 0.900000 0.000000 0.300000 0.000000 0.700000 0.000000 1.000000 0.000000 0.000000 0.000000 0.100000 0.300000 0.600000 0.000000 0.000000 0.100000 0.900000 0.000000 0.900000 0.000000 0.100000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.400000 0.500000 0.100000 0.100000 0.000000 0.800000 0.100000 0.600000 0.300000 0.000000 0.100000 0.000000 0.100000 0.900000 0.000000 0.600000 0.000000 0.100000 0.300000 0.000000 0.000000 0.500000 0.500000 0.000000 0.200000 0.800000 0.000000 0.700000 0.300000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[GA]A[GC]GAT[GC]G[AC]G[AT][GT][GC][AC] -------------------------------------------------------------------------------- Time 12.66 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10554 1.47e-02 370_[+1(9.76e-06)]_118 1337 2.38e-09 150_[+2(3.54e-12)]_71_\ [+1(6.01e-05)]_246 19890 1.28e-03 161_[+3(8.24e-07)]_324 21956 2.22e-04 52_[+1(7.73e-06)]_214_\ [+3(2.79e-06)]_207 25252 7.89e-02 500 25480 3.90e-04 168_[+3(1.63e-06)]_145_\ [+1(3.94e-05)]_160 261726 3.57e-05 28_[+2(8.34e-08)]_22_[+1(8.95e-05)]_\ 417 262636 8.22e-09 10_[+2(1.53e-10)]_260_\ [+2(4.96e-05)]_70_[+1(7.73e-06)]_106 262637 3.34e-07 406_[+1(8.90e-06)]_60_\ [+3(7.92e-10)]_7 262807 1.65e-01 191_[+1(3.58e-05)]_297 263001 4.75e-08 14_[+2(3.48e-08)]_234_\ [+1(7.73e-06)]_4_[+3(5.62e-06)]_200 2771 6.06e-03 6_[+1(3.12e-06)]_334_[+1(8.28e-05)]_\ 136 3069 1.18e-04 51_[+2(3.00e-07)]_81_[+1(2.28e-05)]_\ 335 32319 4.09e-04 127_[+2(3.34e-07)]_25_\ [+1(7.11e-05)]_315 32503 2.33e-09 72_[+2(6.16e-08)]_113_\ [+3(1.67e-08)]_144_[+1(5.59e-05)]_123 35643 1.28e-08 121_[+2(8.98e-08)]_7_[+1(4.88e-06)]_\ 49_[+3(8.24e-07)]_275 37621 7.09e-02 157_[+1(2.28e-05)]_331 38632 8.32e-05 174_[+3(4.98e-07)]_64_\ [+1(2.71e-05)]_235 4254 1.92e-07 143_[+3(7.64e-08)]_80_\ [+1(9.01e-08)]_250 9163 6.21e-07 4_[+2(3.48e-08)]_161_[+3(3.16e-06)]_\ 299 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************