******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/268/268.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10041 1.0000 500 11026 1.0000 500 19631 1.0000 500 21801 1.0000 500 21841 1.0000 500 22249 1.0000 500 22311 1.0000 500 23677 1.0000 500 23836 1.0000 500 23946 1.0000 500 24801 1.0000 500 261117 1.0000 500 262617 1.0000 500 262877 1.0000 500 264728 1.0000 500 38363 1.0000 500 5904 1.0000 500 6473 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/268/268.seqs.fa -oc motifs/268 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.267 C 0.239 G 0.232 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.239 G 0.232 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 16 llr = 192 E-value = 1.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 545:63723:13:1212:13: pos.-specific C 2:1:11:::23::31::2:16 probability G 361a26278633a2681393: matrix T ::3:1:11:244:41175:44 bits 2.1 * * 1.9 * * 1.7 * * * 1.5 * * * Relative 1.3 * * * * Entropy 1.1 * * * * * * * (17.3 bits) 0.8 * * **** * ** * * 0.6 * * ***** * ***** * 0.4 ** ******* ** ***** * 0.2 ********************* 0.0 --------------------- Multilevel AGAGAGAGGGTTGTGGTTGTC consensus GAT A A GA C G AT sequence CG G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 21841 177 1.31e-10 TGGCAAGTAC AAAGAGAGGGTTGGGGTTGGC CATGGTTGAT 23946 254 6.18e-08 GAGGAGGGGT AGCGAGAGGGAGGTGGTGGAT GGATGGACAC 264728 83 7.97e-08 TGCTCCACTG AAAGACAGAGTGGCAGTTGTC AAGAGTAGGA 22311 338 1.30e-07 ACGATGCAGG AGTGAGAGGCTTGAGGAGGAC ACGTGTGGAC 21801 325 2.59e-07 TTTCGTCGGG CAAGGGAGGTTGGTGGGTGGC TTGGGAGATG 38363 249 3.22e-07 AACCATTGCA CAAGCGAGGGTTGGGATTGTT TGGCTTTGGA 23836 1 3.98e-07 . GGTGTGTGGGCAGTGGTTGGC TGGAGTAGAA 6473 106 7.30e-07 ATGAATCTAT CAAGAGAGACTAGCGTTTGTT AGCACAAATG 23677 350 9.74e-07 TTCTTCTTGT GGTGAAATGTCTGTGGATGTC TGATCTCACT 22249 313 9.74e-07 ACGATGGCAG GGAGAGGAGGGGGCAGGGGAC CACAAGAGTA 5904 182 4.11e-06 ATAAAAGAAG AGGGAGAGAGGAGCAGTCAGC ATTGTAGCGG 262877 350 6.37e-06 GTCAAAAAGA AGAGGAAGAGCTGCCGTGGCT GGCGCTGGGT 10041 132 6.83e-06 CATCGTGGCA GGAGCCAAGTGTGTTGTGGTT ACGCCTCAGA 261117 426 7.31e-06 ACTCGAGGTG GAGGAGGAGCCAGTGGACGTT TTACGAGCGT 11026 413 1.08e-05 CTTTTACCAC AACGAAGTGGGAGGGTTCGTC TCCAGTTCCA 262617 197 1.47e-05 TGGAGAAAAG AGTGGATGGGGGGATATTGAC CCTCACCACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21841 1.3e-10 176_[+1]_303 23946 6.2e-08 253_[+1]_226 264728 8e-08 82_[+1]_397 22311 1.3e-07 337_[+1]_142 21801 2.6e-07 324_[+1]_155 38363 3.2e-07 248_[+1]_231 23836 4e-07 [+1]_479 6473 7.3e-07 105_[+1]_374 23677 9.7e-07 349_[+1]_130 22249 9.7e-07 312_[+1]_167 5904 4.1e-06 181_[+1]_298 262877 6.4e-06 349_[+1]_130 10041 6.8e-06 131_[+1]_348 261117 7.3e-06 425_[+1]_54 11026 1.1e-05 412_[+1]_67 262617 1.5e-05 196_[+1]_283 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=16 21841 ( 177) AAAGAGAGGGTTGGGGTTGGC 1 23946 ( 254) AGCGAGAGGGAGGTGGTGGAT 1 264728 ( 83) AAAGACAGAGTGGCAGTTGTC 1 22311 ( 338) AGTGAGAGGCTTGAGGAGGAC 1 21801 ( 325) CAAGGGAGGTTGGTGGGTGGC 1 38363 ( 249) CAAGCGAGGGTTGGGATTGTT 1 23836 ( 1) GGTGTGTGGGCAGTGGTTGGC 1 6473 ( 106) CAAGAGAGACTAGCGTTTGTT 1 23677 ( 350) GGTGAAATGTCTGTGGATGTC 1 22249 ( 313) GGAGAGGAGGGGGCAGGGGAC 1 5904 ( 182) AGGGAGAGAGGAGCAGTCAGC 1 262877 ( 350) AGAGGAAGAGCTGCCGTGGCT 1 10041 ( 132) GGAGCCAAGTGTGTTGTGGTT 1 261117 ( 426) GAGGAGGAGCCAGTGGACGTT 1 11026 ( 413) AACGAAGTGGGAGGGTTCGTC 1 262617 ( 197) AGTGGATGGGGGGATATTGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 9.81218 E= 1.7e-001 91 -35 43 -1064 71 -1064 128 -1064 91 -93 -89 -7 -1064 -1064 210 -1064 123 -93 -31 -207 -9 -93 143 -1064 137 -1064 -31 -107 -51 -1064 156 -107 -9 -1064 169 -1064 -1064 -35 143 -48 -209 7 43 51 23 -1064 43 51 -1064 -1064 210 -1064 -109 39 -31 51 -51 -193 143 -107 -109 -1064 169 -107 -51 -1064 -89 139 -1064 -35 43 93 -209 -1064 201 -1064 -9 -193 11 74 -1064 139 -1064 51 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 16 E= 1.7e-001 0.500000 0.187500 0.312500 0.000000 0.437500 0.000000 0.562500 0.000000 0.500000 0.125000 0.125000 0.250000 0.000000 0.000000 1.000000 0.000000 0.625000 0.125000 0.187500 0.062500 0.250000 0.125000 0.625000 0.000000 0.687500 0.000000 0.187500 0.125000 0.187500 0.000000 0.687500 0.125000 0.250000 0.000000 0.750000 0.000000 0.000000 0.187500 0.625000 0.187500 0.062500 0.250000 0.312500 0.375000 0.312500 0.000000 0.312500 0.375000 0.000000 0.000000 1.000000 0.000000 0.125000 0.312500 0.187500 0.375000 0.187500 0.062500 0.625000 0.125000 0.125000 0.000000 0.750000 0.125000 0.187500 0.000000 0.125000 0.687500 0.000000 0.187500 0.312500 0.500000 0.062500 0.000000 0.937500 0.000000 0.250000 0.062500 0.250000 0.437500 0.000000 0.625000 0.000000 0.375000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG][GA][AT]GA[GA]AG[GA]G[TGC][TAG]G[TC]GGT[TG]G[TAG][CT] -------------------------------------------------------------------------------- Time 3.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 10 llr = 146 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 116:::::132::a5::43:a pos.-specific C 49216:a961548:275226: probability G ::2::1:1151:2:1131:4: matrix T 5::949::2126::22235:: bits 2.1 * 1.9 * * * 1.7 * ** * * 1.5 * * *** * * Relative 1.3 * * *** ** * Entropy 1.1 * ***** *** ** (21.1 bits) 0.8 * ***** *** * ** 0.6 ******** *** ** ** 0.4 ********** *** ** *** 0.2 ********************* 0.0 --------------------- Multilevel TCATCTCCCGCTCAACCATCA consensus C C T TAACG CTGTAG sequence G T T TCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 23677 464 6.87e-10 AACGACCGAG TCATCTCCAGCTCACCCATCA TCCATCGACG 22249 455 5.91e-09 CCCTCTCACT TCCTCTCCCAACCAACCACCA GTTCCTTTTT 261117 189 7.59e-09 ACTGTCTCTG TCGTTTCCGGCTCAACGTTCA ACGTACAGCA 22311 108 1.53e-08 ACGTGACATA CCATCTCCCGACCACTGATGA TTTGAACCTA 23946 474 2.37e-08 ACCCCAGCTC TCATCTCCCCCTCAACTGAGA TGCACA 6473 351 1.38e-07 CATGACTTCT TCCTCTCCCATTCATTGCACA AAGATGCATT 19631 217 2.04e-07 CTTCTCCGGT CCACCTCGCGGTCATCCATCA AATGGTCCTG 10041 391 5.33e-07 CTCTAATTTG CCATTGCCTACTGAACTCACA CCTCCATCCT 21801 421 9.02e-07 TACTACGCCT CAGTTTCCTTCCCAACCTCGA AGACAGAGAC 38363 158 9.53e-07 GAATTCGCCA ACATTTCCCGTCGAGGCTTGA GAATGATAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23677 6.9e-10 463_[+2]_16 22249 5.9e-09 454_[+2]_25 261117 7.6e-09 188_[+2]_291 22311 1.5e-08 107_[+2]_372 23946 2.4e-08 473_[+2]_6 6473 1.4e-07 350_[+2]_129 19631 2e-07 216_[+2]_263 10041 5.3e-07 390_[+2]_89 21801 9e-07 420_[+2]_59 38363 9.5e-07 157_[+2]_322 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=10 23677 ( 464) TCATCTCCAGCTCACCCATCA 1 22249 ( 455) TCCTCTCCCAACCAACCACCA 1 261117 ( 189) TCGTTTCCGGCTCAACGTTCA 1 22311 ( 108) CCATCTCCCGACCACTGATGA 1 23946 ( 474) TCATCTCCCCCTCAACTGAGA 1 6473 ( 351) TCCTCTCCCATTCATTGCACA 1 19631 ( 217) CCACCTCGCGGTCATCCATCA 1 10041 ( 391) CCATTGCCTACTGAACTCACA 1 21801 ( 421) CAGTTTCCTTCCCAACCTCGA 1 38363 ( 158) ACATTTCCCGTCGAGGCTTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 9.18682 E= 1.4e+001 -141 75 -997 93 -141 191 -997 -997 117 -25 -22 -997 -997 -125 -997 178 -997 133 -997 61 -997 -997 -121 178 -997 207 -997 -997 -997 191 -121 -997 -141 133 -121 -39 17 -125 110 -139 -41 107 -121 -39 -997 75 -997 119 -997 174 -22 -997 191 -997 -997 -997 91 -25 -121 -39 -997 155 -121 -39 -997 107 37 -39 58 -25 -121 19 17 -25 -997 93 -997 133 78 -997 191 -997 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 1.4e+001 0.100000 0.400000 0.000000 0.500000 0.100000 0.900000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.000000 0.100000 0.000000 0.900000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.100000 0.900000 0.000000 1.000000 0.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.100000 0.600000 0.100000 0.200000 0.300000 0.100000 0.500000 0.100000 0.200000 0.500000 0.100000 0.200000 0.000000 0.400000 0.000000 0.600000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.200000 0.100000 0.200000 0.000000 0.700000 0.100000 0.200000 0.000000 0.500000 0.300000 0.200000 0.400000 0.200000 0.100000 0.300000 0.300000 0.200000 0.000000 0.500000 0.000000 0.600000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC]C[ACG]T[CT]TCC[CT][GA][CAT][TC][CG]A[ACT][CT][CGT][ATC][TAC][CG]A -------------------------------------------------------------------------------- Time 5.89 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 10 llr = 127 E-value = 6.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 27118541a85:a8:2 pos.-specific C 817922:9::29::48 probability G :1:::21::221:::: matrix T :12::15:::1::26: bits 2.1 1.9 * * 1.7 * ** ** 1.5 * ** ** Relative 1.3 * ** *** *** * Entropy 1.1 * ** *** ***** (18.3 bits) 0.8 * *** *** ***** 0.6 ***** **** ***** 0.4 ***** **** ***** 0.2 **************** 0.0 ---------------- Multilevel CACCAATCAAACAATC consensus A T CCA GC TCA sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 24801 65 3.63e-09 GCTTCTATGG CACCAAACAACCAATC TTCTTTCAGA 10041 434 1.22e-08 TACCACCACC CATCAATCAAACAACC AAACCACCCG 11026 191 1.08e-07 AGGTAGTATC CACCACACAGGCAATC ACAAAAGCGA 262617 354 1.25e-07 GCAATAAAAT CACCAATCAAACATCA TCACACTGCA 38363 457 2.68e-07 GCGCCATCAT CATCCATCAAACATTC CGGATTCATA 264728 218 3.31e-07 ACCAATCCAT CGCCAGACAACCAACC GGACGTAATT 262877 485 2.23e-06 GAGTACTGCC ACCAAAACAAACAATC 6473 288 4.18e-06 AATAATTACG ATCCATTCAATCAATC CTCCTCAAGC 21801 477 4.40e-06 CAATAACAGG CAACACTCAAAGAACA GCAAAATC 19631 387 8.40e-06 CCGCCTCAAC CACCCGGAAGGCAATC GTATCCAATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24801 3.6e-09 64_[+3]_420 10041 1.2e-08 433_[+3]_51 11026 1.1e-07 190_[+3]_294 262617 1.2e-07 353_[+3]_131 38363 2.7e-07 456_[+3]_28 264728 3.3e-07 217_[+3]_267 262877 2.2e-06 484_[+3] 6473 4.2e-06 287_[+3]_197 21801 4.4e-06 476_[+3]_8 19631 8.4e-06 386_[+3]_98 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=10 24801 ( 65) CACCAAACAACCAATC 1 10041 ( 434) CATCAATCAAACAACC 1 11026 ( 191) CACCACACAGGCAATC 1 262617 ( 354) CACCAATCAAACATCA 1 38363 ( 457) CATCCATCAAACATTC 1 264728 ( 218) CGCCAGACAACCAACC 1 262877 ( 485) ACCAAAACAAACAATC 1 6473 ( 288) ATCCATTCAATCAATC 1 21801 ( 477) CAACACTCAAAGAACA 1 19631 ( 387) CACCCGGAAGGCAATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 10.02 E= 6.0e+001 -41 174 -997 -997 139 -125 -121 -139 -141 155 -997 -39 -141 191 -997 -997 158 -25 -997 -997 91 -25 -22 -139 58 -997 -121 93 -141 191 -997 -997 191 -997 -997 -997 158 -997 -22 -997 91 -25 -22 -139 -997 191 -121 -997 191 -997 -997 -997 158 -997 -997 -39 -997 75 -997 119 -41 174 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 6.0e+001 0.200000 0.800000 0.000000 0.000000 0.700000 0.100000 0.100000 0.100000 0.100000 0.700000 0.000000 0.200000 0.100000 0.900000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.500000 0.200000 0.200000 0.100000 0.400000 0.000000 0.100000 0.500000 0.100000 0.900000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.500000 0.200000 0.200000 0.100000 0.000000 0.900000 0.100000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.400000 0.000000 0.600000 0.200000 0.800000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA]A[CT]C[AC][ACG][TA]CA[AG][ACG]CA[AT][TC][CA] -------------------------------------------------------------------------------- Time 8.69 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10041 1.81e-09 131_[+1(6.83e-06)]_238_\ [+2(5.33e-07)]_22_[+3(1.22e-08)]_51 11026 2.98e-06 190_[+3(1.08e-07)]_206_\ [+1(1.08e-05)]_67 19631 2.42e-05 216_[+2(2.04e-07)]_149_\ [+3(8.40e-06)]_98 21801 3.28e-08 324_[+1(2.59e-07)]_75_\ [+2(9.02e-07)]_35_[+3(4.40e-06)]_8 21841 2.83e-06 57_[+1(2.96e-05)]_98_[+1(1.31e-10)]_\ 303 22249 1.23e-07 312_[+1(9.74e-07)]_121_\ [+2(5.91e-09)]_25 22311 1.16e-07 107_[+2(1.53e-08)]_209_\ [+1(1.30e-07)]_142 23677 1.59e-08 349_[+1(9.74e-07)]_93_\ [+2(6.87e-10)]_16 23836 3.45e-04 [+1(3.98e-07)]_330_[+2(5.96e-05)]_\ 128 23946 5.98e-08 253_[+1(6.18e-08)]_199_\ [+2(2.37e-08)]_6 24801 3.36e-05 64_[+3(3.63e-09)]_420 261117 7.24e-07 188_[+2(7.59e-09)]_216_\ [+1(7.31e-06)]_54 262617 1.95e-05 196_[+1(1.47e-05)]_136_\ [+3(1.25e-07)]_131 262877 1.73e-04 349_[+1(6.37e-06)]_114_\ [+3(2.23e-06)] 264728 1.73e-07 82_[+1(7.97e-08)]_114_\ [+3(3.31e-07)]_231_[+1(3.11e-05)]_15 38363 3.20e-09 157_[+2(9.53e-07)]_70_\ [+1(3.22e-07)]_187_[+3(2.68e-07)]_28 5904 8.04e-03 181_[+1(4.11e-06)]_298 6473 1.44e-08 105_[+1(7.30e-07)]_161_\ [+3(4.18e-06)]_47_[+2(1.38e-07)]_129 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************