******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/306/306.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42928 1.0000 500 54602 1.0000 500 20899 1.0000 500 38294 1.0000 500 48435 1.0000 500 48572 1.0000 500 49358 1.0000 500 49570 1.0000 500 50347 1.0000 500 43867 1.0000 500 43959 1.0000 500 44828 1.0000 500 45518 1.0000 500 45926 1.0000 500 49574 1.0000 500 42600 1.0000 500 47038 1.0000 500 33664 1.0000 500 43441 1.0000 500 45761 1.0000 500 48069 1.0000 500 48168 1.0000 500 45025 1.0000 500 44101 1.0000 500 50254 1.0000 500 49201 1.0000 500 43709 1.0000 500 49001 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/306/306.seqs.fa -oc motifs/306 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 28 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 14000 N= 28 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.278 C 0.235 G 0.219 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.278 C 0.235 G 0.219 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 93 E-value = 9.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a::::a::::::::3:::::a pos.-specific C ::38::::388::5:3:aa:: probability G :a83::3a8:3a:3:53::a: matrix T ::::a:8::3::a3838:::: bits 2.2 * * * *** 2.0 * * * ** *** 1.8 ** ** * ** **** 1.5 ** ** * ** **** Relative 1.3 ****** ****** **** Entropy 1.1 ************* * ***** (33.4 bits) 0.9 ************* * ***** 0.7 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AGGCTATGGCCGTCTGTCCGA consensus CG G CTG GACG sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50347 278 1.21e-13 CGGGCAAAGT AGGCTATGGCCGTCTGTCCGA ACAGCTGCGC 49358 278 1.21e-13 CGGGCAAAGT AGGCTATGGCCGTCTGTCCGA ACAGCTGCGC 44101 47 1.35e-10 ATCCTTACAT AGCCTATGCCGGTTTTTCCGA AAAGGGGGAT 45518 44 2.20e-10 GTTGAGCTAC AGGGTAGGGTCGTGACGCCGA TTCGTATGGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50347 1.2e-13 277_[+1]_202 49358 1.2e-13 277_[+1]_202 44101 1.3e-10 46_[+1]_433 45518 2.2e-10 43_[+1]_436 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 50347 ( 278) AGGCTATGGCCGTCTGTCCGA 1 49358 ( 278) AGGCTATGGCCGTCTGTCCGA 1 44101 ( 47) AGCCTATGCCGGTTTTTCCGA 1 45518 ( 44) AGGGTAGGGTCGTGACGCCGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 13440 bayes= 11.7138 E= 9.7e-001 185 -865 -865 -865 -865 -865 219 -865 -865 9 177 -865 -865 167 19 -865 -865 -865 -865 190 185 -865 -865 -865 -865 -865 19 148 -865 -865 219 -865 -865 9 177 -865 -865 167 -865 -10 -865 167 19 -865 -865 -865 219 -865 -865 -865 -865 190 -865 109 19 -10 -15 -865 -865 148 -865 9 119 -10 -865 -865 19 148 -865 209 -865 -865 -865 209 -865 -865 -865 -865 219 -865 185 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 9.7e-001 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.250000 0.250000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- AG[GC][CG]TA[TG]G[GC][CT][CG]GT[CGT][TA][GCT][TG]CCGA -------------------------------------------------------------------------------- Time 6.35 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 18 llr = 219 E-value = 1.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 151:211::31::62::31:: pos.-specific C 54:44:282:277::393:27 probability G :::13:2:1233234:1:112 matrix T 4195196276411147:4871 bits 2.2 2.0 1.8 1.5 * * Relative 1.3 * * * * Entropy 1.1 * * * ** * (17.5 bits) 0.9 * * ** ** ** *** 0.7 **** * ** *** ** *** 0.4 **** ***** ********** 0.2 ********************* 0.0 --------------------- Multilevel CATTCTTCTTTCCAGTCTTTC consensus TC CG CTCAGGGGTC A C sequence A C A C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 48435 104 2.89e-10 CCACTCTCTC CCTCGTTCTTTCCAATCATTC ACGGTCAACT 50347 218 6.91e-10 TTGATGATGC CATTGTTCTTTCGATTCCTTC GACTAACTGT 49358 218 6.06e-09 TTGATGATGC CATTGTTCCTTCGATTCCTTC GACTAACTGT 42928 54 1.99e-08 TTGAATGGAA AATCCTCCTTGCCATCCTTTC GTAGTCGTTA 50254 115 1.63e-07 GGATCAAACT TTTTGTTCCGTGCAGTCTTTC GCAAGATAGT 49574 320 3.77e-07 GGGATAAATT TCTCATCCTTCGCATTCTTTT TACATGAGAA 43959 476 7.38e-07 TGGTTCCAAC CATCCATCCATCCATCCATCC AAGT 44828 264 1.06e-06 GACAGAAAGG AATCCTGTTTGGCGGTCATCC TGAGTCCTGC 48168 224 1.49e-06 ATGGCACGCC CCTTATGTTTTCCGATGCTTC GTGATTTGCT 43867 185 2.24e-06 AATTGCCCTG TAATATTCTAGCGATTCTTTG TAATTTCTTA 33664 291 2.63e-06 ATCTACGGAA TCTTCTTCTGCGCAGCCCTGG AAAGACCGAA 45761 411 2.84e-06 CCGTCTGCCG CCTCCTACTAGCCGGTCAACC CCTCTTTGAT 44101 352 3.06e-06 GATGCATGGA CTTCGTTCTGCCTAACCTTCC CCGCGATCAA 43709 383 4.44e-06 GCTTTGTGGA CCTGCTTCTTCCCTACGCTTC GACGGCTGTA 38294 266 5.12e-06 TATAAAAGTT TATTTATCTATGCGGTCATTG CCGGGAACAC 47038 466 5.89e-06 TTTCCGAGCT TATCGTCTGAACCGTTCTTTC CGCCTTGTTT 43441 343 8.79e-06 GTTCTTTGTT CATTCTCTCTGTGAGTCAGTC AATACCCGTC 45518 16 1.45e-05 GCGCCCGTGC TCTTATGCTTGCTTGTCTGTT GAGCTACAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48435 2.9e-10 103_[+2]_376 50347 6.9e-10 217_[+2]_262 49358 6.1e-09 217_[+2]_262 42928 2e-08 53_[+2]_426 50254 1.6e-07 114_[+2]_365 49574 3.8e-07 319_[+2]_160 43959 7.4e-07 475_[+2]_4 44828 1.1e-06 263_[+2]_216 48168 1.5e-06 223_[+2]_256 43867 2.2e-06 184_[+2]_295 33664 2.6e-06 290_[+2]_189 45761 2.8e-06 410_[+2]_69 44101 3.1e-06 351_[+2]_128 43709 4.4e-06 382_[+2]_97 38294 5.1e-06 265_[+2]_214 47038 5.9e-06 465_[+2]_14 43441 8.8e-06 342_[+2]_137 45518 1.4e-05 15_[+2]_464 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=18 48435 ( 104) CCTCGTTCTTTCCAATCATTC 1 50347 ( 218) CATTGTTCTTTCGATTCCTTC 1 49358 ( 218) CATTGTTCCTTCGATTCCTTC 1 42928 ( 54) AATCCTCCTTGCCATCCTTTC 1 50254 ( 115) TTTTGTTCCGTGCAGTCTTTC 1 49574 ( 320) TCTCATCCTTCGCATTCTTTT 1 43959 ( 476) CATCCATCCATCCATCCATCC 1 44828 ( 264) AATCCTGTTTGGCGGTCATCC 1 48168 ( 224) CCTTATGTTTTCCGATGCTTC 1 43867 ( 185) TAATATTCTAGCGATTCTTTG 1 33664 ( 291) TCTTCTTCTGCGCAGCCCTGG 1 45761 ( 411) CCTCCTACTAGCCGGTCAACC 1 44101 ( 352) CTTCGTTCTGCCTAACCTTCC 1 43709 ( 383) CCTGCTTCTTCCCTACGCTTC 1 38294 ( 266) TATTTATCTATGCGGTCATTG 1 47038 ( 466) TATCGTCTGAACCGTTCTTTC 1 43441 ( 343) CATTCTCTCTGTGAGTCAGTC 1 45518 ( 16) TCTTATGCTTGCTTGTCTGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 13440 bayes= 10.3912 E= 1.0e+000 -132 109 -1081 53 85 73 -1081 -127 -232 -1081 -1081 181 -1081 92 -198 90 -32 73 60 -227 -132 -1081 -1081 173 -232 -8 -40 105 -1081 173 -1081 -27 -1081 -8 -198 143 0 -1081 -40 105 -232 -8 60 53 -1081 151 34 -227 -1081 151 2 -127 114 -1081 34 -127 -32 -1081 83 53 -1081 24 -1081 143 -1081 192 -98 -1081 26 24 -1081 53 -232 -1081 -98 163 -1081 -8 -198 143 -1081 162 -40 -127 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 18 E= 1.0e+000 0.111111 0.500000 0.000000 0.388889 0.500000 0.388889 0.000000 0.111111 0.055556 0.000000 0.000000 0.944444 0.000000 0.444444 0.055556 0.500000 0.222222 0.388889 0.333333 0.055556 0.111111 0.000000 0.000000 0.888889 0.055556 0.222222 0.166667 0.555556 0.000000 0.777778 0.000000 0.222222 0.000000 0.222222 0.055556 0.722222 0.277778 0.000000 0.166667 0.555556 0.055556 0.222222 0.333333 0.388889 0.000000 0.666667 0.277778 0.055556 0.000000 0.666667 0.222222 0.111111 0.611111 0.000000 0.277778 0.111111 0.222222 0.000000 0.388889 0.388889 0.000000 0.277778 0.000000 0.722222 0.000000 0.888889 0.111111 0.000000 0.333333 0.277778 0.000000 0.388889 0.055556 0.000000 0.111111 0.833333 0.000000 0.222222 0.055556 0.722222 0.000000 0.722222 0.166667 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT][AC]T[TC][CGA]T[TC][CT][TC][TA][TGC][CG][CG][AG][GTA][TC]C[TAC]T[TC]C -------------------------------------------------------------------------------- Time 12.48 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 7 llr = 107 E-value = 1.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 6969a1:::::::1: pos.-specific C ::1::4::a:::31a probability G 4:31:3:a::::::: matrix T :1:::1a::aaa77: bits 2.2 ** * 2.0 ****** * 1.8 * ****** * 1.5 * ****** * Relative 1.3 * ** ****** * Entropy 1.1 ** ** ******* * (22.1 bits) 0.9 ** ** ********* 0.7 ***** ********* 0.4 ***** ********* 0.2 *************** 0.0 --------------- Multilevel AAAAACTGCTTTTTC consensus G G G C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 49001 200 6.04e-09 AGAAATTCAA GAAAAGTGCTTTTTC AAACAGGCAT 50347 325 1.12e-08 AGGATACACC AAAAACTGCTTTCTC GATCTCGATG 49358 325 1.12e-08 AGGATACACC AAAAACTGCTTTCTC GATCTCGATG 45926 358 1.35e-08 ACGGCAGCAG GAGAAGTGCTTTTTC CTTTCATCAC 50254 41 1.31e-07 ATGATAATGA AAAAAATGCTTTTCC GATTTATTTA 42600 111 2.18e-07 AAGTTCCAAT AAGAATTGCTTTTAC TACAGCACTC 49570 420 3.85e-07 GTTGGTTGCC GTCGACTGCTTTTTC TGTAAATCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49001 6e-09 199_[+3]_286 50347 1.1e-08 324_[+3]_161 49358 1.1e-08 324_[+3]_161 45926 1.4e-08 357_[+3]_128 50254 1.3e-07 40_[+3]_445 42600 2.2e-07 110_[+3]_375 49570 3.9e-07 419_[+3]_66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=7 49001 ( 200) GAAAAGTGCTTTTTC 1 50347 ( 325) AAAAACTGCTTTCTC 1 49358 ( 325) AAAAACTGCTTTCTC 1 45926 ( 358) GAGAAGTGCTTTTTC 1 50254 ( 41) AAAAAATGCTTTTCC 1 42600 ( 111) AAGAATTGCTTTTAC 1 49570 ( 420) GTCGACTGCTTTTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 13608 bayes= 9.90284 E= 1.7e+000 104 -945 97 -945 162 -945 -945 -91 104 -71 38 -945 162 -945 -62 -945 185 -945 -945 -945 -96 87 38 -91 -945 -945 -945 190 -945 -945 219 -945 -945 209 -945 -945 -945 -945 -945 190 -945 -945 -945 190 -945 -945 -945 190 -945 28 -945 141 -96 -71 -945 141 -945 209 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 1.7e+000 0.571429 0.000000 0.428571 0.000000 0.857143 0.000000 0.000000 0.142857 0.571429 0.142857 0.285714 0.000000 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.428571 0.285714 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.000000 0.714286 0.142857 0.142857 0.000000 0.714286 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG]A[AG]AA[CG]TGCTTT[TC]TC -------------------------------------------------------------------------------- Time 18.63 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42928 3.28e-04 53_[+2(1.99e-08)]_426 54602 5.73e-01 500 20899 3.52e-01 500 38294 2.97e-02 265_[+2(5.12e-06)]_214 48435 1.33e-06 103_[+2(2.89e-10)]_376 48572 7.17e-02 83_[+3(7.11e-05)]_402 49358 1.13e-18 217_[+2(6.06e-09)]_39_\ [+1(1.21e-13)]_26_[+3(1.12e-08)]_161 49570 2.41e-03 419_[+3(3.85e-07)]_66 50347 1.40e-19 217_[+2(6.91e-10)]_39_\ [+1(1.21e-13)]_26_[+3(1.12e-08)]_161 43867 2.90e-02 184_[+2(2.24e-06)]_295 43959 1.97e-03 475_[+2(7.38e-07)]_4 44828 2.27e-03 263_[+2(1.06e-06)]_216 45518 1.77e-07 15_[+2(1.45e-05)]_7_[+1(2.20e-10)]_\ 436 45926 6.72e-05 298_[+3(1.52e-05)]_44_\ [+3(1.35e-08)]_111_[+3(4.00e-05)]_2 49574 2.07e-03 319_[+2(3.77e-07)]_160 42600 3.64e-04 110_[+3(2.18e-07)]_375 47038 4.91e-03 465_[+2(5.89e-06)]_14 33664 2.25e-02 290_[+2(2.63e-06)]_189 43441 3.76e-02 342_[+2(8.79e-06)]_137 45761 6.12e-03 410_[+2(2.84e-06)]_69 48069 4.37e-01 500 48168 3.94e-03 223_[+2(1.49e-06)]_256 45025 5.62e-01 500 44101 1.22e-08 46_[+1(1.35e-10)]_284_\ [+2(3.06e-06)]_128 50254 7.26e-07 40_[+3(1.31e-07)]_59_[+2(1.63e-07)]_\ 365 49201 9.63e-01 500 43709 8.41e-03 382_[+2(4.44e-06)]_97 49001 1.02e-04 199_[+3(6.04e-09)]_95_\ [+3(7.82e-05)]_176 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************