******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/276/276.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1827 1.0000 500 23216 1.0000 500 25511 1.0000 500 263499 1.0000 500 3081 1.0000 500 3212 1.0000 500 38030 1.0000 500 6523 1.0000 500 6789 1.0000 500 7708 1.0000 500 8945 1.0000 500 8953 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/276/276.seqs.fa -oc motifs/276 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.267 C 0.231 G 0.238 T 0.264 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.231 G 0.238 T 0.264 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 12 llr = 155 E-value = 6.3e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 6:23:861:::34:1::6:5: pos.-specific C :57:9:364a82356282629 probability G 1:24:::2::1315:23:211 matrix T 35:312126:132:37:333: bits 2.1 * 1.9 * 1.7 * * * 1.5 * * * Relative 1.3 ** ** * * Entropy 1.1 * ** *** * * * (18.7 bits) 0.8 ** ** *** ** * * 0.6 *** *** *** ****** * 0.4 *********** ****** * 0.2 *********** ********* 0.0 --------------------- Multilevel ACCGCAACTCCGACCTCACAC consensus TT T C C ACGT GTTT sequence A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 3212 303 2.47e-09 CATACTTTGA TCCTCAACTCCGCCTTGACAC AACCATCAAC 8953 456 9.79e-09 GAGCGGGATC ATCACAACTCCGAGCTGTCTC TGCGGCTTGC 1827 467 4.62e-08 CTCACATTGC ATCGCAAACCCAACCTCACGC ATCCTTCAGC 8945 350 7.33e-08 TCATCGATTC ACAACAACTCCGAGCTGTCTC TGAGCTCGCT 3081 450 1.55e-07 CTTAGTGGAC ACCGCTCTCCCCACCGCACAC TGACCAGACA 23216 447 3.37e-07 CAGCTGGAAC GTCTCTTCTCCTAGCTCACAC AACAACCGAC 263499 410 5.26e-07 TTTCCTTCCT ACCACACCTCCACGTCCCTCC TCTCATCTCG 7708 300 1.10e-06 CTCCTCGTTG TCGGCACCTCGTCGTTCTCTC GTCTATGGCT 6523 238 1.38e-06 AAGCACTAAC ATGGTAACCCCCCCCCCAGAC CAACAAATCA 25511 353 1.38e-06 TGCATAACTC ACATCAATTCCTTGATCATAC TCGAGTCTCG 38030 137 2.42e-06 TACACAGTAT TTCTCAAGCCCGTCTTCATCG TCGTCGACGA 6789 59 1.13e-05 TCGTAGTATC TTCGCACGCCTAGCCGCCGAC TTGACTCTAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3212 2.5e-09 302_[+1]_177 8953 9.8e-09 455_[+1]_24 1827 4.6e-08 466_[+1]_13 8945 7.3e-08 349_[+1]_130 3081 1.6e-07 449_[+1]_30 23216 3.4e-07 446_[+1]_33 263499 5.3e-07 409_[+1]_70 7708 1.1e-06 299_[+1]_180 6523 1.4e-06 237_[+1]_242 25511 1.4e-06 352_[+1]_127 38030 2.4e-06 136_[+1]_343 6789 1.1e-05 58_[+1]_421 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=12 3212 ( 303) TCCTCAACTCCGCCTTGACAC 1 8953 ( 456) ATCACAACTCCGAGCTGTCTC 1 1827 ( 467) ATCGCAAACCCAACCTCACGC 1 8945 ( 350) ACAACAACTCCGAGCTGTCTC 1 3081 ( 450) ACCGCTCTCCCCACCGCACAC 1 23216 ( 447) GTCTCTTCTCCTAGCTCACAC 1 263499 ( 410) ACCACACCTCCACGTCCCTCC 1 7708 ( 300) TCGGCACCTCGTCGTTCTCTC 1 6523 ( 238) ATGGTAACCCCCCCCCCAGAC 1 25511 ( 353) ACATCAATTCCTTGATCATAC 1 38030 ( 137) TTCTCAAGCCCGTCTTCATCG 1 6789 ( 59) TTCGCACGCCTAGCCGCCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 8.90388 E= 6.3e-002 113 -1023 -151 34 -1023 111 -1023 92 -68 153 -51 -1023 -9 -1023 81 34 -1023 199 -1023 -166 164 -1023 -1023 -66 113 53 -1023 -166 -168 133 -51 -66 -1023 85 -1023 114 -1023 211 -1023 -1023 -1023 185 -151 -166 -9 -47 48 -8 64 53 -151 -66 -1023 111 107 -1023 -168 133 -1023 34 -1023 -47 -51 134 -1023 170 7 -1023 113 -47 -1023 -8 -1023 133 -51 -8 91 -47 -151 -8 -1023 199 -151 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 12 E= 6.3e-002 0.583333 0.000000 0.083333 0.333333 0.000000 0.500000 0.000000 0.500000 0.166667 0.666667 0.166667 0.000000 0.250000 0.000000 0.416667 0.333333 0.000000 0.916667 0.000000 0.083333 0.833333 0.000000 0.000000 0.166667 0.583333 0.333333 0.000000 0.083333 0.083333 0.583333 0.166667 0.166667 0.000000 0.416667 0.000000 0.583333 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.083333 0.083333 0.250000 0.166667 0.333333 0.250000 0.416667 0.333333 0.083333 0.166667 0.000000 0.500000 0.500000 0.000000 0.083333 0.583333 0.000000 0.333333 0.000000 0.166667 0.166667 0.666667 0.000000 0.750000 0.250000 0.000000 0.583333 0.166667 0.000000 0.250000 0.000000 0.583333 0.166667 0.250000 0.500000 0.166667 0.083333 0.250000 0.000000 0.916667 0.083333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AT][CT]C[GTA]CA[AC]C[TC]CC[GAT][AC][CG][CT]T[CG][AT][CT][AT]C -------------------------------------------------------------------------------- Time 1.24 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 10 llr = 105 E-value = 3.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1:1:1:8::7:a pos.-specific C 288:4a:1a:7: probability G 5:::1::7:13: matrix T 221a4:22:2:: bits 2.1 * * 1.9 * * * * 1.7 * * * * 1.5 * * * * Relative 1.3 * * ** * ** Entropy 1.1 *** ** * ** (15.2 bits) 0.8 *** ******* 0.6 *** ******* 0.4 *** ******* 0.2 ************ 0.0 ------------ Multilevel GCCTCCAGCACA consensus CT T TT TG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 3081 39 9.27e-08 CGATCCAATG GCCTTCAGCACA TTCAACAGCT 1827 489 3.81e-07 ACCTCACGCA TCCTTCAGCACA 8953 166 6.08e-07 TGTCATCGTT GCCTGCAGCACA GGCATTGAGG 25511 55 2.26e-06 TAGGATGAGA GCCTTCAGCTGA GTGAGAGTGT 6789 309 1.20e-05 TGAAATGATG GTCTCCTGCAGA TTTGTTTGAG 6523 307 1.20e-05 CCTCATCGAT ACATCCAGCACA AACCATGATA 263499 316 1.20e-05 GGTCACATTT TTCTCCATCACA ATGAATTTCT 8945 219 1.67e-05 CAAAGAGGAG GCCTACAGCGGA CGAGGCATTA 23216 481 1.77e-05 AACCGACGCA CCTTCCATCACA GCGGCAAC 7708 334 4.20e-05 TATGGCTCTA CCCTTCTCCTCA AACGAGGCTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3081 9.3e-08 38_[+2]_450 1827 3.8e-07 488_[+2] 8953 6.1e-07 165_[+2]_323 25511 2.3e-06 54_[+2]_434 6789 1.2e-05 308_[+2]_180 6523 1.2e-05 306_[+2]_182 263499 1.2e-05 315_[+2]_173 8945 1.7e-05 218_[+2]_270 23216 1.8e-05 480_[+2]_8 7708 4.2e-05 333_[+2]_155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=10 3081 ( 39) GCCTTCAGCACA 1 1827 ( 489) TCCTTCAGCACA 1 8953 ( 166) GCCTGCAGCACA 1 25511 ( 55) GCCTTCAGCTGA 1 6789 ( 309) GTCTCCTGCAGA 1 6523 ( 307) ACATCCAGCACA 1 263499 ( 316) TTCTCCATCACA 1 8945 ( 219) GCCTACAGCGGA 1 23216 ( 481) CCTTCCATCACA 1 7708 ( 334) CCCTTCTCCTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 10.1389 E= 3.9e+001 -141 -21 107 -40 -997 179 -997 -40 -141 179 -997 -140 -997 -997 -997 192 -141 79 -125 60 -997 211 -997 -997 158 -997 -997 -40 -997 -121 155 -40 -997 211 -997 -997 139 -997 -125 -40 -997 160 33 -997 190 -997 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 3.9e+001 0.100000 0.200000 0.500000 0.200000 0.000000 0.800000 0.000000 0.200000 0.100000 0.800000 0.000000 0.100000 0.000000 0.000000 0.000000 1.000000 0.100000 0.400000 0.100000 0.400000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.100000 0.700000 0.200000 0.000000 1.000000 0.000000 0.000000 0.700000 0.000000 0.100000 0.200000 0.000000 0.700000 0.300000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GCT][CT]CT[CT]C[AT][GT]C[AT][CG]A -------------------------------------------------------------------------------- Time 2.68 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 103 E-value = 5.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 2::52:8523:::3::2:::: pos.-specific C :a2:7a22332a22:788:88 probability G :::2::::3:2::32::::2: matrix T 8:832::3237:8283:2a:2 bits 2.1 * * * 1.9 * * * * 1.7 * * * * 1.5 * * * ***** Relative 1.3 *** ** ** * ***** Entropy 1.1 *** ** ** ******* (24.7 bits) 0.8 *** *** ** ******* 0.6 *** *** *** ******* 0.4 ******** **** ******* 0.2 ************* ******* 0.0 --------------------- Multilevel TCTACCAACATCTATCCCTCC consensus T TGC G T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 1827 404 3.03e-10 TTGGACACGA TCTGCCAACCTCTAGCCCTCC TGAAACAACC 263499 388 2.49e-09 ACAGACACGA TCTAACAACCTCTTTCCTTCC TACCACACCT 7708 439 5.67e-09 TTGGTACGAG TCTTCCCTATTCCGTCCCTCC GTCGTCCCCT 25511 464 1.36e-08 ACATCATCTC TCTATCACTTCCTATTCCTCC CTGAACGCAT 3081 171 1.47e-08 CAATTCACTG TCTACCAAGAGCTGTTACTGC TCGTGGTGCG 8945 37 2.90e-08 TCACTAGCTT ACCTCCATGATCTCTCCCTCT ACCACTAATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1827 3e-10 403_[+3]_76 263499 2.5e-09 387_[+3]_92 7708 5.7e-09 438_[+3]_41 25511 1.4e-08 463_[+3]_16 3081 1.5e-08 170_[+3]_309 8945 2.9e-08 36_[+3]_443 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 1827 ( 404) TCTGCCAACCTCTAGCCCTCC 1 263499 ( 388) TCTAACAACCTCTTTCCTTCC 1 7708 ( 439) TCTTCCCTATTCCGTCCCTCC 1 25511 ( 464) TCTATCACTTCCTATTCCTCC 1 3081 ( 171) TCTACCAAGAGCTGTTACTGC 1 8945 ( 37) ACCTCCATGATCTCTCCCTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 10.3532 E= 5.3e+001 -68 -923 -923 166 -923 211 -923 -923 -923 -47 -923 166 90 -923 -51 34 -68 153 -923 -66 -923 211 -923 -923 164 -47 -923 -923 90 -47 -923 34 -68 53 48 -66 32 53 -923 34 -923 -47 -51 134 -923 211 -923 -923 -923 -47 -923 166 32 -47 48 -66 -923 -923 -51 166 -923 153 -923 34 -68 185 -923 -923 -923 185 -923 -66 -923 -923 -923 192 -923 185 -51 -923 -923 185 -923 -66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 5.3e+001 0.166667 0.000000 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.500000 0.000000 0.166667 0.333333 0.166667 0.666667 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.500000 0.166667 0.000000 0.333333 0.166667 0.333333 0.333333 0.166667 0.333333 0.333333 0.000000 0.333333 0.000000 0.166667 0.166667 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.333333 0.166667 0.333333 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.666667 0.000000 0.333333 0.166667 0.833333 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.000000 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TCT[AT]CCA[AT][CG][ACT]TCT[AG]T[CT]CCTCC -------------------------------------------------------------------------------- Time 3.99 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1827 3.91e-13 403_[+3(3.03e-10)]_42_\ [+1(4.62e-08)]_1_[+2(3.81e-07)] 23216 1.04e-04 446_[+1(3.37e-07)]_13_\ [+2(1.77e-05)]_8 25511 1.75e-09 54_[+2(2.26e-06)]_286_\ [+1(1.38e-06)]_90_[+3(1.36e-08)]_16 263499 6.95e-10 176_[+1(7.07e-05)]_118_\ [+2(1.20e-05)]_60_[+3(2.49e-09)]_1_[+1(5.26e-07)]_2_[+1(4.14e-05)]_47 3081 1.25e-11 38_[+2(9.27e-08)]_120_\ [+3(1.47e-08)]_258_[+1(1.55e-07)]_30 3212 4.60e-05 302_[+1(2.47e-09)]_177 38030 1.30e-02 136_[+1(2.42e-06)]_343 6523 3.68e-05 237_[+1(1.38e-06)]_48_\ [+2(1.20e-05)]_182 6789 1.68e-03 58_[+1(1.13e-05)]_229_\ [+2(1.20e-05)]_180 7708 9.30e-09 299_[+1(1.10e-06)]_13_\ [+2(4.20e-05)]_93_[+3(5.67e-09)]_41 8945 1.48e-09 36_[+3(2.90e-08)]_161_\ [+2(1.67e-05)]_119_[+1(7.33e-08)]_130 8953 6.65e-08 165_[+2(6.08e-07)]_278_\ [+1(9.79e-09)]_24 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************