******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/356/356.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11078 1.0000 500 11637 1.0000 500 11926 1.0000 500 1203 1.0000 500 2576 1.0000 500 268619 1.0000 500 269868 1.0000 500 3040 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/356/356.seqs.fa -oc motifs/356 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.282 C 0.241 G 0.231 T 0.246 Background letter frequencies (from dataset with add-one prior applied): A 0.282 C 0.241 G 0.231 T 0.246 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 8 llr = 99 E-value = 2.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :9136::598:a3:a pos.-specific C 6:94:993:13:6:: probability G 4::34:13118:19: matrix T :1:1:1:::::::1: bits 2.1 1.9 * * 1.7 * * 1.5 * ** * ** Relative 1.3 ** ** * ** ** Entropy 1.1 *** *** * ** ** (17.8 bits) 0.8 *** *** **** ** 0.6 *** *** ******* 0.4 *** *********** 0.2 *** *********** 0.0 --------------- Multilevel CACCACCAAAGACGA consensus G AG C C A sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 11078 231 1.93e-08 TCTACTTGCC CACGACCCAAGACGA ACACCTGCCT 11637 379 9.74e-08 TCACCTATCG GACCGCCAAACACGA ATCTTTGAAT 269868 427 3.98e-07 GAATAGGTTT GACCGCCCAAGAGGA GACTCTGACA 3040 269 1.06e-06 ACATAGACCA CACGGCCAAAGAATA GACTGATTTG 268619 91 1.14e-06 TTTGGACGAT GACTACCGACGACGA TGAAACAATG 11926 368 1.84e-06 AGAGATGATA CAAAACCGAACACGA AATCACACAT 2576 461 2.28e-06 CTTACTTTAA CACCATCAAGGAAGA AACCAAGCAC 1203 403 6.04e-06 GACCTCACAT CTCAACGAGAGACGA GAAGGGGAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11078 1.9e-08 230_[+1]_255 11637 9.7e-08 378_[+1]_107 269868 4e-07 426_[+1]_59 3040 1.1e-06 268_[+1]_217 268619 1.1e-06 90_[+1]_395 11926 1.8e-06 367_[+1]_118 2576 2.3e-06 460_[+1]_25 1203 6e-06 402_[+1]_83 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=8 11078 ( 231) CACGACCCAAGACGA 1 11637 ( 379) GACCGCCAAACACGA 1 269868 ( 427) GACCGCCCAAGAGGA 1 3040 ( 269) CACGGCCAAAGAATA 1 268619 ( 91) GACTACCGACGACGA 1 11926 ( 368) CAAAACCGAACACGA 1 2576 ( 461) CACCATCAAGGAAGA 1 1203 ( 403) CTCAACGAGAGACGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 3888 bayes= 8.92184 E= 2.1e+000 -965 137 70 -965 163 -965 -965 -98 -117 186 -965 -965 -17 64 11 -98 115 -965 70 -965 -965 186 -965 -98 -965 186 -89 -965 83 5 11 -965 163 -965 -89 -965 141 -95 -89 -965 -965 5 170 -965 183 -965 -965 -965 -17 137 -89 -965 -965 -965 192 -98 183 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 8 E= 2.1e+000 0.000000 0.625000 0.375000 0.000000 0.875000 0.000000 0.000000 0.125000 0.125000 0.875000 0.000000 0.000000 0.250000 0.375000 0.250000 0.125000 0.625000 0.000000 0.375000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.875000 0.125000 0.000000 0.500000 0.250000 0.250000 0.000000 0.875000 0.000000 0.125000 0.000000 0.750000 0.125000 0.125000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.625000 0.125000 0.000000 0.000000 0.000000 0.875000 0.125000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG]AC[CAG][AG]CC[ACG]AA[GC]A[CA]GA -------------------------------------------------------------------------------- Time 0.63 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 8 llr = 86 E-value = 4.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 996389:a199: pos.-specific C ::3:1:1::::8 probability G 11:8119:9::3 matrix T ::1::::::11: bits 2.1 1.9 * 1.7 * 1.5 *** Relative 1.3 ** * ******* Entropy 1.1 ** * ******* (15.4 bits) 0.8 ** ********* 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AAAGAAGAGAAC consensus CA G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 11078 102 1.19e-07 CGTTGCCTTC AAAGAAGAGAAC AATGCGGAGA 11926 418 1.63e-06 CGCAAACACC AACAAAGAGAAC TCAGATGCCT 11637 440 3.75e-06 GATGCCTTCA AAAGAAGAGTAG CACTGGACGG 269868 366 5.15e-06 GTAGTCTTGG AATGCAGAGAAC GACACTCCGT 1203 428 6.52e-06 GAAGGGGAGA GAAGAGGAGAAC CAGGCAGGCT 2576 98 8.48e-06 CGAGGGAAGA AACAGAGAGAAC GCTCTTCTAC 268619 359 1.36e-05 CGCAGAAGGC AAAGAACAAAAC GACAAACGGC 3040 44 1.83e-05 AAGACTCAAC AGAGAAGAGATG ACCACGTTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11078 1.2e-07 101_[+2]_387 11926 1.6e-06 417_[+2]_71 11637 3.8e-06 439_[+2]_49 269868 5.1e-06 365_[+2]_123 1203 6.5e-06 427_[+2]_61 2576 8.5e-06 97_[+2]_391 268619 1.4e-05 358_[+2]_130 3040 1.8e-05 43_[+2]_445 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=8 11078 ( 102) AAAGAAGAGAAC 1 11926 ( 418) AACAAAGAGAAC 1 11637 ( 440) AAAGAAGAGTAG 1 269868 ( 366) AATGCAGAGAAC 1 1203 ( 428) GAAGAGGAGAAC 1 2576 ( 98) AACAGAGAGAAC 1 268619 ( 359) AAAGAACAAAAC 1 3040 ( 44) AGAGAAGAGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3912 bayes= 8.93074 E= 4.1e+001 163 -965 -89 -965 163 -965 -89 -965 115 5 -965 -98 -17 -965 170 -965 141 -95 -89 -965 163 -965 -89 -965 -965 -95 192 -965 183 -965 -965 -965 -117 -965 192 -965 163 -965 -965 -98 163 -965 -965 -98 -965 164 11 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 4.1e+001 0.875000 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.625000 0.250000 0.000000 0.125000 0.250000 0.000000 0.750000 0.000000 0.750000 0.125000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.125000 0.875000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.875000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.125000 0.000000 0.750000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- AA[AC][GA]AAGAGAA[CG] -------------------------------------------------------------------------------- Time 1.20 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 99 E-value = 7.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::1:::::3::3633: pos.-specific C :415::::1:95:151 probability G 1:84::351a:143:9 matrix T 96:1aa855:11:43: bits 2.1 ** * 1.9 ** * 1.7 ** * 1.5 * ** ** * Relative 1.3 * *** ** * Entropy 1.1 *** **** ** * * (17.8 bits) 0.8 *** **** ** * * 0.6 ******** ** * * 0.4 ******** ** * ** 0.2 ************* ** 0.0 ---------------- Multilevel TTGCTTTGTGCCATCG consensus C G GTA AGAA sequence GT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 269868 246 1.78e-09 GTTTCTTGCG TCGCTTTTTGCCATCG GCCGAATGTA 11078 445 4.21e-07 TCTTGCCTCT TTGCTTTTTGCAAACC ATACATCAGC 2576 205 1.05e-06 TTCTTGGGAA TTCGTTTTAGCCGTAG ACGTGTTCAT 11637 84 1.05e-06 GTTCATGATG TCGGTTTGCGCAAGAG AATGTCAGGA 3040 89 1.45e-06 AGTGCGGCGT TTGGTTTGTGTGGACG AGTCTGCCGG 11926 72 1.68e-06 TGTGCTACTG TTGTTTGGGGCCATTG CTTGACGACT 1203 17 1.84e-06 TTGTAGATTG TTGCTTGGAGCTGGTG CCGACTCTGA 268619 249 3.83e-06 TCTCCACCAC GCACTTTTTGCCACCG CAGAGTGCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 269868 1.8e-09 245_[+3]_239 11078 4.2e-07 444_[+3]_40 2576 1.1e-06 204_[+3]_280 11637 1.1e-06 83_[+3]_401 3040 1.5e-06 88_[+3]_396 11926 1.7e-06 71_[+3]_413 1203 1.8e-06 16_[+3]_468 268619 3.8e-06 248_[+3]_236 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 269868 ( 246) TCGCTTTTTGCCATCG 1 11078 ( 445) TTGCTTTTTGCAAACC 1 2576 ( 205) TTCGTTTTAGCCGTAG 1 11637 ( 84) TCGGTTTGCGCAAGAG 1 3040 ( 89) TTGGTTTGTGTGGACG 1 11926 ( 72) TTGTTTGGGGCCATTG 1 1203 ( 17) TTGCTTGGAGCTGGTG 1 268619 ( 249) GCACTTTTTGCCACCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 3880 bayes= 8.91886 E= 7.1e+001 -965 -965 -89 183 -965 64 -965 134 -117 -95 170 -965 -965 105 70 -98 -965 -965 -965 202 -965 -965 -965 202 -965 -965 11 161 -965 -965 111 102 -17 -95 -89 102 -965 -965 211 -965 -965 186 -965 -98 -17 105 -89 -98 115 -965 70 -965 -17 -95 11 61 -17 105 -965 2 -965 -95 192 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 7.1e+001 0.000000 0.000000 0.125000 0.875000 0.000000 0.375000 0.000000 0.625000 0.125000 0.125000 0.750000 0.000000 0.000000 0.500000 0.375000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.500000 0.500000 0.250000 0.125000 0.125000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.250000 0.500000 0.125000 0.125000 0.625000 0.000000 0.375000 0.000000 0.250000 0.125000 0.250000 0.375000 0.250000 0.500000 0.000000 0.250000 0.000000 0.125000 0.875000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[TC]G[CG]TT[TG][GT][TA]GC[CA][AG][TAG][CAT]G -------------------------------------------------------------------------------- Time 1.82 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11078 5.29e-11 101_[+2(1.19e-07)]_117_\ [+1(1.93e-08)]_199_[+3(4.21e-07)]_40 11637 1.37e-08 83_[+3(1.05e-06)]_279_\ [+1(9.74e-08)]_46_[+2(3.75e-06)]_49 11926 1.44e-07 71_[+3(1.68e-06)]_280_\ [+1(1.84e-06)]_35_[+2(1.63e-06)]_71 1203 1.60e-06 16_[+3(1.84e-06)]_370_\ [+1(6.04e-06)]_10_[+2(6.52e-06)]_61 2576 5.13e-07 97_[+2(8.48e-06)]_95_[+3(1.05e-06)]_\ 240_[+1(2.28e-06)]_25 268619 1.35e-06 90_[+1(1.14e-06)]_143_\ [+3(3.83e-06)]_94_[+2(1.36e-05)]_130 269868 1.83e-10 245_[+3(1.78e-09)]_104_\ [+2(5.15e-06)]_49_[+1(3.98e-07)]_59 3040 6.84e-07 43_[+2(1.83e-05)]_33_[+3(1.45e-06)]_\ 164_[+1(1.06e-06)]_217 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************