******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/221/221.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 13864 1.0000 500 47879 1.0000 500 48446 1.0000 500 52468 1.0000 500 48958 1.0000 500 49056 1.0000 500 15555 1.0000 500 49558 1.0000 500 8457 1.0000 500 7293 1.0000 500 1241 1.0000 500 4342 1.0000 500 23798 1.0000 500 41071 1.0000 500 23838 1.0000 500 7294 1.0000 500 2171 1.0000 500 50408 1.0000 500 43880 1.0000 500 1875 1.0000 500 44031 1.0000 500 10539 1.0000 500 44134 1.0000 500 50465 1.0000 500 3686 1.0000 500 47484 1.0000 500 40078 1.0000 500 49072 1.0000 500 49808 1.0000 500 50217 1.0000 500 48021 1.0000 500 38867 1.0000 500 49735 1.0000 500 49762 1.0000 500 47598 1.0000 500 48038 1.0000 500 40902 1.0000 500 48817 1.0000 500 35176 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/221/221.seqs.fa -oc motifs/221 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 39 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 19500 N= 39 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.267 C 0.252 G 0.225 T 0.255 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.252 G 0.225 T 0.255 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 153 E-value = 2.8e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::1:::861::3:2298::: pos.-specific C 8181111::9:2213::1::9 probability G 2921:9:2::a1:238:176: matrix T :::79:9:4::7471:1:341 bits 2.2 * 1.9 * 1.7 * * * 1.5 * *** ** * * Relative 1.3 *** **** ** ** * Entropy 1.1 *** **** ** ** *** (24.5 bits) 0.9 *** ******** * ****** 0.6 ************ * ****** 0.4 ************** ****** 0.2 ********************* 0.0 --------------------- Multilevel CGCTTGTAACGTTTCGAAGGC consensus G G GT CAGGA TT sequence C A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 7294 327 6.07e-12 TCCCTAGGAT CGCTTGTAACGCTTGGAAGGC CACTCACTTG 7293 327 6.07e-12 TCCCTAGGAT CGCTTGTAACGCTTGGAAGGC CACTCACTTG 35176 405 4.03e-11 ACGGTATTCC CGCTTGTATCGTAGCGAAGTC GGTATCGGCA 50217 405 4.03e-11 ACGGTATTCC CGCTTGTATCGTAGCGAAGTC GGTATCGGCA 48958 364 7.14e-08 CGAAACGAAC CGGCTGTATCGTCTAGTAGGT GCCTGCAATT 43880 80 7.59e-08 AAAAGGAGCT GGCTCGTAACGTCTCAACTTC TATTGAAGAG 49072 65 9.59e-08 GTCGATTCGT GCCATGTATCGTTTTAAATGC TATTTTCTCA 15555 33 9.59e-08 TCAGAAAGAG CGGGTGTGAAGGATAGAAGTC GTCATTCGAT 47484 234 1.74e-07 GTCGATGGCT CGCTTCCGACGTTCGGAGTGC ATCGTCGGGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7294 6.1e-12 326_[+1]_153 7293 6.1e-12 326_[+1]_153 35176 4e-11 404_[+1]_75 50217 4e-11 404_[+1]_75 48958 7.1e-08 363_[+1]_116 43880 7.6e-08 79_[+1]_400 49072 9.6e-08 64_[+1]_415 15555 9.6e-08 32_[+1]_447 47484 1.7e-07 233_[+1]_246 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 7294 ( 327) CGCTTGTAACGCTTGGAAGGC 1 7293 ( 327) CGCTTGTAACGCTTGGAAGGC 1 35176 ( 405) CGCTTGTATCGTAGCGAAGTC 1 50217 ( 405) CGCTTGTATCGTAGCGAAGTC 1 48958 ( 364) CGGCTGTATCGTCTAGTAGGT 1 43880 ( 80) GGCTCGTAACGTCTCAACTTC 1 49072 ( 65) GCCATGTATCGTTTTAAATGC 1 15555 ( 33) CGGGTGTGAAGGATAGAAGTC 1 47484 ( 234) CGCTTCCGACGTTCGGAGTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 18720 bayes= 11.87 E= 2.8e-002 -982 162 -2 -982 -982 -118 198 -982 -982 162 -2 -982 -126 -118 -102 139 -982 -118 -982 180 -982 -118 198 -982 -982 -118 -982 180 154 -982 -2 -982 105 -982 -982 80 -126 182 -982 -982 -982 -982 215 -982 -982 -18 -102 139 32 -18 -982 80 -982 -118 -2 139 -27 40 56 -120 -27 -982 179 -982 173 -982 -982 -120 154 -118 -102 -982 -982 -982 156 39 -982 -982 130 80 -982 182 -982 -120 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 2.8e-002 0.000000 0.777778 0.222222 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.777778 0.222222 0.000000 0.111111 0.111111 0.111111 0.666667 0.000000 0.111111 0.000000 0.888889 0.000000 0.111111 0.888889 0.000000 0.000000 0.111111 0.000000 0.888889 0.777778 0.000000 0.222222 0.000000 0.555556 0.000000 0.000000 0.444444 0.111111 0.888889 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.222222 0.111111 0.666667 0.333333 0.222222 0.000000 0.444444 0.000000 0.111111 0.222222 0.666667 0.222222 0.333333 0.333333 0.111111 0.222222 0.000000 0.777778 0.000000 0.888889 0.000000 0.000000 0.111111 0.777778 0.111111 0.111111 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.555556 0.444444 0.000000 0.888889 0.000000 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG]G[CG]TTGT[AG][AT]CG[TC][TAC][TG][CGA][GA]AA[GT][GT]C -------------------------------------------------------------------------------- Time 13.02 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 10 llr = 146 E-value = 4.6e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :2147aa31::1aa9: pos.-specific C :1:53:::::4::::: probability G a:3::::39764:::9 matrix T :761:::4:3:5::11 bits 2.2 * 1.9 * ** ** 1.7 * ** * ** * 1.5 * ** * **** Relative 1.3 * ** ** **** Entropy 1.1 * *** *** **** (21.1 bits) 0.9 ** *** *** **** 0.6 ******* ******** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GTTCAAATGGGTAAAG consensus AGAC A TCG sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 35176 316 2.51e-09 ACTCAAAAAT GTTCAAATGGCTAAAG ACCGACCACA 50217 316 2.51e-09 ACTCAAAAAT GTTCAAATGGCTAAAG ACCGACCACA 48038 367 4.36e-08 AGCAGCAACC GTGACAATGGCGAAAG TCCCAACTCC 49808 367 4.36e-08 AGCAGCAACA GTGACAATGGCGAAAG TCCCAACTCC 7294 203 5.91e-08 CTTTGGTAGT GATCAAAGGTGTAAAG CATAGAATTA 7293 203 5.91e-08 CTTTGGTAGT GATCAAAGGTGTAAAG CATAGAATTA 40902 1 2.91e-07 . GTGAAAAGGTGTAATG ACATAAACAA 3686 165 3.31e-07 ACTGGCCGAG GTTCAAAAGGGAAAAT CATCCTCGAA 13864 119 3.52e-07 CAACAAGCAA GCAAAAAAGGGGAAAG CAGGCGTGCT 2171 401 6.16e-07 CTTGTTCATC GTTTCAAAAGGGAAAG CAGCCCGACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35176 2.5e-09 315_[+2]_169 50217 2.5e-09 315_[+2]_169 48038 4.4e-08 366_[+2]_118 49808 4.4e-08 366_[+2]_118 7294 5.9e-08 202_[+2]_282 7293 5.9e-08 202_[+2]_282 40902 2.9e-07 [+2]_484 3686 3.3e-07 164_[+2]_320 13864 3.5e-07 118_[+2]_366 2171 6.2e-07 400_[+2]_84 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=10 35176 ( 316) GTTCAAATGGCTAAAG 1 50217 ( 316) GTTCAAATGGCTAAAG 1 48038 ( 367) GTGACAATGGCGAAAG 1 49808 ( 367) GTGACAATGGCGAAAG 1 7294 ( 203) GATCAAAGGTGTAAAG 1 7293 ( 203) GATCAAAGGTGTAAAG 1 40902 ( 1) GTGAAAAGGTGTAATG 1 3686 ( 165) GTTCAAAAGGGAAAAT 1 13864 ( 119) GCAAAAAAGGGGAAAG 1 2171 ( 401) GTTTCAAAAGGGAAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 18915 bayes= 11.1362 E= 4.6e-002 -997 -997 215 -997 -42 -133 -997 146 -142 -997 41 123 58 99 -997 -135 139 25 -997 -997 190 -997 -997 -997 190 -997 -997 -997 17 -997 41 65 -142 -997 200 -997 -997 -997 163 23 -997 66 141 -997 -142 -997 83 97 190 -997 -997 -997 190 -997 -997 -997 175 -997 -997 -135 -997 -997 200 -135 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 4.6e-002 0.000000 0.000000 1.000000 0.000000 0.200000 0.100000 0.000000 0.700000 0.100000 0.000000 0.300000 0.600000 0.400000 0.500000 0.000000 0.100000 0.700000 0.300000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.300000 0.000000 0.300000 0.400000 0.100000 0.000000 0.900000 0.000000 0.000000 0.000000 0.700000 0.300000 0.000000 0.400000 0.600000 0.000000 0.100000 0.000000 0.400000 0.500000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.900000 0.000000 0.000000 0.100000 0.000000 0.000000 0.900000 0.100000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[TA][TG][CA][AC]AA[TAG]G[GT][GC][TG]AAAG -------------------------------------------------------------------------------- Time 25.26 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 18 sites = 8 llr = 136 E-value = 3.6e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :5a13a5:4a:4:15::a pos.-specific C 95:98::91:a6391:a: probability G 1:::::31::::8::6:: matrix T ::::::3:5:::::44:: bits 2.2 1.9 * * ** ** 1.7 * * ** ** 1.5 * ** * * ** * ** Relative 1.3 * ** * * ** ** ** Entropy 1.1 * **** * ***** *** (24.5 bits) 0.9 ****** * ***** *** 0.6 ****** ******* *** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel CAACCAACTACCGCAGCA consensus C A G A AC TT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 7294 447 1.47e-09 GTACCTCTCC CCACCAACTACCCCTGCA TCCGCCATGT 7293 447 1.47e-09 GTACCTCTCC CCACCAACTACCCCTGCA TCCGCCATGT 15555 66 1.47e-09 CATTCGATAA CAACAAACAACCGCAGCA GACACCACGA 35176 478 2.60e-09 GTCCACAGTT CAACCATCTACAGCATCA AGACG 50217 478 2.60e-09 GTCCACAGTT CAACCATCTACAGCATCA AGACG 48958 79 4.49e-09 AAGCAGCCTG CAACAAGCAACCGCTGCA ATTTCTCACT 50408 469 6.28e-08 GGGTAGTACA GCAACAGCAACAGCAGCA ATACTAGCGG 48021 170 1.49e-07 AATAGACAGT CCACCAAGCACCGACTCA CTCCATTCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7294 1.5e-09 446_[+3]_36 7293 1.5e-09 446_[+3]_36 15555 1.5e-09 65_[+3]_417 35176 2.6e-09 477_[+3]_5 50217 2.6e-09 477_[+3]_5 48958 4.5e-09 78_[+3]_404 50408 6.3e-08 468_[+3]_14 48021 1.5e-07 169_[+3]_313 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=18 seqs=8 7294 ( 447) CCACCAACTACCCCTGCA 1 7293 ( 447) CCACCAACTACCCCTGCA 1 15555 ( 66) CAACAAACAACCGCAGCA 1 35176 ( 478) CAACCATCTACAGCATCA 1 50217 ( 478) CAACCATCTACAGCATCA 1 48958 ( 79) CAACAAGCAACCGCTGCA 1 50408 ( 469) GCAACAGCAACAGCAGCA 1 48021 ( 170) CCACCAAGCACCGACTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 18837 bayes= 11.2007 E= 3.6e-002 -965 179 -85 -965 90 99 -965 -965 190 -965 -965 -965 -109 179 -965 -965 -10 157 -965 -965 190 -965 -965 -965 90 -965 15 -3 -965 179 -85 -965 49 -101 -965 97 190 -965 -965 -965 -965 199 -965 -965 49 131 -965 -965 -965 -1 173 -965 -109 179 -965 -965 90 -101 -965 56 -965 -965 147 56 -965 199 -965 -965 190 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 8 E= 3.6e-002 0.000000 0.875000 0.125000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.250000 0.250000 0.000000 0.875000 0.125000 0.000000 0.375000 0.125000 0.000000 0.500000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.125000 0.875000 0.000000 0.000000 0.500000 0.125000 0.000000 0.375000 0.000000 0.000000 0.625000 0.375000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AC]AC[CA]A[AGT]C[TA]AC[CA][GC]C[AT][GT]CA -------------------------------------------------------------------------------- Time 37.52 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 13864 7.91e-05 118_[+2(3.52e-07)]_239_\ [+3(2.64e-05)]_109 47879 5.93e-01 500 48446 1.23e-01 500 52468 8.29e-02 500 48958 1.57e-08 78_[+3(4.49e-09)]_267_\ [+1(7.14e-08)]_116 49056 5.64e-01 500 15555 9.68e-09 32_[+1(9.59e-08)]_12_[+3(1.47e-09)]_\ 417 49558 1.79e-01 15_[+1(8.49e-05)]_119_\ [+1(3.01e-05)]_324 8457 6.48e-02 387_[+2(5.39e-05)]_97 7293 6.07e-17 202_[+2(5.91e-08)]_108_\ [+1(6.07e-12)]_99_[+3(1.47e-09)]_36 1241 6.78e-01 500 4342 1.28e-01 401_[+3(3.25e-05)]_81 23798 5.53e-01 500 41071 2.46e-01 500 23838 9.83e-01 500 7294 6.07e-17 202_[+2(5.91e-08)]_108_\ [+1(6.07e-12)]_99_[+3(1.47e-09)]_36 2171 1.84e-03 400_[+2(6.16e-07)]_84 50408 1.08e-03 468_[+3(6.28e-08)]_14 43880 1.48e-03 79_[+1(7.59e-08)]_400 1875 6.33e-01 500 44031 1.52e-01 125_[+2(4.11e-05)]_359 10539 2.18e-02 500 44134 6.04e-01 500 50465 5.49e-01 500 3686 4.10e-03 164_[+2(3.31e-07)]_320 47484 1.32e-03 233_[+1(1.74e-07)]_246 40078 6.66e-01 500 49072 4.26e-04 64_[+1(9.59e-08)]_415 49808 3.10e-04 120_[+2(8.32e-06)]_230_\ [+2(4.36e-08)]_118 50217 3.12e-17 315_[+2(2.51e-09)]_73_\ [+1(4.03e-11)]_52_[+3(2.60e-09)]_5 48021 6.27e-04 169_[+3(1.49e-07)]_313 38867 5.72e-01 500 49735 6.88e-01 500 49762 1.42e-01 500 47598 1.61e-01 500 48038 3.10e-04 120_[+2(8.32e-06)]_230_\ [+2(4.36e-08)]_118 40902 6.75e-04 [+2(2.91e-07)]_484 48817 5.55e-02 4_[+1(3.94e-05)]_475 35176 3.12e-17 315_[+2(2.51e-09)]_73_\ [+1(4.03e-11)]_52_[+3(2.60e-09)]_5 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************