<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Skippy Records &#187; statistics mathematics probability coin toss</title>
	<atom:link href="http://skippyrecords.wordpress.com/tag/statistics-mathematics-probability-coin-toss/feed/" rel="self" type="application/rss+xml" />
	<link>http://skippyrecords.wordpress.com</link>
	<description></description>
	<lastBuildDate>Tue, 17 Jan 2012 22:41:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='skippyrecords.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Skippy Records &#187; statistics mathematics probability coin toss</title>
		<link>http://skippyrecords.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://skippyrecords.wordpress.com/osd.xml" title="Skippy Records" />
	<atom:link rel='hub' href='http://skippyrecords.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Statistics of Coin-Toss Patterns</title>
		<link>http://skippyrecords.wordpress.com/2008/04/03/statistics-of-coin-toss-patterns/</link>
		<comments>http://skippyrecords.wordpress.com/2008/04/03/statistics-of-coin-toss-patterns/#comments</comments>
		<pubDate>Fri, 04 Apr 2008 03:22:59 +0000</pubDate>
		<dc:creator>Dr. Skippy</dc:creator>
				<category><![CDATA[Everything Else]]></category>
		<category><![CDATA[statistics mathematics probability coin toss]]></category>

		<guid isPermaLink="false">http://h180745wp.setupmyblog.com/?p=71</guid>
		<description><![CDATA[Yesterday, I watched Peter Donnelly&#8217;s TED presentation on statistical mind-benders. Among other things (statistician jokes!), Peter observes that humans don&#8217;t have good intuition for some kinds of statistical thinking. In the presentation, Donelly posses a coin toss problem to demonstrate his point.&#160; He chooses one that is easy to get wrong. Consider a series of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skippyrecords.wordpress.com&amp;blog=13069636&amp;post=71&amp;subd=skippyrecords&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p align="left">Yesterday, I watched <a title="TED Peter Donelly" target="_blank" href="http://www.ted.com/talks/view/id/67">Peter Donnelly&#8217;s TED presentation</a> on statistical mind-benders.  Among other things (statistician jokes!), Peter observes that humans don&#8217;t have good intuition for some kinds of statistical thinking. In the presentation, Donelly posses a coin toss problem to demonstrate his point.&nbsp; He chooses one that is easy to get wrong.</p>
<p>Consider a series of fair coin tosses.  For example, one possible sequence of coin tosses is THTTHHTHTTH. How many tosses are required to get a particular pattern?  How does this depend on the length of the pattern?</p>
<p>Peter poses a concrete question as follows.&nbsp; Consider the pattern HTH. If we do the experiment of tossing a coin repeatedly and counting the number of tosses, we find that the first occurrence of HTH arises in some average number of coin tosses <img border="0" align="middle" title="n_{HTH}" alt="n_{HTH}" src="http://www.texify.com/img/%5Cnormalsize%5C%21n_%7BHTH%7D.gif" />. For a different pattern, say TTH, we can repeat the experiment and find that the first occurrence of this pattern arises in some average number of coin tosses <img border="0" align="middle" title="n_{TTH}" alt="n_{TTH}" src="http://www.texify.com/img/%5Cnormalsize%5C%21n_%7BTTH%7D.gif" />.</p>
<p>One of the following statements must be true:</p>
<blockquote><p>   (a) <img border="0" align="middle" title="n_{HTH} = n_{TTH}" alt="n_{HTH} = n_{TTH}" src="http://www.texify.com/img/%5Cnormalsize%5C%21n_%7BHTH%7D%20%3D%20n_%7BTTH%7D.gif" /><br />    (b) <img border="0" align="middle" title="n_{HTH} \gt n_{TTH}" alt="n_{HTH} \gt n_{TTH}" src="http://www.texify.com/img/%5Cnormalsize%5C%21n_%7BHTH%7D%20%5Cgt%20n_%7BTTH%7D.gif" /><br />    (c) <img border="0" align="middle" title="n_{HTH} \lt n_{TTH}" alt="n_{HTH} \lt n_{TTH}" src="http://www.texify.com/img/%5Cnormalsize%5C%21n_%7BHTH%7D%20%5Clt%20n_%7BTTH%7D.gif" /></p>
</blockquote>
<p>Which statement is correct?</p>
<p>My reflex reaction was (a). The heuristic leading to this conclusion is that the probability of getting TTH is the same as the probability of getting HTH in any 3 coin tosses, i.e.,</p>
<div align="center">
<p align="center"> <img border="0" align="middle" title="P(HTH)=P(TTH)=\frac{1}{8}" alt="P(HTH)=P(TTH)=\frac{1}{8}" src="http://www.texify.com/img/%5Cnormalsize%5C%21P%28HTH%29%3DP%28TTH%29%3D%5Cfrac%7B1%7D%7B8%7D.gif" />.</p>
</div>
<div align="left">
<p>On the other hand, if the pattern was HHH the probability of getting this pattern on any three coin tosses is the same. But intuitively, I expect to have to make more coin tosses on average to get this pattern.</p>
<p>It is difficult to get to the correct answer with this kind of reasoning.&nbsp;</p>
<p>A little counter intuitively, (b) is the answer. It takes more tosses on average to get the pattern HTH than the pattern TTH.  Peter spends some time arguing that this is plausible&#8211;watch his presentation to get those arguments. Below, I pursue calculating this for myself&#8230;</p>
</div>
<p><span id="more-71"></span></p>
<p>Remember that the average or expected value of n is calculated by, </p>
<div align="center"><img border="0" alt="=\sum{P(n)n}" src="http://www.texify.com/img/%5Cnormalsize%5C%21%3Cn%3E%3D%5Csum%7BP%28n%29n%7D.gif" title="=\sum{P(n)n}" /></div>
<p align="left">The probability of any single coin toss sequence is <img border="0" align="middle" title="\frac{1}{2^n}" alt="\frac{1}{2^n}" src="http://www.texify.com/img/%5Cnormalsize%5C%21%5Cfrac%7B1%7D%7B2%5En%7D.gif" /> where n is the length of the sequence.</p>
<div align="left"> </div>
<p align="left">An approach to calculating P(n) is to count the number of permutations of coin-toss sequences of length 0 to n that contain the target pattern only in the last position. Each of these sequences has the length-dependent probability above, so that</p>
<div align="center">
<p align="center"><img border="0" align="middle" alt="P(n)=N(n-1)\frac{1}{2^{n-1}}(\frac{1}{2})=\frac{N(n-1)}{2^{n}}" src="http://www.texify.com/img/%5Cnormalsize%5C%21P%28n%29%3DN%28n-1%29%5Cfrac%7B1%7D%7B2%5E%7Bn-1%7D%7D%28%5Cfrac%7B1%7D%7B2%7D%29%3D%5Cfrac%7BN%28n-1%29%7D%7B2%5E%7Bn%7D%7D.gif" />.</p>
</div>
<div align="left"> </div>
<p align="left">To take into account that some target patterns can overlap themselves, I force the last (l-1) places in the sequence to match the first (l-1) places of the pattern.&nbsp; The chance of getting the pattern on the next toss is1/2.&nbsp; So multiply by 1/2 to get the formula above.</p>
<div align="left"> </div>
<p align="left">The table below shows the function N(n-1) for the two patterns HTH and TTH. It is easy to see that for longer sequences, there are many more possible sequences not containing HTH than HHT. Since the mean is calculated as a sum of N(n-1) multiplied by the probability which only depends on the length of the sequence, HTH is going to have the greater expected n.</p>
<div align="left"> </div>
<p align="left">You can use a pad a paper to count sequences, but this gets out of hand quickly.  I wrote a Python script to help keep things sorted out. This script makes all the calculations shown for the rest of this entry and can be <a href="http://drskippy.net/python/coinTossPatterns.py" target="_blank" title="Coin Toss Python Script">downloaded here</a>.</p>
<p align="center">&nbsp;</p>
<div align="center">
<table border="0">
<tbody>
<tr>
<td valign="bottom" align="center">Tosses</td>
<td valign="bottom" align="center">Coin Toss Sequence<br />Permutations W/O<br />Pattern (TTH)</td>
<td valign="bottom" align="center">Coin Toss Sequence<br />Permutations W/O<br />Pattern (HTH)</td>
</tr>
<tr>
<td align="center">4</td>
<td align="center">2</td>
<td align="center">2</td>
</tr>
<tr>
<td align="center">5</td>
<td align="center">4</td>
<td align="center">3</td>
</tr>
<tr>
<td align="center">6</td>
<td align="center">7</td>
<td align="center">5</td>
</tr>
<tr>
<td align="center">7</td>
<td align="center">12</td>
<td align="center">9</td>
</tr>
<tr>
<td align="center">8</td>
<td align="center">20</td>
<td align="center">16</td>
</tr>
<tr>
<td align="center">9</td>
<td align="center">33</td>
<td align="center">28</td>
</tr>
<tr>
<td align="center">10</td>
<td align="center">54</td>
<td align="center">49</td>
</tr>
<tr>
<td align="center">11</td>
<td align="center">88</td>
<td align="center">86</td>
</tr>
<tr>
<td align="center">12</td>
<td align="center">143</td>
<td align="center">151</td>
</tr>
<tr>
<td align="center">13</td>
<td align="center">232</td>
<td align="center">265</td>
</tr>
<tr>
<td align="center">14</td>
<td align="center">376</td>
<td align="center">465</td>
</tr>
<tr>
<td align="center">15</td>
<td align="center">609</td>
<td align="center">816</td>
</tr>
<tr>
<td align="center">16</td>
<td align="center">986</td>
<td align="center">1432</td>
</tr>
<tr>
<td align="center">17</td>
<td align="center">1596</td>
<td align="center">2513</td>
</tr>
<tr>
<td align="center">18</td>
<td align="center">2583</td>
<td align="center">4410</td>
</tr>
<tr>
<td align="center">19</td>
<td align="center">4180</td>
<td align="center">7739</td>
</tr>
<tr>
<td align="center">20</td>
<td align="center">6764</td>
<td align="center">13581</td>
</tr>
<tr>
<td align="center">21</td>
<td align="center">10945</td>
<td align="center">23833</td>
</tr>
<tr>
<td align="center">22</td>
<td align="center">17710</td>
<td align="center">41824</td>
</tr>
<tr>
<td align="center">23</td>
<td align="center">28656</td>
<td align="center">73396</td>
</tr>
<tr>
<td align="center">24</td>
<td align="center">46367</td>
<td align="center">128801</td>
</tr>
<tr>
<td align="center">25</td>
<td align="center">75024</td>
<td align="center">226030</td>
</tr>
</tbody>
</table>
<p>&nbsp;</p>
</p></div>
<p>It is easy to use these counts and the formula above to calculate the expected value of n for short patterns.  Here is an example for a pattern of length 2. This calculation applies to either TH or HT and shows that the expected value of n is 4.00.</p>
<div align="center">
<table border="0">
<tbody>
<tr>
<td valign="bottom" align="center">Coin<br />Tosses</td>
<td valign="bottom" align="center">Coin Toss Sequences<br />without Pattern (TH or HT)</td>
<td valign="bottom" align="center">Average Coin Tosses<br />(successive approximations)</td>
</tr>
<tr>
<td align="center">3</td>
<td align="center">2</td>
<td align="center">1.250</td>
</tr>
<tr>
<td align="center">4</td>
<td align="center">3</td>
<td align="center">2.000</td>
</tr>
<tr>
<td align="center">5</td>
<td align="center">4</td>
<td align="center">2.625</td>
</tr>
<tr>
<td align="center">6</td>
<td align="center">5</td>
<td align="center">3.094</td>
</tr>
<tr>
<td align="center">7</td>
<td align="center">6</td>
<td align="center">3.422</td>
</tr>
<tr>
<td align="center">8</td>
<td align="center">7</td>
<td align="center">3.641</td>
</tr>
<tr>
<td align="center">9</td>
<td align="center">8</td>
<td align="center">3.781</td>
</tr>
<tr>
<td align="center">10</td>
<td align="center">9</td>
<td align="center">3.869</td>
</tr>
<tr>
<td align="center">11</td>
<td align="center">10</td>
<td align="center">3.923</td>
</tr>
<tr>
<td align="center">12</td>
<td align="center">11</td>
<td align="center">3.955</td>
</tr>
<tr>
<td align="center">13</td>
<td align="center">12</td>
<td align="center">3.974</td>
</tr>
<tr>
<td align="center">14</td>
<td align="center">13</td>
<td align="center">3.985</td>
</tr>
<tr>
<td align="center">15</td>
<td align="center">14</td>
<td align="center">3.992</td>
</tr>
<tr>
<td align="center">16</td>
<td align="center">15</td>
<td align="center">3.995</td>
</tr>
<tr>
<td align="center">17</td>
<td align="center">16</td>
<td align="center">3.997</td>
</tr>
<tr>
<td align="center">18</td>
<td align="center">17</td>
<td align="center">3.999</td>
</tr>
<tr>
<td align="center">19</td>
<td align="center">18</td>
<td align="center">3.999</td>
</tr>
<tr>
<td align="center">20</td>
<td align="center">19</td>
<td align="center">4.000</td>
</tr>
</tbody>
</table>
</div>
<p>This series converges slowly and the numbre of permutations becomes very large for longer patterns.  So, although this brute-force summing is straight forward, it runs up against practical limits surprisingly quickly.</p>
<p>Another approach to calculated the average n for a pattern is to use a Monte Carlo simulation of the coin toss events. The Python script above also performs this calculation.  The table below shows the results for patterns up to length 4.</p>
<div align="center">
<table border="0">
<tbody>
<tr align="center">
<td>Pattern</td>
<td>(Est.) E</td>
</tr>
<tr align="center">
<td colspan="2">Patterns of length 1 </td>
</tr>
<tr align="center">
<td>T</td>
<td>2.00</td>
</tr>
<tr align="center">
<td>H</td>
<td>1.98</td>
</tr>
<tr align="center">
<td colspan="2">Patterns of length 2 </td>
</tr>
<tr align="center">
<td>TT</td>
<td>5.99</td>
</tr>
<tr align="center">
<td>HH</td>
<td>6.00</td>
</tr>
<tr align="center">
<td>HT</td>
<td>4.02</td>
</tr>
<tr align="center">
<td>TH</td>
<td>4.01</td>
</tr>
<tr align="center">
<td colspan="2">Patterns of length 3 </td>
</tr>
<tr align="center">
<td>TTT</td>
<td>13.95</td>
</tr>
<tr align="center">
<td>HHH</td>
<td>14.07</td>
</tr>
<tr align="center">
<td>HTT</td>
<td>8.03</td>
</tr>
<tr align="center">
<td>THH</td>
<td>8.02</td>
</tr>
<tr align="center">
<td>THT</td>
<td>9.95</td>
</tr>
<tr align="center">
<td>HTH</td>
<td>10.04</td>
</tr>
<tr align="c<br />
enter&#8221;>
<td>HHT</td>
<td>8.00</td>
</tr>
<tr align="center">
<td>TTH</td>
<td>8.06</td>
</tr>
<tr align="center">
<td colspan="2">Patterns of length 4 </td>
</tr>
<tr align="center">
<td>TTTT</td>
<td>30.07</td>
</tr>
<tr align="center">
<td>HHHH</td>
<td>30.15</td>
</tr>
<tr align="center">
<td>HTTT</td>
<td>16.10</td>
</tr>
<tr align="center">
<td>THHH</td>
<td>16.07</td>
</tr>
<tr align="center">
<td>THTT</td>
<td>17.96</td>
</tr>
<tr align="center">
<td>HTHH</td>
<td>17.98</td>
</tr>
<tr align="center">
<td>HHTT</td>
<td>15.92</td>
</tr>
<tr align="center">
<td>TTHH</td>
<td>16.16</td>
</tr>
<tr align="center">
<td>TTHT</td>
<td>18.09</td>
</tr>
<tr align="center">
<td>HHTH</td>
<td>18.00</td>
</tr>
<tr align="center">
<td>HTHT</td>
<td>20.14</td>
</tr>
<tr align="center">
<td>THTH</td>
<td>19.94</td>
</tr>
<tr align="center">
<td>THHT</td>
<td>18.12</td>
</tr>
<tr align="center">
<td>HTTH</td>
<td>18.20</td>
</tr>
<tr align="center">
<td>HHHT</td>
<td>16.13</td>
</tr>
<tr align="center">
<td>TTTH</td>
<td>16.0</td>
</tr>
</tbody>
</table>
</div>
<p align="left">This table is useful to compare the different results for various patterns of the same length.</p>
<p align="left">Finally, the Monte Carlo method gives an estimate of the distribution of n for the patterns.  (The numbers above are based on 20,000 sequences.) The histogram below shows the Distribution of Sequence length for the pattern TTTH.&nbsp; The average sequence length is estimated to be 16 (from the table above). Because the distribution is asymmetric and has a very long tail to the right, the average doesn&#8217;t appear near any interesting features (the peak, for example) of the distribution.</p>
<p align="center"><img width="440" height="330" border="0" align="absmiddle" title="TTTH Distribution (20000 Trials)" alt="TTTH Distribution (20000 Trials)" src="http://drskippy.net/img/TTTH_hist_20080403.png" />&nbsp;</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/skippyrecords.wordpress.com/71/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/skippyrecords.wordpress.com/71/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skippyrecords.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skippyrecords.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skippyrecords.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skippyrecords.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skippyrecords.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skippyrecords.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skippyrecords.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skippyrecords.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skippyrecords.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skippyrecords.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skippyrecords.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skippyrecords.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skippyrecords.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skippyrecords.wordpress.com/71/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skippyrecords.wordpress.com&amp;blog=13069636&amp;post=71&amp;subd=skippyrecords&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skippyrecords.wordpress.com/2008/04/03/statistics-of-coin-toss-patterns/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fd95bd67cd406fcb27a627a44570f2a2?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">drskippy27</media:title>
		</media:content>

		<media:content url="http://www.texify.com/img/%5Cnormalsize%5C%21n_%7BHTH%7D.gif" medium="image">
			<media:title type="html">n_{HTH}</media:title>
		</media:content>

		<media:content url="http://www.texify.com/img/%5Cnormalsize%5C%21n_%7BTTH%7D.gif" medium="image">
			<media:title type="html">n_{TTH}</media:title>
		</media:content>

		<media:content url="http://www.texify.com/img/%5Cnormalsize%5C%21n_%7BHTH%7D%20%3D%20n_%7BTTH%7D.gif" medium="image">
			<media:title type="html">n_{HTH} = n_{TTH}</media:title>
		</media:content>

		<media:content url="http://www.texify.com/img/%5Cnormalsize%5C%21n_%7BHTH%7D%20%5Cgt%20n_%7BTTH%7D.gif" medium="image">
			<media:title type="html">n_{HTH} \gt n_{TTH}</media:title>
		</media:content>

		<media:content url="http://www.texify.com/img/%5Cnormalsize%5C%21n_%7BHTH%7D%20%5Clt%20n_%7BTTH%7D.gif" medium="image">
			<media:title type="html">n_{HTH} \lt n_{TTH}</media:title>
		</media:content>

		<media:content url="http://www.texify.com/img/%5Cnormalsize%5C%21P%28HTH%29%3DP%28TTH%29%3D%5Cfrac%7B1%7D%7B8%7D.gif" medium="image">
			<media:title type="html">P(HTH)=P(TTH)=\frac{1}{8}</media:title>
		</media:content>

		<media:content url="http://www.texify.com/img/%5Cnormalsize%5C%21%3Cn%3E%3D%5Csum%7BP%28n%29n%7D.gif" medium="image">
			<media:title type="html">=\sum{P(n)n}</media:title>
		</media:content>

		<media:content url="http://www.texify.com/img/%5Cnormalsize%5C%21%5Cfrac%7B1%7D%7B2%5En%7D.gif" medium="image">
			<media:title type="html">\frac{1}{2^n}</media:title>
		</media:content>

		<media:content url="http://www.texify.com/img/%5Cnormalsize%5C%21P%28n%29%3DN%28n-1%29%5Cfrac%7B1%7D%7B2%5E%7Bn-1%7D%7D%28%5Cfrac%7B1%7D%7B2%7D%29%3D%5Cfrac%7BN%28n-1%29%7D%7B2%5E%7Bn%7D%7D.gif" medium="image">
			<media:title type="html">P(n)=N(n-1)\frac{1}{2^{n-1}}(\frac{1}{2})=\frac{N(n-1)}{2^{n}}</media:title>
		</media:content>

		<media:content url="http://drskippy.net/img/TTTH_hist_20080403.png" medium="image">
			<media:title type="html">TTTH Distribution (20000 Trials)</media:title>
		</media:content>
	</item>
	</channel>
</rss>
