Statistika For Life: 25 new messages in 7 topics

sci.stat.math
http://groups.google.com/group/sci.stat.math?hl=en

sci.stat.math@googlegroups.com

Today's topics:

* Calculation of critical p-, z-, t- and F-values - 6 messages, 3 authors
http://groups.google.com/group/sci.stat.math/browse_thread/thread/5ef9c70814f32927?hl=en
* Combinatorial probability problem - 10 messages, 6 authors
http://groups.google.com/group/sci.stat.math/browse_thread/thread/d26220a7e3943029?hl=en
* Poisson distribution test - 4 messages, 4 authors
http://groups.google.com/group/sci.stat.math/browse_thread/thread/6dabdbdd305700fc?hl=en
* Turn a uniform number to normal random numbers - 2 messages, 2 authors
http://groups.google.com/group/sci.stat.math/browse_thread/thread/054d911605199f7c?hl=en
* Conditional Probability. - 1 messages, 1 author
http://groups.google.com/group/sci.stat.math/browse_thread/thread/e22ead0b091bc3bf?hl=en
* wrong R-Squared value?? - 1 messages, 1 author
http://groups.google.com/group/sci.stat.math/browse_thread/thread/259e11ac412a3219?hl=en
* Call for Papers: The 2008 International Conference of Computational
Statistics and Data Engineering ICCSDE 2008 - 1 messages, 1 author
http://groups.google.com/group/sci.stat.math/browse_thread/thread/d977ac14ca3059d0?hl=en

==============================================================================
TOPIC: Calculation of critical p-, z-, t- and F-values
http://groups.google.com/group/sci.stat.math/browse_thread/thread/5ef9c70814f32927?hl=en
==============================================================================

== 1 of 6 ==
Date: Wed, Nov 14 2007 1:34 am
From: "Nasser Abbasi"

Hello David;

I thought I try the Mathematica equivalence commands to your R commands.

I just started to learn R as well. But I find that with Mathematica, I can
also do analytical analysis, and not just numerical, which can be very
useful to me, which is not so easily done in R. But R seems to be quite good
in statistics and has many more functions in statistics than Mathematica (as
can be expected :)

Below I show the Mathematrica commands below your R commands. I was able to
find Mathematica command equivalent to each one of the R commands.

I am using version 6 (new version of Mathematica) where they have added many
new statistics functions to the system. I use the student version of
Mathematica (about $130 US)

"David Winsemius" <doe_snot@comcast.n0T> wrote in message
news:Xns99E7E5D6B97A7dwtttttt@216.196.97.136...

> All one needs to do in R is execute very simple procedures. If you cannot
> remember all the distribution functions, all you need to do is type:
> help.search("distribution")
>
> Typing qnorm(0.975) will produce the desired quantile of the inverse
> Normal.

Quantile[NormalDistribution[0, 1], 975/1000]
Sqrt[2]*InverseErf[19/20]

N[%] <----- This say to convert the above output to a number
1.9599639845400538

>The OP desired a method for "critical values for z,
> p, t and F for a significance level (e. g. 99 %)". I don't know what he
> meant by "p" but the corresponding values to the t from qt(<quantile>,
> <degrees of freedom>) and for the F distribution would be obtained by
> typing qf(<quantile>,<df1>,<df2> ).
>
>> qt(0.975,60)
> [1] 2.000298

Quantile[StudentTDistribution[60], 975/1000]
2*Sqrt[15*(-1 + 1/InverseBetaRegularized[1, -(19/20), 30, 1/2])]

N[%]
2.0002978220142507

>> qt(0.975,120)
>
> [1] 1.979930

Quantile[StudentTDistribution[120], 975/1000]
2*Sqrt[30*(-1 + 1/InverseBetaRegularized[1, -(19/20), 60, 1/2])]

N[%]
1.9799304050824413

> # if you wanted the critical values for 10 through 60 df in steps of 10
> this would work:
>
>> qt(0.975,seq(10,60,10))
> [1] 2.228139 2.085963 2.042272 2.021075 2.008559 2.000298
>

Table[Quantile[StudentTDistribution[k], 0.975], {k, 10, 60, 10}]
{2.2281388519862735, 2.0859634472658626, 2.042272456301236,
2.0210753903062715, 2.008559112100758, 2.0002978220142507}

>> qt(0.975,Inf)
> [1] 1.959964

With Infinity for degrees of Freedom, Mathematica does not evaluate it, but
I could put large values ok
(Might be some limiting issue involved. I do not know how R does it).

Quantile[StudentTDistribution[100000000], 0.975]
1.959964074411019

>
> You would find that qf(2*p - 1, 1, df)) was identical to qt(p, df)^2).
>

The above is valid when assuming 1-2p<0 :

r1 = Quantile[StudentTDistribution[df], p];
r2 = Quantile[FRatioDistribution[1, df], 2*p - 1];
Assuming[1 - 2*p < 0, FullSimplify[ r1^2 - r2] ]

Out[154]= 0

> Not very steep if you ask me. Installation is simple enough for an old doc
> to do it without much pain ever since version 1.8.
>
> --
> David Winsemius, MD, MPH
>

Nasser

== 2 of 6 ==
Date: Wed, Nov 14 2007 3:04 am
From: iandjmsmith@aol.com

On 14 Nov, 09:34, "Nasser Abbasi" <n...@12000.org> wrote:
> Hello David;
>
> I thought I try the Mathematica equivalence commands to your R commands.
>
> I just started to learn R as well. But I find that with Mathematica, I can
> also do analytical analysis, and not just numerical, which can be very
> useful to me, which is not so easily done in R. But R seems to be quite good
> in statistics and has many more functions in statistics than Mathematica (as
> can be expected :)
>
> Below I show the Mathematrica commands below your R commands. I was able to
> find Mathematica command equivalent to each one of the R commands.
>
> I am using version 6 (new version of Mathematica) where they have added many
> new statistics functions to the system. I use the student version of
> Mathematica (about $130 US)
>
> "David Winsemius" <doe_s...@comcast.n0T> wrote in message
>
> news:Xns99E7E5D6B97A7dwtttttt@216.196.97.136...
>
> > All one needs to do in R is execute very simple procedures. If you cannot
> > remember all the distribution functions, all you need to do is type:
> > help.search("distribution")
>
> > Typing qnorm(0.975) will produce the desired quantile of the inverse
> > Normal.
>
> Quantile[NormalDistribution[0, 1], 975/1000]
> Sqrt[2]*InverseErf[19/20]
>
> N[%] <----- This say to convert the above output to a number
> 1.9599639845400538
>
> >The OP desired a method for "critical values for z,
> > p, t and F for a significance level (e. g. 99 %)". I don't know what he
> > meant by "p" but the corresponding values to the t from qt(<quantile>,
> > <degrees of freedom>) and for the F distribution would be obtained by
> > typing qf(<quantile>,<df1>,<df2> ).
>
> >> qt(0.975,60)
> > [1] 2.000298
>
> Quantile[StudentTDistribution[60], 975/1000]
> 2*Sqrt[15*(-1 + 1/InverseBetaRegularized[1, -(19/20), 30, 1/2])]
>
> N[%]
> 2.0002978220142507
>
> >> qt(0.975,120)
>
> > [1] 1.979930
>
> Quantile[StudentTDistribution[120], 975/1000]
> 2*Sqrt[30*(-1 + 1/InverseBetaRegularized[1, -(19/20), 60, 1/2])]
>
> N[%]
> 1.9799304050824413
>
> > # if you wanted the critical values for 10 through 60 df in steps of 10
> > this would work:
>
> >> qt(0.975,seq(10,60,10))
> > [1] 2.228139 2.085963 2.042272 2.021075 2.008559 2.000298
>
> Table[Quantile[StudentTDistribution[k], 0.975], {k, 10, 60, 10}]
> {2.2281388519862735, 2.0859634472658626, 2.042272456301236,
> 2.0210753903062715, 2.008559112100758, 2.0002978220142507}
>
> >> qt(0.975,Inf)
> > [1] 1.959964
>
> With Infinity for degrees of Freedom, Mathematica does not evaluate it, but
> I could put large values ok
> (Might be some limiting issue involved. I do not know how R does it).
>
> Quantile[StudentTDistribution[100000000], 0.975]
> 1.959964074411019
>
>
>
> > You would find that qf(2*p - 1, 1, df)) was identical to qt(p, df)^2).
>
> The above is valid when assuming 1-2p<0 :
>
> r1 = Quantile[StudentTDistribution[df], p];
> r2 = Quantile[FRatioDistribution[1, df], 2*p - 1];
> Assuming[1 - 2*p < 0, FullSimplify[ r1^2 - r2] ]
>
> Out[154]= 0
>
> > Not very steep if you ask me. Installation is simple enough for an old doc
> > to do it without much pain ever since version 1.8.
>
> > --
> > David Winsemius, MD, MPH
>
> Nasser

I have never used Mathematica, so I do not understand all of your
notes on its use.

I gather when you ask for

Quantile[StudentTDistribution[120], 975/1000]

it returns 2*Sqrt[30*(-1 + 1/InverseBetaRegularized[1, -(19/20), 60,
1/2])] as the symbolic equivalent of it and then evaluates this
expression and outputs it as 1.9799304050824413. You can control the
number of digits in the output value.

Can you control the accuracy of the calculation? I am just curious
because none of the values returned are particularly accurate and if
Quantile[StudentTDistribution[100000000], 0.975] returns
1.959964074411019 then it is surprisingly inaccurate (relative error
of about 3e-8).

Ian Smith

== 3 of 6 ==
Date: Wed, Nov 14 2007 3:16 am
From: "Nasser Abbasi"

<iandjmsmith@aol.com> wrote in message
news:1195038272.573775.184990@19g2000hsx.googlegroups.com...

>
> I have never used Mathematica, so I do not understand all of your
> notes on its use.
>
> I gather when you ask for
>
> Quantile[StudentTDistribution[120], 975/1000]
>
> it returns 2*Sqrt[30*(-1 + 1/InverseBetaRegularized[1, -(19/20), 60,
> 1/2])] as the symbolic equivalent of it and then evaluates this
> expression and outputs it as 1.9799304050824413.

Mathematica does everthing when possible in symbolic form (numbers are kept
in rational form, etc..) unless there is a numeric value in the expression
(i.e. decimal). So, yes, all the above calculations are done analytically,
then at the end a user can ask for the numerical value using the function
N[].

>
> Can you control the accuracy of the calculation?

Yes.

>I am just curious
> because none of the values returned are particularly accurate and if
> Quantile[StudentTDistribution[100000000], 0.975] returns
> 1.959964074411019 then it is surprisingly inaccurate (relative error
> of about 3e-8).
>

Tis is the actual Quantile expression:

Quantile[StudentTDistribution[100000000], 975/1000]

10000*Sqrt[-1 + 1/InverseBetaRegularized[1, -(19/20), 50000000, 1/2]]

I can now Ask Mathematica to give me the numerical value of the above to 50
decimals for example

N[%, 50]
1.9599640082627668207600863127853028493573117315681873087708854414654`50.

In my last message I was using the default settings. Is the above asnwer
accurate enough now?

Nasser

== 4 of 6 ==
Date: Wed, Nov 14 2007 3:40 am
From: Karl Ove Hufthammer

Nasser Abbasi:

>>I am just curious
>> because none of the values returned are particularly accurate and if
>> Quantile[StudentTDistribution[100000000], 0.975] returns
>> 1.959964074411019 then it is surprisingly inaccurate (relative error
>> of about 3e-8).
>
> I can now Ask Mathematica to give me the numerical value of the above to
> 50 decimals for example
>
> N[%, 50]
> 1.9599640082627668207600863127853028493573117315681873087708854414654`50.
>
> In my last message I was using the default settings. Is the above asnwer
> accurate enough now?

This looks much more accurate, at least. But it's worrying that Mathematica
by default prints many more digits than its calculations are accurate for.

--
Karl Ove Hufthammer

== 5 of 6 ==
Date: Wed, Nov 14 2007 4:30 am
From: iandjmsmith@aol.com

On 14 Nov, 11:16, "Nasser Abbasi" <n...@12000.org> wrote:
> <iandjmsm...@aol.com> wrote in message
>
> news:1195038272.573775.184990@19g2000hsx.googlegroups.com...
>
>
>
> > I have never used Mathematica, so I do not understand all of your
> > notes on its use.
>
> > I gather when you ask for
>
> > Quantile[StudentTDistribution[120], 975/1000]
>
> > it returns 2*Sqrt[30*(-1 + 1/InverseBetaRegularized[1, -(19/20), 60,
> > 1/2])] as the symbolic equivalent of it and then evaluates this
> > expression and outputs it as 1.9799304050824413.
>
> Mathematica does everthing when possible in symbolic form (numbers are kept
> in rational form, etc..) unless there is a numeric value in the expression
> (i.e. decimal). So, yes, all the above calculations are done analytically,
> then at the end a user can ask for the numerical value using the function
> N[].
>
>
>
> > Can you control the accuracy of the calculation?
>
> Yes.
>
> >I am just curious
> > because none of the values returned are particularly accurate and if
> > Quantile[StudentTDistribution[100000000], 0.975] returns
> > 1.959964074411019 then it is surprisingly inaccurate (relative error
> > of about 3e-8).
>
> Tis is the actual Quantile expression:
>
> Quantile[StudentTDistribution[100000000], 975/1000]
>
> 10000*Sqrt[-1 + 1/InverseBetaRegularized[1, -(19/20), 50000000, 1/2]]
>
> I can now Ask Mathematica to give me the numerical value of the above to 50
> decimals for example
>
> N[%, 50]
> 1.9599640082627668207600863127853028493573117315681873087708854414654`50.
>
> In my last message I was using the default settings. Is the above asnwer
> accurate enough now?
>
> Nasser

I've got it. The problem was you asked for
Quantile[StudentTDistribution[100000000], 0.975] and the 0.975 causes
it to work to machine precision. The method of calculation is very
poor and hence only delivers a results with relative error of 3e-8.

You can ask for exact calculations with
Quantile[StudentTDistribution[100000000], 975/1000]

According to http://reference.wolfram.com/mathematica/ref/N.html?q=N&lang=en
N[%, n] attempts to give a result with n-digit precision.

It is not clear what is stopping it giving rather than attempting to
give a result with n-digit precision. The + operation is going to lose
about 8 figures so InverseBetaRegularized[1, -(19/20), 50000000, 1/2]
must be calculated to about 58 digits to give 50 digit accuracy. I am
still lost as to why it has printed out 68 digits. Maybe that is how
many figures it did the calculations to.

Ian Smith

== 6 of 6 ==
Date: Wed, Nov 14 2007 5:58 am
From: "Nasser Abbasi"

<iandjmsmith@aol.com> wrote in message
news:1195043423.480673.177500@o3g2000hsb.googlegroups.com...

> I've got it. The problem was you asked for
> Quantile[StudentTDistribution[100000000], 0.975] and the 0.975 causes
> it to work to machine precision. The method of calculation is very
> poor and hence only delivers a results with relative error of 3e-8.

That is correct. As I said, when there is a 'numeric' value in the
expression, this will force the computation to be done in non-symbolic.

>
> You can ask for exact calculations with
> Quantile[StudentTDistribution[100000000], 975/1000]
>

Yes.

> According to
> http://reference.wolfram.com/mathematica/ref/N.html?q=N&lang=en
> N[%, n] attempts to give a result with n-digit precision.
>

Yes.

> It is not clear what is stopping it giving rather than attempting to
> give a result with n-digit precision. The + operation is going to lose
> about 8 figures so InverseBetaRegularized[1, -(19/20), 50000000, 1/2]
> must be calculated to about 58 digits to give 50 digit accuracy. I am
> still lost as to why it has printed out 68 digits. Maybe that is how
> many figures it did the calculations to.
>
> Ian Smith
>

Well, Mathematica floating point model is a little hard for me to comprehend
without spending more time on it, and I am no expert at this aspect of
Mathematica, may be a Mathematica expert can comment on this. I know it
sometimes uses "significance arithmetic" and is implemented in software.

I copied a reply I wrote sometime ago in another newsgroup which has a link
to a detailed paper about Mathematica floating point model if anyone is
interested in some reading over their coffee break :)

"There is this paper that goes into all the details you ever wanted to know
about Mathematica handling of floating point arithmetic (LONG URL)

http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6W8D-4FFGJ35-1&_coverDate=07%2F31%2F2005&_alid=458876682&_rdoc=1&_fmt=&_orig=search&_qd=1&_cdi=6652&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=d64b2c0097f2daf6c879b7f57ad45ab2

"M. Sofroniou and G. Spaletta. Precise numerical computation. The Journal of
Logic and Algebraic Programming 64:113-134. 2005."

There is a 1999 version of the paper that anyone can download for free ( I
am not sure what is the difference):
http://citeseer.ist.psu.edu/sofroniou99precise.html

Any way, I am trying to understand it. But related to significant
arithmetic use in Mathematica, the paper says that is not done all the time,
here is the quote:

"Indeed Mathematica uses fixed-precision arithmetic instead of significance
arithmetic in its large scale numerical routines, such as in linear algebra
and the numerical solution of differential equations; the error bounds
provided by these numerical methods are well studied and provide much
tighter bounds than those based on assumptions of independent error
accumulation."

There is also standard way to tell Mathematica to use hardware floating
point arithmatics as well if you want by setting some global options at the
start of the session.

Nasser "

Nasser

==============================================================================
TOPIC: Combinatorial probability problem
http://groups.google.com/group/sci.stat.math/browse_thread/thread/d26220a7e3943029?hl=en
==============================================================================

== 1 of 10 ==
Date: Wed, Nov 14 2007 4:08 am
From: John Uebersax

Would someone kindly tell me what is the *formula* to answer this
question:

An urn contains red, blue, and green balls, in equal proportions.
Drawing four balls (with replacement), what is the probability that at
least one color will not be represented among the four.

Thanks in advance. (No, this is not a homework problem ;) )
--
John Uebersax PhD
http://ourworld.compuserve.com/homepages/jsuebersax/agree.htm

== 2 of 10 ==
Date: Wed, Nov 14 2007 5:06 am
From: iandjmsmith@aol.com

On 14 Nov, 12:08, John Uebersax <jsueber...@gmail.com> wrote:
> Would someone kindly tell me what is the *formula* to answer this
> question:
>
> An urn contains red, blue, and green balls, in equal proportions.
> Drawing four balls (with replacement), what is the probability that at
> least one color will not be represented among the four.
>
> Thanks in advance. (No, this is not a homework problem ;) )
> --
> John Uebersax PhDhttp://ourworld.compuserve.com/homepages/jsuebersax/agree.htm

The numbers of red, green and blue balls will have a multi-nomial
distribution.

The only way you will have all 3 colours represented is if there are 2
reds, 1 green and one blue ball or 1 red, 2 greens and one blue ball
or 1 red, 1 green and two blue balls.

The probabality is 3*4!/(2!*1!*1!)*(1/3)^3

The probability of at least one colour not being selected is therefore
5/9.

Ian Smith

== 3 of 10 ==
Date: Wed, Nov 14 2007 5:35 am
From: "Nasser Abbasi"

"John Uebersax" <jsuebersax@gmail.com> wrote in message
news:1195042122.174589.94000@o80g2000hse.googlegroups.com...
> Would someone kindly tell me what is the *formula* to answer this
> question:
>
> An urn contains red, blue, and green balls, in equal proportions.
> Drawing four balls (with replacement), what is the probability that at
> least one color will not be represented among the four.
>
> Thanks in advance. (No, this is not a homework problem ;) )
> --
> John Uebersax PhD
> http://ourworld.compuserve.com/homepages/jsuebersax/agree.htm
>

These are the hard problems I am talking about when it comes to probability.

This is my attempt at it.

at least one color will not be selected means P(exactly one color not
selected) or P(exactly 2 colors not selected)

There are 3^4 ways to draw 4 balls. (wuthout replacement)

There are 2^4 to draw 4 balls with one color missing. But there are 3
colors, hence there are 3*(2^4) ways to draw it.
There are 1 way to draw 4 balls from 2 color missing. But there are 3
colors, hence there are 3 ways to draw it.
Hence the chance at least one color missing is [3*(2^4)+3]/3^4 or 51/81=
0.62963

Almost 60% at least one color is missing? too high

Did I make a mistke? I have a feeling I did :)

Nasser

== 4 of 10 ==
Date: Wed, Nov 14 2007 5:55 am
From: Jussi Piitulainen

John Uebersax writes:

> Would someone kindly tell me what is the *formula* to answer this
> question:
>
> An urn contains red, blue, and green balls, in equal proportions.
> Drawing four balls (with replacement), what is the probability that
> at least one color will not be represented among the four.

Here's an elementary derivation.

Let Rk, Gk, Bk be the statements that there are k Red, Green, Blue
balls among the four drawn. On your information I, there are four
draws, each drawn ball being Red or Green or Blue, each colour equally
likely each time.

Using A+B for disjunction, A,B for conjunction:

P(R0 + G0 + B0|I)

= P(R0|I) + P(G0|I) + P(B0|I) ; repeated use of the sum rule
- P(G0, B0|I) - P(R0, G0|I) - P(R0, B0|I)
+ P(R0, G0, B0|I)

= P(R0|I) + P(G0|I) + P(B0|I) ; use the information I
- P(R4|I) - P(B4|I) - P(G4|I)
+ 0

= 3(P(R0|I) - P(R4|I)) ; colours have the same probabilities

= 3((2/3)^4 - (1/3)^4) ; simple combinatorics

= 5/9.

== 5 of 10 ==
Date: Wed, Nov 14 2007 6:14 am
From: John Uebersax

Thank you Ian.

Two questions:

I seem, perhaps mistakenly, to enumerate 15 possible combinations, of
which only 3 include all
colors, or P = 3/15. Am I overlooking something obvious:

R B G
-----
4 0 0
3 1 0
3 0 1
2 2 0
2 0 2
2 1 1
1 3 0
1 2 1
1 1 2
1 0 3
0 4 0
0 3 1
0 2 2
0 1 3
0 0 4

2. What is the corresponding probability/formula for all colors being
represented given 5 balls drawn instead of 4?

Thanks,

John Uebersax

On Nov 14, 2:06 pm, iandjmsm...@aol.com wrote:
> On 14 Nov, 12:08, John Uebersax <jsueber...@gmail.com> wrote:
>
> > Would someone kindly tell me what is the *formula* to answer this
> > question:
>
> > An urn contains red, blue, and green balls, in equal proportions.
> > Drawing four balls (with replacement), what is the probability that at
> > least one color will not be represented among the four.
>
> > Thanks in advance. (No, this is not a homework problem ;) )
> > --
> > John Uebersax PhDhttp://ourworld.compuserve.com/homepages/jsuebersax/agree.htm
>
> The numbers of red, green and blue balls will have a multi-nomial
> distribution.
>
> The only way you will have all 3 colours represented is if there are 2
> reds, 1 green and one blue ball or 1 red, 2 greens and one blue ball
> or 1 red, 1 green and two blue balls.
>
> The probabality is 3*4!/(2!*1!*1!)*(1/3)^3
>
> The probability of at least one colour not being selected is therefore
> 5/9.
>
> Ian Smith

== 6 of 10 ==
Date: Wed, Nov 14 2007 6:20 am
From: John Uebersax

Thanks Ian.

I, perhaps mistakenly, enumerate 15 possible combinations, with 3
meeting the criterion, for P = 3/ 15. Am I overlooking something very
obvious?

R B G
-----
4 0 0
3 1 0
3 0 1
2 2 0
2 0 2
2 1 1
1 3 0
1 2 1
1 1 2
1 0 3
0 4 0
0 3 1
0 2 2
0 1 3
0 0 4

Also, can you tell me how your formula would generalize given five
balls drawn instead of 3.

--
John Uebersax PhD

== 7 of 10 ==
Date: Wed, Nov 14 2007 6:22 am
From: Joe Blow

Yet another way:

"Each color is drawn" equivalent to "If the first two balls are the same color, the next two balls must be different from it and each other, and if the first two balls are different colors, the next two must be the same color of one of the remaining two colors."

So,

Prob(each color is drawn) = 1/3*(2/3*1/3) + 2/3(2*1/3*1/3) = 4/9

1-Prob(each color is drawn) = 5/9.

== 8 of 10 ==
Date: Wed, Nov 14 2007 6:48 am
From: iandjmsmith@aol.com

On 14 Nov, 14:14, John Uebersax <jsueber...@gmail.com> wrote:
> Thank you Ian.
>
> Two questions:
>
> I seem, perhaps mistakenly, to enumerate 15 possible combinations, of
> which only 3 include all
> colors, or P = 3/15. Am I overlooking something obvious:
>
> R B G
> -----
> 4 0 0
> 3 1 0
> 3 0 1
> 2 2 0
> 2 0 2
> 2 1 1
> 1 3 0
> 1 2 1
> 1 1 2
> 1 0 3
> 0 4 0
> 0 3 1
> 0 2 2
> 0 1 3
> 0 0 4
>
> 2. What is the corresponding probability/formula for all colors being
> represented given 5 balls drawn instead of 4?
>
> Thanks,
>
> John Uebersax
>
> On Nov 14, 2:06 pm, iandjmsm...@aol.com wrote:
>
>
>
> > On 14 Nov, 12:08, John Uebersax <jsueber...@gmail.com> wrote:
>
> > > Would someone kindly tell me what is the *formula* to answer this
> > > question:
>
> > > An urn contains red, blue, and green balls, in equal proportions.
> > > Drawing four balls (with replacement), what is the probability that at
> > > least one color will not be represented among the four.
>
> > > Thanks in advance. (No, this is not a homework problem ;) )
> > > --
> > > John Uebersax PhDhttp://ourworld.compuserve.com/homepages/jsuebersax/agree.htm
>
> > The numbers of red, green and blue balls will have a multi-nomial
> > distribution.
>
> > The only way you will have all 3 colours represented is if there are 2
> > reds, 1 green and one blue ball or 1 red, 2 greens and one blue ball
> > or 1 red, 1 green and two blue balls.
>
> > The probabality is 3*4!/(2!*1!*1!)*(1/3)^3
>
> > The probability of at least one colour not being selected is therefore
> > 5/9.
>
> > Ian Smith- Hide quoted text -
>
> - Show quoted text -

The probabilities are

R B G
-----
4 0 0 1/81
3 1 0 4/81
3 0 1 4/81
2 2 0 6/81
2 0 2 6/81
2 1 1 12/81
1 3 0 4/81
1 2 1 12/81
1 1 2 12/81
1 0 3 4/81
0 4 0 1/81
0 3 1 4/81
0 2 2 6/81
0 1 3 4/81
0 0 4 1/81

and the sum of the 3 which include all is 36/81 or 4/9.

With similar logic, 3*(prob of selecting 3,1,1 + prob of selecting
2,2,1), the probability that at least one color will not be
represented among the five balls drawn is 31/81.

Ian Smith

== 9 of 10 ==
Date: Wed, Nov 14 2007 6:51 am
From: Karl Ove Hufthammer

iandjmsmith@aol.com:

> The probabality is 3*4!/(2!*1!*1!)*(1/3)^3

The last factor should be (1/3)^4 (which you have correctly used in the
final answer).

> The probability of at least one colour not being selected is therefore
> 5/9.

--
Karl Ove Hufthammer

== 10 of 10 ==
Date: Wed, Nov 14 2007 7:03 am
From: Karl Ove Hufthammer

John Uebersax:

> I seem, perhaps mistakenly, to enumerate 15 possible combinations, of
> which only 3 include all
> colors, or P = 3/15. Am I overlooking something obvious:
>
> R B G
> -----
> 4 0 0
> 3 1 0
> 3 0 1
> 2 2 0

They don't all have the same probability. For instance, you can get
400 in only one way, by picking RRRR (probability: (1/3)^4), but you
can get 220 in several (6) ways: RRBB RBRB BRBR BBRR BRRB RBBR
(Probability: 6 × (1/3)^4.)

The shortcut 'favourable outcomes divided by possible outcomes' only works
when each outcome has the same probability (i.e., uniform distribution).
Here 220 is six times as likely as 400 to occur, so you can't use this
shortcut.

--
Karl Ove Hufthammer

==============================================================================
TOPIC: Poisson distribution test
http://groups.google.com/group/sci.stat.math/browse_thread/thread/6dabdbdd305700fc?hl=en
==============================================================================

== 1 of 4 ==
Date: Wed, Nov 14 2007 6:09 am
From: eq

Does any one know the references (sites or literature) that concerns
diffrent tests
about wheter a sample comes from Poisson distribution ??

I performed KS test in SPSS several times on difrent random samples of
cases in total sample.
The probability of rejection is about 0,5 , so I need stronger test.

Tnx in advance.

== 2 of 4 ==
Date: Wed, Nov 14 2007 6:35 am
From: "David Jones"

eq wrote:
> Does any one know the references (sites or literature) that concerns
> diffrent tests
> about wheter a sample comes from Poisson distribution ??
>
> I performed KS test in SPSS several times on difrent random samples of
> cases in total sample.
> The probability of rejection is about 0,5 , so I need stronger test.
>
> Tnx in advance.

You say "I need stronger test" : to get the "best test" you need to think about what departures from the Poisson distribution your problem is most sensitive to. There are various possibilities (i) more (or fewer) zero values than a Poisson distribution would suggest; (ii) longer (or shorter) tails.

One test that is simple and easily interpretable is based on the coefficient of dispersion (ratio of variance to mean), but this may not have good power against alternatives you are interested in.

You may find the idea of testing against a negative binomial alternative useful. Or you could look for other families of discrete ditributions that have the Poisson as a special case.

There are even graphical-based approaches such as lookingh at the ratios of estimates Pr(N=n+1)/Pr(N=n), which should turn out to roughly constant with n, and for which particular patterns may suggest certain alternatives.

David Jones

== 3 of 4 ==
Date: Wed, Nov 14 2007 6:54 am
From: Russell

On Nov 14, 9:09 am, eq <extraeq...@gmail.com> wrote:
> Does any one know the references (sites or literature) that concerns
> diffrent tests
> about wheter a sample comes from Poisson distribution ??
>
> I performed KS test in SPSS several times on difrent random samples of
> cases in total sample.
> The probability of rejection is about 0,5 , so I need stronger test.
>
> Tnx in advance.

You might try this. No guarantees it will solve
your problem.

Martin, R. L., "A Statistic Useful for Characterizing
Probability Distributions, with Application to Rain
Rate Data", J. Appl. Meteor., 28, 354 (1989)

Cheers,
Russell

== 4 of 4 ==
Date: Wed, Nov 14 2007 7:05 am
From: duncan smith

If you keep trying different tests you will most likely eventually find
one that results in a 'significant' p-value, regardless of whether the
null hypothesis is true or false. Also, if you keep testing different
subsamples you'll probably eventually find a subsample that leads to
rejection. Can you see why doing this is wrong?

Duncan

==============================================================================
TOPIC: Turn a uniform number to normal random numbers
http://groups.google.com/group/sci.stat.math/browse_thread/thread/054d911605199f7c?hl=en
==============================================================================

== 1 of 2 ==
Date: Wed, Nov 14 2007 9:16 am
From: Yves

Hi,

I read from Mark Joshi's Concept of Mathematical Finance pg 178 "..there is a simple method which gives reasonable, but not great, approximation is to simply add together 12 uniform variables and subtract 6. The results has correct mean, variance and third moment."

Could someone explain this idea? How can I find out about quick method?

Thanks.

== 2 of 2 ==
Date: Wed, Nov 14 2007 9:47 am
From: Gordon Sande

On 2007-11-14 08:16:02 -0400, Yves <sunder_1600@yahoo.com> said:

> Hi,
>
> I read from Mark Joshi's Concept of Mathematical Finance pg 178
> "..there is a simple method which gives reasonable, but not great,
> approximation is to simply add together 12 uniform variables and
> subtract 6. The results has correct mean, variance and third moment."
>
> Could someone explain this idea? How can I find out about quick method?
>
> Thanks.

The uniform on 0-1 has a variance of 1/12 and a mean of 1/2.
So the sum of 12 of them has a variance of 1 and a mean of 6.
Subtract 6 for a variance of 1 and a mean of 0 as wanted for
a standard normal. The result is symmetric so the third moment
(and all the odd moments as well!) will be zero.

A quick and dirty plausible approximation as promised. Notice that
there will be no values below -6 or above +6 at all. The approximation
is too short tailed for many other purposes. Remember quick and dirty
and not great were the words in the description.

==============================================================================
TOPIC: Conditional Probability.
http://groups.google.com/group/sci.stat.math/browse_thread/thread/e22ead0b091bc3bf?hl=en
==============================================================================

== 1 of 1 ==
Date: Wed, Nov 14 2007 9:56 am
From: probability@farfara.org

What do you think of this?

http://www.farfara.org/

==============================================================================
TOPIC: wrong R-Squared value??
http://groups.google.com/group/sci.stat.math/browse_thread/thread/259e11ac412a3219?hl=en
==============================================================================

== 1 of 1 ==
Date: Wed, Nov 14 2007 10:07 am
From: jantunes

Hi all,

I'm doing a linear regression to produce a trendline that can predict (more or less) some future data. The data is very correlated (something like R=0.98).

This is what I do:
1) get 200 data points (x is a time series; y is CPU usage)
2) do linear regression based on those 200 points, resulting in some y'=a + bx
3) get R-squared (R^2=0.96) for the y'

Then, I want to validate that trendline/prediction by comparing it with more real data:
4) get more data points, past the 200 points (eg 10000)
5) get R-squared for the y' (this time against the new data)

The problem is that this new R-squared has very strange values (depending on the equation), either <0 (SSE/SST>1), >1 (SSR>SST), or near 0,99 (when in fact the trendline is not accurate).
Has I said I have already tried different ways of calculating the R-squared. They all give the same value in 3), but strange values in 5).

Am I doing some wrong assumption here? I pretty sure the calculations are correct... How can I validate my trendlines (linear regression models)?

Thanks in advance!

==============================================================================
TOPIC: Call for Papers: The 2008 International Conference of Computational
Statistics and Data Engineering ICCSDE 2008
http://groups.google.com/group/sci.stat.math/browse_thread/thread/d977ac14ca3059d0?hl=en
==============================================================================

== 1 of 1 ==
Date: Wed, Nov 14 2007 10:55 am
From: wcecs_2008@iaeng.org

Call for Papers: The 2008 International Conference of Computational
Statistics and Data Engineering ICCSDE 2008
From: IAENG - International Association of Engineers
http://www.iaeng.org/WCE2008/ICCSDE2008.html

Important Dates:
Draft Paper Submission Deadline (extended): 6 March, 2008
Camera-Ready papers & Pre-registration Due: 31 March, 2008
WCE 2008: 2-4 July, 2008

The conference ICCSDE'08 is held under the World Congress on
Engineering 2008. The WCE 2008 is organized by the International
Association of Engineers (IAENG), a non-profit international
association for the engineers and the computer scientists. Our
congress committees have been formed with over two hundred and eighty
committee members who are mainly research center heads, faculty deans,
department heads, professors, and research scientists from different
universities like Cambridge, MIT and Oxford etc.

The conference proceedings will be published by IAENG (ISBN:
978-988-98671-9-5) in hardcopy. The full-text congress proceeding will
be indexed in major database indexes so that it can be assessed
easily. The Technology Research Databases (TRD) of CSA (Cambridge
Scientific Abstracts), DBLP and Computer Science Bibliographies have
promised to index the print proceeding in advance of its publication.
And after the publication of the proceeding, print copies will also be
sent to databases like IEE INSPEC, Engineering Index (EI) and ISI
Thomson Scientific for indexing. The accepted papers will also be
considered for publication in the special issues of the journal
Engineering Letters. Some participants may also be invited to submit
extended version of their conference papers for considering as book
chapters (soon after the conference).

The topics include, but not limited to, the following:

Robust and Nonparametric Methods
Optimization
Applications in Economics and Finance
Computational Methods and Algorithms
Statistical Learning Methods
Machine Learning
Data Mining
Signal Engineering
Structural Equations
Mixture Models
Nonlinear Time Series
Financial Econometrics
Software and Tools for Statistical Computing
Matrix Computations
Structured Data Engineering
Statistical Analysis for Functional Data
Genomics

Partial Least Squares
Recursive Partitioning
Information Retrieval
Decision Support Systems
Text Mining
Decision Trees
Association Rules
Dimensional Modeling
Statistical Algorithm for Data Engineering
Data Warehousing
Pattern Matching
Rule-Based Algorithms
Clustering
Web Mining
Spatial Data Engineering

=========
Submission:

WCE 2008 is now accepting manuscript submissions. Prospective authors
are invited to submit their draft paper in full paper (any appropriate
style) to WCE{at}iaeng.org by 6 March, 2008. The submitted file can be
in MS Word format, PS format, or PDF formats.

The first page of the draft paper should include:
(1) Title of the paper;
(2) Name, affiliation and e-mail address for each author;
(3) A maximum of 5 keywords of the paper.

Also, the name of the conference that the paper is being submitted to
should be stated in the email.

It is our target that the reviewing process and the result
notification for each submitted manuscript can be completed within one
month from its submission. The reviewing process is to ensure the
quality of the accepted papers in the WCE congress. The conferences
have enjoyed high reputation among many research colleagues (for
example, see the http://cs.conference-ranking.net/ or
http://www.conference-ranking.com/).

=============
ICCSDE 2008 Conference Committee

Prof. Josu Arteche
Associate Professor, Department of Economia Aplicada III (Econometrics
and Statistics), Universidad del Pais Vasco Euskal Herriko
Unibertsitatea, Spain

Dr. Jitka Bartosova
Senior Lecturer of Statistics and Applied Mathematics, Department of
Information Management, University of Economics, Prague, Faculty of
Management, Czech Republic

Prof. Paula R. Bouzas
Associated Professor, Facultad de Farmacia, University of Granada,
Spain

Prof. Alexandre Carvalho
Head of the Spatial Studies Department, Institute of Applied Economics
Research (IPEA), Brazil

Prof. Ioannis C. Demetriou (co-chair; Ph.D. University of Cambridge)
Professor of mathematics and informatics and
Director of the Computing Lab., Department of Economics, University of
Athens, Greece

Prof. Adelaide Figueiredo
Assistant Professor, Porto School of Economics, Portugal

Prof. Giannoula Florou
Associate Professor, Accounting Department, Kavala Institute of
Technology, Greece

Dr. Kostas Giannopoulos
Head of Finance and Banking, The British University in Dubai, United
Arab Emirates

Prof. Rosa Eva Pruneda Gonzalez
Associated Professor, Mathematics Department, Civil Engineering
School, University of Castilla-La Mancha, Spain

Dr. Peter Hingley
Principal Administrator, Controlling Office, Directorate of Strategic
& Operational Controlling, European Patent Office, Germany

Prof. Salih Turan Katircioglu
Assistant Professor of Economics, Vice Chair,
Department of Banking and Finance, Eastern Mediterranean University,
North Cyprus
Editor: International Journal of Economic Perspectives

Dr. Hasan Al-Madfai
Senior Lecturer in Statistics, Mathematics for Application Field
Leader, Division of Mathematics and Statistics, University of
Glamorgan, UK

Prof. Stefano Mainardi
Associate Professor, Department of Informatics and Econometrics, Card.
S. Wyszynski University, Poland

Prof. S.G. Meintanis
Professor, Division of Statistics and Econometrics, Department of
Economics, National and Kapodistrian University of Athens, Greece

Prof. Annie Morin
Associate Professor, Institut de Recherche en Informatique et Systemes
Aleatoires, University of Rennes 1, France

Dr. Mourad Oussalah
Lecturer, Department of Electronic, Electrical and Computer
Engineering, The University of Birmingham, UK

Dr. Mori Nejad Vaezi-Nejad
Senior Lecturer in Communications Technology, Department of CCTM at
North Campus, London Metropolitan University, UK

Prof. Toshio Sakata
Professor, School of Design, Kyushu University, Japan

Dr. Germana Scepi
University of Naples, Italy

Dr. Agusti Solanas
Research Scientist, CRISES Research Group, Department of Computer
Science and Maths, Rovira i Virgili University, Spain

=============
WCE Congress Co-chairs

Prof. Alexander M. Korsunsky
Professor of Engineering Science
Dean, Trinity College
Department of Engineering Science, University of Oxford, UK

Prof. Andrew Hunter
Professor & Head of Department
Head of Vision and AI Research Group,
Dean of Research
Department of Computing and Informatics,
Lincoln University, UK

Prof. David WL Hukins, CPhys, FinstP, FIPEM, FRSE
Professor of Bio-medical Engineering
Head of Department of Mechanical & Manufacturing Engineering,
University of Birmingham, UK

Prof. Leonid Gelman (honorary co-chair)
Professor and Chair in Vibro-Acoustic Monitoring,
Chairman of COMADIT, British Institute of NDT,
Director, Centre of Vibro-Acoustics and Fatigue,
Department of Process and Systems Engineering, School of Engineering
Cranfield University, UK

Dr. Christopher John Hogger (honorary co-chair)
Senior Lecturer
Department of Computing
Imperial College London, UK

Prof. Darek J. Ceglarek (ICMEEM honorary co-chair)
Professor, International Manufacturing Centre, University of Warwick,
UK
Professor, Department of Industrial and Systems Engineering
The University of Wisconsin-Madison, USA
Fellow of CIRP; Associate Editor, IEEE Transactions on Automation
Science and Engineering, and ASME Transactions on Manufacturing
Science
and Engineering

Dr. Stephen Payne (ICSBB honorary co-chair)
University Lecturer in BioMedical Engineering
Dean of Degrees Keble College, Head of Physiological Understanding
through Modelling, Monitoring and Analysis Group,
Department of Engineering Science, University of Oxford, UK

More details about the WCE 2008 can be found at:
http://www.iaeng.org/WCE2008/index.html
http://www.iaeng.net/WCE2008/index.html
http://www.iaeng.com/WCE2008/index.html

More details about the International Association of Engineers, and the
IAENG International Journal of Computer Science, and the IAENG
International Journal of Applied Mathematics can be found at:
http://www.iaeng.org/about_IAENG.html
http://www.iaeng.org/IJCS/index.html
http://www.iaeng.org/IJAM/index.html
The official journal web site of Engineering Letters at:
http://www.engineeringletters.com
Other Engineering Letters web sites at:
http://www.engineeringletters.net
http://www.engineeringletter.com

********
It will be highly appreciated if you can circulate these calls for
papers to your colleagues.

==============================================================================

You received this message because you are subscribed to the Google Groups "sci.stat.math"
group.

To post to this group, visit http://groups.google.com/group/sci.stat.math?hl=en

To unsubscribe from this group, send email to sci.stat.math-unsubscribe@googlegroups.com

To change the way you get mail from this group, visit:
http://groups.google.com/group/sci.stat.math/subscribe?hl=en

To report abuse, send email explaining the problem to abuse@googlegroups.com

==============================================================================
Google Groups: http://groups.google.com?hl=en

Statistika For Life

Rabu, 14 November 2007

25 new messages in 7 topics - digest

Tidak ada komentar:

Arsip Blog

Mengenai Saya