Statistika For Life: 6 new messages in 4 topics

sci.stat.math
http://groups.google.com/group/sci.stat.math?hl=en

sci.stat.math@googlegroups.com

Today's topics:

* linear regression and multicollinearity - 2 messages, 1 author
http://groups.google.com/group/sci.stat.math/browse_thread/thread/e22deec37b268ddf?hl=en
* slope test in linear regression with known intercept and known error
variance - 1 messages, 1 author
http://groups.google.com/group/sci.stat.math/browse_thread/thread/74ab6da6d7a0f55e?hl=en
* Weibull Parameter Comparison and Test Power - 1 messages, 1 author
http://groups.google.com/group/sci.stat.math/browse_thread/thread/503436b4d74920a8?hl=en
* Interesting (but difficult) question - calculating 'implied' probabilities
of a wager - 2 messages, 2 authors
http://groups.google.com/group/sci.stat.math/browse_thread/thread/245aac10e44fcef8?hl=en

==============================================================================
TOPIC: linear regression and multicollinearity
http://groups.google.com/group/sci.stat.math/browse_thread/thread/e22deec37b268ddf?hl=en
==============================================================================

== 1 of 2 ==
Date: Thurs, Nov 15 2007 1:23 pm
From: hberig@gmail.com

On Nov 11, 7:17 pm, David Winsemius <doe_s...@comcast.n0T> wrote:
> hbe...@gmail.com wrote innews:1194807316.448951.136200@d55g2000hsg.googlegroups.com:
>
> > I agree with you and R, in my second message I've tryed to explain
> > that R shows me the right results.
>
> > My difficulties are: understand how multicollinearity affects the
> > regression analysis and how it's related with a computational problems
> > (like ill and possed X'X matrix) and statistical problem (like with a
> > "small" change in predictors data may arrive to very different results
> > in the model).
> > I know the definitions eigenvalue, singular matrix and condition
> > number but I'm trying to understand implications in the statistics
> > area.
>
> Short answer following Myers, "Classical and Modern Regression with
> Applications". The variance of predictions is proportional to
> x_i'*inv(X'X)x_i. Some of the diagonal elements of inv(X'X) will be
> large when multicollinearity exists. "Variance inflation factors" (VIF)
> are the diagonal elements of the singular decomposition of X'X.
>
> Perhaps reading one of these items (found with a search on VIF and
> "condition number") will help:
> <http://www.nd.edu/~rwilliam/stats2/l11.pdf>
> <http://www.masil.org/documents/multicollinearity.pdf>
>
> Adding CRAN to that search strategy to get r-specific hits produced:
> <http://www.sci.usq.edu.au/courses/STA3301/StudyBook.pdf>
> ..see pages 3.26-3.33
>
> And:
> <http://www-personal.umich.edu/~jwbowers/CLASSES/PS532f07/HANDOUTS/han...>
> ..see pages 2 and 14 and any pages in between that catch yur eye.
>
> The condition number is obtained in R with:
> kappa(<matrix or model object>)
>
> --
> David Winsemius

Thanks David !!!
I'll read your links, that was I'm looking for...

== 2 of 2 ==
Date: Thurs, Nov 15 2007 1:28 pm
From: hberig@gmail.com

On Nov 12, 12:47 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
> On Sun, 11 Nov 2007 11:08:10 -0800, hbe...@gmail.com wrote:
> > On 7 nov, 14:54, Jack Tomsky <jtom...@ix.netcom.com> wrote:
> [snip, previous]
>
> > Thanks Jack!
> > I know about some ways of getting around with the multicollinearity
> > problem (like eliminating variables or getting principal components of
> > the predictor variables and use the rotated base). I'm trying to
> > understand how variations in the data variates the quality of results.
> > With (1) orthogonal variables we haven't multicollinearity problem,
>
> right.
>
> > and (2) a (quasi perfect) linear dependence in predictor variables
> > (like X4 aprox= 2X1 + 3X2 - X3) we have a strong multicollinearity
> > problem and a small change in predictors data may cause big changes in
> > the model (this is the worst problem using multicollinear
> > predictors?);
>
> Is change of coefficients seen as a problem by you?

Is a good question!

> If two models give (almost) exactly the same predictions,
> then it is fair, by *most* standards, to say that they are the
> same model. Or, "The same model can be described in
> more than one way."
>
> The "problem" when two models exist with different coefficients
> depends on other some other criterion. Does one replicate
> or cross-validate better than the other? - either of two solutions
> may work as well if the non-independence is mechanical (using B
> C, and B/C). Having suppressor variables that are incidental seems
> to be a good clue that one particular equation will not be robust.
>
> "Sense" is another sort of criterion. If you have highly correlated
> variables, it can be plain silly to pretend that you have coefficients
> that are worth "interpreting" for their nominal values. I like to
> combine the variables where it makes sense to combine them,
> leaving pretty good independence, so that I *can* make sense
> of coefficients. But you cannot start out by "making sense" when
> you read a set of correlated partial regression coefficients, unless
> you are already aware of which outcomes are effectively the same.
>
> > between the extreme cases (1) and (2) I'm trying to
> > visualize in some way how the predictors data affects the model.
> > Again, Thanks for answer!
>
> --
> Rich Ulrich, wpi...@pitt.eduhttp://www.pitt.edu/~wpilib/index.html

I agree, some decisions about remove or keep variables depends on
sense and the problem domain...

Thanks Rich!!!

==============================================================================
TOPIC: slope test in linear regression with known intercept and known error
variance
http://groups.google.com/group/sci.stat.math/browse_thread/thread/74ab6da6d7a0f55e?hl=en
==============================================================================

== 1 of 1 ==
Date: Thurs, Nov 15 2007 1:26 pm
From: Jack Tomsky

> On Nov 15, 3:38 pm, Jack Tomsky
> <jtom...@ix.netcom.com> wrote:
> > > On Nov 15, 2:27 pm, Jack Tomsky
> > > <jtom...@ix.netcom.com> wrote:
> > > > > Dear Forum,
> >
> > > > > Suppose the following model
> > > > > Y_ij=1+beta*X_i+eps_ij
> > > > > with j=1,2,...n_i and i=1,2,3,4.
> > > > > the eps_ij are iid standard normal rvs
> >
> > > > > The goal is to test
> > > > > Ho:beta =<0
> > > > > vs. H1:beta >0
> >
> > > > > Question 1:
> > > > > Consider the first following procedure:
> > > > > Run a two-sample test based only on Y_1j,
> > > j=1,...,100
> > > > > (sample1) and
> > > > > Y_4j, j=1,...,100 (sample 2). Here,
> n1=n4=100,
> > > > > n2=n3=0.
> > > > > Find the 0.05-level test of this procedure.
> >
> > > > > I am thinking that I can use a t test. the
> test
> > > would
> > > > > reject if
> >
> > > > > (betaols-0)/s(betahat)>t_{200-1}(0.95)
> >
> > > > > where betaols=SSY/SSX is the OLS estimate of
> beta
> > > and
> > > > > s(betaols) is
> > > > > computed as usual with the MSE. Should I be
> using
> > > a t
> > > > > test? it is
> > > > > given that the eps_ij are standard normal, so
> I
> > > could
> > > > > use a normal
> > > > > test, right? In this case, the test would be
> >
> > > > > (betaols-0)*sqrt{SSX}/1>z(0.95)
> >
> > > > > Question 2:
> > > > > Take n replications at each x_i
> (n1=n2=n3=n4=n).
> > > > > Obtain the MLE and
> > > > > base the test of that estimator.
> > > > > Find the 0.05-level test of this procedure.
> >
> > > > > I am thinking that the distribution of
> betamle is
> > > > > Normal(beta, sigma^2/
> > > > > SSX). So I would run the following the normal
> > > test
> > > > > (betamle-0)*sqrt(SSX)/1>z(0.95)
> >
> > > > > Where betamle is SSY/SSX, but this time the
> SSX
> > > and
> > > > > SSY are different
> > > > > from those of question 1.
> >
> > > > > I greatly appreciate your help. In a third
> > > question,
> > > > > I need to compare
> > > > > the power of each of the tests, so I will
> need to
> > > > > express
> > > > > SSY(question2) as a function of
> SSY(question1),
> > > and
> > > > > SSX(question2) as
> > > > > a function of SSX(question1). Can anybody
> help me
> > > do
> > > > > that?
> >
> > > > YOu can use all the data. Let Zij = Yij-1.
> >
> > > > Then the LS estimate of beta is
> >
> > > > betahat = Sum[(Xi)Sum(Zij)]/Sum(Ni*Xi^2)
> >
> > > > where the sums go from i = 1, ..., 4 and j = 1,
> > > ..., Ni.
> >
> > > > The varaince of betahat is
> >
> > > > Var(betahat) = 1/Sum(Ni*Xi^2)
> >
> > > > Then
> >
> > > > betahat/Sqrt(Var(betahat)) ~ N(0,1).
> >
> > > > Jack
> >
> > > Many thanks for your reply. Is there a book or a
> > > paper that would be a
> > > good reference for this problem?
> >
> > > Many thanks for your help.
> >
> > All I did was to put it into the form of a general
> linear model, Z = X*beta, where z is a column vector
> of Z_11, ..., Z_4,N4, X is a vector of X1, ..., X1,
> ...,X4, ..., X4 and beta is a scalar.
> >
> > Then the LS estimate of beta is
> >
> > betahat = X'Z/X'X.
> >
> > Algebraic manipulations reduce it to the form I
> gave.
> >
> > Since the covariance matrix of Z is given as the
> identity matrix,
> >
> > Var(betahat) = X'X/(X'X)^2 = 1/(X'X).
> >
> > Hope this helps.
> >
> > Jack
>
> I understand, many thanks!
>
> Question 2 asks for a test based on the MLE. I can
> say that the MLE is
> just the same as the OLS, right?
>
> I greatly appreciate your help.

Yes, under normality, the MLE and OLS of the means are the same.

Jack

==============================================================================
TOPIC: Weibull Parameter Comparison and Test Power
http://groups.google.com/group/sci.stat.math/browse_thread/thread/503436b4d74920a8?hl=en
==============================================================================

== 1 of 1 ==
Date: Thurs, Nov 15 2007 5:07 pm
From: David Winsemius

info@goodstats.biz wrote in news:1195069250.638597.220260
@v2g2000hsf.googlegroups.com:

> Does anyone have information or expertise on the comparison of
> Weibull location parameters from possibly different underlying Weibull
> distributions (2 parameter)?

Do a search on "weibull regression" + "two parameter model"?

> Also, any resources or information on calculation of the sample size
> for such a comparison needed to achieve a given test power?

Do a search on "weibull regression" + (power OR "sample size")?

You might say "but most of these results are behind publishers' barriers to
access". True, but you told us nothing about your situation (academic
library privileges or no) or your reasons for posing these questions.

<http://www.catb.org/~esr/faqs/smart-questions.html>

==============================================================================
TOPIC: Interesting (but difficult) question - calculating 'implied'
probabilities of a wager
http://groups.google.com/group/sci.stat.math/browse_thread/thread/245aac10e44fcef8?hl=en
==============================================================================

== 1 of 2 ==
Date: Thurs, Nov 15 2007 6:14 pm
From: "Pavel314"

"Pavel314" <Pavel314@NOSPAM.comcast.net> wrote in message
news:SoCdnbzFjODOpaHanZ2dnUVZ_sOrnZ2d@comcast.com...
>
> "Anonymous" <no.reply@here.com> wrote in message
> news:Ou6dnXpmPox49KTanZ2dnUVZ8s-qnZ2d@bt.com...
>> Here is a hypotheical scenario.
>>
>> A friend and I decide to visit the local county fair. There is a
>> competition to see who can throw a heavy ball the highest. I bet my
>> friend that I can throw the heavy metal ball more than X metres high.
>>
>> He in turn, says "I'll pay you a dollar for every Y centimeters that you
>> can throw the ball above X meters - BUT to make it worth my while, you
>> have to PAY ME Z dollars for me to take on the bet".
>>
>> From the above, my friend has calculated (implicitly from the wager he
>> has made), the probability of me being able to throw the ball above X
>> metres. How may I calculate the probaility, so I can work out the
>> (implied) odds of my success?
>>
>> What methodology/logic/technique may I use to calculate the probability
>> of me throwing the ball above X metres (based on the wager given above)?

I worked on this at lunchtime today. It seems the missing link is the
confidence level you and your friend are placing on their bets. Hopefully,
someone more skilled in statistical reasoning will come to our aid.

Paul

== 2 of 2 ==
Date: Thurs, Nov 15 2007 6:59 pm
From: Anonymous

Pavel314 wrote:

> "Pavel314" <Pavel314@NOSPAM.comcast.net> wrote in message
> news:SoCdnbzFjODOpaHanZ2dnUVZ_sOrnZ2d@comcast.com...
>
>>"Anonymous" <no.reply@here.com> wrote in message
>>news:Ou6dnXpmPox49KTanZ2dnUVZ8s-qnZ2d@bt.com...
>>
>>>Here is a hypotheical scenario.
>>>
>>>A friend and I decide to visit the local county fair. There is a
>>>competition to see who can throw a heavy ball the highest. I bet my
>>>friend that I can throw the heavy metal ball more than X metres high.
>>>
>>>He in turn, says "I'll pay you a dollar for every Y centimeters that you
>>>can throw the ball above X meters - BUT to make it worth my while, you
>>>have to PAY ME Z dollars for me to take on the bet".
>>>
>>>From the above, my friend has calculated (implicitly from the wager he
>>>has made), the probability of me being able to throw the ball above X
>>>metres. How may I calculate the probaility, so I can work out the
>>>(implied) odds of my success?
>>>
>>>What methodology/logic/technique may I use to calculate the probability
>>>of me throwing the ball above X metres (based on the wager given above)?
>
>
>
> I worked on this at lunchtime today. It seems the missing link is the
> confidence level you and your friend are placing on their bets. Hopefully,
> someone more skilled in statistical reasoning will come to our aid.
>
> Paul
>
>

Hi Paul, thanks for your feedback. Incidentally, I agree with you - I am
thinking along the same lines - it may not be possible to calculate the
oddds, without knowing the distribution of the height of the throws.
I'll do some more thinking ...

==============================================================================

You received this message because you are subscribed to the Google Groups "sci.stat.math"
group.

To post to this group, visit http://groups.google.com/group/sci.stat.math?hl=en

To unsubscribe from this group, send email to sci.stat.math-unsubscribe@googlegroups.com

To change the way you get mail from this group, visit:
http://groups.google.com/group/sci.stat.math/subscribe?hl=en

To report abuse, send email explaining the problem to abuse@googlegroups.com

==============================================================================
Google Groups: http://groups.google.com?hl=en

Statistika For Life

Jumat, 16 November 2007

6 new messages in 4 topics - digest

Tidak ada komentar:

Arsip Blog

Mengenai Saya