Quests about Random Variables

1. Distance to the Hogwarts Express

You are tracking the distance to the Hogwarts Express. A magical instrument reports itโ€™s 100 leagues away. Before the reading, your belief about the distance D was a Gaussian DโˆผN(ยต=98,ฯƒ2=16). The instrumentโ€™s reading is the true distance plus Gaussian noise (N(0,4)).

a. What is the PDF of your prior belief of the trainโ€™s true distance?

The Probability Density Function of the prior belief about the train's true distance is given by the Gaussian distributionย with ยต=98 and ฯƒ2=16, that is f(x)=142ฯ€eโˆ’(xโˆ’98)232.

b. What is the probability density of seeing a reading of 100 leagues, given the true distance is t?

Let's call R the random variable that represents seeing a reading of r pages, in this case 100. In shorts, we have to compute P(R=100ย |ย D=t), where R, as stated in the text, is the true distance plus a Gaussian noise (N(0,4)). This means that the distribution of R is t+N(0,4)=N(t,4). Then:

P(R=100ย |ย t)=122ฯ€eโˆ’(100โˆ’t)28

c. What is the PDF of your posterior belief (after the reading) of the trainโ€™s true distance? (You can leave a constant and donโ€™t need to simplify).

In shorts, we have to compute P(D=tย |ย R=100). We can calculate this using Bayes' Theorem, that is:

P(D=tย |ย R=100)=P(R=100ย |ย D=t)ร—P(D=t)P(R=100)

Since the question allows us to leave a constant:

P(D=tย |ย R=100)=Cร—P(R=100ย |ย D=t)ร—P(D=t)

We have already found P(D=t) in part (a) and P(R=100ย |ย D=t) in part (b), so we have only to substitute:

P(D=t|R=100)=Cร—122ฯ€eโˆ’(100โˆ’t)28ร—142ฯ€eโˆ’(tโˆ’98)232

We can combine the constant terms into a single proportionality constant:

P(D=t|R=100)=Cร—eโˆ’(100โˆ’t)28ร—eโˆ’(tโˆ’98)232=Cร—eโˆ’(100โˆ’t)28โˆ’(tโˆ’98)232

2. Owls arrival in Owlery

On average, 5.5 owls arrive at the Owlery per minute. What is the probability that:

a. More than 7 owls will arrive in the next minute?

First, we decide to model the owl arrivals using a Poisson distribution, since we're looking at the number of events (owl arrivals) occurring in a fixed interval of time, given a known average rate of occurrence, and assuming the events happen independently.
The Probability Mass Function (PMF) for a Poisson distribution is:

P(X=k)=ฮปkeโˆ’ฮปk!

In this case, we want to compute P(X>7)=1โˆ’P(Xโ‰ค7), where ฮป=5.5.
Let's do that:

P(X>7)=1โˆ’P(Xโ‰ค7)=1โˆ’โˆ‘k=07eโˆ’5.5(5.5)kk!=0.19051

b. More than 13 owls will arrive in the next 2 minutes?

The idea is the same as before, but note that the rate is ฮป=5.5 owls per minute; therefore, for this case, ฮป=5.5ร—2=11. As before:

P(X>13)=1โˆ’P(Xโ‰ค13)=1โˆ’โˆ‘k=013eโˆ’11(11)kk!=0.21870

c. More than 15 owls will arrive in the next 3 minutes?

The idea is the same as before, but note that the rate is ฮป=5.5 owls per minute; therefore, for this case, ฮป=5.5ร—3=16.5. As before:

P(X>15)=1โˆ’P(Xโ‰ค15)=1โˆ’โˆ‘k=015eโˆ’16.5(16.5)kk!=0.58198

3. Finding the median of a random variable

The median of a continuous random variable (like the height of a gnome) having cumulative distribution function F is the value m such that F(m)=0.5. Find the median of X (in terms of distribution parameters) if:

a. XโˆผUni(a,b) (Uniform distribution, like the spread of Floo powder).

For a uniform distribution, the CDF is given by:

F(x)=xโˆ’abโˆ’aforย aโ‰คxโ‰คb

Setting F(m)=0.5:

mโˆ’abโˆ’a=0.5โŸนmโˆ’a=0.5(bโˆ’a)

so

m=a+0.5(bโˆ’a)=a+b2

b. XโˆผN(ยต,ฯƒ2) (Normal distribution, like scores on the O.W.L.s).

A normal distribution is symmetric about its mean, thus the median is equal to the mean:

m=ฮผ

4. A visit to the Hogwarts library

Let Xi be the number of students visiting the Hogwarts library in week i, where XiโˆผN(2200,52900). Assume weekly visits Xi are independent.

a. What is the probability that the total number of visitors in the next two weeks exceeds 5000?

First, we can define the total number of visitors over two weeks as S=X1+X2, and since the sum of independent normal variables is a normal, SโˆผN(2200+2200,52900+52900)=N(4400,105800).

We only need to calculate P(S>5000), using Python or using the integral, that is 0.0325.

b. What is the probability that the weekly number of visitors exceeds 2000 in at least 2 of the next 3 weeks?

Let's first compute the probability that the number of visitors exceeds 2000 for any given week:

P(Xi>2000)=1โˆ’โˆซ0200012ฯ€โ‹…52900expโก(โˆ’(xโˆ’2200)22โ‹…52900)ย dx=0.8077.

Now, let's call Y be the number of weeks (out of 3) with more than 2000 visitors.
Surely, Y follows a binomial distribution with 3 trials and p=0.8078.

Let's compute it:

P(Yโ‰ฅ2)=P(Y=2)+P(Y=3)=(32)(0.8078)2(1โˆ’0.8078)+(33)(0.8078)3=3โ‹…(0.8078)2โ‹…0.1922+(0.8078)3=0.9033

5. Distribution of magical power levels of three Hogwarts students

Let X, Y , and Z be independent random variables representing the magical power levels of three Hogwarts students, where XโˆผN(ยต1,ฯƒ12) (Gryffindor), YโˆผN(ยต2,ฯƒ22) (Hufflepuff), and ZโˆผN(ยต3,ฯƒ32) (Ravenclaw).

a. Let A=X+Y. What is the distribution of the combined power A?

For any two independent normal random variablesย XโˆผN(ยต1,ฯƒ12)ย andย YโˆผN(ยต2,ฯƒ22)ย the sum of those two random variables is another normal:ย A=X+YโˆผN(ฮผ1+ฮผ2,ฯƒ12+ฯƒ22).

b. Let B=5X+2. What is the distribution of B (perhaps after a powerenhancing charm)?

Ifย Xย is a Normal such thatย XโˆผN(ฮผ,ฯƒ1)ย andย Bย is a linear transform ofย Xย such thatย B=aX+bย thenย Bย is also a Normal where: BโˆผN(aฮผ+b,a2ฯƒ2), then BโˆผN(5ฮผ+2,25ฯƒ2).

c. Let C=aXโˆ’bY+c2Z, where a, b, and c are real-valued constants representing spell modifiers. What is the distribution (and parameters) for C? Show how you derived your answer.

As before, C is a normal. Let's start by calculating its mean, that is simply E[C]=aฮผ1โ€‹โˆ’bฮผ2โ€‹+c2ฮผ3โ€‹

Let's now calculate the variance. Rember that the variance is not linear like expectation: Var(aX)=a2Var(X).
Then Var(C)=a2ฯƒ12+b2ฯƒ22+(c2)2ฯƒ32=a2ฯƒ12+b2ฯƒ22+c4ฯƒ32

Putting all together: CโˆผN(aฮผ1โˆ’bฮผ2+c2ฮผ3,a2ฯƒ12+b2ฯƒ22+c4ฯƒ32)

6. A strange probability density function

The joint probability density function of continuous random variables X (skill in Potions) and Y (skill in Charms) is given by fX,Y(x,y)=cyx where 0<y<x<1.

a. What is the value of c for this to be a valid probability density function?

To be a valid PDF, the total integral over the region must equal 1:

โˆซx=01โˆซy=0xcโ‹…yxdydx=1

Let's compute the inner integral first:

โˆซy=0xyxdy=1xโˆซ0xydy=1xโ‹…[y22]0x=1xโ‹…x22=x2

Now substitute into the outer integral:

โˆซx=01cโ‹…x2dx=cโ‹…โˆซ01x2dx=cโ‹…[x24]01=cโ‹…14

Set this equal to 1:

cโ‹…14=1โ‡’c=4

b. Are Potion skill (X) and Charm skill (Y) independent? Explain.

Two variables X and Y are independent if

fX,Y(x,y)=fX(x)โ‹…fY(y)

Let's compute first the marginal density function of X:

fX(x)=โˆซy=0xfX,Y(x,y)dy=โˆซ0x4โ‹…yxdy=4xโ‹…โˆซ0xydy=4xโ‹…x22=2x

Now the marginal density function of Y:

fY(y)=โˆซx=y1fX,Y(x,y)dx=โˆซy14โ‹…yxdx=4yโ‹…โˆซy11xdx=4yโ‹…[lnโกx]y1=4yโ‹…(lnโก1โˆ’lnโกy)=โˆ’4ylnโกy

In the previous answer we discovered that fX,Y(x,y)=4yx, thus we can conclude that:

4yxโ‰ โˆ’8โ‹…xโ‹…yโ‹…ln(y)

The answer is no, X and Y are not independent.

c. What is the marginal density function of X?

We have computed it in the previous answer:

fX(x)=โˆซy=0xfX,Y(x,y)dy=โˆซ0x4โ‹…yxdy=4xโ‹…โˆซ0xydy=4xโ‹…x22=2x

d. What is the marginal density function of Y?

We have computed it in the first answer:

fY(y)=โˆซx=y1fX,Y(x,y)dx=โˆซy14โ‹…yxdx=4yโ‹…โˆซy11xdx=4yโ‹…[lnโกx]y1=4yโ‹…(lnโก1โˆ’lnโกy)=โˆ’4ylnโกy

7. Choosing random house points

Choose a number X at random from the set of house points {1,2,3,4,5,6} awarded by Professor McGonagall. Now choose a number Y at random from the subset of points no larger than X, {1,...,X}.

a. Determine the joint probability mass function of X (initial points) and Y (second random selection).

Let's discuss about the random variable X.
X is uniformly distributed over {1,2,3,4,5,6}, thus:

P(X=x)=16forย xโˆˆ{1,2,โ€ฆ,6}

Now the random variable Y.
Given X=x, Y is uniformly distributed over {1,2,โ€ฆ,x}, thus:

P(Y=yโˆฃX=x)=1xforย yโ‰คx

Putting all together:

P(X=x,Y=y)={16โ‹…1xifย 1โ‰คyโ‰คxโ‰ค6,0otherwise

b. Determine the conditional mass function P(X=j|Y=i) as a function of i and j.

Using Bayesโ€™ rule:

P(X=jโˆฃY=i)=P(Y=iโˆฃX=j)โ‹…P(X=j)P(Y=i)

From the answer before, we know that P(Y=iโˆฃX=j)=1j, while P(X=j)=16.

We need to compute the denominator P(Y=i), that is:

P(Y=i)=โˆ‘j=i6P(X=j,Y=i)=โˆ‘j=i616j

So putting it all together:

P(X=jโˆฃY=i)=16jโˆ‘k=i616k=1jโˆ‘k=i61kforย iโ‰คj

c. Are X and Y independent? Explain.

No, the value of Y is constrained by X (Yโ‰คX), so knowledge of Y affects the possible values of X. More formally, if X and Y are independent:

P(X=j,Y=i)=P(X=j)โ‹…P(Y=i)for allย i,j

From part (a) we have:

Finally:

P(X=j)โ‹…P(Y=i)=16โ‹…โˆ‘k=i616kโ‰ 16j

We can confirm that X and Y are not independent.