# Essays/Binary Probability

## Overview

Binary probability approach allows to represent random events as binary arrays corresponding to elements of a finite uniformly distributed sample space with resulting conveniences of calculating frequency as average of such array and manipulating events with boolean operations.

### Trials and Events

As an illustration we will consider rolls of two dice.

When we have a single trial, multiple events can occur. For example, tossing one die, event A: die is odd; event B: die is prime.

In multiple trials, their combined outcomes may constitute a single event. For example, tossing two dice, event A: both dice are 3; event B: first die is greater than the second.

## Single Trial

Viewed separately, each roll of dice can describe a single trial.

### Sample Space

A roll of dice is an outcome ω corresponding to the number of dots, and its sample space is Ω = {1, 2, 3, 4, 5, 6}, each ω is from Ω. Probability of each ω, P(ω) = 1/|Ω|, one divided by number of elements in Ω.

To illustrate in J we will designate the event space of a single roll as

D=. '123456'

An event is any subset of the sample space. For example, event "roll is odd" is subset '135' of D.

### Discrete Probability

To calculate probabilities of an event, we find it frequency in the sample space,
i.e. divide the number of elements in the event by total number of elements in the
sample space. To count elements, we will add `1s` corresponding to elements in the event.

'1'=D NB. dice shows 1 1 0 0 0 0 0 x:(+/ % #) '1'=D NB. probability 1/6 1r6 Pr=: +/ % # NB. discrete probability of event "filter" '12' =/ D NB. dice either 1 or 2 1 0 0 0 0 0 0 1 0 0 0 0 x:Pr '12' +./ . (=/) D 1r3

Alternative to `OR`ing elementary filters, is to ask, which
elements belong to event.

D e. '135' NB. which are odd? 1 0 1 0 1 0 Pr D e. '135' NB. probability of odd roll 0.5

### Random Variable

It is more convenient and general, to ask a question about oddness numerically. To do so we need to designate a random variable, a function which converts the elements of sample space to certain numeric equivalents.

In general, it is not a one-to-one correspondence, but a functional relation (injection), i.e. many elements may correspond to the same number. However, we need to preserve the correspondence with the orginal elements in the range of random variable (including duplicates) in order to preserve the distribution.

]X=: "."0 D NB. X is range of random variable 1 2 3 4 5 6 2|X NB. odd rolls numerically 1 0 1 0 1 0 Pr 2|X 0.5

### Binary Random Variable

The boolean filter operation used in
calculating the descrete probability (both for events and the variable)
is in turn a binary valued random variable `x`, whose distribution
`Pr(X=x): x in {0,1} -> {1-p,p}` determines the probability `p`
of the sought event.

Thus probability of an event is the average `(+/ % #)` of its
binary variable over the event space. Hence, our convenient definition
of `Pr`.

### Complementary Event

The other part of binary random variable, `X=0`, corresponds
to non-occurence of its event A, obtain with a unary operation "not A", which is
the complement to event A in the elementary space.

For event A with probability `p`, its complementary event, has probablity
`1-p`, the other value of the binary variable distribution.

## Two Trials

Now we will consider two rolls of dice.

### Sample Space

The basic event space `T` for two rolls is a raveled cartesian
product of elementary outcomes of each roll.
Each element is a pair whose first item is first roll and
second item is second roll.

{;~D +--+--+--+--+--+--+ |11|12|13|14|15|16| +--+--+--+--+--+--+ |21|22|23|24|25|26| +--+--+--+--+--+--+ |31|32|33|34|35|36| +--+--+--+--+--+--+ |41|42|43|44|45|46| +--+--+--+--+--+--+ |51|52|53|54|55|56| +--+--+--+--+--+--+ |61|62|63|64|65|66| +--+--+--+--+--+--+ ]T=. ,{;~D +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+-- |11|12|13|14|15|16|21|22|23|24|25|26|31|32|33|34|35|36|41|42|43 ... +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--

### Probability

So we can ask questions both about either each roll separately or a pair together.

x:Pr =/&>T NB. two dice are equal 1r6 x:Pr '3'={.&>T NB. first dice is 3 1r6 x:Pr >&"./&>T NB. first greater than second 5r12

In the last case we used a familiar random variable for the numeric values of dice.

### Union Event

Union of two events `A` and `B`, denoted `A union B` is when at
least one of the two events happens. Alternatively, we can say
either `A` or `B` (or both) happen, so it can be calculated as `OR`
of two event filters.

x:Pr ('3'={.&>T) +. ('3'={:&>T) NB. first roll or second roll is 3 11r36

Alternatively, a union can be constructed semantically, if we know how to ask a question, in this case:

x:Pr '3'&e.&>T NB. 3 belongs to a pair of two rolls 11r36

Let's take this union subspace and look closer.

]U=: T #~ '3'&e.&>T NB. union subspace +--+--+--+--+--+--+--+--+--+--+--+ |13|23|31|32|33|34|35|36|43|53|63| +--+--+--+--+--+--+--+--+--+--+--+ ('3'={.&>U) + _1*'3'={:&>U NB. structure _1 _1 1 1 0 1 1 1 _1 _1 _1 U <@:>/.~ ('3'={.&>U) + _1*'3'={:&>U NB. grouping +--+--+--+ |13|31|33| |23|32| | |43|34| | |53|35| | |63|36| | +--+--+--+

Its structure consists of three parts: B without A, A without B and both A and B.

As seen from above, probability (and count) of the union is
not the sum of probabilities of A and B. Because they have
a common part, it will be repeated twice in a sum, so we
need to subtract it. Thus yet another way to calculate union
probability, `P(A union B) = P(A) + P(B) - P(A and B)`.

x: (Pr '3'={.&>T) + (Pr '3'={:&>T) - Pr '3'&(*./ .=)&>T 11r36

### Joint Event

When both events A and B happen it is designated `A intersects B`,
or simply A AND B.
Joint probability, or probability of a joint event, is determined
with the AND operation between the event filters.

x:Pr ('3'={.&>T) *. '3'={:&>T 1r36

Semantically, it is an event where not dice are 3.

x:Pr '3'&(*./ .=)&>T 1r36

### Difference Event

A without B can be obtained by subtracting (x AND not y)
the event filters of A from A and B. Or `P(A\B) = P(A\[A and B]) = P(A and not [A and B])`

x:Pr ('3'={.&>T) > ('3'={.&>T) *. '3'={:&>T 5r36

Alternatively, it is probability of A minus probability
of A and B. Or `P(A\B) = P(A) - P(A and B)`.

x:(Pr '3'={.&>T) - Pr ('3'={.&>T) *. '3'={:&>T 5r36

## Conditional Probability

### Definition

Given events (or subsets) A and B in the sample space Ω, if it is known that an element randomly drawn from Ω belongs to B, then the probability that it also belongs to A is *defined* to be the conditional probability of A, given B.

Thus, the probability of event A given occurence of B is the probability of their intersection in the new sample space B.

In other words event A, given B is a projection of subset A on subset B.

_6]\ A=: 4>+&"./&>T NB. A: sum < 4 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x:Pr A 1r12 _6]\ B=: '1'={.&>T NB. B: first is 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x:Pr B 1r6 B#T NB. projection +--+--+--+--+--+--+ |11|12|13|14|15|16| +--+--+--+--+--+--+ ] AcB=: 4>+&"./&> B#T NB. A given B 1 1 0 0 0 0 x:Pr AcB 1r3

Now applying the formula `P(A|B) = P(A and B) / P(B)`

x: pAcB=: (Pr A *. B) % Pr B 1r3

### Statistical Independence

Two random events A and B are independent if and only if
`P(A and B) = P(A) P(B)`.

Two non-empty events A and B are independent if and only if ratio of
A in Ω is in proportion to ratio of intersection of A and B
in B and visa versa; or `P(A) = P(A|B)` and `P(B) = P(B|A)`.

Lets take the example from section *Union Event*.
Although events "first die is 3" and "second die is 3"
have an non-empty intersection, "both dice are 3", they are
independent. By definition,

x:Pr *./&('3'&=)/&> T NB. intersection 1r36 x: (Pr '3'={.&> T) * Pr '3'={:&> T NB. product 1r36

Now in projection,

T#~'3'={:&> T NB. projection by "second is 3" +--+--+--+--+--+--+ |13|23|33|43|53|63| +--+--+--+--+--+--+ x:Pr '3'={.&> T#~'3'={:&> T NB. first given second 1r6 x:Pr '3'={.&> T NB. first alone 1r6