Hidden Markov Chain
📂Probability TheoryHidden Markov Chain
Buildup

Imagine a machine that produces certain items at fixed intervals as shown in the figure above. If green represents the normal quality goods 1 and red represents defective goods that need to be discarded 0, then the record so far would be (1,0,1). The actual visible results like this are referred to as Signal.
However, the probability of producing a defective item varies depending on whether the machine is normal or broken. When normal +, the defect rate is only 1, but when broken −, the defect rate skyrockets to 4. If we express this as a conditional probability, it will be
p(1∣+)=0.99p(1∣−)=0.96p(0∣+)=0.01p(0∣−)=0.04
The diagram would look as follows.

Let’s assume that the probability of the machine breaking down each time it produces an item is as the diagram above 10%. This is a Markov chain, and the state space of the stochastic process {Xj}j=0n is {+,−}, and our interest is the probability that the machine is normal after the jth production p(Xj=+). Of course, we’ll assume that the machine is normal initially, represented as p(X0=+)=1p(X0=−)=0.
If the machine is left broken, it will continue to produce many defective items. The question is how we can know this. Of course, a manager can quickly find out whether the machine is broken or not by inspecting it once, but if there are an enormous number of machines in a large factory, even such inspections can be a very burdensome task. Realistically, we have no choice but to guess which machine has a problem based on the items produced.
Definition
When the Markov chain (cause) is hidden and only the resulting signal (effect) is known, it is called Hidden Markov Chain {Xj}j=0n.
Example
(1)
Now let’s calculate the probability p(Xn=x∣S(n)=s(n)) given the signal s(n)=(s1,s2,⋯,sn).
If we say
==p(Xn=x∣S(n)=s(n))p(S(n)=s(n))p(Xn=x,S(n)=s(n))∑yFn(y)Fn(x)
that is, if we know Fn(x), we can know p(Xn=x∣S(n)=s(n)) without considering the conditional probability. (1) sn is independent of s(n−1), and (2) {Xj}j=1n is a Markov chain, so
Fn(x)=========p(Xn=x,S(n)=s(n))p(Xn=x,S(n−1)=s(n−1),Sn=sn)y∑p(Xn=x,Xn−1=y,S(n−1)=s(n−1),Sn=sn)y∑p(Xn−1=y,S(n−1)=s(n−1))p(Xn−1=y,S(n−1)=s(n−1))p(Xn=x,Xn−1=y,S(n−1)=s(n−1),Sn=sn)y∑Fn−1(y)p(Xn=x,Sn=sn∣Xn−1=y,S(n−1)=s(n−1))=(1)y∑Fn−1(y)p(Xn−1=y)p(Sn=sn,Xn=x,Xn−1=y)y∑Fn−1(y)p(Xn−1=y)p(Xn=x,Xn−1=y)p(Xn=x,Xn−1=y)p(Sn=sn,Xn−1=y,Xn=x)y∑Fn−1(y)p(Xn=x∣Xn−1=y)p(Sn=sn∣Xn=x,Xn−1=y)=(2)y∑Fn−1(y)pyxp(sn∣x)y∑Fn−1(y)p(Xn=x,Sn=sn∣Xn−1=y)y∑Fn−1(y)p(Xn=x∣Xn−1=y)p(Sn=sn∣Xn=x)
To summarize, we get the following.
Fn(x)=p(sn∣x)y∑Fn−1(y)pyx
(2)
Then, as an example in this post, let’s calculate p(X3=+∣S(3)=(1,0,1)) under the assumptions mentioned. In other words, compute the probability that the machine is normal in that situation.
In the Markov chain, the transition probabilities of states are
p++=0.9p−+=0p+−=0.1p–=1
For each state,
p(1∣+)=0.99p(1∣−)=0.96p(0∣+)=0.01p(0∣−)=0.04
Initially, assuming that the machine is normal,
F1(+)=p(1∣+)=0.99F1(−)=p(0∣+)=0.01
since s2=0,
F2(+)=p(0∣+)[F1(+)p+++F1(−)p−+]=0.01×[0.99×0.9+0]=0.00891F2(−)=p(0∣−)[F1(+)p+−+F1(−)p–]=0.04×[0.99×0.1+0.01×1]=0.00436
since s3=1,
F3(+)=p(1∣+)[F2(+)p+++F2(−)p−+]=0.99×[.00891×0.9+0])=0.00793881F3(−)=p(1∣−)[F2(+)p+−+F2(−)p–]=0.94×[.00891×0.1+.00436×1]=0.00493594
Therefore,
p(X3=+∣S(3)=(1,0,1))=F3(+)+F3(−)F3(+)=0.012874750.00793881=0.6166⋯
Logically, if the defect rate is not that high but one out of three is defective, we can guess that the probability of the machine being normal is significantly decreased. The calculation results show that the probability of the machine being normal barely exceeds 60%.