1.

In cluster analysis of microarray data– If Xi is the log odds value for gene X at time i, then for two genes X and Y and N observations, a similarity score is calculated. S(X,Y) is also known as the Pearson correlation coefficent. Xoffset and Yoffset can be the mean of the observations on X or Y, respectively, in which case is the standard deviation, or else Xoffset and Yoffset can be set to zero when a reference state is used. Which of the following best represents it?(a) S(X,Y) = \(\frac{1}{N-2}\) ∑i=1,N . (Xi – Xoffset) (Yi + Yoffset)/ϕxQY(b) S(X,Y) = \(\frac{1}{N}\) ∑i=1,N . (Xi – Xoffset) (Yi – Yoffset)/ϕxQY(c) S(X,Y) = \(\frac{1}{N-1}\) ∑i=1,N . (Xi + Xoffset) (Yi + Yoffset)/ϕxQY(d) S(X,Y) = \(\frac{1}{N}\) ∑i=1,N+2 . (Xi + Xoffset) (Yi – Yoffset)/ϕxQYThe question was asked in an internship interview.This intriguing question comes from Global Gene Regulation in section Genome Analysis of Bioinformatics

Answer»

The correct option is (b) S(X,Y) = \(\frac{1}{N}\) ∑i=1,N . (Xi – Xoffset) (Yi – YOFFSET)/ϕxQY

The explanation: After values of S(X,Y) have been calculated for all gene combinations, the most closely related pairs are identified in an above-diagonal scoring matrix. The object of CLUSTERING is to identify GENES that RESPOND the same way to the environmental treatment. Each gene is compared to every other gene and a gene similarity score (metric) is produced.



Discussion

No Comment Found

Related InterviewSolutions