Explore topic-wise InterviewSolutions in .

This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.

1.

Which of the given statements is incorrect?(a) Two tandem copies of a gene are produced while Proteins with new functions are produced(b) Proteins with new functions are produced by a gene duplication event(c) Assortment and reassortment of protein domains takes place in individual genomes(d) In no case the two duplicated genes both undergo changeI got this question by my school teacher while I was bunking the class.Query is from Comparative Genomics topic in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right OPTION is (d) In no case the two duplicated genes both undergo change

To explain I would say: In a possibility, two duplicated genes both undergo change, but interactions between the PROTEINS stabilize the original function and support the evolution of new ones. Through MUTATION and natural selection, one of the copies can develop a new function, LEAVING the other copy to cover for the original function. However, because most mutations are deleterious to function, often one of the copies becomes a PSEUDOGENE. Not all gene duplications are thought to have the above effects.

2.

Which of the given statements is untrue?(a) Sequence and other data files that contain non-ASCII characters also may not be transferred correctly from one machine to another and may cause unpredictable behavior of the communications software(b) The ASCII mode is useful for transferring text files, and the binary mode is useful for transferring compressed data files, which also contain non-ASCII characters(c) ASCII and binary modes cannot be set by the user(d) Most sequence analysis programs also require not only that a DNA or protein sequence file be a standard ASCII file, but also that the file be in a particular format such as the FASTA formatI had been asked this question by my college director while I was bunking the class.I would like to ask this question from Sequence Formats & Computer Storage of Sequences in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

The correct choice is (b) The ASCII mode is useful for transferring text files, and the binary mode is useful for transferring compressed DATA files, which also contain non-ASCII characters

To explain I would SAY: The file transfer program (FTP) has ASCII and binary modes, which may be set by the user. Some communications software can be set to ignore such control character. The use of windows on a computer has simplified such problems, SINCE one merely has to copy a sequence from one window, for example, a window that is RUNNING a Web browser on the ENTREZ Web site, and paste it into another, for example, that of a translation program.

3.

cDNA libraries have been prepared that have the same sequences as the mRNA molecules produced by organisms, or else cDNA copies are sequenced directly by RT-PCR (copying of mRNA by reverse transcriptase followed by sequencing of the cDNA copy by the polymerase chain reaction).(a) True(b) FalseThe question was posed to me by my college professor while I was bunking the class.My question is based upon Sequencing cDNA Libraries of Expressed Genes, Submission of Sequences to the Databases in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct CHOICE is (a) True

To ELABORATE: There has been a great deal of progress in developing COMPUTATIONAL methods for analyzing genomic SEQUENCES and finding these protein-encoding regions. But these methods are not COMPLETELY reliable and, furthermore, such genomic sequences are often not available.

4.

The combined mixture of all labeled DNA fragments is electrophoresed to _____ the fragments by______ and the ladder of fragments is scanned for the presence of each of the four labels.(a) separate, size(b) separate, pH(c) assimilate, pH(d) assimilate, sizeI got this question in an international level competition.This intriguing question originated from DNA & Genomic Sequencing topic in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

The correct choice is (a) separate, size

To explain I would say: A computer program then determines the PROBABLE order of the bands and predicts the sequence. Depending on the ACTUAL procedure being used, one RUN may generate a RELIABLE sequence of as many as 500 nucleotides.

5.

_____ of the human genome comprises one particular family of the SINE Element, designated Alu (1.2 million copies).(a) 10%(b) 20%(c) 60%(d) 40%The question was posed to me by my school principal while I was bunking the class.Question is taken from Genome Anatomy in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer» CORRECT choice is (a) 10%

The best explanation: Ten percent of the human GENOME COMPRISES one particular FAMILY of the SINE Element. And 14.6% of one particular LINE designated LINE1 (593,000 copies) are present.
6.

Vertebrate chromosomes have long (>300 kb) regions of distinct GC richness, repeat content, and gene density, designated isochores in a model of genome organization proposing that genomes are made up of distinct segments of unique composition.(a) True(b) FalseThe question was asked in an international level competition.This question is from Genome Anatomy topic in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

The correct answer is (a) True

Explanation: Human and MOUSE chromosomal REGIONS that have a low density of GENES are AT-rich and have more Alu or B1/B2 (SINES) than LINE1 elements. WHEREAS the reverse is true for regions that have a high gene density, and those regions are more GC-rich.

7.

More than _____ of the human genome consists of interspersed repetitive sequences derived from TEs (transposable elements).(a) one-third(b) one-eighth(c) one-fifth(d) halfThis question was addressed to me during an online interview.My doubt stems from Genome Anatomy topic in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer» CORRECT answer is (a) one-third

The explanation: The presence of these elements may be DEMONSTRATED using programs for detection of low-complexity REGIONS in sequences. For e.g. in the FRUIT fly Drosophila has 15% of genome that is MADE up of transposable elements.
8.

Minisatellites are made up of repeat units of up to ____ and microsatellites compose of repeat units of ____ or less.(a) 25 bp, 10 bp(b) 70 bp, 6 bp(c) 80 bp, 9 bp(d) 25 bp, 4 bpI had been asked this question during an interview for a job.My enquiry is from Genome Anatomy in chapter Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right choice is (d) 25 bp, 4 bp

The best I can explain: They are also FOUND in EUKARYOTIC Genomes. MICROSATELLITE repeats are found at the ends of eukaryotic chromosomes at the telomeres, which in humans comprise hundreds of copies of a 6-bp repeat TTAGGG.

9.

Annotation involves identifying open reading frames in the genome sequence.(a) True(b) FalseThis question was addressed to me during an interview.The origin of the question is Genome Anatomy in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer» CORRECT choice is (a) True

Best explanation: It is DONE by using the predicted protein as query sequences in a database similarity SEARCH. It further adds significant matches to the genome SEQUENCE entry in the sequence database.
10.

Which of the following is incorrect about ENTREZ?(a) There is no simple way to find the correct sequence without manually checking the information provided in each sequence, but this usually takes longer time(b) Before leaving ENTREZ, it is often useful to check for sequence database entries that are similar to the one of interest, called “neighbors” by ENTREZ(c) The expanded query searches other database entries of interest, such as the same protein in another organism, a large chromosomal sequence that includes the gene, or members of the same gene family(d) While visiting the site, note that ENTREZ has been adapted to search through a number of other biological databases, and also through Medline, and these searches are available from the initial ENTREZ Web pageThis question was posed to me in quiz.My question comes from Using the Database Access Program ENTREZ topic in chapter Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right option is (a) There is no simple way to find the CORRECT sequence without manually checking the information provided in each sequence, but this usually takes LONGER time

To explain: Opposite to what is mentioned in option a, this takes shorter time. It is important to look through the SEQUENCES to locate the one intended. There may be several different copies of the sequence because it may have been SEQUENCED from more than one organism, or the sequence may be a mutant sequence, a particular clone, or a fragment.

11.

Knowing ________ should be enough to find the required entry quickly.(a) publication date, protein name, journal name(b) accession number, protein name, or name of gene(c) publication date, protein name, or volume(d) properties, protein name, or title wordThis question was addressed to me during an interview.My question is from Using the Database Access Program ENTREZ in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct choice is (b) accession number, protein name, or name of gene

To elaborate: If the same protein has been sequenced in several ORGANISMS, PROVIDING an organism name is also helpful. When the chosen SEARCH terms and fields have been decided and submitted, a database comprising all of the currently available SEQUENCES (called the non redundant or NR database) will be SEARCHED. Other database selections can also be made.

12.

Data files that have multiple sequences, such as those required for multiple sequence alignment and phylogenetic analysis using parsimony (PAUP), are not converted in READSEQ.(a) True(b) FalseThis question was posed to me in an internship interview.My doubt is from Multiple Sequence Formats & Storage of Information in a Sequence Database topic in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right option is (a) True

For explanation: Data FILES with such multiple sequences as mentioned are converted in READSEQ. OPTIONS to reverse-complement and to remove gaps from sequences are included. SEQIO and another SEQUENCE conversion program for a UNIX machine.

13.

Which of the following is wrong about Genetics Computer Group Sequence Format?(a) Earlier versions of the Genetics Computer Group (GCG) programs require a unique sequence format and include programs that convert other sequence formats into GCG format(b) Information about the sequence in the GenBank entry is not included but the line information is carried out(c) If one or more sequence characters become changed through error, a program reading the sequence will be able to determine that the change has occurred because the checksum value in the sequence entry will no longer be correct(d) Lines of information are terminated by two periods, which mark the end of information and the start of the sequence on the next lineThis question was addressed to me in quiz.The origin of the question is Sequence Formats & Computer Storage of Sequences topic in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer» RIGHT choice is (b) Information about the sequence in the GenBank ENTRY is not included but the LINE information is carried out

The BEST I can explain: Information about the sequence in the GenBank entry is first included, followed by a line of information about the sequence and a checksum value. This value (not SHOWN) is provided as a check on the accuracy of the sequence by the addition of the ASCII values of the sequence. If the sequence has not been changed, this value should stay the same.
14.

Which of the following is wrong about National Biomedical Research Foundation/Protein Information Resource Sequence Format?(a) Sequences retrieved from the PIR database are not in this compact format, but in an expanded format with much more information about the sequence(b) The NBRF format is similar to the FASTA sequence format but with significant differences(c) This is different than PIR format(d) The first line includes an initial “>” character followed by a two-letter code such as P for complete sequence or F for fragment, followed by a 1 or 2 to indicate type of sequence, then a semicolon, then a four- to six-character unique name for the entryI got this question by my school teacher while I was bunking the class.The query is from Sequence Formats & Computer Storage of Sequences topic in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right choice is (C) This is different than PIR format

Easy explanation: This sequence format, which is sometimes also called the PIR format. It has been used by the NATIONAL BIOMEDICAL Research Foundation/Protein Information Resource (NBRF) and also by other sequence analysis programs.

15.

Which of the following is untrue about Shotgun Sequencing?(a) When DNA fragments derived from different chromosomal regions have repeats of the same sequence, they will appear to overlap(b) When DNA fragments derived from different chromosomal regions have repeats of the same sequence, they will appear to scrutinize(c) In a new whole shotgun approach, Celera Genomics is sequencing both ends of DNA fragments of short (2 kb), medium (10 kb), and long (BAC or >100 kb) lengths(d) A large number of reads are then assembled by computerThis question was addressed to me by my college director while I was bunking the class.Query is from DNA & Genomic Sequencing topic in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct answer is (b) When DNA fragments derived from different chromosomal regions have repeats of the same sequence, they will appear to scrutinize

The explanation: A controversy has arisen as to whether or not the above shotgun sequencing strategy can be applied to genomes with repetitive sequences such as those likely to be encountered in sequencing the human genome. This method has been used to assemble the genome of the fruit fly Drosophila melanogaster after REMOVAL of the most HIGHLY repetitive regions and also to assemble a SIGNIFICANT proportion of the human genome.

16.

Which of the given statements is incorrect about Horizontal Gene Transfer?(a) The genomes of most organisms are derived by vertical transmission, the inheritance of chromosomes from parents to offspring from one generation to the next(b) It is the acquisition of genetic material from a different organism(c) The transferred materialbecomes a temporary addition to the recipient genome(d) An extreme example is the proposed endosymbiont origin of mitochondria in eukaryotic cells and chloroplasts in plantsThe question was posed to me in homework.Enquiry is from Comparative Genomics topic in chapter Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct option is (c) The transferred materialbecomes a TEMPORARY addition to the recipient genome

Best EXPLANATION: The transferred material BECOMES a permanent addition to the recipient genome. Although these EXCHANGES do not occur very often on a generation-to-generation basis, a significant number can occur over a period of hundreds of MILLIONS of years.

17.

Which of the given statements is incorrect about Clusters of orthologous groups?(a) Using the protein from one of the organisms to search the proteome of the other for high-scoring matches should identify the ortholog as the highest- scoring match, or best hit(b) When entire proteomes of the two organisms are available, orthologs may be identified(c) a pair of orthologous genes in two organisms share so much sequence similarity that they may be assumed to have arisen from a common ancestor gene(d) each of the orthologs belongs to a family composed ofparalogous sequences but irrelevant or not related to each otherThis question was addressed to me in an interview.Question is taken from Comparative Genomics topic in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»
18.

Which of the given statements is incorrect about Grouping Sequences?(a) The problem of deciding which sequences to include in the same group or cluster and which to separate into different groups or clusters is a recurring one(b) Divergence is necessary, but the sequences chosen should be clearly related based on inspection of each pair-wise alignment and a statistical analysis(c) The conservative approach is to group distinct sequences(d) The adventurous approach is to choose a set of marginally alignable sequences to pursue the difficult task of making a multiple sequence alignment and then to make profile models that may recognize divergence but will also give false predictionsThis question was addressed to me in a job interview.The query is from Comparative Genomics in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct choice is (c) The conservative approach is to group distinct sequences

To explain I would say: The conservative approach is to group only very SIMILAR sequences together. However, in making a conservative MULTIPLE sequence alignment with only very alike sequences, it is not POSSIBLE to analyze the evolutionary divergence that may have occurred in a family of proteins. Furthermore, if a matrix or profile MODEL is MADE from this alignment, that model will not be useful for identifying more divergent members of a family.

19.

Which of the given statement is incorrect about Orthologs?(a) In comparing two proteomes, a common standard is to require that for each pair of orthologs, the first of the pair is the best hit when the second is used to query the proteome of the first(b) To identify orthologs, each protein in the proteome of an organism is used as a query in a similarity search of a database comprising the proteomes of only one different organism(c) The best hit in each proteome is likely to be with an ortholog of the query gene(d) Orthologs are genes that are so highly conserved by sequence in different genomes that the proteins they encode are strongly predicted to have the same structure and function and to have arisen from a common ancestor through speciationI had been asked this question during an online interview.I want to ask this question from Sequence Assembly and Gene Identification topic in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer» RIGHT answer is (B) To identify orthologs, each protein in the proteome of an organism is used as a QUERY in a SIMILARITY search of a database comprising the proteomes of only one different organism

For EXPLANATION: To identify orthologs, each protein in the proteome of an organism is used as a query in a similarity search of a database comprising the proteomes of one or more different organisms.
20.

Which of the given statement is incorrect about Functional Genomics?(a) Functional genomics involves the preparation of mutant or transgenic organisms with a mutant form of a particular gene usually designed to prevent expression of the gene(b) An abnormal properties of the mutant organism does not reveal the gene function(c) When two or more members of a gene family are found, rather than a single match to a known gene, the biological activity of these members may be analyzed by functional genomics to look for diversification of function in the family(d) A more detailed analysis of the relative amount of sequence variability in a chromosomal region within populations of closely related species can reveal the presence of genes that are under selectionI got this question in homework.My question is from Sequence Assembly and Gene Identification in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

The correct ANSWER is (b) An abnormal properties of the mutant organism does not REVEAL the gene function

Easy explanation: The gene function is revealed by any abnormal properties of the mutant organism. This methodology provides a way to test a gene function that is predicted by sequence similarity to be the same as that of a gene of known function in another organism. If the other organism is very different biologically (comparing a predicted plant or animal gene to a known yeast gene), then FUNCTIONAL genomics can also SHED light on any newly ACQUIRED biological role.

21.

In Stanford University/Intelligenetics Sequence Format At the end of the sequence, a 1 is placed if the sequence is linear, and a 2 if the sequence is circular.(a) True(b) FalseI got this question by my college director while I was bunking the class.I'm obligated to ask this question of Sequence Formats & Computer Storage of Sequences in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct OPTION is (a) True

For explanation I would say: It is started by a MOLECULAR genetics group at Stanford University, and subsequently continued by a company, Intelligenetics, the IG format is SIMILAR to the PIR format, except that a semicolon is usually placed before the comment line. The identifier on the second line is ALSO present.

22.

Which of the given statement is incorrect?(a) The predicted set of proteins for the genome is referred to as the proteome(b) The amino acid sequence of proteins encoded by the predicted genes is used as a query of the protein sequence databases in a database similarity search(c) A match of a predicted protein sequence to one or more database sequencesserves only to identify the gene function but it doesn’t validate the gene prediction(d) The genome sequence is annotated with the information on gene content and predicted structure, gene location, and functional predictionsThis question was addressed to me in final exam.I want to ask this question from Sequence Assembly and Gene Identification in chapter Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right CHOICE is (c) A match of a predicted PROTEIN SEQUENCE to one or more database sequencesserves only to identify the gene function but it doesn’t validate the gene prediction

Easiest EXPLANATION: A match of a predicted protein sequence to one or more database sequences not only serves to identify the gene function, but also validates the gene prediction. Pseudogenes, gene copies that have lost function, may also be found in this analysis.

23.

Processed pseudogenes are also derived from a functional gene and they contain introns and a promoter.(a) True(b) FalseI have been asked this question by my school teacher while I was bunking the class.The doubt is from Sequence Assembly and Gene Identification topic in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct choice is (b) False

Explanation: Processed pseudogenes are ALSO derived from a functional gene, but they do not contain introns and LACK a promoter; hence, they are not expressed. The origin of these pseudogenes is probably DUE to reverse TRANSCRIPTION of the mRNA of the functional gene and INSERTION of the cDNA copy into a new chromosomal location by a LINE1 reverse transcriptase.

24.

TEs (Transposable Elements) can at most comprise one-fourth of the genome sequence.(a) True(b) FalseI had been asked this question in homework.This interesting question is from Sequence Assembly and Gene Identification topic in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

The correct choice is (b) False

To explain I WOULD say: TEs (Transposable Elements) can comprise one-half or more of the genome sequence. Eukaryotic genomes comprise CLASSES of REPEATED elements, including tandem repeats present in centromeres and TELOMERES, dispersed tandem repeats (MINISATELLITES and macrosatellites), and interdispersed TEs.

25.

In case of genome sequence assembly which of the given statement is incorrect?(a) Full chromosomal sequences areassembled from the overlaps in a highly redundant set of fragments by an automatic computational method or from the fragment order on a physical map(b) Chromosome cloning is carried out inbacterial artifical chromosomes (BACs)(c) Chromosomes of a target organism are purified, fragmented, and subcloned in fragments of size hundreds of bp(d) Genome sequences are assembled from DNA sequence fragments of approximate length 500 bp obtained using DNA sequencing machinesI have been asked this question in an internship interview.This is a very interesting question from Sequence Assembly and Gene Identification topic in chapter Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

The correct CHOICE is (C) Chromosomes of a TARGET organism are purified, fragmented, and subcloned in fragments of size hundreds of bp

Easy explanation: Chromosomes of a target organism are purified, fragmented, and subcloned in fragments of size hundreds of KBP and not bp.The BAC fragments are then further subcloned as smaller fragments into plasmid vectors for DNA sequencing.

26.

The retroposons include short _________ interspersed nuclear elements (SINES).(a) 90–4000 bp long(b) 80–500 Mbp long(c) 80–300 bp long(d) 100–3000 bp longI got this question in quiz.The origin of the question is Genome Anatomy topic in chapter Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct option is (c) 80–300 BP long

To elaborate: There exists also (6–8 kbp long) interspersed NUCLEAR elements (LINES). Different types of TRANSPOSABLE elements are present in high COPY numbers in mammalian genomes in varying manner.

27.

Analysis of the ribosomal RNA molecules of prokaryotes and eukaryotes had led to the prediction of three main branches in the tree of life.(a) True(b) FalseI got this question during an internship interview.The query is from Genome Anatomy in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

The correct option is (a) True

The explanation is: The THREE branches are represented by Archaea, the Bacteria, and the Eukarya. Analysis of the ribosomal for GENOME sequencing projects, organisms have been sampled from throughout the tree, including some that are in DEEPER branches of the tree and that have growth properties reminiscent of an ancient environment.

28.

Which of the following is incorrect about Retrieving a Specific Sequence?(a) It can be difficult to retrieve the sequence of a specific gene or protein simply because of the sheer number of sequences in the Gen-Bank database and the complex problem of indexing them(b) Other projects may benefit from the availability of better curated and annotated protein sequence databases, but not PIR and SwissProt(c) For projects that require the most currently available sequences, the NR databases should be searched(d) The genomic databases can also provide the sequence of a particular gene or protein. Protein sequences in the Genpro database are generated by automatic translation of DNA sequencesI have been asked this question in a job interview.Query is from Using the Database Access Program ENTREZ in chapter Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right answer is (b) Other projects may BENEFIT from the availability of better CURATED and annotated protein sequence databases, but not PIR and SwissProt

For explanation: Curated and annotated protein sequence databases include PIR and SwissProt. When read from cDNA copies of MRNA SEQUENCES, they provide a reliable sequence, given a certain amount of uncertainty as to the translational start site. Many protein sequences are now predicted by translation of genomic sequences, requiring a prediction of exons, a SOMEWHAT error-prone step.

29.

Investigators are encouraged to submit their newly obtained sequences directly to a member of the International Nucleotide Sequence Database Collaboration, such as the NCBI, DDBJ, and EMBL.(a) True(b) FalseI got this question during an internship interview.This intriguing question comes from Sequencing cDNA Libraries of Expressed Genes, Submission of Sequences to the Databases topic in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct answer is (a) True

To EXPLAIN I would say: NCBI STANDS for National CENTER for Biotechnology Information. It MANAGES GenBank. DDBJ and EMBL stand for DNA Database Bank of Japan and European Molecular BIOLOGY Laboratory respectively.

30.

Two common goals in sequence analysis are to identify sequences that encode proteins, which determine all cellular metabolisms, and to discover sequences that regulate the expression of genes or other cellular processes.(a) True(b) FalseI had been asked this question in a national level competition.The above asked question is from Sequencing cDNA Libraries of Expressed Genes, Submission of Sequences to the Databases in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct option is (a) True

Easy explanation: Genomic sequencing MEETS both GOALS. However, only a small percentage of the genomic sequence of many organisms actually ENCODES proteins because of the presence of introns WITHIN coding regions and other NONCODING regions in the genome.

31.

To sequence larger molecules, individual chromosomes are purified and broken into _____ or larger random fragments, which are cloned into vectors designed for large molecules.(a) 100-Mb(b) 100-kb(c) 5000-kb(d) 600-kbI have been asked this question in final exam.The above asked question is from DNA & Genomic Sequencing in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right choice is (b) 100-kb

The explanation: To sequence LARGER molecules, such as HUMAN chromosomes, individual chromosomes are purified and BROKEN into 100-kb or larger random fragments, which are cloned into vectors designed for large molecules, such as artificial yeast (YAC) or bacterial (BAC) chromosomes. In a laborious procedure, the resulting library is screened for fragments called contigs, which have overlapping or common sequences, to PRODUCE an integrated map of the CHROMOSOME.

32.

When the process is fully automated, a number of priming sites may be used to obtain sequencing results that give optimal separation of bands in each region of the sequence.(a) True(b) FalseI got this question in quiz.The query is from DNA & Genomic Sequencing topic in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct CHOICE is (a) True

Best explanation: By repeating this procedure, both strands of a DNA fragment several kilobases in LENGTH can be sequenced. SEQUENTIAL sequencing of a DNA molecule USING oligonucleotide primers is done LATER.

33.

Which of the given statements is incorrect?(a) Proteins may be clustered into families on the basis of either sequence or structural similarity(b) Proteins often comprise separate domains(c) The number of protein sequences that are available is insufficient to determine that domain shuffling occurs in evolution(d) Proteins are modularThis question was posed to me in an online quiz.My question is from Comparative Genomics in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct answer is (c) The number of protein sequences that are AVAILABLE is insufficient to determine that domain shuffling OCCURS in evolution

Easiest explanation: The number of protein sequences is sufficient unlike mentioned in option c. The comparisons of PROTEOMES of different organisms can identify the type of domain CHANGES and also provide an indication as to what biological ROLE they may have in a particular organism.

34.

Which of the following information Sequence comparisons do not provide?(a) Gene relationships(b) Function history(c) Evolutionary history(d) Gene locationsThis question was posed to me in unit test.This key question is from Comparative Genomics topic in portion Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct answer is (d) Gene locations

Easiest EXPLANATION: Map locations of orthologous genes may also be compared. If a set of genes is GROUPED together at a particular chromosomal location, and if a set of similar genes is also grouped together in the genome of another organism, these GROUPS share an EVOLUTIONARY history.

35.

Which of the given statement is incorrect?(a) As in an all-by-all protein comparison within a proteome, a matrix of alignment scores with E values is made, and the most closely related sequences in the two organisms are identified(b) To perform a between-proteome analysis, proteome databases are made for the known and predicted genes of two or more genomes(c) Each protein of one proteome is selected in turn as a query of the proteome of another organism or the combined proteome of a group of organisms(d) Each protein of one proteome is selected in turn as a query of the proteome of another single organism onlyThis question was posed to me in class test.This intriguing question comes from Sequence Assembly and Gene Identification in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct choice is (d) Each protein of ONE PROTEOME is selected in TURN as a query of the proteome of another single organism only

Easiest explanation: This analysis can predict orthologs. In other words proteins have an IDENTICAL function attributable to descent of the RESPECTIVE genes from a common ancestor.

36.

Prokaryotic genomes commonly have tandem repeats of sequences and include introns in protein-coding genes.(a) True(b) FalseI had been asked this question during an online interview.Question is taken from Genome Anatomy topic in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

The correct answer is (a) True

To explain I WOULD say: EUKARYOTIC genomes commonly have tandem repeats of SEQUENCES. In addition to this, they having LINEAR chromosomes within a nucleus, and DIFFERING from prokaryotic genomes in this respect.

37.

To assist in finding suitable terms, for each field, ENTREZ provides a list of index entries.(a) True(b) FalseI got this question in an online quiz.My question comes from Using the Database Access Program ENTREZ in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right choice is (a) True

To EXPLAIN I would say: When SEARCHING for terms in a particular field, some knowledge of the terms that are in the database can be helpful. The “Limits” LINK on the ENTREZ form page is used to limit the GenBank field to be searched, and various logical COMBINATIONS of search terms may be designed by this method. These FIELDS refer to the GenBank fields.

38.

Computational resources can facilitate the analysis of bacterial genomes.(a) True(b) FalseThis question was addressed to me in an interview for job.I would like to ask this question from Genome Anatomy in chapter Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer» CORRECT option is (a) True

To elaborate: GENEQUIZ is an example of such a resource. There are Web SITES that provide a complete ANNOTATION of the prokaryotic genomes that have been SEQUENCED.
39.

which of the given statements is incorrect about Block multiple sequence alignment format?(a) Identification starts contain a short identifier for the group of sequences from which the block was made and often is the original Prosite group ID(b) The identifier is terminated by a comma, and “BLOCK” indicates the entry type(c) AC contains the block number, a seven-character group number for sequences from which the block was made, followed by a letter (A–Z) indicating the order of the block in the sequences(d) The block number is a 5-digit number preceded by BL (BLOCKS database) or PR (PRINTS database)The question was asked by my school principal while I was bunking the class.My doubt stems from Multiple Sequence Formats & Storage of Information in a Sequence Database in chapter Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct OPTION is (b) The identifier is terminated by a COMMA, and “BLOCK” indicates the entry type

The best explanation: The identifier is terminated by a semicolon, and “BLOCK” indicates the entry type. Min, max is the minimum, maximum number of AMINO acids from the previous blocks or from the sequence STARTING. DE describes sequences from which the block was made.

40.

Which of the given statements is incorrect about Searching for orthologs to a protein family in an EST database?(a) Searches of EST databases for matches to a query sequence routinely produce minimal amounts of output that must be searched manually for significant hits(b) ESTs with a high percent identity with the query sequence, a long alignment with the query sequence, and a very low E value of the alignment score represent groups of paralogous and orthologous genes(c) To identify orthologs as the most closely related sequence, ESTs were aligned using the amino acid alignment as a guide(d) To identify orthologs as the most closely related sequence, a phylogenetic tree was produced by the maximum likelihood methodThe question was asked in examination.The origin of the question is Comparative Genomics topic in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct choice is (a) SEARCHES of EST databases for matches to a query SEQUENCE routinely produce minimal amounts of output that must be searched manually for significant hits

The explanation: The Searches of EST databases for matches to a query sequence routinely produce large amounts of output that must be searched manually for significant hits.an automatic method was DESCRIBED in 1999 utilizing a COMPUTER script, FAST-PAN, that scans EST databases with multiple queries from a protein family, sorts the alignment scores, and produces CHARTS and alignments of the matches found.

41.

A third category of TEs has features of both class I and class II TEs. These miniature, inverted repeat TEs (MITES) are ________ in length.(a) 400 bp(b) 500 Mbp(c) 300 kbp(d) 600 kbpThis question was addressed to me in an interview.Origin of the question is Genome Anatomy topic in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct answer is (a) 400 bp

The best I can explain: They were DISCOVERED in diverse flowering PLANTS where they are frequently associated with regulatory regions of genes. HENCE, they COULD be exerting an influence on REGULATION of gene expression.

42.

Which of the given statements is incorrect about Cluster analysis?(a) Clustering organizes the proteins into groups by some objective criterion(b) One criterion for a matching protein pair is the statistical significance of their alignment score(c) The P or E value from BLAST searches cannot be the criterion for a matching protein pair(d) A criterion for clustering proteins is the distance between each pair of sequences in a multiple sequence alignmentThe question was posed to me in semester exam.The query is from Comparative Genomics in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer» CORRECT choice is (c) The P or E VALUE from BLAST searches cannot be the criterion for a matching protein pair

The explanation is: Option c and b mean the same YET are different by the negation in option c.The LOWER this value, the better the alignment. There will be a cutoff P or E value at which the matches in the BLAST search are no longer considered significant. A value of P or E = 0.01–0.05 is usually the POINT at which the alignment score is no longer considered to be significant inorder to focus on a more closely related group of proteins.
43.

Sequencing of genomes depends on the assembly of a large number of DNA reads into a linear, contiguous DNA sequence.(a) True(b) FalseI had been asked this question by my school teacher while I was bunking the class.The query is from Sequence Assembly and Gene Identification topic in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct option is (a) True

Easy explanation: The COST and efficiency of this process has been greatly improved by AUTOMATIC methods of sequence assembly, first used for the sequencing of the BACTERIUM H. influenza. This same method of assembly was also used, in part, to complete the sequencing of the Drosophila and human genomes in a timely manner.

44.

The intron structure of genes in a particular eukaryote is used for predicting the location of genes of genome sequences.(a) True(b) FalseThe question was asked in quiz.My doubt is from Genome Anatomy in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

The CORRECT answer is (a) True

The EXPLANATION is: Other features of eukaryotic genes in a particular organism that are useful for gene prediction include the consensus sequences at exon–intron and intron–exon splice junctions, base composition, codon usage, and preference for neighboring codons. Computational methods incorporate this INFORMATION into a gene model that MAY be used to predict the PRESENCE of genes in a genome sequence.

45.

While sequencing of the first bacterial genome–A large number of random overlapping fragments were sequenced and then a consensus sequence of the entire ______ chromosome of Hemophilus was assembled by computer.(a) 8.6 x 10^9 bp(b) 1.8 x 10^6 bp(c) 6.9 x 10^5 bp(d) 1.8 x 10^4 bpThe question was posed to me by my school teacher while I was bunking the class.The origin of the question is Genome Anatomy in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right option is (B) 1.8 x 10^6 bp

Explanation: It was done excepting SEVERAL regions that had to be assembled manually. Once AVAILABLE, open reading frames were identified, and these were compared to the EXISTING proteins by a database similarity search.

46.

Each DNA or protein sequence database entry has much information, including ______(a) an assigned accession number(s)(b) source organism(c) name of locus(d) reference number type(s)The question was asked in an interview for job.I would like to ask this question from Multiple Sequence Formats & Storage of Information in a Sequence Database in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct answer is (d) reference number type(s)

EASIEST explanation: In addition to these KEYWORDS that apply to sequence, features in the sequence such as CODING regions, intron splice sites, and mutations; and finally the sequence itself is given the sequence database ENTRY. The above information is organized into a tabular FORM very much like that found in a relational database.

47.

Which of the given statements is incorrect about Database Types?(a) Relational databases are more useful in the development of biological databases(b) The tables in relational database are carefully indexed and cross-referenced with each other, sometimes using additional tables, so that each item in the database has a unique set of identifying features(c) The relational database orders data in tables made up of rows giving specific items in the database, and columns giving the features as attributes of those items(d) The two principal types of DBs are the relational and object-oriented databasesThe question was posed to me in an online interview.This is a very interesting question from Multiple Sequence Formats & Storage of Information in a Sequence Database in division Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right choice is (a) RELATIONAL databases are more USEFUL in the development of biological databases

To explain: The OBJECT-oriented database structure has been useful in the development of biological databases. The objects, such as GENETIC MAPS, genes, or proteins, each have an associated set of utilities for analysis and display of the object and a set of attributes such as identifying name or references.

48.

Which of the following is wrong about Abstract Syntax Notation Sequence Format?(a) The information is much more difficult to read by eye than a GenBank formatted sequence(b) Abstract Syntax Notation (ASN.1) is a formal data description language that has been developed by the computer industry(c) All the information found in other forms of sequence storage, e.g., the GenBank format, is present. For example, sequences can be retrieved in this format by ENTREZ(d) Taxonomic information and bibliographic information cannot be encoded with this formatI got this question in semester exam.This interesting question is from Sequence Formats & Computer Storage of Sequences in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Correct choice is (d) TAXONOMIC information and bibliographic information cannot be encoded with this format

For explanation I would say: ASN.1 has been adopted by the NATIONAL Center for BIOTECHNOLOGY Information (NCBI) to encode data such as sequences, maps, taxonomic information, molecular structures, and bibliographic information. These data sets may then be EASILY connected and accessed by computers. The ASN.1 sequence format is a highly structured and detailed format especially designed for COMPUTER access to the data.

49.

In Organization of the GenBank database and the search procedure used by ENTREZ—each row is another sequence entry and each column another GenBank field.(a) True(b) FalseThis question was posed to me in an interview for job.Question is taken from Sequence Formats & Computer Storage of Sequences in chapter Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right option is (a) True

To EXPLAIN I would say: When one sequence entry is RETRIEVED, all of these fields will be DISPLAYED. Search for the term “SOS regulon and coli” in all fields will find two MATCHING sequences. FINDING these sequences is simple because indexes have been made listing all of the sequences that have any given term, one index for each field. Similarly, a search for transcriptional regulator will find three sequences.

50.

For computer analysis of proteins, it is more convenient to use single-letter than three letter amino acid codes.(a) True(b) FalseI had been asked this question during an interview for a job.The question is from Sequence Formats & Computer Storage of Sequences topic in section Collecting & Storing Sequences in Laboratory of Bioinformatics

Answer»

Right option is (a) True

To elaborate: For example, GenBank DNA sequence entries contain a TRANSLATED sequence in single-letter code. The STANDARD, single-letter AMINO acid code was established by a joint international committee.