Solution Found!

Sequence alignment. When a new gene is discovered, a standard approach to understanding

Chapter 6, Problem 6.26

(choose chapter or problem)

Get Unlimited Answers
QUESTION:

Sequence alignment. When a new gene is discovered, a standard approach to understanding itsfunction is to look through a database of known genes and find close matches. The closenessof two genes is measured by the extent to which they are aligned. To formalize this, think ofa gene as being a long string over an alphabet = {A, C, G, T}. Consider two genes (strings)x = ATGCC and y = TACGCA. An alignment of x and y is a way of matching up these twostrings by writing them in columns, for instance: A T G C CT A C G C AHere the indicates a gap. The characters of each string must appear in order, and eachcolumn must contain a character from at least one of the strings. The score of an alignment isspecified by a scoring matrix of size (|| + 1) (|| + 1), where the extra row and column are toaccommodate gaps. For instance the preceding alignment has the following score:(, T) + (A, A) + (T, ) + (, C) + (G, G) + (C, C) + (C, A).Give a dynamic programming algorithm that takes as input two strings x[1 . . . n] and y[1 . . . m]and a scoring matrix , and returns the highest-scoring alignment. The running time should beO(mn).

Questions & Answers

QUESTION:

Sequence alignment. When a new gene is discovered, a standard approach to understanding itsfunction is to look through a database of known genes and find close matches. The closenessof two genes is measured by the extent to which they are aligned. To formalize this, think ofa gene as being a long string over an alphabet = {A, C, G, T}. Consider two genes (strings)x = ATGCC and y = TACGCA. An alignment of x and y is a way of matching up these twostrings by writing them in columns, for instance: A T G C CT A C G C AHere the indicates a gap. The characters of each string must appear in order, and eachcolumn must contain a character from at least one of the strings. The score of an alignment isspecified by a scoring matrix of size (|| + 1) (|| + 1), where the extra row and column are toaccommodate gaps. For instance the preceding alignment has the following score:(, T) + (A, A) + (T, ) + (, C) + (G, G) + (C, C) + (C, A).Give a dynamic programming algorithm that takes as input two strings x[1 . . . n] and y[1 . . . m]and a scoring matrix , and returns the highest-scoring alignment. The running time should beO(mn).

ANSWER:

Step 1 of 2

The two sequences given are and with sizes  and  respectively. We are defining a matrix that stores the score of aligning the sequence. Each value  defines the score of aligning the sequence  with the sequence. We can define the recursive function to find the score of alignment as follows:

The case defines that the alignment of the sequence is copied as it is till the second last element and align the characters and. Theis determined based on the match or miss of aligning the sequence.

The case is to align with a gap.

The case is to align with a gap.

Add to cart


Study Tools You Might Need

Not The Solution You Need? Search for Your Answer Here:

×

Login

Login or Sign up for access to all of our study tools and educational content!

Forgot password?
Register Now

×

Register

Sign up for access to all content on our site!

Or login if you already have an account

×

Reset password

If you have an active account we’ll send you an e-mail for password recovery

Or login if you have your password back