Week 2 - bio-computing
Week 2 - bio-computing CSE 6613
Popular in Bio-computing
verified elite notetaker
Popular in Buttler Hall
This 9 page Class Notes was uploaded by Marina Notetaker on Wednesday August 24, 2016. The Class Notes belongs to CSE 6613 at Mississippi State University taught by Andy Perkins in Fall 2016. Since its upload, it has received 10 views. For similar materials see Bio-computing in Buttler Hall at Mississippi State University.
Reviews for Week 2 - bio-computing
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 08/24/16
August 22, 2016 – Bio-computing Class 1) Testing an assignment or Conditional Statement “if” is used to test the first thing you just said, such as a=4 When using “if”, whatever you want to know that is true should be after double= (= =) So “= =” is the ‘true or false’ test “else” is you are testing another conditional statement other the one tested in “if” Because “else” and “if” are conditional statement from same thingsaid (a=4), theymust have the same tab distance Each tab represents one same block. It’s called intentional levels. In this example we are testing if a is equal to 5. If == is true it will print the first printed statement (a=4) was true, otherwise it will say it’s false. 2) Multiple intentional levels This example it has two intentional levels. The first will test is a >3 is true and the second will test if a<10. Remind that when a second intentional level is insert, you must go forward on “tab” PRINT statement. If the first condition is false, the second intentional level won’t work. So if a=1, and a>3, and it’s false, the second statement (a<10) won’t be tested because it’s a condition from the first testing. It will only be tested if the first one is true. Same intentional levels must have same tab space. Tab character is a single character, which means it’s not the same as giving 4 spaces. If you write spaces it might not work depending on the editor. 3) OR: testing more than one statement This case first statement (a>3) can be true or false and also the second statement (a<1) can be true or false. That means the whole thing OR have one true statement. Using that for ‘p’ and ‘q’ equations. You might have the possible results: P Q Q or P True True True True False True False True True false false False It means that it will onlybe false if both statements are false. That’s the only waythat OR can be false. If the statement is false, python won’t print anything! 4) AND: testing more than one statement, second option. Using “and” means that it will print only if BOTH statements are true. For example, it didn’t print anything because a < 1 is false, even though a>3 is true. Using “or/and” you can use as much as you want: a>2 or a<1 or a>1 etc 5) Assignment statement using STRINGS assignment statement can be either numerical (a=5) or string (a=ATT). String in python is not just letters. The letters are called method and define an object. String gives us methods to work on that string. But if you want to input on python the initial assignment statement you can use the INPUT (‘..’). In this case python will ask you to enter the sequence and then it will answer. Remember if you use upper case on your codes, upper case also should be used on your assignment statement and vice-versa. The example above is testing if the input statement is start codon or is not a stop codon. If it’s not a start codon and neither a stop codon (for example, GGT), nothing will be printed on python. However, if the input sequence is true for any of the conditional, it will be printed. ELIF is the same as a “if” statement inside a “else” test. So the same as above, could be written like this: So you can go testing each sequence you have. It will tell you if it’s a start codon, stop codon or if neither of both it won’t say anything, such as AAA. Another, but wrong way to do it is: In this case, to print “it’s not a stop and start codon”, the first ‘if’ and ‘else’ should be true. So instead of using conditional inside conditional, you can only use elif. So the right way to do it would be: You can have as many “if” you want, but only one “else”, because else covers every other case. 6) Assignment statements direct into python program : upper, lower and title cases Whatever is your statement it should be written as = ‘to something’, don’t forget the ‘..’, neither =. Example of assignment statement a =5 or if it’s a string, a = ‘ATT’. Notice that the string is given ‘..’ When you write mystr (statement), it gives you what it is equal to. Turn everything to UPPER case: “.upper()” Lower case: “.lower()” Title case: “.title()” If you alreadyturned your statement in title case, if you asked a second time to change it to title, nothing it will happens because it already changed. The original statement does not exist anymore. 7) Turning a string in a numerical keyboard Insert INT at the beginning of your statement followed by (..) or INT before the string you are going to test. If you use string as a test, you can use number to test a string (first test example on python bellow) However, using that it turns to an entire number. You cannot enter a number like 5.5 or -5. If you want to be able to enter a decimal number, then you include FLOAT instead of ‘int’. When using float to be able to enter a decimal number, you don’t need to do INT anytime, because it already changed to numerical keyboard. If you input 5 only using FLOAT, it will answer 25.0, because it becomes a decimal number. Other ways to input the same thing: *= means that whatever the number it was before; it will be multiplied by 5 (or whatever) to them the new number will always be its result. The != code. The ‘!’ means not equal. So whenever you use it, it’s to test is something is NOT equal. So it meanly says “if number is not equal to 15, print it”. Another way to do it: if not whatever= = … same thing as using != 8) Notes for the test: a. Code segment: it is a piece of code that has a purpose, a particular purpose. For example, the last picture is a code segment, which is a code segment that input a floating point number that multiplies by 5 and check if it is not equal to 15. So a code segment is a minimum amount of code that it is necessary to do what it is been asked. b. When asking quiz or test, don’t do things that was not asked. Example: write a code segment that multiply a number by 5 and check if it is not equal to 15. In this case the last picture is too much. Because you are not being asked to get it from a keyboard. 9) Loops, looping and conditional statements WHILE is the way you do looping Loops and conditional statements form most of the structure of programs you write (it's the base of the structure of a program) and are all equivalent to “while” statement It will check if count is less than 10. While will count down the number you started your statement, if while is true. In the same way if while is true and youdon’ttell whereit shouldstop, itwill keepcounting forever. First statement is the starting number. Count down is by using ‘<’. This code says: start counting from 10 and stop before 1, printing the number outside the loop. This another code gives you a starting number of 1 and each time it will give the result of multiplying the next number by 2 until 1,000 is gotten. At the end it gives the excluded last number (outside the loop), because it was the last multiplied number that was larger than 1,000. It’s a counting number by the power of 2. Counting up is done by using “>” Infiniteloop is when whileis trueso it never ends counting.Sothecodebelowwill never stop running. To stop it, close your window or hold the ctrl+C. To perform a loop by inserting the start number from the keyboard: This code will count down from the number you wrote on the python keyboard until 0. Count - = means one number of difference every count down. If you write + =, it will count up, but it would run forever any number you type on the keyboard will be greater than 0. August 24, 2016 1) Identifying part of a string using Python keyboard Identify your string Remember, if your string has any space it will count as a part. For example “hello there” has 11 elements. a. Identifying a letter from a specific location on a sring “string name [location]”. Remember the first letter is position zero! Negative numbers means that is counting from the last letter to the first. So in this case, it asks what is the letter located on [x] on my string named greeting. Negative numbers will start giving location from the last letter (e) to the first letter (h). b. Pulling out the first codon, or elements inside a string: “string name [start: end]” So to get a codon of a gene: gene[1:4] or gene[x: x+3] Getting from a specific position until the end of a gene: gene[x :] Getting gene from the beginning until specific position: gene[: x] Zero is the position of first letter. So if the first codon starts at the first letter so gene[ :3]. And then the second codon is gene[2:5] and so on. If you write gene[0:2], it gives you letter 1st and 2 nd, but not 3 rd letter. So it is string[start: end], in which the end position is excluded. The numbers between [ .. ] are called INDEX. c. String case methods: string.lower(), string.upper() d. String.find(): slice a string and search for a string inside another string. It tells where the string starts (position). - It can be used to find a particular motif in a protein e. Other string operations: alpha (same as numerical), decimal, digit. If someone did the program you can use isalpha, isdecimal isdigit.. so it will tell you what it's. You can do all kinds of things with strings: convert to numerical, find it, etc. 2) Parts of code program a. Initialization: where you insert the initial statement b. Test: whatever you want to know if it is true c. Increment: comes after the test, the “extra” part (count +=1) This program counts up, always adding 1 for each next number so 0,1,2… and not including 10 3) FOR… IN RANGE does the same thing as “while” does. It gives the loop but you say the range. Instead of count<10, you would say “in range (0,10)” Remember, the end number is excluded. So 10 is excluded. To include “count is” is this second part it should write print(‘count is’, count’) and not only “print(count)”. Function and methods are location of strings and list of numbers. Range (0,3) is a list of numbers, inside of that numbers include 0,1 and 2. Count in range (0,10), will count numbers starting from 0 and ending on 9. Youcan program it to be not always +1byusingin range. You cantell a forloopforthat. For example, using “in range(start, end, how much you want it to increase)”. So, “in range (0,10,2)” it will start on 0 and count up adding 2 until get the maximum number before 10, which is 8. The same can be obtained in “while” way, just adding “count+=2”. If using decimal numbers, use while. For example, “count += 0.1”. “For index in range (start, end)” is used for getting letters (index) from a string that would start and end at the indicated location. So, you can insert a string with any size and ask for the letter from 0 to 6, excluding 6. 4) Counting the numbers of each letter that appears on a string a. First way to do it: faster for the computer process but not to write down In this case you want it to count the numbers of A, C, T and G in a string from the location 0 to 6. Bad counting is whatever is not A, C, T or G and appears on the string. The first time says 6 errors, because I wrote the string in upper case and the program was looking for lower case. So it was all other than a,c,t and g. The second time, it counts just fine. Don’t forget that all “if” statements are inside the “for index”, so should give a plus tab. Need to say Xcount=0, otherwise it doesn’t work. b. Secondwayto count lettersis easier to writethe code,but cantakelongerto process depending on the size of your string. It can take longer because it will run 4 times to give the answer (run for a, then for c, etc), while the other way will count A,G,T,C together in one run. 5) Ratio string content Such as GC content, it would be (G + T)/ (all other letters, including G and T) GC content is the percentage of Gs and Cs that an organism has. For example, above 53.3%. GC contents are important thing to know. Some organisms have a different GC content than others. Same happens in different areas, introns and exoms have different GC content. So it’s used to tell us what might be the type of feature of genome we are looking at. 6) Length of string = LEN () This conde is only asking to print(lean(string)). See above. 7) Giving codons with and without overlapping Blast works by looking for a sequence (query sequence) in database. Blast does the fuzzy search. One way it works is breaking your sequence in words, generally, 3 words which can overlap. And then it will search for matching pieces in the database. Once it finds a match, it starts expanding. This code, breaks the sequence in 3 letter words. When you say “for index in range (0, len(dna)):” it means that you will work on a sequence that goes the whole length. “word=dna[index: index+3]” gives you the letters/word starting from 0 and including 0,1 and 2 letters at time. However, the end will get weird because it will always have 3 characters even though you don’t have enough letters. So you can see: T,A, .. and A, .., … To end early, so it won’t have these last 2 words, you do “for index in range (0, len(dna)-2):” To NOT overlap: just add 3 each time. So, “for index in range (0, len(dna)-2,3)”. THAT’S HOW EXTRACT CODONS!
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'