New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here

Networking III

by: Janiya Renner MD

Networking III CSCI 553

Janiya Renner MD

GPA 3.62

Derek Harter

Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Derek Harter
Class Notes
25 ?




Popular in Course

Popular in ComputerScienence

This 244 page Class Notes was uploaded by Janiya Renner MD on Friday October 30, 2015. The Class Notes belongs to CSCI 553 at Texas A&M University - Commerce taught by Derek Harter in Fall. Since its upload, it has received 30 views. For similar materials see /class/232408/csci-553-texas-a-m-university-commerce in ComputerScienence at Texas A&M University - Commerce.

Popular in ComputerScienence


Reviews for Networking III


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/30/15
CSCI 414 553 Networking III Unix Network Programming pring 2008 Compiling Linking amp Libraries Today39s Agenda I First a bit more about version control I Check subversion task from Thursday I Compilers Linking amp Libraries I Debuggers I Discuss Lab 2 g Today39s Agenda I Before we begin or after class to try on your own cd mkdir compilers cd compilers tar xvfz homecsci553classfilescompilerstgz Introduction I C is a powerful systems level programming language I It is the fundamental language of Unix I It is a source for many of the paradigms of programming that have found there way into modern languages I So a basic familiarity with C concepts is fundamental to being a good programmer I Linking and Libraries are parts of good modular programming Execution Cycle I C is an example of a sturdy language SWC I The full cycle 1 Edit 2 Compile 3 Link 4 Debug Gotol Preprocessing Assembly C Compiler I Free implementation of c complier GNU c gcc is ubiquitously available I cc command in most Linux distributes is simply a symbolic link to gcc I gcc can do all 4 steps in 1 invocation preprocessing compiling assembling linking I Or can be stopped at any stage Compiling a C Program I SingleModule Selfcontained program 39 Single source file vi reversec editing session cc reversec Is total 12 rwxrwxrx 1 dharter dharter 5047 20080124 1514 aout rwrwr 1 dharter dharter 504 20080124 1513 reversec Running a C Program r any executable program for that matter aout bash aout command not found aout reverse quotcatquot tac reverse quotnoonquot noon I Solution 1 Add or directory where executable is to PATH 2 give a correct relative or absolute path name to executable Running a C Program esides finding the executable the executable scriptprogram needs to be well executable chmod ugo x aout ls total 12 rwrwr 1 dharter dharter 5047 20080124 1514 aout rwrwr 1 dharter dharter 504 200801 24 1513 reversec aout bash aout Permission denied Options for cc 397 c on many systems now adays is a symbolic link to the gnu gcc compiler I man gcc for options I Override the default executable name gcc reversec 0 reverse ls l total 20 rwxrwxrX 1 dharter dharter 5047 20080124 1524 reverse rwrwr 1 dharter dharter 504 200801 24 1513 reversec reverse reverse quotcatquot tac reverse quotnoonquot noon g Multimodule Programs 7 eal world software needs to be modular I Break into functions modules packages etc I Breaking related functions functionality into separate files enhances reusability I In vanilla c world link into standard system libraries I Similar concept to packages in Java Python Compiling Separate Files Modules I e ault gcc behavior given a bunch of source files I attempt to compile them all individually and link together gcc reversec mainc ls total 20 rwxrwxrx 1 dharter dharter 5130 20080124 1535 aout rwrwr 1 dharter dharter 265 20080124 1534 mainc rwrwr 1 dharter dharter 241 20080124 1532 reversec rwrwr 1 dharter dharter 114 20080124 1533 reverseh aout reverse quotcatquot tac reverse quotnoonquot noon Compiling and Linking g Separately 39 gcc options allow you to stop the process at any step preprocessing assembling the object files linking etc I compile source files and stop at object compilation gcc c mainc gcc c reversec ls l o rwrwr 1 dharter dharter 1060 20080124 1551 maino rwrwr 1 dharter dharter 783 200801 24 1551 reverseo Use gcc to Link object files I gcc can take object files and link them into an executable for you gcc reverseo maino 0 reverse reverse reverse quotcatquot tac reverse quotnoonquot noon The StandAlone g LinkerLoader I When gcc is used to link object files it transparently invokes Id I Often never need to directly invoke ld I But it is wise to know a little bit about it I Especially if you want to create linkable libraries The StandAlone Linker Loader Id ehframehdr m efi386 hashstyegnu dynamicIinker ibdinuxso2 usribcrt1o usribcrtio usribgcci386redhatIinux412crtbegino Lusribgcci386redhatIinux412 Lusrib reverseo maino gcc as needed gccs no asneeded c usribgcci386redhatIinux412crtendo usribcrtno 0 reverse Creating Stand Alone Libraries Archives I Archiver tool ar create stand along libraries I static libraries a I dynamic link shared object libraries 50 Creating a Static Library 39 Compile object files gcc c palindromec reverse c 39 Use ar to create library libreversea ar crv libpalindromea reverse 0 palindromeo a reverse 0 a palindromeo I Link an executable using a library I L directory where to find library I directory where to find include file header gcc c mainc gcc o palindrome maino L 1palindrome palindrome cat palindrome O cattac palindrome 1 g Library Naming Conventions I Libraries are named with prefix lib I The following command gcc srcfilec lm lpthread I usrliblibma I usrliblibpthreada I Standard system library locations I lib I usrlib Using an ArchiveLibrary I Many standard libraries are already available I usrliblibmatha Standard c math functions gcc c mathc gcc 0 math matho 1m gcc and ar options ommand Option Purpose gcc c Stop at compilation phase eg do not link executable 0 file Place output in file Default is fileo if only compling object file aout if linking executable I dir Search dir for include header files L dir Search dir for libraryarchive files to link against lname Link executable with library named usrliblibnamea g Produce debugging information in the object file The second ggdb form produces extra information useful for gdb debugger ar c Create a new archive r Insert files into archive with replacement v Be verbose give information about files being placed in archive Summary I c language concepts pervade modern programming systems and languages I plain c is modular I UnixLinux use libraryarchive concepts to share modular code libraries The Transport Layer UniX Network Programming Ch 2 I Today39s Agenda A Word about Projects Lab 7 Basic Sockets amp Transport Layer Protocols Stevens Ch 2 Intro to the Transport Layer protocols TCP amp UDP I Reminder Lab 6 due Thursday Fixing the Unfairness of TCP Congestion Control httpblogszdnetcomOup1078 OSI Model A common way to describe layers in a network International Organization for Standardization ISO open systems interconnection OSI model for computer communcations application presentatiOn session transport network datalink physical OSI model application TCP P IPv4 IPv6 device driver and hardware Internet Protocol suite application details user process sockets XTI kernel communications details I Transport Layer protocols in the TCPIP suite Goal is to provide enough detail from a network programming perspective to understand how to use the protocols effectively 3 transport layer protocols we will discuss TCP Transmission Control Protocol UDP User Datagram Protocol SCTP Stream Control Transmission Protocol I This lecture provides an overview of the I Transport Layer Protocols UDP A simple unreliable datagram protocol Can only send a package datagram of data over an I established link No guarantee that package will reach its intended des na on TCP reliable bytestream protocol send and receive a stream of bytes over an established link TCP handles breaking down stream into packets sending then reassembling them TCP reliable makes sure all packets are successfully received makes sure received and put back together in order I Transport Layer Protocols is a newer transport layer protocol developed for telephony applications and with IPv6 in mind similar to TOP as it is a reliable transport protocol provides message boundaries in TCP application level needs to agree on message boundaries of the stream and other improvements performance improvements multihoming I SCTP I Motivation understood make it easier for us to write robust clients and servers When we understand these features it becomes easier to debug our CS using common tools like netstat I There are features of TCPUDP that when The Big Picture Although the protocol suite is called TCPIP there are more members of this family than just TCP and IP IPv4 applications IPV6 applications AFNET AFNET6 sockaddrin sockaddrin6 tcp trace trace dump routed ping route tappl appl appl appl appl appl route ping l V Aim ICMP TCP SCTP UDP IGMP IPv4 IPv6 39C39g39PV 32bit ad It RARP addresses SEE datalink The Big Picture Although the protocol suite is called TCPIP there are more members of this family than just TCP and IP lPV4 applications IPv6 applications AFINET AFINET6 A sockaddrin A sockaddriin6 tcp m trace 39 trace sockets 7 7 7 7 iffieieiervr r 7 API I 1 TCP 5cm ICMP 32bit 128bit ICMP IGMP addresses addresses ARP RARP Bl F data DLPI ll1 Lllt Figure 21 Overview of TCPP protocols Internet Protocol Suite IPv4 Internet Protocol version 4 IPv4 often denoted simply IP has been the workhorse protocol of the IP suite since the early 1980s It uses 32bit addresses IPv6 Internet Protocol version 6 IPv6 was designed in the mid1990s as a replacement for IPv4 The major change is a larger address comprising 128 bits to deal with the explosive growth of the internet in the 1990s TCP Transmission Control Protocol TCP is a connection oriented protocol that provides a reliable fulldublex byte stream to its users TCP sockets are an example of stream sockets TCP takes care of details such as acknowledgements timeouts retransmissions and the like Internet Protocol Suite UDP User Datagram Protocol UDP is a connectionless protocol and UDP sockets are an example of datagram sockets There is no guarantee that UDP datagrams ever reach their intended destination SCTP Stream Control Transmission Protocol SCTP is a connectionoriented protocol that provides a reliable full duplex association The word association is used when referring to a connection in SCTP because SCTP is multihomed involving a set of IP addresses and a single port for each side of an association SCTP provides a message service which maintains record boundaries Internet Protocol Suite ICMP Internet Control Message Protocol ICMP handles error and control information between routers and hosts These messages are normally generated by and processed by the TCPIP networking software itself not user processes IGMP Internet Group Management Protocol IGMP is used with multicasting ARP Address Resolution Protocol ARP maps an IPv4 address into a hardware address such as an Ethernet address MAC ARP is normally used on broadcast networks such as Ethernet token ring and FDDI RARP Reverse Address Resolution Protocol RARP maps a hardware address into an IPv4 address It is sometimes used when a diskless node is booting I Internet Protocol Suite version 6 ICMPV6 combines the functionality of ICMPV4 IGMP and ARP BPF BSD packet lter This interface provides access to the datalink layer It is normally found on Berkeleyderived kernels DLPI Dataink provider interface This interface also provides access to the datalink layer It is normally provided with SVR4 derived kernels I lCMPv6 Internet Control Message Protocol User Datagram Protocol UDP Simple transportlayer protocol Application writes a message to a UDP socket which is then encapsulated in a UDP datagram which is then sent to destination there is no guarantee that a UDP datagram will ever reach its nal destination that order of datagrams will be presened or that datagrams arrive only once Each UDP datagram has a length length is passed to the receiving application along with data datagram has message boundaries length included in datagram connectione33 service Transmission Control Protocol TCP TCP provides connections between clients and servers TCP provides reliability When TCP sends data to the other end it requires an acknowledgment in return If acknowledgment is not received TCP automatically retransmits the data and waits a longer amount of time After some number of retransmissions TCP will give up provides reliable data delivery or reliable noti cation of failure I TCP algorithms to estimate the roundtrip time RTT I dynamically estimates how long to wait for acknowledgements sequences data by associating a sequence number with every byte that it sends if segments arrive out of order receiving TCP will reorder the segments based on the sequence numbers if TCP receives duplicates it can detect because of duplicate segment numbers and discard duplicates TCP TCP provides flow control TCP tells its peer exactly how many bytes of data it is willing to accept advertised window prevents overflowing the receiver application before it can process data TCP connection is fullduplex application can send and receive data in both directions on a given connection at any time this means that TCP must keep track of state information sequence numbers and window sizes for each direction of data flow Stream Control Transmission Protocol SCTP Like TCP provides applications with reliability sequencing flow control and fullduplex data transfer Provides associations between clients and servers connection implies communication between only two IP addresses an association refers to a communication between any two systems which may involve more than two addresses due to multihoming Unlike TCP SCTP is messageoriented it provides a sequenced delivery of individual records like UDP the length of a record written by the sender is passed to the receiving application SCTP SCTP can provide multiple streams between connection endpoints each with its own reliable sequenced delivery of messages A lost message in one of these streams does not block delivery of messages in any other stream in contrast to TOP where a lost message blocks delivery of all future data on the connection until the loss is repaired SCTP provides multihoming allows single SCTP endpoint to support multiple IP addresses increased robustness against network failure TCP Connection Establishment and Termination In order to help understand connect accept and close functions of sockets debug TCP applications using the netstat program We must understand how TCP connections are established and terminated and the TCP39s state transition diagram TCP connect The following scenario occurs when a TCP connection is established ThreeWay Handshake Server must be prepared to accept an incoming connection by calling socket bind and listen Client issues an active open by calling connect Causes TCP to send a synchronize SYN segment which tells the server the client39s initial sequence number for the data that the client will send on the connection Server must acknowledge ACK the client39s SYN and the server must also send its own SYN containing the initial sequence number for the data that the server will send on the connection Client must acknowledge the servers SYN TCP ThreeWay Handshake client server socket socket bind connect listen blocks active open connect returns k CK J k k passive open accept blocks accept returns I TCP ThreeWay Handshake Server39s initial sequence number is K The acknowledgment number in an ACK is the next expected sequence for the end sending the ACK I Client39s initial sequence number is J I TCP Options I Each SYN can contain TCP options Commonly used options include MSS option Maximum Segment Size maximum amount of data willing to accept in each TCP segment TCPMAXSEG socket option Window scale option setting the window for flow control Timestamp option I TCP Connection Termination connection it takes four to terminate a connection One application calls close first and we say that this end performs the active close This end s TCP sends a FIN segment which means it is finished sending data The other end that receives the FIN performs the passive close The received FIN is acknowledged by TOP The receipt of FIN is also passed to the application as an endof file Sometime later the application that received the endof file will close its socket This causes its TCP to send a FIN The TCP on the system that receives this final FIN the end that did the active close acknowledges the FIN I While it takes three segments to establish a TCP Connection close client Close active Close k KM MN k server passive Close read application returns 0 Close I TCP Connection Close active close either end the client or server I Although we show the client performing the can perform the active close I TCP State Transition Diagram establishment and connection termination 11 different states defined for a TCP connection to establishterminate rules dictate transitions from one state to another based on the current state and the segment received in that state Further state needed for sendingreceiving data I Only shows states with regards to connection TCP State Transition Diagram starting point CLOSED I I appl passive open I send ltnothinggt I I passive open recv SYN appl close SYN SENT send SYN ACK f or timeout simultaneous open Cilve Open 960 8 q 10 v 006 ltf Q 504 r i n s I ESTABLISHED data transfer state CLOSEWAIT I l I I I 1 I appl I close I send FIN I J I I l I LAsLACK JETLAEE send ltnoth1nggt I i i i n i i i i T i n Simultaneous close recv FLN I gt send ACK CLOSENG recv ACK send ltnothjnggt I I I I I I I I passive ClOSE recv ACK send ltnothinggt HI recv FIN IT FLNWA 2 send ACK TIMEWA1T L E v T E 7 7 e J active close ZMSL timeout I I I I I I I I I I I I gt indicate normal transitionsfbr client gt indicate normal transitions or server app indicate state transitions ta 39en when application issues operation recv indicate state transitions taken when segment received send indicate what is sent for this transition Figure 24 TCP state transition diagram client socket c onne c t blocks active open SYNSENT ESTABLISHED conne c t returns ltclientf0rms requestgt write read blocks re a d returns close active close FINWAIT1 FINWAIT2 TIMEWAIT SYN Mss 536 SYN K ACK II M55 1460 ACK 1 data requeSt data reply AcK of request Of IEPIy W W W server socket bind listen LISTEN passive open accept blocks SYNRCVD ESTABLISHED accept returns read blocks read returns ltse739ver processes requestgt write read blocks CLOSEWAIT passive close read returns 0 Close LASTACIlt CLOSED Figure 25 Packet exchange for TCP connection client socket connect blocks active open SYNSENT ESTABLISHED conne c t returns ltClient forms reqiiestgt write read blocks read returns close active close FINWAIT1 FINWAIT2 TIMEWAIT SYN K ACK ll MSS 1460 ACK data request data re le AcK 0 request ACK Of reply W W W SEI39VEI39 socket bind listen LISTEN passive open accept blocks SYNRCVD ESIYXBIJSIIEI accept returns read blocks re ad returns ltserver processes requestgt write read blocks CLOSEWAII passive close read returns 0 close LASTACK CLOSED Figure 25 Packet exchange for TCP connection I I I applzlclose sendFIN I I 7 7 7 7 7 39 7 7 7 7 7 7 7 I recx FIN I recv ACK EVEK send ACK IsEndzqiot h i g L e e r e r e o starting point CLOSED appl passive open I send ltnothinggt I recv SYN send SYN ACK simultaneous open appl close or timeout send ACK I CLOSEWAIT l l I I I I I I I passive Close recv ACK send ltnothinggt ZMSL timeout active close gt indicate normal transitionsfor client gt indicate normal transitions or server appl indicate state transitions to en when application issues operation recv indicate state transitions taken when segment received send indicate what is sentfor this transition Figure 244 TCP state transition diagrami TIMEWAT State One of most misunderstood aspects of TCP with I regtard to network programming is its TIMEWAIT s a e Wet can see end that performs active close goes through this s a e Duration that this endpoint remains in this state is twice the maximum segment lifetime MSL Every implementation of TCP must choose a value for the MSL typically 2 minutes Berkeleyderived implementations use 30 seconds this means that duration in TIMEWAIT is between 1 and 4 minutes The MSL is supposed to represent the maximum amount of time that any given lP datagram can live in a network I TIMEWAT STATE state 1 To implement TCP s fullduplex connection termination reliably 2 To allow old duplicate segments to expire in the network I There are 2 reasons for the TIMEWAIT SCTP Association I Establishment and Termination SCTP Fourway handshake for association establishment 1 The server must be prepared to accept an incoming association using socket bind and listen passive open 2 The client issues an active open by calling connect or by sending a message which implicitly opens the association This causes the client SCTP to send an lNlT message which stands for initialization to tell the server the client39s list of IP addresses initial sequence number initiation tag number of outbound streams 3 The server acknowledges the client39s lNlT message with an lNlTACK message which contains the server39s list of IP addresses initial sequence number initiation tag number of outbound streams number of inbound streams and a state cookie The state cookie contains all of the state that the server needs to ensure that the association is valid and is digitally signed to ensure its validity 4 The client echos the server39s state cookie with a COOKIEECHO message This message may also contain user data bundled within the same packet 5 The server acknowledges that the cookie was correct and that the association was established with a COOKIEACK message This message may also contain user data bundled within the same packet STCP FourWay Handshake client socket connect blocks active open connect returns Ta CONEACK NIT Tami K TzK00 K e C TzCOOKiE ECHO C server socket bind listen passive open accept blocks accept returns read blocks SCTP FourWay Handshake Similar in many ways to TCP39s threeway handshake except for the cookie generation which is an integral part INIT carries a verification tag Ta and an initial sequence number J Initial sequence number J is used as the starting sequence number for DATA The verification tag Ta must be present in every packet sent by the peer for the life of the association Likewise the other end sends an lNlTACK and with it sends it own verification tag T2 and initial sequence number K Close active Close client DOWNACK SH UTDOWNCOMPLETE SCTP Association Termination server passive Close read application returns 0 Close I SCTP State Transition Diagram Shows only state transition of establishment I and termination of SCTP connections I SCTP Watching the Packets I Example of SCTP packet transfer Port Numbers At any given time multiple processes can be using any given transport UDP SCTP TOP All three transport layers use 16bit integer port numbers to differentiate between these processes Servers request a port well known services use wellknown ports Clients use ephemeral ports shortlived ports assigned automatically by the transport protocol to the client unique on the client39s host Port Numbers Port numbers are divided into three ranges 1 wellknown ports 0 through 1023 assigned by IANA 2registered ports 102449151 not controlled by IANA but registers and lists the uses of these pons 3 dynamic or private ports 49152 through 65535 I Socket Pair connection is the fourtuple that defines the two endpoints of a connection client IP client port server IP server port For SCTP an association is identified by a set of local IP addresses a local port a set of foreign IP addresses and a foreign port The two values that identify each endpoint an IP address and a port number are often called a socket I Terminology A socket pair for a TCP TCP Port Numbers and Concurrent Sewers Concurrent server main server loop spawns a child to handle each new connection what happens if the child continues to use the wellknown port number while servicing a request TCP Port Numbers and Concurrent Sewers 1o 19 o 115 1 192168 0 1 server 21 gt listening socket TCP server ftp with a passive open on port 21 10 33137 2 C nnection request 10 19 O 115 192168 0 1 O client 1033137149 1019011521 1 1 0190115 port 2 52 1 server 21 gtistening socket Connection request from client to server 3 TCP Port Numbers and Concurrent Sewers 10 33137 client K 1033137149152 10190115121 10 19 O 115 192168 0 1 server 321 3 gt listening fork socket server child 1019o11521 gt connected 11 f quot 40711f i4E 1 IQI I JIULI Concurrent server has child handle client socket TCP Port Numbers and Concurrent Sewers 10 33137 client 10331371491 1019011521 client 10331371491 1019011521 K K 52 10 19 O 115 192168 0 1 10 9011521 server 21 server child listening socket 033137149 b2 server child 1019011521 V 033137149153 connected socket connected socket Second client connection with same server I Buffer Sizes and Limitations can be dictated by hardware for example Ethernet MTU is 1500 bytes when an IP datagram is to be sent on an interface if the size of the datagram exceeds the link MTU fragmentation is performed I MTU Maximum Transmission Unit CSci 553 Artificial Intelligence Fall 2007 Lecture 5 Local Search and CSPs 10022007 Derek Harter Texas AampM University Commerce Many slides over the course adapted from Srini Narayanan Dan klein Stuart Russell and Andrew Moore Announcements 39 HW1 mini assignments 39 HW2 Thursday Local Search Methods 39 Queuebased algorithms keep fallback options backtracking 39 Local search improve what you have until you can t make it better 39 Generally much more efficient but incomplete Example NQueens What are the states What is the start What is the goal What are the actions What should the costs be Types of Problems 39 Planning problems 39 We want a path to a solution examples 39 Usually want an optimal path 39 Incremental formulations 39 Identi cation problems 39 We actually just want to know what the goal is examples 39 Usually want an optimal goal 39 Completestate formulations 39 Iterative improvement algorithms Example 4Queens States 4 queens in 4 columns 44 256 states Operators move queen in column Goal test no attacks Evaluation hn number of attacks Example NQueens h0 39 Start wherever move queens to reduce conflicts 39 Almost always solves large nqueens nearly Instantly Hill Climbing 39 Simple general idea 39 Start wherever 39 Always choose the best neighbor 39 If no neighbors have better scores than current quit 39 Why can this be a terrible idea 39 Complete 39 Optimal 39 What s good about it Hill Climbing Diagram objecti efunction oba maximum shoulder local maximum quotflatquot local maximum quot space curren state 39 Random restarts 39 Random sideways steps m msmbm 9a m mmmlt 30563 5 ea in 5 La 3 95 SEQ S Imam u m o msmu m v n O U 0 3 u A U I 00 0UDt 39 D U 039 39 n The Shape of a Yet Harder Problem Remedies to drawbacks of hill climbing 39 Random restart 39 Problem reformulation 39 In the end Some problem spaces are great for hill climbing and others are terrible Monte Carlo Descent 1 S initial state 2 Repeatktimes 3 b C d If GOALS then return 8 S 6 successor of S picked at random if hS S hS then S 6 8 else Ah hS39h with probability exp AhT where T is called the temperature 5 6 539 Metropolis criterion 3 Return failure Simulated annealing lowers T over the k iterations It starts with a large T and slowly decreases T Simulated Annealing 39 Idea Escape local maxima by allowing downhill moves 39 But make them rarer as time goes on function SIMI LATED ANNEALINGproblem schedule returns a solution state inputs problem a problem schedule a mapping from time to temperature local variables current a node next a node T a temperature controlling prob of downward steps current e llAKE NODEINITIAL STATEproblem for tlt 1 to 00 d0 TH scheduldt if T 0 then return current nerte a randomly selected successor of current AER VALUEne17t VALUEurrerzt if AE gt 0 then currentlt nert else curreutg n e rt only with probability 6A ET Simulated Annealin 39 Theoretical guarantee Em 39 Stationary distribution 1923 OC e W 39 If T decreased slowly enough will converge to optimal state 39 Is this an interesting guarantee 39 Sounds like magic but reality is reality 39 The more downhill steps you need to escape the less likely you are to every make them all In a row 39 People think hard about ridge operators which let you jump around the space In better ways Beam Search Like greedy search but keep K states at all times if o i o 30 o o o Greedy Search Beam Search Variables beam size encourage diversity The best choice in MANY practical settings Complete Optimal Why do we still need optimal methods Genetic Algorithms 32752411 32748552 327482 24748552 24752411 24752411 20 26 32752411 32752124 322124 11 14 24415124 24415411 6 244154 F ness Semc on Pans CrossOver 24748552 Genetic algorithms use a natural selection metaphor 39 Like beam search selection but also have painvise crossover operators with optional mutation Probably the most misunderstood misapplied and even maligned technique around Example NQueens Why does crossover make sense here When wouldn t it make sense What would mutation be What would a good fitness function be The Basic Genetic Algorithm 1 Generate random population of chromosomes 2 Until the end condition is met create a new population by repeating followmg steps 1 Evaluate the fitness of each chromosome Select two parentchromosomes from a population weighed by their fitness 2 3 With probability pC cross over the parents to form a new offspring 4 With probability pm mutate new offspring at each position on the chromosome 5 Place new offspring in the new population 3 Return the best solution in current population Search problems Blind search Heuristic search bestfirst and A Construction of heuristics Variants of A Local search Continuous Problems 39 Placing airports in Romania 39 States x1y1x2y2x3y3 39 Cost sum of squared distances to closest city Timisoara X 212 I Hirsova I Urziceni Bucharest Dobreta l l Eforie Grint Mths 39 How to deal with continous therefore infinite state spaces 39 Discretization bucket ranges of values 39 Eg force integral coordinates 39 Continuous optimization 39 Eg gradient ascent W 8f 8f 8f 8f 8f 8f 8178y178278y278378y3 a e I Iozfa 39 More later in the course l l Image from viasorg Constraint Satisfaction Problems Standard search problems 39 State is a black box any old data structure 39 Goal test any function over states 39 Successors any map from states to sets of states Constraint satisfaction problems CSPs 39 State is defined by variables Xi with values from a domain D sometimes D depends on i 39 Goal test is a set of constraints specifying allowable combinations of values for subsets of variables Simple example of a formal representation language Allows useful generalpurpose al orithms with more power than standard searc algorithms Examle Nueens 39 Formulation 1 39 Variables Xm 39 Domains 071 39 Constraints Wayk XijaXilc E 0707071170 Vij k Xijij E 07070717170 vm k X23 Xikjk E 0 O 0 1 1 0 vm k X23 Xikjk E 0 O 0 1 1 0 Z Xij N m Examle NQueens 39 Formulation 2 39 Variables Qk 39 Domains 111213 21 NN 39 Constraints Vij nonthreateningQ Qj there s an even better way What is it Exml MCln ng Variables WA NT NSW V SA T Domain D 2 red green blue Constraints adjacent regions must have different colors WAy NT WA NT E red green red blue green red Solutions are assignments satisfying a constraints eg WA 2 red NT 2 green Q 2 red NSW 2 green V 2 red SA 2 blue T 2 green Example The Waltz Algorithm The Waltz al solid polyhe An early example of a computation posed as a CSP orithm is for interpreting line drawings of ra Look at all intersections Adjacent intersections impose constraints on each other Waltz on Simple Scenes 39 Assume all objects 39 Have no shadows or cracks 39 Threefaced vertices 39 General position nojunctions change with small movements of eye Then each line on image is one ofthe followmg 39 Boundary line edge of an object a with right hand of arrow enoting SOIIdH and left hand 39 Interior convex edge 39 Interior concave edge LegalJunc ons Ornlylcelrltain jungltions are V V V p ysica y pOSSI e How can we formulate a CSP to V v V label an image Variables vertices Y Domains junction labels Constraints both ends ofa line 6 TV6 le should have the same label Example MapColoring 39 Solutions are complete and consistent assignments eg WA red NT greenQ redNSW greenV redSA blueT green Constraint Graphs 39 Binary CSP each constraint relates at most two variables C Constraint graph nodes are variables arcs show constraints Generalpurpose CSP algorithms use the graph structure to speed up search Eg Tasmania is an independent subproblem I mtg Examle Crytarithmetic 39 Variables FT U W R0 X1 X2 X3 39 Domains 0 12345 678 9 39 Constraints alldifFF T U W R 0 0 0 R 10 X1 Varieties of CSPs 39 Discrete Variables 39 Finite domains 39 Size d means 0d complete assignments 39 Eg Boolean CSPs including Boolean satisfiability NPcomplete 39 Infinite domains integers strings etc 39 Eg job scheduling variables are startend times for each job 39 Need a constraint language eg StartJob1 5 lt StartJob3 39 Linear constraints solvable nonlinear undecidable 39 Continuous variables 39 Eg startend times for Hubble Telescope observations 39 Linear constraints solvable in polynomial time by LP methods see cs170 for a bit of this theory Varieties f Cnstraints 39 Varieties of Constraints 39 Unary constraints involve a single variable equiv to shrinking domains SA green 39 Binary constraints involve pairs of variables 3147va 39 Higherorder constraints involve 3 or more variables eg cryptarithmetic column constraints 39 Preferences soft constraints 39 Eg red is better than green 39 Often representable by a cost for each variable assignment 39 Gives constrained optimization problems 39 We ll ignore these until we get to Bayes nets RealWorld CSPs Assignment problems eg who teaches what class Timetabling problems eg which class is offered when and where Hardware configuration Spreadsheets Transportation scheduling Factory scheduling Floorplanning Many realworld problems involve realvalued variables Standard Search Formulation 39 Standard search formulation of CSPs incremental 39 Let39s start with the straightfonNard dumb approach then fix it 39 States are defined by the values assigned so far 39 Initial state the empty assignment 39 Successor function assign a value to an unassigned variable 39 fail if no legal assignment 39 Goal test the current assignment is complete and satisfies all constraints Search Methods What does DFS do w 39 What s the obvious problem here 39 What s the slightly Iess obvious problem CSP formulation as search This is the same for all CSPs Every solution appears at depth n with n variables 9 use depthfirst search Path is irrelevant so can also use completestate formulation b n d at depth l hence n dn leaves Backtracking Search Idea 1 Only consider a single variable at each point 39 Variable assignments are commutative 39 le WA red then NT green same as NT green then WA red 39 Only need to consider assignments to a single variable at each step 39 How many leaves are there Idea 2 Only allow legal assignments at each point 39 le consider only values which do not conflict previous assignments 39 Might have to do some computation to figure out whether a value is ok Depthfirst search for CSPs with these two improvements is called backtracking search Backtracking search is the basic uninformed algorithm for CSPs Can solve nqueens for n z 25 Backtracking Search function BACKTRACKING SEARCHesp returns solutionfailure return RECURSIVE BACKTRACKING esp function RECURSIVErBACKTRACKING assignment esp returns soInfailure if assignment is complete then return assignment WITH SELECT UNASSIGNED VARIABLEVARIABLESesp assignment esp for each value in ORDER DOMAIN VALUESum assignment esp do if value is consistent with assignment given CONSTRAINTScsp then add ZHM value to assignment result e RECURSIVE BACKTRACKING assignment esp if result failure then return result remove var value from assignment return failure 39 What are the choice points Backtracking Example Improving Backtracking 39 Generalpurpose ideas can give huge gains in speed 39 Which variable should be assigned next 39 In what order should its values be tried 39 Can we detect inevitable failure early 39 Can we take advantage of problem structure Minimum Remaining Values 39 Minimum remaining values MRV 39 Choose the variable with the fewest legal values 4 39 Why min ratherthan max 39 Called most constrained variable 39 Failfast ordering Degree Heuristic 39 Tiebreaker among MRV variables 39 Degree heuristic 39 Choose the variable with the most constraints on remaining variables g 39 Why most rather than fewest constraints Least Constraining Value 39 Given a choice of variable 39 Choose the least constraining value 39 The one that rules out the fewest values in the remaining variables 39 Note that it may take some computation to determine this 39 Why least rather than most 39 39 Combining these heuristics makes 1000 queens feasible Forward Checking 39 Idea Keep track of remaining legal values for unassigned variables 39 Idea Terminate when any variable has no legal values C WA NT Q N SW V SA T Constraint Propagation FonNard checkin ropagates information from assigned to 1EmIaSSIgned varIa es but doesn39t prowde early detection for all aI ures WA NT Q NSW V SA T IIIIIIIIIIIIIIlllllllllllllll IIIIIIIIIIIIIII IIIIIII I I ll IIIIII llllll NT and SA cannot both be blue Why didn t we detect this yet Constraint propagation repeatedly enforces constraints locally Arc Consistency Simplest form of propagation makes each arc consistent 39 X e Y is consistent iff for every value x there is some allowed y de esan WA NT Q NSW V SA T I IIII IIIII IIIl W If X loses a value neighbors of X need to be rechecked Arc consistency detects failure earlier than fonNard checking What s the downside of arc consistency Can be run as a preprocessor or after each assignment Arc Consistency function AC S esp returns the CSP possibly with reduced domains inputs 65 a binary CSP with variables X1 X2 Xn local variables queue a queue of arcs initially all the arcs in esp While queue is not empty do X X lt REMOVE FIRSTqueue if REMOVE INCONSISTENT VALUESOQ Xj then for each Xk in NEIGHBORSPQ do add Xi to queue function REMOVE N ONSISTENT VALUESXi Xj returns true iff succeeds removed e false for each 1 in DOMAINlXi do if no value y in DOMAINlXj allows 1 y to satisfy the constraint Xi H X then delete J7 from DOMAINX1 removede true return removed 39 Runtime On2d3 can be reduced to On2d2 39 but detecting all possible future problems is NPhard why Problem Structure Tasmania and mainland are Independent subproblems Identifiable as connected compone of constraint graph Suppose each subproblem has c variables out of n total Worstcase solution cost is Oncdc linear in n 39 Eg n 80 cl 2 c20 39 280 4 billion years at 10 million nodessec 39 48220 04 seconds at 10 million no essec wC I gag TreeStructured CSPs 0 G 90 G G 39 Theorem if the constraint graph has no loops the CSP can be solved in On d2 time nex slide 39 Compare to general CSPs where worstcase time is Od 39 This property also applies to logical and probabilistic reasoning an important example of the relation between syntactic restrictions and the compleXIty of reasoning TreeStructured CSPs Choose a variable as root order a 6 variables from root to leaves such that every node39s parent precedes 9 0 it in the ordering a a Fori n 2 apply RemovelnconsistentParentXiXi Fori 1 n assign gt consistently with ParentXi Runtime On d2 Nearly TreeStructured CSPs 39 Conditioning instantiate a variable prune its neighbors39 domains 39 Cutset conditioning instantiate in all ways a set of variables such that the remaining constraint graph is a tree 39 Cutset size c gives runtime 0 dc nc d2 very fast for small c Iterative Algorithms for CSPs Greedyand local methods typically work with complete states Ie all variables aSSIgned To apply to CSPs 39 Allow states with unsatisfied constraints 39 Operators reassign variable values Variable selection randomly select any conflicted variable Value selection by minconflicts heuristic 39 Choose value that violates the fewest constraints 39 e hill climb with hn total number of violated constraints Example 4Queens States 4 queens in 4 columns 44 256 states Operators move queen in column Goal test no attacks Evaluation hn number of attacks erfrmnce f MinCnflicts Given random initial state can solve nqueens in almost constant time for arbitrary n with high probability eg n 10000000 The same appears to be true for any randomlygenerated CSP except in a narrow range of the ratio number of constraints number of variables CPu time i JL I quot39 R critical ratio Summary CSPs are a special kind of search problem 39 States defined by values of a fixed set of variables 39 Goal test defined by constraints on variable values Backtracking depthfirst search with one legal variable assigned per node Variable ordering and value selection heuristics help significantly FonNard checking prevents assignments that guarantee later failure Constraint propagation e arc consistency does additional work to constrain values and de ec InconsIs encnes The constraint graph representation allows analysis of problem structure Treestructured CSPs can be solved in linear time Iterative minconflicts is usually effective in practice CSCI 553 Networking III Unix Network Programming Spring 2007 I Object Oriented Programming More on Objects Introduction i Previous lecture introduced basics of object oriented programming I This one goes a little further I As before the ideas are almost universal but the form varies from language to language def initself number3 class Recentobject selfnumber number selfitems def strself I We39ve already met some special return strse1f items methods like init and str def WW item I Usually not called directly self39ltems39appe d em selfitems selfitems selfnumber I Instead Python automatically def leneem invokes them at specific times I lem itimm object creation and string 39 for era in 39Permian39 39Trassic39 39Jurassic39 I 39Cretaceous39 39Tertiary39 I There are lots of other specnal historyaddma print 1enhistory history I For example if obj has a len J Permian method Python calls it whenever 2 39Permanx 39rrassicv 3 Permian Trassic Jurassic 3 Trassic Jurassic Cretaceous 3 Jurassic Cretaceous Tertiary Overloading Operators I The expression quota bquot is just a shorthand for adda b I Or if a is an object for aaddb I But since people might actually want to use the name add Python spells this method add I Defining it is an example of operator overloading I If a class defines a method add it is called whenever something is 39d to the object 1e x y calls XaddY class Recentobject def add self item selfitemsappenditem selfitems selfitems selfnumber return self if name 39main39 history Recent for era in 39Permian39 39Trassic39 39Jurassic39 39Cretaceous39 39Tertiary39 history history era print lenhistory history 1 Permian 2 Permian Trassic 3 Permian Trassic Jurassic 3 Trassic Jurassic Cretaceous 3 Jurassic Cretaceous Tertiary Commutativity I 2 X and X 2 don39t always do the same thing I Matrix multiplication I Classes can define righthand versions of operators eg radd instead of add I If the object on the left has an add method call that I Otherwise if the object on the right has an radd method call that I Otherwise try Python39s builtins Other Special Methods i Almost every aspect of an object39s behavior can be overridden by defining the right methods Method Purpose iltiself other Less than comparison ilei E and others are used for less than or equal not equal etc icalliself args Called for obj 3 lithium ileniself Object length igetitemiself key Called for obj 314 isetitemiself key value Called for obj 314 217 icontainsi Called for lithium in obj iadd Called for obj value use imuli for obj value etc iinti Called for intobj use ifloati and others to convert to other types Example Sparse Vector A I vector is sparse if most of its entries are zero I Use a dictionary to record nonzero values and their indices I No point padding eleven actual values with nine million zeroes I Overload operators to make the object look like a real vector I Addition create a new vector with a nonzero value wherever either operand had a nonzero value I Dot product add up products of matching nonzero values I Length return one more than the index of the largest non zero value I One more to be consistent with Python39s lists How Long is a Sparse Vector I What is the length of v after the following operations v SparseVector v27 v43 v43 10 10 00 all values initialized to 00 length is now 28 length is now 44 is the length still 44 or 28 I This isn39t really a programming question Largest current indexquot and largest index ever seenquot can both be implemented The latter is easy so we39ll use that Vector Behavior class SparseVectorobject 39 Construction creates an empty sparse vector I Define en getitem ancl setitem to make it behave like a list I Exercise implement del sparseindex quot39Implement a sparse vector explicitly def def quot39The length of a vector is one more than the largest indexquot39 def def setitemself If a value has not been set its value is zero39 initself quot39Construct a sparse vector with all zero entriesquot selfdata lenself if selfdata return 1 maxselfdatakeys return 0 key quot39Return an explicit value or 00 if none has been setquot39 getitemself if key in selfdata return selfdatakey return 00 key value quot39Assign a new value to a vector entryquot39 if typekey is not int 39non integer index to sparse vector39 raise KeyError selfdatakey value Dot Product The other object on the right side of quotquot is usually called other I No reason to insist that it be a sparse vector 39 Could equally well be a list of values So loop over our indices and multiply by corresponding values in other object I Any index not encountered in this loop doesn39t matter since it corresponds to something that39s zero And make rmul mul do the same thing as rmul def mulself other quot39Calculate dot product of a sparse vector with something elsequot39 result 00 for k in selfdata result selfdatak otherllt return result def rmulself other return self mulother Addition I Trickier than multiplication result is nonzero wherever either argument is nonzero I Don39t want to loop over all the zeroes of either argument I Solution if the other object is a sparse vector cheat I Ie reach inside it and rely on details of its implementation addself other quot39Add something to a sparse vectorquot Initialize result with all non zero values from this vector result SparseVector resultdataupdateselfdata If the other object is also a sparse vector add non zero values if isinstanceother SparseVector for k in otherdata resultk resultk otherk Otherwise use brute force else for i in rangelenother resulti resulti otheri return result Right hand add does the same thing as left hand add radd add Testing The class isn39t written until the tests are finished I Exercise replace the print statements with assertions if name 39main39 X SparseVector Xl 10 X3 30 X5 50 print 391enx39 1enx for i in range1enx print 3939 i Xi y SparseVector yl 100 y2 200 y3 300 print 39Xy39 Xy print 39yX39yX print 39Xy39 Xy print 39yX39yX Z 00 01 02 0 print 39Xz39 Xz print 39Xz39 XZ print 39ZX39 ZX 3 04 0 00 110 200 330 00 50 00 110 200 330 0 ZenX 6 0 00 1 10 2 00 3 30 4 00 550 X y y X y 1000 y X 1000 X z 00 11 X z 35 2 X 00 11 5 02 33 04 02 33 04 0 50 55 55 Static Data Members I Sometimes want to share data between all instances of a class I Constants a count of the number of class instances created etc Any data members defined inside the class block belong to the class as a whole class Counterobject num 0 Number of Counter objects created def initse1f name Counternum 1 selfname name if name 39main39 print 39initial count39 Counternum first Counter39first39 print 39after creating first object39 Counternum second Counter39second39 print 39after creating second object39 Counternum initial count 0 after creating first object 1 after creating second object 2 class Experimentobject alreadydone staticmethod I def getresu1tsname params if name in Experimenta1readydone return Experimenta1readydonename exp Experimentname params exprun Can also create static methods Experiment a1 readydonename exp return exp I Just like a function but put inside the class definition for clarity I I def initse1f name params Define the method without the self 5e1fname name parameter self params params I Since it isn39t tied to any particular instance of the class def runse1f Put staticmethod in front of it 39 A decorator if name 39main39 first Experimentgetresu1ts39anti gravity39 I PowerfUI bUt beyond the scope 0f thls second Experimentgetresu1ts 39 time travel 39 COUI SE th39 d ir Experimentgetresu1ts39anti gravity39 print 39first 39 idfirst print 39second39 idsecond print 39third 39 idthird first 5120204 second 5120396 third 5120204 Design Patterns tyle guidelines describe what code should look like line by line Design patterns are how we describe larger patterns I A standard solution to a commonlyoccurring problem I That isn39t specific enough to be captured once and for all in a library routine or framework Idea developed from the 19605 on by the building architect Christopher Alexander I For example it39s hard to define what a porch is but the basic idea comes up everywhere the climate is warm Introduced to programmers in Gamma et al 1995 I Still a bestseller but not particularly approachable The Singleton Pattern I roblem want to ensure that there39s only ever one instance of a particular class I Eg the controller for a radio telescope antenna I Considerations I There must be exactly one instance of the class I All objects that use the class must have access to that instance I Solution I Create objects by calling a function instead of the class39s constructor I Have the function store a reference to the first object it creates I Have it return that same object on every subsequent call Singleton Implementation class AntennaClassobject quot39Singleton that controls a radio telescopequot39 The unique instance of the class instance None The constructor fails if an instance already exists def initself maxrotation assert AntennaClassinstance is None 39Trying to create a second instance selfmaxrotation maxrotation AntennaClassinstance self Make the creation function look like a class constructor def Antennamaxrotation quot39Create and store an AntennaClass instance or return the one that has already been createdquot39 if AntennaClassinstance return AntennaClassinstance return AntennaClassmaxrotation Singleton Pattern g Demonstration first Antenna23 5 pr1nt 39f1rst 1nstance39 1df1rst second Antenna47 25 pr1nt 39second 1nstance39 1dsecond first instance 10685200 second instance 10685200 The Visitor Pattern I roblem want an easy way to walk around a complex structure I Eg visit each value in a list of lists of lists exactly once I Considerations I Many different operations may need to be performed I Structure is complex enough that visiting elements is errorprone I The types of objects in the structure and the ways they are connected are fixed I Solution I Create a class that knows how to get to each value in turn I Give it an empty method that is called once for each value I Users derive from this class and fill in the method I An allinone version of the framework shown earlier Visitor Implementation class NestedListVisitorobject quot39Visit each element in a list of nested listsquot39 def initself data def quot39Construct but do not runquot39 assert typedata is list 39Only works on lists selfdata data runself quot39Iterate over all valuesquot39 selfrecurseselfdata recurseself current quot39Loop over a particular list or sub list not meant to be called by usersquot39 if typecurrent is list for v in current selfrecursev else selfvisitcurrent visitself value quot39Users should fill this method inquot39 pass Visitor Demonstration class MaXOfNNestedListVisitor def initself data NestedListVisitorinitself data selfmax None selfcount 0 def visitself value selfcount 1 if selfmax is None selfmax value else selfmax maxselfmax value testdata 39gold39 39lead39 39zinc39 39silver39 39iron39 39mercury39 test MaXOfNtestdata testrun print 39max39 testmax print 39count39 testcount max zinc count 6 The Abstract Factory Pattern roptern apphcauon doesn Know the speonc types oromem wt WanB to create un t rurm me Ifme onrornamgrapn 5 an RC 100 create a oonUoHer and an Rm 100 oon gurauon pane ts a Subatm 4c create a Subatm 4c controuer and oom guauon pane Conswderauo s n RC 100 EDT100 Suhalta 4C controller JeCB can be grouped by category and fanny New categones or fammes may appear amr swam con guranun panel Geate a dass matknows now to bqu an mstanoe ofeach category for a parum ar fam y Geate another dass matstores rstances of these bquer K oasses and cans men methods wnen sked to 2quot Addmg a new fam y 5 easy but addmg a new category reqmres onanges to every bquer Abstract Factory Builder Class AbstraCtFamilyobject 39 7 Builders for particular families derive from this definitself family selffamily family def getname self 39 return sel fname def makecontrollerself39 raise NotImplementedError makecontroller missing def makeconfigurationpanel self 39 raise NotImplementedError makeconfigurationpanel missing Abstract Factory Manager class Fac toryManagerobjec t Manage builders by family def def init self currentfamilyNone selfbuilders selffamily family setfamilyself family assert family Empty family selffamily family addself builder name bui lder ge tname selfbuildersname builder makecontrollerself selfcheckstate return sel f buildersself family makecontrol ler makeconfigurationpanel sel f selfcheckstate return sel f buildersself family makeconfigurationpanel checkstateself assert self family No family specified assert self family in selfbuilders Unknown familyquot self family Factory Demonstration factory FactoryManager factoryaddRCT100Factory factory addubalta4CFactory factorysetfam7 ly Subalta4C controller factorymakecontroller configurationpanel factorymakeconfigurationpanel The Command Pattern I roblem want to be able to control the operation of a complex object I Turn on the robot arm move it lower it move it again etc I Considerations I Do not want to have to write an entirely new program for each sequence of operations I Want to be able to add new operations I Would like to be able to undo operations I Solution I Create one class for each distinct operation I Give the class do undo and redo methods I Create instances of these classes to represent particular commands I Create lists of these instances to control the robot arm Base Command Class Class AbstraCtCommandobject Base Class for commands 777 def isundoableself return False by default can t undoredo operations def doself robot raise NotImpZementedErrorquotDon t know how to do squot selfname def undoself robot pass def redoser robot pass A Particular Command c I ass MoveCommandAbs trac tCommand Move the robot arm def 7 n7 tself X y 2 selfx x selfy y selfz z 7 sundoableself return True doself robot robottranslateselfx selfy selfz undoself robot robot translate selfx sery ser z redo self robot selfdorobot Command Demonstration robot Robot commands MoveCommand50 20 23 RotateCommand900 00 00 MoveCommandJ0 20 20 CZoseHandCommand for C in commands Cdorobot A Few Other Patterns ache store temporary copies of objects locally to improve performance State record state of program as object so that it can be re started Null Object use an object that does nothing in place of null 39 Saves testing that object isn39t null before doing operations Adapter wrap one object in another to give the first a different interface I Usually used to give a new library an interface that39s compatible with an old one Proxy use one object as an interface to another Typically the proxy is local and the real object is on another machine Summary I Overloading design patterns and other advanced concepts serve two purposes I Communication a concise way for designers to communicate with each other I Education gives them a way to communicate what they know to newcomers I Don39t expect to connect them all to your own experience the first time I But keep them in mind as you look at new problems CSCI 553 Networking III Unix Network Programming Spring 2007 I Web Server Programming Introduction I Most of the web39s power comes from the fact that browsers can interact with programs I More accurately browsers can ask web servers to run programs on their behalf I This lecture looks at what to do if you receive an H39I39I39P request I Very important that you go through the lecture on security before putting your programs on the web The Pluggable Web I Users want to make the web do different things I How to let them write programs that handle H ITP requests I Option 1 Require them to write socketlevel code I Complicated and errorprone I Can only have one program listening to a socket at a time I Option 2 have the web server accept the H39I39I39P request and then run the user39s code I Recompiling the web server every time someone wants to add functionality would be a pain I So define a protocol that lets web servers run other programs The CGI Protocol I The Common Gateway Interface CGI protocol specifies 39 How a web server passes information to a program How that program passes information back to the web server I CGI does not specify A particular language You can use Fortran the shell C Java Perl Python 39 How the web server figures out what program to run Each web server has its own rules We39ll briefly talk about Apache39s From Server to CGI I Web server runs the CGI by creating a new process Client 39 Server l I 1 I I Python l appW I 1 HTTP request Irka I swam a I I Web Server I p I so HTT response rut 4 20 Figure 141 CGI Data Processing Cycle From Server to CGI I eb server passes some information to the CGI process through environment variables Name Purpose Example REQUESTMETHOD What kind of HTTP request is being GET or POST handled SCRIPTNAME The path to the script that39s executing cgibinpostphotopy QUERYSTRING The query parameters following in the URL 23gemyd g391pgampexp39re5n The type of any extra data being sent with the request How much extra data is being sent with 17290 the request in bytes I The web server may also send CONTENTLENGTH bytes to the CGI on standard input Eg when a file is being uploaded CONTENTTYPE imglpeg CONTENTLENGTH From CGI to Server I The CGI program sends data back to the web server by printing it to standard output I The web server then fonNards this directly to the client I Which means that the CGI program is responsible for creating headers I Note none of this works unless the web server has been configured to run the CGI I By default modern servers won39t do this unless they39re told they can MIME Types I Clients and servers need a way to specify data types to each other I Remember bytes are just bytes the browser doesn39t magically know how to interpret them I Multipurpose Internet Mail Extensions standard specifies how to do this I Organizes data types into families and provides a twopart name for each type I Use the quotContentTypequot header to specify the MIME type of the data being sent Family Text Image Audio Video Applicationspecific data Specific Type Describes texthtml web pages imagemeg JPEGformat image audiOXmP3 MP3 audio file videoquicktime Apple Quicktime video format Adobe PDF document applicationpdf Hello CGI I Simplest possible CGI pays no attention to query parameters or extra data 39 Just prints HTML to standard output to be relayed to the client 39ngmmaQWHmehammoHHMdmmmemmH Wm I and a blank line to separate the headers from the data lusrbinenv python Headers and an extra blank line print 39Content type texthtml39 print Body print 39lthtmlgtltbodygtltpgtHellO CGI ltpgtltbodygtlthtmlgt39 g Invoking a CGI I Invoke it by going to httpwwwyourservercomcgi binheocgipy 39 By convention CGI programs are put in a cgibin directory I Browser displays the simple HTML page generated by the progra Hello CGIi Dane Figure 142 Basic CGI Output Generating Dynamic Content 39 But the whole point of C61 is to generate content dynamically I Eg show a list of environment variables and their values lusrbinenv python import os cgi Headers and an extra blank line print 39Content type texthtml39 print Body print 39lthtmlgtltbodygt39 keys osenvironkeys keyssort for k in keys print 39ltpgts sltpgt39 cgiescapek cgiescapeosenvironk print 39ltbodygtlthtmlgt39 Generating Dynamic Content 39 You39ll use this frequently when debugging ianilla Fill39nx E E2 Edi 2w Ga Yaais Help u mzpiiizvuuiauaui v 0 Ga HQ REQUESTMETHOD GET 7 REQUESTiU39RI kgrbmshowienv py SCRIPTiFELENAMIE F ApacheApacheZcgrbmshowienv py 7 SCRIPTiNAMIE kgrbmshowienv p397 Finge 143 Environment Variable Output Forms I Next step is to allow users to enter data I Without manually editing URLs to append parameters I HTML forms allow users to enter text choose items from lists etc I Not nearly as sophisticated as desktop interfaces I Although programmers are doing more every day particularly using Javascript Creating Forms I Create a form using a ltformgtltformgt element I action attribute specifies the URL to send data to I method attribute specifies the type of H l l39P request to send I Usually quotPOSTquot for HTML forms I Inside the form can have ltseectgt elements to let users choose values from a list I List items specified using ltoptiongt elements ltinputgt elements for other kind of data I If type is quottextquot get a oneline text entry box I If type is quotcheckboxquot get an onoff checkbox I quotsubmitquot and quotresetquot create buttons to submit the form or reset the data to initial values A Simple Form 39 Mozilla Firefax Eile Edit Eiew e eekmarks eels Help 139 EC g filelI39I39fFII39herneI395wII39tan 0 Ge Sequence GATFACA Searehtype Similaritymeteh V Programs FROG version 11 D FROG 20 beta BayesHart Dene Figure 144 A Simple Form Parameter Names I Each ltinputgt element has a name attribute I These become the names of the parameters that the client sends to the server The input elements39 values are the parameters39 values I Submitting the form shown above with default values produces osenviron39REQUESTMETHOD39 quotPOSTquot osenviron39SCRIPTNAME39 quotcgibinsimpleformpyquot osenviron39CONTENTTYPE39 quotapplicationXwwwform udencodedquot osenviron39REQUESTLENGTH39 quot80quot Standard input sequenceGA ITACAampsearchtypeSimilarity matchampprogramFROG11ampprogramBayesHart Handling Forms I Could handle form data directly I Read and parse environment variables I Read extra data from standard input I But the mechanics are the same each time so use Python39s cgi module instead I Defines a dictionarylike object called FieldStorage I Keys are parameter names I Values are either strings if there39s a single value assocatied with the parameter or lists if there are many I When a FieldStorage object is created it reads and stores information contained in the URL and environment I Which means that a C61 program should only ever create one I Program can read extra data from sysstdin Form Handling Example I Example show the parameters send to a script lusrbinenv python import cgi print 39Content type texthtml39 print print 39ltntmlgtltbodygt39 form cgiFieldStorage for key in formkeys value formgetvaluekey if isinstancevalue list value 3939 39 39joinvalue 3939 print 39ltpgts sltpgt39 cgiescapekey cgiescapevalue print 39ltbodygtltntmlgt39 URL Value of Value of a b httpwwwthirdbitcomswcshowparamspya0 quot0quot None httpwwwthirdbitcomswcshowparamspy quot0 quothelloquot Raggtvavgntgirdbit comswcshow params py a0ampbhelloampa22 39 039 22 quothemquot Development Tips I During development add import cgitb cgitbenabe to the top of the program I cgitb is the C61 traceback module I When enabled it will create a web page showing a stack trace when something goes wrong in your script I Testing whether a FieIdStorage value is a string or a list is tedious I In almost all cases you39ll know whether to expect one value or many I Use FieldStoragegetfirstname to get the unique value I Returns the first if there are many I FieldStoragegetlistname always returns a list of values I Empty list if there39s no data associated with name I If there39s only one value get a singleitem list Maintaining State 39 Often want to change the data a server is managing as well as read it pda escrlption 0 an experiment d1ange your preferred email address etc 39 The industrialstrength solution is to use a threetier architecture Logical view Controller Made 01gt Bmwser Web Server Database Miriam Physical Figure 145 Three Tier Architecture 39 CGI program s 39 Runs the ueri 39 Translates rsults into HTML to send back to the client tuffs parameters from H39I39I39P requesls into SQL queris es Maintaining State in Files I Simple programs can often get away with using files I The CGI program re reads the file each time it processes a request I And rewrites it if there have been any updates I Example append messages to a web page I Old messages are saved in a file one per line Hi is anyone reading this site I was wondering the same thing I wasn39t sure if we were supposed to post here Good point Is there way to delete messages Maintaining State in Files I Script checks the incoming parameters to decide what to do I If newmessage is there append it and display results If newmessage isn t there someone39s visiting the page rather than submitting the form Get existing messages infile open39messagestxt39 39r39 lines xrstrip for x in infilereadlines infileclose Add more data form cgiFieldStorage if formhaskey39newmessage39 linesappendformgetfirst39newmessage39 outfile open39messagestxt39 39w39 for line in lines print gtgt outfile line outfileclose HTML Generation I Note that there is no static HTML page in this example I WMHMuwmwsswwwgmamwbyamwmm Display print 39Content Type texthtml39 print print 39lthtmlgtltbodygt39 for line in lines print 39ltpgt39 line 39ltpgt39 print quot39 ltform actionquothttpwwwthird bitcomswcmessageformpyquot methodquotpostquotgt ltpgtYour message ltinput namequotnewmessagequot typequottextquotgt ltpgt ltpgt ltinput typequotsubmitquotgt ltinput typequotresetquotgt ltpgt ltformgt print 39ltbodygtlthtmlgt39 HTML Templating I A lot of this program is devoted to copying values into an HTML template I There are lots of good systems out there in many languages for doing this I Kid WebWare in Python I Java Server Pages JSPs in Java I Please do not write one of your own What About Concurrency I What happens if two users try to save messages at the same time 10 is typically slower than processing So most web servers try to overlap operations I Race condition First instance of messageformpy opens messagestxt reads lines closes file Second instance opens messagestxt reads the same lines closes file First instance reopens file writes out original data plus one new line Second instance reopens file writes out original plus a different new line First instance39s message has been lost File Locking I Solution is to lock the file As the name implies gives one process exclusive rights to the le After the first process acquires the lock any other process that tries to read or write the file is suspended until the first releases it I Mechanics are different on different operating systems 39 But the Python Cookbook includes a generic file locking function that works on both Unix and Windows Implementing Locking Get existing messages msgfile open39messagestxt39 39r39 fcntlflockmsgfilefileno fcntlLOCKEX lines Xrstrip for x in msgfilereadlines Add more data form cgiFieldStorage if formhaskey39newmessage39 linesappendformgetfirst39newmessage39 msgfileseek0 for line in lines print gtgt msgfile line Unlock and close fcntlflockmsgfilefileno fcntlLOCKUN msgfileclose Who Are You I How to maintain state on the client I Need to know which shopping cart to display for a particular user I H39ITP is a stateless protocol I If a client makes a second or third or fourth request server has no reliable way of connecting it to the first one I Can guess based on client address elapsed time etc I But it39s just a guess Cookies I Solution is for the server to create a cookie I A string that is sent to the client in an H39I39I39P response header I Client saves it either in memory or on disk Client 1 Server wwwthagwartsgdu Daabasu Emwser urrpmqmi when My Wm magic g conkle L H1772 i Wm saw 9 wwwlIogwansedu D1772 x Wm msnonse 01172 Harry l Daahase b l I applicauon 9 9 Browser HTTP request W cunkie01772 l Web Server HYTP i spnnse wwwhn9wansedu 01772 a Figure 146 Cookies I The next time the client sends a request to the site it sends the cookie back to the server I Like giving someone a claim check for their luggage Creating Cookies I Represent cookies in Python using CookieSimpIeCookie I Do not use SmartCookie it is potentially insecure I When creating add values to a cookie as if it were a dictionary I Convert it to a strin eg by printing it to create the required H39ITP hea er I When the cookie comes back I Get the value associated of the environment variable quotH39I39I39PCOOKIEquot I Create a SimpIeCookie I Pass the quotH39TPCOOKIEquot value to the cookie39s load method Cookie Example I Example count the number of times a user has visited a web site I If there39s no cookie create one with a count of 1 I Otherwise increment the count I Create a new cookie to send back to the user I Display the count Cookie Example Get old count count 0 if osenvironhaskey39HTTPCOOKIE39 cookie CookieSimpleCookie cookieloadosenviron39HTTPCOOKIE39 if cookiehaskey39count39 count intcookie39count39value Create new count count 1 cookie CookieSimpleCookie cookie39count39 count Display print 39Content Type texthtml39 print cookie print print 39lthtmlgtltbodygt39 print 39ltpgtVisits dltpgt39 count print 39ltbodygtlthtmlgt39 Cookie Tip I Can control how long a cookie is valid by setting an expiry value I Either the number of milliseconds I Or the time it should expire in UTC I Use timeasctimetimegmtime to create the value I Do not put sensitive information in cookies I Browsers store them in files on disk I Villains can watch network traffic and steal data I Cookies should instead be random values that act as keys into serverside information I Talk about this more in the next lecture Summary I CGI is example of eventdriven programming I The framework invokes your code at specific times and passes it specific information I What happens the rest of the time isn39t your concern I At least until something goes wrong and you have to debug it I Simple CGI programs can accomplish a lot I The entire first generation of web applications were built this way I But they can easily become very complicated I Discuss alternatives in the final lecture Fall 2007 CSci 553 Pipe Practice Practice Using Pipes Here is a list of small tasks that can be done by chaining together 2 or more uniX standard commands into a pipeline to get the results First one to email me a correct and complete solution for a particular task will receive fame and glory for their brillant solution and maybe a few bonus points 1 N L 4 V39 The ps command can be used to get a list of processes running on the system Create a pipeline to nd all of the processes that you are running on the system The tail command can be used to continuously monitor the output being generated to a le Write a lter to watch a le continuously and monitor it for when certain events or output is generated For example the varlogmessages le is where lots of uniX services dump log messages Monitor this le and look for when new kernel messages are generated du can be used to determine the amount of disk usage being used by a directory Determine which directory is using the most disk space hint du and sort would be useful here Print out only the name and modi cation time of the oldest le in the current directory Do the same but for the 3 oldest les Find is a powerful command that can be used to nd les based on not only the name of the le matching a pattern but by modi cation time owner etc Find all of the directories under the current directory but lter out all of the svn subversion directories from the list of directories CSCI 553 Networking III Unix Network Programming Spring 2008 Shell Basics Today39s Agenda I Sign NISL account forms Determine assignment 01 completion I Demo of ssh amp x windows access I Windows ssh access if possible I ssh and X access from a Linux I Ubuntu virtual machine in Computer Lab I Command Line Editing Handout I Shell I More Shell I Special Characters CT RL C CT RL D CT RL SQ CT RL Z I Editors I vi Handout I kate Assignment 0 1 I Sign NISL account forms I Determine assignment 01 completion SSH amp X Window Access I Demo of SSH and X Window access I MS Windows ssh access if possible 39 ssh and X access from a Linux 39 Ubuntu virtual machine in Computer Lab new nis account access handout Command Line Editing I Command Line Editing Handout An Aside Command Line g Editing I uparrow previous command I downarrow next command I ctrIa go to beginning of line I ctrIe go to end of line I tab complete current command or file name Shell Basics Introduction to the Unix Command Line An Aside A Note on Linux and Unix Philosophy The Big Picture I Most of the tools used in this course are freely available under various open source licenses I Makes Unix Linux development attractive for this reason along I However Unix philosophy of software development goes even beyond this and is embodied in many of the tools we will look at I We will learn more about this as the course progresses and you get more familiar with UnixLinux environment An Aside A Note on Linux and Unix Philosophy The Big Picture UPU Ch 1 Basics of Unix Philosophy Rule of Modularity Write simple parts connected by clean interfaces Rule of Clarity Clarity is better than cleverness Rule of Composition Design programs to be connected to other programs Rule of Separation Separate policy from mechanism separate interfaces from engines Rule of Simplicity Design for simplicity add complexity only where you must Rule of Parsimony Write a big program only when it is clear by demonstration that nothing else will do Rule of Transparency Design for visibility to make inspection and debugging easier Rule of Robustness Robustness is the child of transparency and simplicity Rule of Representation Fold knowledge into data so program logic can be stupid and robust Rule of Least Surprise In interface design always do the least surprising thing Rule of Silence When a program has nothing surprising to say it should say nothing Rule of Repair When you must fail fail noisily and as soon as possible Rule of Economy Programmer time is expensive conserve it in preference to machine time Rule of Generation Avoid handhacking write programs to write programs when you can Rule of Optimization Prototype before polishing Get it working before you optimize it Rule of Diversity Distrust all claims for one true way Rule of Extensibility Design for the future because it will be here sooner than you think Introduction I Most modern tools have a graphical user interface GUI They39re easier to use A picture is worth a thousand words I But commandline user interfaces CLUIs still have their place Easier to build a simple CLUI than a simple GUI Higher actiontokeystroke ratio 39 Once you39re over the learning curve Easier to see and understand what the computer is doing on your behaW Which is part of what this course is about Most important it39s easier to combine CLUI tools than GUI tools 39 Small tools combined in many ways can be very powerful Introduction I You are encouraged to follow along and try out things as they are presented I In this and future lectures I Please switch drivers occasionally if you are sharing a computer I KMenugtSystemgtTerminal Program Konsole The She w l11511ercur 2 I The most important commandline tool is the command shel 39 Usually just called the shell I Looks and igession Edit View Bookmarks Settings Help I i gishell No 2 Shell dhartercurvus 2007rspr1ng 05 05515551 Lilia LeiLuivugt LgtL155315 Ll 551 12K drwxrrxrrx 5 dharter dharter 40K Jan 10 1520 w 5x 5 dherter dherter 40K Jan 10 1520 5 dherter dherter 40K Jan 15 1420 web dhartercurvus 050155315 5 weblectures dhartercurvus lecturesl d drwxrrxrrx 2 dherter dherter 40K Jan 15 1504 drwxrrxrrx 5 dherter dherter 40K Jan 15 1420 rrwxrrrrrr 1 dharter dharter 109K Jan 15 1454 lorlntrodu y r wxrrrrrr 1 erte dherter 61K Jan 15 1504 llrswcrlntruppt 5r xrrrrrr 1 dherter dherter 51K Jan 15 1505 125hellrbaslcsppt dhartercurvus lecture51 g1mp5 a 14970 dhartercurvus lecturesl ts Orlntraduttlc ppt llrswCrl trappt dhartercurvus lecturesl 15 1 551 240 hellrbaslcsppt rrwxrrrrrr 1 dherter dherter 111104 Jan 15 1454 lDrlntruductlun ppt wxrrrrrr 1 dherter dherter 52454 Jan 15 1504 1 wcrlntru r dharter 51712 155 15 1505 1 1511 has rrwxrnrw 1 dharte anartsrecurws lectur351 I t c icmi i works like an interactive terminal circa 1980 Fig L21 A Shell in Action The Shell I Manages a user39s interactions with the operating system by I Reading commands from the keyboard I Executing those commands I or running another program I Displaying the output I The shell is just one program among many I Many different shells have been written I The Bourne shell called sh is an ancestor of many of them I It39s still a lowest common denominator that you can always rely on I We39ll use bash the Bourne Again Shell in this course The Shell is Not the Operating System The operating system is notjust another program 39 Automatically loaded when the computer boom up Runs everything else including shells The OS manages the computer39s hardware Provides a common interface to different chips disks network cards e c So that userrlevel applications can run anywhere The OS also keeps track ofwhat mgrams are running what privileges they have etc I Which makes it crucial to security browser disk network 2le Figure 22 Operating System i The File System I The le system is the set of files and directories the computer can home access 5 i ii I Everything that doesn39t go away when you reboot Fiji Data is stored in les thesis I By convention files have two part names like notestxt or homehtml I Most Oferating systems allow you to associate a lename extension with an app ication I Eg txt is associated with an editor and html with a web browser I But this is all just convention you can call files almost anything you wan I Files are stored in directories often called folders I Directories can contain other directories too I Results in the familiar directory tree I Everything in a particular directory must have a unique name I Otherwise how would you identify it I But items in different directories can have the same name I On Unix the le system has a unique root directorycalled addresseshtml r E I Every other directory is a child of it or a child of a child etc PNEHS IEY j I On Windows every drive has its own root directory 41 I So Chomengilsonnotestxt is different from m l 39 Jhomengilsonnotestxt I Notefile and directory names are casesensitive on Unix but caseInsenSItlve on Windows I So don39t ever rely on case differences when naming things W i 7 Figure L23 A Directory Tree Paths A path is a description of how to find something in a file system I An absolute path describes a location from the root directory down I Equivalent to a street address I Always starts with quotquot I homehpotter is Harry Potter39s home directory I coursesswcweblecshelhtml is this file I A relative path describes how to nd something from some other location I Equivalent to saying Four blocks north and seven east I From coursesswc the relative path to this file is weblecshellhtml Every program including the shell has a current working directory I Where am I I Relative paths are deciphered relative to this location I It can change while a program is running Two special directory names I pronounced dot is the current directory I pronounced dot dot is the directory one level up I Also called the parent directory I In coursesswcdata is coursesswc In 39 L is I 1 home r q hpo er r j thesrs 39 a L L r 39 Wk 2000 Figure L24 Parent Directories util To see what39s in the data directory type ls data ls data bio elements haikutxt morsetxt pdb solarsystemtxt Or I Type cd data to go into data I Ie change the current working directory to data I Type Is on its own I Type cd to go back to where you started cd data Pwd homehpotterswcdata ls bio elements haikutxt morsetxt pdb solarsystemtxt cd pwd homecsci553hpotterswc g Execution Cycle when you type a command like Is the 05 a b M Reads characters from the ke oard M Passes them to the shell because It39s the currently actlve wlndow The shell 39 Breaks the ilne Of teXt It reCelVeS Into words Looks for a program wlth the same name as the rst word See In a moment now the shell knows where to look uns that program That program39s output goes back to the shell which gives it to the OS which dis lays it on the screen 231 Actually the 05 hands It to the wlndow manager whlch takes care of the dlspl All welledesigned software systems work this way Break the task down Into pleces Wlte a tool that solves each subrproblem Kr Hoo 39em up 20 Allows you to Develop and test components Independently I crementally o emmg yslem Figure L25 Running a Program eplace or reuse components n Add new components as you go along util LICENSEtxt conf data docs indexswc icenseswc printcss swccss tests Providing Options I man is your FRIEND I The basic set of Unix commands use man pages to document options I Options are a well supported paradigm in Unix we will look again at option parsers in c amp python Creating Files and Directories I Rrather than messing with the course files let39s create a temporary directory and play around in t ere mkdir tmp I Note no output but v verbose would tell mkdir to print a con rmation message I Go into that directory no files there yet cd tmp Is I Ilse the editor of Iour choice to create a file called earthtxt with the following contents Name Earth Period 36526 days Inclination 000 Eccentricity 002 I Notepad on Windows runs in a window of its own I Pico on Unix takes over the shell window temporarily I I would suggest for now learning basic vi and using kate when you get to larger programming projects I We39ll look at more advanced programming editors later I Easiest way to create a similar file venustxt is to copy earthtxt and edit it cp earthtxt venustxt vi venustxt ls t venustxt earthtxt I t tells Is to list by modification time instead of alphabetically Looking at Files I Check the contents of the file using cat short for concatenate I Prints the contents of a file to the screen cat venustxt Name Venus Period 22470 days Inclination 339 Eccentricity 001 I Compare the sizes of the two files using I ls l quot meaning long formquot ls total 2 rwxrxrx 1 ngilson None 73 Jan 4 1558 earthtxt rwxrxrx 1 ngilson None 73 Jan 4 1558 venustxt I Fifth column is size in bytes I wc for word count wc earthtxt venustxt 4 9 73 earthtxt 4 9 73 venustxt 8 18 146 total I Columns show lines words and characters Basic Tools Especially these man Documentation and help for commans cat Concatenate and display text files cd Change working directory clear Clear the screen cp Copy files and directories date Display the current date and time diff Show differences between two text files echo Print arguments head Display the first few lines of a le ls List files and directories mkdir Make directories more less Page through a text le mv Move rename files and directories od Display the bytes in a le octal dump passwd Change your password pwd Print current working directory rm Remove files rmdir Remove directories sort Sort lines tail Display the last few lines of a file uniq Remove adjacent duplicate lines we Count lines words and characters in file Summary I Commandline tools will be with us for a long time 39 Easiest way to do many simple tasks Easiest way to see what the computer is actually doing I Often the only thing you can rely on having on a new machine A handful of basic commands will get you a long way Today39s Agenda I File Permissions I Pipes example I Version Control Introduction I Lab Assignment 1 apr 10151020 CSCI 414 553 Networking III Unix Network Programming pring 2008 Version Control Introduction I Five things distinguish professional programmers from amateurs Using a version control system Automating repetitive tasks Systematic testing Using debugging aids rather than print statements Advanced familiarity with a Programming Editor I This lecture introduces the first of these Even if it39s the only thing you take away from this course you39ll be more productive than you are now Problem 1 Collaboration I What if two or more people want to edit the same file at the same time I Option 1 make them take turns I But then only one person can be working at any time I And how do you enforce the rule I Option 2 patch up differences afterwards I Requires a lot of reworking I Stuff always gets lost Solution Version Control The right solution is to use a version con 0 sys m Keep the master copy of the file in a central repositvry Each author ed its a working copy When they39re ready to share their changes they committhem to the repository Other people can then do an update to get those changes Raposnory Hermione for one person to manage files on mu tlple mac Ines e p one working copy on your personal laptop the lab machine and the departmental seryer o more mailing yourself files or carrying around a USB drive and forgetting to copy things onto it 2am Fig L41 Managing MultiAuthor Collaboration Problem 2 Undoing Changes I Often want to undo changes to a file Start work realize it39s the wrong approach want to get back to starting point Like undo in an editor but keep the whole history of every file forever I Also want to be able to see who changed what when 39 The best way to find out how something works is often to ask the person who wrote it Solution Version Control g Again I Have the version r g Rep 5it ry l i control system keep 39 old revisions of files I And have it record who made the change and when I Authors can then roll back to a particular revision or time Fig L42 Version Control as a Time Machine Which Version Control System I Many systems are available commercially I If you have a large group or a budget Perforce is excellent I CVS and Subversion are I Open source I Reliable I Well documented I CVS has been around since the 19805 I Very popular but showing its age I Flaw 1 it keeps track of each file separately I So there39s no reliable way to ask Which files were changed together I Flaw 2 you can create new directories but can39t delete old ones I Subversion developed from 2000 onward as a workalike replacement I Feels the same but eliminates CVS39s major weaknesses I Many projects have already switched Basic Use I Ron and Hermione each has a working copy of the solarsystem project repository I See below how they got it I Ron wants to add some information about Jupiter39s moons I Runs svn update to synchronize his working copy with the repository 39GoesLntotbeJutheLd1LectonLandCLeates moonstxt Name OrbitalRadius OrbitalPeriod Mass Radius 10 4216 1769138 8932 18216 Europa 6709 3551181 4800 15608 Ganymede 10704 7154553 14819 26312 Callisto 18827 16689018 10759 24103 I Runs svn add moonstxt to bring it to Subversion39s notice I Runs svn commit to save his changes in the repository Repository is now at revision 121 I That afternoon Hermione runs svn update on her working copy I Subversion sends her Ron39s changes Basic Use Rn I Re P05 0 W H e rm i on e muonslxt moonsth moonsixt a 39 a f 39 a i E S F 39 Equot l 73quot I a 1 7 a 2 g 1 3 a 39 i g 3 moansixt 239 comm 1 7 3 moonstxt Fig L43 The Basic EditUpdate Cycle Q How To Do It 39 One way to use Subversion is to type commands in a shell I A lowest common denominator that will work almost everywhere I RapidSVN is a GUI that runs on Windows Linux and Mac 39 Well maybe walks is a better description Version 09 isn39t articular fast Query mum Extras HElP me new Ree Authm Status i tee hanged index SW 473 386 qywilsan E3EI2EIEIS 9 19 34 AM We SW 473 454 qywilsan g 12l2uu5 a 3n 2 my links SW 473 449 qywilsan madified 982EIEIS 7 m 36 PM listaffiquves SW 473 333 qy ilsari Eliamus 4 HQ 55 PM li ftahl 473 333 qywilsan EMS2 US 4 El 55 M make SW 473 411 qywilsan 912EIEIS in HQ 37 AM neyex SW 473 398 qywilsan E3EI2EIEIS I 31 36 PM aw SW 473 335 qywilsan Ell72EIEI5 4 SD 37 PM Pym SW 473 441 qy ilsari 972EIEIS 4 44 U6 PM PyEIZ SW 473 445 qywilsan 9 ZED 24 M PyE3 SW 473 412 qywilsan 912EIEIS 2 59 28 PM PyE4 SW 473 4u7 qywilsan E312EIEI5 I 39 16 PM re SW 473 411 qywilsan 912EIEIS in HQ 37 AM readstyle SW 473 427 qywilsan 972EIEIS 2 33 53 PM reflect SW 473 462 qywilsan 9112EIEI5 3 47 DE PM security SW 473 289 qywilsan ECl2EIEIS 12 U4 56 PM 5 v 5 473 32a qywilsan E132 US in EIEI 42 AM shell SW 473 462 qywilsan madified 9112EIEI5 3 47 DE PM Sill SW 473 345 qywilsan E212EIEI5 2 49 47 PM summavy SW 473 411 qywilsan sixmus in us 37 PM l i i gt Fiq L44 RapidSVN How To Do It 39 TortoiseSVN is a Windows shell extension separately Fig L45 TortoiseSVN Resolving Conflicts I Back to the problem of con icting edits or more simply conflicts I Option 1 only allow one person to have a writeable copy at any time I This is called pessimistic concurrency I Used in Microsoft Visual SourceSafe I Option 2 let people edit and resolve conflicts afterward by merging files I Called optimistic concurrency I It39s easier to get forgiveness than permission I Most modern systems including Subversion do this g Example of Resolving I Ron and Hermione are both synchronized with version 151 of the repository I Dnn radiic mnnnc vi and rnmmifc hie rhnn UV U u u UV I IU U I WWW 5 ges to create Iers n 152 Name OrbitalRadiusOrbitalPeriodMass Radius 10 4216 1769138 8932 18216 Europa 6709 3551181 4800 15608 Ganymede 10704 7154553 14819 26312 Callisto 18827 16689018 10759 24103 Amalthea 1814 0498179 0075 131x73x67 Himalia 11460 2505662 0095 85 Elara 11740 2596528 0008 40 g Example of Resolving I Simultaneously Hermione edits her copy of moonsbltt Name OrbitalRadiusOrbitalPeriodMass Radius 103 days 1020kg km 10 4216 1769138 8932 18216 Europa 6709 3551181 4800 15608 Ganymede 10704 7154553 14819 26312 Callisto 18827 16689018 10759 24103 Amalthea 1814 0498179 0075 131 Himalia 11460 2505662 0095 85 Elara 11740 2596528 0008 40 Pasiphae 23620 7436 0003 18 Sinope 23940 7589 00008 14 Lysithea 11720 25922 00008 12 I Example of Resolving 39 When she tries to commit Subversion tells her there39s a con ict Ron Repos39mry Hermione w moonstxt 3 moonsle moonslx m 3 I A race condition two or more wouldbe writers racing to get their changes in first Fig L46 Merging Con icls Example of Resolving con nued I Subversion puts Hermione39s changes and Ron39s in moonstxt Add rnn irf marinara tn hnml Alhnrn fhm I Name OrbitalRadius Orbitangriod Mass Radius 103 km days 1020kg km 10 4216 1769138 8932 18216 Europa 6709 3551181 4800 15608 Ganymede 10704 7154553 14819 26312 Callisto 18827 16689018 10759 24103 ltltltltltltlt mine Amalthea 1814 0498179 0075 131 Hima1ia 11460 2505662 0095 85 E1ara 11740 2596528 0008 40 Pasiphae 23620 7436 0003 18 Sinope 23940 7589 00008 14 Lysithea 11720 25922 00008 12 Ama1thea 1814 0498179 0075 131x73x67 Hima1ia 11460 2505662 0095 85 Elara 11740 2596528 0008 40 gtgtgtgtgtgtgt I152 ahuvva the start uf the section f u the flat le I dWMsedbm I gtgtgtgtgtgtgt shows the end of the section from the second file I Subversion also creates I m nstxtmine contains Hermione39s changes I moonstxt151 the file before either set of changes I moonstxt152 the most recent version of the le in the repository Example of Resolving continued I At this point Hermione can Run svn revert moonstxt to throw away her changes 39 Copy one of the three temporary files on top of moonstxt 39 Edit moonstxt to remove the conflict markers I Once she39s done she runs svn resolved moonstxt to let Subversion know she39s done svn commit to commit her changes creating version 153 of the repository Starvation I But what happens if Ginny commits another set of changes while Hermione is resolving I And then Harry commits yet another set I Starvation Hermione never gets a turn because someone else always gets there first I This is a management problem not a technical one I Break the files up into smaller pieces I Give people clearer responsibilities I The version control system is trying to tell you that people on your team are working at cross purposes Binary Files I Subversion can only merge conflicts in text files I Source code HTML basically anything you can edit with Notepad Vi or Emacs I But images video clips Microsoft Word and other formats aren39t plain text I When there39s a conflict Subversion saves your copy and the master copy side by side in your working directory I Up to you to resolve the differences I It39s not Subversion39s fault I Most creators of non text formats don39t provide a way to find or merge differences between files Reverting I After doing some more work Ron decides he39s on the wrong path I svn diff shows him which files he has changed and what those changes are I He hasn39t committed anything yet so he uses svn revert to discard his work I Ie throw away any differences between his working copy an the master as it was when he started I Synchronizes with where he was notwith any changes other people have made since then I If you find yourself reverting repeatedly you should probably go and do something else for a while Rolling Back Now Ron decides that he doesn39t like the changes Harry just made to moonsDltt Wanm to do the equlvalent of undo svn log shows recent history 39 rent nevlslon IS 15 e wants to reve moonsDltt will ment to the 7r flag Speclfles volved m to keep some of nts to Revlslon 157 is still in the repository Repository Harry on mounstxl 9 minim m mam moans txt moons txl b cu quotm A7 2006 Fig L47 Rolling Back Creating and Checking Out To create a repository I Decide where to put it eg svnrotor I Go into the containing directory cd svn I svnadmin create rotor I Can then check out a working copy I Directly through the file system svn checkout filesvnrotor I Through a web server svn checkout httpwwwhogwartsedusvnrotor I Note requires your system administrator to configure the web server properly I Only use svn checkout once to initialize your working copy I After that use svn update in that directory I If you only want part of the repository use svn co httpwwwhogwartsedusvnrotorenginedynamics Subversion Command Reference Name Purpose Example svn add Add les andor directories to version control svn add new lec newdir svn checkout Get a fresh working copy of a repository svn checkout httpsyourhostnamerotorepo rotorproject co svn commit Send changes from working copyt to reporisoty inverse of update svn commit m Comment on the changes ci svn delete Delete les andor directories from version control svn delete odfiec del rm remove svn help Get help in general or for a particular command svn help update svn merge Merge two different versions of a file into one svn merge r 1816 spinc svn rename Rename a file or directory keeping track of history svn rename temptxt releasenotestxt mv move svn resolve Let subversion know that a con ict has been resolved svn resolve panetstxt svn revert Undo changes to working copy ie resynchronize with repository svn revert spinh svn status Show the status of les and directories in the working copy svn status svn update Bring changes from repository into working copy inverse of commit svn update Reading Subversion Output I svn status compares your working copy with the repository I Prints one line for each le that39s worth talking about svn status jupitermoonstxt readmetxt jupitermoonstxt has been modified readmebltt has conflicts I svn update prints one line for each file or directory it does something to svn update Asaturnmoonstxt U marsmarstxt I saturnmoonstxt has been added I marsmarsbltt has been updated ie someone else modified it lloz U Summary I Version control is one of the things that distinguishes professionals from amateurs I And successful projects from failures I Everything that a human being had to create should be under version control I You39ll see the benefits almost immediately Today39s Agenda I Version Control Introduction I Lab Assignment 1 1020


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

Allison Fischer University of Alabama

"I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

Jim McGreen Ohio University

"Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.