New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here


by: Vito Quigley


Vito Quigley
GPA 3.55

Feifei Li

Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Feifei Li
Class Notes
25 ?




Popular in Course

Popular in Computer Programming

This 25 page Class Notes was uploaded by Vito Quigley on Thursday September 17, 2015. The Class Notes belongs to COP 4710 at Florida State University taught by Feifei Li in Fall. Since its upload, it has received 80 views. For similar materials see /class/205475/cop-4710-florida-state-university in Computer Programming at Florida State University.

Similar to COP 4710 at FSU

Popular in Computer Programming


Reviews for DATABASES


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/17/15
TreeStructured Indexes Range Searches o Find all students with gpa gt 3 039 39 If data is in sorted file do binary search to find first such student then scan to find others Cost of binary search can be quite high 0 Simple idea Create an index39 file Level of indirection again w i Page1 Pagez Page3 l lPageN Data File E Can do binary search on smaller index file index entry P K P Tree l 1391 I K2P K 2 000 m Index file may still be quite large But we can apply the idea repeatedly Nonleaf Pages CM dull Overflow gt El l page Primary pages E Leafpages contain data entries B Tree The Most Widely Used Index 0 Insert delete at log F N cost keep tree height balanced F fanout N leaf pages Minimum 50 occupancy except for root Each node contains d lt m lt 2d entries The parameter d is called the orderof the tree 0 Supports equality and rangesearches efficiently d Index Entries Direct search Data Entries quotSequence setquot Example B Tree 0 Search begins at root and key comparisons direct it to a leaf 0 Search for 5 15 all data entries gt 24 Root 13 17 i A 221 I 24I 27129I I E Based on the search for 15 3 we know it is not in the tree A A i I2I3I5I7I I14 I16I I II19120 B Trees in Practice Typical order 100 Typical fillfactor 67 average fanout 133 Typical capacities Height 4 1334 312900700 records Height 3 1333 2352637 records 0 Can often hold top levels in buffer pool Level 1 1 page 8 Kbytes Level 2 133 pages 1 Mbyte Level 3 17689 pages 133 MBytes Index Classification Clustered vs unclustered If order of data records is the same as or close to order of index data entries then called clustered index A file can be clustered on at most one search key Cost of retrieving data records through index varies greatly based on whether index is clustered or not Clustered vs Unclustered Index 0 Suppose that the data records are stored in a Heap file To build clustered index first sort the Heap file with some free space on each block for future inserts Overflow blocks may be needed for inserts Thus order of data recs is close to but not identical to the sort order Data entries Data entries CLUSTERED Index File Data file Data Records Data Records Index entries direct search for data entries Unclustered vs Clustered Indexes What are the tradeoffs Clustered Pros Efficient for range searches May be able to do some types of compression Possible locality benefits related data Clustered Cons Expensive to maintain on the fly or sloppy with reorganization Inserting a Data Entry into a 3 Tree 0 Find correct leaf L 0 Put data entry onto L If L has enough space done Else must grit L into L and a new node L2 Redistribute entries evenly copy up middle key 0 Insert index entry pointing to LZinto parent of L o This can happen recursively To split index node redistribute entries evenly but push up middle key Contrast with leaf splits Splits grow tree root split increases height Tree growth gets Wider or one eve teler at ten Example B Tree Inserting 8 Root 13 17 Example B Tree Inserting 8 A A 19f201221 241271291 33 34 38 39 A A 2I3l I 578 I1416 33 Notice that root was split leading to increase in height 33 In this example we can avoid split by redistributing entries however this is usually not done in practice Inserting 8 into Example B Tree Entry to be inserted in parent node I Note that 5 is copied up and continues to appear in the leaf A 578 lo 0 0 Observe how minimum occupancy is guaranteed in both leaf and index pg splits Note difference between copy up and pushup 23 Entry to be inserted in parent node Note that 17 is pushed up and only appears once in the index Contrast be sure you this with a leaf split understand the reasons for this Deleting a Data Entry from a 8 Tree Start at root find leaf L where entry belongs Remove the entry If L is at least halffull done If L has only d1 entries 0 Try to redistribute borrowing from sibling ac acent node with same parent as L o If redistribution fails merge L and sibling If merge occurred must delete entry pointing to L or sibling from parent of L Merge could propagate to root decreasing height Example Tree including 8 Delete 19 and 20 IHIEIIIII A Deleting 19 is easy Example Tree including 8 Delete 19 and 20 Deleting 19 is easy Deleting 20 is done with redistribution Notice how middle key is copied up And Then Deleting 24 39 St e e39 u 30 II 0 Observe toss of index entry on right A A and pull down Of I22 I 27 I29 I I I33 I34 I38 I39 I index entry below RA 5 13 17 30 ra jrk II14I16 23 578 jraL erL I II22I27I29I II33I34I38I39I Example of Nonleaf Redistribution Tree is shown below during deletion of 24 What could be a possible initial tree 0 In contrast to previous example can redistribute entry from left child of root to right child m m 7 23 578 1416 quot1181 2 121l 221271291 331341381391 After Redistribution Intuitively entries are redistributed by pushing through the splitting entry in the parent node 0 It suffices to redistribute index entry with key 20 we ve redistributed 17 as well for illustration Root A m m m m m 23 578 1416 quot1181 2 21 221271291 33343839 Bulk Loading of a 3 Tree If we have a large collection of records and we want to create a 3 tree on some field doing so by repeatedly inserting records is very slow Also leads to minimal leaf utilization why 0 Bulk Loading can be done much more efficiently Initialization Sort all data entries insert pointer to first leaf page in a new root page Rok Sorted pages of data entries not yet in B tree 34 9l 1 1213 2 22 2331 3536 3841 44 Bulk Loading Contd Root Index entries for leaf pages always entered into rightmost index MM W page just above leaf level When this fills up it splits Split W Em Ea Em E may go up rightmost path to the root Much faster than repeated inserts especially when one considers locking Data entry pages not yet in B tree Data entry pages not yet in B tree Summary of Bulk Loading 0 Option 1 multiple inserts Slow Does not give sequential storage of leaves 0 Option 2 Bulk Loading Has advantages for concurrency control Fewer IOs during build Leaves will be stored sequentially and linked of course Can control fill factorquot on pages A Note on Order 0rderd concept replaced by physical space criterion in practice at least halffull Index pages can typically hold many more entries than leaf pages Summary Treestructured indexes are ideal for range searches also good for equality searches ISAM is a static structure Only leaf pages modified overflow pages needed Overflow chains can degrade performance unless size of data set and data distribution stay constant B tree is a dynamic structure Insertsdeletes leave tree heightbalanced log F N cost High fanout F means depth rarely more than 3 or 4 Almost always better than maintaining a sorted file Summary Contd Typically 67 occupancy on average Bulk loading can be much faster than repeated inserts for creating a 3 tree on a large data set 0 Most widely used index in database management systems because of its versatility One of the most optimized components of a DBMS


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Jim McGreen Ohio University

"Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

Amaris Trozzo George Washington University

"I made $350 in just two days after posting my first study guide."

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.