A programmer team in your organization is developing a web-based application tha
ID: 3672313 • Letter: A
Question
A programmer team in your organization is developing a web-based application that will be accessible to the customers. In order to make it efficient they need estimate the cost of different operations the system needs to do before answering a user request. One of the services to users involves joining two relatively large relations (RA and RB), so they asked you as a system administrator to provide an estimate of the cost for this join operation. You know that the DMBS supports only nested loops join algorithms. The cost metric is the number of page I/Os unless otherwise noted, and the cost of writing out the result should be uniformly ignored.
Given the following information about the relations to be joined:
Relation RA contains 10,000 tuples and has 10 tuples per page.
Relation RB contains 2,000 tuples and also has 10 tuples per page.
Both relations are stored as simple heap files.
Neither relation has any indexes built on it.
52 buffer pages are available.
What is the cost of joining RA and RB using a page-oriented simple nested loops join? What is the minimum number of buffer pages required for this cost to remain unchanged?
What is the cost of joining RA and RB using a block nested loops join? What is the minimum number of buffer pages required for this cost to remain unchanged?
Explanation / Answer
Answer:
Let M =1000 be the number of pages in R, N = 200 be the number of pages in S,and B = 52 be the number of buffer pages available.
Question 1:
Based on given Question , we can write as Basic idea is to read each page of the outer relation, and for each page scan the inner relation for matching tuples. Total cost would be
#pagesinouter + (#pagesinouter . #pagesininner)
which is minimized by having the smaller relation be the outer relation.
TotalCost = N + (N . M) = 200200
The minimum number of buffer pages for this cost is 3.
Question 2
From the Question 2, So we can define as this time read the outer relation in blocks, and for each block scan the inner relation for matching tuples. So the outer relation is still read once, but the inner relation is scanned only once for each outer block, of which there are
ceiling ( #pages in outer / (B.2) ) = ceiling (200/50) = 4.
TotalCost = N + M . ceiling( N / (B.2) ) = 4200
If the number of buffer pages is less than 52, the number of scans of the inner would be more than 4 since ceiling(200/49) = 5.
The minimum number of buffer pages for this cost is therefore 52.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.