High-Performance Parallel Database Processing and Grid Databases- P7

Số trang: 50 Loại file: pdf Dung lượng: 337.12 KB Lượt xem: 8 Lượt tải: 0

Thư viện của tui

Phí tải xuống: 14,000 VND

Xem trước 5 trang đầu tiên của tài liệu này:

Thông tin tài liệu:

High-Performance Parallel Database Processing and Grid Databases- P7: Parallel databases are database systems that are implemented on parallel computingplatforms. Therefore, high-performance query processing focuses on queryprocessing, including database queries and transactions, that makes use of parallelismtechniques applied to an underlying parallel computing platform in order toachieve high performance.
Nội dung trích xuất từ tài liệu:
High-Performance Parallel Database Processing and Grid Databases- P7280 Chapter 9 Parallel Query Scheduling and Optimization The constant γ can be used as a design parameter to determine the operationsthat will be corrected. When γ takes a large value, the operations with large poten-tial estimation errors will be involved in the plan correction. A small value of γimplies that the plan correction is limited to the operations whose result sizes canbe estimated more accurately. In fact, when γ D 0, the APC method becomes thePPC method, while for sufﬁciently large γ the APC method becomes the OPCmethod.9.6.2 MigrationSubquery migration is based on up-to-date load information available at the timewhen the query plan is corrected. Migration process is activated by a high loadprocessing node when it ﬁnds at least one low load processing node from the loadtable. The process interacts with selected low load processing nodes, and if suc-cessful, some ready-to-run subqueries are migrated. Two decisions need to be madeon which node(s) should be probed and which subquery(s) is to be reallocated.Alternatives may be suggested from simple random selection to biased selection interms of certain beneﬁt/penalty measures. A biased migration strategy is used thatattempts to minimize the additional cost of the migration. In the migration process described in Figure 9.14, each subquery in the readyqueue is checked in turn to ﬁnd a current low load processing node, migration towhich incurs the smallest cost. If the cost is greater than a constant threshold α,the subquery is marked as nonmigratable and will not be considered further. Othersubqueries will be attempted one at a time for migration in an ascending order ofthe additional costs. The process stops when either the node is no longer at highload level or no low load node is found. The threshold α determines which subquery is migratable in terms of additionaldata transfer required along with migration. Such data transfer imposes a workloadon the original subquery cluster that initiates the migration and thus reduces or evennegates the performance gain for the cluster. Therefore, the migratable conditionfor a subquery q is deﬁned as follows: Given a original subquery processing nodeSi and a probed migration node S j , let C.q; Si / be the cost of processing q at Siand let D.q; Si ; S j / be the data transmission cost for Si migrating q to S j Ð q is D.q;Si ;S /said to be migratable from Si to S j if 1Ci; j D C.q;Si /j < α. It can be seen from the deﬁnition that whether or not a subquery is migrat-able is determined by three main factors: the system conﬁguration that determinesthe ratio of data transmission cost to local processing cost, the subquery oper-ation(s) that determines the total local processing cost, and the data availabilityat the probed migration processing node. If the operand relation of the subqueryis available at the migration processing node, no data transfer is needed and theadditional cost 1Ci; j is zero. The value of threshold α is insensitive to the performance of the migration algo-rithm. This is because the algorithm always chooses the subqueries with minimumadditional cost for migration. Moreover, the subquery migration takes place onlywhen a query plan correction has already been made. In fact, frequent changes 9.6 Dynamic Cluster Query Optimization 281Algorithm: Migration Algorithm1. The process is activated by any high load processing node when there exists a low load processing node.2. For each subquery Qi in the ready queue, do For each low load processing node j, do Calculate cost increase 1Ci, j for migrating Qi to j Find the node si, min with the minimum cost increase 1Ci, min If 1Ci, min < α, mark Qi as migratable, otherwise it is non-migratable3. Find the migratable subquery Qi with minimum cost increased4. Send a migration request message to processing node si, min5. If an accepted message is received, Qi is migrated to node si, min Else Qi is marked as non-migratable6. If processing node load level is still high and there is a migratable subquery, go to step 3, otherwise go to Subquery Partition.Figure 9.14 Migration algorithmin subquery allocation are not desirable because the processing node’s workloadschange time to time. A node that has a light load at the time of plan correction maybecome heavily loaded shortly because of the arrival of new queries and reallocatedqueries. The case of thrashing, that is, some subqueries are constantly reallocatedwithout actually being executed, must be avoided.9.6.3 PartitionThe partition process is invoked by a medium load processing node when thereis at least one low load processing node but no high load process ...