Distributed Database
Data of the database is stored on more than 1 sites and all site work cooperatively to fulfill the user-request(Queries,transactions) of Distributed Database.
Local Transactions : Transactions which access date of site on which it started.
Distributed Transactions : Transactions which access data from more than 1 site.
2-PHASE COMMIT PROTOCOL (Nick name : 2PC)
Purpose : to ensure the atomicity of Distributed Transacations.
System Structure : Each site has TC(transaction coordinator) and TM(Transaction manager).
1)
- TC of site at which transaction T starts,distributes the transactions to various sites on which the required data is available for completing the transactions.
- This TC is called co-ordinator(TC of the site at which T started).
- It is the responsibility of co-ordinator to maintain the atomicity of T.
2)
When all participating sites inform the co-ordinator that their part of T has completed,co-ordinater starts the 2 PC.
SHORT NAMES used : T=transaction, c = co-ordinator .
3)
PHASE 1 :
- C writes <prepare T> entry to its log,and sends prepare T message to all sites.
- All sites on receiving this message decides wheather they can commit the portion of T
- if Ans = no,it adds <no T> to log,sends <abort T> message to C
- if Ans = yes,adds <ready T> entry to its log file and forces the log file to stable storage.
- Then replies with <ready T> message to C.
PHASE 2 :
- If all sites reply with <ready T> message,C adds entry <commit T> to log and saves to stable storage and then commits T.
- else it adds <abort T> to log ,stores to stable-storage and then rollbacks T.
- Depending on above outcome,C sends <abort T> or <commit T> message to all sites.
- Each site on receiving the above message ,add either <abort T> or <commit T> message,saves to stable storage.
- if above log record is saved on disk ,then it either commits or rollbacks T.
[B] IN CASE OF FAILURES :
1) Failure of participating site :
- When site recovers from failure,it reads its log.
- if log contains <commit T>,site executes redo(T).
- if log contains <abort T>,site executes undo(T).
- if log contains <ready T>,site consults C to find the outcome of T.
2) Failure of coordinator site :
- if C fails in midst of execution of 2PC,participating sites must find the fate of T.
- if an active site contains <commit T> in log,the T must be committed.
- if an active site contains <abort T> in log,the T must be aborted.
BLOCKING PROBLEM IN 2PC : Drawback of 2PC
- if all active sites contain <ready T> record,but no additional control records, then it is impossible to decide the fate of T.
- thus all active sites have to wait for C to recover.
- T continues to hold locks and system resources at each sites.
- due to locks ,other Transactions might have to wait for T to complete.
- Above problem is known as Blocking Problem in 2PC.
3 PHASE COMMIT
Assumptions for 3PC
i)no network partition occurs
ii)not more than k sites fail,where k=predetermined number
- it is an extension of 2PC.
- introduces a third phase in which instead of 1 site,more than 1 site are involved in decision to commit.
- Before committing the T,C ensures that at least k other sites know the decision to commit T.
- When coordinator fails,remaining sites selects a new coordinator.
- new co-ordinator finds the fate of T by sending <querystatus T> message to other sites
Advantage :
- it avoids the blocking problem in 2PC,unless k Transactions fail.
Dis-advantage :
- if network partition occurs,than it will same as more than k sites failing,which will lead to the problem of BLOCKING.
- When network partition occurs it can lead to situations when T is commited in 1 part and aborted in other part.
- it has more overhead than 2PC.