The following scenario has been addressed on Chegg previously, however the answe

ID: 3780988 • Letter: T

Question

The following scenario has been addressed on Chegg previously, however the answers were somewhat incomplete and lacking references. Can you please answer these questions in a more detailed manner covering all the bases of the scenario?, Thank You.

Scenario:

"TAL Distributors is interested in open source distributed database management systems (DDBMS). Use the
Internet to research open source DDBMS software. Use Date’s 12 rules for distributed databases to evaluate
the software. Are there any open source DDBMS software programs that follow all 12 rules? Which open source
DDBMS would you recommend TAL Distributors use? Justify your recommendation and be sure to cite your
references."

Explanation / Answer

According to the C.J. Date's 12 Distributed DBMS Rules:

RULE 1-LOCAL AUTONOMY

Definition: The sites in a distributed system should be autonomous or independent of each other .

Comments: A DBMS at each site in a distributed system should provide its own security, locking, logging, integrity, and recovery. Local operations use and affect only local resources and do not depend on other sites.

RULE 2-NO RELIANCE ON CENTRAL SITE

Definition: A distributed database system should not rely on a central site, because a single central site may become a single point of failure, affecting the entire system. Also, a central site may become a bottleneck affecting the distributed system's performance and throughput.

Comments: Each site of a distributed database system provides its own security, locking, logging, integrity, and recovery, and handles its own data dictionary. No central site must be involved in every distributed transaction.

RULE 3-CONTINUOUS OPERATION

Definition: A distributed database system should never require downtime.

Comments: A distributed database system should provide on-line backup and recovery, and a full and incremental archiving facility. The backup and recovery should be fast enough to be performed on- line without noticeable detrimental affect on the entire system performance.

RULE 4-LOCATION TRANSPARENCY AND LOCATION INDEPENDENCE

Definition: Users and/or applications should not know, or even be aware of, where the data is physically stored; instead, users and/or applications should behave as if all data was stored locally.

Comments: Location transparency can be supported by extended synonyms and extensive use of the data dictionary. Location independence allows applications to be ported easily from one site in a distributed database system to another without modifications.

RULE 5-FRAGMENTATION INDEPENDENCE

Definition: Relational tables in a distributed database system can be divided into fragments and stored at different sites transparent to the users and applications.

Comments: Similar to the location transparency rule, users and applications should not be aware of the fact that some data may be stored in a fragment of a table at a site different from the site where the table itself is stored.

RULE 6-REPLICATION INDEPENDENCE

Definition: Data can be transparently replicated on multiple computer systems across a network.

Comments: Similar to the data location and fragmentation independence rules, replication independence is designed to free users of the concerns of where the data is stored. In the case of replication, users and applications should not be aware that replicas of the data are maintained and synchronized automatically by the distributed database management system.

RULE 7-DISTRIBUTED QUERY PROCESSING

Definition: The performance of a given query should be independent of the site at which the query is submitted.

Comments: Since a relational database management system pro- vides non-navigational access to data (via SQL), such a system should support an optimizer that can select not only the best access path within a given node, but also can optimize a distributed query performance in regard to the data location, CPU and I/O utilization, and network traffic throughput.

RULE 8-DISTRIBUTED TRANSACTION MANAGEMENT

Definition: A distributed system should be able to support atomic transactions.

Comments: Transaction properties of atomicity, consistency, durability, isolation, and serialization should be supported not only for local transactions, but also for distributed transactions that can span multiple systems. An example of a distributed transaction management issue is transaction coordination in the distributed two-phase commit processing.

RULE9-HARDWARE INDEPENDENCE

Definition: A distributed database system should be able to operate and access data spread across a wide variety of hardware platforms.

Comments: Any truly distributed DBMS system should not rely on a particular hardware feature, nor should it be limited to a certain hardware architecture or vendor.

RULE 10-OPERATING SYSTEM INDEPENDENCE

Definition: A distributed database system should be able to run on different operating systems.

Comments: Similar to Rule 9, a truly distributed database system should support distribution of functions and data across different operating systems, including any combination of such operating systems as DOS, UNIX, Windows NT, MVSNM, VSE, and VAX.

RULE 11-NETWORK INDEPENDENCE

Definition: A distributed database system should be designed to run regardless of the communication protocols and network topology used to interconnect various system nodes.

Comments: Similar to Rules 9 and 10, a truly distributed database system should support distribution of functions and data across different operating systems irrespective of the particular communication method used to interconnect all participating systems, including local and wide area networks. In fact, networks and communication protocols can be mixed to satisfy certain business, economic, geo- graphical, and other requirements.

RULE 12-DBMS INDEPENDENCE

Definition: An ideal distributed database management system must be able to support interoperability between DBMS systems running on different nodes, even if these DBMS systems are unlike (heterogeneous).

Comments: All participants in distributed database management systems should use common standard interfaces (APIs) in order to interoperate with each other and to participate in distributed data- base processing.

According to CAP theorem, no database could provide all the below three solutions:

It's theoretically impossible to have all 3 requirements met, so a combination of 2 must be chosen and this is usually the deciding factor in what technology is used.

When it comes to distributed databases, the two choices are really AP (Availability and Partition Tolerance) or CP (Consistency and Partition Tolerance) since if it's not partition tolerant, it's not really a reliable distributed database.

So, the choice is simpler:

You should CP mode when:

It should just stop responding unless you can get the absolute latest copy.

Most popular Open Source DBMS solutions for such approach are MongoDB and Apache HBase

These are compliant with all the C.J Date’s rules.

You should choose AP mode when:

A network split happens, and you want the database to keep answering but with possibly old/bad data.

Best Open Source DBMS solutions for this are Apache Cassandra and Apache Couch DB

These are compliant with all the C.J Date’s rules except for the Rule 8 i.e. Distributed Transaction Management.

Navigate

The following scatterplot shows monthly sales figures (in units) and number of m

The following scenario is an opportunity to practice your management skills. It

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

The following scenario has been addressed on Chegg previously, however the answe

Question

Explanation / Answer

Related Questions

Navigate