PARALLEL DATABASE :-
A parallel database system
seeks to improve performance through parallelization of various
operations, such as loading data, building indexes and evaluating queries.
Although data may be stored in a distributed fashion, the distribution is
governed solely by performance considerations. Parallel databases improve
processing and input/output speeds by
using multiple CPUs and disks
in parallel. Centralized and client–server database
systems are not powerful enough to handle such applications. In parallel
processing, many operations are performed simultaneously, as opposed to serial
processing, in which the computational steps are performed sequentially.
Parallel databases can be roughly divided into three categories:
- Shared memory architecture, where multiple processors share the main memory space, as well as mass storage (e.g. hard disk drives).
- Shared disk architecture, where each node has its own main memory, but all nodes share mass storage, usually a storage area network. In practice, each node usually also has multiple processors.
- Shared nothing architecture, where each node has its own mass storage as well as main memory.
Example parallel databases
Parallel Database
Architectures
Shared Memory
In a shared-memory
architecture, the processors and disks have access to a common memory,
typically via a bus or through an interconnection network.Benefit of using
shared memory is extreme efficient communication between processors - data in
shared memory can be accessed by any processor without being moved with
software.A processor can send messages to other processors much faster by using
memory writes than by sending messages through communication medium.There is
downside of shared memory as well. The architecture is not scalable beyond 32
or 64 processors. Reason behind this downside is bus or the interconnection
network becomes the bottleneck. Adding more processors will make processors
spend most of their time in waiting for their turn on the bus to access
memory.Shared memory architectures usually have large memory caches at each
processor, so that referencing of the shared memory is avoided whenever
possible.Moreover, caches need to be coherent, that is , if a processor
performs a write to a memory location, the data in that memory location should
be either updated at or removed from any processor where the data is cached.
Shared Disk
In the shared-disk
model, all processors can access all disks directly via an interconnection
network, but the processors have private memories.Shared disk has two
advantages over shared memory. First is, since each processor has its own
memory, the memory bus is not a bottleneck. Second is, it offers a cheap way to
provide a degree of fault tolerance.
Fault tolerance : If a processor ( or its memory) fails, the other
processor can take over its tasks, since the database is resident on disks that
are accessible from all processors.
The main problem with shared disk system is
again scalability. Although the memory bus is no longer a bottleneck, the
interconnection to the disk subsystem is now a bottleneck.Compared to share
memory systems, shared disk systems can scale to a somewhat larger number of
processors, but communication across processors is slower , since it has to go
through a communication network.
Shared nothing
In shared
nothing system, each node of the machine consists of a processor, memory
and one or more disks. The processors at one node may communicate with another
processor at another node by a high speed interconnection network.A node
function as a serverfor the data on the disk.Moreover, the interconnection
networks for shared nothing systems are usually designed to be scalable, so
that their transmission capacity increases as more nodes are
added.Consequently, shared nothing architectures are more scalable and can
easily support a large number of processors.Main drawback of shared nothing
systems are the costs of communication and of nonlocal disk access, which are
higher than in a shared memory or shared disk architecture since sending data
involves software interaction at both ends.Teradata database and Grace
and the Gamma research prototypes are shared nothing architectures.
Hierarchical
The hierarchical
architecture comes with characteristics of shared memory, shared disk
and shred nothing architectures. At the top level, the system consists of nodes
connected by an interconnection network, and do not share disks or memory woth
one another.Thus, the top level is a shared nothing architecture.Attempts to
reduce the complexity of programming such systems have yielded distributed virtual-memory
architectures, where logically there is a single shared memory, the memory
mapping hardware coupled with system software, allows each processor to view
the disjoint memories as a single virtual memory,Such architectures are also
referred to as a nonuniform memory
architecture (NUMA).
No comments:
Post a Comment