DISTRIBUTED AND HIGH-PERFORMANCE COMPUTING

40 slides
0.12 MB
674 views

Similar Presentations

Presentation Transcript

1

DISTRIBUTED AND HIGH-PERFORMANCE COMPUTINGChapter 9 : Distributed Computing

2

Outline for Distributed ComputingWill discuss some practical aspects of designing and implementing distributed systems. Address distributed and concurrent programming on loosely coupled systems. Main concept is distributed services – these can be anything from standard services (e.g. file serving, mail, printing) to specialised HPC applications. Focus on client/server model of computation; distributed file stores; performance aspects of distributed computing. Software architecture; OS support. Redundancy and fault tolerance.

3

Outline for Distributed ComputingThen a more detailed look at a few interesting issues in distributed computing, including issues of distributed time and communications protocols. Distributed applications on local area networks (LAN) are common; medium/metropolitan area networks (MAN) becoming fairly common; wide area networks (WAN) (e.g. between cities or international) still a research area but becoming more common.

4

Distributed Computing EnvironmentPlatformsDCE Applications

5

Reasons for Distributed Computing SystemsInherently distributed applications Distributed DB, worldwide airline reservation, banking system Information sharing among distributed users CSCW or groupware Resource sharing Sharing DB/expensive hardware and controlling remote lab. devices Better cost-performance ratio / Performance Emergence of Gbit network and high-speed/cheap MPUs Effective for coarse-grained or embarrassingly parallel applications Reliability Non-stopping (availability) and voting features. Scalability Loosely coupled connection and hot plug-in Flexibility Reconfigure the system to meet users’ requirements

6

Loosely Coupled System CharacteristicsDistributed computing systems: loosely coupled distributed computers, connected across local or wide area networks possibly heterogenous machines and OS relatively low bandwidth and high latency comms networks or clusters of workstations (NOW/COW) or (possibly HPC) servers communication using message passing combine different services on different servers provide extra processing power, remotely access shared services, redundancy and fault tolerance

7

Loosely Coupled System CharacteristicsParallel computing systems: tightly coupled processors, in same box (or same room) homogeneous processors and OS high bandwidth inter-processor communications shared memory or message passing provide extra memory and processing power

8

Distributed Systems Advantages & ChallengesDistributed computing can provide: remote access to services distributed tasks among different machines to improve throughput redundancy in processing services to share load redundancy in data storage services to improve I/O throughput and reduce download times (e.g. mirroring at multiple sites) fault tolerance if some components of the system fail

9

Distributed Systems Advantages & ChallengesNOW easily upgraded - just buy more machines. Modern OS like Unix or NT allows fairly easy construction of a basic NOW. Distributed computing not so simple for legacy applications. Transparent distribution still a research area. Various issues still problematic – reliability, fault tolerance, scheduling of processes, process monitoring, security and authentication. Achieving performance non-trivial – high communications overhead; load balancing often hard.

10

Platform Milestones in Distributed Systems

11

Historical perspectiveWork in distributed systems started with workstations Cheap microprocessors and workstations/PCs, modern OS (Unix and NT), and rise of LANs (e.g. Ethernet) led to replacement of mainframe model by distributed network of workstations Ideas from Xerox Palo Alto in early 1980’s Idle cycles on workstations could be exploited via shared file system – any process can run on any available system Significant complications for the file system – concurrency control problems XDFS first implemented on Xerox D Series W/S

12

Historical perspectiveUnique network wide file identifiers (integers) allow retrieval of files from anywhere on network Mapping from human readable name to FID by directory server (itself a distributed application) System was transparent, but slow and fault intolerant Sun took core functionality of XDFS into NFS (1987) NFS evolved into de facto standard, fairly robust, efficient and transparent Very wide area transparent file store still a research area (WebFS, Globus, DWorFS, etc)

13

Unix File SystemFile reference by name, or Use i-node - an element of the i-list, and represented by an index into the i-list i-node contains information about the file - ownership, timestamps, array of pointers to the data blocks of the file Directory used to maintain the name to i-node translation easy to use: int fd; char buffer[8192]; int count; fd = open(‘‘myfile’’, O_RDONLY); count = read(fd, buffer, 8192);

14

Unix File Systemopen system call does name to i-node translation Users manipulate file descriptors which refer to i-nodes For transparency, above code must also work for NFS mounted file Need to preserve the Unix filesystem semantics (same paradigm) (an inode is a data structure on a traditional Unix-style file system such as UFS. An inode stores basic information about a regular file, directory, or other file system object. )

15

NFS LayersFor NFS implementation, new representation – v-nodes v-node triple: (computer-ID, FS number, i-node) Also allows support for foreign (i.e. non Unix) file systems Client/server model used to communicate across network Allows NFS to look like normal Unix file system to applications NFS is stateless - no server retains information about clients

16

NFS LayersIf client crashes, no effect on server If server crashes, client blocks until server returns Client computations delayed but not damaged (blocking communications) Stateless costs some performance NFS optimized for transferring lots of small blocks, so not optimal for bulk data transfer (e.g. in HPC applications), some research work is addressing this problem

17

NFS ProblemsConsequence of statelessness is lack of file locking Normal Unix allows files to be locked for reading/writing In normal Unix kernel, file blocks are cached in kernel NFS speeds up operations by caching at both client and server ends This creates problems - clients may have inconsistent view of data Client reads from its own cache Some other client may have modified that part of the file already, i.e. file has been changed on server

18

NFS ProblemsNFS approach is to request new copy if data is more than some number of seconds old (3sec) This is costly and not very effective No locking means writes must be synchronous, so write operations complete only when server has written data Writes are therefore much slower than reads (which can be from cache ) NFS addresses problem as best as possible given the constraints - so we still use it.

19

Architecture ModelsThree basic architectural models for distributed systems: workstations/servers model; processor pool (thin client) model; integrated model.

20

Workstation ModelEach user has a workstation Application programs run on the workstation Specialised servers perform designated services (e.g. file, directory, authentication, news, printing, gateway, mail, specialist processing) Workstations integrated by sharing common set of resources and common interface Usually user ID is unique across whole network of W/S and any user may use any W/S System wide filestore is mandatory

21

Workstation ModelSome W/S may in addition have private filestores – must be exported to allow transparent access from other machines User can also run application programs remotely on other workstations Cluster Management Systems (CMS) allow user to submit jobs transparently to the NOW, rather than have to manually choose a specific machine to run on CMS handles resource allocation, scheduling, queueing

22

Workstation ModelProcess migration Users first log on his/her personal workstation. If there are idle remote workstations, a heavy job may migrate to one of them. Problems: How to find am idle workstation How to migrate a job What if a user log on the remote machine

23

Processor Pool ModelCollection of terminals for access to the system Pool of processors which run user tasks – can be distributed memory cluster or shared memory SMP server Other servers (e.g. fileservers) Model more common now with advent of X-Terminals, SunRays, thin clients, network interface computers (NICs) Few pure pool systems around – usually a hybrid with workstation approach – but becoming more popular

24

Processor Pool ModelAdvantages: Easier to manage centralised hardware and software Easy to add or remove pool processors without affecting users Old pool processors never die, just gradually out live their usefulness Software licensing costs are reduced Good processor utilisation (unlike workstation model where usually less than 10% of cycles are used) Migration to new hardware platforms is easier, cheaper and can be evolutionary/continuous

25

Processor-Pool ModelClients: They log in one of terminals (diskless workstations or X terminals) All services are dispatched to servers. Servers: Necessary number of processors are allocated to each user from the pool. Better utilization but less interactivity

26

The Integrated ModelIn the totally integrated model all platforms are on single network-wide distributed OS completely seamless completely transparent completely mythical Some small local area networked system might appear close, one day, perhaps...

Browse More Presentations

Last Updated: 8th March 2018

Recommended PPTs