Distributed System Overview


Contents

  1. Introduction


Introduction ||

A distributed system (DS) is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another. The components interact with one another in order to achieve a common goal. Throughout the context, distributed system will be referred to as DS. The reason behind tech giants prefer DS instead of monolithic system:

  • Parallelism
  • Fault Tolerance
    - Availability
    - Recoverability
    - Non-volatile Storage
    - Replication
  • Geographical location
  • Security and Isolation
Basic challenges of DS:
  • Concurrency and Consistency
    - Strong/Weak Consistency
  • Partial failure
  • Performance
    - Scalability
DS Infrastructure used by Applications:
  • Storage
  • Communication
  • Computation (e.g., Map-Reduce)

MapReduce

MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.

RPC and Thread

Thread: A thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system.[1] The implementation of threads and processes differs between operating systems, but in most cases a thread is a component of a process. Multiple threads can exist within one process, executing concurrently and sharing resources such as memory, while different processes do not share these resources.


References