Инфоурок Другое ПрезентацииAdvanced Distributed

Advanced Distributed

Скачать материал
Скачать материал "Advanced Distributed"

Получите профессию

Менеджер по туризму

за 6 месяцев

Пройти курс

Рабочие листы
к вашим урокам

Скачать

Методические разработки к Вашему уроку:

Получите новую специальность за 3 месяца

Семейный психолог

Описание презентации по отдельным слайдам:

  • 1CS 525 Advanced Distributed SystemsSpring 2011Indranil Gupta (Indy)

Membe...

    1 слайд

    1
    CS 525
    Advanced Distributed Systems
    Spring 2011
    Indranil Gupta (Indy)

    Membership Protocols (and Failure Detectors)
    March 31, 2011
    All Slides © IG

  • 2Target SettingsProcess ‘group’-based systems
Clouds/Datacenters 
Replicated...

    2 слайд

    2
    Target Settings
    Process ‘group’-based systems
    Clouds/Datacenters
    Replicated servers
    Distributed databases


    Crash-stop/Fail-stop process failures

  • 3Group Membership ServiceApplication Queries
     e.g., gossip, overlays, 	DH...

    3 слайд

    3
    Group Membership Service
    Application Queries
    e.g., gossip, overlays, DHT’s, etc.
    Membership
    Protocol
    Group
    Membership List
    joins, leaves, failures
    of members
    Unreliable
    Communication
    Application Process pi
    Membership List

  • 4Two sub-protocolsDisseminationFailure DetectorApplication Process  pi...

    4 слайд

    4
    Two sub-protocols
    Dissemination
    Failure Detector
    Application Process pi
    pj
    Group
    Membership List
    Unreliable
    Communication
    Almost-Complete list (focus of this talk)
    Gossip-style, SWIM, Virtual synchrony, …
    Or Partial-random list (other papers)
    SCAMP, T-MAN, Cyclon,…

  • 5Large Group: Scalability A Goalthis is us (pi)Unreliable Communication
Netwo...

    5 слайд

    5
    Large Group: Scalability A Goal
    this is us (pi)
    Unreliable Communication
    Network
    1000’s of processes
    Process Group
    “Members”

  • 6 pj I pj crashed Group Membership ProtocolUnreliable Communication
Networkpi...

    6 слайд

    6
    pj
    I pj crashed
    Group Membership Protocol
    Unreliable Communication
    Network
    pi
    Some process
    finds out quickly
    Failure Detector
    II
    Dissemination
    III
    Crash-stop Failures only

  • 7I. pj crashes Nothing we can do about it! 
A frequent occurrence
Common case...

    7 слайд

    7
    I. pj crashes
    Nothing we can do about it!
    A frequent occurrence
    Common case rather than exception
    Frequency goes up at least linearly with size of datacenter

  • 8II. Distributed Failure Detectors: Desirable PropertiesCompleteness = each f...

    8 слайд

    8
    II. Distributed Failure Detectors: Desirable Properties
    Completeness = each failure is detected
    Accuracy = there is no mistaken detection
    Speed
    Time to first detection of a failure
    Scale
    Equal Load on each member
    Network Message Load

  • 9Distributed Failure Detectors: PropertiesCompleteness
Accuracy

Speed
Time t...

    9 слайд

    9
    Distributed Failure Detectors: Properties
    Completeness
    Accuracy

    Speed
    Time to first detection of a failure
    Scale
    Equal Load on each member
    Network Message Load

    Impossible together in
    lossy networks [Chandra
    and Toueg]

    If possible, then can
    solve consensus!

  • 10What Real Failure Detectors PreferCompleteness
Accuracy
Speed
Time to first...

    10 слайд

    10
    What Real Failure Detectors Prefer
    Completeness
    Accuracy
    Speed
    Time to first detection of a failure
    Scale
    Equal Load on each member
    Network Message Load
    Guaranteed
    Partial/Probabilistic
    guarantee

  • 11Failure Detector PropertiesCompleteness
Accuracy
Speed
Time to first detect...

    11 слайд

    11
    Failure Detector Properties
    Completeness
    Accuracy
    Speed
    Time to first detection of a failure
    Scale
    Equal Load on each member
    Network Message Load

    Time until some
    process detects the failure
    Guaranteed
    Partial/Probabilistic
    guarantee

  • 12Failure Detector PropertiesCompleteness
Accuracy
Speed
Time to first detect...

    12 слайд

    12
    Failure Detector Properties
    Completeness
    Accuracy
    Speed
    Time to first detection of a failure
    Scale
    Equal Load on each member
    Network Message Load

    Time until some
    process detects the failure
    Guaranteed
    Partial/Probabilistic
    guarantee
    No bottlenecks/single
    failure point

  • 13Failure Detector PropertiesCompleteness
Accuracy
Speed
Time to first detect...

    13 слайд

    13
    Failure Detector Properties
    Completeness
    Accuracy
    Speed
    Time to first detection of a failure
    Scale
    Equal Load on each member
    Network Message Load

    In spite of
    arbitrary simultaneous
    process failures

  • 14Centralized Heartbeating…pi, Heartbeat Seq. l++ pi HotspotpjHeartbeats sen...

    14 слайд

    14
    Centralized Heartbeating

    pi, Heartbeat Seq. l++
    pi
     Hotspot
    pj
    Heartbeats sent periodically
    If heartbeat not received from pi within
    timeout, mark pi as failed

  • 15Ring Heartbeatingpi, Heartbeat Seq. l++ Unpredictable on
simultaneous mult...

    15 слайд

    15
    Ring Heartbeating
    pi, Heartbeat Seq. l++
     Unpredictable on
    simultaneous multiple failures
    pi


    pj

  • 16All-to-All Heartbeatingpi, Heartbeat Seq. l++… Equal load per member
pipj

    16 слайд

    16
    All-to-All Heartbeating
    pi, Heartbeat Seq. l++

     Equal load per member

    pi
    pj

  • 17Gossip-style HeartbeatingArray of 
Heartbeat Seq. l
for member subset Good...

    17 слайд

    17
    Gossip-style Heartbeating
    Array of
    Heartbeat Seq. l
    for member subset
     Good accuracy properties
    pi

  • 18Gossip-Style Failure Detection1243Protocol: 
Nodes periodically gossip thei...

    18 слайд

    18
    Gossip-Style Failure Detection
    1
    2
    4
    3
    Protocol:
    Nodes periodically gossip their membership list
    On receipt, the local membership list is updated
    Current time : 70 at node 2
    (asynchronous clocks)
    Address
    Heartbeat Counter
    Time (local)
    Fig and animation by: Dongyun Jin and Thuy Ngyuen

  • 19Gossip-Style Failure DetectionIf the heartbeat has not increased for more t...

    19 слайд

    19
    Gossip-Style Failure Detection
    If the heartbeat has not increased for more than Tfail seconds,
    the member is considered failed
    And after Tcleanup seconds, it will delete the member from the list
    Why two different timeouts?

  • 20Gossip-Style Failure DetectionWhat if an entry pointing to a failed node is...

    20 слайд

    20
    Gossip-Style Failure Detection
    What if an entry pointing to a failed node is deleted right after Tfail seconds?







    Fix: remember for another Tfail
    1
    2
    4
    3
    Current time : 75 at node 2

  • 21Multi-level GossipingNetwork topology is hierarchical
Random gossip target...

    21 слайд

    21
    Multi-level Gossiping
    Network topology is hierarchical
    Random gossip target selection => core routers face O(N) load (Why?)

    Fix: Select gossip target in subnet I, which contains ni nodes, with probability 1/ni
    Router load=O(1)
    Dissemination time=O(log(N))
    Why?
    What about latency for multi-level topologies?
    [Gupta et al, TPDS 06]


    Router
    N/2 nodes in a subnet
    N/2 nodes in a subnet

  • 22Analysis/DiscussionWhat happens if gossip period Tgossip is decreased? 
A s...

    22 слайд

    22
    Analysis/Discussion
    What happens if gossip period Tgossip is decreased?
    A single heartbeat takes O(log(N)) time to propagate. So: N heartbeats take:
    O(log(N)) time to propagate, if bandwidth allowed per node is allowed to be O(N)
    O(N.log(N)) time to propagate, if bandwidth allowed per node is only O(1)
    What about O(k) bandwidth?
    What happens to Pmistake (false positive rate) as Tfail ,Tcleanup is increased?
    Tradeoff: False positive rate vs. detection time

  • 23SimulationsAs # members increases, the detection time increasesAs requireme...

    23 слайд

    23
    Simulations
    As # members increases, the detection time increases
    As requirement is loosened, the detection time decreases
    As # failed members increases, the detection time increases significantly
    The algorithm is resilient to message loss

  • 24Failure Detector Properties …Completeness
Accuracy
Speed
Time to first dete...

    24 слайд

    24
    Failure Detector Properties …
    Completeness
    Accuracy
    Speed
    Time to first detection of a failure
    Scale
    Equal Load on each member
    Network Message Load

  • 25…Are application-defined RequirementsCompleteness
Accuracy
Speed
Time to fi...

    25 слайд

    25
    …Are application-defined Requirements
    Completeness
    Accuracy
    Speed
    Time to first detection of a failure
    Scale
    Equal Load on each member
    Network Message Load

    Guarantee always
    Probability PM(T)
    T time units

  • 26…Are application-defined RequirementsCompleteness
Accuracy
Speed
Time to fi...

    26 слайд

    26
    …Are application-defined Requirements
    Completeness
    Accuracy
    Speed
    Time to first detection of a failure
    Scale
    Equal Load on each member
    Network Message Load

    Guarantee always
    Probability PM(T)
    T time units
    N*L: Compare this across protocols

  • 27All-to-All Heartbeatingpi, Heartbeat Seq. l++…piEvery T unitsL=N/T

    27 слайд

    27
    All-to-All Heartbeating
    pi, Heartbeat Seq. l++

    pi
    Every T units
    L=N/T

  • 28Gossip-style HeartbeatingArray of 
Heartbeat Seq. l
for member subsetpiEver...

    28 слайд

    28
    Gossip-style Heartbeating
    Array of
    Heartbeat Seq. l
    for member subset
    pi
    Every tg units
    =gossip period,
    send O(N) gossip
    message
    T=logN * tg
    L=N/tg=N*logN/T

  • 29
Worst case load L* 
as a function of T, PM(T), N
Independent Message Loss...

    29 слайд

    29

    Worst case load L*
    as a function of T, PM(T), N
    Independent Message Loss probability pml

    (proof in PODC 01 paper)


    What’s the Best/Optimal we can do?

  • 30HeartbeatingOptimal L is independent of N (!)
All-to-all and gossip-based:...

    30 слайд

    30
    Heartbeating
    Optimal L is independent of N (!)
    All-to-all and gossip-based: sub-optimal
    L=O(N/T)
    try to achieve simultaneous detection at all processes
    fail to distinguish Failure Detection and Dissemination components

    Key:
    Separate the two components
    Use a non heartbeat-based Failure Detection Component

  • 31SWIM Failure Detector ProtocolProtocol period
= T’ time unitsXK random
proc...

    31 слайд

    31
    SWIM Failure Detector Protocol
    Protocol period
    = T’ time units
    X
    K random
    processes
    pi
    ping
    ack
    ping-req
    ack
    random pj
    X
    ack
    ping
    random K
    pj

  • 32SWIM versus HeartbeatingProcess LoadFirst Detection
TimeConstantConstantO(N...

    32 слайд

    32
    SWIM versus Heartbeating
    Process Load
    First Detection
    Time
    Constant
    Constant
    O(N)
    O(N)
    SWIM
    For Fixed :
    False Positive Rate
    Message Loss Rate
    Heartbeating
    Heartbeating

  • 33SWIM Failure Detector

    33 слайд

    33
    SWIM Failure Detector

  • 34Accuracy, Load
PM(T) is exponential in -K. Also depends on pml (and pf )
Se...

    34 слайд

    34
    Accuracy, Load

    PM(T) is exponential in -K. Also depends on pml (and pf )
    See paper


    for up to 15 % loss rates


  • 35Prob. of being pinged in T’=

E[T ] = 

Completeness: Any alive member dete...

    35 слайд

    35
    Prob. of being pinged in T’=

    E[T ] =

    Completeness: Any alive member detects failure
    Eventually
    By using a trick: within worst case O(N) protocol periods


    Detection Time

  • 36   pj crashed III. DisseminationUnreliable Communication
NetworkpiDissemina...

    36 слайд

    36
    pj crashed
    III. Dissemination
    Unreliable Communication
    Network
    pi
    Dissemination
    HOW ?
    Failure Detector
    Some process
    finds out quickly

  • 37Dissemination OptionsMulticast (Hardware / IP)
unreliable 
multiple simulta...

    37 слайд

    37
    Dissemination Options
    Multicast (Hardware / IP)
    unreliable
    multiple simultaneous multicasts
    Point-to-point (TCP / UDP)
    expensive
    Zero extra messages: Piggyback on Failure Detector messages
    Infection-style Dissemination

  • 38Infection-style DisseminationProtocol period
= T time unitsXpipingackping-r...

    38 слайд

    38
    Infection-style Dissemination
    Protocol period
    = T time units
    X
    pi
    ping
    ack
    ping-req
    ack
    random pj
    X
    ack
    ping
    random K
    pj
    Piggybacked
    membership
    information
    K random
    processes

  • 39Infection-style DisseminationEpidemic style dissemination
After   		protoco...

    39 слайд

    39
    Infection-style Dissemination
    Epidemic style dissemination
    After protocol periods, processes would not have heard about an update
    Maintain a buffer of recently joined/evicted processes
    Piggyback from this buffer
    Prefer recent updates
    Buffer elements are garbage collected after a while
    After protocol periods; this defines weak consistency

  • 40Suspicion MechanismFalse detections, due to
Perturbed processes
Packet loss...

    40 слайд

    40
    Suspicion Mechanism
    False detections, due to
    Perturbed processes
    Packet losses, e.g., from congestion
    Indirect pinging may not solve the problem
    e.g., correlated message losses near pinged host
    Key: suspect a process before declaring it as failed in the group

  • 41Suspicion MechanismAliveSuspectedFailedDissmn  (Suspect pj)Dissmn  (Alive p...

    41 слайд

    41
    Suspicion Mechanism
    Alive
    Suspected
    Failed
    Dissmn (Suspect pj)
    Dissmn (Alive pj)
    Dissmn (Failed pj)
    pi :: State Machine for pj view element
    FD:: pi ping failed
    Dissmn::(Suspect pj)
    Time out
    FD::pi ping success
    Dissmn::(Alive pj)
    Dissmn
    FD
    pi

  • 42Suspicion MechanismDistinguish multiple suspicions of a process
 Per-proces...

    42 слайд

    42
    Suspicion Mechanism
    Distinguish multiple suspicions of a process
    Per-process incarnation number
    Inc # for pi can be incremented only by pi
    e.g., when it receives a (Suspect, pi) message
    Somewhat similar to DSDV
    Higher inc# notifications over-ride lower inc#’s
    Within an inc#: (Suspect inc #) > (Alive, inc #)
    Nothing overrides a (Failed, inc #)
    See paper

  • 43Time-bounded CompletenessKey: select each membership element once as a ping...

    43 слайд

    43
    Time-bounded Completeness
    Key: select each membership element once as a ping target in a traversal
    Round-robin pinging
    Random permutation of list after each traversal
    Each failure is detected in worst case 2N-1 (local) protocol periods
    Preserves FD properties

  • 44Results from an ImplementationCurrent implementation
Win2K, uses Winsock 2...

    44 слайд

    44
    Results from an Implementation
    Current implementation
    Win2K, uses Winsock 2
    Uses only UDP messaging
    900 semicolons of code (including testing)
    Experimental platform
    Galaxy cluster: diverse collection of commodity PCs
    100 Mbps Ethernet
    Default protocol settings
    Protocol period=2 s; K=1; G.C. and Suspicion timeouts=3*ceil[log(N+1)]
    No partial membership lists observed in experiments

  • 45Per-process Send and Receive Loads 
are independent of group size

    45 слайд

    45
    Per-process Send and Receive Loads
    are independent of group size

  • 46Time to First Detection of a process failure 
T1T1+T2+T3

    46 слайд

    46
    Time to First Detection of a process failure

    T1
    T1+T2+T3

  • 47T1Time to First Detection of a process failure 
apparently uncorrelated to...

    47 слайд

    47
    T1
    Time to First Detection of a process failure
    apparently uncorrelated to group size
    T1+T2+T3

  • 48Membership Update Dissemination Time 
is low at high group sizesT2+T1+T2+T3

    48 слайд

    48
    Membership Update Dissemination Time
    is low at high group sizes
    T2
    +
    T1+T2+T3

  • 49Excess time taken by 
Suspicion MechanismT3+T1+T2+T3

    49 слайд

    49
    Excess time taken by
    Suspicion Mechanism
    T3
    +
    T1+T2+T3

  • 50Benefit of Suspicion Mechanism:
Per-process 10% synthetic packet loss

    50 слайд

    50
    Benefit of Suspicion Mechanism:
    Per-process 10% synthetic packet loss

  • 51More discussion pointsIt turns out that with a partial membership list that...

    51 слайд

    51
    More discussion points
    It turns out that with a partial membership list that is uniformly random, gossiping retains same properties as with complete membership lists
    Why? (Think of the equation)
    Partial membership protocols
    SCAMP, Cyclon, TMAN, …
    Gossip-style failure detection underlies
    Astrolabe
    Amazon EC2/S3 (rumored!)
    SWIM used in
    CoralCDN/Oasis anycast service: http://oasis.coralcdn.org
    Mike Freedman used suspicion mechanism to blackmark frequently-failing nodes

  • Reminder – Due this Sunday April 3rd at 11.59 PMProject Midterm Report due, 1...

    52 слайд

    Reminder – Due this Sunday April 3rd at 11.59 PM
    Project Midterm Report due, 11.59 pm [12pt font, single-sided, 8 + 1 page Business Plan max]
    Wiki Term Paper - Second Draft Due (Individual)

    Reviews – you only have to submit reviews for 15 sessions (any 15 sessions) from 2/10 to 4/28. Keep track of your count! Take a breather!

  • 53Questions

    53 слайд

    53
    Questions

Получите профессию

Бухгалтер

за 6 месяцев

Пройти курс

Рабочие листы
к вашим урокам

Скачать

Скачать материал

Найдите материал к любому уроку, указав свой предмет (категорию), класс, учебник и тему:

6 666 268 материалов в базе

Скачать материал

Вам будут интересны эти курсы:

Оставьте свой комментарий

Авторизуйтесь, чтобы задавать вопросы.

  • Скачать материал
    • 16.11.2020 216
    • PPTX 1 мбайт
    • Оцените материал:
  • Настоящий материал опубликован пользователем Кузнецова Ирина Николаевна. Инфоурок является информационным посредником и предоставляет пользователям возможность размещать на сайте методические материалы. Всю ответственность за опубликованные материалы, содержащиеся в них сведения, а также за соблюдение авторских прав несут пользователи, загрузившие материал на сайт

    Если Вы считаете, что материал нарушает авторские права либо по каким-то другим причинам должен быть удален с сайта, Вы можете оставить жалобу на материал.

    Удалить материал
  • Автор материала

    Кузнецова Ирина Николаевна
    Кузнецова Ирина Николаевна
    • На сайте: 3 года и 4 месяца
    • Подписчики: 0
    • Всего просмотров: 112981
    • Всего материалов: 235

Ваша скидка на курсы

40%
Скидка для нового слушателя. Войдите на сайт, чтобы применить скидку к любому курсу
Курсы со скидкой

Курс профессиональной переподготовки

Технолог-калькулятор общественного питания

Технолог-калькулятор общественного питания

500/1000 ч.

Подать заявку О курсе

Курс профессиональной переподготовки

Руководство электронной службой архивов, библиотек и информационно-библиотечных центров

Начальник отдела (заведующий отделом) архива

600 ч.

9840 руб. 5600 руб.
Подать заявку О курсе
  • Этот курс уже прошли 25 человек

Курс профессиональной переподготовки

Организация деятельности библиотекаря в профессиональном образовании

Библиотекарь

300/600 ч.

от 7900 руб. от 3650 руб.
Подать заявку О курсе
  • Сейчас обучается 285 человек из 66 регионов
  • Этот курс уже прошли 850 человек

Курс повышения квалификации

Специалист в области охраны труда

72/180 ч.

от 1750 руб. от 1050 руб.
Подать заявку О курсе
  • Сейчас обучается 35 человек из 21 региона
  • Этот курс уже прошли 155 человек

Мини-курс

Эффективное управление запасами

4 ч.

780 руб. 390 руб.
Подать заявку О курсе

Мини-курс

Организация и планирование воспитательной работы в СПО

6 ч.

780 руб. 390 руб.
Подать заявку О курсе

Мини-курс

Финансовый анализ

5 ч.

780 руб. 390 руб.
Подать заявку О курсе
  • Сейчас обучается 20 человек из 12 регионов