3.1.7. Nbor list dead-lock¶
A deadlock could occur if a task A depends on another task B being updated,
because B is in the nbor list of A and the logical flag needed
is true
for the B entry in A’s nbor list, while at the same time A is not in the nbor
list of B, and hence B will not include A when doing check_nbors()
. This could
cause a deadlock because the mechanism that puts task A in the ready queue is
that a thread working on B, updating the time past the one that task A is
waiting for, then calls check_ready()
on the A-task.
But if the reason that the A task is not on the nbor list of B is just a time
delay (e.g. because of latency in MPI communication), then one just neeeds to
make sure that the check_ready()
call actually happens when A – after the
delay – gets put on the nbor list of B. This will be ensured if the adding
of a task to an nbor list always is accompanied by a check_ready()
call.
One call too many does not hurt.
So, at guarantee against nbor-list caused deadlocking is to add a
check_nbors()
call into the init_nbors()
call. This will work, as long
as init_nbors()
is the method used to update the nbor lists from
task_mesg_t%unpack()
. If the method is changed the new method needs to
do something similar.