Internals |
Understanding the task, host, and agent life cycle can be crucial to effectively writing applications. Most of these details are very technical and are not needed to successfully write an application. The information in this section can prove useful in helping to understand why the system behaves in certain ways.
In this section the following is explained:
The coordinator internally has a task queue. When a task is submitted to the coordinator, the task gets put in the queue. The task's order in the queue is based on the job's priority (JobPriority). The coordinator takes the first task from the queue whose preconditions (JobTaskPreconditions) are satisfied.
Once a task is taken from the queue, the coordinator assigns the task to an available agent based on the job's specified scheduling algorithm (JobAgentSelectionPreference). The default agent selection algorithm first sort agents based on priority, then it cycles through each agent until the agent's maximum task capacity is reached.
Tasks are run in hosts, which are isolated processes. If a host process crashes, it will not affect the other host processes or the agent. For the .NET host process, a separate AppDomain is created to run the task's code. The assemblies needed to run the task's code are streamed to the host and resolved in the task's AppDomain when needed.
When a host starts up, the host calls TaskEnvironmentSetup on the task environment it receives. Then the host loops for two states:
The chart below shows this state diagram:
Most of the workflow associated with the agent centers around comparing task environment identifications and checking whether a host has already loaded the environment. As a refresher, this topic explains how task environment identifications are compared.
When an agent is assigned a task, its workflow is as follows. First, the agent checks the task environment identification of the task. If an idle host process with that task environment loaded already exists, the agent sends the task to that host. If no hosts have the specified task environment loaded, the agent checks if it can start another host. If the agent cannot start another host because an empty host slot is not available, the agent will cause one of the idle host processes to exit and start a new host process in its place. This procedure is called recycling a host. The new host is then given the new task environment.
Note |
---|
Here are a few scenarios and a description of what happens internally in each situation. For simplicity, the task environment identification GUIDS have been replaced with letters. To learn more about the task environment identification comparison mechanism, read this.
Situation: One idle host (task environment id "A"), One empty host slot. Task is received with environment id "A".
Situation: One idle host (task environment id "A"), One empty host slot. Task is received with environment id "B".
Situation: Two idle hosts (task environment id "A" and "B" respectively). Task is received with environment id "C".
Situation: Task completes on a host (host completed 1 task so far). Task Environment's HostRecycleSettingsFixedNumberOfTasks option is set to 32.
Situation: Task completes on a host (host completed 32 tasks so far). Task Environment's HostRecycleSettingsFixedNumberOfTasks option is set to 32.
|
After a host completes a task, the agent will check whether the host has met any of the requirements of its task environment's HostRecycleSettings. Various recycling strategies are set through the TaskEnvironmentRecycleSettings's properties. Recycling strategies fall into three categories: time, memory, and job based.
Time Based
Memory Based
Job Based
namespace CodeSamples { using System; using AGI.Parallel.Client; using AGI.Parallel.Infrastructure; public class Program { public static void Main(string[] args) { using (IJobScheduler scheduler = new ClusterJobScheduler("localhost")) { scheduler.Connect(); // Create HostRecycleSetting to use in TaskEnvironment HostRecycleSettings settingsTaskEnvironmentNoLongerReferenced = new HostRecycleSettings { TaskEnvironmentNoLongerReferenced = true }; HostRecycleSettings settingsJobCompletion = new HostRecycleSettings { JobCompletion = true }; // Create jobs and set TaskEnvironments Job job1 = scheduler.CreateJob(); Job job2 = scheduler.CreateJob(); Job job3 = scheduler.CreateJob(); Job job4 = scheduler.CreateJob(); Job job5 = scheduler.CreateJob(); Job job6 = scheduler.CreateJob(); job1.TaskEnvironment.RecycleSettings = settingsTaskEnvironmentNoLongerReferenced; job2.TaskEnvironment.RecycleSettings = settingsTaskEnvironmentNoLongerReferenced; job3.TaskEnvironment.RecycleSettings = settingsTaskEnvironmentNoLongerReferenced; job4.TaskEnvironment.RecycleSettings = settingsTaskEnvironmentNoLongerReferenced; job5.TaskEnvironment.RecycleSettings = settingsJobCompletion; job6.TaskEnvironment.RecycleSettings = settingsJobCompletion; // Jobs 1 and 2 have identical IDs job1.TaskEnvironment.Id = new Guid("A457B1FA-BE37-49b4-8B95-A53A16E2AA5F"); job2.TaskEnvironment.Id = new Guid("A457B1FA-BE37-49b4-8B95-A53A16E2AA5F"); // Jobs 3 and 4 have different IDs job3.TaskEnvironment.Id = new Guid("B457B1FA-BE37-49b4-8B95-A53A16E2AA5F"); job4.TaskEnvironment.Id = new Guid("C457B1FA-BE37-49b4-8B95-A53A16E2AA5F"); // Jobs 5 and 6 have identical IDs job5.TaskEnvironment.Id = new Guid("D457B1FA-BE37-49b4-8B95-A53A16E2AA5F"); job6.TaskEnvironment.Id = new Guid("D457B1FA-BE37-49b4-8B95-A53A16E2AA5F"); for (int i = 0; i < 100; i++) { job1.AddTask(new BlankTask()); job2.AddTask(new BlankTask()); job3.AddTask(new BlankTask()); job4.AddTask(new BlankTask()); job5.AddTask(new BlankTask()); job6.AddTask(new BlankTask()); } /* * Hosts will be recycled after both jobs complete * because the task environments are identical. */ // Submit Jobs with identical environments (Recycling Strategy: TaskEnvironmentNoLongerReferenced) job1.Submit(); job2.Submit(); job1.WaitUntilDone(); job2.WaitUntilDone(); /* * Hosts will be recycled after each individual job completes * because the task environments are different. */ // Submit Jobs with different Environment (Recycling Strategy: TaskEnvironmentNoLongerReferenced) job3.Submit(); job4.Submit(); job3.WaitUntilDone(); job4.WaitUntilDone(); /* * Hosts will be recycled after each individual job completes * even though their task environments are identical. */ // Submit Jobs with identical Environment (JobCompletion) job5.Submit(); job6.Submit(); job5.WaitUntilDone(); job6.WaitUntilDone(); } } } // Create Blank Task [Serializable] public class BlankTask : Task { public override void Execute() { } } }
Changes to hosts will be displayed in the AgentService.exe window. Shown below is the output after jobs 1 and 2 finish running (each job has 100 tasks).
STK Parallel Computing Server 2.9 API for .NET