Key concepts

Job, Task, and Task Environment are the three key components exposed for extension.

In this section the following is explained:

Job
Task
Task Environment

Job

A Job allows groups of tasks to be managed as a unit. A job consists of one or more tasks and possibly a task environment. Operations performed on a job affect all the tasks associated with the job. For example, setting the priority and requiring certain conditions to be met before a task can be executed.

A job must be submitted before it can have its work run on a job scheduler. Once a job is submitted, wait until all the tasks of the job are completed. None of the job options are mandatory.

Job Options

There are many options on a job. All of the properties are optional. Below is a table of all the options and their default values.

Option	Description	Default value
Name	Name of the job displayed in the monitoring applications.	Job #
Description	Description of the job displayed in the monitoring applications.	None
Priority	Priority of the job's tasks. Higher priority tasks are selected to be executed before lower priority tasks.	JobPriority.Normal
TaskPreconditions	List of conditions that must be met by an agent machine before a task is assigned to it.	None
AgentSelectionPreference	The scheduling algorithm to use when selecting which agent should execute a task.	AgentSelectionType.Default
CancelOnClientDisconnection	Whether to cancel all tasks if the client disconnects. Appropriate for interactive jobs.	false
CancelOnTaskFailure	Whether to cancel all tasks if any task in the job fails. Appropriate if the result of the job depends on all the tasks completing successfully.	false
ExclusiveExecution	If true, the job's tasks will have exclusive access to the agent it is running on. In other words, it is the only task running on the machine.	false
TaskExecutionTimeout	The number of milliseconds for which the job's tasks are allowed to run before the task is considered to have timed-out.	Timeout.Infinite
MaxTaskInterruptedRetryAttempts	Number of times the job's tasks will be retried if the task gets interrupted.	5

Tips and Best Practices

There is a performance overhead when submitting jobs. When jobs are submitted to the job scheduler, the jobs are sent with their .NET assemblies. If possible, packing more tasks per job is a way to decrease this overhead.

Example

Here is an annotated example:

Copy

using (IJobScheduler scheduler = new ClusterJobScheduler("localhost"))
{
    scheduler.Connect();

    // Create job using IJobScheduler.CreateJob
    Job job = scheduler.CreateJob();

    // Add one or more tasks
    job.AddTask(new SimpleTask(1, 1));

    // Set the task environment
    job.TaskEnvironment = new MyEnvironment();

    // Set the various job options
    job.Name = "My job name";
    job.Description = "My job description";
    job.MaxTaskInterruptedRetryAttempts = 0;
    job.Priority = JobPriority.Low;
    job.TaskPreconditions.Add(CommonResources.AvailableCores, Operator.GreaterThan, 2);

    // Submit task to job scheduler
    job.Submit();

    // Wait for the results
    job.WaitUntilDone();
}

Task

A task is a basic building block. The application executes programs contained within tasks. The most important components of a task are the Execute method and the Result property. In the API, developers must extend the Task abstract class.

Every task has an Id and a task number assigned to it. The task Id is created when the task is instantiated and the coordinator assigns the task number. Tasks can also be given names for easier identification.

Note
Do not confuse this Task object with .NET's Parallel Library's System.Threading.TasksTask. Although they share the same name, they are two completely different classes and one cannot be used with the other.

Serialization

Tasks are serialized as .NET object when submitted to a job scheduler. When a task is executed, a copy of the .NET object is deserialized and it's Execute method is called.

For the task to be sent to the host, the object must be serializable. Currently, only the BinaryFormatter is supported but additional formatters will be supported in the future. The developer must ensure the object can be serialized by the .NET Binary Serializer. For more information on .NET Binary Serialization, see the MSDN documentation on binary serialization.

Note
Note when the task is deserialized by .NET, the task's constructor will not be called.

Task Properties

Properties that can be set either before a task is submitted or when it is running:

Property	Description	Default value
Name	Name of the task displayed in the monitoring applications.	Task's type name
Result	Result of the task after task executes. Result objects must be able to be serialized.	null
Properties	List of properties returned after the task executes. The items in this collection can only be strings and primitive types such as int and float.	No properties.

Read-only properties set by the infrastructure:

Property	Description
StandardOutput	The output of the process while it was executing the task.
StandardError	The error stream of the process while it was executing the task.
Id	Random GUID used to identify the task.
IsCancelling	While the task is executing, this will be true if the task is being canceled.
TaskStatus	The current status of the task. This will be updated when its state changes in the job scheduler.
TaskCancellationMessage	If the task is canceled, this property gives a human readable reason why.
TaskCancellationReason	If the task is canceled, a TaskCancellationReason that explains why the task was canceled.

Task States

A task can transition through several different states during its lifetime. Task states are represented by the TaskStatus enum.

All tasks start at the NotSubmitted state. After the task is submitted, the state changes to the Submitted state while the task sits in the coordinator queue. When a task is assigned to an agent for execution, its status changes to Assigned and then to Running once the Agent starts running the task. If no errors occur during execution of the task, the task status changes to Completed. If an uncaught exception is encountered, it goes into the Failed state. Below is a state transition diagram and a full table of task statuses.

Element	Description	Transition
NotSubmitted	Task is not submitted yet.	Always the initial state
Submitted	Task is submitted but not assigned yet.	Always the state after task is submitted
Assigned	Task is assigned but not run yet.	Transition state. Next state is Running
Running	Task is currently running.	Expected
Canceling	Task is in the process of canceling. IsCancelling flag is set to true.	Transition
Interrupted	Task encountered a system exception. Examples of system exceptions include if the agent disconnects or the host process exits unexpectedly.	End state
Canceled	Task is canceled.	End state
EnvironmentError	Task failed to run because of an uncaught exception in Setup.	End state
Completed	Task completed successfully.	End state
TimedOut	Task timed-out because it ran longer than the specified TaskExecutionTimeout value.	End state
Failed	Task failed because of an uncaught exception while running the task.	End state

Tips and Best Practices

How should tasks be split? It has been found that splitting the workload into more units can yield better performance. A common problem in parallel applications is that the workload is split more coarsely than is optimal. In some cases, the performance of the task becomes equal to the "time it takes for the slowest task to complete". As with any performance tip, measure, measure, measure.
Design tasks so they are decoupled from the application code as much as possible. Try not to store application logic in the Task class. This will prove helpful when refactoring and testing code.
If certain fields on the task instance are not used, ignore or remove the fields to reduce the size of the object being sent. Otherwise, mark unused fields with the NonSerializedAttribute keyword to avoid serialization.

Example

Here is an annotated example:

Copy

// You must always make the Task class serializable
[Serializable]
public class SimpleTask : Task
{
    // The instance of the class is automatically serialized with the task
    private int a;
    private int b;

    public SimpleTask(int first, int second)
    {
        a = first;
        b = second;
    }

    public override void Execute()
    {
        // Put your application logic in the Execute method
        Result = a + b;
    }
}

Task Environment

A task environment is used to provide a single set of functions performed for each host once prior to executing any of the tasks associated with the task environment and once before the host is recycled.

Although not strictly required, using a task environment is crucial to optimizing many cases where it takes time to do a common operation. A very common scenario for using the task environment is to perform some sort of expensive operation before a task executes. Examples include loading required data, setting up and running an application, or simply starting the host environment such as .NET itself.

Note
When a task environment is not specified, a default task environment, common for all tasks in the job, is used.

Task Environment Lifetime

A task environment is set up only once per host process. After a process sets up a task environment, the process remains idle until a task specifying the task environment is assigned to the process for execution. In other words, a host runs one task environment and multiple tasks. After each task is executed, the host checks whether any of the task environment's RecycleSettings conditions have been satisfied. If any one of the conditions have been met, the process is recycled. The Agent Tray Application exposes a user interface for configuring an agent's host recycle settings. The values set in the UI will be overridden if a submitted job's TaskEnvironment's RecycleSettings have been set programmatically. A task environment can also be recycled if the agent needs to free up a process to run an incoming task that has a different task environment. For more information, read Host Assignment and Recycling.

Identification

To identify whether two task environment references are equal, a few properties of the task environment are checked. A task environment is uniquely identified via a combination of its Id and the AdditionalId properties. Two task environment identifications are equal if and only if both Id and AdditionalId properties are equal.

Here are a few examples to demonstrate. Below is code for a number of task environments:

Copy

[Serializable]
public class TaskEnvironmentA : TaskEnvironment
{
    public TaskEnvironmentA()
    {
        this.Id = new Guid("94CFE45C-5EA9-4592-BB64-B83A5E72DB77");
    }

    public override void Setup() { }
    public override void Teardown() { }
}

[Serializable]
public class TaskEnvironmentB : TaskEnvironment
{
    public TaskEnvironmentB()
    {
        this.Id = new Guid("94CFE45C-5EA9-4592-BB64-B83A5E72DB77");
    }

    public override void Setup() { }
    public override void Teardown() { }
}

[Serializable]
public class TaskEnvironmentC : TaskEnvironment
{
    public TaskEnvironmentC()
    {
        this.Id = new Guid("94CFE45C-5EA9-4592-BB64-B83A5E72DB77");
        this.AdditionalId = "ham";
    }

    public override void Setup() { }
    public override void Teardown() { }
}

[Serializable]
public class TaskEnvironmentD : TaskEnvironment
{
    public TaskEnvironmentD()
    {
        this.AdditionalId = "ham";
    }

    public override void Setup() { }
    public override void Teardown() { }
}

[Serializable]
public class TaskEnvironmentE : TaskEnvironment
{
    public TaskEnvironmentE()
    {
        this.AdditionalId = "ham";
    }

    public override void Setup() { }
    public override void Teardown() { }
}

[Serializable]
public class TaskEnvironmentF : TaskEnvironment
{
    public TaskEnvironmentF()
    {
        this.Id = new Guid("B766B6B0-232A-4878-A7CF-82F354E117D4");
    }

    public override void Setup() { }
    public override void Teardown() { }
}

Imagine if there was a hypothetical method called TaskEnvironmmentIsEqual, which compares whether two task environments have the same identification reference. The results are below with the explanations of their results in the comments.

Copy

// IsEqual: true
// Reason:  Both Id and AdditionalId properties are the same.
TaskEnvironmentIsEqual(new TaskEnvironmentA(), new TaskEnvironmentB());

// IsEqual: false
// Reason:  Id properties are not the same.
TaskEnvironmentIsEqual(new TaskEnvironmentA(), new TaskEnvironmentF());

// IsEqual: false
// Reason:  Id properties are not the same, even though AdditionalId properties are the same.
TaskEnvironmentIsEqual(new TaskEnvironmentA(), new TaskEnvironmentC());

// IsEqual: false
// Reason:  Additional properties are the same. But the Id is by default random so they are not equal.
TaskEnvironmentIsEqual(new TaskEnvironmentD(), new TaskEnvironmentE());

Serialization

As with Tasks, the .NET binary serializer must be able to serialize user defined TaskEnvironments. Read this topic.

Properties

Property	Description	Default value
Id	One of the ways to determine if two task environment references are equal. See the Identification section above.	Unique guid generated from GuidNewGuid
AdditionalId	One of the ways to determine if two task environment references are equal. See the Identification section above.	null
HostArchitecture	Whether to host the task in a 32 bit or 64 bit process. CpuArchitectureAny indicates that the task can execute either on a 32 bit or a 64 bit processes.	CpuArchitectureAny if the client application is 64 bit. Otherwise, CpuArchitecturex86 if the client application is 32 bit.
HostGcMode	The type of garbage collection (Server or Workstation) the host should use.	GarbageCollectionTypeServer
HostRecycleSettings	The strategies to use for determining when host processes get recycled (shutdown before another is started).	If not specified, the default HostRecycleSettings constructor sets an infinite idle timeout and unspecified fixed number of tasks.
ThreadApartmentState	The apartment state of the execution thread.	ApartmentState.STA
WorkingDirectory	The working directory of the host process when it executes. For .NET this is also the value for AppDomainBaseDirectory.	If not specified, the directory where the Host.exe is located is used. For example: C:\Program Files\AGI\STK Parallel Computing Server 2.10\Agent\bin

Example

Here is an annotated example:

Copy

// You must always make the TaskEnvironment class serializable
[Serializable]
public class MyEnvironment : TaskEnvironment
{
    public override void Setup()
    {
        // Put your common setup logic in the Setup method
    }

    public override void Teardown()
    {
        // Put your common teardown logic in the Teardown method
    }
}

Reference

Job

Task

TaskEnvironment

Other Resources

Internals

MSDN documentation on binary serialization