Task Management before version 6.5

From Sense/Net Wiki
Jump to: navigation, search
  •  
  •  
  •  
  •  
  • 100%
  • 6.3.1
  • Enterprise
  • Community
  • Planned

Overview

Task Management
Sense/Net ECM is a web application, running in the context of one IIS process and designed to serve many users at the same time. There are tasks that are not well-suited for this environment - for example long-running tasks that would keep the web server busy and make the user wait for the response, or resource-consuming jobs that are even able to crash the executing process. For these kinds of scenarios Sense/Net offers the Task Management module that lets you execute background operations in a scalable way.


This page contains documentation for Task Management before Sense/Net ECM version 6.5. For the new solution please take a look at the main article: Task Management.


Components

Task

A task is an atomic, independent job that consumes more resources than a few processor cycles: e.g. a job that loads one or more content and converts them, or generates preview images using an external tool. Compressing documents is a good example also.

These tasks are independent enough that they can be moved to another worker agent, even to a different machine. A task appoints the plugin (e.g. the preview generator) that should execute it. Tasks also contain all the necessary metadata for these worker agents to start their work - e.g. a document id and the start and end of a page interval that preview images should be generated for.

Task database

Tasks are stored in the task database. This is a central storage where Sense/Net web applications put tasks and from where the agents will get them for execution through the Task Manager.

A task database is independent from the Sense/Net Content Repository. It can be in the same database or a different database for load balancing reasons - see the Configuration section below for details.

Task Manager

The Task Manager is a module that runs on every web server hosting Sense/Net in the context of the web application itself (there is no separate installation needed). It is responsible for creating new tasks and setting or modifying their priority.

When a new task was created, this module will notify the agents (all of them) that there is a job to do (a push model to avoid too much load on the task database). See the Communication section below for more details about how the components talk to each other. The task manager provides an API for creating tasks and setting task priority (see an example later in this article).

Distributed Task Manager

For larger projects it is advisable to use the distributed task manager that is avaialable only in the Enterprise Edition of Sense/Net. This allows you to put your worker agents on a different machine than your web server to save resources. There is a Windows Service for this edition that will keep the configured number of agents alive.

Local Task Manager

In the Community Edition of Sense/Net we offer the Local Task Manager that lets you install Sense/Net on a web server and have a configured number of agents on the same server, locally. The portal will keep these agent processes alive. This is sufficient for smaller projects when the web server doesn't need to serve too many users. It is also possible to use the Local Task Manager with multiple Sense/Net web servers in a load balanced environment.

For developers: agents continuously send SignalR requests to the server. This may make debugging the application cumbersome because of too many requests. For helping the development scenario we do not start agents during the boot sequence of the portal if there are no executable tasks in the database. The agents will be started later when the first task gets registered. This delay feature is only relevant for the Local Task Manager mode.

Task Executor Agent

Agents are the components (separate processes) that will start the task tools (e.g. a preview generator exe). One or more agents are hosted on worker nodes (usually virtual machines) and they run constantly. Agent processes are kept alive by a Windows Service (in a distributed environment) or the portal itself (in a local environment). If an agent process crashes, the service will start a new one in place of it. The number of agents per machine is configurable.

Agents are notified by the Task Manager when a new task comes in (see the communication chapter for more details). Then all available (idle) agents will request and lock a new task for themselves for execution. A task can be assigned to a single agent only.

When an agent finished a task, it will request a new task through the Task Manager – but only once. If there is no new task to be executed, it will sleep and wait for the push notification of the task manager.

Agents do not know the specifics of tasks, they just start the appropriate task process (e.g. a preview generator), described by the task object.

Task Executor

A Task Executor is a command line tool that actually does the job - e.g. generates preview images. 3rd party developers will create task executors for their custom tasks.

An executor may connect to Sense/Net or other 3rd party applications for more information beyond task management. In this case the additional task-specific configuration is the responsibility of the executor itself – and the operator.

Task execution workflow

In the following example we go through the steps of the first built-in use case for the Task Management framework: generating preview images for the uploaded document.

  1. A user uploads a document.
  2. Sense/Net ECM saves and indexes the document.
  3. Users can now access (download) the document.
  4. Task Manager queues a task for preview generation.
  5. Task Manager notifies all agents that there is a new task.
    Notifying the agents
  6. Idle agents compete for the new task (through the task manager).
  7. Task Manager locks a task in the task database.
  8. Only one agent wins the task.
  9. Task Manager returns the task to the caller agent.
    Get new task
  10. The winner agent starts the appropriate executor tool (in this case the preview generator) that will start working: downloads the document from the Content Repository.
  11. The executor uploads the generated images (previews + thumbnails) to the Content Repository, than exits.
  12. The agent rounds off the finished task and requests a new one.
  13. The task manager finalizes the task in the database and provides a new one for the agent (if necessary).

Communication

The components above need to communicate with each other. For example the Task Manager has to notify the agents when a new task comes in (note that this is a push model) and sometimes the agents need to notify the Task Manager or ask for information. The general communication is performed using the ASP.NET SignalR technology. It allows a two-way communication and has a robust, fail-safe architecture that ensures that the system functions stably.

Sense/Net does not register or manage the agents connected to the system – this is maintained by SignalR. Connection/disconnection events are raised automatically. Other health information (processor/memory load) is sent by the agents (see the monitoring section below).

All other communication related to specific tasks (e.g. downloading a document or getting information from the Sense/Net Content Repository) will be performed by the task executor tools.

Authentication

Agents need a portal user registered in Sense/Net to be able to contact the task manager through SignalR. This user will be used by the task executors also, so it has to be an administrator who has permissions for all the content that the task executors will wish to access.

From version 6.3.1 Patch 4 both Windows (NTLM) and username/password (basic) authentication works. In previous versions only basic authentication is possible. See the Task agent configuration section below for details.

Task priority

When there are many tasks to be executed at the same time (e.g. there are plenty of new documents to create preview images for), there should be a way to have more important tasks than others.

The Task Manager API provides entry points for setting task priority without knowing much about the logic behind the task. Developers may choose from the following task priority levels:

  • Immediately
  • Important
  • Normal
  • Unimportant

See the details later in this article.

Task error and timeout

If an error occurs during task execution related to the task itself, it is the responsibility of the executor to handle the case: for example if the document could not be downloaded, the preview generator will set the preview status of the document to Error and will close the task gracefully.

It is possible that the task executor itself crashes during execution. In this case the agent sends all the available information to the server and closes the task itself.

On the other hand it is also possible that the executor process hangs. There is a timeout interval we wait for a task to raise some life sign or finish - after that we close the process and release the task so that it could be claimed by another agent.

It is the responsibility of the developer of the task executor tool to raise a status message regularly so that the agent knows that the task is running correctly and it should not close it forcefully.

Monitoring

If you are using the Enterprise Edition, you get a page accessible from the Root console (click on the Root node in Content Explorer) where you can monitor the Task Management system's status. On the Task Monitor page you'll see the list of agent machines and all the connected agents. The page updates periodically with real-time information about the status of the individual executors.

Task Monitor

The page displays the CPU usage and the available memory (RAM) on each agent machine. You can also get a more detailed log if you click on the individual agents.

Configuration and maintenance

Web.config

Common config values

You can modify the behavior of the Task Managment framework with the following web.config keys:

  • TaskManager: on each web.server you need to configure one of the following classes to choose between the distributed and the local behavior:
    • SenseNet.BackgroundOperations.DistributedTaskManager
    • SenseNet.BackgroundOperations.LocalTaskManager (this is the default)
  • SignalRSqlEnabled: if you are using multiple web servers (NLBS), you need to let SignalR use the database for its communication traffic. The value can be true or false.
  • TaskAgentCount: if you are using LocalTaskManager, you may configure the number of running agents on the web server using this key.
  • AgentPath: the folder where the agent executable can be found (from version 6.3.1 Patch 3 this is obsolete and omitted and the agent is started from the [webfolder]\TaskManagement folder).

Connection strings

  • TaskDatabase: tasks are saved into the database and removed after they were completed. If you want to store the tasks in a different database outside of the Content Repository, you can do so by providing a different connection string here.
  • SignalRDatabase: when you use multiple web servers, SignalR needs to store its messages in the database. By default this is the Content Repository, but you can store these records in a completely different database.

Windows Service

If you are using DistributedTaskManager (available only in the Enterprise Edition), you need to copy the TaskManagement folder from the web folder to each worker machine. You can place this folder anywhere you want on the agent machine. This folder contains the service, the agent and all the available task executors.

In the configuration file of the service (SenseNetTaskAgentService.exe.config) you can define the following values:

  • TaskAgentCount: the number of running agents.
  • AgentPath: the folder where the agent executable can be found (from version 6.3.1 Patch 3 this is obsolete and the agent is started from the directory of the service).

You have to register the service on every agent machine. Please open a command prompt with administrator and install the service:

  • %.NET FRAMEWORK DIRECTORY%\installutil SenseNetTaskAgentService.exe

For example:

"C:\Windows\Microsoft .NET\Framework\v.......\installutil SenseNetTaskAgentService.exe"

If you start the service (e.g. in Windows Task Manager) you will see the agent processes among the running processes (by default 3 agents).

Task Agent

The agent has its own configuration file (SenseNetTaskAgent.exe.config) containing the following options:

  • RepositoryUrl: the url by which the agent will access the portal (e.g. https://example.com).
  • Username: user name (in the form of domain\username) for accessing the portal.
    • basic authentication: provide the username and password in the configuration. It is advisable to use https to access the portal.
    • Windows authentication (available only from version 6.3.1 Patch 4): leave the Username empty in the configuration. In this case we will connect to the portal using the user that started the agent (which is the user of the service in case of distributed mode and the user of the application pool in case of local mode).
  • Password: password for the user above (in case of basic authentication).
  • TaskExecutorDirectory: directory of the task executor tools. Relative to the agent exe (default: ./TaskExecutors).
  • UpdateLockPeriodInSeconds: time interval for refreshing the lock on the executing task (default: 15).
  • ExecutorTimeoutInSeconds: maximum executor inactivity (default: 30).
  • HeartbeatPeriodInSeconds: time interval for sending heartbeat messages to the portal so it can monitor the status of the agent (default: 30).

Updating agents

If you are using the framework in distributed mode (available in the Enterprise Edition), starting with version 6.3.1.6877 updating agents is automatic. See the following article for details:

Built-in tasks

Currently the following Sense/Net features use the Task Management framework:

Creating a custom task executor

It is possible to create your own task executor and use the Task Management framework to execute them. You have to create a new Console application in Visual Studio. You have to handle the communication with the agent in your tool:

  • parse the incoming parameters at the beginning of your Main method
  • regularly write the status of the execution to the Console for the agent to know that the tool is still alive

Incoming parameters

The agent starts the task executor process and provides the necessary parameters as command line arguments in the following format:

MyTaskExecutorTool.exe <ParamName1>:<Value1> <ParamName2>:<Value2>

The parameters are the following:

  • REPO: url for accessing the portal.
  • USERNAME: username for authentication.
  • PASSWORD: password for the user above.
  • DATA: all custom data that was provided when the task was registered (e.g. a content id). This is usually a JSON formatted string that can be deserialized easily - see details in the Starting a new task section below.

Writing the status

The agent monitors the messages written to the Console by the task executor. You can write a text message or provide a progress percent value - both will be sent to the Task Monitor GUI; the latter will be interpreted as a number and will be displayed on the progress chart. The format is simple:

  • Progress: [message]
  • Progress: [percent value]%

Simple text:

Console.WriteLine("Progress: The document was downloded successfully.");

Progress percent:

Console.WriteLine("Progress: 65%");

The percent value totally depends on the nature of the task. For example the preview generator tool writes a new percent value after generating every preview and thumbnail image. This is only for providing an approximate status of the task for the Task Monitor UI.

Deploying the task executor

To let agents use your task executor, you need to create a separate folder for it under the TaskExecutors folder on every agent machine. The name of the folder should be the name of your tool (except the .exe extension - in the following example this would have been MyTaskExecutor) and copy all the necessary libraries and config files there needed by the tool. The agents will automatically discover and execute your tool when a new task arrives for it (a service restart may be needed in case you are working in distributed mode).

Example

The following code snippet demonstrates the way you are able to create a custom task executor tool as a console application.

using System;
using System.IO;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
 
namespace MyTaskExecutor
{
    class Program
    {
        static void Main(string[] args)
        {
            if (!ParseParameters(args))
            {
                //Logger.WriteWarning("Task process arguments are not correct.");
                return;
            }
 
            try
            {
                ExecuteTask();
            }
            catch (Exception ex)
            {
                //Logger.WriteError(ex);
 
                // if logger writes a standard error information to the console,
                // the task finalizer receives an SnTaskError instance with the
                // type, message and stack trace of the catched exception.
                // The simplest way of the exception writing is the following line:
                //Console.WriteLine("ERROR:" + SnTaskError.Create(ex).ToString());
            }
        }
 
        private static void ExecuteTask()
        {
            Console.WriteLine("Progress: Execution started.");
 
            //TODO: implement task logic
 
            for (var i = 0; i < 10; i++)
            {
                //do stuff...
                Console.WriteLine("Progress: {0}%", i * 10);
            }
 
            Console.WriteLine("Progress: Execution ended.");
        }
 
        private static bool ParseParameters(string[] args)
        {
            foreach (var arg in args)
            {
                if (arg.StartsWith("REPO:", StringComparison.OrdinalIgnoreCase))
                {
                    //TODO: store Url
                }
                else if (arg.StartsWith("USERNAME:", StringComparison.OrdinalIgnoreCase))
                {
                    //TODO: store Username
                }
                else if (arg.StartsWith("PASSWORD:", StringComparison.OrdinalIgnoreCase))
                {
                    //TODO: store Password
                }
                else if (arg.StartsWith("DATA:", StringComparison.OrdinalIgnoreCase))
                {
                    var data = GetParameterValue(arg).Replace("\"\"", "\"");
 
                    var settings = new JsonSerializerSettings { DateFormatHandling = DateFormatHandling.IsoDateFormat };
                    var serializer = JsonSerializer.Create(settings);
                    var jreader = new JsonTextReader(new StringReader(data));
                    var taskData = serializer.Deserialize(jreader) as JObject;
 
                    //TODO: get values from taskData
                }
            }
 
            return true;
        }
 
        private static string GetParameterValue(string arg)
        {
            return arg.Substring(arg.IndexOf(":", StringComparison.Ordinal) + 1).TrimStart(new[] { '\'', '"' }).TrimEnd(new[] { '\'', '"' });
        }
    }
}

Starting a new task

You are able to start your custom task (or one of the built-in tasks of Sense/Net ECM) from your code using the Task Management API. In the core pruduct for example when a user uploads a document, the preview provider NodeObserver registers a new task to generate the first couple of preview images.

TaskManager.RegisterTask("TaskName", priority, taskData);

The parameters are the following>

  • Tak name: this should be the name of your custom task executor tool, without the .exe extension.
  • Priority: one of the TaskPriority values (see above).
  • Task data: free text (string) containing all the information that the executor needs. The executor tool will get this data without change, and you are responsible for parsing the data (our recommendation is to use a JSON formatted string).

Task finalizers

It is possible to execute custom code when a task was finished, without writing a custom executor. Developers may inject their custom code even into the finalize process of built-in tasks. The finalizer executes on the server and receives all the information about the task and the execution result. To write a finalizer you need to create a class that implements the ITaskFinalizer interface. It will contain a method that gets called when a task was finished. The finalizer method will receive an object of type SnTaskResult that contains the following fields:

  • AgentName: unique name of the agent that executed the task.
  • Task: task object that was registered. It contains the full task data provided the code that registered the task.
  • ResultCode: result code of the executor tool.
  • ResultData: additional information written to the Console by the task executor with the prefix ResultData:.
  • Error: error object containing any error that occured during task execution and were written to the Console by the task executor with the prefix ERROR:.

A task finalizer must also define the task(s) that was created for. This is necessary because the system does not call all task finalizers when a task was finished, only those that were 'registered' for that task type. This is done using the following method in the finalizer:

  • GetSupportedTaskNames: a list of task names that this finalizer supports.

Example

The following example shows how you can execute custom code (e.g. logging or registering new tasks) when a task was finished.

public class MyCustomTaskFinalizer : ITaskFinalizer
{
    public void Finalize(SnTaskResult result)
    {
        // not enough information
        if (result.Task == null || string.IsNullOrEmpty(result.Task.TaskData))
            return;
 
        // not a task that intrests us, or the task was executed successfully without an error message
        if (string.Compare(result.Task.Type, "ImportantTaskName", StringComparison.InvariantCulture) != 0 || (result.Successful && result.Error == null))
            return;
 
        try
        {
            if (result.Error != null)
            {
                // log the error message and details for admins
            }
 
            // deserialize the same task data that was created when the task was registered
            var settings = new JsonSerializerSettings { DateFormatHandling = DateFormatHandling.IsoDateFormat };
            var serializer = JsonSerializer.Create(settings);
 
            using (var jreader = new JsonTextReader(new IO.StringReader(result.Task.TaskData)))
            {
                var taskData = serializer.Deserialize(jreader) as JObject;
                var contentId = taskData["Id"].Value<int>();
 
                // do something
            }
        }
        catch (Exception ex)
        {
            Logger.WriteException(ex);
        }
    }
 
    public string[] GetSupportedTaskNames()
    {
        return new string[] { "MyTaskName" };
    }
}

Related links

References