Task

ClusterShell Task module.

Simple example of use:

>>> from ClusterShell.Task import task_self
>>>  
>>> # get task associated with calling thread
... task = task_self()
>>> 
>>> # add a command to execute on distant nodes
... task.shell("/bin/uname -r", nodes="tiger[1-30,35]")
<ClusterShell.Worker.Ssh.WorkerSsh object at 0x7f41da71b890>
>>> 
>>> # run task in calling thread
... task.resume()
>>> 
>>> # get results
... for buf, nodelist in task.iter_buffers():
...     print NodeSet.fromlist(nodelist), buf
... 
class ClusterShell.Task.Task(thread=None, defaults=None)

The Task class defines an essential ClusterShell object which aims to execute commands in parallel and easily get their results.

More precisely, a Task object manages a coordinated (ie. with respect of its current parameters) collection of independent parallel Worker objects. See ClusterShell.Worker.Worker for further details on ClusterShell Workers.

Always bound to a specific thread, a Task object acts like a “thread singleton”. So most of the time, and even more for single-threaded applications, you can get the current task object with the following top-level Task module function:

>>> task = task_self()

However, if you want to create a task in a new thread, use:

>>> task = Task()

To create or get the instance of the task associated with the thread object thr (threading.Thread):

>>> task = Task(thread=thr)

To submit a command to execute locally within task, use:

>>> task.shell("/bin/hostname")

To submit a command to execute to some distant nodes in parallel, use:

>>> task.shell("/bin/hostname", nodes="tiger[1-20]")

The previous examples submit commands to execute but do not allow result interaction during their execution. For your program to interact during command execution, it has to define event handlers that will listen for local or remote events. These handlers are based on the EventHandler class, defined in ClusterShell.Event. The following example shows how to submit a command on a cluster with a registered event handler:

>>> task.shell("uname -r", nodes="node[1-9]", handler=MyEventHandler())

Run task in its associated thread (will block only if the calling thread is the task associated thread):

>>> task.resume()

or:

>>> task.run()

You can also pass arguments to task.run() to schedule a command exactly like in task.shell(), and run it:

>>> task.run("hostname", nodes="tiger[1-20]", handler=MyEventHandler())

A common need is to set a maximum delay for command execution, especially when the command time is not known. Doing this with ClusterShell Task is very straighforward. To limit the execution time on each node, use the timeout parameter of shell() or run() methods to set a delay in seconds, like:

>>> task.run("check_network.sh", nodes="tiger[1-20]", timeout=30)

You can then either use Task’s iter_keys_timeout() method after execution to see on what nodes the command has timed out, or listen for ev_timeout() events in your event handler.

To get command result, you can either use Task’s iter_buffers() method for standard output, iter_errors() for standard error after command execution (common output contents are automatically gathered), or you can listen for ev_read() and ev_error() events in your event handler and get live command output.

To get command return codes, you can either use Task’s iter_retcodes(), node_retcode() and max_retcode() methods after command execution, or listen for ev_hup() events in your event handler.

__init__(thread=None, defaults=None)

Initialize a Task, creating a new non-daemonic thread if needed.

static __new__(thread=None, defaults=None)

For task bound to a specific thread, this class acts like a “thread singleton”, so new style class is used and new object are only instantiated if needed.

__weakref__

list of weak references to the object (if defined)

abort(kill=False)

Abort a task. Aborting a task removes (and stops when needed) all workers. If optional parameter kill is True, the task object is unbound from the current thread, so calling task_self() creates a new Task object.

copy(source, dest, nodes, **kwargs)

Copy local file to distant nodes.

default(default_key, def_val=None)

Return per-task value for key from the “default” dictionary. See set_default() for a list of reserved task default_keys.

default_excepthook(exc_type, exc_value, tb)

Default excepthook for a newly Task. When an exception is raised and uncaught on Task thread, excepthook is called, which is default_excepthook by default. Once excepthook overriden, you can still call default_excepthook if needed.

flush_buffers()

Flush all task messages (from all task workers).

flush_errors()

Flush all task error messages (from all task workers).

info(info_key, def_val=None)

Return per-task information. See set_info() for a list of reserved task info_keys.

iter_buffers(match_keys=None)

Iterate over buffers, returns a tuple (buffer, keys). For remote workers (Ssh), keys are list of nodes. In that case, you should use NodeSet.fromlist(keys) to get a NodeSet instance (which is more convenient and efficient):

Optional parameter match_keys add filtering on these keys.

Usage example:

>>> for buffer, nodelist in task.iter_buffers():
...     print NodeSet.fromlist(nodelist)
...     print buffer
iter_errors(match_keys=None)

Iterate over error buffers, returns a tuple (buffer, keys).

See iter_buffers().

iter_keys_timeout()

Iterate over timed out keys (ie. nodes).

iter_retcodes(match_keys=None)

Iterate over return codes, returns a tuple (rc, keys).

Optional parameter match_keys add filtering on these keys.

If the process exits normally, the return code is its exit status. If the process is terminated by a signal, the return code is 128 + signal number.
join()

Suspend execution of the calling thread until the target task terminates, unless the target task has already terminated.

key_buffer(key)

Get buffer for a specific key. When the key is associated to multiple workers, the resulting buffer will contain all workers content that may overlap. This method returns an empty buffer if key is not found in any workers.

key_error(key)

Get error buffer for a specific key. When the key is associated to multiple workers, the resulting buffer will contain all workers content that may overlap. This method returns an empty error buffer if key is not found in any workers.

key_retcode(key)

Return return code for a specific key. When the key is associated to multiple workers, return the max return code from these workers. Raises a KeyError if key is not found in any finished workers.

load_topology(topology_file)

Load propagation topology from provided file.

On success, task.topology is set to a corresponding TopologyTree instance.

On failure, task.topology is left untouched and a TopologyError exception is raised.

max_retcode()
Get max return code encountered during last run
or None in the following cases:
  • all commands timed out,
  • no command was executed.
If the process exits normally, the return code is its exit status. If the process is terminated by a signal, the return code is 128 + signal number.
node_buffer(key)

Get buffer for a specific key. When the key is associated to multiple workers, the resulting buffer will contain all workers content that may overlap. This method returns an empty buffer if key is not found in any workers.

node_error(key)

Get error buffer for a specific key. When the key is associated to multiple workers, the resulting buffer will contain all workers content that may overlap. This method returns an empty error buffer if key is not found in any workers.

node_retcode(key)

Return return code for a specific key. When the key is associated to multiple workers, return the max return code from these workers. Raises a KeyError if key is not found in any finished workers.

num_timeout()

Return the number of timed out “keys” (ie. nodes).

port(handler=None, autoclose=False)

Create a new task port. A task port is an abstraction object to deliver messages reliably between tasks.

Basic rules:
  • A task can send messages to another task port (thread safe).
  • A task can receive messages from an acquired port either by setting up a notification mechanism or using a polling mechanism that may block the task waiting for a message sent on the port.
  • A port can be acquired by one task only.

If handler is set to a valid EventHandler object, the port is a send-once port, ie. a message sent to this port generates an ev_msg event notification issued the port’s task. If handler is not set, the task can only receive messages on the port by calling port.msg_recv().

rcopy(source, dest, nodes, **kwargs)

Copy distant file or directory to local node.

resume(timeout=None)

Resume task. If task is task_self(), workers are executed in the calling thread so this method will block until all (non-autoclosing) workers have finished. This is always the case for a single-threaded application (eg. which doesn’t create other Task() instance than task_self()). Otherwise, the current thread doesn’t block. In that case, you may then want to call task_wait() to wait for completion.

Warning: the timeout parameter can be used to set an hard limit of task execution time (in seconds). In that case, a TimeoutError exception is raised if this delay is reached. Its value is 0 by default, which means no task time limit (TimeoutError is never raised). In order to set a maximum delay for individual command execution, you should use Task.shell()’s timeout parameter instead.

run(command=None, **kwargs)

With arguments, it will schedule a command exactly like a Task.shell() would have done it and run it. This is the easiest way to simply run a command.

>>> task.run("hostname", nodes="foo")

Without argument, it starts all outstanding actions. It behaves like Task.resume().

>>> task.shell("hostname", nodes="foo")
>>> task.shell("hostname", nodes="bar")
>>> task.run()

When used with a command, you can set a maximum delay of individual command execution with the help of the timeout parameter (see Task.shell’s parameters). You can then listen for ev_timeout() events in your Worker event handlers, or use num_timeout() or iter_keys_timeout() afterwards. But, when used as an alias to Task.resume(), the timeout parameter sets an hard limit of task execution time. In that case, a TimeoutError exception is raised if this delay is reached.

running()

Return True if the task is running.

schedule(*args, **kwargs)

Schedule a worker for execution, ie. add worker in task running loop. Worker will start processing immediately if the task is running (eg. called from an event handler) or as soon as the task is started otherwise. Only useful for manually instantiated workers, for example:

>>> task = task_self()
>>> worker = WorkerSsh("node[2-3]", None, 10, command="/bin/ls")
>>> task.schedule(worker)
>>> task.resume()
set_default(default_key, value)

Set task value for specified key in the dictionary “default”. Users may store their own task-specific key, value pairs using this method and retrieve them with default().

Task default_keys are:
  • “stderr”: Boolean value indicating whether to enable stdout/stderr separation when using task.shell(), if not specified explicitly (default: False).
  • “stdout_msgtree”: Whether to instantiate standard output MsgTree for automatic internal gathering of result messages coming from Workers (default: True).
  • “stderr_msgtree”: Same for stderr (default: True).
  • “engine”: Used to specify an underlying Engine explicitly (default: “auto”).
  • “port_qlimit”: Size of port messages queue (default: 32).
  • “worker”: Worker-based class used when spawning workers through shell()/run().
Unlike set_info(), when called from the task’s thread or not, set_default() immediately updates the underlying dictionary in a thread-safe manner. This method doesn’t wake up the engine when called.
set_info(*args, **kwargs)

Set task value for a specific key information. Key, value pairs can be passed to the engine and/or workers. Users may store their own task-specific info key, value pairs using this method and retrieve them with info().

The following example changes the fanout value to 128:
>>> task.set_info('fanout', 128)
The following example enables debug messages:
>>> task.set_info('debug', True)
Task info_keys are:
  • “debug”: Boolean value indicating whether to enable library debugging messages (default: False).
  • “print_debug”: Debug messages processing function. This function takes 2 arguments: the task instance and the message string (default: an internal function doing standard print).
  • “fanout”: Max number of registered clients in Engine at a time (default: 64).
  • “grooming_delay”: Message maximum end-to-end delay requirement used for traffic grooming, in seconds as float (default: 0.5).
  • “connect_timeout”: Time in seconds to wait for connecting to remote host before aborting (default: 10).
  • “command_timeout”: Time in seconds to wait for a command to complete before aborting (default: 0, which means unlimited).
Unlike set_default(), the underlying info dictionary is only modified from the task’s thread. So calling set_info() from another thread leads to queueing the request for late apply (at run time) using the task dispatch port. When received, the request wakes up the engine when the task is running and the info dictionary is then updated.
shell(command, **kwargs)

Schedule a shell command for local or distant parallel execution. This essential method creates a local or remote Worker (depending on the presence of the nodes parameter) and immediately schedules it for execution in task’s runloop. So, if the task is already running (ie. called from an event handler), the command is started immediately, assuming current execution contraintes are met (eg. fanout value). If the task is not running, the command is not started but scheduled for late execution. See resume() to start task runloop.

The following optional parameters are passed to the underlying local or remote Worker constructor:

  • handler: EventHandler instance to notify (on event) – default is no handler (None)
  • timeout: command timeout delay expressed in second using a floating point value – default is unlimited (None)
  • autoclose: if set to True, the underlying Worker is automatically aborted as soon as all other non-autoclosing task objects (workers, ports, timers) have finished – default is False
  • stderr: separate stdout/stderr if set to True – default is False.
Local usage::
task.shell(command [, key=key] [, handler=handler]
[, timeout=secs] [, autoclose=enable_autoclose] [, stderr=enable_stderr])
Distant usage::
task.shell(command, nodes=nodeset [, handler=handler]
[, timeout=secs], [, autoclose=enable_autoclose] [, tree=None|False|True] [, remote=False|True] [, stderr=enable_stderr])

Example:

>>> task = task_self()
>>> task.shell("/bin/date", nodes="node[1-2345]")
>>> task.resume()
suspend()

Suspend task execution. This method may be called from another task (thread-safe). The function returns False if the task cannot be suspended (eg. it’s not running), or returns True if the task has been successfully suspended. To resume a suspended task, use task.resume().

class tasksyncmethod

Class encapsulating a function that checks if the calling task is running or is the current task, and allowing it to be used as a decorator making the wrapped task method thread-safe.

__weakref__

list of weak references to the object (if defined)

Task.timer(fire, handler, interval=-1.0, autoclose=False)

Create a timer bound to this task that fires at a preset time in the future by invoking the ev_timer() method of `handler’ (provided EventHandler object). Timers can fire either only once or repeatedly at fixed time intervals. Repeating timers can also have their next firing time manually adjusted.

The mandatory parameter `fire’ sets the firing delay in seconds.

The optional parameter `interval’ sets the firing interval of the timer. If not specified, the timer fires once and then is automatically invalidated.

Time values are expressed in second using floating point values. Precision is implementation (and system) dependent.

The optional parameter `autoclose’, if set to True, creates an “autoclosing” timer: it will be automatically invalidated as soon as all other non-autoclosing task’s objects (workers, ports, timers) have finished. Default value is False, which means the timer will retain task’s runloop until it is invalidated.

Return a new EngineTimer instance.

See ClusterShell.Engine.Engine.EngineTimer for more details.

classmethod Task.wait(from_thread)

Class method that blocks calling thread until all tasks have finished (from a ClusterShell point of view, for instance, their task.resume() return). It doesn’t necessarly mean that associated threads have finished.

ClusterShell.Task.task_self(defaults=None)

Return the current Task object, corresponding to the caller’s thread of control (a Task object is always bound to a specific thread). This function provided as a convenience is available in the top-level ClusterShell.Task package namespace.

ClusterShell.Task.task_wait()

Suspend execution of the calling thread until all tasks terminate, unless all tasks have already terminated. This function is provided as a convenience and is available in the top-level ClusterShell.Task package namespace.

ClusterShell.Task.task_terminate()

Destroy the Task instance bound to the current thread. A next call to task_self() will create a new Task object. Not to be called from a signal handler. This function provided as a convenience is available in the top-level ClusterShell.Task package namespace.

ClusterShell.Task.task_cleanup()

Cleanup routine to destroy all created tasks. This function provided as a convenience is available in the top-level ClusterShell.Task package namespace. This is mainly used for testing purposes and should be avoided otherwise. task_cleanup() may be called from any threads but not from a signal handler.