A Task stores a function, arguments, output, and metadata.
Details
A Task object is used as a storage class. It is a container used to hold an
R function and any arguments to be passed to the function. It can also hold
any output returned by the function, anything printed to stdout or stderr
when the function is called, and various other metadata such as the process
id of the worker that executed the function, timestamps, and so on.
The methods for Task objects fall into two groups, roughly speaking. The
get_*() methods are used to return information about the Task, and the
register_*() methods are used to register information related to events
relevant to the Task status.
The retrieve() method is special, and returns a tibble containing all
information stored about the task. Objects further up the hierarchy use this
method to return nicely organised output that summarise the results from
many tasks.
Methods
Method new()
Create a new task. Conceptually, a Task is viewed as a
function that will be executed by the Worker to which it is assigned,
and it is generally expected that any resources the function requires
are passed through the arguments since the execution context will be a
different R session to the one in which the function is defined.
Method retrieve()
Retrieve a tidy summary of the task state.
Returns
A tibble containing a single row, and the following columns:
task_idA character string specifying the task identifierworker_idAn integer specifying the worker process id (pid)stateA character string indicating the task status ("created", "waiting", "assigned", "running", or "done")resultA list containing the function output, or NULLruntimeCompletion time for the task (NA if the task is not done)funA list containing the functionargsA list containing the argumentscreatedThe time at which the task was createdqueuedThe time at which the task was added to aQueueassignedThe time at which the task was assigned to aWorkerstartedThe time at which theWorkercalled the functionfinishedThe time at which theWorkeroutput was returnedcodeThe status code returned by the callr R session (integer)messageThe message returned by the callr R session (character)stdoutList containing the contents of stdout during function executionstderrList containing the contents of stderr during function executionerrorList containingNULL
Note: at present there is one field from the callr rsession::read() method
that isn't captured here, and that's the error field. I'll add that after
I've finished wrapping my head around what that actually does. The error
column, at present, is included only as a placeholder
Method get_task_state()
Retrieve the task state.
Returns
A string specifying the current state of the task. Possible values are "created" (task exists), "waiting" (task exists and is waiting in a queue), "assigned" (task has been assigned to a worker but has not yet started), "running" (task is running on a worker), or "done" (task is completed and results have been assigned back to the task object)
Method register_task_created()
Register the task creation by updating internal storage.
When this method is called, the state of the Task is set to "created"
and a timestamp is recorded, registering the creation time for the task.
This method is intended to be called by Worker objects. Users should
not need to call it.
Method register_task_waiting()
Register the addition of the task to a queue by updating
internal storage. When this method is called, the state of the Task
is set to "waiting" and a timestamp is recorded, registering the time
at which the task was added to a queue. This method is intended to be
called by Worker objects. Users should not need to call it.
Method register_task_assigned()
Register the assignment of a task to a worker by updating
internal storage. When this method is called, the state of the Task
is set to "assigned" and a timestamp is recorded, registering the time
at which the task was assigned to a Worker. In addition, the
worker_id of the worker object (which is also it's pid) is registered
with the task. This method is intended to be called by Worker objects.
Users should not need to call it.
Method register_task_running()
Register the commencement of a task to a worker by updating
internal storage. When this method is called, the state of the Task is
set to "running" and a timestamp is recorded, registering the time at
which the Worker called the task function. In addition, the worker_id
is recorded, albeit somewhat unnecessarily since this information is
likely already stored when register_task_assigned() is called. This
method is intended to be called by Worker objects. Users should not
need to call it.
Method register_task_done()
Register the finishing of a task to a worker by updating
internal storage. When this method is called, the state of the Task is
set to "done" and a timestamp is recorded, registering the time at which
the Worker returned results to the Task. The results object is
read from the R session, and is stored locally by the Task at this time.
This method is intended to be called by Worker objects. Users should
not need to call it.