Previous topic

disco.schemes – Default input streams for URL schemes

Next topic

disco.node.worker – Runtime environment for Disco jobs

This Page

disco.util — Helper functions

This module provides utility functions that are mostly used by Disco internally.

The external() function below comes in handy if you use the Disco external interface.

class disco.util.DefaultDict
Like a defaultdict, but calls the default_factory with the key argument.
disco.util.data_err(message, url)

Raises a data error with the reason message. This signals the master to re-run the task on another node. If the same task raises data error on several different nodes, the master terminates the job. Thus data error should only be raised if it is likely that the occurred error is temporary.

Typically this function is used by map readers to signal a temporary failure in accessing an input file.

disco.util.err(message)
Raises an exception with the reason message. This terminates the job.
disco.util.external(files)

Packages an external program, together with other files it depends on, to be used either as a map or reduce function.

Parameter:files – a list of paths to files so that the first file points at the actual executable.

This example shows how to use an external program, cmap that needs a configuration file cmap.conf, as the map function:

disco.new_job(input=["disco://localhost/myjob/file1"],
              fun_map=disco.util.external(["/home/john/bin/cmap",
                                           "/home/john/cmap.conf"]))

All files listed in files are copied to the same directory so any file hierarchy is lost between the files. For more information, see Disco External Interface.

disco.util.jobname(address)

Extracts the job name from an address addr.

This function is particularly useful for using the methods in disco.core.Disco given only results of a job.

disco.util.msg(message)
Sends the string message to the master for logging. The message is shown on the web interface. To prevent a rogue job from overwhelming the master, the maximum message size is set to 255 characters and job is allowed to send at most 10 messages per second.
disco.util.parse_dir(dir_url, partid=None)

Translates a directory URL to a list of normal URLs.

This function might be useful for other programs that need to parse results returned by disco.core.Disco.wait(), for instance.

Parameter:dir_url – a directory url, such as dir://nx02/test_simple@12243344