Utils

Command Line Utilities

Utility class and functions to help run commands and access the command line.

class pyhdtoolkit.utils.cmdline.CommandLine[source]

Added in version 0.2.0.

A high-level object to encapsulate the different methods for interacting with the commandline.

static check_pid_exists(pid: int) → bool[source]

Added in version 0.2.0.

Check whether the given PID exists in the current process table.

Parameters:: pid (int) -- The Process ID to check the existence of.
Returns:: bool -- A boolean stating the result.

Example

CommandLine.check_pid_exists(os.getpid())
# True

static run(command: str, shell: bool = True, env: Mapping | None = None, timeout: float | None = None) → tuple[int | None, bytes][source]

Added in version 0.2.0.

Runs command through subprocess.Popen and returns the tuple of (returncode, stdout).

Note

Note that stderr is redirected to stdout. Here shell is identical to the same parameter of subprocess.Popen.

Parameters:

command (str) -- The command to run.
shell (bool) -- Same parameter as subprocess.Popen. If True, the command will be run through an intermediate shell, and variables, glob patterns, and other special shell features in the command string are processed before the command is run. Defaults to True.
env (Mapping, optional) -- A mapping that defines the environment variables for the new process.
timeout (float, optional) -- Same as the subprocess.Popen.communicate argument, the number of seconds to wait for a response before raising a TimeoutExpired exception.

Returns:

tuple[int | None, bytes] -- The tuple of (returncode, stdout). Beware, the stdout will be a byte array (i.e. b"some returned text"). This output, returned as stdout, needs to be decoded properly before you do anything with it, especially if you intend to log it into a file. While it will most likely be “utf-8”, the encoding can vary from system to system so the standard output is returned in bytes format and should be decoded later on.

Raises:

TimeoutExpired -- If a value was provided for timeout and the process does not terminate before timeout seconds.

Examples

CommandLine.run("echo hello")
# returns (0, b"hello\r\n")

import os

modified_env = os.environ.copy()
modified_env["ENV_VAR"] = "new_value"
CommandLine.run("echo $ENV_VAR", env=modified_env)
# returns (0, b"new_value")

static terminate(pid: int) → bool[source]

Added in version 0.2.0.

Terminates the process corresponding to the given PID. On other platforms, uses os.kill with signal.SIGTERM to kill.

Parameters:: pid (int) -- The ID of the process to kill.
Returns:: bool -- A boolean stating the success of the operation.

Example

CommandLine.terminate(500_000)  # max PID is 32768 (99999) on linux (macOS).
# returns False

Decorator Utilities

Provides useful decorators.

pyhdtoolkit.utils.decorators.deprecated(message: str = '') → Callable[source]

Decorator to mark a function as deprecated. It will result in an informative DeprecationWarning being issued with the provided message when the function is used for the first time.

Parameters:: message (str, optional) -- Extra information to be displayed after the deprecation notice, when the function is used. Defaults to an empty string (no extra information).
Returns:: Callable -- The decorated function.

Example

@deprecated("Use 'new_alternative' instead.")
def old_function():
    return "I am old!"

pyhdtoolkit.utils.decorators.maybe_jit(func: Callable, **kwargs) → Callable[source]

Added in version 1.7.0.

A numba.jit decorator that does nothing if numba is not installed.

Parameters:

func (Callable) -- The function to be decorated.
**kwargs -- Additional keyword arguments are passed to the numba.jit decorator.

Returns:

Callable -- The JIT-decorated function if numba is installed, the original function otherwise.

Example

@maybe_jit
def calculations(x, y):
    return (x + y) / (x - y)

Context Utilities

Provides useful contexts to use functions in.

pyhdtoolkit.utils.contexts.timeit(function: Callable) → NoneType[source]

Added in version 0.4.0.

Returns the time elapsed when executing code in the context via function. Original code from is from Jaime Coello de Portugal.

Parameters:: function (Callable) -- Function to be executed with the elapsed time as argument, this was conceived with a lambda in mind. See the example below.
Returns:: Iterator[None] -- The elapsed time as an argument for the provided function.

Example

with timeit(lambda x: logger.debug(f"Took {x} seconds")):
    some_stuff()
    some_other_stuff()

HTCondor Monitoring

A module with utility to query the HTCondor queue, process the returned data and display it nicely.

Note

This module is meant to be called as a script, but some of the individual functionality is made public API and one shoule be able to build a different monitor script from the functions in here.

pyhdtoolkit.utils.htc_monitor.query_condor_q() → str[source]

Added in version 0.9.0.

Returns a decoded string with the result of the condor_q command, to get the status of the caller’ jobs.

Returns:: str -- The utf-8 decoded string returned by the condor_q command.

pyhdtoolkit.utils.htc_monitor.read_condor_q(report: str) → tuple[list[HTCTaskSummary], ClusterSummary][source]

Added in version 0.9.0.

Splits information from different parts of the condor_q command’s output into one clean, validated data structures.

Parameters:: report (str) -- The utf-8 decoded string returned by the condor_q command, as returned by query_condor_q .
Returns:: tuple[list[HTCTaskSummary], ClusterSummary] -- A tuple with two elements. The first element is a list of each task summary given by condor_q, as a validated HTCTaskSummary. The second element is a validated ClusterSummary object with the scheduler identification and summaries of the user as well as all users’ statistics on this scheduler cluster.

Example

condor_q_output = get_the_string_as_you_wish(...)
tasks, cluster = read_condor_q(condor_q_output)

Logging Utilities

Added in version 1.0.0.

The loguru package is used for logging throughout pyhdtoolkit, and this module provides utilities to easily set up a logger configuration. Different pre-defined formats are provided to choose from:

FORMAT1: will display the full time of the log message, its level, the calling line and the message itself. This is the default format.
FORMAT2: similar to FORMAT1, but the caller information is displayed at the end of the log line.
SIMPLE_FORMAT: minimal, displays the local time, the level and the message.

pyhdtoolkit.utils.logging.config_logger(level: str | int = 'INFO', fmt: str = '<green>{time:YYYY-MM-DD HH:mm:ss}</green> | <level>{level: <8}</level> | <cyan>{name}</cyan>:<cyan>{line}</cyan> - <level>{message}</level>', **kwargs) → None[source]

Added in version 0.8.2.

Resets the logger object from loguru, with sys.stdout as a sink and the provided format.

Parameters:

level (str | int) -- The logging level to set. Case insensitive if a string is given. Valuevalue can be any of the loguru levels or their integer values equivalents. Defaults to INFO.
fmt (str) -- The format to use for the logger to display messages. Defaults to a pre-defined format in this module.
**kwargs -- Any keyword argument is transmitted to the add call.

Examples

Using the defaults and setting the logging level:

config_logger(level="DEBUG")

Specifying a custom format and setting the logging level:

from pyhdtoolkit.utils.logging import config_logger, SIMPLE_FORMAT

config_logger(level="DEBUG", format=SIMPLE_FORMAT)