ocrd.processor.builtin.dummy_processor module

class ocrd.processor.builtin.dummy_processor.DummyProcessor(*args, **kwargs)[source]

Bases: Processor

Bare-bones processor creates PAGE-XML and optionally copies file from input group to output group

Instantiate, but do not process. Unless list_resources or show_resource or show_help or show_version or dump_json or dump_module_dir is true, setup for processing (parsing and validating parameters, entering the workspace directory).

Parameters:

workspace (Workspace) – The workspace to process. Can be None even for processing (esp. on multiple workspaces), but then needs to be set before running.

Keyword Arguments:
  • ocrd_tool (string) – JSON of the ocrd-tool description for that processor. Can be None for processing, but needs to be set before running.

  • parameter (string) – JSON of the runtime choices for ocrd-tool parameters. Can be None even for processing, but then needs to be set before running.

  • input_file_grp (string) – comma-separated list of METS ``fileGrp``s used for input.

  • output_file_grp (string) – comma-separated list of METS ``fileGrp``s used for output.

  • page_id (string) – comma-separated list of METS physical page IDs to process (or empty for all pages).

  • show_resource (string) – If not None, then instead of processing, resolve given resource by name and print its contents to stdout.

  • list_resources (boolean) – If true, then instead of processing, find all installed resource files in the search paths and print their path names.

  • show_help (boolean) – If true, then instead of processing, print a usage description including the standard CLI and all of this processor’s ocrd-tool parameters and docstrings.

  • subcommand (string) – ‘worker’ or ‘server’, only used here for the right –help output

  • show_version (boolean) – If true, then instead of processing, print information on this processor’s version and OCR-D version. Exit afterwards.

  • dump_json (boolean) – If true, then instead of processing, print ocrd_tool on stdout.

  • dump_module_dir (boolean) – If true, then instead of processing, print moduledir on stdout.

process()[source]

Process the workspace from the given input_file_grp to the given output_file_grp for the given page_id under the given parameter.

(This contains the main functionality and needs to be overridden by subclasses.)