Resulting crate
Once the application has finished, a new sub-folder under the application’s Working Directory
will be created with the name COMPSs_RO-Crate_[uuid]/, which is also known as crate. The contents of the
folder include all the elements needed to record a COMPSs application execution (this is, the application together with
the datasets used for the run), and
are:
Application Source Files: as detailed by the user in the YAML configuration file, with the term
sources. The main source file and all auxiliary files that the application needs (e.g..py,.java,.classor.jarsource code files, and also any installation, configuration, compilation or submission scripts, readme files, etc…) are included by the user. All application files are added to a sub-folder in the crate namedapplication_sources/, where thesourcesdirectory locations are included with their same folder tree structure, while the individual files included are added to the root of theapplication_sources/sub-folder in the crate.Application Datasets: when
data_persistenceis set toTruein the YAML configuration file, both the input and output datasets of the workflow are included in the crate. The input dataset are the files that the workflow needs to be run. The output dataset is formed by all the resulting files generated by the execution of the COMPSs application. A sub-folderdataset/with all related files copied will be created, and the sub-directories structure will be respected. If more than a single root path is detected, a set of folders will be provided inside thedataset/folder.complete_graph.svg: the diagram of the workflow generated by the COMPSs runtime, as generated with the
runcompss -gor--graphoptions.App_Profile.json (or custom name): a set of task statistics of the application run recorded by the COMPSs runtime, as if the
runcompss --output_profile=<path>option was enabled. It includes, for each resource and method executed: number of executions of the specific method, as well as maximum, average and minimum run time for the tasks. The name of the file can be customized using the--output_profile=<path>option. See also Section Schedulers.compss_submission_command_line.txt: stores the exact command line that was used to submit the application (i.e.
runcompssorenqueue_compss), including all the flags and parameters passed. This is especially important for reproducing a COMPSs application, since the workflow generated by the COMPSs runtime is created dynamically at run time, thus, input parameters could even potentially change the resulting workflow generated by the COMPSs runtime.ro-crate-info.yaml (or custom name): the YAML workflow provenance configuration file.
compss-[job_id].out: only when the execution is on a cluster. The standard output log of the job execution.
compss-[job_id].err: only when the execution is on a cluster. The standard error log of the job execution.
ro-crate-metadata.json: the RO-Crate JSON main file describing the contents of this directory (crate) in the RO-Crate specification format. You can find examples at Section Metadata examples.
Tip
Since its version 3.3.4, the PyCOMPSs CLI includes the capacity of inspecting RO-Crates with the
pycompss inspect [crate_folder/ | crate.zip] command. Check the Inspect Workflow Provenance
Section for more details.
Tip
For the basic set of files always included for every application (i.e. complete_graph.svg, App_Profile.json,
compss_submission_command_line.txt, ro-crate-info.yaml, compss-[job_id].out, compss-[job_id].err),
the runtime generates a file checksum using the sha256 algorithm, as specified inside the metadata file
ro-crate-metadata.json. This checksum can be used to verify the file integrity with the sha256sum command.
Warning
All previous file names (complete_graph.svg, App_Profile.json and compss_submission_command_line.txt)
are automatically used to generate new files when using the -p or --provenance option.
Avoid using these file names among
your own files to avoid unwanted overwriting. You can change the resulting App_Profile.json name by using
the --output_profile=/path_to/file flag.
Warning
The complete_graph.svg workflow diagram will not be generated automatically if your workflow’s application
edges are larger than 6500, to avoid large generation times. If you want to generate the diagram anyway, you can
trigger the diagram generation manually with compss_gengraph or pycompss gengraph.