Package Curations
Curations correct invalid or missing package metadata and set the concluded license for packages.
You can use the curations.yml example as the base configuration file for your scans.
When to Use Curations
Curations can be used to:
- correct invalid or missing package metadata such as:
- package source code repository.
- tag or revision (SHA1) for a specific package version.
- binary or source artifacts.
- declared license.
- package description or URL to its homepage.
- set the concluded license for a package:
- concluded license is the license applicable to a package dependency defined as an SPDX license expression.
- set the is_metadata_only flag:
- metadata-only packages, such as Maven BOM files, do not have any source code. Thus, when the flag is set, the downloader just skips the download and the scanner skips the scan. Also, any evaluator rule may optionally skip its execution.
- set the is_modified flag:
- it indicates whether files of this package have been modified compared to the original files, e.g., in case of a fork of an upstream Open Source project, or a copy of the code in this project's repository.
- set the declared_license_mapping property:
- Packages may have declared license string values which cannot be parsed to SpdxExpressions. In some cases, this can be fixed by mapping these strings to a valid license. If multiple curations declare license mappings, they get combined into a single mapping. Thus, multiple curations can contribute to the declared license mapping for the package. The effect of its application can be seen in the declared_license_processed property of the respective curated package.
- set the source_code_origins property:
- Override the source code origins priority configured in the downloader configuration by the given one. Possible values: VCS, ARTIFACT.
The sections below explain how to create curations in the curations.yml
file which, if passed to the analyzer, is applied to all package metadata found in the analysis.
If a license detected in the source code of a package needs to be corrected, add a license finding curation in the .ort.yml file for the project.
Curations Basics
To discover the source code of the dependencies of a package, ORT relies on the package metadata. Often the metadata contains information on how to locate the source code, but not always. In many cases, the metadata of packages provides no VCS information, it points to outdated repositories or the repositories are not correctly tagged. Because it is not always possible to fix this information in upstream packages, ORT offers a curation mechanism for metadata.
These curations can be configured in a YAML file passed to the analyzer. The data from the curations file amends the metadata provided by the packages themselves. This way, it is possible to fix broken VCS URLs or provide the location of source artifacts.
Hint:
If the concluded_license
and the authors
are curated, this package will be skipped during the scan
step, as no more information from the scanner is required.
This requires the skipConcluded
scanner option to be enabled in the config.yml.
A curation file consists of one or more id
entries:
- id: "Maven:com.example.app:example:0.0.1"
curations:
comment: "An explanation why the curation is needed or the reasoning for a license conclusion"
purl: "pkg:Maven/com.example.app/example@0.0.1?arch=arm64-v8a#src/main"
authors:
- "Name of one author"
- "Name of another author"
cpe: "cpe:2.3:a:example-org:example-package:0.0.1:*:*:*:*:*:*:*"
concluded_license: "Valid SPDX license expression to override the license findings."
declared_license_mapping:
"license a": "Apache-2.0"
description: "Curated description."
homepage_url: "http://example.com"
binary_artifact:
url: "http://example.com/binary.zip"
hash:
value: "ddce269a1e3d054cae349621c198dd52"
algorithm: "MD5"
source_artifact:
url: "http://example.com/sources.zip"
hash:
value: "ddce269a1e3d054cae349621c198dd52"
algorithm: "MD5"
vcs:
type: "Git"
url: "http://example.com/repo.git"
revision: "1234abc"
path: "subdirectory"
is_metadata_only: true
is_modified: true
source_code_origins: [ARTIFACT, VCS]
Where the list of available options for curations is defined in PackageCurationData.kt.
Command Line
To make ORT use the curations.yml
file, put it to the default location of $ORT_CONFIG_DIR/curations.yml
and then run the analyzer:
cli/build/install/ort/bin/ort analyze
-i [source-code-of-project-dir]
-o [analyzer-output-dir]
Alternatively to a single file, curations may also be split across multiple files below a directory, by default $ORT_CONFIG_DIR/curations
.
File and directory package curation providers may also be configured as FilePackageCurationProviders in $ORT_CONFIG_DIR/config.yml
.
Similarly, ORT can use ClearlyDefined and SW360 as sources for curated metadata.
See the reference configuration file for examples.
To override curations, e.g. for testing them locally, you can also pass a curations.yml
file or a curations directory via the --package-curations-file
/ --package-curations-dir
options of the evaluator:
cli/build/install/ort/bin/ort evaluate
-i [scanner-output-dir]/scan-result.yml
-o [evaluator-output-dir]
--license-classifications-file $ORT_CONFIG_DIR/license-classifications.yml
--package-curations-file $ORT_CONFIG_DIR/curations.yml
--package-curations-dir $ORT_CONFIG_DIR/curations
--rules-file $ORT_CONFIG_DIR/evaluator.rules.kts
Example
---
# Example for a complete curation object:
#- id: "Maven:org.hamcrest:hamcrest-core:1.3"
# curations:
# comment: "An explanation why the curation is needed or the reasoning for a license conclusion."
# concluded_license: "Apache-2.0 OR BSD-3-Clause" # Valid SPDX license expression to override the license findings.
# declared_license_mapping:
# "Copyright (C) 2013, Martin Journois": "NONE"
# "BSD": "BSD-3-Clause"
# description: "Curated description."
# homepage_url: "http://example.com"
# binary_artifact:
# url: "http://example.com/binary.zip"
# hash:
# value: "ddce269a1e3d054cae349621c198dd52"
# algorithm: "MD5"
# source_artifact:
# url: "http://example.com/sources.zip"
# hash:
# value: "ddce269a1e3d054cae349621c198dd52"
# algorithm: "MD5"
# vcs:
# type: "Git"
# url: "http://example.com/repo.git"
# revision: "1234abc"
# path: "subdirectory"
# is_metadata_only: true # Whether the package is metadata only.
# is_modified: true # Whether the package is modified compared to the original source.
- id: "Maven:asm:asm" # No version means the curation will be applied to all versions of the package.
curations:
comment: "Repository moved to https://gitlab.ow2.org."
vcs:
type: "Git"
url: "https://gitlab.ow2.org/asm/asm.git"
- id: "NPM::ast-traverse:0.1.0"
curations:
comment: "Revision found by comparing the NPM package with the sources from https://github.com/olov/ast-traverse."
vcs:
revision: "f864d24ba07cde4b79f16999b1c99bfb240a441e"
- id: "NPM::ast-traverse:0.1.1"
curations:
comment: "Revision found by comparing the NPM package with the sources from https://github.com/olov/ast-traverse."
vcs:
revision: "73f2b3c319af82fd8e490d40dd89a15951069b0d"
- id: "NPM::ramda:[0.21.0,0.25.0]" # Ivy-style version matchers are supported.
curations:
comment: >-
The package is licensed under MIT per `LICENSE` and `dist/ramda.js`. The project logo is CC-BY-NC-SA-3.0 but it is
not part of the distributed .tar.gz package, see the `README.md` which says:
"Ramda logo artwork © 2014 J. C. Phillipps. Licensed Creative Commons CC BY-NC-SA 3.0."
concluded_license: "MIT"
- id: "Maven:org.jetbrains.kotlin:kotlin-bom"
curations:
comment: "The package is a Maven BOM file and thus is metadata only."
is_metadata_only: true
- id: "PyPI::pyramid-workflow:1.0.0"
curations:
comment: "The package has an unmappable declared license entry."
declared_license_mapping:
"BSD-derived (http://www.repoze.org/LICENSE.txt)": "LicenseRef-scancode-repoze"
- id: "PyPI::branca"
curations:
comment: "A copyright statement was used to declare the license."
declared_license_mapping:
"Copyright (C) 2013, Martin Journois": "NONE"
- id: "Maven:androidx.collection:collection:"
curations:
comment: "Scan the source artifact, because the VCS revision and path are hard to figure out."
source_code_origins: [ARTIFACT]