This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Understand Endor scores

Understand how packages are scored in Endor Labs

Endor Labs collects and analyzes a large amount of information about open-source and customer packages and uses it to compute scores. Every open source and private package is scored across different dimensions that capture both security and operation aspects of a risk.

Types of scores

Scores provide a high-level, easy-to-understand metric of how well a package does based on factors such as security, activity, popularity, and code quality.

Endor Labs scores are categorized into:

  • Security: Indicates the number of security-related issues a package may have such as known vulnerabilities, following security best practices when developing code, and the results of static code analysis. Packages with lower security scores can be expected to have a large number of security-related issues when compared with packages with higher scores. See the factors affecting the security score for more details.
  • Activity: Indicates the level of development activity for a package as observed through the source code management system. Packages with higher activity scores will be more active and presumably better maintained when compared to packages with a lower activity score. See the factors affecting the activity score for more details.
  • Popularity: Indicates how widely a package is used in its ecosystem by tracking both source code management system metrics (for example, the number of stars in GitHub) as well as counting how many other packages import it. A package with a high popularity score indicates that it is used widely. See the factors affecting the popularity score for more details.
  • Code Quality: Indicates how well the package complies with best practices for code development and includes the results of static code analysis of that package’s source code. A package with a higher quality score has fewer code issues. See the factors affecting the code quality score for more details.

Input data for score calculation

Endor Labs performs score computation for various entities such as:

  • Repository score - A repository has a score that captures overall repository activity and properties that span multiple versions of the code.
  • Repository version score - A repository version has a score that captures details that are specific to a version of the code.
  • Package version score - A package version that captures the details that are specific to a package version.

The scoring algorithm considers the following input parameters while calculating the scores:

  • Data from a version control system such as Git that provides information about files, versions and their contents.
  • Data from a source code management system such as GitHub that provides information about the development activities on a project like issues, pull requests, and more.
  • Data from Package managers that provide information about the properties of a package, for example, license, releases and metadata like the number of stars.
  • Data from Vulnerabilities that provide information about known security issues in a package.
  • Data from Static code analysis tools that provide information about specific issues in the source code of the package.

Score range

The scores for each category range between 0 and 10. For example, a score of 5 indicates inconclusive analysis and the package is neutral. A score higher than 5 indicates that the package mostly has positive factors while a score lower than 5 indicates negative factors. A score of 10 indicates that the package meets all the positive conditions, while a score of 0 indicates that a package meets all negative conditions.

1 - Activity score factors

Activity scores indicate the level of activity associated with a repository. Activity information is based on metadata gathered from a code hosting and version control system such as GitHub. Higher levels of activity can mean that the repository is well maintained and will continue to be in the future.

The following factors have a positive contribution to the activity score:

  • Activity from corporate affiliated accounts that indicate that the project can have reliable backing and support.
  • Activity from reputable accounts indicates that the repository is well-maintained. An account is considered reputable if it participates in multiple open-source projects and has a high rating on a source control system such as GitHub.
  • Consistent and continuous commit activity over longer periods of time indicates that the project is active.
  • The repository has frequent releases indicating a commitment to maintaining and supporting the codebase.
  • Activity in the form of comments on issues, show there is engagement in the project.
  • A high ratio of issues opened by external contributors indicates that the project is active.
  • More issues being closed than opened indicates that the project is active.
  • The repository keeps releasing updates to earlier release trains, this is a sign of a commitment to maintaining and supporting the users of the project.
  • When a repository belongs to an organization there is a lower risk of it getting abandoned in the future
  • Recent issue and commit activity means the project is active.
  • Configuring topics is an indication that the repository is well-maintained

The following factors have a negative contribution to the activity score:

  • Archived repositories are not active and have a low score.
  • A high ratio of rejected pull requests indicates that the project may not be actively developed.
  • The Lack of recent issue activity may indicate that the project is not actively used.
  • Significantly more pull requests being submitted than merged indicates that the project may not be maintained.
  • The repository does not have any recent releases and this could mean that it is not actively maintained.
  • When a repository is personal there is a higher risk of it getting abandoned in the future
  • The majority of the repository’s activity comes from a very small number of accounts, the project could be at risk if these accounts can not continue their contributions

2 - Popularity score factors

Popularity scores indicate how popular is the repository. Popularity information is based on metadata gathered from a code hosting and version control system such as GitHub. Popular repositories are more likely to be maintained.

The following factors have a positive contribution to the popularity score:

  • A large number of reputable contributors affiliated with the project indicates that the project is reliable. An account is considered reputable if it participates in multiple open-source projects and has a high rating on GitHub.
  • The project includes many stars indicating an interest in the project.
  • Having subscribers indicates interest in the project.
  • Includes many subscribers
  • Includes a high number of dependent projects.
  • Includes many forks.
  • Some of the released artifacts of the repository are downloaded many times indicating the project is popular.

The following factors have a negative contribution to the activity score:

  • Very few forks may indicate a lack of interest in the project.
  • Few subscribers may mean a lack of interest in the project.
  • Few stars may mean a lack of interest in the project.

3 - Security score factors

Security scores indicate the level of compliance with security best practices as well as vulnerability information for the repository that includes open and fixed vulnerabilities. Vulnerability information is based on OSV.dev data and Endor Lab’s vulnerability database.

The following factors have a positive contribution to the security score:

  • Critical and high severity vulnerabilities were discovered in the past in the repository but have now been fixed. This indicates that the code base is properly maintained.
  • A SECURITY.md file highlighting security-related information is a sign of repository maturity.
  • A high volume of commits related to vulnerabilities may indicate that the project has a large number of security issues but also that they are actively being addressed. A commit is considered vulnerability-related if it mentions a CVE in its commit message.
  • This package does not access any environment information such as environment variables, user, or host names. This reduces the risk of exposing security-sensitive information, such as environment variables with API keys.
  • No vulnerabilities ever discovered in a repository indicate that there are no known security issues in this codebase.
  • The package does not read data from a file system. It does not have write access to a file system.
  • This package does not start operating system processes. This reduces the risk of having command or parameter injection vulnerabilities.
  • This package does not use any dynamic programming techniques such as introspection, reflection or dynamic code execution through eval() type of functions or script engines.
  • This package does not open any network connections or listen for incoming connection requests. This reduces the risk of data leakage or loading of data from untrusted sources.
  • Recently fixed vulnerabilities indicate that the repository has lower security risk and is well maintained.

The following factors have a negative contribution to the security score:

  • High activity from invalid accounts is suspicious.
  • This package accesses environment information like environment variables, user, and host names. Some of this information may be security sensitive, e.g., environment variables with API keys.
  • This package reads data from the file system. This can be dangerous in combination with user-provided input, e.g., lead to path traversal vulnerabilities.
  • This package starts operating system processes. This can be dangerous in combination with user-provided input, as it can lead to command or parameter injection vulnerabilities.
  • This package has a large number of instances of suspicious code that has been known to be used by malware. While this is not a guarantee that this package is malicious, a review of the related code is recommended.
  • This package opens network connections or listens for incoming connection requests. This can be dangerous in combination with user-provided input, e.g., lead to data leakage or the load of data from untrusted sources.
  • This package writes data to the file system, creates or deletes directories, or changes the ownership or permissions of files. This can be dangerous in combination with user-provided input, e.g., lead to path traversal vulnerabilities.
  • A high fraction of critical vulnerabilities among the discovered vulnerabilities indicates an elevated security risk and potentially systematic security issues with this codebase.
  • A high fraction of high-fix priority vulnerabilities among the discovered vulnerabilities indicates an elevated security risk and that the repository needs immediate maintenance. A vulnerability is considered a high priority based on our analysis.
  • A high fraction of high severity or critical vulnerabilities among the discovered vulnerabilities indicates an elevated security risk and potentially systematic security issues with this codebase.
  • This package uses dynamic programming techniques like introspection, reflection or dynamic code execution through eval() type of functions or script engines.
  • Taking more time to fix critical vulnerabilities discovered in a repository indicates a lack of regular maintenance. Analysis only considers vulnerabilities associated with this repository and not its dependencies.
  • The package accesses environment information like environment variables, user and host names. Some of this information may be security sensitive, such as environment variables with API keys.
  • This package reads data from the file system. This can be dangerous in combination with user-provided input, e.g., lead to path traversal vulnerabilities.
  • A high fraction of releases with high severity vulnerabilities indicate an elevated security risk and potentially systematic security issues with this codebase.
  • This package starts operating system processes. This can be dangerous in combination with user-provided input, as it can lead to command or parameter injection vulnerabilities.
  • This package opens network connections or listens for incoming connection requests. This can be dangerous in combination with user-provided input, e.g., lead to data leakage or the load of data from untrusted sources.
  • The package has a large number of unmerged vulnerability-related pull requests means that the project is not actively maintained and may have security issues.
  • The repository includes recently discovered vulnerabilities indicating that the repository’s security risk is increasing.
  • This package has a large number of instances of suspicious code that has been known to be used by malware.
  • A high number of critical or unfixed vulnerabilities discovered in a repository indicates an elevated security risk and potentially systematic security issues with this codebase.

4 - Code quality score factors

Code quality scores provide a view of code quality and adherence to best practices in a repository. Code quality information is based on metadata gathered from a code hosting and version control system such as GitHub and from the source code in the repository.

The following factors have a positive contribution to the code quality score:

  • Activity from bot accounts shows that the project is using automation for some development tasks
  • The repository has reached 1.0 release status indicating the first major release milestone and is a sign of maturity
  • The project includes test code.
  • Attaching labels to issues allows for better tracking of issue activity in the project
  • The repository has multiple files that cover basic operational aspects of the project and this shows a strong emphasis on best practices
  • A large fraction of the commits in this repository are verified; this shows that security best practices are followed
  • Pull requests from dependency management bot accounts indicate that the project is using automation to keep its dependencies up to date
  • Attaching labels to pull requests helps organize the development activity in the project
  • Pull requests from bot accounts indicate that the project is using automation for development tasks
  • A large faction of the commits in this repository is associated with a pull request; this shows that development best practices are followed
  • The repository has released signed artifacts which is a sign of mature security operations
  • The use of continuous integration is a sign of good developer practices
  • Using GitHub templates to manage issues shows that the development work in the repository is well-organized
  • The repository includes badges.
  • Displaying the Code Coverage badge means that the repository is using code coverage tools in its development process
  • Displaying the Core Infra Best Practices badge means that the repository has met a number of best practices requirements
  • The repository includes documentation making it easier to understand and use.
  • The repository has files that cover basic operational aspects of the project and this shows an emphasis on best practices
  • The repository uses CI and a high fraction of commits pass the CI checks which is a sign of good code quality
  • Displaying the OSSF scorecards badge means that the repository strives to meet the OSSF scorecard checks

The following factors have a negative contribution to the code quality score:

  • This package has a large number of instances of likely incorrect code that is associated with coding issues and potential bugs
  • This package has a large number of instances of questionable code warnings that are associated with coding issues and potential bugs
  • This project has a high number of indirect dependencies compared to the number of direct dependencies; this additional code increases the cost of building the project and its supply chain risk.
  • The repository has many major releases in a short amount of time, this is a sign of high churn and potential instability
  • Packages where the package manager license information does not agree with the license information found in the code require additional review
  • Packages with multiple licenses require extra effort to determine their exact license status
  • Multiple unpinned dependencies can significantly increase the risk of a codebase since packages can be updated at any moment
  • Many unreachable direct dependencies unnecessarily increase the size of the codebase and the cost of building it
  • The project does not have an automated build system.
  • The repository does not have any of the files that typically explain the basic operational aspects of the project, this may be an indication that the project is not well-maintained
  • Packages or source code without license information or a restrictive license can create operational risk
  • This release is old and has been superseded by multiple newer releases, it should not be used
  • The repository has releases that do not follow the SemVer standard, this goes against best practices
  • When a repository contains binary files it is harder to analyze and assess its functionality and risks
  • Lack of access to the source code of the project dramatically limits visibility in its quality and adherence to best practices
  • The repository has an unusually fast first release.