Python

Learn how to implement Endor Labs in repositories with Python packages.

Python is a high-level, interpreted programming language widely used by developers. Endor Labs supports the scanning and monitoring of Python projects.

Using Endor Labs, developers can:

  • Test their software for potential issues and violations of organizational policy
  • Prioritize vulnerabilities in the context of their applications
  • Understand the relationships between software components in their applications

Scan Python projects

To successfully scan your repositories for Python:

  1. Install software prerequisites
  2. Build Python projects
  3. Run a scan
  4. Understand the scan process
  5. Troubleshoot errors

Software prerequisites

Ensure that the following prerequisites are complete:

  • Install Python 3.6 or higher versions. See the Python documentation on how to install Python.
  • Ensure that the package manager pip, Poetry, or PDM is used by your projects to build your software packages. From Python 3.12, install setuptools if you use pip.
  • Set up any build, code generation, or other dependencies that are required to install your project’s packages.
  • Organize the project as one or more packages using setup.py, setup.cfg, pyproject.toml, or requirements.txt package manifest files.
  • Make sure your repository includes one or more files with .py extension or pass either one of requirements.txt, setup.py, setup.cfg or pyproject.toml using the --include flag.

Hardware Prerequisites

Make sure that you meet the following minimum system specifications.

Repository size Processor Memory
Small 8-core 32 GB
Medium 16-core 64 GB
Large 32-core 128 GB

Build Python projects

You must create a virtual environment and build your Python projects before running the endorctl scan. Additionally, ensure that the packages are downloaded into the local package caches and that the build artifacts are present in the standard locations.

  1. Configure any private repositories
    • If you use dependencies from a PyPI compatible repository other than pypi.org, configure it in the Integrations section of the Endor Labs web application. See Configure Python private repositories for more details.
  2. Clone the repository and create a virtual environment inside it
    1. Clone the repository using git clone or an equivalent workflow.

    2. Enter the working copy root directory that’s created.

    3. Create a virtual environment based your package manager:

      For pip or setuptools

      • Use python3 -m venv venv. Set up the virtual environment in the root folder that you want to scan and name it venv or .venv.
      • Install your project’s dependencies using venv/bin/python -m pip install -r requirements.txt or venv/bin/python -m pip install.
      • If the virtual environment is created outside the project, use one of the ways defined in Virtual environment support to specify the path of the Python virtual environment to Endor Labs.

      For Poetry projects

      • Install your project’s dependencies using poetry install.

      For PDM projects

      • Install your project’s dependencies using pdm sync.

Virtual environment support

For Poetry and PDM, virtual environments endorctl automatically picks the virtual environments.

For pip, you need to use one of the following ways to specify the virtual environment details of your Python projects for both quick and deep scans.

  • Set up the virtual environment in the root folder that you want to scan and name it venv or .venv, it is automatically picked up by the Endor Labs application.
export PYTHONPATH=/usr/tmp/venv:/usr/tmp/another-venv
  • Set the environment variable ENDOR_SCAN_PYTHON_VIRTUAL_ENV to the path of the virtual environment of your Python project.
export ENDOR_SCAN_PYTHON_VIRTUAL_ENV=/usr/tmp/venv
  • Set the environment variable ENDOR_SCAN_PYTHON_GLOBAL_SITE_PACKAGES to true to indicate that a virtual environment is not present and Endor Labs can use the system-wide Python installation packages and modules.
export ENDOR_SCAN_PYTHON_GLOBAL_SITE_PACKAGES=true

If you do not set up the virtual environment, Endor Labs attempts to set it up with all the code dependencies, however, we recommend that you install all dependencies in a virtual environment for the most accurate results.

If you are using custom scripts without manifest files to assemble your dependencies, make sure to set up the virtual environment and install the dependencies.

Configure private Python repositories

In addition to scanning public Python projects, Endor Labs provides support to fetch and scan private Python repositories. Endor Labs will fetch the resources from the authenticated endpoints and perform the scan, and you can view the dependencies and findings.

  1. Sign in to Endor Labs and select Integrations under Manage from the left sidebar.
  2. From Package Managers, select PyPI and click Manage.
  3. Click Add Package Manager.
  4. Enter a package manager URL.
  5. To enable Endor Labs to authenticate to your registry, select Authenticate to this registry and enter the username and password of your private package manager repository.
  6. Click Add Package Manager to save your configuration.

Run a scan

Use the following options to scan your repositories. Perform the endorctl scan after building the projects.

Option 1 - Quick scan

Perform a quick scan to get quick visibility into your software composition and perform dependency resolution. It discovers dependencies that the package has explicitly declared. If the package’s build file is incomplete then the dependency list will also be incomplete. This scan won’t perform reachability analysis to help you prioritize vulnerabilities.

endorctl scan --quick-scan

You can perform the scan from within the root directory of the Git project repository, and save the local results to a results.json file. The results and related analysis information are available on the Endor Labs user interface.

endorctl scan --quick-scan -o json | tee /path/to/results.json

You can sign in to the Endor Labs user interface, click the Projects on the left sidebar, and find your project to review its results.

Option 2 - Deep scan

Use the deep scan to perform dependency resolution, reachability analysis, and generate call graphs. You can do this after you complete the quick scan successfully. The deep scan performs the following operations for the Python projects.

  • Discovers explicitly declared dependencies,
  • Discovers project dependent OSS packages present in the venv/global and scope/python.
  • Performs reachability analysis and generates call graphs.
  • Detects dependencies used in source code but not declared in the package’s manifest files called phantom dependencies.
endorctl scan

Use the following flags to save the local results to a results.json file. The results and related analysis information are available on the Endor Labs user interface.

endorctl scan -o json | tee /path/to/results.json

When a deep scan is performed all private software dependencies are completely analyzed by default if they have not been previously scanned. This is a one-time operation and will slow down initial scans, but won’t impact subsequent scans.

Organizations might not own some parts of the software internally and the related findings are not actionable by them. They can choose to disable this analysis using the flag disable-private-package-analysis. By disabling private package analysis, teams can enhance scan performance but may lose insights into how applications interact with first-party libraries.

You can sign in to the Endor Labs user interface, click the Projects on the left sidebar, and find your project to review its results.

Understand the scan process

Endor Labs uses the following two methods to analyze your Python code.

Endor Labs uses the results from both these methods to perform superior dependency resolution, identify security issues, detect open-source vulnerabilities, and generate call graphs.

Dependency resolution using manifest files

In this method, Endor Labs analyzes the manifest files present in a project to detect and resolve dependencies. The manifest files are analyzed in the following priority.

Package Manager Priority Build Solution
Poetry 1 poetry.lock
pyproject.toml
PDM 2 pdm.lock
pyproject.toml
pip 3 setup.py
setup.cfg
pyproject.toml
requirements.txt

For Poetry and pip, if both lock and toml files are available, both are analyzed based to detect and resolve dependencies.

For pip, the dependency resolution is as follows, where the first available file in the priority list is analyzed to detect and resolve dependencies, and others are ignored.

Build solution Priority
setup.py 1
setup.cfg 2
pyproject.toml 3
requirements.txt 4

On initialization of a scan, Endor Labs identifies the package manager by inspecting files such as the pyproject.toml, poetry.lock, pdm.lock, setup.py, and requirements.txt. When the files, poetry.lock or pyproject.tomlfiles are discovered, Endor Labs will use the Poetry package manager to build the project. When the files, pdm.lock or pyproject.toml files are discovered, Endor Labs will use the PDM package manager. Otherwise, it will use pip3.

Example

This is an example that demonstrates scanning a Python repository from GitHub on your local system using the endorctl scan. Here we are assuming that you are running the scan on a Linux or Mac operating system environment and that you have the following Endor Labs API key and secret stored in the environment variables. See endorctl flags and variables.

  • ENDOR_API_CREDENTIALS_KEY set to the API key
  • ENDOR_API_CREDENTIALS_SECRET set to the API secret
  • ENDOR_NAMESPACE set to your namespace (you can find this when logged into Endor Labs by looking at your URL: https://app.endorlabs.com/t/NAMESPACE/...; it is typically a form of your organization’s name)
pip
git clone https://github.com/HybirdCorp/creme_crm.git
cd creme_crm
python3 -m venv venv
source venv/bin/activate
venv/bin/python3 -m pip install
endorctl scan
Poetry
git clone https://github.com/HybirdCorp/creme_crm.git
cd creme_crm
poetry lock
endorctl scan
PDM
git clone https://github.com/HybirdCorp/creme_crm.git
cd creme_crm
pdm sync
endorctl scan

The scan for this repository is expected to be completed in a few minutes depending on the size of the project. You can now visit app.endorlabs.com, navigate to Projects, and choose the helloflas/flask-examples project to see your scan results.

Dependency resolution using static analysis

All Python projects do not always include manifest files. A project can be a series of install statements that are assembled by custom scripts. Even when manifest files are present, the dependency information and version declared in the manifest file may be drastically different from what is used in a project.

To solve this problem, Endor Labs has developed a unique method for dependency resolution by performing a static analysis on the code, giving you complete visibility of what is used in your code.

  • Endor Labs enumerates all Python packages and recognizes the import statements within the project. An import statement is a Python code statement that is used to bring external modules or libraries into your Python script.
  • It performs a static analysis of the code to match the import statements with the pre-installed packages and recursively traverses all files to create a dependency tree with the actual versions that are installed in the virtual environment.
  • It detects the dependencies at the system level to identify which ones are resolved and retrieves the precise name and version information from the library currently in use.
  • Also, it gives you accurate visibility into your project components and helps you understand how the components depend on one another.

Through this approach, Endor Labs conducts comprehensive dependency management, assesses reachability, and generates integrated call graphs.

Known Limitations

  • Endor Labs specifically looks for the requirements.txt file for a Python project using pip. If you use a different file name, it won’t be automatically discovered.
  • Python versions older than 3.7 are not supported but may work as expected.
  • If a virtual environment is not provided, Python version constraints are not assumed based on the runtime environment of CI. Dependencies are shown for all possible versions of Python at runtime. If a virtual environment is provided, Endor Labs respects what is installed in the virtual environment.
  • Symbolic links into manifest files may result in the same package being duplicated in the project.
  • If a dependency is not available in the PyPI repository or in a configured private package repository, Endor Labs will be unable to build the software and scans may fail without first building the package in the local environment successfully.

Call Graph Limitations

  • Function calls using dispatch table calls might not be included in the call graph.
  • Function calls using unresolved variables might not be included in the call graph.
  • Dynamically modified or extended function calls used to declare methods or attributes at run time might not be included in the call graph.
  • Functions called indirectly through a function pointer and not by their direct name, might not be included in the call graph.
  • Type stubs that provide hints or type annotations for functions, methods, and variables in your Python modules or libraries have to be installed manually before performing a scan.
  • If your project has a pyproject.toml file that includes tools.pyright section, it overrides Endor Labs settings for Pyright and may result in incorrect call graph results. You will need to remove the tools.pyright section from the pyproject.toml file.

Troubleshoot errors

Here are a few error scenarios that you can check for and attempt to resolve them.

  • Virtual environment errors: You can identify the errors that may occur during virtual environment installation by looking for the following message in the error logs; failed to create virtual environment or failed to install dependencies.
  • Missing environment dependency: If your code depends on packages such as psycopg2, environment dependencies such as PostgreSQL are also required. The endorctl scan may fail if the environment where it is running does not have PostgreSQL installed.
  • Incompatible Python version: The default Python version in the environment where the endorctl scan is running is incompatible with one or more of the dependencies that are needed by the code.
  • Incompatible architecture: One or more dependencies are not compatible with the operating system architecture of the local system on which you are running the endorctl scan. For example, projects with dependency on PyObjC can be run on Mac-based systems, but not Linux systems. A few Python libraries are incompatible with x32 architectures and can only be run on x64 architectures.
  • Resolved dependency errors: A version of a dependency does not exist, or it cannot be found. It may have been removed from the repository.
  • Call graph errors: These errors come if pip or Poetry are unable to build the project because a required dependency cannot be located.