Hi there! Thank you for even being interested in contributing to LangChain-Google. As an open-source project in a rapidly developing field, we are extremely open to contributions, whether they involve new features, improved infrastructure, better documentation, or bug fixes.
To contribute to this project, please follow the "fork and pull request" workflow. Please do not try to push directly to this repo unless you are a maintainer.
Please follow the checked-in pull request template when opening pull requests. Note related issues and tag relevant maintainers.
Pull requests cannot land without passing the formatting, linting, and testing checks first. See Testing and Formatting and Linting for how to run these checks locally.
It's essential that we maintain great documentation and testing. If you:
- Fix a bug
- Add a relevant unit or integration test when possible. These live in
tests/unit_tests
andtests/integration_tests
.
- Add a relevant unit or integration test when possible. These live in
- Make an improvement
- Update unit and integration tests when relevant.
- Add a feature
- Add unit and integration tests.
We are a small, progress-oriented team. If there's something you'd like to add or change, opening a pull request is the best way to get our attention.
This project utilizes Poetry v1.7.1+ as a dependency manager.
βNote: Before installing Poetry, if you use Conda
, create and activate a new Conda env (e.g. conda create -n langchain python=3.9
)
Install Poetry: documentation on how to install it.
βNote: If you use Conda
or Pyenv
as your environment/package manager, after installing Poetry,
tell Poetry to use the virtualenv python environment (poetry config virtualenvs.prefer-active-python true
)
This repository contains three packages with Google integrations with LangChain:
- langchain-google-genai implements integrations of Google Generative AI models.
- langchain-google-vertexai implements integrations of Google Cloud Generative AI on Vertex AI
-
langchain-google-community implements integrations for Google products that are not part of
langchain-google-vertexai
orlangchain-google-genai
packages
Each of these has its own development environment. Docs are run from the top-level makefile, but development is split across separate test & release flows.
If you plan on contributing to LangChain-Google code or documentation, it can be useful to understand the high level structure of the repository.
LangChain-Google is organized as a monorepo that contains multiple packages.
Here's the structure visualized as a tree:
.
βββ libs
β βββ community
β β βββ tests/unit_tests # Unit tests (present in each package not shown for brevity)
β β βββ tests/integration_tests # Integration tests (present in each package not shown for brevity)
β βββ genai
β βββ vertexai
The root directory also contains the following files:
-
pyproject.toml
: Dependencies for building docs and linting docs, cookbook. -
Makefile
: A file that contains shortcuts for building, linting and docs and cookbook.
There are other files in the root directory level, but their presence should be self-explanatory. Feel free to browse around!
Install development requirements (for running langchain, running examples, linting, formatting, tests, and coverage):
poetry install --with lint,typing,test,test_integration
Then verify dependency installation:
make test
If during installation you receive a WheelFileValidationError
for debugpy
, please make sure you are running
Poetry v1.6.1+. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases.
If you are still seeing this bug on v1.6.1+, you may also try disabling "modern installation"
(poetry config installer.modern-installation false
) and re-installing requirements.
See this debugpy
issue for more details.
Formatting for this project is done via ruff.
To run formatting for a library, run the same command from the relevant library directory:
cd libs/{LIBRARY}
make format
Additionally, you can run the formatter only on the files that have been modified in your current branch as compared to the master branch using the format_diff command:
make format_diff
This is especially useful when you have made changes to a subset of the project and want to ensure your changes are properly formatted without affecting the rest of the codebase.
Linting for this project is done via a combination of ruff and mypy.
To run linting for docs, cookbook and templates:
make lint
To run linting for a library, run the same command from the relevant library directory:
cd libs/{LIBRARY}
make lint
In addition, you can run the linter only on the files that have been modified in your current branch as compared to the master branch using the lint_diff command:
make lint_diff
This can be very helpful when you've made changes to only certain parts of the project and want to ensure your changes meet the linting standards without having to check the entire codebase.
We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
Spellchecking for this project is done via codespell.
Note that codespell
finds common typos, so it could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.
To check spelling for this project:
make spell_check
To fix spelling in place:
make spell_fix
If codespell is incorrectly flagging a word, you can skip spellcheck for that word by adding it to the codespell config in the pyproject.toml
file.
[tool.codespell]
...
# Add here:
ignore-words-list =...
community
, genai
, and vertexai
rely on optional dependencies to keep these packages lightweight.
You'll notice that pyproject.toml
and poetry.lock
are not touched when you add optional dependencies below.
If you're adding a new dependency to Langchain-Google, assume that it will be an optional dependency, and that most users won't have it installed.
Users who do not have the dependency installed should be able to import your code without any side effects (no warnings, no errors, no exceptions).
To introduce the dependency to a library, please do the following:
- Open extended_testing_deps.txt and add the dependency
- Add a unit test that the very least attempts to import the new code. Ideally, the unit test makes use of lightweight fixtures to test the logic of the code.
- Please use the
@pytest.mark.requires(package_name)
decorator for any unit tests that require the dependency.
All of our packages have unit tests and integration tests, and we favor unit tests over integration tests.
Unit tests run on every pull request, so they should be fast and reliable.
Integration tests run once a day, and they require more setup, so they should be reserved for confirming interface points with external services.
Unit tests cover modular logic that does not require calls to outside APIs. If you add new logic, please add a unit test. In unit tests we check pre/post processing and mocking all external dependencies.
To install dependencies for unit tests:
poetry install --with test
To run unit tests:
make test
To run unit tests in Docker:
make docker_tests
To run a specific test:
TEST_FILE=tests/unit_tests/test_imports.py make test
Integration tests cover logic that requires making calls to outside APIs (often integration with other services). If you add support for a new external API, please add a new integration test.
Warning: Almost no tests should be integration tests.
Tests that require making network connections make it difficult for other developers to test the code.
Instead favor relying on responses
library and/or mock.patch to mock
requests using small fixtures.
To install dependencies for integration tests:
poetry install --with test,test_integration
To run integration tests:
make integration_tests
We annotate integration tests to separate those tests which heavily rely on GCP infrastructure. Especially for running those tests we have created a separate GCP project with all necessary infrastructure parts provisioned. To run the extended integration tests locally you will need to provision a GCP project and pass its configuration via env variables.
Test annotations:
- Tests without annotations will be executed on every run of the integration tests pipeline.
- Tests with release annotation ( @pytest.mark.release ) will be run with the release pipeline.
- Tests with extended annotation ( @pytest.mark.extended ) will be run on each PR.
The integration tests use several search engines and databases. The tests aim to verify the correct behavior of the engines and databases according to their specifications and requirements.
To run some integration tests, you will need GCP project configured. The configuration of the GCP project required for integration testing is stored in the terraform folder within each library.
- copy
tests/integration_tests/.env.example
totests/integration_tests/.env
- set variables in
tests/integration_tests/.env
file, e.gOPENAI_API_KEY
Additionally, it's important to note that some integration tests may require certain
environment variables to be set, such as PROJECT_ID
. Be sure to set any required
environment variables before running the tests to ensure they run correctly.
pytest tests/integration_tests/.py --cov=langchain --cov-report=html
start "" htmlcov/index.html || open htmlcov/index.html
Code coverage (i.e. the amount of code that is covered by unit tests) helps identify areas of the code that are potentially more or less brittle.
Coverage requires the dependencies for integration tests:
poetry install --with test_integration
To get a report of current coverage, run the following:
make coverage
For detailed information on how to contribute, see LangChain contribution guide.