Python Client Library for Affinda Document Parser API

This is a python client for the Affinda document parsing API which wraps all available endpoints and handles authentication and signing. You may also want to refer to the full API documentation for additional information.

Installation

pip install affinda

API Version Compatibility

The Affinda API is currently on v3, with breaking changes meant the release of new versions of the client library. Please see below for which versions are compatible with which API version.

Affinda API version	`affinda-python` versions
v2	0.1.0 - 3.x.x
v3	>= 4.x.x

Quickstart

If you don't have an API token, obtain one from affinda.com.

from pathlib import Path
from pprint import pprint

from affinda import AffindaAPI, TokenCredential
from affinda.models import WorkspaceCreate, CollectionCreate

token = "REPLACE_API_TOKEN"
file_pth = Path("PATH_TO_DOCUMENT.pdf")

credential = TokenCredential(token=token)
client = AffindaAPI(credential=credential)

# First get the organisation, by default your first one will have free credits
my_organisation = client.get_all_organizations()[0]

# And within that organisation, create a workspace, for example for Recruitment:
workspace_body = WorkspaceCreate(
    organization=my_organisation.identifier,
    name="My Workspace",
)
recruitment_workspace = client.create_workspace(body=workspace_body)

# Finally, create a collection that will contain our uploaded documents, for example resumes, by selecting the
# appropriate extractor
collection_body = CollectionCreate(
    name="Resumes", workspace=recruitment_workspace.identifier, extractor="resume"
)
resume_collection = client.create_collection(collection_body)

# Now we can upload a resume for parsing
with open(file_pth, "rb") as f:
    resume = client.create_document(file=f, file_name=file_pth.name, collection=resume_collection.identifier)

pprint(resume.as_dict())