Skip to content

Split a core (python) API for the Archive out of the Ingester #3

@cnweaver

Description

@cnweaver

As we move toward implementing a REST API for the archive, I think it would be advantageous to have three clearly delineated components:

  • A core/common python API for manipulating the archive. This would be intended simply to be called from trusted code (which holds all necessary AWS/DB/etc. credentials) and would not be concerned with any type of authentication or authorization checks. It would presumably consist mostly of the code currently in archiver/src/*_api.py which is responsible for writing and fetching data from the data store while maintaining and working with its internal invariants. I think this should be a python module which would then be imported by the other two components.
  • An ingester service, which is built into a Docker image and deployed to continuously populate the archive. This would potentially be pretty small, as it would consist of roughly archiver/src/archive_ingest.py and supporting bits, while it would delegate to the hop-client (ad it already does) for accessing Kafka, and the archive-core module for inserting data into the archive.
  • A REST API service, which would likewise be packaged as a Docker image, which would provide a user-facing means to work with the archive (which users would access with tools like the hop-client), delegating to the scimma-admin API for AuthN/AuthZ information, and the archive-core module to access the archived data.

At a later date we could then build further services, like a graphical web frontend built on top of the REST API.

We could develop all of these archive-related components in this single repository, but personally I am in favor of the approach of placing each in a distinct repository, as we do with our other service code bases (e.g. hop-creds-sync separate from scimma-admin, hopbeat and gcn2hop separate from hop-client). In that approach, I would suggest moving the archive core code out of this repository, into a new repository. I think there would be no need to distribute it via PyPI or Conda, since it would be used only by us internally, but pip installing from a git tag seems like a simple way for code we write in other repositories to utilize it.

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions