Skip to content

Releases: s7clarke10/tap-rest-api-msdk

Patching OAuth Authentication to align with SDK changes

04 Sep 22:32
Compare
Choose a tag to compare

A change to the Meltano SDK affected the OAuth authentication.

This releases updates the tap and auth components to adjust to the changes in the Meltano SDK.

Updating sdk and dependencies

03 Sep 22:49
1d2450a
Compare
Choose a tag to compare

What's Changed
updated SDK and deps by @jlloyd-widen in Widen#56
Full Changelog: Widen/tap-rest-api-msdk@1.3.10...1.3.11

Use correct argument for page_size #52

02 Sep 05:55
Compare
Choose a tag to compare

This resolves a small bug with the simple_offset_paginator where the page_size positional argument was incorrectly set.

1.3.9 - SimpleOffsetPaginator and Drop Python 3.7 add 3.12

24 Jun 00:43
f4eeb54
Compare
Choose a tag to compare

The two features have been added from the upstream fork.

  • add SimpleOffsetPaginator (Widen#48)

  • Support Python 3.12 and drop support for EOL Python 3.7

New Features from upstream

15 Jun 04:00
Compare
Choose a tag to compare

Adding the following features

  • Pagination offset starting counter is configurable
  • If JSON Path next token is set, this is the default paginator

Add Rate Limit Logic and Cache Authenticator

06 Nov 21:51
Compare
Choose a tag to compare

This PR contains three features to deliver required functionality for a new API.

  1. It contains enhanced logic in the tap discovery to cache credentials to avoid having to re-authenticate for each stream. The API was erroring due to too many OAuth Requests in a quick succession.
  2. Optional backoff logic has been built in allowing tap-rest-api-msdk to respond to http retry-after messages. This is configurable to use either the header or message responses. There is also a new setting which adds some additional time because sometime the requested wait time is longer enough. This feature was built in to meet the API's backoff requirements.
  3. Optionally provides the ability to store the whole raw message. The feature is enabled by setting the store_raw_json_message to true. This is useful if you wish to offload, the flattening functionality to the likes of dbt. Where I have used this feature I tend to select only the primary key, the replication key, and the _sdc_raw_json field.
    The use case for this was a dynamic schema with optional fields/columns which were not available in every record. In this situation the schema discovery did not pick up every field leading to missing data, it was elected that storing the raw json record was important - to ensure all data is preserved.

Points 2-3 are optional, and so the tap's behaviour does not change. For Point 1, the caching will speed up discovery and ingestion as credentials are cached.

Resolving pagination bug and dependabot securities

04 Aug 21:10
b73fe3e
Compare
Choose a tag to compare

This release resolves a bug affecting offset pagination. It also addresses one recent Dependabot security vulnerabilities, and a partial fix for the other.

Bug Fix:

  • Removing the defaulting of pagination_page_size to 0. This is an optional parameter.

Security Fixes:

  • pyca/cryptography's wheels include vulnerable OpenSSL
  • ReDoS in py library when used with subversion

The second security issue has been resolved by a pytest version upgrade, however it also needs an upgrade of tox to be fully resolved. Currently a Meltano SDK dependency prevents the tox being bumped to a higher version. When the SDK is updated, a bump of tox will fully resolve the - ReDoS in py library when used with subversion issue.

Syncing tap-rest-api-msdk with upstream repo

27 Jul 04:55
8b9e3dc
Compare
Choose a tag to compare

This release brings the tap in-line with the upstream repository now that the PR has been accepted to merge our changes into the main repo.

Release includes

  • library dependencies
  • linting of the code
  • resolution to Dependabot issues

Adding SDK support for Pagination and Authentication.

25 Jul 01:59
adb41b5
Compare
Choose a tag to compare

This PR introduces new Authenticators and Paginators to tap-rest-api-msdk. (it is a refactored approach to previous PR's). With this feature there is greater support for a range of API's - making this tap the swiss army knife for accessing API's.

Summary

  • Support for most Meltano SDK Authenticators.
  • Support for all Meltano Paginators.
  • Flexibility to support many new API's by new settings to adjust request parameter names. See README.md for more details on settings.
  • Ability to send parameters in the request body rather than request parameters (if required).
  • Moving from deprecated get_next_page_token to support get_new_paginator. This removes the warnings in the logs.
  • Enhanced incremental replication (include support for API query templates).
  • New modules auth and pagination keeping a clean design.
  • New auth method aws, to support ingestion from AWS REST End-point e.g. OpenSearch.

Paginators

Each REST API is different. This PR builds on the concept of picking an appropriate request and response style for the API. Select an appropriate pagination_request_style to pick the paginator you require. In most cases this needs to be coupled with an appropriate paginator_response_style to process the response and pick the next page location in the body or headers.

Supported Paginators as part of this PR include:

  • jsonpath_paginator or default - This style obtains the token for the next page from a specific location in the response body via JSONPath notation. In many situations the jsonpath_paginator is a more appropriate paginator to the hateoas_paginator.
    • next_page_token_path - The jsonpath to next page token. Example: "$['@odata.nextLink']", this locates the token returned via the Microsoft Graph API. Default '$.next_page' for the jsonpath_paginator paginator only otherwise None.
  • offset_paginator or style1 - This style uses URL parameters named offset and limit
    • offset is calculated from the previous response, or not set if there is no previous response
    • pagination_page_size - Sets a limit to number of records per page / response. Default 25 records.
    • pagination_limit_per_page_param - the name of the API parameter to limit number of records per page. Default parameter name limit.
    • pagination_total_limit_param - The name of the param that indicates the total limit e.g. total, count. Defaults to total
    • next_page_token_path - Used to locate an appropriate link in the response. Default None - but looks in the pagination section of the JSON response by default. Example, jsonpath to get the offset from the NOAA API '$.metadata.resultset'.
  • simple_header_paginator - This style uses links in the Header Response to locate the next page. Example the x-next-page link used by the Gitlab API.
  • header_link_paginator - This style uses the default header link paginator from the Meltano SDK.
  • restapi_header_link_paginator - This style is a variant on the header_link_paginator. It supports the ability to read from GitHub API.
    • pagination_page_size - Sets a limit to number of records per page / response. Default 25 records.
    • pagination_limit_per_page_param - the name of the API parameter to limit number of records per page. Default parameter name per_page.
    • pagination_results_limit - Restricts the total number of records returned from the API. Default None i.e. no limit.
  • hateoas_paginator - This style parses the next_token response for the parameters to pass. It is used by API's utilising the HATEOAS Rest style HATEOAS, including FHIR API's.
    • pagination_page_size - Sets a limit to number of records per page / response. Default None.
    • pagination_limit_per_page_param - the name of the API parameter to limit number of records per page e.g. _count for FHIR API's. Default None.
  • single_page_paginator - A paginator that does works with single-page endpoints.
  • page_number_paginator - Paginator class for APIs that use page number. Looks at the response link to determine more pages.
    • next_page_token_path - Use to locate an appropriate link in the response. Default "hasMore".

Authentication

This PR introduces many additional forms of authentication that weren't possible with just headers in the request (for example OAuth).

The Meltano SDK introduced a number of authentication methods, which have been supported with this feature. The feature utilizes the available SDK Authenticators https://github.com/meltano/sdk/blob/main/singer_sdk/authenticators.py.

While new auth methods are supported, by default for legacy support, you can still pass Authentication via headers, there is no breaking changes as a result. New supported authenticators :

  • oauth: for OAuth2 authentication
  • basic: Basic Header authentication - base64-encoded username + password config items
  • api_key: for API Keys in the header e.g. X-API-KEY.
  • bearer_token: for Bearer token authentication.
  • aws: for AWS authentication. Works with the aws_credentials parameter.

Please note that support for OAuthJWTAuthentication has not been developed.

Other Changes:

  • Fixes to the meltano.yml kind / data types.
  • Updated meltano.yml with all the available parameters.
  • Adds a config.json.sample file for illustrating how to construct a config.json file when using the tap stand-alone for development purposes.
  • Documentation for new settings and examples of use against a number of API's.

Note: I am aware that there are no supported API tests as they are time consuming to build and test. I have however with my limited time tested against of variety of API's available to me. Perhaps faker python package to help simulate tests for a variety of API's and responses. This appears to be used by tap-dbt https://github.com/MeltanoLabs/tap-dbt/blob/main/tests/test_core.py