Ragie can call HTTP endpoints hosted on your servers as events in Ragie occur. You can configure one or more endpoints that will be called. Webhook calls are retried for up to day, further details can be found below. Webhook calls include a signature of the request body in the X-Signature http header so that the authenticity of the call can be validated. The signature is signed using a shared signing secret which is available in the Ragie ui.

Setting up Webhook Endpoints

Webhook endpoints are managed in the Ragie app. In the Ragie app click on “Webhooks” in the main navigation. Click the “Add Endpoint” button and a form will be presented that includes a name for the Endpoint and the URL which will be called when events occur. Fill out these values and click the create button. You may create multiple endpoints and all active endpoints will be called when events occur.

Once endpoints exist they’ll be listed with their signing secret on the webhooks page. The signing secret can be used to validate the authenticity of the webhook call. More details on how to perform this validation are below.

Each webhook will also have actions to:

  • Test an endpoint by simulating an event
  • Delete the endpoint
  • Activate or deactivate the endpoint

Development tips

It can be helpful when developing the webhook handler to expose your local development server to the internet. Using a tool like ngrok or port forwarding in vscode can be a convenient way to do this.

Retries

Ragie will make up to 18 attempts to call each endpoint for a given event. Ragie will exponentially backoff its attempts up to every 4 hours and the last attempt will be roughly 24 hours after the initial attempt. A webhook call is determined to be successfully received if the called endpoint returns a http status code >=200 and <300. Webhook handling should be idempotent, a nonce is provided to facilitate this . Ragie webhooks guarantee at least once delivery up to the retry limit. If all attempts are exhausted for a given webhook endpoint, it will be disabled and further delivery attempts for future events will not be made unless it is re-enabled.

Ragie Webhook Events

All events include a nonce. The nonce can be used to enforce idempotency in your system and to protect against replay attacks. If you’ve processed an event with a given nonce any further events you receive with that nonce should be ignored.

document_status_updated

This event is dispatched when a document is finished processing and either in the ready or failed state.

Payload fields

  • document_id
  • external_id
  • status
  • sync_id
  • partition
  • metadata

document_deleted

This event is dispatched when a document is deleted.

Payload fields

  • document_id
  • external_id
  • status
  • sync_id
  • partition
  • metadata

entity_extracted

This event is dispatched when entities are extracted from documents

Payload fields

  • entity_id
  • document_id
  • instruction_id
  • document_metadata
  • document_external_id
  • partition
  • sync_id
  • data

connection_sync_started

  • connection_id
  • sync_id
  • connection_metadata
  • create_count
  • update_content_count
  • update_metata_count
  • delete_count

connection_sync_progress

  • connection_id
  • sync_id
  • partition
  • connection_metadata
  • create_count
  • created_count
  • update_content_count
  • updated_content_count
  • update_metadata_count
  • updated_metadata_count
  • delete_count
  • deleted_count
  • errored_count

connection_sync_finished

  • connection_id
  • sync_id
  • partition
  • connection_metadata

If there is a Ragie event you’d like to be able to subscribe to via webhooks that’s not listed here, jump into our discord service and post in feature requests.

Validating signature

Ragie generates a signature using the signing secret for the webhook endpoint. This github repository demonstrates how to validate the signature in python: https://github.com/ragieai/python-webhook-example

If you’re using another language the steps to implement signature validation yourself is described below.

Instructions for Validating a Webhook Signature

  • Retrieve the Signature from the Request Header
    • Extract the signature from the X-Signature header of the incoming HTTP request.
    • Error Handling: If the X-Signature header is missing, reject the request (e.g., respond with HTTP 400 Bad Request).
  • Obtain the Raw Request Body
    • Action: Read the raw bytes of the request body exactly as received.
    • Note: Do not modify or parse the payload before validation.
    • Compute the Expected Signature - Generate an HMAC SHA-256 signature using the shared secret key and the raw request body.
      • Compute the HMAC SHA-256 digest with:
      • Key: The encoded secret key.
      • Message: The raw request body.
      • Convert the digest to a hexadecimal string to get the expected signature.
    • Compare the Signatures Securely
      • Action: Use a constant-time comparison function to compare the expected signature with the received signature.
      • Reason: Prevents timing attacks by ensuring the comparison takes the same amount of time regardless of where the first difference occurs.
      • Validate or Reject the Request
        • If Signatures Match: Proceed to process the webhook payload
        • If Signatures Do Not Match: Reject the request (e.g., respond with HTTP 401 Unauthorized)

Additional Notes:

  • Secret Signing Key Security:
    • Store the shared secret key securely (e.g., in environment variables or a secure vault).
    • Do not expose the secret key in logs or error messages.
  • Character Encoding:
    • Ensure consistent encoding (UTF-8) for the secret key when generating the HMAC.
  • Avoid Logging Sensitive Data:
    • Do not log the raw request body or signatures.
  • Use Trusted Libraries:
    • Utilize standard or well-maintained cryptographic libraries for HMAC computation and constant-time comparison.
  • Error Handling:
    • Provide generic error messages to avoid revealing sensitive information to potential attackers.