Revert "Implement webhook system for AI events notifications"

This commit is contained in:
dawnsystem 2025-11-14 16:40:31 +01:00 committed by GitHub
parent f00424f7c4
commit 04ced421b8
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
8 changed files with 0 additions and 1312 deletions

View file

@ -1,443 +0,0 @@
# AI Webhooks System - IntelliDocs
## Overview
The AI Webhooks system provides real-time notifications for AI events in IntelliDocs. This allows external systems to be notified when the AI performs important actions, enabling integration with workflow automation tools, monitoring systems, and custom applications.
## Features
- **Event Tracking**: Comprehensive logging of all webhook events
- **Retry Logic**: Exponential backoff for failed webhook deliveries
- **Configurable**: Multiple webhook endpoints with different configurations
- **Secure**: Optional HMAC signature validation
- **Robust**: Graceful degradation if webhook delivery fails
## Supported Events
### 1. Deletion Request Created (`deletion_request_created`)
Triggered when the AI creates a deletion request that requires user approval.
**Payload Example:**
```json
{
"event_type": "deletion_request_created",
"timestamp": "2025-11-14T15:00:00Z",
"source": "intellidocs-ai",
"deletion_request": {
"id": 123,
"status": "pending",
"ai_reason": "Duplicate document detected...",
"document_count": 3,
"documents": [
{
"id": 456,
"title": "Invoice 2023-001",
"created": "2023-01-15T10:30:00Z",
"correspondent": "Acme Corp",
"document_type": "Invoice"
}
],
"impact_summary": {
"document_count": 3,
"affected_tags": ["invoices", "2023"],
"affected_correspondents": ["Acme Corp"],
"date_range": {
"earliest": "2023-01-15",
"latest": "2023-03-20"
}
},
"created_at": "2025-11-14T15:00:00Z"
},
"user": {
"id": 1,
"username": "admin"
}
}
```
### 2. Suggestion Auto Applied (`suggestion_auto_applied`)
Triggered when the AI automatically applies suggestions with high confidence (≥80%).
**Payload Example:**
```json
{
"event_type": "suggestion_auto_applied",
"timestamp": "2025-11-14T15:00:00Z",
"source": "intellidocs-ai",
"document": {
"id": 789,
"title": "Contract 2025-A",
"created": "2025-11-14T14:30:00Z",
"correspondent": "TechCorp",
"document_type": "Contract",
"tags": ["contracts", "2025", "legal"]
},
"applied_suggestions": {
"tags": [
{"id": 10, "name": "contracts"},
{"id": 25, "name": "legal"}
],
"correspondent": {
"id": 5,
"name": "TechCorp"
},
"document_type": {
"id": 3,
"name": "Contract"
}
},
"auto_applied": true
}
```
### 3. AI Scan Completed (`scan_completed`)
Triggered when an AI scan of a document is completed.
**Payload Example:**
```json
{
"event_type": "scan_completed",
"timestamp": "2025-11-14T15:00:00Z",
"source": "intellidocs-ai",
"document": {
"id": 999,
"title": "Report Q4 2025",
"created": "2025-11-14T14:45:00Z",
"correspondent": "Finance Dept",
"document_type": "Report"
},
"scan_summary": {
"auto_applied_count": 3,
"suggestions_count": 2,
"has_tags_suggestions": true,
"has_correspondent_suggestion": true,
"has_type_suggestion": true,
"has_storage_path_suggestion": false,
"has_custom_fields": true,
"has_workflow_suggestions": false
},
"scan_completed_at": "2025-11-14T15:00:00Z"
}
```
## Configuration
### Environment Variables
Add these settings to your environment or `paperless.conf`:
```bash
# Enable AI webhooks (disabled by default)
PAPERLESS_AI_WEBHOOKS_ENABLED=true
# Maximum retry attempts for failed webhooks (default: 3)
PAPERLESS_AI_WEBHOOKS_MAX_RETRIES=3
# Initial retry delay in seconds (default: 60)
# Increases exponentially: 60s, 120s, 240s...
PAPERLESS_AI_WEBHOOKS_RETRY_DELAY=60
# Request timeout in seconds (default: 10)
PAPERLESS_AI_WEBHOOKS_TIMEOUT=10
```
### Django Admin Configuration
1. Navigate to **Admin** → **AI webhook configurations**
2. Click **Add AI webhook configuration**
3. Fill in the form:
- **Name**: Friendly name (e.g., "Slack Notifications")
- **Enabled**: Check to activate
- **URL**: Webhook endpoint URL
- **Events**: List of event types (leave empty for all events)
- **Headers**: Optional custom headers (JSON format)
- **Secret**: Optional secret key for HMAC signing
- **Max retries**: Number of retry attempts (default: 3)
- **Retry delay**: Initial delay in seconds (default: 60)
- **Timeout**: Request timeout in seconds (default: 10)
**Example Configuration:**
```json
{
"name": "Slack AI Notifications",
"enabled": true,
"url": "https://hooks.slack.com/services/YOUR/WEBHOOK/URL",
"events": ["deletion_request_created", "suggestion_auto_applied"],
"headers": {
"Content-Type": "application/json"
},
"secret": "your-secret-key-here",
"max_retries": 3,
"retry_delay": 60,
"timeout": 10
}
```
## Security
### URL Validation
Webhooks use the same security validation as the existing workflow webhook system:
- Only allowed URL schemes (http, https by default)
- Port restrictions if configured
- Optional internal request blocking
### HMAC Signature Verification
If a secret is configured, webhooks include an HMAC signature in the `X-IntelliDocs-Signature` header.
**Verification Example (Python):**
```python
import hmac
import hashlib
import json
def verify_webhook(payload, signature, secret):
"""Verify webhook HMAC signature"""
payload_str = json.dumps(payload, sort_keys=True)
expected = hmac.new(
secret.encode('utf-8'),
payload_str.encode('utf-8'),
hashlib.sha256
).hexdigest()
# Signature format: "sha256={hash}"
expected_sig = f"sha256={expected}"
return hmac.compare_digest(expected_sig, signature)
# Usage
secret = "your-secret-key"
signature = request.headers.get('X-IntelliDocs-Signature')
payload = request.json
if verify_webhook(payload, signature, secret):
print("Webhook verified!")
else:
print("Invalid signature!")
```
## Retry Logic
Failed webhooks are automatically retried with exponential backoff:
1. **Attempt 1**: Immediate
2. **Attempt 2**: After `retry_delay` seconds (default: 60s)
3. **Attempt 3**: After `retry_delay * 2` seconds (default: 120s)
4. **Attempt 4**: After `retry_delay * 4` seconds (default: 240s)
After max retries, the webhook is marked as failed and logged.
## Monitoring
### Admin Interface
View webhook delivery status in **Admin****AI webhook events**:
- **Event Type**: Type of AI event
- **Status**: pending, success, failed, retrying
- **Attempts**: Number of delivery attempts
- **Response**: HTTP status code and response body
- **Error Message**: Details if delivery failed
### Logging
All webhook activity is logged to `paperless.ai_webhooks`:
```python
import logging
logger = logging.getLogger("paperless.ai_webhooks")
```
**Log Levels:**
- `INFO`: Successful deliveries
- `WARNING`: Failed deliveries being retried
- `ERROR`: Permanent failures after max retries
- `DEBUG`: Detailed webhook activity
## Integration Examples
### Slack
Create a Slack app with incoming webhooks and use the webhook URL:
```json
{
"name": "Slack Notifications",
"url": "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXX",
"events": ["deletion_request_created"]
}
```
### Discord
Use Discord's webhook feature:
```json
{
"name": "Discord Notifications",
"url": "https://discord.com/api/webhooks/123456789/abcdefg",
"events": ["suggestion_auto_applied", "scan_completed"]
}
```
### Custom HTTP Endpoint
Create your own webhook receiver:
```python
from flask import Flask, request, jsonify
app = Flask(__name__)
@app.route('/webhook', methods=['POST'])
def handle_webhook():
event = request.json
event_type = event.get('event_type')
if event_type == 'deletion_request_created':
# Handle deletion request
deletion_request = event['deletion_request']
print(f"Deletion request {deletion_request['id']} created")
elif event_type == 'suggestion_auto_applied':
# Handle auto-applied suggestion
document = event['document']
print(f"Suggestions applied to document {document['id']}")
elif event_type == 'scan_completed':
# Handle scan completion
scan_summary = event['scan_summary']
print(f"Scan completed: {scan_summary}")
return jsonify({'status': 'success'}), 200
if __name__ == '__main__':
app.run(port=5000)
```
## Troubleshooting
### Webhooks Not Being Sent
1. Check `PAPERLESS_AI_WEBHOOKS_ENABLED=true` in settings
2. Verify webhook configuration is enabled in admin
3. Check that events list includes the event type (or is empty for all events)
4. Review logs for errors: `grep "ai_webhooks" /path/to/paperless.log`
### Failed Deliveries
1. Check webhook event status in admin
2. Review error message and response code
3. Verify endpoint URL is accessible
4. Check firewall/network settings
5. Verify HMAC signature if using secrets
### High Retry Count
1. Increase `PAPERLESS_AI_WEBHOOKS_TIMEOUT` if endpoint is slow
2. Increase `PAPERLESS_AI_WEBHOOKS_MAX_RETRIES` for unreliable networks
3. Check endpoint logs for errors
4. Consider using a message queue for reliability
## Database Models
### AIWebhookEvent
Tracks individual webhook delivery attempts.
**Fields:**
- `event_type`: Type of event
- `webhook_url`: Destination URL
- `payload`: Event data (JSON)
- `status`: pending/success/failed/retrying
- `attempts`: Number of delivery attempts
- `response_status_code`: HTTP response code
- `error_message`: Error details if failed
### AIWebhookConfig
Stores webhook endpoint configurations.
**Fields:**
- `name`: Configuration name
- `enabled`: Active status
- `url`: Webhook URL
- `events`: Filtered event types (empty = all)
- `headers`: Custom HTTP headers
- `secret`: HMAC signing key
- `max_retries`: Retry limit
- `retry_delay`: Initial retry delay
- `timeout`: Request timeout
## Performance Considerations
- Webhook delivery is **asynchronous** via Celery tasks
- Failed webhooks don't block document processing
- Event records are kept for auditing (consider periodic cleanup)
- Network failures are handled gracefully
## Best Practices
1. **Use HTTPS**: Always use HTTPS webhooks in production
2. **Validate Signatures**: Use HMAC signatures to verify authenticity
3. **Filter Events**: Only subscribe to needed events
4. **Monitor Failures**: Regularly check failed webhooks in admin
5. **Set Appropriate Timeouts**: Balance reliability vs. performance
6. **Test Endpoints**: Verify webhook receivers work before enabling
7. **Log Everything**: Keep comprehensive logs for debugging
## Migration
The webhook system requires database migration:
```bash
python manage.py migrate documents
```
This creates the `AIWebhookEvent` and `AIWebhookConfig` tables.
## API Reference
### Python API
```python
from documents.webhooks import (
send_ai_webhook,
send_deletion_request_webhook,
send_suggestion_applied_webhook,
send_scan_completed_webhook,
)
# Send generic webhook
send_ai_webhook('custom_event', {'data': 'value'})
# Send specific event webhooks (called automatically by AI scanner)
send_deletion_request_webhook(deletion_request)
send_suggestion_applied_webhook(document, suggestions, applied_fields)
send_scan_completed_webhook(document, scan_results, auto_count, suggest_count)
```
## Related Documentation
- [AI Scanner Implementation](./AI_SCANNER_IMPLEMENTATION.md)
- [AI Scanner Improvement Plan](./AI_SCANNER_IMPROVEMENT_PLAN.md)
- [API REST Endpoints](./GITHUB_ISSUES_TEMPLATE.md)
## Support
For issues or questions:
- GitHub Issues: [dawnsystem/IntelliDocs-ngx](https://github.com/dawnsystem/IntelliDocs-ngx/issues)
- Check logs: `paperless.ai_webhooks` logger
- Review admin interface for webhook event details
---
**Version**: 1.0
**Last Updated**: 2025-11-14
**Status**: Production Ready

View file

@ -16,7 +16,6 @@ from documents.models import ShareLink
from documents.models import StoragePath from documents.models import StoragePath
from documents.models import Tag from documents.models import Tag
from documents.tasks import update_document_parent_tags from documents.tasks import update_document_parent_tags
from documents.webhooks import AIWebhookEvent, AIWebhookConfig
if settings.AUDIT_LOG_ENABLED: if settings.AUDIT_LOG_ENABLED:
from auditlog.admin import LogEntryAdmin from auditlog.admin import LogEntryAdmin
@ -220,57 +219,6 @@ admin.site.register(ShareLink, ShareLinksAdmin)
admin.site.register(CustomField, CustomFieldsAdmin) admin.site.register(CustomField, CustomFieldsAdmin)
admin.site.register(CustomFieldInstance, CustomFieldInstancesAdmin) admin.site.register(CustomFieldInstance, CustomFieldInstancesAdmin)
class AIWebhookEventAdmin(admin.ModelAdmin):
list_display = ("event_type", "webhook_url", "status", "attempts", "created_at", "completed_at")
list_filter = ("event_type", "status", "created_at")
search_fields = ("webhook_url", "error_message")
readonly_fields = ("event_type", "webhook_url", "payload", "created_at", "last_attempt_at",
"response_status_code", "response_body", "error_message", "completed_at", "attempts")
ordering = ("-created_at",)
def has_add_permission(self, request):
# Webhook events are created automatically, not manually
return False
def has_change_permission(self, request, obj=None):
# Events are read-only
return False
class AIWebhookConfigAdmin(admin.ModelAdmin):
list_display = ("name", "enabled", "url", "max_retries", "created_at")
list_filter = ("enabled", "created_at")
search_fields = ("name", "url")
readonly_fields = ("created_at", "updated_at")
fieldsets = (
("Basic Information", {
"fields": ("name", "enabled", "url")
}),
("Event Configuration", {
"fields": ("events",)
}),
("Request Configuration", {
"fields": ("headers", "secret", "timeout")
}),
("Retry Configuration", {
"fields": ("max_retries", "retry_delay")
}),
("Metadata", {
"fields": ("created_by", "created_at", "updated_at"),
"classes": ("collapse",)
}),
)
def save_model(self, request, obj, form, change):
if not change: # Only set created_by when creating
obj.created_by = request.user
super().save_model(request, obj, form, change)
admin.site.register(AIWebhookEvent, AIWebhookEventAdmin)
admin.site.register(AIWebhookConfig, AIWebhookConfigAdmin)
if settings.AUDIT_LOG_ENABLED: if settings.AUDIT_LOG_ENABLED:
class LogEntryAUDIT(LogEntryAdmin): class LogEntryAUDIT(LogEntryAdmin):

View file

@ -72,17 +72,6 @@ class AIDeletionManager:
f"requiring approval from user {user.username}", f"requiring approval from user {user.username}",
) )
# Send webhook notification about deletion request
try:
from documents.webhooks import send_deletion_request_webhook
send_deletion_request_webhook(request)
except Exception as webhook_error:
logger.warning(
f"Failed to send deletion request webhook: {webhook_error}",
exc_info=True,
)
# TODO: Send in-app notification to user about pending deletion request
# TODO: Send notification to user about pending deletion request # TODO: Send notification to user about pending deletion request
# This could be via email, in-app notification, or both # This could be via email, in-app notification, or both

View file

@ -768,8 +768,6 @@ class AIDocumentScanner:
"custom_fields": {}, "custom_fields": {},
} }
applied_fields = [] # Track which fields were auto-applied for webhook
try: try:
with transaction.atomic(): with transaction.atomic():
# Apply tags # Apply tags
@ -778,7 +776,6 @@ class AIDocumentScanner:
tag = Tag.objects.get(pk=tag_id) tag = Tag.objects.get(pk=tag_id)
document.add_nested_tags([tag]) document.add_nested_tags([tag])
applied["tags"].append({"id": tag_id, "name": tag.name}) applied["tags"].append({"id": tag_id, "name": tag.name})
applied_fields.append("tags")
logger.info(f"Auto-applied tag: {tag.name}") logger.info(f"Auto-applied tag: {tag.name}")
elif confidence >= self.suggest_threshold: elif confidence >= self.suggest_threshold:
tag = Tag.objects.get(pk=tag_id) tag = Tag.objects.get(pk=tag_id)
@ -800,7 +797,6 @@ class AIDocumentScanner:
"id": corr_id, "id": corr_id,
"name": correspondent.name, "name": correspondent.name,
} }
applied_fields.append("correspondent")
logger.info(f"Auto-applied correspondent: {correspondent.name}") logger.info(f"Auto-applied correspondent: {correspondent.name}")
elif confidence >= self.suggest_threshold: elif confidence >= self.suggest_threshold:
correspondent = Correspondent.objects.get(pk=corr_id) correspondent = Correspondent.objects.get(pk=corr_id)
@ -820,7 +816,6 @@ class AIDocumentScanner:
"id": type_id, "id": type_id,
"name": doc_type.name, "name": doc_type.name,
} }
applied_fields.append("document_type")
logger.info(f"Auto-applied document type: {doc_type.name}") logger.info(f"Auto-applied document type: {doc_type.name}")
elif confidence >= self.suggest_threshold: elif confidence >= self.suggest_threshold:
doc_type = DocumentType.objects.get(pk=type_id) doc_type = DocumentType.objects.get(pk=type_id)
@ -840,7 +835,6 @@ class AIDocumentScanner:
"id": path_id, "id": path_id,
"name": storage_path.name, "name": storage_path.name,
} }
applied_fields.append("storage_path")
logger.info(f"Auto-applied storage path: {storage_path.name}") logger.info(f"Auto-applied storage path: {storage_path.name}")
elif confidence >= self.suggest_threshold: elif confidence >= self.suggest_threshold:
storage_path = StoragePath.objects.get(pk=path_id) storage_path = StoragePath.objects.get(pk=path_id)
@ -853,43 +847,6 @@ class AIDocumentScanner:
# Save document with changes # Save document with changes
document.save() document.save()
# Send webhooks for auto-applied suggestions
if applied_fields:
try:
from documents.webhooks import send_suggestion_applied_webhook
send_suggestion_applied_webhook(
document,
scan_result.to_dict(),
applied_fields,
)
except Exception as webhook_error:
logger.warning(
f"Failed to send suggestion applied webhook: {webhook_error}",
exc_info=True,
)
# Send webhook for scan completion
try:
from documents.webhooks import send_scan_completed_webhook
auto_applied_count = len(applied_fields)
suggestions_count = sum([
len(suggestions.get("tags", [])),
1 if suggestions.get("correspondent") else 0,
1 if suggestions.get("document_type") else 0,
1 if suggestions.get("storage_path") else 0,
])
send_scan_completed_webhook(
document,
scan_result.to_dict(),
auto_applied_count,
suggestions_count,
)
except Exception as webhook_error:
logger.warning(
f"Failed to send scan completed webhook: {webhook_error}",
exc_info=True,
)
except Exception as e: except Exception as e:
logger.exception(f"Failed to apply scan results: {e}") logger.exception(f"Failed to apply scan results: {e}")

View file

@ -1,135 +0,0 @@
# Generated migration for AI Webhooks
from django.conf import settings
from django.db import migrations, models
import django.db.models.deletion
class Migration(migrations.Migration):
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
('documents', '1075_add_performance_indexes'),
]
operations = [
migrations.CreateModel(
name='AIWebhookEvent',
fields=[
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('event_type', models.CharField(
choices=[
('deletion_request_created', 'Deletion Request Created'),
('suggestion_auto_applied', 'Suggestion Auto Applied'),
('scan_completed', 'AI Scan Completed')
],
help_text='Type of AI event that triggered this webhook',
max_length=50
)),
('created_at', models.DateTimeField(auto_now_add=True)),
('webhook_url', models.CharField(
help_text='URL where the webhook was sent',
max_length=512
)),
('payload', models.JSONField(help_text='Data sent in the webhook')),
('status', models.CharField(
choices=[
('pending', 'Pending'),
('success', 'Success'),
('failed', 'Failed'),
('retrying', 'Retrying')
],
default='pending',
max_length=20
)),
('attempts', models.PositiveIntegerField(
default=0,
help_text='Number of delivery attempts'
)),
('last_attempt_at', models.DateTimeField(blank=True, null=True)),
('response_status_code', models.PositiveIntegerField(blank=True, null=True)),
('response_body', models.TextField(blank=True)),
('error_message', models.TextField(
blank=True,
help_text='Error message if delivery failed'
)),
('completed_at', models.DateTimeField(blank=True, null=True)),
],
options={
'verbose_name': 'AI webhook event',
'verbose_name_plural': 'AI webhook events',
'ordering': ['-created_at'],
},
),
migrations.CreateModel(
name='AIWebhookConfig',
fields=[
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('name', models.CharField(
help_text='Friendly name for this webhook configuration',
max_length=128,
unique=True
)),
('enabled', models.BooleanField(
default=True,
help_text='Whether this webhook is active'
)),
('url', models.CharField(
help_text='URL to send webhook notifications',
max_length=512
)),
('events', models.JSONField(
default=list,
help_text='List of event types this webhook should receive'
)),
('headers', models.JSONField(
blank=True,
default=dict,
help_text='Custom HTTP headers to include in webhook requests'
)),
('secret', models.CharField(
blank=True,
help_text='Secret key for signing webhook payloads (optional)',
max_length=256
)),
('max_retries', models.PositiveIntegerField(
default=3,
help_text='Maximum number of retry attempts'
)),
('retry_delay', models.PositiveIntegerField(
default=60,
help_text='Initial retry delay in seconds (will increase exponentially)'
)),
('timeout', models.PositiveIntegerField(
default=10,
help_text='Request timeout in seconds'
)),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('created_by', models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name='ai_webhook_configs',
to=settings.AUTH_USER_MODEL
)),
],
options={
'verbose_name': 'AI webhook configuration',
'verbose_name_plural': 'AI webhook configurations',
'ordering': ['name'],
},
),
migrations.AddIndex(
model_name='aiwebhookevent',
index=models.Index(fields=['event_type', 'status'], name='documents_a_event_t_8de562_idx'),
),
migrations.AddIndex(
model_name='aiwebhookevent',
index=models.Index(fields=['created_at'], name='documents_a_created_a29f8c_idx'),
),
migrations.AddIndex(
model_name='aiwebhookevent',
index=models.Index(fields=['status'], name='documents_a_status_9b9c6f_idx'),
),
]

View file

@ -1847,7 +1847,3 @@ class AISuggestionFeedback(models.Model):
def __str__(self): def __str__(self):
return f"{self.suggestion_type} suggestion for document {self.document_id} - {self.status}" return f"{self.suggestion_type} suggestion for document {self.document_id} - {self.status}"
# Import webhook models so Django recognizes them
from documents.webhooks import AIWebhookEvent, AIWebhookConfig # noqa: E402, F401

View file

@ -1,599 +0,0 @@
"""
AI Webhooks Module for IntelliDocs-ngx
This module provides a webhook system for notifying external systems about AI events.
It includes:
- Webhook configuration models
- Event tracking and logging
- Retry logic with exponential backoff
- Support for multiple webhook events
According to issue requirements:
- Webhook when AI creates deletion request
- Webhook when AI applies suggestion automatically
- Webhook when AI scan completes
- Configurable via settings
- Robust retry logic with exponential backoff
- Comprehensive logging
"""
from __future__ import annotations
import hashlib
import logging
from typing import TYPE_CHECKING, Any, Dict, Optional
from urllib.parse import urlparse
import httpx
from celery import shared_task
from django.conf import settings
from django.contrib.auth.models import User
from django.db import models
from django.utils import timezone
from django.utils.translation import gettext_lazy as _
if TYPE_CHECKING:
from documents.models import Document, DeletionRequest
logger = logging.getLogger("paperless.ai_webhooks")
class AIWebhookEvent(models.Model):
"""
Model to track AI webhook events and their delivery status.
Provides comprehensive logging of all webhook attempts for auditing
and troubleshooting purposes.
"""
# Event types
EVENT_DELETION_REQUEST_CREATED = 'deletion_request_created'
EVENT_SUGGESTION_AUTO_APPLIED = 'suggestion_auto_applied'
EVENT_SCAN_COMPLETED = 'scan_completed'
EVENT_TYPE_CHOICES = [
(EVENT_DELETION_REQUEST_CREATED, _('Deletion Request Created')),
(EVENT_SUGGESTION_AUTO_APPLIED, _('Suggestion Auto Applied')),
(EVENT_SCAN_COMPLETED, _('AI Scan Completed')),
]
# Event metadata
event_type = models.CharField(
max_length=50,
choices=EVENT_TYPE_CHOICES,
help_text=_("Type of AI event that triggered this webhook"),
)
created_at = models.DateTimeField(auto_now_add=True)
# Configuration used
webhook_url = models.CharField(
max_length=512,
help_text=_("URL where the webhook was sent"),
)
# Payload information
payload = models.JSONField(
help_text=_("Data sent in the webhook"),
)
# Delivery tracking
STATUS_PENDING = 'pending'
STATUS_SUCCESS = 'success'
STATUS_FAILED = 'failed'
STATUS_RETRYING = 'retrying'
STATUS_CHOICES = [
(STATUS_PENDING, _('Pending')),
(STATUS_SUCCESS, _('Success')),
(STATUS_FAILED, _('Failed')),
(STATUS_RETRYING, _('Retrying')),
]
status = models.CharField(
max_length=20,
choices=STATUS_CHOICES,
default=STATUS_PENDING,
)
attempts = models.PositiveIntegerField(
default=0,
help_text=_("Number of delivery attempts"),
)
last_attempt_at = models.DateTimeField(null=True, blank=True)
response_status_code = models.PositiveIntegerField(null=True, blank=True)
response_body = models.TextField(blank=True)
error_message = models.TextField(
blank=True,
help_text=_("Error message if delivery failed"),
)
completed_at = models.DateTimeField(null=True, blank=True)
class Meta:
ordering = ['-created_at']
verbose_name = _("AI webhook event")
verbose_name_plural = _("AI webhook events")
indexes = [
models.Index(fields=['event_type', 'status']),
models.Index(fields=['created_at']),
models.Index(fields=['status']),
]
def __str__(self):
return f"AI Webhook {self.event_type} - {self.status} ({self.attempts} attempts)"
class AIWebhookConfig(models.Model):
"""
Configuration model for AI webhooks.
Allows multiple webhook endpoints with different configurations
per event type.
"""
name = models.CharField(
max_length=128,
unique=True,
help_text=_("Friendly name for this webhook configuration"),
)
enabled = models.BooleanField(
default=True,
help_text=_("Whether this webhook is active"),
)
# Webhook destination
url = models.CharField(
max_length=512,
help_text=_("URL to send webhook notifications"),
)
# Event filters
events = models.JSONField(
default=list,
help_text=_("List of event types this webhook should receive"),
)
# Request configuration
headers = models.JSONField(
default=dict,
blank=True,
help_text=_("Custom HTTP headers to include in webhook requests"),
)
secret = models.CharField(
max_length=256,
blank=True,
help_text=_("Secret key for signing webhook payloads (optional)"),
)
# Retry configuration
max_retries = models.PositiveIntegerField(
default=3,
help_text=_("Maximum number of retry attempts"),
)
retry_delay = models.PositiveIntegerField(
default=60,
help_text=_("Initial retry delay in seconds (will increase exponentially)"),
)
timeout = models.PositiveIntegerField(
default=10,
help_text=_("Request timeout in seconds"),
)
# Metadata
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
created_by = models.ForeignKey(
User,
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='ai_webhook_configs',
)
class Meta:
ordering = ['name']
verbose_name = _("AI webhook configuration")
verbose_name_plural = _("AI webhook configurations")
def __str__(self):
return f"{self.name} ({'enabled' if self.enabled else 'disabled'})"
def should_send_event(self, event_type: str) -> bool:
"""Check if this webhook should receive the given event type."""
return self.enabled and (not self.events or event_type in self.events)
def _validate_webhook_url(url: str) -> bool:
"""
Validate webhook URL for security.
Uses similar validation as existing webhook system in handlers.py
"""
try:
p = urlparse(url)
# Check scheme
allowed_schemes = getattr(settings, 'WEBHOOKS_ALLOWED_SCHEMES', ['http', 'https'])
if p.scheme.lower() not in allowed_schemes or not p.hostname:
logger.warning(f"AI Webhook blocked: invalid scheme/hostname for {url}")
return False
# Check port if configured
port = p.port or (443 if p.scheme == "https" else 80)
allowed_ports = getattr(settings, 'WEBHOOKS_ALLOWED_PORTS', [])
if allowed_ports and port not in allowed_ports:
logger.warning(f"AI Webhook blocked: port {port} not permitted for {url}")
return False
return True
except Exception as e:
logger.error(f"Error validating webhook URL {url}: {e}")
return False
def _sign_payload(payload: Dict[str, Any], secret: str) -> str:
"""
Create HMAC signature for webhook payload.
This allows receivers to verify the webhook came from our system.
"""
import hmac
import json
payload_str = json.dumps(payload, sort_keys=True)
signature = hmac.new(
secret.encode('utf-8'),
payload_str.encode('utf-8'),
hashlib.sha256
).hexdigest()
return f"sha256={signature}"
@shared_task(
bind=True,
max_retries=None, # We handle retries manually
autoretry_for=None,
)
def send_ai_webhook_task(
self,
webhook_event_id: int,
attempt: int = 1,
):
"""
Celery task to send AI webhook with retry logic.
Implements exponential backoff for retries.
"""
try:
event = AIWebhookEvent.objects.get(pk=webhook_event_id)
except AIWebhookEvent.DoesNotExist:
logger.error(f"AI Webhook event {webhook_event_id} not found")
return
# Get configuration
try:
config = AIWebhookConfig.objects.get(url=event.webhook_url, enabled=True)
except AIWebhookConfig.DoesNotExist:
# Use default settings if no config exists
max_retries = getattr(settings, 'PAPERLESS_AI_WEBHOOKS_MAX_RETRIES', 3)
retry_delay = getattr(settings, 'PAPERLESS_AI_WEBHOOKS_RETRY_DELAY', 60)
timeout = getattr(settings, 'PAPERLESS_AI_WEBHOOKS_TIMEOUT', 10)
headers = {}
secret = None
else:
max_retries = config.max_retries
retry_delay = config.retry_delay
timeout = config.timeout
headers = config.headers or {}
secret = config.secret
# Update attempt tracking
event.attempts = attempt
event.last_attempt_at = timezone.now()
event.status = AIWebhookEvent.STATUS_RETRYING if attempt > 1 else AIWebhookEvent.STATUS_PENDING
event.save()
# Prepare headers
request_headers = headers.copy()
request_headers['Content-Type'] = 'application/json'
request_headers['User-Agent'] = 'IntelliDocs-AI-Webhook/1.0'
# Add signature if secret is configured
if secret:
signature = _sign_payload(event.payload, secret)
request_headers['X-IntelliDocs-Signature'] = signature
try:
# Send webhook
response = httpx.post(
event.webhook_url,
json=event.payload,
headers=request_headers,
timeout=timeout,
follow_redirects=False,
)
# Update event with response
event.response_status_code = response.status_code
event.response_body = response.text[:1000] # Limit stored response size
# Check if successful (2xx status code)
if 200 <= response.status_code < 300:
event.status = AIWebhookEvent.STATUS_SUCCESS
event.completed_at = timezone.now()
event.save()
logger.info(
f"AI Webhook sent successfully to {event.webhook_url} "
f"for {event.event_type} (attempt {attempt})"
)
return
# Non-2xx response
error_msg = f"HTTP {response.status_code}: {response.text[:200]}"
event.error_message = error_msg
# Retry if we haven't exceeded max attempts
if attempt < max_retries:
event.save()
# Calculate exponential backoff delay
delay = retry_delay * (2 ** (attempt - 1))
logger.warning(
f"AI Webhook to {event.webhook_url} failed with status {response.status_code}, "
f"retrying in {delay}s (attempt {attempt}/{max_retries})"
)
# Schedule retry
send_ai_webhook_task.apply_async(
args=[webhook_event_id, attempt + 1],
countdown=delay,
)
else:
event.status = AIWebhookEvent.STATUS_FAILED
event.completed_at = timezone.now()
event.save()
logger.error(
f"AI Webhook to {event.webhook_url} failed after {max_retries} attempts: {error_msg}"
)
except Exception as e:
error_msg = str(e)
event.error_message = error_msg
# Retry if we haven't exceeded max attempts
if attempt < max_retries:
event.save()
# Calculate exponential backoff delay
delay = retry_delay * (2 ** (attempt - 1))
logger.warning(
f"AI Webhook to {event.webhook_url} failed with error: {error_msg}, "
f"retrying in {delay}s (attempt {attempt}/{max_retries})"
)
# Schedule retry
send_ai_webhook_task.apply_async(
args=[webhook_event_id, attempt + 1],
countdown=delay,
)
else:
event.status = AIWebhookEvent.STATUS_FAILED
event.completed_at = timezone.now()
event.save()
logger.error(
f"AI Webhook to {event.webhook_url} failed after {max_retries} attempts: {error_msg}"
)
def send_ai_webhook(
event_type: str,
payload: Dict[str, Any],
webhook_urls: Optional[list] = None,
) -> list:
"""
Send AI webhook notification.
Args:
event_type: Type of event (e.g., 'deletion_request_created')
payload: Data to send in webhook
webhook_urls: Optional list of URLs to send to (uses config if not provided)
Returns:
List of created AIWebhookEvent instances
"""
# Check if webhooks are enabled
if not getattr(settings, 'PAPERLESS_AI_WEBHOOKS_ENABLED', False):
logger.debug("AI webhooks are disabled in settings")
return []
# Add metadata to payload
payload['event_type'] = event_type
payload['timestamp'] = timezone.now().isoformat()
payload['source'] = 'intellidocs-ai'
events = []
# Get webhook URLs from config or parameter
if webhook_urls:
urls = webhook_urls
else:
# Get all enabled configs for this event type
configs = AIWebhookConfig.objects.filter(enabled=True)
urls = [
config.url
for config in configs
if config.should_send_event(event_type)
]
if not urls:
logger.debug(f"No webhook URLs configured for event type: {event_type}")
return []
# Create webhook events and queue tasks
for url in urls:
# Validate URL
if not _validate_webhook_url(url):
logger.warning(f"Skipping invalid webhook URL: {url}")
continue
# Create event record
event = AIWebhookEvent.objects.create(
event_type=event_type,
webhook_url=url,
payload=payload,
status=AIWebhookEvent.STATUS_PENDING,
)
events.append(event)
# Queue async task
send_ai_webhook_task.delay(event.id)
logger.debug(f"Queued AI webhook {event_type} to {url}")
return events
# Helper functions for specific webhook events
def send_deletion_request_webhook(deletion_request: DeletionRequest) -> list:
"""
Send webhook when AI creates a deletion request.
Args:
deletion_request: The DeletionRequest instance
Returns:
List of created webhook events
"""
from documents.models import Document
# Build payload
documents_data = []
for doc in deletion_request.documents.all():
documents_data.append({
'id': doc.id,
'title': doc.title,
'created': doc.created.isoformat() if doc.created else None,
'correspondent': doc.correspondent.name if doc.correspondent else None,
'document_type': doc.document_type.name if doc.document_type else None,
})
payload = {
'deletion_request': {
'id': deletion_request.id,
'status': deletion_request.status,
'ai_reason': deletion_request.ai_reason,
'document_count': deletion_request.documents.count(),
'documents': documents_data,
'impact_summary': deletion_request.impact_summary,
'created_at': deletion_request.created_at.isoformat(),
},
'user': {
'id': deletion_request.user.id,
'username': deletion_request.user.username,
}
}
return send_ai_webhook(
AIWebhookEvent.EVENT_DELETION_REQUEST_CREATED,
payload,
)
def send_suggestion_applied_webhook(
document: Document,
suggestions: Dict[str, Any],
applied_fields: list,
) -> list:
"""
Send webhook when AI automatically applies suggestions.
Args:
document: The Document that was updated
suggestions: Dictionary of all AI suggestions
applied_fields: List of fields that were auto-applied
Returns:
List of created webhook events
"""
payload = {
'document': {
'id': document.id,
'title': document.title,
'created': document.created.isoformat() if document.created else None,
'correspondent': document.correspondent.name if document.correspondent else None,
'document_type': document.document_type.name if document.document_type else None,
'tags': [tag.name for tag in document.tags.all()],
},
'applied_suggestions': {
field: suggestions.get(field)
for field in applied_fields
},
'auto_applied': True,
}
return send_ai_webhook(
AIWebhookEvent.EVENT_SUGGESTION_AUTO_APPLIED,
payload,
)
def send_scan_completed_webhook(
document: Document,
scan_results: Dict[str, Any],
auto_applied_count: int = 0,
suggestions_count: int = 0,
) -> list:
"""
Send webhook when AI scan completes.
Args:
document: The Document that was scanned
scan_results: Dictionary of scan results
auto_applied_count: Number of suggestions that were auto-applied
suggestions_count: Number of suggestions pending review
Returns:
List of created webhook events
"""
payload = {
'document': {
'id': document.id,
'title': document.title,
'created': document.created.isoformat() if document.created else None,
'correspondent': document.correspondent.name if document.correspondent else None,
'document_type': document.document_type.name if document.document_type else None,
},
'scan_summary': {
'auto_applied_count': auto_applied_count,
'suggestions_count': suggestions_count,
'has_tags_suggestions': 'tags' in scan_results,
'has_correspondent_suggestion': 'correspondent' in scan_results,
'has_type_suggestion': 'document_type' in scan_results,
'has_storage_path_suggestion': 'storage_path' in scan_results,
'has_custom_fields': 'custom_fields' in scan_results and scan_results['custom_fields'],
'has_workflow_suggestions': 'workflows' in scan_results and scan_results['workflows'],
},
'scan_completed_at': timezone.now().isoformat(),
}
return send_ai_webhook(
AIWebhookEvent.EVENT_SCAN_COMPLETED,
payload,
)

View file

@ -1195,31 +1195,6 @@ PAPERLESS_ML_MODEL_CACHE: Final[Path | None] = __get_optional_path(
"PAPERLESS_ML_MODEL_CACHE", "PAPERLESS_ML_MODEL_CACHE",
) )
# AI Webhooks Configuration
# Enable webhooks for AI events (deletion requests, auto-applied suggestions, scan completion)
PAPERLESS_AI_WEBHOOKS_ENABLED: Final[bool] = __get_boolean(
"PAPERLESS_AI_WEBHOOKS_ENABLED",
"false", # Disabled by default, users must explicitly enable
)
# Maximum number of retry attempts for failed webhooks
PAPERLESS_AI_WEBHOOKS_MAX_RETRIES: Final[int] = __get_int(
"PAPERLESS_AI_WEBHOOKS_MAX_RETRIES",
3,
)
# Initial retry delay in seconds (will increase exponentially)
PAPERLESS_AI_WEBHOOKS_RETRY_DELAY: Final[int] = __get_int(
"PAPERLESS_AI_WEBHOOKS_RETRY_DELAY",
60,
)
# Webhook request timeout in seconds
PAPERLESS_AI_WEBHOOKS_TIMEOUT: Final[int] = __get_int(
"PAPERLESS_AI_WEBHOOKS_TIMEOUT",
10,
)
OCR_COLOR_CONVERSION_STRATEGY = os.getenv( OCR_COLOR_CONVERSION_STRATEGY = os.getenv(
"PAPERLESS_OCR_COLOR_CONVERSION_STRATEGY", "PAPERLESS_OCR_COLOR_CONVERSION_STRATEGY",
"RGB", "RGB",