<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit
- **Improvements**
- Increased the default concurrency for background tasks, enhancing processing efficiency.
- Improved handling of empty or unsupported documents to ensure consistent processing.
- Optimized document filtering to exclude certain documents from processing, improving performance.
- **Bug Fixes**
- Enhanced detection of empty document summaries, reducing errors during processing.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
fix AI-10
fix AI-109
fix PD-2484
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit
- **New Features**
- Added a method to check if a document requires embedding, improving embedding efficiency.
- Enhanced document embeddings with enriched metadata, including title, summary, creation/update dates, and author information.
- Introduced a new type for document fragments with extended metadata fields.
- **Improvements**
- Embedding logic now conditionally processes only documents needing updates.
- Embedding content now includes document metadata for more informative context.
- Expanded and improved test coverage for embedding scenarios and workspace behaviors.
- Event emission added for workspace embedding updates on client version mismatch.
- Job queueing enhanced with prioritization and explicit job IDs for better management.
- Job queue calls updated to include priority and context identifiers in a structured format.
- **Bug Fixes**
- Improved handling of ignored documents in embedding matches.
- Fixed incorrect document ID assignment in embedding job queueing.
- **Tests**
- Added and updated snapshot and behavioral tests for embedding and workspace document handling.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
fix AI-131
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit
- **Refactor**
- Updated database schema to consolidate unique constraints into composite primary keys for embedding-related data, improving consistency.
- Changed the relation in the Snapshot model to allow multiple embeddings.
- Improved filtering logic for documents and snapshots based on embedding existence.
- Reformatted SQL queries and schema attributes for improved readability; no changes to functionality.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
fix AI-127
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit
- **New Features**
- Added automated event handling for workspace updates and document embedding, streamlining document embedding workflows.
- Introduced detection and queuing of documents needing embedding, excluding ignored documents.
- **Improvements**
- Enhanced performance of embedding-related searches by filtering results at the database level.
- Increased concurrency for embedding job processing to improve throughput.
- **Bug Fixes**
- Improved error handling and fallback for missing document titles during embedding.
- Added safeguards to skip invalid embedding jobs based on document identifiers.
- **Tests**
- Expanded test coverage for document embedding and ignored document filtering.
- Updated end-to-end tests to use dynamic content for improved reliability.
- Added synchronization waits in document creation utilities to improve test stability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit
- **Bug Fixes**
- Improved the accuracy of document matching by excluding ignored documents from search results.
- **Chores**
- Updated internal handling of ignored document lists for better consistency and reliability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
fix AI-20
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit
- **New Features**
- Enhanced file metadata with MIME type, blob ID, and file name across context and workspace, now visible in UI and API.
- Added workspace-level matching for files and documents with configurable thresholds and workspace scoping in search queries.
- Introduced a new error type and user-friendly messaging for global workspace context matching failures.
- **Bug Fixes**
- Improved consistent handling of file MIME types and nullable context IDs for accurate metadata.
- **Documentation**
- Updated GraphQL schema, queries, and mutations to include new metadata fields, optional parameters, and error types.
- **Style**
- Added new localization strings for global context matching error messages.
- **Tests**
- Extended test coverage with new and updated snapshot tests for metadata and matching logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit
- **New Features**
- Ignored documents in workspace embedding now display additional metadata, including document title, creation and update timestamps, and the names and avatars of users who created or updated the document.
- **Enhancements**
- The list of ignored documents provides richer information for easier identification and management within the workspace.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit
- **New Features**
- Expanded file chunk matching to include both context and workspace file embeddings, providing broader and more relevant search results.
- **Improvements**
- Enhanced result ranking by introducing a re-ranking step for combined embedding matches, improving the relevance of returned file chunks.
- Adjusted file count reporting to reflect the total number of workspace files instead of ignored documents for more accurate workspace file statistics.
- Renamed and streamlined workspace file management methods for clearer and more consistent API usage.
- **Bug Fixes**
- Prevented embedding similarity queries when embedding is disabled for a workspace, improving system behavior consistency.
- **Tests**
- Added comprehensive tests to verify workspace embedding management, including enabling, matching, and disabling embedding functionality.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->