Commit Graph

13 Commits

Author SHA1 Message Date
darkskygit
f4e7595f4b feat(server): add copilot embedding feature (#12590)
fix AI-154
2025-05-28 04:36:37 +00:00
darkskygit
9220b973c7 feat(server): increase embedding jobs concurrency & handle empty content after trim (#12574)
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

- **Improvements**
  - Increased the default concurrency for background tasks, enhancing processing efficiency.
  - Improved handling of empty or unsupported documents to ensure consistent processing.
  - Optimized document filtering to exclude certain documents from processing, improving performance.

- **Bug Fixes**
  - Enhanced detection of empty document summaries, reducing errors during processing.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-27 14:28:34 +00:00
darkskygit
2a80fbb993 feat(server): workspace embedding improve (#12022)
fix AI-10
fix AI-109
fix PD-2484

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

- **New Features**
  - Added a method to check if a document requires embedding, improving embedding efficiency.
  - Enhanced document embeddings with enriched metadata, including title, summary, creation/update dates, and author information.
  - Introduced a new type for document fragments with extended metadata fields.

- **Improvements**
  - Embedding logic now conditionally processes only documents needing updates.
  - Embedding content now includes document metadata for more informative context.
  - Expanded and improved test coverage for embedding scenarios and workspace behaviors.
  - Event emission added for workspace embedding updates on client version mismatch.
  - Job queueing enhanced with prioritization and explicit job IDs for better management.
  - Job queue calls updated to include priority and context identifiers in a structured format.

- **Bug Fixes**
  - Improved handling of ignored documents in embedding matches.
  - Fixed incorrect document ID assignment in embedding job queueing.

- **Tests**
  - Added and updated snapshot and behavioral tests for embedding and workspace document handling.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-23 10:16:15 +00:00
darkskygit
84667b3440 feat(server): workspace embedding status count with files (#12420)
fix AI-32
fix AI-132
2025-05-21 10:51:35 +00:00
darkskygit
7fd3ee957f fix(server): embedding chunks primary key (#12416)
fix AI-131

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

- **Refactor**
  - Updated database schema to consolidate unique constraints into composite primary keys for embedding-related data, improving consistency.
  - Changed the relation in the Snapshot model to allow multiple embeddings.
  - Improved filtering logic for documents and snapshots based on embedding existence.
  - Reformatted SQL queries and schema attributes for improved readability; no changes to functionality.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-21 10:51:35 +00:00
darkskygit
6f9361caee feat(server): trigger workspace embedding (#12328)
fix AI-127

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

- **New Features**
  - Added automated event handling for workspace updates and document embedding, streamlining document embedding workflows.
  - Introduced detection and queuing of documents needing embedding, excluding ignored documents.
- **Improvements**
  - Enhanced performance of embedding-related searches by filtering results at the database level.
  - Increased concurrency for embedding job processing to improve throughput.
- **Bug Fixes**
  - Improved error handling and fallback for missing document titles during embedding.
  - Added safeguards to skip invalid embedding jobs based on document identifiers.
- **Tests**
  - Expanded test coverage for document embedding and ignored document filtering.
  - Updated end-to-end tests to use dynamic content for improved reliability.
  - Added synchronization waits in document creation utilities to improve test stability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-20 05:16:45 +00:00
darkskygit
6224344a4f chore(server): improve ignored docs list & match (#12307)
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

- **Bug Fixes**
	- Improved the accuracy of document matching by excluding ignored documents from search results.
- **Chores**
	- Updated internal handling of ignored document lists for better consistency and reliability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-15 09:36:28 +00:00
darkskygit
cecf545590 feat(server): improve context metadata & matching (#12064)
fix AI-20

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

- **New Features**
  - Enhanced file metadata with MIME type, blob ID, and file name across context and workspace, now visible in UI and API.
  - Added workspace-level matching for files and documents with configurable thresholds and workspace scoping in search queries.
  - Introduced a new error type and user-friendly messaging for global workspace context matching failures.

- **Bug Fixes**
  - Improved consistent handling of file MIME types and nullable context IDs for accurate metadata.

- **Documentation**
  - Updated GraphQL schema, queries, and mutations to include new metadata fields, optional parameters, and error types.

- **Style**
  - Added new localization strings for global context matching error messages.

- **Tests**
  - Extended test coverage with new and updated snapshot tests for metadata and matching logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-14 06:32:29 +00:00
darkskygit
21dc550b9d feat(server): add doc meta for ignored docs (#12021)
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

- **New Features**
  - Ignored documents in workspace embedding now display additional metadata, including document title, creation and update timestamps, and the names and avatars of users who created or updated the document.
- **Enhancements**
  - The list of ignored documents provides richer information for easier identification and management within the workspace.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-28 11:56:09 +00:00
darkskygit
49c57ca649 fix(server): query workspace embed files (#11982)
<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

- **New Features**
	- Expanded file chunk matching to include both context and workspace file embeddings, providing broader and more relevant search results.
- **Improvements**
	- Enhanced result ranking by introducing a re-ranking step for combined embedding matches, improving the relevance of returned file chunks.
	- Adjusted file count reporting to reflect the total number of workspace files instead of ignored documents for more accurate workspace file statistics.
	- Renamed and streamlined workspace file management methods for clearer and more consistent API usage.
- **Bug Fixes**
	- Prevented embedding similarity queries when embedding is disabled for a workspace, improving system behavior consistency.
- **Tests**
	- Added comprehensive tests to verify workspace embedding management, including enabling, matching, and disabling embedding functionality.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 08:32:32 +00:00
darkskygit
ddb739fa13 feat: add pagination support for workspace config (#11859)
fix AI-78
2025-04-23 11:25:41 +00:00
darkskygit
5397fba897 feat(server): global embedding gql endpoint (#11809)
fix AI-30
fix AI-31
fix PD-2487
2025-04-23 11:25:41 +00:00
darkskygit
f2adb9f72c feat(server): workspace file embedding & ignored docs model impl (#11804)
fix AI-30
fix AI-31
2025-04-21 05:34:10 +00:00