What this feature does
When you connect a storage and pick a folder, RecordsKeeper.AI can selectively sync files based on:
File Categories & Types (e.g., PDFs, DOCX, spreadsheets, images)
File Size (minimum and/or maximum)
Name-Based (Regex) Filters for precise filename matching
Only files that meet your filters are synced. Everything else is skipped.
Before you start
You have a storage account you can connect (e.g., Google Drive, OneDrive, S3, etc.).
You know which folder you want to sync.
Optional: a rough idea of file sizes, and any naming patterns you want to target.
Step-by-step
1) Storage
From Add Cloud Storage, choose your storage provider and select Connect.
2) Connect
Authorize access to your storage. You’ll see who connected it and when (for audit clarity).
3) Directory
Browse your connected storage and select the folder you want to sync. Subfolders are included automatically.
4) File Type (Filters & Review)
You’ll land on a screen with a blue help banner - this is where you tell RecordsKeeper.AI what to bring in.
A) File Categories To Be Added
Select categories you want (e.g., Documents and Communications, Spreadsheets and Data, Media and Graphics).
Expand a category to see and toggle individual file types (e.g.,
.pdf,.docx,.xlsx,.png,.svg, etc.).Items marked Not Supported are disabled for now.
Result: Only selected categories/types will be considered; others are skipped.
Tip: Start with the categories you actually use day-to-day (e.g., PDFs + spreadsheets). You can widen later.
B) File Size Filtering
Set File Size Greater Than to define a minimum; smaller files are skipped.
Set File Size Less Than to define a maximum; files ≥ this size are skipped.
Leave either field blank to have no limit on that side.
Choose a unit (KB/MB/GB); the system handles conversion.
Size is based on the source-reported file size at sync time (e.g., a ZIP counts as the ZIP size).
Examples
Only typical office docs: Min
20 KB, Max25 MBExclude giant media: Max
100 MB, leave Min blank
C) Advance Filters → Name-Based Filtering (Regex)
Use Regular Expressions to match filenames precisely. Matching is done against the full filename (with extension), not the file path or content.
Common patterns
(?i)\.pdf$→ all PDFs, case-insensitive^INV→ names starting withINV^Invoice_2025.*\.(pdf|docx)$→ 2025 invoices in PDF or DOCX
Good to know
Leave this blank to include all names.
An invalid or too-strict pattern can include zero files.
Use
^and$to anchor,|for OR,(?i)for case-insensitive.
D) Review & Confirm
Your final inclusion rule is:
Selected Category/Type AND within Size Range AND (Regex match, if provided)Click Add to RecordKeeper.AI to start the sync. You can Cancel to back out safely.
How filters impact credits & speed
Storage Credit Consumed: based on files actually ingested. Tighter filters → less storage used.
File Processing Credits: used for AI tasks (OCR, classification, extraction). Fewer/lighter files → fewer credits.
Skipped files don’t consume processing credits.
Practical tips
Pilot first: Start with one folder + narrow filters, review results, then broaden.
Exclude clutter: Set a minimum size (e.g.,
10–20 KB) to avoid syncing tiny temp or signature files.Be intentional with media: Large images/videos can be costly - either cap size or leave Media off unless needed.
Regex safely: Test simple patterns first; add complexity gradually.
Troubleshooting
“Zero files were synced.”
Check you didn’t over-filter: lower the minimum size, raise/remove the maximum, or clear the regex.
Ensure you selected at least one category/type that exists in the folder.
“Too many files came in.”
Add/adjust a maximum size or enable a regex pattern to narrow by name.
Deselect unneeded categories/types.
“Regex error” or “Pattern invalid.”
Remove special characters you didn’t intend; try a simpler pattern.
Use anchors:
^pattern(start),pattern$(end). Add(?i)for case-insensitive.
“Media not syncing.”
Confirm the Media and Graphics category is selected and the relevant types (e.g.,
.jpg,.png) are checked.Check for size caps blocking large files.
“Audio/Video disabled.”
Types marked Not Supported can’t be synced yet.
FAQs
Can I change filters later?
Yes. Update filters from Widgets & Integration → your storage → Edit, then run a new sync.
Do filters apply to subfolders?
Yes. Filters apply to all files under the selected folder.
Are duplicates handled?
Duplicates may be skipped based on policy. Check batch details for Skipped reasons.
Does name filtering look into folders or file contents?
No - filename only (with extension). Paths and contents aren’t matched.
Quick checklist before you click “Add to RecordKeeper.AI”
Correct folder selected
Only needed categories/types are checked
Size range makes sense (or fields left blank intentionally)
Regex tested or left empty
You’re comfortable with expected credit usage
You’re set - sync only what matters and keep your workspace clean and fast.


