Knowledge Synchronization

Automate the synchronization of knowledge from your systems of record into Sharely.ai, enabling AI agents to deliver unified insights across all your organizational content.

What is Knowledge Synchronization?

Knowledge Synchronization is the practice of programmatically keeping Sharely.ai's knowledge base in sync with your source systems through automated workflows using the Sharely.ai API.

Key capabilities:

Multi-source aggregation - Unify knowledge from CMSs, file storage, databases, and external systems
Automated reconciliation - Add, update, and remove knowledge based on your source of truth
Idempotent operations - Run sync workflows repeatedly without side effects
Temporal-powered reliability - Built on Temporal workflows for resilient, restartable processes
YAML-driven configuration - Declarative source of truth that's version-controlled and auditable
Role-based synchronization - Automatically apply RBAC rules during sync

When to Use Knowledge Sync

Use Knowledge Synchronization when you want to:

✅ Aggregate knowledge from multiple siloed systems into a unified AI-accessible knowledge base
✅ Automate content updates from your CMS, file storage, or databases
✅ Maintain Sharely.ai knowledge in sync with your system of record
✅ Apply consistent metadata and role-based access control at scale
✅ Version control your knowledge configuration with Git
✅ Eliminate manual upload workflows and human error

Common use cases:

Professional associations - Sync research libraries, videos, podcasts, and member resources from multiple platforms
Enterprise documentation - Keep internal knowledge base synchronized with Confluence, SharePoint, or Notion
Content publishers - Automatically sync CMS content (WordPress, Contentful, Strapi) to power AI assistants
Educational institutions - Aggregate course materials, recordings, and resources from learning management systems
Healthcare organizations - Synchronize medical literature, training videos, and clinical guidelines from diverse sources

How It Works

Knowledge Synchronization follows a simple reconciliation pattern:

1. Define Source of Truth

Create a YAML configuration file listing all knowledge that should exist in Sharely.ai:

# knowledge-config.yaml
workspace_id: "your-workspace-uuid"
organization_id: "your-org-id"
 
knowledge:
  - source_path: "azure://research/diabetes-guidelines-2024.pdf"
    title: "Diabetes Treatment Guidelines 2024"
    type: "FILE"
    language: "en"
    roles: ["medical-professionals"]
 
  - source_path: "wordpress://blog/ai-in-healthcare"
    title: "AI Applications in Modern Healthcare"
    type: "LINK"
    url: "https://nmea.org/blog/ai-in-healthcare"
    roles: ["all-members"]
 
  - source_path: "vimeo://videos/cme-cardiology-101"
    title: "CME: Cardiology Fundamentals"
    type: "LINK"
    url: "https://vimeo.com/nmea/cardiology-101"
    roles: ["medical-professionals", "students"]

2. Fetch Current State

Query Sharely.ai to get all existing knowledge in your workspace:

const response = await fetch(
  `https://api.sharely.ai/v1/workspaces/${workspaceId}/knowledge?limit=100`,
  {
    headers: {
      'Authorization': `Bearer ${apiToken}`,
      'Content-Type': 'application/json'
    }
  }
);
 
const existingKnowledge = await response.json();

3. Reconcile Differences

Compare your YAML configuration (desired state) with existing knowledge (current state):

// Build index of existing knowledge by source_path
const existingMap = {};
existingKnowledge.items.forEach(item => {
  if (item.metadata?.source_path) {
    existingMap[item.metadata.source_path] = item;
  }
});
 
// Determine what to add, update, or delete
const toAdd = [];
const toDelete = [];
 
yamlConfig.knowledge.forEach(item => {
  if (!existingMap[item.source_path]) {
    toAdd.push(item); // Not in Sharely, needs to be added
  }
});
 
Object.keys(existingMap).forEach(sourcePath => {
  if (!yamlConfig.knowledge.find(k => k.source_path === sourcePath)) {
    toDelete.push(existingMap[sourcePath]); // In Sharely but not in YAML
  }
});

4. Apply Changes

Create new knowledge items and remove orphaned ones:

// Add missing knowledge
for (const item of toAdd) {
  await createKnowledge(item);
}
 
// Remove orphaned knowledge
for (const item of toDelete) {
  await deleteKnowledge(item.knowledgeId);
}

5. Run Idempotently

The sync script can be run multiple times safely:

Existing items aren't duplicated (matched by source_path)
Deletes only happen for items truly missing from YAML
Temporal workflows ensure operations can be restarted without corruption

Real-World Example: National Medical Education Association

The Challenge

The National Medical Education Association (NMEA) serves 50,000+ healthcare professionals with educational content spread across multiple systems:

Azure Blob Storage - 500GB of medical research PDFs, clinical guidelines, and studies
WordPress - News, blog posts, and member announcements
Vimeo - 1,000+ hours of continuing medical education (CME) videos
Podcast Platform - Audio interviews with medical experts and case studies
External Links - PubMed articles, journal references, clinical trial databases

Problem: Members couldn't find relevant information across these siloed systems. Search was fragmented, and AI assistants couldn't provide unified insights.

The Solution

NMEA implemented Knowledge Synchronization to create a unified knowledge base in Sharely.ai:

Created YAML source of truth defining all content across systems
Built sync script running hourly via cron job
Applied role-based access - students see basic content, professionals see everything
Enabled AI agents to deliver insights from the entire knowledge corpus

Implementation

knowledge-config.yaml:

workspace_id: "nmea-workspace-uuid"
organization_id: "nmea-org-id"
 
knowledge:
  # Azure Blob - Research PDFs
  - source_path: "azure://research/diabetes-guidelines-2024.pdf"
    title: "Diabetes Treatment Guidelines 2024"
    type: "FILE"
    azure_blob_url: "https://nmeastorage.blob.core.windows.net/research/diabetes-2024.pdf"
    language: "en"
    roles: ["medical-professionals"]
 
  - source_path: "azure://research/cardiology-best-practices.pdf"
    title: "Cardiology Best Practices Compendium"
    type: "FILE"
    azure_blob_url: "https://nmeastorage.blob.core.windows.net/research/cardiology.pdf"
    language: "en"
    roles: ["medical-professionals", "cardiologists"]
 
  # WordPress - Blog content
  - source_path: "wordpress://blog/ai-healthcare-2024"
    title: "The Future of AI in Healthcare"
    type: "LINK"
    url: "https://nmea.org/blog/ai-healthcare-2024"
    roles: ["all-members"]
 
  # Vimeo - CME Videos
  - source_path: "vimeo://cme/cardiology-fundamentals"
    title: "CME: Cardiology Fundamentals (12 credits)"
    type: "LINK"
    url: "https://vimeo.com/nmea/cardiology-fundamentals"
    roles: ["medical-professionals", "students"]
 
  # Podcast - Audio content
  - source_path: "podcast://expert-interviews/ep-42-immunology"
    title: "Expert Interview: Advances in Immunology"
    type: "LINK"
    url: "https://nmea-podcasts.com/episodes/42"
    roles: ["all-members"]
 
  # External links - Journal articles
  - source_path: "pubmed://article-12345678"
    title: "Novel Approaches to Cancer Immunotherapy"
    type: "LINK"
    url: "https://pubmed.ncbi.nlm.nih.gov/12345678/"
    roles: ["medical-professionals", "researchers"]

sync-script.js:

const yaml = require('js-yaml');
const fs = require('fs');
const fetch = require('node-fetch');
 
const WORKSPACE_ID = process.env.SHARELY_WORKSPACE_ID;
const ORGANIZATION_ID = process.env.SHARELY_ORGANIZATION_ID;
const API_KEY = process.env.SHARELY_API_KEY;
 
async function syncKnowledge() {
  console.log('Starting knowledge synchronization...');
 
  // 1. Load YAML configuration
  const config = yaml.load(fs.readFileSync('./knowledge-config.yaml', 'utf8'));
 
  // 2. Get API token
  const apiToken = await generateAPIToken();
 
  // 3. Fetch existing knowledge
  const existingKnowledge = await fetchAllKnowledge(apiToken);
 
  // 4. Build index by source_path
  const existingMap = {};
  existingKnowledge.forEach(item => {
    if (item.metadata?.source_path) {
      existingMap[item.metadata.source_path] = item;
    }
  });
 
  // 5. Reconcile: determine what to add and delete
  const toAdd = [];
  const toDelete = [];
 
  config.knowledge.forEach(item => {
    if (!existingMap[item.source_path]) {
      toAdd.push(item);
    }
  });
 
  Object.keys(existingMap).forEach(sourcePath => {
    const inConfig = config.knowledge.find(k => k.source_path === sourcePath);
    if (!inConfig) {
      toDelete.push(existingMap[sourcePath]);
    }
  });
 
  // 6. Apply changes
  console.log(`Adding ${toAdd.length} new knowledge items...`);
  for (const item of toAdd) {
    await createKnowledgeItem(apiToken, item);
  }
 
  console.log(`Removing ${toDelete.length} orphaned knowledge items...`);
  for (const item of toDelete) {
    await deleteKnowledgeItem(apiToken, item.knowledgeId);
  }
 
  console.log('Synchronization complete!');
}
 
async function generateAPIToken() {
  const response = await fetch(
    `https://api.sharely.ai/workspaces/${WORKSPACE_ID}/generate-access-key-token`,
    {
      method: 'POST',
      headers: {
        'x-api-key': API_KEY,
        'Content-Type': 'application/json'
      }
    }
  );
 
  const data = await response.json();
  return data.token;
}
 
async function fetchAllKnowledge(apiToken) {
  const allKnowledge = [];
  let offset = 0;
  const limit = 100;
 
  while (true) {
    const response = await fetch(
      `https://api.sharely.ai/v1/workspaces/${WORKSPACE_ID}/knowledge?limit=${limit}&offset=${offset}`,
      {
        headers: {
          'Authorization': `Bearer ${apiToken}`,
          'Content-Type': 'application/json'
        }
      }
    );
 
    const data = await response.json();
    allKnowledge.push(...data.items);
 
    if (data.items.length < limit) break;
    offset += limit;
  }
 
  return allKnowledge;
}
 
async function createKnowledgeItem(apiToken, item) {
  const payload = {
    type: item.type,
    title: item.title,
    language: item.language,
    metadata: {
      source_path: item.source_path
    }
  };
 
  // Add URL for LINK types
  if (item.type === 'LINK' && item.url) {
    payload.url = item.url;
  }
 
  // Add file URL for FILE types (if syncing from Azure Blob or similar)
  if (item.type === 'FILE' && item.azure_blob_url) {
    payload.url = item.azure_blob_url;
  }
 
  const response = await fetch(
    `https://api.sharely.ai/v1/workspaces/${WORKSPACE_ID}/knowledge`,
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${apiToken}`,
        'organizationId': ORGANIZATION_ID,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify(payload)
    }
  );
 
  const result = await response.json();
  console.log(`Created: ${item.title} (${result.knowledgeId})`);
 
  // Apply roles if specified
  if (item.roles && item.roles.length > 0) {
    await assignRoles(apiToken, result.knowledgeId, item.roles);
  }
 
  return result;
}
 
async function assignRoles(apiToken, knowledgeId, roleNames) {
  // Note: This assumes roles already exist in workspace
  // In production, you'd resolve role names to role IDs
  const roleIds = await resolveRoleIds(apiToken, roleNames);
 
  await fetch(
    `https://api.sharely.ai/v1/workspaces/${WORKSPACE_ID}/knowledge/${knowledgeId}/role`,
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${apiToken}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({ roleIds })
    }
  );
}
 
async function resolveRoleIds(apiToken, roleNames) {
  // Simplified: In production, query roles API to map names to IDs
  return roleNames; // Placeholder
}
 
async function deleteKnowledgeItem(apiToken, knowledgeId) {
  await fetch(
    `https://api.sharely.ai/v1/workspaces/${WORKSPACE_ID}/knowledge/${knowledgeId}`,
    {
      method: 'DELETE',
      headers: {
        'Authorization': `Bearer ${apiToken}`
      }
    }
  );
 
  console.log(`Deleted: ${knowledgeId}`);
}
 
// Run sync
syncKnowledge().catch(console.error);

Result:

✅ All 5 knowledge sources unified in Sharely.ai
✅ AI agents now provide insights across entire corpus
✅ Members find information instantly, regardless of original source
✅ Role-based access ensures appropriate content visibility
✅ Hourly sync keeps knowledge current
✅ Idempotent operations mean restarts are safe

Key Concepts

Temporal Workflows & Idempotency

Why it matters: Knowledge synchronization involves multiple API calls, each of which could fail due to network issues, rate limits, or service disruptions.

Temporal workflows power many Sharely.ai API operations (especially role assignment and file processing), providing:

Automatic retries - Failed operations retry automatically without manual intervention
Durable execution - Workflows survive service restarts and crashes
Idempotent operations - Running the same operation multiple times has no side effects
Eventual consistency - Operations complete reliably, even if they take time

For sync scripts, this means:

✅ You can restart your sync script at any time without corrupting data
✅ Duplicate knowledge items won't be created (matched by source_path in metadata)
✅ Role assignments eventually succeed even if they're queued
✅ Your sync job is bulletproof against transient failures

Best practice: Always include a unique identifier (like source_path) in knowledge metadata to enable idempotent reconciliation.

Source of Truth Pattern

The source of truth is a single, authoritative configuration that defines what should exist in Sharely.ai.

Why YAML?

Declarative - Describes desired state, not imperative steps
Version controlled - Track changes in Git, enable rollbacks
Human-readable - Easy to audit and review
Diff-friendly - See exactly what changed between versions

Pattern:

# This file IS the truth
# Sharely.ai should match this exactly
 
knowledge:
  - source_path: "system-a/doc-1"
    title: "Document 1"
  - source_path: "system-b/doc-2"
    title: "Document 2"

Sync script responsibility:

Read the YAML (source of truth)
Read Sharely.ai (current state)
Make Sharely.ai match the YAML

Benefits:

✅ Single source of truth for all knowledge
✅ Audit trail via Git history
✅ Rollback capability (revert YAML, re-sync)
✅ Clear separation: YAML = what, script = how

Reconciliation Cycle

Reconciliation is the process of making the current state (Sharely.ai) match the desired state (YAML configuration). The process compares what should exist according to your source of truth against what actually exists in Sharely.ai, then applies the necessary changes to bring them into alignment.

Three-step reconciliation:

Add missing - Items in YAML but not in Sharely.ai → CREATE
Remove orphaned - Items in Sharely.ai but not in YAML → DELETE
Update changed - Items in both but with different metadata → UPDATE (optional)

Implementation:

// Build maps
const yamlMap = {};
config.knowledge.forEach(item => {
  yamlMap[item.source_path] = item;
});
 
const sharelyMap = {};
existingKnowledge.forEach(item => {
  if (item.metadata?.source_path) {
    sharelyMap[item.metadata.source_path] = item;
  }
});
 
// Reconcile
for (const sourcePath in yamlMap) {
  if (!sharelyMap[sourcePath]) {
    // ADD: In YAML but not Sharely
    await createKnowledge(yamlMap[sourcePath]);
  }
}
 
for (const sourcePath in sharelyMap) {
  if (!yamlMap[sourcePath]) {
    // DELETE: In Sharely but not YAML
    await deleteKnowledge(sharelyMap[sourcePath].knowledgeId);
  }
}

CMS Integration Patterns

WordPress

Sync blog posts, pages, and media from WordPress:

knowledge:
  - source_path: "wordpress://post-123"
    title: "10 Healthcare Trends to Watch in 2024"
    type: "LINK"
    url: "https://nmea.org/blog/healthcare-trends-2024"
    wordpress_post_id: 123
    roles: ["all-members"]

WordPress API integration:

// Fetch posts from WordPress REST API
const wpPosts = await fetch('https://nmea.org/wp-json/wp/v2/posts').then(r => r.json());
 
// Transform to YAML format
const knowledgeItems = wpPosts.map(post => ({
  source_path: `wordpress://post-${post.id}`,
  title: post.title.rendered,
  type: 'LINK',
  url: post.link,
  wordpress_post_id: post.id,
  roles: ['all-members']
}));

Contentful

Sync structured content from Contentful headless CMS:

knowledge:
  - source_path: "contentful://entry-abc123"
    title: "Understanding Chronic Kidney Disease"
    type: "LINK"
    url: "https://nmea.org/conditions/chronic-kidney-disease"
    contentful_entry_id: "abc123"
    contentful_content_type: "medical-article"
    roles: ["medical-professionals"]

Contentful API integration:

const contentful = require('contentful');
 
const client = contentful.createClient({
  space: process.env.CONTENTFUL_SPACE_ID,
  accessToken: process.env.CONTENTFUL_ACCESS_TOKEN
});
 
// Fetch entries of specific content type
const entries = await client.getEntries({
  content_type: 'medical-article'
});
 
// Transform to YAML format
const knowledgeItems = entries.items.map(entry => ({
  source_path: `contentful://entry-${entry.sys.id}`,
  title: entry.fields.title,
  type: 'LINK',
  url: `https://nmea.org/articles/${entry.fields.slug}`,
  contentful_entry_id: entry.sys.id,
  contentful_content_type: 'medical-article',
  roles: ['medical-professionals']
}));

Strapi

Sync content from Strapi open-source CMS:

knowledge:
  - source_path: "strapi://research-papers/42"
    title: "Advances in Cardiac Surgery Techniques"
    type: "FILE"
    url: "https://api.nmea.org/uploads/cardiac-surgery-2024.pdf"
    strapi_content_type: "research-papers"
    strapi_id: 42
    roles: ["medical-professionals", "surgeons"]

Strapi API integration:

// Fetch content from Strapi REST API
const strapiData = await fetch(
  'https://api.nmea.org/api/research-papers?populate=*',
  {
    headers: {
      'Authorization': `Bearer ${process.env.STRAPI_API_TOKEN}`
    }
  }
).then(r => r.json());
 
// Transform to YAML format
const knowledgeItems = strapiData.data.map(item => ({
  source_path: `strapi://research-papers/${item.id}`,
  title: item.attributes.title,
  type: 'FILE',
  url: `https://api.nmea.org${item.attributes.pdf.data.attributes.url}`,
  strapi_content_type: 'research-papers',
  strapi_id: item.id,
  roles: item.attributes.roles || ['medical-professionals']
}));

API Reference

Knowledge Management APIs

All APIs use the /v1/ prefix and require authentication via Bearer token.

Create Knowledge

POST /v1/workspaces/{workspaceId}/knowledge

Create a new knowledge item (file, link, or text).

Headers:

Authorization: Bearer {token}
organizationId: {organizationId}
Content-Type: application/json

Request body:

{
  "type": "LINK",
  "title": "Medical Research Article",
  "url": "https://example.com/article",
  "language": "en",
  "metadata": {
    "source_path": "wordpress://post-123",
    "custom_field": "custom_value"
  }
}

Response:

{
  "knowledgeId": "uuid-here",
  "status": "BACKGROUND_START"
}

Note: Large files process asynchronously. Check status via Knowledge API if needed.

Search/List Knowledge

GET /v1/workspaces/{workspaceId}/knowledge

List or search knowledge items with pagination.

Headers:

Authorization: Bearer {token}
Content-Type: application/json

Query parameters:

limit - Number of items per page (default: 20, max: 100)
offset - Pagination offset (default: 0)
q - Semantic search query (optional)
title - Title search (optional)

Response:

{
  "items": [
    {
      "knowledgeId": "uuid-1",
      "title": "Document Title",
      "type": "LINK",
      "metadata": {
        "source_path": "wordpress://post-123"
      }
    }
  ],
  "total": 150,
  "limit": 100,
  "offset": 0
}

Delete Knowledge

DELETE /v1/workspaces/{workspaceId}/knowledge/{knowledgeId}

Delete a knowledge item.

Headers:

Authorization: Bearer {token}

Response: 204 No Content

Role Management APIs

Assign Roles to Knowledge

POST /v1/workspaces/{workspaceId}/knowledge/{knowledgeId}/role

Assign roles to a knowledge item for RBAC.

Headers:

Authorization: Bearer {token}
Content-Type: application/json

Request body:

{
  "roleIds": ["role-uuid-1", "role-uuid-2"]
}

Note: This operation uses Temporal workflows and is eventually consistent.

List Roles on Knowledge

GET /v1/workspaces/{workspaceId}/knowledge/{knowledgeId}/role

Get all roles assigned to a knowledge item.

Headers:

Authorization: Bearer {token}

Response:

{
  "roles": [
    {
      "roleId": "role-uuid-1",
      "name": "medical-professionals"
    }
  ]
}

Remove Roles from Knowledge

DELETE /v1/workspaces/{workspaceId}/knowledge/{knowledgeId}/role

Remove role assignments from a knowledge item.

Headers:

Authorization: Bearer {token}
Content-Type: application/json

Request body:

{
  "roleIds": ["role-uuid-1"]
}

Authentication

All API calls require a Bearer token generated via:

POST /workspaces/{workspaceId}/generate-access-key-token

Headers:

x-api-key: sk-sharely-your-api-key
Content-Type: application/json

Response:

{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "expiresIn": 86400
}

Use this token in subsequent API calls as Authorization: Bearer {token}.

Best Practices

1. Use Pagination for Large Workspaces

Always paginate when fetching knowledge to avoid timeouts:

async function fetchAllKnowledge(apiToken) {
  const allItems = [];
  let offset = 0;
  const limit = 100;
 
  while (true) {
    const response = await fetch(
      `https://api.sharely.ai/v1/workspaces/${WORKSPACE_ID}/knowledge?limit=${limit}&offset=${offset}`,
      { headers: { 'Authorization': `Bearer ${apiToken}` } }
    );
 
    const data = await response.json();
    allItems.push(...data.items);
 
    if (data.items.length < limit) break; // No more items
    offset += limit;
  }
 
  return allItems;
}

2. Store Source Identifiers in Metadata

Always include a unique identifier in metadata to enable idempotent reconciliation:

{
  "type": "LINK",
  "title": "Article Title",
  "metadata": {
    "source_path": "wordpress://post-123", // Unique identifier
    "source_system": "wordpress",
    "last_synced": "2024-01-15T10:30:00Z"
  }
}

This prevents duplicate creation and enables safe deletion of orphaned items.

3. Handle Async Operations

Some operations (file uploads, role assignments) use Temporal workflows and complete asynchronously:

async function createKnowledge(item) {
  const response = await fetch(/* ... */);
  const result = await response.json();
 
  if (result.status === 'BACKGROUND_START') {
    console.log(`Processing started for ${item.title}, continuing...`);
    // Don't wait - Temporal ensures eventual completion
  }
 
  return result;
}

Key point: You don't need to poll for completion. Temporal workflows ensure the operation completes eventually, even if your script exits.

4. Version Control Your YAML Configuration

Store your knowledge configuration in Git:

git add knowledge-config.yaml
git commit -m "Add new research papers to knowledge sync"
git push

Benefits:

Track changes over time
Collaborate with team on knowledge structure
Rollback if needed
Audit trail of all modifications

5. Run Sync on Schedule

Use cron or a scheduler to keep knowledge current:

# Run every hour
0 * * * * cd /path/to/sync && node sync-script.js >> sync.log 2>&1

Or use a workflow orchestration tool like Temporal, Airflow, or GitHub Actions.

6. Implement Error Handling

Always handle API errors gracefully:

async function createKnowledgeItem(apiToken, item) {
  try {
    const response = await fetch(/* ... */);
 
    if (!response.ok) {
      const error = await response.json();
      console.error(`Failed to create ${item.title}:`, error);
      return null; // Continue with other items
    }
 
    return await response.json();
  } catch (err) {
    console.error(`Network error creating ${item.title}:`, err);
    return null;
  }
}

7. Apply Roles Consistently

If using RBAC, ensure roles are applied during sync:

knowledge:
  - source_path: "azure://sensitive-data.pdf"
    title: "Confidential Research"
    roles: ["senior-researchers", "administrators"]

if (item.roles && item.roles.length > 0) {
  await assignRoles(apiToken, knowledgeId, item.roles);
}

Scaling to a Synchronization Service

While the examples above demonstrate sync scripts suitable for cron jobs or scheduled tasks, production environments often require more sophisticated synchronization services that can handle updates at scale efficiently.

From Script to Service

Evolution path:

Basic Script - Cron-based full reconciliation (good for < 10,000 items, hourly sync)
Incremental Sync - Track last sync timestamp, only process changes (good for < 100,000 items)
Event-Driven Service - Webhook-triggered updates, queue-based processing (production scale)

Architecture Considerations

When building a production-ready synchronization service, consider these architectural patterns:

1. Change Detection at Source

Instead of fetching all knowledge on every sync, detect what changed:

// Track last sync timestamp
const lastSync = await getLastSyncTimestamp();
 
// Query only changed items from source
const changedItems = await sourceSystem.getUpdatedSince(lastSync);
 
// Reconcile only the delta
await reconcileChanges(changedItems);

Benefits:

Reduces API calls to both source and Sharely.ai
Faster sync cycles
Lower infrastructure costs

Implementation approaches:

Source system provides updated_at or modified_since filtering
Maintain sync state database with per-item timestamps
Use ETags or version numbers for change detection

2. Event-Driven Updates

Replace polling with webhooks from source systems:

// Webhook receiver
app.post('/webhooks/wordpress', async (req, res) => {
  const { post_id, action } = req.body; // created, updated, deleted
 
  // Queue job for processing
  await queue.enqueue({
    type: 'sync_item',
    source: 'wordpress',
    item_id: post_id,
    action: action
  });
 
  res.status(200).send('Queued');
});

Benefits:

Real-time synchronization (seconds instead of hours)
No unnecessary polling
Lower latency for end users

Implementation requirements:

Webhook endpoints for each source system
Authentication and verification of webhook sources
Queue for buffering high-volume updates
Retry logic for failed webhook processing

3. Queue-Based Processing

Use message queues to handle large volumes:

// Producer: Add jobs to queue
await queue.enqueue({
  type: 'sync_knowledge',
  source_path: 'azure://research/doc.pdf',
  action: 'create'
});
 
// Consumer: Worker processes jobs
queue.process('sync_knowledge', async (job) => {
  const { source_path, action } = job.data;
 
  if (action === 'create') {
    await createKnowledgeItem(source_path);
  } else if (action === 'delete') {
    await deleteKnowledgeItem(source_path);
  }
});

Queue technologies:

AWS SQS - Managed, serverless
RabbitMQ - Self-hosted, feature-rich
Apache Kafka - High-throughput, event streaming
Redis Queue (Bull/BullMQ) - Simple, Node.js-friendly

Benefits:

Parallel processing with multiple workers
Automatic retries on failure
Rate limiting and backpressure handling
Visibility into processing status

4. Parallel Processing with Workers

Scale horizontally by running multiple sync workers:

// Main orchestrator
const chunks = chunkArray(allKnowledge, 100); // Process 100 items per worker
 
await Promise.all(
  chunks.map(chunk =>
    processChunk(chunk, workerId)
  )
);
 
// Worker function
async function processChunk(items, workerId) {
  console.log(`Worker ${workerId} processing ${items.length} items`);
 
  for (const item of items) {
    await syncKnowledgeItem(item);
  }
}

Scaling strategies:

Partition knowledge by source system (WordPress worker, Azure worker, etc.)
Partition by content type (videos, PDFs, links)
Use worker pools with configurable concurrency
Deploy workers as separate containers/pods for horizontal scaling

5. State Management

Track sync progress and health in a database:

CREATE TABLE sync_state (
  id SERIAL PRIMARY KEY,
  source_path VARCHAR(500) UNIQUE,
  source_system VARCHAR(100),
  last_synced_at TIMESTAMP,
  sharely_knowledge_id UUID,
  sync_status VARCHAR(50), -- 'synced', 'pending', 'failed'
  error_message TEXT,
  retry_count INTEGER DEFAULT 0
);

Use cases:

Resume failed syncs from where they stopped
Monitor which items are out of sync
Detect items that consistently fail
Report on sync lag and health

Queries:

Find items not synced in last 24 hours
Identify items with repeated failures
Calculate sync coverage percentage

6. Monitoring and Observability

Instrument your sync service for production:

// Metrics
metrics.increment('knowledge.synced', { source: 'wordpress' });
metrics.timing('sync.duration', duration, { source: 'wordpress' });
metrics.gauge('sync.lag_seconds', lagInSeconds);
 
// Logging
logger.info('Sync started', { source: 'wordpress', item_count: 150 });
logger.error('Sync failed', { source_path, error: err.message });
 
// Alerts
if (failureRate > 0.1) {
  alerting.trigger('high_sync_failure_rate', { rate: failureRate });
}

Key metrics to track:

Sync lag (time between source update and Sharely.ai update)
Success/failure rates
Processing throughput (items/second)
Queue depth
API error rates

Implementation Patterns

Pattern 1: Temporal Workflow Orchestration

Leverage Temporal for durable, long-running sync workflows:

// Temporal workflow for full reconciliation
async function fullSyncWorkflow(workspaceId) {
  // Fetch all source data
  const sourceItems = await activities.fetchAllSourceData();
 
  // Fetch all Sharely knowledge
  const sharelyItems = await activities.fetchSharelyKnowledge(workspaceId);
 
  // Reconcile (durable, survives restarts)
  const toAdd = await activities.calculateDelta(sourceItems, sharelyItems);
 
  // Process in batches (parallel activities)
  for (const batch of chunk(toAdd, 50)) {
    await activities.syncBatch(batch);
  }
 
  return { synced: toAdd.length };
}

Benefits:

Workflow state persisted automatically
Survives service restarts
Built-in retries and error handling
Activity versioning for safe deployments

Pattern 2: Incremental Sync with Timestamps

Only sync what changed since last run:

async function incrementalSync() {
  const lastRun = await db.getLastSyncTimestamp('wordpress');
 
  // Fetch only items modified since last sync
  const updatedPosts = await wordpress.getPostsModifiedSince(lastRun);
  const deletedPosts = await wordpress.getDeletedPostsSince(lastRun);
 
  // Sync updates
  for (const post of updatedPosts) {
    await syncWordPressPost(post);
  }
 
  // Remove deleted items
  for (const post of deletedPosts) {
    await removeKnowledgeBySourcePath(`wordpress://post-${post.id}`);
  }
 
  // Update last sync timestamp
  await db.setLastSyncTimestamp('wordpress', Date.now());
}

Pattern 3: Webhook + Queue Hybrid

Combine webhooks for real-time updates with scheduled full reconciliation:

// Webhook for real-time updates
app.post('/webhook/contentful', async (req, res) => {
  await queue.add('sync-item', {
    source: 'contentful',
    entry_id: req.body.sys.id,
    action: req.body.sys.type // created, updated, deleted
  });
 
  res.sendStatus(200);
});
 
// Daily full reconciliation (catch any missed webhooks)
cron.schedule('0 2 * * *', async () => {
  await fullReconciliation('contentful');
});

Best of both worlds:

Real-time updates via webhooks (seconds latency)
Full reconciliation catches missed events (eventual consistency)
Resilient to webhook delivery failures

Coming Soon: Sample Sync Scripts

We're developing official sync scripts for popular platforms:

WordPress - Sync posts, pages, and media
Contentful - Sync structured content from headless CMS
Strapi - Sync content from open-source CMS
SharePoint - Sync documents and lists
Notion - Sync pages and databases

Early access: Contact support@sharely.ai to join our beta program.

Support

Email: support@sharely.ai
Documentation: https://docs.sharely.ai (opens in a new tab)
API Reference: https://docs.sharely.ai/api-reference (opens in a new tab)

Next Steps

Define your source systems - Identify where your knowledge currently lives
Create YAML configuration - Define your desired knowledge state
Build sync script - Use the examples above as templates
Test reconciliation - Run in test workspace first
Schedule regular sync - Keep knowledge current automatically
Monitor and maintain - Track sync logs and adjust as needed

Knowledge Synchronization

What is Knowledge Synchronization?

When to Use Knowledge Sync

How It Works

1. Define Source of Truth

2. Fetch Current State

3. Reconcile Differences

4. Apply Changes

5. Run Idempotently

Real-World Example: National Medical Education Association

The Challenge

The Solution

Implementation

Key Concepts

Temporal Workflows & Idempotency

Source of Truth Pattern

Reconciliation Cycle

CMS Integration Patterns

WordPress

Contentful

Strapi

API Reference

Knowledge Management APIs

Create Knowledge

Search/List Knowledge

Delete Knowledge

Role Management APIs

Assign Roles to Knowledge

List Roles on Knowledge

Remove Roles from Knowledge

Authentication

Best Practices

1. Use Pagination for Large Workspaces

2. Store Source Identifiers in Metadata

3. Handle Async Operations

4. Version Control Your YAML Configuration

5. Run Sync on Schedule

6. Implement Error Handling

7. Apply Roles Consistently

Scaling to a Synchronization Service

From Script to Service

Architecture Considerations

1. Change Detection at Source

2. Event-Driven Updates

3. Queue-Based Processing

4. Parallel Processing with Workers

5. State Management

6. Monitoring and Observability

Implementation Patterns

Pattern 1: Temporal Workflow Orchestration

Pattern 2: Incremental Sync with Timestamps

Pattern 3: Webhook + Queue Hybrid

Coming Soon: Sample Sync Scripts

Related Documentation

APIs

Integration & Distribution

Concepts

Support

Next Steps