← Tacit Labs
fyi: generated by a human

METHODOLOGY

The engineering blueprint for capturing organisational intelligence. Five techniques. One unified knowledge graph. Built for insurance underwriting.

Master System Architecture


graph TB subgraph "CAPTURE LAYER" SS["Shadow Sessions — Active, Deep"] DA["Decision Archaeology — Active, Historical"] SE["Scenario Elicitation — Active, Structured"] NK["Negative Knowledge Audit — Active, Org-Wide"] BA["Behavioral Analytics — Passive, Continuous"] end subgraph "PROCESSING LAYER" TP["Transcription Pipeline"] AP["Analysis Pipeline"] PE["Pattern Engine"] end subgraph "KNOWLEDGE LAYER" KG["Knowledge Graph — Neo4j"] IS["Intent Specification — YAML Store"] end subgraph "DELIVERY LAYER" IE["Intent Engine"] AM["Alignment Monitor"] end SS --> TP DA --> AP SE --> TP NK --> AP BA --> PE TP --> KG AP --> KG PE --> KG KG --> IE IS --> IE IE --> AM AM -->|"drift/freshness signals"| KG

Five capture techniques feed three processing pipelines, which populate a unified knowledge graph. The Intent Engine enriches AI calls with expert knowledge. The Alignment Monitor detects drift and feeds corrections back.

01

Shadow Sessions


Parameter Value
Type Active capture (practitioner-led)
Duration 3–5 days per expert, 2–3 experts per function
ICP Context Sit with senior commercial underwriters during live submission evaluation in Guidewire / Duck Creek
Output Heuristic candidates, decision rationale transcripts, annotation-enriched event logs
Unique Value Captures the WHY — deep reasoning, emotional context, interpersonal judgment

Data Flow

graph LR subgraph "INPUT" E["Expert works on platform"] P["Practitioner observes + asks why"] R["Audio recording — with consent"] end subgraph "PROCESSING" FW["Faster Whisper — transcription"] SP["spaCy NLP — entity extraction"] HE["Heuristic Extractor — custom pipeline"] OE["Observer Events — session assembly"] end subgraph "OUTPUT" HC["Heuristic Candidates"] ST["Annotated Transcript"] EL["Timestamped Event Log"] end E --> OE R --> FW FW --> SP SP --> HE P --> HE OE --> HE HE --> HC FW --> ST OE --> EL

Pipeline

Stage 1 — Audio Capture & Transcription

InputWAV/MP3 audio from practitioner's device. 2–8 hours per session. Written consent signed.
ProcessingFaster Whisper (large-v3 model). Beam size 5, word timestamps enabled, VAD filter active.
OutputJSON with word-level timestamps, speaker diarisation, per-segment confidence scores.

Stage 2 — Entity Extraction

InputTranscription JSON segments
ProcessingspaCy (en_core_web_trf) + custom NER labels: RISK_TYPE, THRESHOLD, DATA_SOURCE, HEURISTIC_SIGNAL, EXCEPTION_TRIGGER, RATIONALE. Sentence-transformers for embeddings. KeyBERT for key phrases.
OutputAnnotated transcript with entities, key phrases, embeddings, heuristic_type tags.

Stage 3 — Heuristic Extraction

InputAnnotated transcript + practitioner notes + Observer event log
ProcessingDBSCAN clustering on embeddings → heuristic pattern detection via regex + dependency parsing → cross-reference with Observer data → LLM structured output (Claude/GPT-4o)
OutputStructured Heuristic Candidate objects (JSON) with conditions, actions, rationale, confidence scores

Heuristic Output Schema

// Example: Shadow Session heuristic candidate { "heuristic_id": "HEU-SS-001", "source": "shadow_session", "heuristic": { "description": "Buildings > 30yr in commercial property: uplift risk score 8–15 points above model", "type": "threshold_override", "conditions": [ { "field": "building_age", "operator": "gt", "value": 30 }, { "field": "submission_type", "operator": "eq", "value": "commercial_property" }, { "field": "sum_insured", "operator": "gt", "value": 500000 } ], "action": { "type": "score_override", "direction": "increase", "magnitude": "8–15 points" }, "rationale": "Older buildings have higher structural risk. Model underweights building age.", "exceptions": ["Full renovation in last 5yr → standard scoring"] }, "evidence": { "observer_correlation": { "sessions_observed": 47, "pattern_match_rate": 0.89 }, "cross_expert_agreement": ["expert_james_p"] }, "confidence": 0.78, "status": "pending_validation" }
Insurance implementation: Target commercial lines UWs with 15+ years experience, approaching retirement. Days 1–2: observe submission triage (how they decide what to quote vs. decline). Days 3–5: deep-dive on risk score overrides, pricing adjustments, broker negotiations, exception approvals.
02

Decision Archaeology


Parameter Value
Type Active capture (data-driven historical analysis)
Duration 1–2 weeks of data analysis
ICP Context Analyse 50–100 recent underwriting decisions and their claims outcomes
Output Outcome-correlated patterns, invisible variable identification, decision quality scores
Unique Value Captures patterns experts can't articulate — they emerge only from aggregate data

Data Flow

graph LR subgraph "INPUT" PD["Policy Data — Guidewire export"] CD["Claims Data — 3-5yr loss runs"] UD["Underwriter Decision Logs"] end subgraph "PROCESSING" DC["Data Cleaning — pandas"] FE["Feature Engineering — domain-specific"] PM["Pattern Mining — PrefixSpan + stats"] IV["Invisible Variable Detection"] end subgraph "OUTPUT" DP["Decision Patterns — ranked by outcome"] IVR["Invisible Variables — report"] HC2["Heuristic Candidates — data-derived"] end PD --> DC CD --> DC UD --> DC DC --> FE FE --> PM FE --> IV PM --> DP IV --> IVR DP --> HC2 IVR --> HC2

Pipeline

Stage 1 — Data Extraction

InputPolicyCenter API/CSV: 50–100+ decisions, 12–36 months. Fields: submission_id, underwriter_id, decision, risk_score, risk_score_final, premium, sum_insured, risk_class, time_to_decision.
Processingpandas + numpy. Deduplicate, normalise currency (GBP), join policy → claims, calculate: override_flag, override_magnitude, loss_ratio, decision_time_percentile.
OutputFeature-enriched DataFrame (~30–50 columns). One row per decision.

Stage 2 — Pattern Mining

Analysis Tool Target Example Finding
Override pattern analysis sklearn DecisionTreeClassifier (depth 5) When do experts override the model? "UWs override upward 78% when building_age > 30 AND sum_insured > £500K"
Outcome correlation sklearn RandomForestClassifier Which expert decisions predict fewer claims? "broker_relationship_strength ranks #3 — not in the model"
Invisible variable detection Feature ablation method Find fields not in the model that predict outcomes "time_of_day improves loss prediction 3.2% — decisions after 4pm have 18% higher loss ratios"
Expert clustering sklearn KMeans Group underwriters by decision style "'Cautious' cluster has 12% lower loss ratio than 'model-trusting' cluster"
Key insight: Decision Archaeology surfaces patterns that are invisible to the experts themselves. Nobody knows that decisions after 4pm produce worse outcomes — until you look at the data. These become process constraints in the Intent Engine.
03

Scenario Elicitation


Parameter Value
Type Active capture (structured interview)
Duration 2–3 hours per expert, 3–5 experts
ICP Context Present senior underwriters with edge-case submissions, borderline risks, novel scenarios
Output Exception-handling heuristics, decision boundary maps, conditional logic trees
Unique Value Captures exception handling and decision boundaries — the hardest knowledge to extract

Data Flow

graph LR subgraph "INPUT" SC["Scenario Library — curated edge cases"] DA3["Decision Archaeology findings"] end subgraph "CAPTURE" VI["Interview Recording"] SR["Structured Response Form"] end subgraph "PROCESSING" FW2["Faster Whisper"] DB["Decision Boundary Mapper"] CL["Conditional Logic Extractor — LLM"] end subgraph "OUTPUT" EH["Exception Heuristics"] DBM["Decision Boundary Maps"] CLT["Conditional Logic Trees"] end SC --> VI DA3 --> SC VI --> FW2 SR --> DB FW2 --> CL CL --> EH DB --> DBM CL --> CLT

Scenario Library (Insurance UW)

Category Purpose Example
Borderline Risk Submissions on the approve/decline boundary Restaurant chain: clean history but one flood-zone location, no fire suppression upgrade, sum insured £2.1M, model score 68 (borderline)
Novel Risk Risks the expert hasn't seen before Indoor vertical farming facility, £5M, no comparable loss data, converted 1970s warehouse
Exception Handling When standard process doesn't apply 15-year client, £3M annual premium, major claim from contractor negligence, model recommends 35% increase
What-If Variations Map decision boundaries by varying one factor "Same submission, but building is 50 years old." Record exactly when the decision flips.

Decision Boundary Mapping

For each what-if series: vary ONE parameter, record expert's decision at each threshold, plot the inflection point where the decision changes, and capture the stated reason for the boundary.

// Example: Scenario Elicitation heuristic { "heuristic_id": "HEU-SE-001", "type": "exception_override", "description": "Long-standing clients (>10yr, >£1M premium) with single claim from third-party negligence: cap increase at 15%", "decision_boundary": { "tenure_threshold": "10 years — below this, follow model", "claim_count_threshold": "2+ claims in 5yr — follow model regardless", "premium_threshold": "< £500K annual — follow model, lower relationship value" }, "cross_expert_agreement": "2 of 3 experts agreed", "confidence": 0.72 }
04

Negative Knowledge Audit


Parameter Value
Type Active capture (org-wide survey + interviews)
Duration 1–2 weeks
ICP Context Uncover failed AI initiatives, broken processes, known workarounds in underwriting
Output Anti-pattern library, workaround registry, institutional memory map
Unique Value Prevents AI from repeating known mistakes. Captures knowledge that is actively AVOIDED.

Data Flow

graph LR subgraph "INPUT" SV["Anonymous Survey — all UW staff"] FI["Follow-Up Interviews — 30 min each"] PM2["Post-Mortem Reports"] end subgraph "PROCESSING" SA["Survey Analysis — pandas"] TA["Thematic Analysis — LLM-assisted"] WD["Workaround Detection"] end subgraph "OUTPUT" AP["Anti-Pattern Library"] WR["Workaround Registry"] IM["Institutional Memory Map"] end SV --> SA FI --> TA PM2 --> TA SA --> AP TA --> AP WD --> WR TA --> IM

Survey Domains

Domain Questions
Failed Initiatives What AI/automation has been tried and didn't work? What guidelines look good on paper but fail in practice?
Workarounds What process steps do you regularly skip? Do you use personal spreadsheets alongside the official system?
Knowledge Gaps What questions from juniors are hardest to answer? What does the org understand least?
Assumptions What "truths" are outdated? What risks is the org systematically mispricing?
// Example: Anti-pattern from failed automation { "anti_pattern_id": "AP-001", "type": "failed_automation", "description": "Auto-triage model (hard score cutoff at 50) abandoned after 3 months — was declining profitable risks with unusual profiles", "lesson": "Never use model scores alone for decline decisions. Expert review required for scores 30–70.", "ai_mitigation": "Intent Engine constraint: no automated decline without expert review for scores 30–70" }
05

Expert Behavior Analytics


Passive capture — automated, continuous, zero-intrusion. The UEBA-inspired core of the Tacit engine.

Parameter Value
Type Passive capture (automated, continuous)
Duration Continuous after deployment
ICP Context Observer Agent on Guidewire/Duck Creek captures how underwriters actually assess risk
Output Behavioral baselines, decision patterns, heuristic candidates, concentration risk alerts
Unique Value Scales to every expert. Continuous enrichment. Patterns emerge from aggregate observation.

Data Flow

graph TB subgraph "CAPTURE — Browser Extension" CE["Content Script — rrweb + custom events"] AN["Anonymisation Layer"] BW["Background Worker — event batching"] end subgraph "INGESTION" OT["OpenTelemetry Collector"] RP["Redpanda — event stream"] end subgraph "PATTERN ENGINE" SA2["Session Assembler"] BM["Baseline Modeler — River"] SM["Sequence Miner — PrefixSpan"] AD["Anomaly Detector — PyOD"] HX["Heuristic Extractor"] end subgraph "KNOWLEDGE GRAPH" KG2["Neo4j"] end CE --> AN AN --> BW BW --> OT OT --> RP RP --> SA2 SA2 --> BM SA2 --> SM BM --> AD SM --> HX AD --> HX HX --> KG2

Observer Agent Spec

Component Technology Detail
Capture engine rrweb DOM event serialisation. mousemove: off, scroll: 500ms throttle, input: last value only
Custom events DOM observers model_score_view, model_override, data_source_access, decision_action
Anonymisation In-browser Expert ID: SHA-256 hash. Sum insured: bucketed into tiers. NER scan removes PERSON/ORG/LOC
Transport WebSocket (wss://) AES-256-GCM encrypted. 50-event buffer, 5s flush interval. IndexedDB offline queue
Expert UI Extension popup Status indicator, pause/resume, "Why?" annotation button (what + why + confidence slider)
The "Why?" button: When an expert makes an unusual decision, the extension prompts: "That's an interesting choice — want to note why?" Annotations become the highest-value nodes in the knowledge graph. They bridge passive capture with active elicitation.

Pattern Engine Detail

Stage Tool Function
Session Assembly Custom Python (asyncio) Group events by expert + time proximity (gap > 5 min = new session). Calculate: duration, screen sequence, time per screen, overrides, annotations.
Baseline Modeling River (CluStream + ADWIN) Incremental learning per expert. Features: session duration, screen count, override count, data source set. Alerts on behavioral change points.
Sequence Mining PrefixSpan-py Find recurring screen sequences. min_support: 0.3, min_length: 3. Output: pattern, support, which experts, outcome correlation.
Anomaly Detection PyOD (Isolation Forest) Contamination: 0.05. Flags: novel patterns, threshold shifts, process skips, extended deliberation.
Heuristic Extraction LLM + sentence-transformers Convert validated patterns → structured heuristics. Match annotations to patterns via semantic similarity. Cross-expert validation.
06

Knowledge Graph Ingestion


Unified Pipeline

graph LR subgraph "ALL SOURCES" S1["Shadow Session Heuristics"] S2["Archaeology Heuristics"] S3["Elicitation Heuristics"] S4["Anti-Patterns"] S5["Behavioral Heuristics"] end subgraph "INGESTION" VL["Validation Layer"] DD["Deduplication — embedding similarity"] CR["Conflict Resolution"] CS["Confidence Scoring"] end subgraph "STORE" KG3["Knowledge Graph — Neo4j"] end S1 --> VL S2 --> VL S3 --> VL S4 --> VL S5 --> VL VL --> DD DD --> CR CR --> CS CS --> KG3

Graph Schema

graph TD E["Expert"] -->|contributes| H["Heuristic"] H -->|applies_to| D["Decision Domain"] H -->|has_condition| C["Condition Set"] H -->|produces| A["Action"] H -->|conflicts_with| H2["Other Heuristic"] D -->|regulated_by| R["Regulation"] I["Intent Spec"] -->|constrains| D

Node Types

Node Key Properties Example
Expert id, role, seniority, tenure, retirement_date, concentration_score Sarah M., Senior UW, 22yr, retires Sep 2027
Heuristic id, description, type, confidence, freshness, source, status "Uplift score 8–15pts for buildings > 30yr"
Domain type, sub_type, value_range, regulation_scope Commercial property, £500K+, Solvency II
Condition field, operator, value, unit building_age > 30 years
Intent Spec objectives, constraints, preferences, triggers "Combined ratio < 95%, profitability weight 0.65"

Confidence Scoring

# Confidence formula for all heuristic candidates confidence = ( 0.20 × source_reliability + # shadow=0.9, scenario=0.85, archaeology=0.8, behavioral=0.7 0.25 × cross_expert_agreement + # experts_agreeing / total_experts 0.25 × outcome_correlation + # does this predict good outcomes? 0.15 × observation_frequency + # sessions_observed / 100 (saturates) 0.15 × annotation_support # 1.0 if expert explained it, 0.3 if not )
07

Technology Stack


Layer Tool Role
Observer — capture rrweb (17K+ ⭐) DOM event serialisation, session recording
Observer — pipeline OpenTelemetry Event collection, processing, routing
Event streaming Redpanda (10K+ ⭐) Kafka-compatible, single binary, no JVM
Pattern — baselines River (5K+ ⭐) Streaming ML, incremental learning, ADWIN drift
Pattern — sequences PrefixSpan-py Sequential pattern mining
Pattern — anomalies PyOD (8K+ ⭐) 50+ anomaly detection algorithms
Knowledge Graph Neo4j Community Graph storage, Cypher queries
Intent — KG bridge LlamaIndex (38K+ ⭐) Neo4j → prompt context enrichment
Intent — guardrails NeMo Guardrails (4K+ ⭐) NVIDIA constraint enforcement
Intent — models LiteLLM (16K+ ⭐) Universal LLM API proxy
Monitor — audit Langfuse (7K+ ⭐) LLM decision tracing
Monitor — drift Evidently AI (5K+ ⭐) Statistical drift detection
Monitor — dashboards Grafana (65K+ ⭐) Client-facing reporting
Transcription Faster Whisper (13K+ ⭐) 4× faster Whisper, shadow session audio
NLP spaCy Entity extraction, dependency parsing
Build vs. buy split: ~30% custom code, ~70% open-source infrastructure. Custom IP: anonymisation layer, event taxonomy, heuristic extraction pipeline, KG schema, "Why?" button UX, intent spec format, freshness scoring, concentration risk detection, and the glue logic tying all layers together.
08

Delivery Timeline


12-week engagement per client. Active and passive capture run in parallel.

Week Deliverable Techniques Active
1–2 Client onboarding. Observer Agent deployed. Data access established. Behavioral Analytics begins. Negative Knowledge Audit survey distributed.
3–4 Shadow sessions with 2–3 senior experts. Decision Archaeology data pull. Shadow Sessions + Decision Archaeology + Observer collecting
5–6 Scenario elicitation interviews. Pattern Engine processing begins. Scenario Elicitation + Pattern Engine + Transcription pipeline
7–8 Knowledge Graph v1 populated. Heuristic validation sessions with experts. All heuristics → KG ingestion pipeline → expert review
9–10 Intent Specification drafted. Intent Engine configured. First AI enrichment test. Intent Engine + Knowledge Engine API live
11–12 Alignment Monitor deployed. Decision audit logs. Client dashboard live. Full system operational. Handover + training.
Ongoing Observer continues passive capture. Monthly KG reviews. Quarterly intent reviews. Behavioral Analytics + Alignment Monitor (autonomous)
09

Observer SDK — Behavioral Capture Engine


The Observer SDK is a zero-dependency TypeScript library that runs silently in the expert's browser, capturing the implicit behavioural signals that reveal tacit knowledge — without requiring the expert to do anything differently. It records not what they type, but how they think.

Parameter Value
Type Passive capture (continuous, zero-effort)
Footprint <20KB gzipped, <1% CPU
Privacy SHA-256 hashing + PII redaction before data leaves the browser
Output Behavioural events, session journeys, expert baselines, anomaly alerts
Unique Value Captures what experts cannot articulate — the patterns embedded in their actions

Architecture

graph TB subgraph "Browser — Expert's Application" DW["DOM Watcher"] CA["Cursor Analyzer"] HA["Hesitation Analyzer"] JT["Judgment Tracker"] JM["Journey Mapper"] end subgraph "Processing" PE["Privacy Engine"] SM["Session Manager"] BB["Baseline Builder"] AD["Anomaly Detector"] end subgraph "Output" EB["Event Buffer"] OQ["Offline Queue — IndexedDB"] TP["Transport — WebSocket"] end DW --> PE CA --> PE HA --> PE JT --> SM JM --> SM PE --> SM SM --> BB BB --> AD SM --> EB AD --> EB EB --> OQ EB --> TP

Cursor & Mouse Behavioural Analytics

The SDK includes a dedicated Cursor Analyzer with 7 real-time detectors, grounded in Meidenbauer et al. (PMC10084322) research on mouse movements reflecting personality traits and cognitive states:

Detector Signal What It Reveals
Hover Linger Cursor stationary >1.5s over element without click Uncertainty, curiosity, or searching for information not immediately visible
Dead Zones 50×50px grid cells with zero mouse activity Page areas that are invisible or unimportant to the expert — potential UI clutter
Slow Deliberate Speed <50px/s sustained >1s High cognitive load — expert is reading intently, analysing, or struggling with complexity
Erratic Movement Speed >800px/s with rapid direction changes (>90°) Frustration, impatience, or disorientation — the expert may be lost in the interface
Idle Cursor <5px/s for >2s Reading, thinking, or watching content — deep internal processing
Two-Step Targeting Fast move (>600px/s) → slow corrective (<100px/s) Goal-oriented decisive action — the expert knows exactly what button they want
Text Selection Highlighting text on screen Cognitive anchoring on complex or non-structured content to maintain reading focus
Fixation Micro-movements within 25px for >250ms Deep attention and focused reading (per Meidenbauer et al. methodology)

Hesitation & Deliberation Patterns

Pattern Detection Significance
Pause Before Commit >10s inactivity before a field change or submission The expert is weighing a difficult decision — high deliberation moment
Undo/Redo A → B → A value pattern in a field Second-guessing, testing alternatives, or reconsidering initial judgment
Extended Idle >30s of no DOM events Expert has stepped away, is thinking deeply, or has switched context
Rapid Scan 3+ field focuses in <2s intervals Quick triage — expert is scanning for a specific signal rather than reading sequentially

Expert for the User

The SDK is designed so that the expert — typically a non-technical domain professional — has an effortless experience:

Step 1Log in as usual. Nothing to install.
Step 2Accept the one-time consent banner.
Step 3Work normally. SDK runs invisibly in the background.
OptionalClick the floating "Why?" button to explain a tricky decision.
Privacy guarantee: Expert identity is immediately SHA-256 hashed. Sensitive fields (SSN, DOB, etc.) are redacted to [REDACTED] before data leaves the browser. No screenshots, no keylogging, no tracking outside the work application.
10

Socially Interactive Agent (SIA)


The SIA is a digitally embodied colleague — warm, competent, and contextually aware — that conducts structured reflection dialogues with domain experts. It captures what people know but cannot articulate by engaging them in natural, trust-building conversations that surface deep experiential knowledge.

Parameter Value
Type Active capture (AI-guided conversational reflection)
Duration 15–25 minutes per session, yielding 2–4 codified tacit rules
Trigger Scheduled, anomaly-detected, or expert-requested
Output Codified heuristics, decision boundary maps, training scenarios
Unique Value Captures the WHY behind expert behaviour — the reasoning that the Observer SDK alone cannot surface

Dual-Channel Knowledge Capture

The SIA and Observer SDK form a complementary system:

Observer SDK (Passive) SIA (Active)
Captures What experts do What experts think
Method Behavioural signals (cursor, hesitation, overrides) Structured CoT-guided dialog
Expert Effort Zero — runs in background 15–30 min sessions
Knowledge Type Procedural patterns, decision heuristics (implicit) Reasoning, mental models, edge case rules (explicit)

Multimodal Perception

The SIA perceives the expert across four simultaneous channels — mirroring how humans read each other:

Channel Input What It Captures Technology
Verbal Speech → text Explicit statements, domain terminology, reasoning chains Whisper v3 / Conformer ASR
Paraverbal Speech → prosody Hesitation markers (um, uh), pace changes, pitch shifts indicating uncertainty Prosody analysis (F0, jitter, shimmer)
Nonverbal Webcam → expression Micro-expressions (doubt, confidence), gaze direction, head nods/shakes MediaPipe Face Mesh + AU detection
Behavioural Observer SDK → signals Cursor fixations, hover linger, override patterns, hesitation events Tacit Observer SDK
Certainty Index: The SIA fuses all four channels in real-time to compute a continuous score (0–1) estimating how confident the expert is in what they're currently saying. When the index drops, the SIA knows to probe deeper.

Chain-of-Thought Dialog Strategy

The SIA does not passively respond to prompts. It drives the conversation through five phases designed to excavate tacit knowledge layer by layer:

Phase Purpose Example
1. Rapport & Anchoring Establish trust via warmth and recognition of expertise "I noticed you've been handling coastal property risks for about 8 years — that's deep experience. How do you think about flood proximity differently from what the models show?"
2. Situated Recall Trigger episodic memory with specific case probing "Can you walk me through a specific case where you overrode the model's flood risk score? What did you see that the model missed?"
3. Cognitive Excavation Decompose intuition into teachable steps "You mentioned the satellite view. If a junior underwriter were sitting next to you, what would you point at on the screen?"
4. Contrastive Elicitation Map decision boundaries via counterfactuals "What if the property were 200 meters further east, past the ridge? Would your assessment change?"
5. Validation & Codification Mirror the extracted heuristic back for confirmation "Your rule seems to be: 'For coastal properties, check satellite for elevation features like ridges not captured in flood zone maps.' Does that sound right?"

Trust & Persona Engine

Trust is not a feature — it is the prerequisite for tacit knowledge sharing. The SIA dynamically calibrates its persona along two research-backed dimensions:

Dimension How the SIA Expresses It Adaptation Over Time
Warmth Active listening cues (nodding, "I see"), empathetic phrasing, remembering previous conversations High in early sessions → balanced as trust establishes
Competence Correct domain jargon, referencing regulations, informed counterfactuals, citing expert's own past decisions Moderate initially → increases as SIA proves itself knowledgeable

RAG-Powered Organisational Memory

The RAG engine gives the SIA institutional context via four real-time data sources:

Document CorpusPolicies, guidelines, regulatory texts, training manuals
Knowledge GraphCodified tacit rules from previous SIA sessions
Behavioural BaselinesPer-expert fingerprints from Observer SDK
Case HistoryPast decisions, outcomes, and expert annotations

SIA Embodiment

Research shows people disclose 40% more information to embodied agents vs text interfaces (Krämer et al., 2018). The SIA uses:

Modality Implementation Purpose
Visual Realistic avatar with real-time lip sync, micro-expressions, gaze direction Social presence triggers natural storytelling and "thinking aloud"
Vocal Neural TTS with prosodic variation, paced pauses, matched cadence Strategic pauses before important questions signal that the question matters

Knowledge Externalisation Output

// Example: Codified tacit rule from a SIA session { "rule_id": "TK-2026-0847", "source_expert": "af901a8...", "confidence": 0.92, "domain": "coastal_property_underwriting", "rule": "When model flags high flood risk for coastal properties, check satellite imagery for geological ridges. Properties above the flood plain on ridges are systematically over-scored.", "boundary_conditions": [ "Only applies within 2km of coastline", "Does not apply if in designated SFHA zone", "Ridge must be visible in satellite/topographic data" ], "validation": { "expert_confirmed": true, "corroborated_by": 2, "outcome_validated": true, "historical_accuracy": 0.87 } }

Technology Stack

Layer Technology Rationale
Foundation LLM GPT-4o / Claude 3.5 / Gemini 2.0 Conversational intelligence with long-context for session continuity
CoT Engine Custom prompt chains + state machine Structured dialog strategy adapted to expert responses
RAG LlamaIndex + pgvector / Pinecone Hybrid retrieval: vector similarity + knowledge graph traversal
Knowledge Graph Neo4j Relationships between experts, rules, cases, and domains
ASR Whisper v3 / Deepgram Nova-2 Real-time speech recognition with domain vocabulary
TTS ElevenLabs / Azure Neural TTS Expressive speech with prosodic control
Avatar Ready Player Me + NVIDIA Audio2Face Real-time lip sync and expression mapping
Emotion Detection MediaPipe + custom classifier Micro-expression and prosody analysis for Certainty Index
The feedback loop: Observer SDK detects an anomaly → SIA is triggered to probe the expert's reasoning → New rule is codified and added to the Knowledge Graph → RAG makes it available to other experts → Observer SDK baseline updates → Fewer false anomalies for that pattern.
11

Expert Reward & Collaboration Framework


To ensure experts actively contribute rather than merely tolerate observation, the system treats expert judgment as a distinct asset — "Proof of Expertise" — and provides structured incentives across three tiers.

Tier 1 — Intrinsic & Utility Rewards

Things that make the expert's daily job easier right now.

Mechanism How It Works Value to Expert
Personalised Copilot SDK learns from the expert to build a personalised AI assistant that pre-flags cases matching their patterns Less grunt work, more time for complex cases
Compliance Automation "Why?" button annotations automatically formatted into required audit documentation Eliminated duplicate administrative writing
Skill Analytics Dashboard Private metrics showing decision speed, accuracy, and unique strengths vs. baseline Self-improvement and professional validation

Tier 2 — Extrinsic Recognition

Recognising expertise publicly within the professional community.

Mechanism How It Works Value to Expert
"Golden Rule" Leaderboard Quality-based gamification: experts gain standing when their overrides prove correct over time Peer recognition and competitive motivation
Named Patterns Unique heuristics are codified and named after the expert (e.g., "The Chen Technique") Professional legacy and institutional credit
Expert Review Boards Top contributors are elevated to a paid review board that resolves low-confidence AI disputes Authority and paid advisory role

Tier 3 — Financial & Career Compensation

Direct material benefits for contributing IP to the firm.

Mechanism How It Works Value to Expert
Knowledge Royalty Pool A percentage of efficiency gains from AI models trained on expert data is distributed as a bonus pool Direct financial reward tied to knowledge impact
AI Training Time Codes "Why?" button and SIA session time officially coded as "Strategic AI Training" (billable hours) Knowledge sharing is recognised as productive work, not overhead
Promotion Pillar Tacit knowledge contribution becomes a formal KPI for reaching Senior/Principal levels Career advancement tied to institutional contribution
Anti-gaming safeguard: Rewards are tied to outcomes (override correctness validated over time) and novelty (reasoning is different from what the baseline AI already knows), preventing performative annotation spam.
12

IT Deployment Architecture


The Observer SDK is deployed to expert workstations through one of four channels, chosen based on the client's architecture and security constraints.

Deployment Options

Option Method Best For Integration Effort
A NPM Package (npm install @tacit/observer-sdk) Modern SPAs (React, Vue, Angular, Next.js) ~2 hours engineering
B CDN Script Tag (<script src="cdn.tacit.ai/...">) Legacy server-rendered portals (Java, .NET, PHP) ~1 hour — copy/paste
C Tag Manager (GTM / Tealium) When marketing/analytics team controls third-party scripts ~30 minutes — custom HTML tag
D ★ Enterprise Browser Extension (Chrome / Edge) Zero-touch deployment, 3rd-party SaaS (Salesforce, Guidewire, Pega) IT pushes via Group Policy — no code changes

Option D — Browser Extension (Recommended)

The browser extension model is the fastest, lowest-friction deployment path. IT pushes a private Chrome/Edge extension to expert laptops via corporate MDM (Intune, Jamf). No source code modifications to the host application are required.

1. IT Pushes ExtensionSilent install via Group Policy / MDM to expert laptops.
2. Allowlist ConfiguredExtension policy defines allowed domains (e.g., *.guidewire.client.com).
3. Auto-InjectionWhen expert navigates to an allowed URL, SDK injects as content script.
4. Background CaptureSDK runs silently. Events batched and streamed via service worker.

Security & Network Configuration

Concern Configuration
Content Security Policy (CSP) script-src: cdn.tacit.ai & connect-src: wss://ingest.tacit.ai
Transport Protocol Secure WebSockets (WSS) over port 443, auto-fallback to HTTPS POST
Bandwidth <2KB per minute. Events batched locally (IndexedDB) and flushed every 5s or 50 events
Data Residency Regional ingest URLs (e.g., wss://eu-west-2.ingest.tacit.ai) — data never leaves jurisdiction
PII All hashing and redaction occurs in the browser before data reaches the network
Key advantage of Option D: Works with any web-based enterprise tool — Salesforce, Guidewire, Duck Creek, Pega, custom portals — without requiring vendor cooperation or source code access. Deployment takes days, not months.

Tacit Labs — Organisational Intelligence Infrastructure

← Back to main site