Section 4Data & Knowledge

SharePoint Intelligence Framework

Knowledge management strategy, metadata architecture, and AI indexing recommendations to maximize retrieval quality from SharePoint.

Strategic Objective

Transform SharePoint from a document repository into an AI-queryable knowledge system. The goal: any team member (or AI agent) can find the right information within seconds, not hours.

Current State Assessment

Most tax and finance SharePoint sites suffer from:

  • Inconsistent naming conventions
  • Missing or incorrect metadata
  • Deeply nested folder structures that AI cannot traverse effectively
  • Duplicate documents across multiple sites
  • No content lifecycle management (outdated docs sitting alongside current)
  • Poor search relevance due to lack of structure

Target Architecture

Recommended Site Structure

SharePoint Site Purpose Key Libraries
Tax Knowledge Hub Central tax knowledge management Policies, Procedures, Advisory Archive, Research, Templates
IDF Program E-invoicing project documentation Country folders, Vendor docs, Technical specs, Meeting minutes
Finance Shared Services Operational documentation Process docs, Training materials, Compliance calendars
Tax Technology Systems and tools documentation Integration specs, Configuration guides, Vendor contracts

Metadata Strategy

Core Metadata Columns (apply across ALL tax documents):

Column Type Values AI Purpose
Document Type Choice Policy, Procedure, Advisory Memo, Research Note, Decision Record, Template, Correspondence Enables filtered AI search
Jurisdiction Choice (multi) EU-Wide, France, Germany, Netherlands, UK, Belgium, Poland, UAE, Other Geographic scoping
Tax Type Choice VAT, Corporate Tax, Transfer Pricing, Customs, Withholding Tax, Other Topic filtering
Status Choice Draft, Under Review, Current, Superseded, Archived Only surface current docs to AI
Confidentiality Choice Internal, Restricted, Public Access control for AI responses
Valid From Date Temporal relevance
Valid Until Date Auto-archive trigger
Entity Choice (multi) [List of Canon entities] Entity-specific search
Author/Owner Person Accountability
Review Date Date Content freshness trigger

IDF-Specific Metadata:

Column Type Values Purpose
Country Implementation Choice France, Poland, Belgium, UAE, Future Filter by country
Project Phase Choice Discovery, Design, Build, Test, Go-Live, Hypercare Phase context
Document Category Choice Requirement, Design, Test Plan, Migration, Training, Change Document purpose
Vendor Choice [Vendor list] Vendor-specific retrieval
Workstream Choice Technical, Business Process, Data, Change Management, Governance Workstream filtering

Folder Structure Recommendation

Principle: Flat is better than deep for AI retrieval.

Maximum 2 levels of folders. Use metadata for filtering instead of folder nesting.

Tax Knowledge Hub/
├── Policies & Procedures/
│   ├── [Use metadata for VAT/CT/TP filtering, not subfolders]
├── Advisory Archive/
│   ├── [Use metadata for year, jurisdiction, topic filtering]
├── Research & Analysis/
│   ├── [Use metadata for topic, jurisdiction filtering]
├── Decision Records/
│   ├── [Use metadata for entity, topic filtering]
├── Templates/
│   ├── [Use metadata for document type filtering]
└── External Guidance/
    ├── [Use metadata for source, jurisdiction filtering]

Why flat structures matter for AI:

  • Microsoft 365 Copilot and SharePoint search index at the library level
  • Deep folder nesting reduces discoverability
  • Metadata-driven filtering is more flexible than folder-based organization
  • AI can filter by multiple metadata attributes simultaneously (impossible with folders)

AI Indexing Recommendations

Copilot for Microsoft 365 Integration

Configuration Setting Rationale
Semantic Index Enabled for all tax libraries Allows natural language queries
Restricted Content Sensitivity labels applied Prevents leakage in AI responses
Content freshness Prioritize last 24 months Reduce noise from outdated content
File types indexed DOCX, PDF, XLSX, PPTX, MSG Cover all knowledge containers
OCR enabled Yes Index scanned documents

Retrieval Optimization Strategies

1. Document Summaries (AI-Generated) Add a "Summary" metadata column to key documents. Populate using Copilot:

"Summarize this document in 2-3 sentences focusing on: what decision was made, what jurisdiction it applies to, and any conditions or limitations."

This summary becomes searchable metadata, dramatically improving retrieval relevance.

2. Knowledge Articles For complex topics, create standalone knowledge articles (short, structured documents) that serve as entry points:

# VAT Treatment of Software Licensing — Canon Position
Jurisdiction: EU-wide
Last Updated: [Date]
Status: Current

## Summary Position
[2-3 sentence summary]

## Detailed Analysis
[Link to full advisory memo]

## Key References
- EU VAT Directive Article [X]
- ECJ Case [reference]
- Internal memo [reference with link]

## Conditions & Limitations
[When this position applies and when it doesn't]

3. Tagging for AI Discovery Add a "Keywords" multi-line text field populated with:

  • Alternative terms for the same concept
  • Related topics AI might associate
  • Common queries this document answers

Content Lifecycle Management

Document Age Action Automation
Created today Full index, high priority Automatic
6 months Review date triggered Power Automate notification
12 months Freshness review required Workflow to owner
24 months Mark for archival review Status → Under Review
36+ months Archive or confirm still current Status → Archived (excluded from AI)

Implementation Roadmap

Phase Timeline Activities Success Metric
1. Audit Weeks 1-2 Inventory all tax/finance SharePoint content. Identify duplicates, outdated, and missing documents. Complete inventory with gap analysis
2. Structure Weeks 3-4 Implement recommended site/library structure. Create metadata columns. Structure deployed, metadata schema active
3. Migration Weeks 5-8 Move documents to new structure. Apply metadata (batch where possible, manual for complex docs). 80% of current docs tagged and migrated
4. Quality Weeks 9-10 Generate AI summaries for top 100 documents. Verify metadata accuracy. Test search relevance. Search relevance testing: 80% of test queries return correct top-3 results
5. Operationalize Weeks 11-12 Training for team. Document upload procedures. Governance rules. Team trained, procedures documented
6. Optimize Ongoing Monitor search analytics. Tune metadata. Fill content gaps. Monthly search quality reviews

Governance Rules

  1. No document uploaded without metadata — Enforce via required columns
  2. Owner assigned to every document — Accountability for freshness
  3. Review dates mandatory — Content must be confirmed current or archived
  4. Templates for consistency — Standard document templates with pre-set metadata
  5. Monthly quality check — Review search analytics, identify poor-performing queries