Skip to content

AI-Optimized Documentation Header Standard

Version: 1.0
Last Updated: October 2025
Status: Recommended for AI-first development workflows
Authors: Engineering Standards Committee
Target Audience: Engineering teams using AI coding assistants


Table of Contents

  1. Executive Summary
  2. Philosophy and Core Principles
  3. Technical Justification
  4. Why Not Existing Standards?
  5. Evidence from AI Tool Creators
  6. Standard Format Specification
  7. Complete Examples
  8. Migration Guide
  9. FAQs
  10. References

Executive Summary

This standard defines a structured documentation header format optimized for AI-assisted development workflows. Unlike traditional documentation standards designed for human-only consumption, this standard prioritizes machine-readable metadata for efficient codebase traversal, dependency mapping, and semantic search by Large Language Models (LLMs) and AI coding assistants.

Key Benefits

  • More files evaluated per context window - shorter structured headers (vs verbose prose) allow AI to scan more files before hitting token limits
  • Improved semantic search - structured metadata creates cleaner vector embeddings for RAG retrieval
  • Faster onboarding for both AI assistants and human developers
  • Explicit dependency mapping accelerates codebase understanding

Scope

Target Audience: Engineering teams using GitHub Copilot, Cursor, Claude, ChatGPT, or similar AI coding assistants

File Types: Python (.py), TypeScript/JavaScript (.ts, .tsx, .js, .jsx), Markdown (.md)

Use Cases: Internal codebases, private repositories, microservices, AI-first development workflows

Not Recommended For: Open-source projects, public libraries, codebases with mandated external style guides


Philosophy and Core Principles

1. Metadata-First Design

AI tools scan hundreds of files to build context. Structured metadata in the first 50-100 tokens enables:

  • Fast relevance filtering (2-3x more files scanned per context window)
  • Accurate dependency graph construction without code analysis
  • Precise semantic search without inference
  • Efficient token usage in constrained contexts

Analogy: Like a library card catalog system. Structured metadata = Dewey Decimal System (fast, precise). Prose = browsing shelves randomly (slow, uncertain).

2. Progressive Disclosure

Information is layered by priority to match how AI processes files:

[Tokens 1-100]   → Machine-readable metadata (file, purpose, exports, dependencies)
[Tokens 101-200] → Human narrative (overview explaining why and how)
[Tokens 201-300] → Usage examples and implementation notes

This structure mirrors the three stages of AI code understanding:

  1. Retrieval: Quick scan for relevance
  2. Ranking: Deeper assessment of match quality
  3. Analysis: Full content processing for task completion

3. Front-Loaded Critical Information

The first 5 lines contain 80% of what AI needs for initial file assessment. This follows the Pareto Principle applied to information retrieval:

  • Line 1: File path (absolute reference)
  • Line 2: Purpose with keywords (semantic hooks)
  • Line 3: Exports (public API surface)
  • Line 4: Dependencies (relationship edges)
  • Line 5: Implements/Related (graph connections)

These 5 lines answer the questions AI asks first:

  • Where is this file?
  • What does it do?
  • What does it provide?
  • What does it need?
  • How does it relate to other files?

4. Explicit Over Implicit

# ❌ Implicit (requires inference):
"This service processes payments using Stripe"

# ✅ Explicit (direct machine parsing):
Exports: PaymentProcessor
Depends: stripe
Implements: IPaymentService

AI models excel at pattern matching but struggle with inference under token constraints. Explicit metadata reduces cognitive load and improves accuracy.


Technical Justification

How AI Code Assistants Process Headers

Modern AI coding tools operate in three stages:

Stage 1: File Retrieval (Embedding-Based)

  • Headers are embedded into vector space using models like text-embedding-3-small or similar
  • Semantic search finds relevant files based on cosine similarity
  • Optimization: Structured metadata creates more precise embeddings than prose
  • Result: Structured headers improve retrieval recall by 15-25% (based on RAG benchmarks)

Stage 2: Relevance Ranking (Token-Efficient Scanning)

  • AI scans first 100-150 tokens of each retrieved file
  • Scores relevance based on keyword matching and structural patterns
  • Optimization: Front-loaded metadata allows faster filtering
  • Result: 2-3x more files can be evaluated in same context window

Stage 3: Deep Analysis (Full Content Processing)

  • AI reads complete content of top-ranked files
  • Builds dependency graphs and understands relationships
  • Optimization: Explicit Depends: and Related: fields accelerate graph building
  • Result: Faster, more accurate codebase understanding

Token Economics

Context window constraints are real. Simple math illustrates the impact:

Model Advertised Context
GPT-4 128K tokens
Claude 3.5 Sonnet 200K tokens
Claude Opus 4.5 200K tokens

Per IBM Research: "LLMs are more apt to pick up on important information appearing at the start or end of a long prompt rather than buried in the middle."

Key Insight: If structured headers use fewer tokens than verbose prose, more files fit in context. Front-loading critical metadata ensures it's in the high-attention zone of the context window.

Embedding Quality

Vector embeddings power semantic search (as used by Cursor). Structured metadata produces cleaner embeddings:

Structured:

"File: payment-processor.ts | Exports: PaymentProcessor | Depends: stripe"
→ Clear semantic clusters: {payment, processor, stripe, typescript}

Prose:

"A TypeScript service that processes payments using the Stripe API for handling transactions..."
→ Noisier embeddings: {typescript, service, processes, payments, using, api, handling, transactions}

Principle: Fewer filler words = tighter embedding clusters = better retrieval precision. This is consistent with how RAG systems work, though specific improvement percentages vary by implementation.


Why Not Existing Standards?

Google Style Guide (Python, TypeScript)

What Google Optimizes For:

  • Human readability in code review
  • API documentation generation (Sphinx, JSDoc)
  • Minimalism and consistency across massive monorepos
  • Developer documentation circa 2010-2015 (pre-AI era)

Example of Google Python Style:

"""Module for processing payments.

This module contains classes and functions for handling payment
processing operations including validation, authorization, and
settlement of transactions.
"""

Limitations for AI:

  • ❌ No explicit exports listing (must infer from code)
  • ❌ No dependency declaration (must parse imports)
  • ❌ No relationship mapping (must discover through analysis)
  • ❌ Prose-first structure (slower token-by-token scanning)
  • ❌ Designed pre-LLM era (2008-2012 origin)
  • ❌ Requires ~2x more tokens for AI to extract same information

When to Use Google Style:

  • Open-source projects expecting external contributors
  • Public APIs and libraries distributed via PyPI/npm
  • Company policy mandates it
  • Small codebases (<50 files) where AI assistance is minimal
  • Contributing to Google-sponsored projects

Microsoft/TypeScript Official Standards

What Microsoft Optimizes For:

  • VSCode IntelliSense and autocomplete
  • TypeScript compiler integration
  • Enterprise-scale type safety
  • IDE-driven development

Example:

/**
 * Processes payment transactions
 * @param amount - The payment amount
 * @returns Payment result
 */

Limitations for AI:

  • ❌ JSDoc focused on function signatures, not file-level context
  • ❌ No standard for file relationships
  • ❌ Assumes IDE tooling (IntelliSense), not LLM consumption
  • ❌ Minimal file-level metadata
  • ❌ Dependency information scattered across imports

When to Use Microsoft Style:

  • TypeScript library development for public consumption
  • Projects heavily dependent on VSCode features
  • When type safety is the primary concern over AI assistance

PEP 257 (Python Docstrings)

What PEP 257 Optimizes For:

  • Docstring conventions for Python functions/classes
  • Human-readable inline documentation
  • help() function output in Python REPL
  • Consistency across Python ecosystem

Limitations for AI:

  • ❌ Only defines docstring format, not file-level headers
  • ❌ No structured metadata specification
  • ❌ Focus on runtime introspection, not static analysis
  • ❌ Written in 2001, pre-modern development workflows
  • ❌ No consideration for AI consumption

JSDoc Standard

What JSDoc Optimizes For:

  • JavaScript/TypeScript API documentation
  • Type information for tooling
  • HTML documentation generation

Limitations for AI:

  • ❌ Function/class level, not file level
  • ❌ Verbose tag-based syntax increases token cost
  • ❌ No standard for file relationships or architecture
  • ❌ Designed for documentation generation, not AI parsing

Evidence from AI Tool Documentation

How AI Coding Tools Process Files

The following is based on publicly available documentation and observed behavior of AI coding assistants.

Cursor - Codebase Indexing

From Cursor's official documentation:

  • Cursor indexes codebases by computing embeddings for each file
  • Files are chunked into semantically meaningful pieces using tools like tree-sitter
  • Embeddings are stored in a vector database for nearest-neighbor semantic search
  • When you query with @Codebase, Cursor retrieves relevant chunks based on cosine similarity
  • Ignoring large content files improves answer accuracy

Implication: Structured metadata at the top of files creates cleaner embedding vectors, improving retrieval precision.

GitHub Copilot - Context Usage

From Microsoft's Visual Studio Blog:

  • Copilot uses surrounding code, comments, and file structure to generate suggestions
  • Clear, understandable comments help Copilot generate desired solutions
  • Context includes function names, code comments, docstrings, file names, and cursor position

Implication: Well-structured file headers provide immediate context that Copilot uses to improve suggestions.

Anthropic Claude - Prompt Engineering

From Anthropic's long context tips:

  • Put longform data at the top of prompts, above queries and instructions
  • Queries at the end can improve response quality by up to 30% in tests with complex inputs
  • Structure document content with XML tags for clarity when using multiple documents
  • Claude's 200K token context window enables handling complex, data-rich tasks

Implication: Front-loaded structured metadata aligns with how Claude processes information progressively.

IBM Research - Context Windows

From IBM Research blog on context windows:

  • LLMs are more apt to pick up on important information appearing at the start or end of a long prompt rather than buried in the middle
  • Quality of examples matters more than simply expanding context windows
  • Larger context windows improve coding tasks by allowing more documentation to be ingested

Implication: Front-loading critical metadata ensures it's in the high-attention zone of the context window.

Emerging Standard: LLMs.txt

The LLMs.txt standard, proposed by Jeremy Howard of Answer.AI in late 2024, applies similar principles at the project level:

  • A markdown file describing high-level goals, key areas, and file structure
  • One-time investment that improves AI code generation quality
  • Rapidly gaining adoption in AI-first development workflows

Implication: The industry is converging on structured, AI-readable documentation formats.


Standard Format Specification

Python Files (.py)

"""
File: path/to/module/filename.py
Purpose: One-line description with key searchable terms (40-60 chars)
Exports: MainClass, helper_function, CONSTANT_VALUE, CustomException
Depends: requests, typing.Optional, internal.module.Helper
Implements: AbstractBaseClass | Inherits: ParentClass
Related: module/related_file.py, module/interface.py

Overview:
    2-4 sentence narrative explaining what this module does, why it exists,
    and how it fits into the larger system. Focus on business logic and key
    patterns like retry logic, caching strategies, or architectural decisions.
    Keep this under 150 tokens for optimal AI consumption.

Usage:
    instance = MainClass(config)
    result = instance.main_method(param1, param2)

Notes: Performance characteristics, limitations, or critical warnings (optional)
"""

from typing import Optional
import requests
# ... rest of file

Field Definitions

Field Required Description Guidelines
File Yes Full path from repository root Use forward slashes, include extension
Purpose Yes One-line description 40-60 chars, include key searchable terms
Exports Yes Public API surface Classes, functions, constants, exceptions (comma-separated)
Depends Yes Critical dependencies External packages and key internal modules (not every import)
Implements Conditional Interface/ABC implementation Only if implementing an abstract base
Inherits Conditional Parent class Only for inheritance relationships
Related Recommended Related files 2-4 most relevant files (relative paths)
Overview Yes Narrative explanation 2-4 sentences, ~100-150 tokens, explain why/how
Usage Recommended Code example 1-3 lines showing typical invocation
Notes Optional Special considerations Performance, limitations, warnings

Formatting Rules

  • Use triple quotes (""")
  • Fields use Label: Value format
  • Multiple values separated by commas
  • Related files use relative paths from current file
  • Keep line length under 100 characters for readability
  • Use pipe (|) to separate multiple attributes on same line

TypeScript/JavaScript Files (.ts, .tsx, .js, .jsx)

/**
 * File: src/services/payment/stripe-payment-processor.ts
 * Purpose: Stripe payment processing with retry logic and idempotency
 * Exports: StripePaymentProcessor, StripePaymentOptions, PaymentResult
 * Depends: stripe (^14.0), @/lib/retry, @/lib/logger, @/types/payment
 * Implements: PaymentProcessor
 * Related: payment-processor.interface.ts, payment-webhook.controller.ts
 * 
 * Overview:
 *   Service class handling Stripe payment operations including charge creation,
 *   refunds, and subscription management. Implements automatic retry logic with
 *   exponential backoff for transient failures and idempotency key generation
 *   to prevent duplicate charges. All mutating operations require idempotency
 *   keys and webhook signatures are verified using raw request bodies.
 * 
 * Usage:
 *   const processor = new StripePaymentProcessor(config);
 *   const result = await processor.processPayment({
 *     amount: 1000, currency: 'usd', customerId: 'cus_123'
 *   });
 * 
 * Notes: Max 3 retry attempts | ~200-500ms latency | Rate limit: 100 req/sec
 */

import Stripe from 'stripe';
import { logger } from '@/lib/logger';
// ... rest of imports

Field Definitions

Same as Python, with TypeScript-specific considerations:

  • Depends: Include version constraints for external packages (e.g., ^14.0)
  • Exports: Include both types and runtime values
  • Related: Use same import path style as your codebase (@/ or relative)

Formatting Rules

  • Use JSDoc block comments (/** ... */)
  • Each field on its own line
  • Use asterisk (*) at line start with single space
  • Blank line before Overview, Usage, and Notes sections
  • Keep consistent with your project's JSDoc style

Markdown Documentation Files (.md)

---
file: docs/architecture/event-driven-messaging.md
purpose: Event messaging architecture - brokers, schemas, consumer patterns
audience: Backend Engineers, DevOps, Architects
dependencies: [RabbitMQ 3.12+, Schema Registry v2.0]
related: 
  - docs/architecture/system-overview.md
  - docs/api/event-schemas.md
  - docs/runbooks/message-queue-troubleshooting.md
status: approved
version: 2.3.0
updated: 2024-10-13
---

# Event-Driven Messaging Architecture

**Overview:**  
Architecture guide for asynchronous event-driven communication across services.
Defines message broker topology, event schema design and versioning, producer/consumer
patterns, dead letter queue handling, and retry strategies. Provides best practices
for decoupling services through async messaging while maintaining reliability,
observability, and data consistency guarantees.

**When to use:** Decoupling services, high-throughput background jobs, event broadcasting  
**When NOT to use:** User-facing sync operations, simple low-latency service calls  
**Key pattern:** `Producer → Exchange → Queue → Consumer → Handler (+ DLQ)`

**Quick start:**  
```bash
docker-compose up rabbitmq schema-registry

Table of Contents

#### YAML Front Matter Fields

| Field | Required | Description |
|-------|----------|-------------|
| **file** | Yes | Full path from repository root |
| **purpose** | Yes | One-line description with key terms |
| **audience** | Yes | Target readers (roles/teams) |
| **dependencies** | Optional | Required infrastructure or tools |
| **related** | Recommended | List of related documents |
| **status** | Recommended | draft, review, approved, deprecated |
| **version** | Recommended | Semantic version (e.g., 2.3.0) |
| **updated** | Yes | Last update date (YYYY-MM-DD) |

#### Content Structure

After front matter, include:

1. **H1 Title** matching the purpose
2. **Overview paragraph** (2-4 sentences)
3. **When to use / When NOT to use** (for decision guides)
4. **Quick start** (for getting-started docs)
5. **Table of contents** (for long documents)
6. **Main content sections**

---

## Complete Examples

### Example 1: Python Service Module

```python
"""
File: src/services/notification/email_sender.py
Purpose: Email delivery service with template rendering and retry logic
Exports: EmailSender, EmailTemplate, EmailDeliveryException
Depends: sendgrid, jinja2, redis.Redis, internal.templates.loader
Implements: NotificationSender
Related: services/notification/notification_service.py, services/notification/sms_sender.py

Overview:
    Email delivery service that handles template rendering, personalization, and
    sending via SendGrid API. Implements retry logic with exponential backoff for
    transient failures and uses Redis for deduplication to prevent duplicate sends.
    Supports batch sending, attachment handling, and delivery tracking. All templates
    are cached in Redis for performance with 1-hour TTL.

Usage:
    sender = EmailSender(config)
    result = await sender.send_email(
        to="user@example.com",
        template_id="welcome_email",
        context={"name": "Alice", "verify_url": url}
    )

Notes: Max 3 retry attempts with exponential backoff | Redis required for dedup | ~100-300ms latency
"""

from typing import Dict, Optional, List
from sendgrid import SendGridAPIClient
from sendgrid.helpers.mail import Mail
import jinja2
from redis import Redis

from internal.templates.loader import TemplateLoader
from .notification_service import NotificationSender


class EmailDeliveryException(Exception):
    """Raised when email delivery fails after all retry attempts."""
    pass


class EmailTemplate:
    """Email template with Jinja2 rendering capabilities."""

    def __init__(self, template_id: str, subject: str, body_html: str):
        self.template_id = template_id
        self.subject = subject
        self.body_html = body_html
        self.jinja_env = jinja2.Environment()

    def render(self, context: Dict[str, str]) -> Dict[str, str]:
        """Render template with provided context variables."""
        subject = self.jinja_env.from_string(self.subject).render(context)
        body = self.jinja_env.from_string(self.body_html).render(context)
        return {"subject": subject, "body": body}


class EmailSender(NotificationSender):
    """Email sender with retry logic and template support."""

    def __init__(self, config: Dict, redis_client: Redis):
        self.api_key = config["sendgrid_api_key"]
        self.from_email = config["from_email"]
        self.client = SendGridAPIClient(self.api_key)
        self.redis = redis_client
        self.template_loader = TemplateLoader(redis_client)
        self.max_retries = 3

    async def send_email(
        self, 
        to: str, 
        template_id: str, 
        context: Dict[str, str],
        attachments: Optional[List] = None
    ) -> Dict:
        """Send email with retry logic and deduplication."""
        # Implementation details...
        pass

Example 2: TypeScript React Component

/**
 * File: src/components/payment/PaymentForm.tsx
 * Purpose: Payment form with validation, Stripe integration, and error handling
 * Exports: PaymentForm (default), PaymentFormProps, PaymentFormState
 * Depends: react, stripe/react-stripe-js, @/hooks/usePayment, @/types/payment, yup
 * Related: components/payment/PaymentSummary.tsx, hooks/usePayment.ts, services/payment-api.ts
 * 
 * Overview:
 *   React form component for collecting payment information with real-time validation,
 *   Stripe Elements integration, and comprehensive error handling. Supports credit card,
 *   Apple Pay, and Google Pay. Implements form state management with validation using
 *   Yup schemas, automatic retry on transient failures, and displays clear error messages
 *   to users. All sensitive payment data is handled client-side by Stripe Elements to
 *   maintain PCI compliance.
 * 
 * Usage:
 *   <PaymentForm
 *     amount={1000}
 *     currency="usd"
 *     onSuccess={(result) => handlePaymentSuccess(result)}
 *     onError={(error) => handlePaymentError(error)}
 *   />
 * 
 * Notes: Requires StripeProvider wrapper | Min amount $0.50 | Max 3 retry attempts
 */

import React, { useState, useEffect } from 'react';
import { CardElement, useStripe, useElements } from '@stripe/react-stripe-js';
import * as yup from 'yup';
import { usePayment } from '@/hooks/usePayment';
import { PaymentIntent, PaymentError } from '@/types/payment';
import { PaymentSummary } from './PaymentSummary';

/**
 * Props for PaymentForm component
 */
export interface PaymentFormProps {
  amount: number;
  currency: string;
  customerId?: string;
  onSuccess: (result: PaymentIntent) => void;
  onError: (error: PaymentError) => void;
  enableApplePay?: boolean;
  enableGooglePay?: boolean;
}

/**
 * Internal state for payment form
 */
export interface PaymentFormState {
  isProcessing: boolean;
  error: PaymentError | null;
  formData: {
    name: string;
    email: string;
    zipCode: string;
  };
}

/**
 * Payment form component with Stripe integration
 */
export default function PaymentForm({
  amount,
  currency,
  customerId,
  onSuccess,
  onError,
  enableApplePay = false,
  enableGooglePay = false,
}: PaymentFormProps) {
  const stripe = useStripe();
  const elements = useElements();
  const { processPayment, isLoading } = usePayment();

  const [state, setState] = useState<PaymentFormState>({
    isProcessing: false,
    error: null,
    formData: { name: '', email: '', zipCode: '' },
  });

  // Validation schema
  const validationSchema = yup.object({
    name: yup.string().required('Name is required'),
    email: yup.string().email('Invalid email').required('Email is required'),
    zipCode: yup.string().matches(/^\d{5}$/, 'Must be 5 digits').required('ZIP code is required'),
  });

  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();

    if (!stripe || !elements) {
      return;
    }

    setState(prev => ({ ...prev, isProcessing: true, error: null }));

    try {
      // Validate form data
      await validationSchema.validate(state.formData);

      const cardElement = elements.getElement(CardElement);
      if (!cardElement) {
        throw new Error('Card element not found');
      }

      // Create payment method
      const { error: stripeError, paymentMethod } = await stripe.createPaymentMethod({
        type: 'card',
        card: cardElement,
        billing_details: {
          name: state.formData.name,
          email: state.formData.email,
          address: { postal_code: state.formData.zipCode },
        },
      });

      if (stripeError) {
        throw stripeError;
      }

      // Process payment
      const result = await processPayment({
        amount,
        currency,
        paymentMethodId: paymentMethod!.id,
        customerId,
      });

      onSuccess(result);
    } catch (error) {
      const paymentError = error as PaymentError;
      setState(prev => ({ ...prev, error: paymentError }));
      onError(paymentError);
    } finally {
      setState(prev => ({ ...prev, isProcessing: false }));
    }
  };

  return (
    <form onSubmit={handleSubmit} className="payment-form">
      <PaymentSummary amount={amount} currency={currency} />

      {/* Form fields */}
      <div className="form-field">
        <label htmlFor="name">Name on Card</label>
        <input
          id="name"
          type="text"
          value={state.formData.name}
          onChange={(e) => setState(prev => ({
            ...prev,
            formData: { ...prev.formData, name: e.target.value }
          }))}
          disabled={state.isProcessing}
          required
        />
      </div>

      <div className="form-field">
        <label htmlFor="email">Email</label>
        <input
          id="email"
          type="email"
          value={state.formData.email}
          onChange={(e) => setState(prev => ({
            ...prev,
            formData: { ...prev.formData, email: e.target.value }
          }))}
          disabled={state.isProcessing}
          required
        />
      </div>

      <div className="form-field">
        <label htmlFor="zipCode">ZIP Code</label>
        <input
          id="zipCode"
          type="text"
          value={state.formData.zipCode}
          onChange={(e) => setState(prev => ({
            ...prev,
            formData: { ...prev.formData, zipCode: e.target.value }
          }))}
          disabled={state.isProcessing}
          maxLength={5}
          required
        />
      </div>

      <div className="form-field">
        <label>Card Details</label>
        <CardElement
          options={{
            style: {
              base: {
                fontSize: '16px',
                color: '#424770',
                '::placeholder': { color: '#aab7c4' },
              },
              invalid: { color: '#9e2146' },
            },
          }}
        />
      </div>

      {state.error && (
        <div className="error-message" role="alert">
          {state.error.message}
        </div>
      )}

      <button
        type="submit"
        disabled={!stripe || state.isProcessing}
        className="submit-button"
      >
        {state.isProcessing ? 'Processing...' : `Pay ${currency.toUpperCase()} ${(amount / 100).toFixed(2)}`}
      </button>
    </form>
  );
}

Example 3: Markdown Architecture Document

---
file: docs/architecture/event-driven-messaging.md
purpose: Event messaging architecture - RabbitMQ topology, schemas, consumer patterns
audience: [Backend Engineers, DevOps Engineers, System Architects]
dependencies:
  - RabbitMQ 3.12+
  - Schema Registry v2.0
  - Prometheus/Grafana for monitoring
related:
  - docs/architecture/system-overview.md
  - docs/architecture/microservices-communication.md
  - docs/api/event-schemas.md
  - docs/runbooks/message-queue-troubleshooting.md
  - docs/adr/015-message-broker-selection.md
status: approved
version: 2.3.0
updated: 2024-10-13
reviewers: [Senior Architects, Platform Team Lead]
---

# Event-Driven Messaging Architecture

**Overview:**  
Comprehensive architecture guide for asynchronous event-driven communication across microservices.
Defines message broker topology, exchange/queue configuration, event schema design and versioning,
producer/consumer implementation patterns, dead letter queue handling, retry strategies, and
monitoring approaches. Provides best practices for decoupling services through async messaging
while maintaining reliability, eventual consistency, observability, and data quality guarantees.

**When to use this pattern:**

- Decoupling services that don't need synchronous responses
- Broadcasting events to multiple consumers (pub/sub pattern)
- Handling high-throughput background jobs and async workflows
- Building audit trails and event logs
- Implementing eventual consistency patterns
- Processing that can tolerate delays (seconds to minutes)

**When NOT to use this pattern:**

- User-facing operations requiring immediate feedback (<100ms)
- Simple service-to-service calls with low latency requirements
- Transactions requiring strong consistency guarantees
- Direct database queries (use read APIs instead)
- Real-time bidirectional communication (use WebSockets)

**Key Architecture Pattern:**
Producer → Exchange → Queue → Consumer → Handler ↓ (Dead Letter Queue on failure) ↓ (Retry Queue with delay)
**Quick Start (Local Development):**

```bash
# Start infrastructure
docker-compose up -d rabbitmq schema-registry prometheus grafana

# Verify services
curl http://localhost:15672  # RabbitMQ Management UI (guest/guest)
curl http://localhost:8081   # Schema Registry
curl http://localhost:3000   # Grafana

# Set environment variables
export RABBITMQ_URL=amqp://localhost:5672
export SCHEMA_REGISTRY_URL=http://localhost:8081


Table of Contents

  1. Architecture Overview
  2. Message Broker Topology
  3. Event Schema Design
  4. Producer Patterns
  5. Consumer Patterns
  6. Error Handling and Retries
  7. Monitoring and Observability
  8. Best Practices
  9. Common Pitfalls
  10. Migration Guide

Architecture Overview

Our event-driven architecture uses RabbitMQ as the central message broker with the following key components:

Components

Message Broker (RabbitMQ): - 3-node cluster for high availability - Quorum queues for data durability - Automatic failover and leader election - Persistent messages for critical events

Schema Registry: - Centralized schema management for all events - Schema versioning with backward compatibility checks - Automatic validation of producer/consumer schemas - REST API for schema retrieval

Producers: - Services that publish events (order-service, user-service, etc.) - Responsible for schema compliance and idempotency keys - Use publisher confirms for guaranteed delivery - Implement circuit breaker pattern for broker failures

Consumers: - Services that subscribe to events (notification-service, analytics-service, etc.) - Implement at-least-once delivery semantics - Handle idempotency (duplicate message detection) - Use prefetch limits for flow control

Dead Letter Queues (DLQ): - Automatic routing of failed messages - Separate DLQ per queue for isolation - Manual inspection and replay capabilities - Alerting on DLQ message accumulation

Message Flow

  1. Producer creates event with schema validation
  2. Exchange routes message to appropriate queue(s) based on routing key
  3. Queue stores message durably until consumed
  4. Consumer processes message with business logic
  5. Acknowledgment confirms successful processing
  6. Failure → message sent to DLQ or retry queue

Guarantees

  • At-least-once delivery: Messages are delivered at least once (consumers must handle duplicates)
  • Ordering: Guaranteed within a single queue, not across queues
  • Durability: Messages survive broker restarts (persistent messages + quorum queues)
  • Availability: 3-node cluster provides 99.9% uptime SLA

Message Broker Topology

Exchange Types

We use three types of exchanges:

1. Topic Exchanges (Primary Pattern):

exchange: events.topic
type: topic
routing patterns:
  - order.created
  - order.updated
  - order.cancelled
  - user.registered
  - payment.completed

Topic exchanges route messages based on routing key patterns. Consumers bind with wildcards: - order.* matches all order events - *.created matches all creation events - # matches everything

2. Fanout Exchanges (Broadcasting):

exchange: events.broadcast
type: fanout
use case: Send to all subscribers (e.g., cache invalidation)

3. Direct Exchanges (Specific Routing):

exchange: events.direct
type: direct
use case: Point-to-point messaging with exact routing key match

Queue Configuration

Standard Queue Configuration:

name: order-service.order-created
type: quorum  # For durability
durable: true
arguments:
  x-dead-letter-exchange: dlx.order-service
  x-dead-letter-routing-key: order-created.dlq
  x-message-ttl: 86400000  # 24 hours
  x-max-length: 100000     # Max queue size

Dead Letter Queue Configuration:

name: dlq.order-service.order-created
type: quorum
durable: true
arguments:
  x-message-ttl: 604800000  # 7 days retention

Retry Queue Configuration (with delay):

name: retry.order-service.order-created
type: quorum
durable: true
arguments:
  x-dead-letter-exchange: events.topic
  x-dead-letter-routing-key: order.created
  x-message-ttl: 30000  # 30 second delay before retry

Event Schema Design

Schema Structure

All events follow this structure:

{
  "eventId": "uuid-v4",
  "eventType": "order.created",
  "eventVersion": "1.0.0",
  "timestamp": "2024-10-13T10:30:00Z",
  "source": "order-service",
  "idempotencyKey": "unique-operation-id",
  "payload": {
    // Event-specific data
  },
  "metadata": {
    "userId": "user-123",
    "traceId": "trace-xyz",
    "correlationId": "correlation-abc"
  }
}

Schema Versioning

We use semantic versioning for event schemas:

  • Major version (1.x.x): Breaking changes (remove fields, change types)
  • Minor version (x.1.x): Backward-compatible additions (new optional fields)
  • Patch version (x.x.1): Documentation or metadata changes

Version Compatibility Rules:

  • Consumers MUST support current major version
  • Consumers SHOULD support previous major version for 90 days (migration period)
  • Producers MUST NOT send breaking changes without major version bump
  • Schema registry enforces backward compatibility for minor versions

Example Event Schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "title": "OrderCreatedEvent",
  "version": "1.2.0",
  "required": ["eventId", "eventType", "eventVersion", "timestamp", "source", "payload"],
  "properties": {
    "eventId": {
      "type": "string",
      "format": "uuid"
    },
    "eventType": {
      "type": "string",
      "const": "order.created"
    },
    "eventVersion": {
      "type": "string",
      "pattern": "^\\d+\\.\\d+\\.\\d+$"
    },
    "timestamp": {
      "type": "string",
      "format": "date-time"
    },
    "source": {
      "type": "string"
    },
    "idempotencyKey": {
      "type": "string"
    },
    "payload": {
      "type": "object",
      "required": ["orderId", "customerId", "items", "totalAmount"],
      "properties": {
        "orderId": { "type": "string" },
        "customerId": { "type": "string" },
        "items": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "productId": { "type": "string" },
              "quantity": { "type": "integer", "minimum": 1 },
              "price": { "type": "number", "minimum": 0 }
            }
          }
        },
        "totalAmount": { "type": "number", "minimum": 0 },
        "shippingAddress": {
          "type": "object",
          "properties": {
            "street": { "type": "string" },
            "city": { "type": "string" },
            "state": { "type": "string" },
            "zipCode": { "type": "string" },
            "country": { "type": "string" }
          }
        }
      }
    },
    "metadata": {
      "type": "object",
      "properties": {
        "userId": { "type": "string" },
        "traceId": { "type": "string" },
        "correlationId": { "type": "string" }
      }
    }
  }
}

Producer Patterns

Basic Producer Implementation (Node.js)

import { EventBus } from '@company/event-bus';
import { v4 as uuidv4 } from 'uuid';

class OrderEventProducer {
  private eventBus: EventBus;

  constructor(eventBus: EventBus) {
    this.eventBus = eventBus;
  }

  async publishOrderCreated(order: Order): Promise<void> {
    const event = {
      eventId: uuidv4(),
      eventType: 'order.created',
      eventVersion: '1.2.0',
      timestamp: new Date().toISOString(),
      source: 'order-service',
      idempotencyKey: `order-created-${order.id}`,
      payload: {
        orderId: order.id,
        customerId: order.customerId,
        items: order.items,
        totalAmount: order.totalAmount,
        shippingAddress: order.shippingAddress,
      },
      metadata: {
        userId: order.userId,
        traceId: getCurrentTraceId(),
        correlationId: getCurrentCorrelationId(),
      },
    };

    await this.eventBus.publish('events.topic', 'order.created', event, {
      persistent: true,
      mandatory: true,
      contentType: 'application/json',
    });
  }
}

Producer Best Practices

  1. Always include idempotency keys to enable duplicate detection
  2. Use publisher confirms for guaranteed delivery
  3. Implement circuit breaker pattern for broker failures
  4. Include tracing metadata for observability
  5. Validate against schema before publishing
  6. Handle publish failures with retry or DLQ
  7. Keep payloads small (<1MB, ideally <10KB)
  8. Use connection pooling for performance

Consumer Patterns

Basic Consumer Implementation (Node.js)

import { EventBus } from '@company/event-bus';

class NotificationConsumer {
  private eventBus: EventBus;
  private processedEvents: Set<string>;

  constructor(eventBus: EventBus) {
    this.eventBus = eventBus;
    this.processedEvents = new Set();
  }

  async start(): Promise<void> {
    await this.eventBus.subscribe(
      'notification-service.order-created',
      'order.created',
      async (event) => {
        try {
          // Idempotency check
          if (this.processedEvents.has(event.idempotencyKey)) {
            console.log(`Duplicate event ${event.eventId}, skipping`);
            return; // Acknowledge without processing
          }

          // Business logic
          await this.sendOrderConfirmationEmail(event.payload);

          // Mark as processed
          this.processedEvents.add(event.idempotencyKey);

          // Acknowledge
          // (automatic if no error thrown)
        } catch (error) {
          console.error(`Failed to process event ${event.eventId}:`, error);
          throw error; // Will be retried or sent to DLQ
        }
      },
      {
        prefetch: 10,
        noAck: false,
        exclusive: false,
      }
    );
  }

  private async sendOrderConfirmationEmail(orderData: any): Promise<void> {
    // Email sending logic
  }
}

Consumer Best Practices

  1. Implement idempotency - detect and skip duplicate messages
  2. Use manual acknowledgments - only ack after successful processing
  3. Set appropriate prefetch - balance throughput and reliability (10-50)
  4. Handle poison messages - messages that always fail (send to DLQ after N attempts)
  5. Use structured logging - include eventId, eventType, correlationId
  6. Implement graceful shutdown - finish processing before exit
  7. Monitor consumer lag - alert if falling behind
  8. Keep processing logic fast - offload heavy work to background jobs

Error Handling and Retries

Retry Strategy

We implement exponential backoff with jitter:

const retryDelays = [
  5000,   // 5 seconds
  30000,  // 30 seconds
  300000, // 5 minutes
];

async function processWithRetry(event: Event, attempt: number = 0): Promise<void> {
  try {
    await processEvent(event);
  } catch (error) {
    if (attempt < retryDelays.length) {
      // Requeue with delay
      await requeueWithDelay(event, retryDelays[attempt], attempt + 1);
    } else {
      // Max retries exceeded, send to DLQ
      await sendToDLQ(event, error);
    }
  }
}

Dead Letter Queue Handling

Manual Inspection:

# View messages in DLQ
rabbitmqadmin get queue=dlq.order-service.order-created count=10

# Replay message from DLQ
rabbitmqadmin publish exchange=events.topic routing_key=order.created payload='...'

Automated Replay (with caution):

  • Review DLQ messages daily
  • Fix underlying issues first
  • Replay in small batches
  • Monitor for repeated failures

Monitoring and Observability

Key Metrics

Monitor these metrics in Prometheus/Grafana:

Broker Metrics: - Queue depth (messages waiting) - Message rate (messages/second) - Consumer lag (time behind) - Connection count - Memory/disk usage

Application Metrics: - Publish success rate - Publish latency (p50, p95, p99) - Consume success rate - Processing latency - Retry count - DLQ message count

Alerts: - Queue depth > 10,000 - Consumer lag > 5 minutes - DLQ messages > 100 - Publish error rate > 1% - Processing error rate > 5%

Tracing

Use distributed tracing to follow events through the system:

const span = tracer.startSpan('process-order-created', {
  childOf: extractSpanContext(event.metadata.traceId),
  tags: {
    'event.id': event.eventId,
    'event.type': event.eventType,
    'service.name': 'notification-service',
  },
});

try {
  await processEvent(event);
  span.setTag('status', 'success');
} catch (error) {
  span.setTag('status', 'error');
  span.setTag('error', true);
  span.log({ 'error.message': error.message });
  throw error;
} finally {
  span.finish();
}

Best Practices

Do's

✅ Use idempotency keys for all events
✅ Implement schema versioning from day one
✅ Monitor queue depth and consumer lag
✅ Use quorum queues for durability
✅ Keep event payloads small and focused
✅ Include tracing metadata in all events
✅ Implement graceful degradation for broker failures
✅ Document event schemas in schema registry
✅ Test failure scenarios (broker down, consumer crash)
✅ Use publisher confirms for critical events

Don'ts

❌ Don't publish massive payloads (>1MB)
❌ Don't rely on message ordering across queues
❌ Don't skip schema validation
❌ Don't forget idempotency in consumers
❌ Don't use auto-ack (always manual ack)
❌ Don't ignore DLQ messages
❌ Don't make breaking schema changes without version bump
❌ Don't block consumer threads with slow operations
❌ Don't retry indefinitely without circuit breaker
❌ Don't mix sync and async patterns for same operation


Common Pitfalls

1. Not Handling Duplicates

Problem: Messages can be delivered multiple times (at-least-once semantics)

Solution: Always implement idempotency using idempotencyKey

2. Blocking Consumer Threads

Problem: Long-running operations block message processing

Solution: Offload heavy work to background jobs, keep handlers fast

3. Ignoring Schema Compatibility

Problem: Breaking schema changes break consumers

Solution: Use schema registry, version schemas, maintain backward compatibility

4. No Monitoring

Problem: Can't detect issues until users complain

Solution: Set up comprehensive metrics and alerts

5. Overfetching Messages

Problem: High prefetch causes memory issues and uneven load distribution

Solution: Set prefetch to 10-50, tune based on message size and processing time


Migration Guide

From Synchronous HTTP to Events

Step 1: Identify operations that can be async

Step 2: Define event schemas

Step 3: Implement dual-write pattern (write to DB + publish event)

Step 4: Build consumers

Step 5: Monitor and validate

Step 6: Remove synchronous code path


References


Document Changelog

Version Date Author Changes
2.3.0 2024-10-13 Platform Team Added section on circuit breaker patterns
2.2.0 2024-09-01 Platform Team Updated schema versioning strategy
2.1.0 2024-07-15 Platform Team Added monitoring and tracing sections
2.0.0 2024-06-01 Platform Team Initial approved version

Next Review: 2025-01-15
Feedback: Submit issues to Architecture Guild

---

## Migration Guide

### For Existing Codebases

**Phase 1: Audit (Week 1)**

1. Inventory all files in codebase
2. Identify files lacking documentation
3. Prioritize by:
   - Complexity (complex files first)
   - Usage (frequently modified files)
   - Criticality (core business logic)

**Phase 2: Create Template (Week 1)**

1. Create `.file-header-template.txt` in repository root
2. Add language-specific templates for Python, TypeScript, Markdown
3. Document examples in team wiki

**Phase 3: Gradual Adoption (Weeks 2-8)**

1. **New files:** Use new standard from day one
2. **Modified files:** Add headers when making significant changes
3. **Critical files:** Dedicated documentation sprint (allocate 20% of sprint)
4. **Remaining files:** Background task, no rush

**Phase 4: Automation (Week 8+)**

1. Create linter rule to check for headers
2. Add pre-commit hook (warning only, not blocking)
3. Generate weekly report of files missing headers
4. Gamify: Track team progress, celebrate milestones

### Migration Script Example

```python
"""
File: scripts/add_file_headers.py
Purpose: Automated script to add structured headers to existing Python files
Exports: add_header_to_file, analyze_file_exports, analyze_file_dependencies
Depends: ast, pathlib, typing
"""

import ast
from pathlib import Path
from typing import List, Set

def analyze_file_exports(file_path: Path) -> List[str]:
    """Extract exported classes and functions from Python file."""
    with open(file_path) as f:
        tree = ast.parse(f.read())

    exports = []
    for node in ast.walk(tree):
        if isinstance(node, ast.ClassDef):
            exports.append(node.name)
        elif isinstance(node, ast.FunctionDef) and not node.name.startswith('_'):
            exports.append(node.name)

    return exports

def analyze_file_dependencies(file_path: Path) -> Set[str]:
    """Extract top-level imports from Python file."""
    with open(file_path) as f:
        tree = ast.parse(f.read())

    dependencies = set()
    for node in ast.walk(tree):
        if isinstance(node, ast.Import):
            for alias in node.names:
                dependencies.add(alias.name)
        elif isinstance(node, ast.ImportFrom):
            if node.module:
                dependencies.add(node.module)

    return dependencies

def add_header_to_file(file_path: Path, dry_run: bool = True):
    """Add structured header to Python file if not present."""
    with open(file_path) as f:
        content = f.read()

    # Check if header already exists
    if content.startswith('"""') and 'File:' in content[:500]:
        print(f"✓ {file_path} already has header")
        return

    # Analyze file
    exports = analyze_file_exports(file_path)
    dependencies = analyze_file_dependencies(file_path)

    # Generate header
    relative_path = file_path.relative_to(Path.cwd())
    header = f'''"""
File: {relative_path}
Purpose: [TODO: Add one-line description]
Exports: {', '.join(exports) if exports else '[None]'}
Depends: {', '.join(sorted(list(dependencies)[:5]))}
Related: [TODO: Add related files]

Overview:
    [TODO: Add 2-4 sentence overview explaining what this module does,
    why it exists, and how it fits into the larger system.]

Usage:
    [TODO: Add typical usage example]
"""

'''

    # Prepend header
    new_content = header + content

    if dry_run:
        print(f"Would add header to {file_path}")
        print(header)
    else:
        with open(file_path, 'w') as f:
            f.write(new_content)
        print(f"✓ Added header to {file_path}")

# Usage
if __name__ == '__main__':
    import sys

    # Find all Python files
    python_files = Path('src').rglob('*.py')

    dry_run = '--apply' not in sys.argv

    for file_path in python_files:
        if file_path.name != '__init__.py':
            add_header_to_file(file_path, dry_run=dry_run)

    if dry_run:
        print("\nDry run complete. Run with --apply to make changes.")


FAQs

Q: Is this overkill for small projects?

A: Yes. For projects with <50 files or <3 developers, traditional Google-style documentation is sufficient. This standard is designed for:

  • Medium to large codebases (100+ files)
  • Teams heavily using AI coding assistants
  • Microservices architectures with complex dependencies
  • Codebases with frequent developer rotation

Q: How do I convince my team to adopt this?

A: Focus on measurable benefits:

  1. Run a pilot: Have 2-3 developers use structured headers for 2 weeks
  2. Measure productivity: Track time to understand unfamiliar code
  3. Demonstrate AI improvements: Show side-by-side comparisons of AI suggestions
  4. Start small: New files only, don't migrate entire codebase immediately
  5. Share results: Document time savings and improved AI assistance

Q: What about legacy code with no documentation?

A: Prioritize:

  1. Add headers to files you're actively modifying
  2. Focus on complex business logic files first
  3. Use automated script to generate skeleton headers (see Migration Guide)
  4. Don't block work waiting for perfect documentation
  5. Improve incrementally over months, not weeks

Q: Should I use this for open-source projects?

A: Generally no, unless:

  • Your project explicitly targets AI-assisted development
  • Contributors are already using AI tools extensively
  • You're willing to document the non-standard format

For most open-source projects, stick with community standards (Google, JSDoc) to reduce contributor friction.

Q: How often should I update headers?

A: Update headers when:

  • Exports change (adding/removing public API)
  • Dependencies change significantly
  • File purpose changes
  • Major refactoring occurs

Don't update for every minor code change. Headers document the file's role, not implementation details.

Q: What if my IDE complains about non-standard format?

A: Most IDEs are flexible:

  • VSCode: Recognizes any docstring format
  • PyCharm: Can be configured to accept custom docstring styles
  • IntelliJ: Supports JSDoc extensions

If your IDE has issues, create a custom docstring template or file a feature request.

Q: How do I measure if this is working?

A: Track these metrics:

Before adoption: - Time to understand unfamiliar code (developer survey) - Number of "Where is X?" Slack questions per week - AI suggestion acceptance rate (Copilot metrics) - Time to onboard new developers

After adoption (3 months later): - Same metrics, compare improvement - Developer satisfaction survey (1-5 scale) - Number of files with complete documentation - AI assistant usage frequency

Expect 20-40% improvement in developer productivity metrics.

Q: Can I mix this with Google style?

A: Yes! Hybrid approach:

"""Brief Google-style summary line.

Longer Google-style description paragraph following their conventions.

---
File: path/to/file.py
Exports: MainClass, helper_func
Depends: requests, internal.module
Related: other/file.py
"""

This satisfies both human readers (Google style) and AI tools (structured metadata). The --- separator clearly delineates the two sections.

Q: What about generated code or third-party files?

A: Don't document:

  • Auto-generated code (protobuf, GraphQL schemas, etc.)
  • Third-party vendored code
  • Migration scripts (temporary files)
  • Test fixtures (data files)

Focus documentation on hand-written source code that developers need to understand.


References

Official Documentation (Verified)

  1. Cursor Codebase Indexing Documentation. https://cursor.com/docs/context/codebase-indexing
  2. Explains embedding-based indexing, chunking with tree-sitter, and RAG retrieval

  3. Microsoft Visual Studio Blog - GitHub Copilot Comments. https://devblogs.microsoft.com/visualstudio/how-to-use-comments-to-prompt-github-copilot-visual-studio/

  4. Documents how Copilot uses comments and file structure for context

  5. Anthropic Long Context Tips. https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips

  6. Recommends putting data at top of prompts, structured formats, up to 30% quality improvement

  7. Anthropic Claude Code Best Practices. https://www.anthropic.com/engineering/claude-code-best-practices

  8. How Anthropic teams use Claude for codebase understanding

  9. IBM Research - Context Windows. https://research.ibm.com/blog/larger-context-window

  10. Research on context window reliability and information positioning

  11. GitHub Copilot Documentation. https://docs.github.com/en/copilot

  12. Google Python Style Guide. https://google.github.io/styleguide/pyguide.html

  13. TypeScript JSDoc Reference. https://www.typescriptlang.org/docs/handbook/jsdoc-supported-types.html

Research Papers (Verified - Available on arXiv)

  1. Chen, M., et al. (2021). "Evaluating Large Language Models Trained on Code." arXiv:2107.03374. https://arxiv.org/abs/2107.03374
  2. Codex paper, foundation for GitHub Copilot

  3. Feng, Z., et al. (2020). "CodeBERT: A Pre-Trained Model for Programming and Natural Languages." EMNLP 2020. https://arxiv.org/abs/2002.08155

  4. Guo, D., et al. (2021). "GraphCodeBERT: Pre-training Code Representations with Data Flow." ICLR 2021. https://arxiv.org/abs/2009.08366

  5. OpenAI GPT-4 Technical Report. (2023). https://arxiv.org/abs/2303.08774

Blog Posts and Articles

  1. LLMs.txt Standard - Gary Svenson. https://garysvenson09.medium.com/2025-dev-trend-ai-friendly-docs-via-llms-txt-8d25a9ae1bb6
  2. Emerging standard for AI-friendly project documentation

  3. How Cursor Indexes Codebases Fast - Engineer's Codex. https://read.engineerscodex.com/p/how-cursor-indexes-codebases-fast

  4. Building RAG on Codebases - LanceDB. https://blog.lancedb.com/rag-codebase-1/

  5. How I Code With LLMs These Days - Honeycomb. https://www.honeycomb.io/blog/how-i-code-with-llms-these-days


Appendix A: Template Files

Python Template (.py)

"""
File: [path/to/file.py]
Purpose: [One-line description with key terms]
Exports: [Class1, Class2, function1, CONSTANT]
Depends: [external_lib, internal.module]
Implements: [InterfaceName] | Inherits: [ParentClass]
Related: [path/to/related1.py, path/to/related2.py]

Overview:
    [2-4 sentence narrative explaining what this module does, why it exists,
    and how it fits into the larger system. Focus on business logic and key
    patterns.]

Usage:
    [typical usage example - 1-3 lines of code]

Notes: [Optional: performance notes, limitations, warnings]
"""

TypeScript Template (.ts)

/**
 * File: [src/path/to/file.ts]
 * Purpose: [One-line description with key terms]
 * Exports: [Class1, Class2, function1, Type1]
 * Depends: [external-lib (^1.0), @/internal/module]
 * Implements: [InterfaceName]
 * Related: [path/to/related1.ts, path/to/related2.ts]
 * 
 * Overview:
 *   [2-4 sentence narrative explaining what this module does, why it exists,
 *   and how it fits into the larger system.]
 * 
 * Usage:
 *   [typical usage example - 1-3 lines of code]
 * 
 * Notes: [Optional: performance notes, limitations, warnings]
 */

Markdown Template (.md)

---
file: [docs/path/to/file.md]
purpose: [One-line description with key terms]
audience: [Role1, Role2, Role3]
dependencies: [Tool1, Tool2]
related:
  - [path/to/related1.md]
  - [path/to/related2.md]
status: [draft|review|approved|deprecated]
version: [1.0.0]
updated: [YYYY-MM-DD]
---

# [Document Title]

**Overview:**  
[2-4 sentence description of what this document covers and why it's important]

**When to use:** [Scenarios where this is applicable]  
**When NOT to use:** [Scenarios where this is not applicable]  
**Quick start:** [If applicable, brief getting started instructions]

---

## Table of Contents
[If long document]

## [Main Content Sections]

Appendix B: Linter Integration

ESLint Plugin (TypeScript/JavaScript)

// .eslintrc.js
module.exports = {
  rules: {
    'require-file-header': ['warn', {
      pattern: /\/\*\*[\s\S]*File:[\s\S]*Purpose:[\s\S]*Exports:[\s\S]*\*\//,
      message: 'File must have structured header with File, Purpose, and Exports fields'
    }]
  }
};

Pylint Plugin (Python)

# pylintrc
[MASTER]
load-plugins=pylint_file_header

[FILE-HEADER]
file-header-pattern="""File:.*Purpose:.*Exports:"""

Pre-commit Hook

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: check-file-headers
        name: Check file headers
        entry: python scripts/check_headers.py
        language: system
        types: [python, javascript, typescript]
        pass_filenames: true

Version History

Version Date Changes
1.0 2024-10-13 Initial release

Document maintained by: Engineering Standards Committee
Questions or feedback: Create issue at https://github.com/company/engineering-standards/issues
Next review: 2025-04-13