Investigations API

Overview

The Investigations module provides sophisticated anomaly detection capabilities for government data analysis. It leverages a multi-agent system to identify irregularities, suspicious patterns, and potential compliance violations in public sector data.

Technical Architecture

Endpoints

Start Investigation

Initiates a new investigation with configurable parameters for anomaly detection.

POST /api/v1/investigations/start
Content-Type: application/json
Authorization: Bearer {token}

Request Schema

interface InvestigationRequest {
  // Core parameters
  query: string;                    // Natural language query
  data_source: DataSource;          // Target data source
  
  // Filtering options
  filters?: {
    date_range?: {
      start: string;              // ISO 8601 date
      end: string;                // ISO 8601 date
    };
    organizations?: string[];     // Organization IDs
    contract_types?: string[];    // Contract type filters
    min_value?: number;          // Minimum transaction value
    max_value?: number;          // Maximum transaction value
    regions?: string[];          // Geographic regions
  };
  
  // Detection configuration
  anomaly_types?: AnomalyType[];  // Types to detect
  sensitivity?: number;           // 0.0-1.0 (default: 0.7)
  confidence_threshold?: number;  // 0.0-1.0 (default: 0.8)
  
  // Processing options
  include_explanations?: boolean; // XAI explanations
  stream_results?: boolean;       // Real-time streaming
  max_results?: number;          // Result limit
  
  // Advanced options
  agent_configuration?: {
    max_iterations?: number;      // Agent iterations
    reflection_enabled?: boolean; // Self-reflection
    parallel_agents?: number;     // Concurrent agents
  };
}

enum DataSource {
  PORTAL_TRANSPARENCIA = "portal_transparencia",
  TCU = "tcu",
  CGU = "cgu",
  COMPRAS_GOV = "compras_gov"
}

enum AnomalyType {
  PRICE_ANOMALY = "price_anomaly",
  PATTERN_ANOMALY = "pattern_anomaly",
  TEMPORAL_ANOMALY = "temporal_anomaly",
  RELATIONSHIP_ANOMALY = "relationship_anomaly",
  COMPLIANCE_VIOLATION = "compliance_violation"
}

Response

{
  "investigation_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "processing",
  "estimated_completion": "2025-01-30T10:20:30Z",
  "stream_url": "/api/v1/investigations/stream/550e8400-e29b-41d4-a716-446655440000"
}

Example: Complex Investigation

curl -X POST https://api.cidadao.ai/api/v1/investigations/start \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Identify suspicious patterns in IT contracts above 1M in 2024",
    "data_source": "portal_transparencia",
    "filters": {
      "date_range": {
        "start": "2024-01-01",
        "end": "2024-12-31"
      },
      "contract_types": ["IT_SERVICES", "SOFTWARE_LICENSE"],
      "min_value": 1000000
    },
    "anomaly_types": ["price_anomaly", "pattern_anomaly"],
    "sensitivity": 0.8,
    "include_explanations": true,
    "agent_configuration": {
      "reflection_enabled": true,
      "parallel_agents": 4
    }
  }'

Stream Results

Real-time streaming of investigation results via Server-Sent Events (SSE).

GET /api/v1/investigations/stream/{investigation_id}
Authorization: Bearer {token}
Accept: text/event-stream

Event Types

// Progress Update
event: progress
data: {
  "percentage": 45,
  "current_phase": "analyzing_patterns",
  "agents_active": 3
}

// Anomaly Detected
event: anomaly
data: {
  "id": "anomaly_123",
  "type": "price_anomaly",
  "severity": "high",
  "confidence": 0.92,
  "description": "Contract value 340% above market average",
  "affected_record": {
    "contract_id": "CT-2024-0001",
    "supplier": "ACME Corp",
    "value": 4500000
  }
}

// Agent Insight
event: insight
data: {
  "agent": "CorruptionDetector",
  "message": "Identified unusual payment patterns",
  "confidence": 0.85
}

// Completion
event: complete
data: {
  "total_anomalies": 12,
  "processing_time": 4523,
  "agents_used": 5
}

Client Example

const eventSource = new EventSource(
  `${API_URL}/investigations/stream/${investigationId}`,
  { headers: { 'Authorization': `Bearer ${token}` } }
);

eventSource.addEventListener('anomaly', (event) => {
  const anomaly = JSON.parse(event.data);
  console.log(`Found ${anomaly.severity} anomaly: ${anomaly.description}`);
});

eventSource.addEventListener('complete', (event) => {
  const summary = JSON.parse(event.data);
  console.log(`Investigation complete: ${summary.total_anomalies} anomalies found`);
  eventSource.close();
});

Get Investigation Status

Retrieve current status and metadata of an investigation.

GET /api/v1/investigations/{investigation_id}/status
Authorization: Bearer {token}

Response Schema

interface InvestigationStatus {
  investigation_id: string;
  status: InvestigationState;
  progress: {
    percentage: number;
    current_phase: ProcessingPhase;
    phases_completed: ProcessingPhase[];
  };
  statistics: {
    records_processed: number;
    anomalies_found: number;
    agents_invoked: number;
    processing_time_ms: number;
  };
  created_at: string;
  started_at?: string;
  completed_at?: string;
  error?: {
    code: string;
    message: string;
    recoverable: boolean;
  };
}

enum InvestigationState {
  PENDING = "pending",
  PROCESSING = "processing",
  COMPLETED = "completed",
  FAILED = "failed",
  CANCELLED = "cancelled"
}

enum ProcessingPhase {
  INITIALIZATION = "initialization",
  DATA_COLLECTION = "data_collection",
  PREPROCESSING = "preprocessing",
  AGENT_ANALYSIS = "agent_analysis",
  PATTERN_DETECTION = "pattern_detection",
  RESULT_AGGREGATION = "result_aggregation",
  REPORT_GENERATION = "report_generation"
}

Get Investigation Results

Retrieve complete results of a finished investigation.

GET /api/v1/investigations/{investigation_id}/results
Authorization: Bearer {token}

Response Schema

interface InvestigationResponse {
  investigation_id: string;
  metadata: {
    query: string;
    data_source: string;
    execution_time_ms: number;
    total_records_analyzed: number;
  };
  
  anomalies: AnomalyResult[];
  
  summary: {
    total_anomalies: number;
    by_type: Record<AnomalyType, number>;
    by_severity: {
      critical: number;
      high: number;
      medium: number;
      low: number;
    };
    risk_score: number;  // 0-100
  };
  
  insights: AgentInsight[];
  
  recommendations: string[];
  
  explanations?: {
    methodology: string;
    confidence_intervals: Record<string, [number, number]>;
    feature_importance: Record<string, number>;
  };
}

interface AnomalyResult {
  id: string;
  type: AnomalyType;
  severity: Severity;
  confidence: number;
  
  description: string;
  technical_details: Record<string, any>;
  
  affected_entities: {
    primary: EntityReference;
    related: EntityReference[];
  };
  
  evidence: {
    data_points: DataPoint[];
    patterns: Pattern[];
    statistical_analysis?: StatisticalEvidence;
  };
  
  context: {
    historical_comparison?: HistoricalContext;
    peer_comparison?: PeerContext;
    regulatory_context?: RegulatoryContext;
  };
  
  explanation?: {
    human_readable: string;
    technical: string;
    contributing_factors: Factor[];
  };
}

Example Response

{
  "investigation_id": "550e8400-e29b-41d4-a716-446655440000",
  "metadata": {
    "query": "Suspicious patterns in IT contracts",
    "data_source": "portal_transparencia",
    "execution_time_ms": 4523,
    "total_records_analyzed": 15420
  },
  "anomalies": [
    {
      "id": "anom_001",
      "type": "price_anomaly",
      "severity": "high",
      "confidence": 0.92,
      "description": "Contract CT-2024-0001 shows pricing 340% above market average",
      "technical_details": {
        "contract_value": 4500000,
        "market_average": 1323529,
        "standard_deviation": 425000,
        "z_score": 7.47
      },
      "affected_entities": {
        "primary": {
          "type": "contract",
          "id": "CT-2024-0001",
          "name": "Software Development Services"
        },
        "related": [
          {
            "type": "supplier",
            "id": "SUP-123",
            "name": "ACME Corp"
          }
        ]
      },
      "evidence": {
        "data_points": [
          {
            "metric": "contract_value",
            "value": 4500000,
            "timestamp": "2024-03-15T10:00:00Z"
          }
        ],
        "statistical_analysis": {
          "method": "isolation_forest",
          "anomaly_score": 0.92,
          "contamination": 0.1
        }
      },
      "explanation": {
        "human_readable": "This contract's value is exceptionally high compared to similar IT service contracts in 2024",
        "contributing_factors": [
          {
            "factor": "price_inflation",
            "impact": 0.75,
            "description": "Unusual price increase pattern"
          }
        ]
      }
    }
  ],
  "summary": {
    "total_anomalies": 12,
    "by_type": {
      "price_anomaly": 8,
      "pattern_anomaly": 4
    },
    "by_severity": {
      "critical": 2,
      "high": 5,
      "medium": 4,
      "low": 1
    },
    "risk_score": 78
  },
  "insights": [
    {
      "agent": "CorruptionDetector",
      "insight": "Detected systematic overpricing in contracts from 3 specific suppliers",
      "confidence": 0.85,
      "supporting_evidence": ["anom_001", "anom_002", "anom_003"]
    }
  ],
  "recommendations": [
    "Initiate detailed audit of contracts CT-2024-0001, CT-2024-0045, CT-2024-0089",
    "Review procurement process for IT services category",
    "Implement automated price benchmarking for contracts above 1M"
  ]
}

List Investigations

Retrieve paginated list of investigations for the current user.

GET /api/v1/investigations
Authorization: Bearer {token}

Query Parameters

Parameter	Type	Description	Example
`status`	string	Filter by status	`completed`
`data_source`	string	Filter by data source	`portal_transparencia`
`date_from`	string	Start date (ISO 8601)	`2024-01-01`
`date_to`	string	End date (ISO 8601)	`2024-12-31`
`min_anomalies`	integer	Minimum anomalies found	`5`
`sort`	string	Sort field and order	`-created_at`
`page`	integer	Page number	`1`
`limit`	integer	Results per page (max 100)	`20`

Response

{
  "data": [
    {
      "investigation_id": "550e8400-e29b-41d4-a716-446655440000",
      "query": "Suspicious IT contracts",
      "status": "completed",
      "anomalies_found": 12,
      "risk_score": 78,
      "data_source": "portal_transparencia",
      "created_at": "2025-01-30T09:00:00Z",
      "completed_at": "2025-01-30T09:04:32Z"
    }
  ],
  "meta": {
    "total": 156,
    "page": 1,
    "limit": 20,
    "pages": 8
  }
}

Cancel Investigation

Cancel an ongoing investigation.

DELETE /api/v1/investigations/{investigation_id}
Authorization: Bearer {token}

Response

{
  "message": "Investigation cancelled successfully",
  "investigation_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "cancelled",
  "partial_results_available": true
}

Error Handling

Investigation-Specific Errors

Error Code	Description	Recovery Action
`INVESTIGATION_NOT_FOUND`	Investigation ID doesn't exist	Verify ID
`INVESTIGATION_IN_PROGRESS`	Results not yet available	Wait or stream
`INSUFFICIENT_DATA`	Not enough data for analysis	Adjust filters
`RATE_LIMIT_EXCEEDED`	Too many concurrent investigations	Retry later
`INVALID_DATA_SOURCE`	Data source not available	Check availability

Example Error Response

{
  "error": {
    "code": "INSUFFICIENT_DATA",
    "message": "Not enough data points for meaningful analysis",
    "details": {
      "records_found": 3,
      "minimum_required": 10,
      "suggestion": "Try broadening your date range or filters"
    },
    "request_id": "req_123"
  }
}

Best Practices

Optimize Queries

# Good: Specific, focused query
investigation = client.investigations.create(
    query="IT service contracts above 1M with single-bid",
    filters={
        "contract_types": ["IT_SERVICES"],
        "min_value": 1000000,
        "bid_count": 1
    }
)

# Avoid: Too broad
investigation = client.investigations.create(
    query="Find all suspicious contracts"  # Too vague
)

Handle Streaming Properly

// Implement reconnection logic
let retries = 0;
const maxRetries = 3;

function connectStream(investigationId) {
  const eventSource = new EventSource(streamUrl);
  
  eventSource.onerror = (error) => {
    if (retries < maxRetries) {
      retries++;
      setTimeout(() => connectStream(investigationId), 1000 * retries);
    }
  };
  
  eventSource.onopen = () => {
    retries = 0;  // Reset on successful connection
  };
}

Process Results Efficiently

# Process anomalies by severity
results = client.investigations.get_results(investigation_id)

critical_anomalies = [
    a for a in results.anomalies 
    if a.severity == "critical"
]

# Generate alerts for critical findings
for anomaly in critical_anomalies:
    alert_system.create_alert(
        title=f"Critical anomaly: {anomaly.type}",
        description=anomaly.description,
        evidence=anomaly.evidence
    )

Performance Considerations

Resource Usage

Investigation Type	Avg Duration	CPU Usage	Memory Usage
Simple query	5-10s	Low	<100MB
Complex filters	30-60s	Medium	<500MB
Historical analysis	2-5min	High	<2GB
Full dataset scan	10-30min	Very High	<5GB

Optimization Tips

Use specific filters to reduce dataset size
Enable streaming for long-running investigations
Set appropriate sensitivity to balance accuracy vs performance
Limit concurrent investigations per user
Cache investigation results for repeated access

Overview​

Technical Architecture​

Endpoints​

Start Investigation​

Request Schema​

Response​

Example: Complex Investigation​

Stream Results​

Event Types​

Client Example​

Get Investigation Status​

Response Schema​

Get Investigation Results​

Response Schema​

Example Response​

List Investigations​

Query Parameters​

Response​

Cancel Investigation​

Response​

Error Handling​

Investigation-Specific Errors​

Example Error Response​

Best Practices​

Optimize Queries​

Handle Streaming Properly​

Process Results Efficiently​

Performance Considerations​

Resource Usage​

Optimization Tips​

Overview

Technical Architecture

Endpoints

Start Investigation

Request Schema

Response

Example: Complex Investigation

Stream Results

Event Types

Client Example

Get Investigation Status

Response Schema

Get Investigation Results

Response Schema

Example Response

List Investigations

Query Parameters

Response

Cancel Investigation

Response

Error Handling

Investigation-Specific Errors

Example Error Response

Best Practices

Optimize Queries

Handle Streaming Properly

Process Results Efficiently

Performance Considerations

Resource Usage

Optimization Tips