AI-Driven PR Review

Structured, multi-perspective, actionable code review. Powered by your AI IDE. No server. No API keys.

$ npx prr-kit install

Works with

Claude Code Claude Code Cursor Cursor Windsurf Windsurf GitHub Copilot GitHub Copilot Antigravity Antigravity +13 more
Why PRR Kit

Built different.

No server. No subscription. No black box.

Privacy

Your code never leaves your machine

No server, no third-party API calls, no data collection. Safe for codebases under NDA or corporate security policy.

Cost

Free forever (MIT)

No subscription, no credit card, no usage limits. Install into as many repos as you want.

IDE Agnostic

Uses the AI you already have

Claude Code, Cursor, Windsurf, Copilot, and 14+ more. No extra API keys.

Open Source

Fully customizable

Every workflow is a Markdown or YAML file. Fork it, extend it, make it yours.

Business Review

Findings in business language

Translates code risks into user impact, GDPR exposure, and deployment recommendations.

Platforms

GitHub, GitLab, Azure, Bitbucket

Auto-detects your platform and posts reviews to the right place.

Live Example

See the difference.

Same PR. Same AI. Completely different output.

Both reviews generated by GitHub Copilot with claude-sonnet-4-5 Project:  twentyhq/twenty Branch: ai-models-and-bedrock PR #18155
Without PRR Kit
Standard review

Code Review: Branch ai-models-and-bedrock

Review Date: February 22, 2026
Branch: ai-models-and-bedrock
Base branch: main
Total files changed: 27 files (+1272, -341 lines)


📋 Executive Summary

This branch implements a major refactor of Twenty's AI models system with the following key changes:

  1. Expanded AI provider support: Added AWS Bedrock, Google (Gemini), and Mistral
  2. Restructured pricing: Migrated from per-1k-tokens to per-million-tokens with cached input pricing
  3. Improved type safety: Renamed ModelProvider to InferenceProvider and separated ModelFamily
  4. Updated billing system: Adjusted cost calculation logic to align with new structure
  5. Deprecation marking: Marked legacy models for backward compatibility

🎯 Major Changes

1. Type System Refactoring

File: ai-models-types.const.ts

Key Changes:

// OLD
export enum ModelProvider {
  OPENAI = 'openai',
  ANTHROPIC = 'anthropic',
  // ...
}

export interface AIModelConfig {
  provider: ModelProvider;
  inputCostPer1kTokensInCents: number;
  outputCostPer1kTokensInCents: number;
}

// NEW
export enum ModelFamily {
  OPENAI = 'openai',
  ANTHROPIC = 'anthropic',
  BEDROCK = 'bedrock',
  GOOGLE = 'google',
  MISTRAL = 'mistral',
  // ...
}

export enum InferenceProvider {
  OPENAI = 'openai',
  ANTHROPIC = 'anthropic',
  BEDROCK = 'bedrock',
  // ...
}

export interface AIModelConfig {
  modelFamily: ModelFamily;
  inferenceProvider: InferenceProvider;
  inputCostPerMillionTokens: number;
  outputCostPerMillionTokens: number;
  cachedInputCostPerMillionTokens?: number;
  deprecated?: boolean;
}

Significance:

  • Separates model family (model group) from inference provider (inference infrastructure)
  • Enables greater flexibility (e.g., running Anthropic models through AWS Bedrock)
  • Clearer pricing structure with per-million-tokens (industry standard)
  • Supports prompt caching via cachedInputCostPerMillionTokens

2. Added AWS Bedrock Support

New File: bedrock-models.const.ts

Added 6 models via AWS Bedrock:

  • Claude models: claude-4-1-sonnet, claude-4-1-haiku, claude-4-haiku
  • Llama models: llama-4-4b-instruct, llama-4-70b-instruct, llama-4-405b-instruct

Example Configuration:

{
  modelId: 'claude-4-1-sonnet',
  label: 'Claude 4.1 Sonnet (Bedrock)',
  description: 'Top-tier reasoning + vision via AWS Bedrock with prompt caching',
  modelFamily: ModelFamily.ANTHROPIC,
  inferenceProvider: InferenceProvider.BEDROCK,
  inputCostPerMillionTokens: 3.0,
  outputCostPerMillionTokens: 15.0,
  cachedInputCostPerMillionTokens: 0.3,
  contextWindowTokens: 200000,
  maxOutputTokens: 8192,
  doesSupportThinking: true,
}

Strengths:

  • Supports prompt caching (90% cost reduction for cached tokens)
  • Supports extended thinking mode for Claude models
  • Integrates with AWS IAM authentication

3. Added Google (Gemini) Models

New File: google-models.const.ts

Added 3 Gemini models:

  • gemini-4-flash - Fast model with budget pricing
  • gemini-4-flash-thinking - Model with reasoning capabilities
  • gemini-4-pro - Flagship model

Highlights:

  • Large context window (1M tokens for gemini-4-flash)
  • Competitive pricing ($0.20/M input for flash)
  • Supports diverse file types (image, audio, video, documents)

4. Added Mistral Models

New File: mistral-models.const.ts

Added 1 model:

  • mistral-large-latest - Mistral's most powerful model

Features:

  • 128K context window
  • $2/M input, $6/M output
  • Competitive pricing in the large model segment

5. Updated Existing Models

OpenAI Models

Key Changes:

  • Moved GPT-4.1 to top of list (featured models)
  • Marked as deprecated: o3, o4-mini, gpt-4-turbo
  • Updated descriptions to be more concise
  • Added cached input costs

Example:

// GPT-4.1 - Featured model
{
  modelId: 'gpt-4.1',
  inputCostPerMillionTokens: 2.0,  // Was: 0.2 per 1k = $2 per million
  outputCostPerMillionTokens: 8.0,
  cachedInputCostPerMillionTokens: 0.5,
  contextWindowTokens: 1047576,  // ~1M context
}

Anthropic Models

Key Changes:

  • Added Claude 4.1 models (Sonnet and Haiku)
  • Marked as deprecated: claude-3-7-sonnet, claude-3-6-sonnet, claude-3-5-haiku
  • Added support for extended thinking
  • Updated pricing structure

XAI Models

Key Changes:

  • Reordered: grok-4 before grok-4-1-fast-reasoning
  • Marked as deprecated: grok-3, grok-3-mini
  • Updated costs and descriptions

6. AI Billing Service Updates

File: ai-billing.service.ts

Thay đổi quan trọng:

// OLD
async calculateAndBillUsage(
  modelId: string,
  usage: { promptTokens: number; completionTokens: number },
  workspaceId: string,
) {
  const cost = 
    (usage.promptTokens * model.inputCostPer1kTokensInCents / 1000) +
    (usage.completionTokens * model.outputCostPer1kTokensInCents / 1000);
}

// NEW
async calculateAndBillUsage(
  modelId: string,
  usageParams: {
    usage?: { promptTokens: number; completionTokens: number };
    cachedPromptTokens?: number;
  },
  workspaceId: string,
) {
  const inputCost = 
    (regularPromptTokens * model.inputCostPerMillionTokens) / 1_000_000;
  const cachedInputCost = 
    (cachedTokens * (model.cachedInputCostPerMillionTokens ?? model.inputCostPerMillionTokens)) / 1_000_000;
  const outputCost = 
    (completionTokens * model.outputCostPerMillionTokens) / 1_000_000;
}

Improvements:

  • Supports cost calculation for cached prompts
  • More accurate with per-million-tokens
  • Clear separation between cached and non-cached inputs
  • Extensive unit tests (221 lines of test code)

7. Model Registry Service Updates

File: ai-model-registry.service.ts

Added registration methods:

// AWS Bedrock
private registerBedrockModels(region: string): void {
  const bedrock = createAmazonBedrock({
    region,
    ...(accessKeyId && secretAccessKey
      ? { accessKeyId, secretAccessKey, sessionToken }
      : {}),
  });
  // Register models...
}

// Google
private registerGoogleModels(): void {
  GOOGLE_MODELS.forEach((modelConfig) => {
    this.modelRegistry.set(modelConfig.modelId, {
      modelId: modelConfig.modelId,
      inferenceProvider: InferenceProvider.GOOGLE,
      model: google(modelConfig.modelId),
    });
  });
}

// Mistral
private registerMistralModels(): void { /* ... */ }

API key validation updates:

  • Added validation for GOOGLE_API_KEY
  • Added validation for MISTRAL_API_KEY
  • Added validation for AWS_BEDROCK_REGION (+ optional credentials)

8. Agent Model Config Service

File: agent-model-config.service.ts

Added Bedrock provider options:

private getBedrockProviderOptions(model: RegisteredAIModel): ProviderOptions {
  if (!model.doesSupportThinking) {
    return {};
  }

  return {
    bedrock: {
      thinking: {
        type: 'enabled',
        budgetTokens: AGENT_CONFIG.REASONING_BUDGET_TOKENS,
      },
    },
  };
}

Significance:

  • Supports extended thinking mode for Claude models via Bedrock
  • Configures reasoning budget tokens
  • Consistent with Anthropic provider options

9. Environment Variables

File: config-variables.ts

Added new environment variables:

// AWS Bedrock
AWS_BEDROCK_REGION: { type: 'string', required: false }
AWS_BEDROCK_ACCESS_KEY_ID: { type: 'string', required: false }
AWS_BEDROCK_SECRET_ACCESS_KEY: { type: 'string', required: false }
AWS_BEDROCK_SESSION_TOKEN: { type: 'string', required: false }

// Google
GOOGLE_API_KEY: { type: 'string', required: false }

// Mistral
MISTRAL_API_KEY: { type: 'string', required: false }

Bedrock Configuration:

  • Supports IAM roles (no credentials needed when running on EC2/ECS)
  • Supports explicit credentials (access key + secret key)
  • Supports temporary credentials (session token)

10. Dependencies Updates

File: package.json & yarn.lock

Added dependencies:

{
  "@ai-sdk/amazon-bedrock": "^3.0.83",
  "@ai-sdk/google": "^2.0.54",
  "@ai-sdk/mistral": "^2.0.28"
}

Related dependencies:

  • @smithy/eventstream-codec: ^4.0.1 (for Bedrock streaming)
  • @smithy/util-utf8: ^4.0.0
  • aws4fetch: ^1.0.20 (for AWS request signing)

🔍 Detailed Analysis

Frontend Changes

File: graphql.ts (generated)

  • Auto-generated from GraphQL schema changes
  • Added types for ModelFamily, InferenceProvider
  • Updated AIModelConfig interface

File: useAiModelOptions.ts

// Filtering logic updated
const availableModels = allModels.filter(
  (model) => !model.deprecated
);
  • Filters out deprecated models from UI
  • Users won't see legacy models in dropdown

Backend Changes

Client Config Updates

File: client-config.entity.ts & client-config.service.ts

Updated how AI models config is returned to frontend:

// Before
aiModels: models.map(model => ({
  id: model.modelId,
  provider: model.provider,
  inputCostPer1kTokens: model.inputCostPer1kTokensInCents / 100,
  // ...
}))

// After
aiModels: models.map(model => ({
  id: model.modelId,
  modelFamily: model.modelFamily,
  inferenceProvider: model.inferenceProvider,
  inputCostPerMillionTokens: model.inputCostPerMillionTokens,
  cachedInputCostPerMillionTokens: model.cachedInputCostPerMillionTokens,
  deprecated: model.deprecated,
  // ...
}))

Test Coverage

File: ai-billing.service.spec.ts

Expanded test cases:

  1. Prompt caching tests:
it('should handle prompt caching correctly', async () => {
  await service.calculateAndBillUsage(
    'claude-4-1-sonnet',
    {
      usage: { promptTokens: 100000, completionTokens: 1000 },
      cachedPromptTokens: 50000,
    },
    workspaceId,
  );
  // Verify caching discount applied
});
  1. Multiple provider tests:
  • OpenAI models
  • Anthropic models
  • Bedrock models
  • Google models
  • Mistral models
  1. Edge cases:
  • Models without caching support
  • Zero token usage
  • Large token counts

Coverage: +221 lines of test code


✅ Strengths

1. Architecture Design

Separation of Concerns: Separating ModelFamily and InferenceProvider enables flexibility
Type Safety: Using enums instead of magic strings
Backward Compatibility: Keeping deprecated models prevents breaking existing agents

2. Cost Management

Industry Standard: Per-million-tokens pricing aligns with industry practice
Caching Support: Significant cost savings with prompt caching
Transparency: Clear cost breakdown in billing service

3. Provider Support

AWS Integration: Bedrock supports enterprise use cases with IAM
Cost Options: Multiple providers for price/performance tradeoffs
Feature Parity: Consistent capabilities across providers

4. Code Quality

Comprehensive Tests: Extensive unit tests for billing logic
Clear Comments: Good documentation in constants files
Consistent Naming: Follows project naming conventions

5. User Experience

Model Filtering: Hide deprecated models from UI
Clear Descriptions: Updated model descriptions more concise
Cost Visibility: Users can see exact costs per model


⚠️ Potential Issues & Recommendations

1. Migration Strategy

Issue: No migration script for existing agents using deprecated models

Recommendation:

// Add migration service
class ModelMigrationService {
  async migrateDeprecatedModels(workspaceId: string) {
    const agents = await this.findAgentsWithDeprecatedModels(workspaceId);
    
    const migrationMap = {
      'o3': 'gpt-4.1',
      'o4-mini': 'gpt-4.1-mini',
      'claude-3-7-sonnet': 'claude-4-1-sonnet',
      'grok-3': 'grok-4',
    };
    
    for (const agent of agents) {
      agent.modelId = migrationMap[agent.modelId] || agent.modelId;
      await this.save(agent);
    }
  }
}

2. Error Handling

Issue: AWS credentials validation is unclear

Current:

const apiKey = this.twentyConfigService.get('AWS_BEDROCK_REGION');
if (!apiKey) {
  throw new AgentException('AWS_BEDROCK_REGION not configured');
}

Recommendation:

// More detailed credentials validation
private async validateBedrockConfig(): Promise<void> {
  const region = this.twentyConfigService.get('AWS_BEDROCK_REGION');
  const accessKeyId = this.twentyConfigService.get('AWS_BEDROCK_ACCESS_KEY_ID');
  const secretAccessKey = this.twentyConfigService.get('AWS_BEDROCK_SECRET_ACCESS_KEY');
  
  if (!region) {
    throw new AgentException(
      'AWS_BEDROCK_REGION is required to use Bedrock models',
      AgentExceptionCode.API_KEY_NOT_CONFIGURED,
    );
  }
  
  // If explicit credentials provided, both must be present
  if ((accessKeyId && !secretAccessKey) || (!accessKeyId && secretAccessKey)) {
    throw new AgentException(
      'Both AWS_BEDROCK_ACCESS_KEY_ID and AWS_BEDROCK_SECRET_ACCESS_KEY must be provided together',
      AgentExceptionCode.API_KEY_NOT_CONFIGURED,
    );
  }
  
  // Test connection
  try {
    await this.bedrockClient.testConnection();
  } catch (error) {
    throw new AgentException(
      `Failed to connect to AWS Bedrock: ${error.message}`,
      AgentExceptionCode.PROVIDER_CONNECTION_ERROR,
    );
  }
}

3. Cost Calculation Precision

Issue: Floating point precision may cause small errors

Current:

const inputCost = (promptTokens * inputCostPerMillionTokens) / 1_000_000;

Recommendation:

// Use integer math to avoid floating point errors
const inputCostInCents = Math.round(
  (promptTokens * inputCostPerMillionTokens * 100) / 1_000_000
);
const inputCost = inputCostInCents / 100;

4. Documentation

Missing:

  • ENV variable documentation for users
  • Migration guide for deprecated models
  • Cost comparison table

Recommendation: Add file docs/ai-models-migration.md:

# AI Models Migration Guide

## New Providers

### AWS Bedrock Setup
```bash
# Option 1: IAM Role (recommended for AWS deployments)
AWS_BEDROCK_REGION=us-east-1

# Option 2: Access Keys
AWS_BEDROCK_REGION=us-east-1
AWS_BEDROCK_ACCESS_KEY_ID=AKIA...
AWS_BEDROCK_SECRET_ACCESS_KEY=...

Google (Gemini) Setup

GOOGLE_API_KEY=AIza...

Mistral Setup

MISTRAL_API_KEY=...

Deprecated Models

Old Model Recommended Replacement Notes
o3 gpt-4.1 Better performance, same pricing
claude-3-7-sonnet claude-4-1-sonnet Extended thinking support
grok-3 grok-4 Enhanced reasoning

### 5. Testing

**Missing:**
- Integration tests for Bedrock authentication
- E2E tests for cached prompts
- Load tests for billing calculations

**Recommendation:**
```typescript
// Add integration test
describe('BedrockIntegration', () => {
  it('should authenticate with IAM role', async () => {
    process.env.AWS_BEDROCK_REGION = 'us-east-1';
    const service = new AiModelRegistryService();
    
    const model = await service.getModel('claude-4-1-sonnet');
    expect(model).toBeDefined();
  });
  
  it('should handle prompt caching in real request', async () => {
    const response = await agent.chat({
      messages: [...previousMessages, newMessage],
      // AI SDK should auto-detect cached prompts
    });
    
    expect(response.usage.cachedPromptTokens).toBeGreaterThan(0);
  });
});

6. Performance

Concern: Bedrock may have cold start latency

Recommendation:

  • Implement connection pooling
  • Add latency monitoring
  • Consider keep-alive pings
// Example monitoring
class BedrockPerformanceMonitor {
  async trackLatency(modelId: string, operation: () => Promise<any>) {
    const start = Date.now();
    try {
      const result = await operation();
      const duration = Date.now() - start;
      
      await this.metricsService.record({
        metric: 'bedrock.latency',
        value: duration,
        tags: { modelId, provider: 'bedrock' },
      });
      
      return result;
    } catch (error) {
      await this.metricsService.record({
        metric: 'bedrock.error',
        tags: { modelId, error: error.code },
      });
      throw error;
    }
  }
}

🔐 Security Considerations

1. Credentials Management

Current Implementation: ✅ GOOD

  • Credentials loaded from environment variables
  • No hardcoding in code
  • Supports IAM roles (no credentials needed in code)

Recommendations:

  • Consider using AWS Secrets Manager for production
  • Implement credential rotation
  • Add audit logging for credential access

2. Cost Control

Risk: Users may incur high costs with expensive models

Recommendation:

// Add budget limits
interface WorkspaceBudgetConfig {
  dailyLimitInDollars: number;
  alertThresholdPercentage: number;
  autoStopOnLimit: boolean;
}

async calculateAndBillUsage(...) {
  // Check budget before billing
  const currentSpend = await this.getTodaySpend(workspaceId);
  const budget = await this.getBudgetConfig(workspaceId);
  
  if (currentSpend + totalCost > budget.dailyLimitInDollars) {
    if (budget.autoStopOnLimit) {
      throw new BudgetExceededException();
    }
    await this.sendBudgetAlert(workspaceId);
  }
  
  // Continue with billing...
}

📊 Impact Analysis

Performance Impact

Positive: ✅ Cached prompts significantly reduce latency and cost
✅ Google Gemini Flash models are very fast (low latency)
✅ Bedrock has multi-region support (better global latency)

Negative: ⚠️ More providers = more connection overhead
⚠️ Billing calculation more complex (cached vs non-cached)

Cost Impact

For Users:

  • Cached prompts: Save 75-90% cost for repeated contexts
  • More options: Choose provider suitable for budget
  • Competitive pricing: Google Gemini Flash very affordable ($0.075/M input)

Cost Comparison Example:

Scenario: Chat with 50K context, 100 turns

Old (GPT-4.1, no caching):
- Input: 50K × 100 turns = 5M tokens
- Cost: 5M × $2/M = $10

New (Claude 4.1 Sonnet with caching):
- Turn 1: 50K input = $0.15
- Turn 2-100: 50K cached @ $0.3/M = $0.015 each
- Total: $0.15 + (99 × $0.015) = $1.63
- Savings: 83.7% 🎉

Breaking Changes

API Changes:

  • ✅ Backward compatible: deprecated models still work
  • ⚠️ Frontend needs rebuild (GraphQL types changed)
  • ⚠️ Billing reports format changed (per-million vs per-1k)

Migration Required:

  • Update frontend build
  • Update any external tools reading billing data
  • Educate users about deprecated models

🧪 Testing Checklist

Unit Tests

  • ✅ Billing calculations with caching
  • ✅ Model filtering (deprecated models)
  • ✅ Provider options generation
  • ⚠️ Missing: Edge cases for all new providers

Integration Tests

  • ⚠️ Bedrock authentication flows
  • ⚠️ Google API key validation
  • ⚠️ Mistral API integration
  • ⚠️ Prompt caching end-to-end

E2E Tests

  • ⚠️ Agent creation with new models
  • ⚠️ Cost tracking with cached prompts
  • ⚠️ Model migration flow
  • ⚠️ Multi-provider failover

📝 Documentation Needs

User Documentation

  1. Setup guides for each provider:

    • AWS Bedrock setup (IAM roles vs credentials)
    • Google API key generation
    • Mistral API key setup
  2. Cost optimization guide:

    • When to use caching
    • Provider price comparison
    • Budget recommendations
  3. Migration guide:

    • Deprecated models mapping
    • Feature comparison
    • Timeline for deprecation

Developer Documentation

  1. Architecture decision records:

    • Why separate ModelFamily vs InferenceProvider
    • Why per-million-tokens pricing
    • Caching implementation details
  2. API documentation:

    • Updated GraphQL schema docs
    • Billing API changes
    • Client config changes

🎯 Recommendations Priority

High Priority 🔴

  1. Add migration script for deprecated models
  2. Improve error messages for Bedrock credentials
  3. Add integration tests for all new providers
  4. Create setup documentation for users

Medium Priority 🟡

  1. Add budget limits and alerting
  2. Implement connection pooling for Bedrock
  3. Add performance monitoring for all providers
  4. Create cost comparison dashboard

Low Priority 🟢

  1. Add model benchmarks (quality metrics)
  2. Implement auto-failover between providers
  3. Add A/B testing framework for models
  4. Create analytics dashboard for model usage

📈 Future Enhancements

Short Term (Next Sprint)

  1. Model auto-selection based on task type
  2. Cost alerts when exceeding threshold
  3. Usage analytics dashboard

Medium Term (Next Quarter)

  1. Smart caching - auto-detect cacheable contexts
  2. Multi-model routing - route requests to best provider
  3. Performance benchmarks - compare models automatically

Long Term (Future)

  1. Custom model fine-tuning support
  2. Federation - combine multiple models
  3. Auto-scaling - switch models based on load

✅ Final Verdict

Overall Assessment: APPROVE with Minor Changes

Score: 8.5/10

Breakdown:

  • Code Quality: 9/10 - Well structured, type-safe
  • Test Coverage: 7/10 - Good unit tests, missing integration tests
  • Documentation: 6/10 - Needs user-facing docs
  • Performance: 9/10 - Caching is great optimization
  • Security: 8/10 - Good practices, needs budget controls
  • Backward Compatibility: 9/10 - Excellent deprecation strategy

Must-Have Before Merge:

  1. ✅ Add basic setup documentation
  2. ✅ Add integration test for Bedrock auth
  3. ✅ Improve error messages for missing credentials
  4. ✅ Add migration path documentation

Nice-to-Have:

  1. Budget limit implementation
  2. Performance monitoring
  3. More comprehensive E2E tests
  4. Cost optimization guide

� Questions for Team

  1. Timeline: When will deprecated models be fully sunset?
  2. Default model: Should we change default from GPT-4.1 to a cheaper model?
  3. Testing: Do we have budget to test on production with real providers?
  4. Monitoring: What infrastructure exists to monitor model performance/costs?
  5. Failover: Do we need to implement auto-failover between providers?

🏁 Conclusion

This is a high-quality refactor with:

  • ✅ Clear architecture improvements
  • ✅ Industry-standard pricing model
  • ✅ Significant cost savings potential (caching)
  • ✅ Good backward compatibility
  • ✅ Extensible design for future providers

Main concerns:

  • ⚠️ Missing integration tests
  • ⚠️ Documentation gaps
  • ⚠️ No cost control mechanisms

Recommendation: SHIP IT with the condition that must-have items above are addressed.


Review completed by: AI Code Reviewer
Date: February 22, 2026
Branch: ai-models-and-bedrock

With PRR Kit
PRR Kit

PR Review: origin/ai-models-and-bedrock

Date: 2026-02-22 | Reviewer: AI Review Framework
Type: refactor | Files: 27 | Lines: +1272 / -341


Executive Summary

This PR refactors the AI model configuration system to support AWS Bedrock, Google Gemini, and Mistral AI providers while standardizing cost calculations from per-1k-tokens (cents) to per-million-tokens (dollars). The changes demonstrate strong architectural thinking with comprehensive test coverage.

Verdict: ⚠️ APPROVE WITH NOTES — Fix 3 blockers before merge

Business Risk: MEDIUM

  • High business value: Multi-cloud AI strategy unlocks enterprise AWS deals (+$250K ARR opportunity)
  • Critical billing accuracy concerns: Duplicate calculation logic must be fixed to prevent revenue leakage
  • Strategic pricing decision needed: Transparent pricing has competitive intelligence implications

Totals: 🔴 3 blockers | 🟡 9 warnings | 🟢 5 suggestions | ❓ 4 questions for author


Business Impact 💼

Value Proposition

Multi-cloud AI strategy — Supporting 6 major providers (OpenAI, Anthropic direct + Bedrock, Google, Mistral, xAI, Groq) positions Twenty as vendor-neutral and future-proof. AWS Bedrock specifically unlocks enterprise deals with strict data residency requirements.

Risk Level: MEDIUM

  • Billing accuracy critical for customer trust and revenue
  • Pricing transparency exposes competitive intelligence
  • Model deprecation requires careful user communication

Top Business Concerns:

  1. Billing calculation accuracy — Duplicate logic could cause revenue leakage or customer overcharges
  2. Pricing transparency — GraphQL API exposes exact costs; decide if intentional
  3. Deprecated model migration — Users on o3, gpt-4o, etc. need clear migration path and advance notice

Deployment Recommendation:Ship after fixing blockers + validating billing across all providers

Post-ship Monitoring:

  • Week 1: Billing reconciliation audit (provider invoices vs. customer charges)
  • Month 1: Close first AWS Bedrock-enabled enterprise deal
  • Ongoing: Track provider adoption rates and revenue per provider

Blockers 🔴

🔴 BLOCKER — Duplicate calculation in totalInputTokens

File: packages/twenty-server/src/engine/metadata-modules/ai/ai-billing/services/ai-billing.service.ts:86-88

Category: Logic Error

Issue: Both branches of the ternary operator are identical:

const totalInputTokens = excludesCachedTokens
  ? adjustedInputTokens + cachedInputTokens + safeCacheCreationTokens
  : adjustedInputTokens + cachedInputTokens + safeCacheCreationTokens;

Impact:

  • Either the logic is incomplete (one branch should differ) OR the ternary is unnecessary
  • If calculation is wrong: revenue leakage (undercharge) or customer anger (overcharge)
  • Billing accuracy is critical for trust and profitability

Fix: Verify intent with author, then either:

// If both branches should be same:
const totalInputTokens = adjustedInputTokens + cachedInputTokens + safeCacheCreationTokens;

// OR if logic is incomplete, implement correct differentiation

🔴 BLOCKER — Missing modelFamily field in custom model config

File: packages/twenty-server/src/engine/metadata-modules/ai/ai-models/services/ai-model-registry.service.ts:265-272

Category: Type Safety

Issue: Fallback config for OpenAI-compatible models missing required modelFamily field:

return {
  modelId: registeredModel.modelId,
  ...
  inferenceProvider: registeredModel.inferenceProvider,
  inputCostPerMillionTokens: 0,
  outputCostPerMillionTokens: 0,
  ...
} as AIModelConfig;  // ← Missing modelFamily! Cast hides the error

Impact:

  • Runtime crash when billing service checks model.modelFamily === ModelFamily.ANTHROPIC
  • TypeScript can't catch this due to unsafe as cast
  • Affects any user using OpenAI-compatible models

Fix:

return {
  modelId: registeredModel.modelId,
  label: registeredModel.modelId,
  description: `Custom model: ${registeredModel.modelId}`,
  modelFamily: ModelFamily.OPENAI,  // ADD THIS
  inferenceProvider: registeredModel.inferenceProvider,
  inputCostPerMillionTokens: 0,
  outputCostPerMillionTokens: 0,
  contextWindowTokens: 128000,
  maxOutputTokens: 4096,
};  // Remove cast — now type-safe

🔴 BLOCKER — RegisteredAIModel schema mismatch

File: packages/twenty-server/src/engine/metadata-modules/ai/ai-models/services/ai-model-registry.service.ts:25-29

Category: Architecture / Type Safety

Issue: RegisteredAIModel interface uses old schema (provider) while AIModelConfig uses new schema (inferenceProvider + modelFamily):

export interface RegisteredAIModel {
  modelId: string;
  provider: ModelProvider;  // ← OLD schema
  model: LanguageModel;
}

export interface AIModelConfig {
  modelFamily: ModelFamily;              // NEW
  inferenceProvider: InferenceProvider;  // NEW (replaces 'provider')
  ...
}

Impact:

  • Type confusion between old provider and new inferenceProvider
  • Unsafe casts paper over the mismatch
  • Future refactorings will be error-prone

Fix: Update RegisteredAIModel to match new schema:

export interface RegisteredAIModel {
  modelId: string;
  modelFamily: ModelFamily;              // ADD
  inferenceProvider: InferenceProvider;  // RENAME from 'provider'
  model: LanguageModel;
  doesSupportThinking?: boolean;
}

Then update all registration methods to populate both fields.


Warnings 🟡

🟡 WARNING — Enum usage contradicts project conventions

File: packages/twenty-server/src/engine/metadata-modules/ai/ai-models/constants/ai-models-types.const.ts

Category: Code Quality

CLAUDE.md states: "String literals over enums (except for GraphQL enums)". New InferenceProvider and ModelFamily enums are not GraphQL enums.

Suggested fix: Convert to as const objects per project standards:

export const InferenceProvider = {
  NONE: 'none',
  OPENAI: 'openai',
  ...
} as const;

export type InferenceProvider = typeof InferenceProvider[keyof typeof InferenceProvider];

🟡 WARNING — Breaking API change: async removed from calculateAndBillUsage

File: packages/twenty-server/src/engine/metadata-modules/ai/ai-billing/services/ai-billing.service.ts:46-54

Category: API Compatibility

Method changed from async to synchronous, but caller still uses await. Harmless but could cause confusion.

Suggested fix: Either keep async for compatibility or update all callers.


🟡 WARNING — Incomplete validation for Bedrock credentials

File: packages/twenty-server/src/engine/metadata-modules/ai/ai-models/services/ai-model-registry.service.ts:384-408

Category: Security / UX

Validation only checks AWS_BEDROCK_REGION, not credentials. If region is set but credentials missing, validation passes but runtime fails.

Suggested fix:

case InferenceProvider.BEDROCK:
  const region = this.twentyConfigService.get('AWS_BEDROCK_REGION');
  if (!region) {
    throw new AgentException('AWS_BEDROCK_REGION required');
  }
  // Optionally warn if credentials partially configured
  const accessKey = this.twentyConfigService.get('AWS_BEDROCK_ACCESS_KEY_ID');
  const secretKey = this.twentyConfigService.get('AWS_BEDROCK_SECRET_ACCESS_KEY');
  if ((accessKey && !secretKey) || (!accessKey && secretKey)) {
    this.logger.warn('Bedrock credentials partially configured');
  }
  apiKey = region;
  break;

🟡 WARNING — AWS_BEDROCK_REGION sensitivity classification

File: packages/twenty-server/src/engine/core-modules/twenty-config/config-variables.ts

Category: Security (Low)

Region not marked isSensitive: true. While regions are public info, zero-trust security suggests marking metadata sensitive.

Suggested fix: Either mark as isSensitive: false with comment explaining, or mark true for consistency.


🟡 WARNING — Provider-specific business logic concentration

File: packages/twenty-server/src/engine/metadata-modules/ai/ai-billing/services/ai-billing.service.ts:75-95

Category: Architecture

Token accounting logic (excludesCachedTokens = modelFamily === ANTHROPIC) concentrates provider-specific behavior in billing service. As more providers are added, this will accumulate conditionals.

Suggested refactoring: Strategy pattern to encapsulate per-provider token accounting (see Architecture Review for details).


🟡 WARNING — Model costs hardcoded across multiple files

Files: packages/twenty-server/src/engine/metadata-modules/ai/ai-models/constants/*.const.ts

Category: Architecture / Business

~60+ hardcoded cost values across provider files. Pricing updates require code deployment.

Trade-off: Hardcoded = simple + audit trail vs. Database-backed = flexible + dynamic pricing

Decision needed: If pricing changes frequently (monthly), consider database. If stable (quarterly+), current approach acceptable.


🟡 WARNING — Model registry initialization frequency

File: packages/twenty-server/src/engine/metadata-modules/ai/ai-models/services/ai-model-registry.service.ts:35-77

Category: Performance

Registry rebuilt in constructor. If service is not singleton, this creates overhead (SDK clients recreated per request).

Assessment needed: Verify @Injectable() scope (default = singleton = good). If not singleton, implement lazy initialization.


🟡 WARNING — Cost computation in streaming loop

File: packages/twenty-server/src/engine/metadata-modules/ai/ai-chat/services/agent-chat-streaming.service.ts

Category: Performance

computeStreamCosts() called repeatedly during streaming. Simple arithmetic, acceptable unless profiling shows bottleneck.


🟡 WARNING — Potential secret exposure via client config

File: packages/twenty-server/src/engine/core-modules/client-config/client-config.entity.ts

Category: Security

Question for author: Are API keys exposed to frontend? Review shows only model costs are exposed (GOOD), but verify no keys leak.


Suggestions 🟢

🟢 Data Validation

Extract safe() number helper to shared utility module for reuse across billing/metrics code.

🟢 Documentation

Model descriptions shortened from informative to generic. Balance brevity with use-case clarity (e.g., "Best for: coding, analysis").

🟢 Performance

Consider caching getDefaultSpeedModel() / getDefaultPerformanceModel() results (currently iterates 50+ models per call).

🟢 Security

Token usage logging could expose sensitive metadata in compliance contexts. Consider debug-level logging for detailed cost breakdowns.

🟢 Architecture

Future-proof with adapter pattern for multi-cloud inference (accessing same model via different clouds: OpenAI direct vs. Azure vs. Bedrock).


Questions 📌

❓ Cache creation token source for non-Bedrock providers

Context: Streaming service extracts cacheWriteInputTokens from Bedrock metadata.

Question: How are cache creation tokens tracked for OpenAI and Anthropic direct? Is this Bedrock-only? If not, billing will be inaccurate for direct provider usage with prompt caching.


❓ AWS Bedrock credential fallback behavior

Context: Bedrock registration conditionally includes explicit credentials, falling back to AWS SDK credential chain.

Questions:

  1. Is IAM role-based auth the recommended production method?
  2. Should credential chain order be documented for compliance audits?
  3. Are there environments where auto-discovery should NOT happen (prevent accidental credential use)?

❓ Strategic implications of exposing model costs

Context: ClientAIModelConfig exposes exact per-token costs via GraphQL.

Questions:

  1. Is pricing transparency intentional (trust-building) or accidental?
  2. Could competitors scrape this to undercut pricing?
  3. Can users game the system with cost optimization scripts?
  4. Should you show user-facing prices (with margin) instead of raw costs?

❓ Model deprecation policy

Context: o3, o4-mini, gpt-4o, gpt-4o-mini, gpt-4-turbo marked deprecated.

Questions:

  1. What's the user communication plan (email, in-app warning, docs)?
  2. How much advance notice before deprecated models stop working?
  3. Which replacement model for each deprecated model?
  4. What's the revenue impact if users migrate to cheaper alternatives?

Files Reviewed

High-Impact Changes (3+ findings)

File Issues Categories
ai-billing.service.ts 4 Blocker (logic), Warning (architecture), Suggestion (util)
ai-model-registry.service.ts 6 Blocker (schema), Warning (validation, perf), Questions (2)
ai-models-types.const.ts 2 Blocker (schema), Warning (enum)

New Files (provider support)

Test Coverage

Comprehensiveai-billing.service.spec.ts covers:

  • Basic token usage
  • Cached token pricing (OpenAI vs. Anthropic accounting)
  • Reasoning tokens
  • Cache creation tokens

Review Methodology

Context Sources:

  • Primary: CLAUDE.md (project standards)
  • Stack rules: TypeScript (58 rules), NestJS (63 rules)
  • Domains: ai-configuration, ai-billing, ai-providers, graphql-schema
  • Stacks: typescript, nestjs, react, graphql, jest-vitest

Review Coverage:

  • ✅ General (logic, quality, testing, side effects)
  • ✅ Security (OWASP Top 10, secrets, auth)
  • ✅ Performance (queries, computation, caching)
  • ✅ Architecture (separation of concerns, extensibility)
  • ✅ Business (user impact, revenue, deployment risk)

Recommended Next Steps

Before Merge (Required)

  1. ✅ Fix 3 blockers (duplicate calc, missing field, schema mismatch)
  2. ✅ Validate billing across all 6 providers with integration tests
  3. ⚠️ Make conscious decision on pricing transparency
  4. ⚠️ Document model deprecation policy and user migration path

After Deployment (Week 1)

  1. Billing reconciliation audit (provider invoices vs. customer charges)
  2. Monitor support tickets for billing/deprecation confusion
  3. Track new provider adoption rates (Bedrock, Google, Mistral)

Strategic (1-3 months)

  1. Close first AWS Bedrock-enabled enterprise deal
  2. Evaluate Strategy pattern refactoring for token accounting
  3. Consider database-backed pricing if updates become frequent

Overall Assessment: Strong refactoring with comprehensive test coverage. Fix 3 critical issues before merge. High business value (AWS enterprise unlocked, multi-cloud strategy). Medium risk due to billing accuracy concerns and strategic pricing decisions needed.

Reviews generated using GitHub Copilot with default settings (left) vs. PRR Kit structured workflow (right).

Quick Start

Up in 3 steps

Install, answer a few prompts, start reviewing.

1

Install into your repo

npx prr-kit install

Run the interactive installer. It guides you through everything.

2

Answer a few quick questions

◆ What should reviewers call you? › Alex
◆ Language for reviewer agents? › English
◆ GitHub repo for posting comments?
  › owner/repo  (optional, blank = local mode)
◆ Which IDEs to configure? › Claude Code

Config is written automatically. No file editing needed.

3

Start reviewing

/prr-quick

One command runs the full pipeline: select PR, review, report.

Silent install (non-interactive)
bash
$ npx prr-kit install --directory /path/to/repo --modules prr --tools claude-code --yes
Pipeline

How it works

From PR selection to inline comments, fully orchestrated.

01
/prr-master

Start the Agent

Invoke the PRR Master agent from your IDE. It reads your _prr/ configuration and presents the full workflow menu.

02
Select PR

Select Your PR

List open PRs from your Git platform or select branches manually. The diff loads directly into AI context.

03
Describe changes

PR Description

Classify PR type, generate a concise summary, and produce a file-by-file walkthrough of all changes.

04
Automatic

PR-Specific Context

Automatically collects fresh context for this PR: ESLint rules, CLAUDE.md, CONTRIBUTING.md, inline annotations, only what's relevant to changed files.

05
Deep code review

Specialist Reviews

A team of reviewer agents runs in parallel or sequence, each focused on a different dimension of code quality, using the fresh PR-specific context.

06
Generate report

Generate Report

All findings compiled into a Markdown report, sorted by severity from blockers to suggestions.

07
Post inline comments

Post Inline Comments

Optionally post findings as inline code comments on the exact file and line, identical to a human reviewer experience.

Full control with the master agent

Use /prr-master to access the full workflow menu. Run each step manually, mix and match reviewers, or explore any part of the pipeline independently.

/prr-master
Severity Levels
BLOCKER
Must fix before merge
WARNING
Should fix
SUGGESTION
Nice-to-have
QUESTION
Needs clarification
Platform Support

Works where you work

GitHub, GitLab, Azure DevOps, Bitbucket, or local mode with no CLI.

GitHub
gh
List open PRs
Load diff
Inline reviews
Approve/reject
GitLab
glab
List open MRs
Load diff
MR discussions
Approve/reject
Azure DevOps
az
List open PRs
Git fallback diff
PR threads
Approve/reject
Bitbucket
bb / curl
List PRs
Git fallback diff
Inline comments
Core review
None / Local mode

All review analysis runs locally via git diff. No platform CLI required. Findings are saved to _prr-output/reviews/ only.

IDE Support

Works with your IDE

Compatible with 18 AI IDEs. Use the one you already have.

Claude Code
Claude Code
Anthropic
Cursor
Cursor
Anysphere
Windsurf
Windsurf
Codeium
GitHub Copilot
GitHub Copilot
GitHub
Antigravity
Antigravity
Antigravity
Also supported
Augment Code Cline Codex Crush Gemini CLI iFlow Kilo Kiro OpenCode QwenCoder Roo Code Rovo Dev Trae
Requirements
Node.js 20+
Git
AI IDE of your choice
Optional (for PR listing)
gh (GitHub CLI) GitHub
glab (GitLab CLI) GitLab
az (Azure CLI + extension) Azure DevOps
bb / curl (Bitbucket) Bitbucket
Support

Support the project

PR Review Kit is free for everyone, and always will be.
If you'd like to support development: