Instrument your service, pull AWS cost data, add LLM token spend, and get the true total cost per request — so you know exactly what each AI call actually costs your business.
Copy one of the snippets below into your service. The SDK tracks token usage, latency, and request metadata — sending it to Argovaa for cost blending.
import argovaa from openai import OpenAI # Initialise with your Argovaa API key argovaa.init( api_key="arg_live_xxxxxxxxxxxx", service_name="my-ai-service", environment="production" ) # Wrap your existing LLM client client = argovaa.wrap(OpenAI()) # Use exactly as before — tracking is automatic response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}], # Argovaa captures: tokens, latency, cost, model ) # Optionally tag requests for cost attribution with argovaa.trace( user_id=user_id, feature="document-summary", tenant=tenant_id # for multi-tenant billing ): result = client.chat.completions.create(...)
import { Argovaa } from '@argovaa/sdk'; import OpenAI from 'openai'; // Initialise Argovaa const argovaa = new Argovaa({ apiKey: 'arg_live_xxxxxxxxxxxx', serviceName: 'my-ai-service', environment: 'production', }); // Wrap your OpenAI client const openai = argovaa.wrap(new OpenAI()); // Tracking is fully automatic from here const response = await openai.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: prompt }], }); // Tag for multi-tenant cost attribution await argovaa.trace({ userId: userId, feature: 'document-summary', tenant: tenantId, customCost: 0.002 // optional fixed overhead }, async () => { return openai.chat.completions.create({...}); });
| Metric | Source |
|---|---|
| Input tokens | LLM API |
| Output tokens | LLM API |
| Model name | LLM API |
| Token cost ($) | Calculated |
| Latency (ms) | SDK |
| Error rate | SDK |
| User / tenant ID | SDK |
| Feature / endpoint | SDK |
| AWS infra cost | CUR upload |
Use this key to initialise the SDK. Keep it secret — never commit to version control. Rotate from Settings if compromised.
Upload your AWS Cost and Usage Report (CUR) CSV file. We extract EC2, ECS, Lambda, and networking costs and blend them with your LLM token usage.
Enter token usage from your LLM provider dashboard, or the SDK will populate this automatically once instrumented.
All three cost layers blended into a single TCO view — infrastructure, LLM tokens, and overhead per request.
| Cost layer | Monthly ($) | Per request ($) | % of TCO | Status |
|---|