Skip to content

Request Tracking and Costs

SipPulse AI provides a complete request tracking system that allows you to monitor costs, performance, and usage of each API call individually. This is especially useful for:

  • Cost control by customer or project
  • Detailed usage reports
  • Cost pass-through to end customers

What is x-request-id?

Every consumption request (TTS, ASR, LLM, etc.) returns an HTTP header called x-request-id. This unique identifier allows you to track the costs and metrics specific to that request.

ID Format

The ID follows the pattern: req_ + 32 alphanumeric characters

Example: req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6


Routes that Return x-request-id

RouteMethodResponse TypeDescription
/audio/speechPOSTStreamingText-to-Speech (OpenAI compatible)
/tts/generatePOSTJSONText-to-Speech
/asr/transcribePOSTJSONSpeech-to-Text
/audio/transcriptionsPOSTJSONSpeech-to-Text (OpenAI compatible)
/llms/completionPOSTStreamingLLM text generation
/chat/completionsPOSTStreamingLLM Chat (OpenAI compatible)
/text-inteligencePOSTJSONText analysis
/anonymizePOSTJSONText anonymization

How to Extract x-request-id

javascript
const response = await fetch('https://api.sippulse.ai/tts/generate', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer your_token',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'openai-tts-1',
    input: 'Hello, how can I help you?',
    voice: 'alloy'
  })
});

// Extract the request ID from the header
const requestId = response.headers.get('x-request-id');
console.log('Request ID:', requestId);

const data = await response.json();
python
import requests

response = requests.post(
    'https://api.sippulse.ai/tts/generate',
    headers={
        'Authorization': 'Bearer your_token',
        'Content-Type': 'application/json'
    },
    json={
        'model': 'openai-tts-1',
        'input': 'Hello, how can I help you?',
        'voice': 'alloy'
    }
)

# Extract the request ID from the header
request_id = response.headers.get('x-request-id')
print(f'Request ID: {request_id}')

data = response.json()
bash
curl -i -X POST https://api.sippulse.ai/tts/generate \
  -H "Authorization: Bearer your_token" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai-tts-1",
    "input": "Hello, how can I help you?",
    "voice": "alloy"
  }'

# The x-request-id header will appear in the response

Querying Usage Details

With the request_id in hand, you can query the complete cost and performance details:

GET https://api.sippulse.ai/usage-requests/{request_id}

Asynchronous Processing

Cost calculation and billing are processed asynchronously. This means that immediately after receiving the x-request-id, the usage details may not be available yet (typically a few milliseconds delay).

For programmatic access, consider implementing a small delay or retry logic to ensure you retrieve the data correctly.

Retry Implementation Example

When querying usage details programmatically, it's recommended to implement a retry mechanism:

javascript
async function getUsageWithRetry(requestId, maxRetries = 3, delayMs = 100) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(
        `https://api.sippulse.ai/usage-requests/${requestId}`,
        { headers: { 'Authorization': 'Bearer your_token' } }
      );

      if (response.ok) {
        const data = await response.json();
        // Check if cost data is available
        if (data.total_price_local !== undefined) {
          return data;
        }
      }

      // Wait before retrying (exponential backoff)
      if (attempt < maxRetries) {
        await new Promise(resolve => setTimeout(resolve, delayMs * attempt));
      }
    } catch (error) {
      if (attempt === maxRetries) throw error;
    }
  }

  throw new Error(`Failed to get usage after ${maxRetries} attempts`);
}

// Usage
const usage = await getUsageWithRetry('req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6');
console.log('Cost:', usage.total_price_local);
python
import time
import requests

def get_usage_with_retry(request_id, max_retries=3, delay_ms=100):
    for attempt in range(1, max_retries + 1):
        try:
            response = requests.get(
                f'https://api.sippulse.ai/usage-requests/{request_id}',
                headers={'Authorization': 'Bearer your_token'}
            )

            if response.ok:
                data = response.json()
                # Check if cost data is available
                if data.get('total_price_local') is not None:
                    return data

            # Wait before retrying (exponential backoff)
            if attempt < max_retries:
                time.sleep((delay_ms * attempt) / 1000)

        except Exception as e:
            if attempt == max_retries:
                raise e

    raise Exception(f'Failed to get usage after {max_retries} attempts')

# Usage
usage = get_usage_with_retry('req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6')
print(f"Cost: {usage['total_price_local']}")

Request Example

bash
curl -X GET https://api.sippulse.ai/usage-requests/req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6 \
  -H "Authorization: Bearer your_token"

Response Example

json
{
  "id": "req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6",
  "organization_id": "org_xxxx",
  "user_id": "usr_yyyy",
  "agent_id": "agt_zzzz",
  "thread_id": "thr_wwww",
  "project_id": null,
  "auth_mode": "jwt",
  "created_at": "2026-01-07T10:30:45.000Z",
  "event_timestamp": "2026-01-07T10:30:45.000Z",
  "updated_at": "2026-01-07T10:30:47.000Z",
  "execution_time_ms": 2350,
  "processing_time_ms": 50,
  "total_price_local": 0.0025,
  "speed": 10.21,
  "items": [
    {
      "type": "model",
      "subtype": "text-to-speech",
      "identifier": "openai-tts-1",
      "pricing_rule": "tts_char",
      "amount": 24,
      "unit_price": 0.0001,
      "total_price": 0.0024,
      "performance": {
        "execution_time_ms": 2350,
        "throughput": 10.21,
        "is_cached": false
      }
    }
  ]
}

Response Structure

Main Fields

FieldTypeDescription
idstringUnique request ID
organization_idstringOrganization ID
user_idstringID of the user who made the request
agent_idstringAgent ID (if applicable)
thread_idstringConversation thread ID (if applicable)
auth_modestringAuthentication type: jwt, api_key, guest_token
execution_time_msnumberExecution time in milliseconds
processing_time_msnumberProcessing/setup time in milliseconds
total_price_localnumberTotal cost in local currency
speednumberThroughput (tokens/s, characters/s, etc.)

Item Fields

Each item represents a cost component of the request:

FieldTypeDescription
typestringResource type: model, feature, etc.
subtypestringSubtype: text-to-speech, speech-to-text, llm, etc.
identifierstringIdentifier of the model/resource used
pricing_rulestringApplied pricing rule
amountnumberAmount consumed
unit_pricenumberPrice per unit
total_pricenumberTotal item price

Use Cases

1. Cost Pass-Through to Customers

Store the request_id along with your customer's identifier:

javascript
async function processRequest(clientId, text) {
  const response = await ttsGenerate(text);
  const requestId = response.headers.get('x-request-id');

  // Save to database
  await db.usageLog.create({
    client_id: clientId,
    request_id: requestId,
    timestamp: new Date()
  });

  return response;
}

// Later, to generate cost report:
async function generateClientReport(clientId) {
  const logs = await db.usageLog.findByClientId(clientId);

  let totalCost = 0;
  for (const log of logs) {
    const usage = await fetch(`/usage-requests/${log.request_id}`);
    totalCost += usage.total_price_local;
  }

  return totalCost;
}

2. Performance Monitoring

Track performance metrics:

javascript
const response = await asrTranscribe(audioFile);
const requestId = response.headers.get('x-request-id');

const usage = await getUsageRequest(requestId);
const metrics = {
  executionTime: usage.execution_time_ms,
  throughput: usage.speed,
  cost: usage.total_price_local
};

// Send to monitoring system
monitoring.track('asr_performance', metrics);

Best Practices

  1. Always store the request_id for important requests you may need to audit
  2. Don't expose request_ids to end users, as they allow access to cost information
  3. Implement rate limiting based on queried costs
  4. Set up alerts for requests with higher than expected costs

Next Steps