Request Tracking and Costs

SipPulse AI provides a complete request tracking system that allows you to monitor costs, performance, and usage of each API call individually. This is especially useful for:

Cost control by customer or project
Detailed usage reports
Cost pass-through to end customers

What is x-request-id?

Every consumption request (TTS, ASR, LLM, etc.) returns an HTTP header called x-request-id. This unique identifier allows you to track the costs and metrics specific to that request.

ID Format

The ID follows the pattern: req_ + 32 alphanumeric characters

Example: req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6

Routes that Return x-request-id

Route	Method	Response Type	Description
`/audio/speech`	POST	Streaming	Text-to-Speech (OpenAI compatible)
`/tts/generate`	POST	JSON	Text-to-Speech
`/asr/transcribe`	POST	JSON	Speech-to-Text
`/audio/transcriptions`	POST	JSON	Speech-to-Text (OpenAI compatible)
`/llms/completion`	POST	Streaming	LLM text generation
`/chat/completions`	POST	Streaming	LLM Chat (OpenAI compatible)
`/text-inteligence`	POST	JSON	Text analysis
`/anonymize`	POST	JSON	Text anonymization

How to Extract x-request-id

JavaScript/TypeScriptPythoncURL

javascript

const response = await fetch('https://api.sippulse.ai/tts/generate', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer your_token',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'openai-tts-1',
    input: 'Hello, how can I help you?',
    voice: 'alloy'
  })
});

// Extract the request ID from the header
const requestId = response.headers.get('x-request-id');
console.log('Request ID:', requestId);

const data = await response.json();

python

import requests

response = requests.post(
    'https://api.sippulse.ai/tts/generate',
    headers={
        'Authorization': 'Bearer your_token',
        'Content-Type': 'application/json'
    },
    json={
        'model': 'openai-tts-1',
        'input': 'Hello, how can I help you?',
        'voice': 'alloy'
    }
)

# Extract the request ID from the header
request_id = response.headers.get('x-request-id')
print(f'Request ID: {request_id}')

data = response.json()

bash

curl -i -X POST https://api.sippulse.ai/tts/generate \
  -H "Authorization: Bearer your_token" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai-tts-1",
    "input": "Hello, how can I help you?",
    "voice": "alloy"
  }'

# The x-request-id header will appear in the response

Querying Usage Details

With the request_id in hand, you can query the complete cost and performance details:

GET https://api.sippulse.ai/usage-requests/{request_id}

Asynchronous Processing

Cost calculation and billing are processed asynchronously. This means that immediately after receiving the x-request-id, the usage details may not be available yet (typically a few milliseconds delay).

For programmatic access, consider implementing a small delay or retry logic to ensure you retrieve the data correctly.

Retry Implementation Example

When querying usage details programmatically, it's recommended to implement a retry mechanism:

JavaScriptPython

javascript

async function getUsageWithRetry(requestId, maxRetries = 3, delayMs = 100) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(
        `https://api.sippulse.ai/usage-requests/${requestId}`,
        { headers: { 'Authorization': 'Bearer your_token' } }
      );

      if (response.ok) {
        const data = await response.json();
        // Check if cost data is available
        if (data.total_price_local !== undefined) {
          return data;
        }
      }

      // Wait before retrying (exponential backoff)
      if (attempt < maxRetries) {
        await new Promise(resolve => setTimeout(resolve, delayMs * attempt));
      }
    } catch (error) {
      if (attempt === maxRetries) throw error;
    }
  }

  throw new Error(`Failed to get usage after ${maxRetries} attempts`);
}

// Usage
const usage = await getUsageWithRetry('req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6');
console.log('Cost:', usage.total_price_local);

python

import time
import requests

def get_usage_with_retry(request_id, max_retries=3, delay_ms=100):
    for attempt in range(1, max_retries + 1):
        try:
            response = requests.get(
                f'https://api.sippulse.ai/usage-requests/{request_id}',
                headers={'Authorization': 'Bearer your_token'}
            )

            if response.ok:
                data = response.json()
                # Check if cost data is available
                if data.get('total_price_local') is not None:
                    return data

            # Wait before retrying (exponential backoff)
            if attempt < max_retries:
                time.sleep((delay_ms * attempt) / 1000)

        except Exception as e:
            if attempt == max_retries:
                raise e

    raise Exception(f'Failed to get usage after {max_retries} attempts')

# Usage
usage = get_usage_with_retry('req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6')
print(f"Cost: {usage['total_price_local']}")

Request Example

bash

curl -X GET https://api.sippulse.ai/usage-requests/req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6 \
  -H "Authorization: Bearer your_token"

Response Example

json

{
  "id": "req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6",
  "organization_id": "org_xxxx",
  "user_id": "usr_yyyy",
  "agent_id": "agt_zzzz",
  "thread_id": "thr_wwww",
  "project_id": null,
  "auth_mode": "jwt",
  "created_at": "2026-01-07T10:30:45.000Z",
  "event_timestamp": "2026-01-07T10:30:45.000Z",
  "updated_at": "2026-01-07T10:30:47.000Z",
  "execution_time_ms": 2350,
  "processing_time_ms": 50,
  "total_price_local": 0.0025,
  "speed": 10.21,
  "items": [
    {
      "type": "model",
      "subtype": "text-to-speech",
      "identifier": "openai-tts-1",
      "pricing_rule": "tts_char",
      "amount": 24,
      "unit_price": 0.0001,
      "total_price": 0.0024,
      "performance": {
        "execution_time_ms": 2350,
        "throughput": 10.21,
        "is_cached": false
      }
    }
  ]
}

Response Structure

Main Fields

Field	Type	Description
`id`	string	Unique request ID
`organization_id`	string	Organization ID
`user_id`	string	ID of the user who made the request
`agent_id`	string	Agent ID (if applicable)
`thread_id`	string	Conversation thread ID (if applicable)
`auth_mode`	string	Authentication type: `jwt`, `api_key`, `guest_token`
`execution_time_ms`	number	Execution time in milliseconds
`processing_time_ms`	number	Processing/setup time in milliseconds
`total_price_local`	number	Total cost in local currency
`speed`	number	Throughput (tokens/s, characters/s, etc.)

Item Fields

Each item represents a cost component of the request:

Field	Type	Description
`type`	string	Resource type: `model`, `feature`, etc.
`subtype`	string	Subtype: `text-to-speech`, `speech-to-text`, `llm`, etc.
`identifier`	string	Identifier of the model/resource used
`pricing_rule`	string	Applied pricing rule
`amount`	number	Amount consumed
`unit_price`	number	Price per unit
`total_price`	number	Total item price

Use Cases

1. Cost Pass-Through to Customers

Store the request_id along with your customer's identifier:

javascript

async function processRequest(clientId, text) {
  const response = await ttsGenerate(text);
  const requestId = response.headers.get('x-request-id');

  // Save to database
  await db.usageLog.create({
    client_id: clientId,
    request_id: requestId,
    timestamp: new Date()
  });

  return response;
}

// Later, to generate cost report:
async function generateClientReport(clientId) {
  const logs = await db.usageLog.findByClientId(clientId);

  let totalCost = 0;
  for (const log of logs) {
    const usage = await fetch(`/usage-requests/${log.request_id}`);
    totalCost += usage.total_price_local;
  }

  return totalCost;
}

2. Performance Monitoring

Track performance metrics:

javascript

const response = await asrTranscribe(audioFile);
const requestId = response.headers.get('x-request-id');

const usage = await getUsageRequest(requestId);
const metrics = {
  executionTime: usage.execution_time_ms,
  throughput: usage.speed,
  cost: usage.total_price_local
};

// Send to monitoring system
monitoring.track('asr_performance', metrics);

Best Practices

Always store the request_id for important requests you may need to audit
Don't expose request_ids to end users, as they allow access to cost information
Implement rate limiting based on queried costs
Set up alerts for requests with higher than expected costs

Next Steps

REST API - Complete API documentation
Usage Dashboard - Aggregated cost visualization
Credits and Billing - Credit management

Agents

Configuration

Tools

Advanced

Deploying Agents

Settings

Request Tracking and Costs

What is x-request-id?

ID Format

Routes that Return x-request-id

How to Extract x-request-id

Querying Usage Details

Retry Implementation Example

Request Example

Response Example

Response Structure

Main Fields

Item Fields

Use Cases

1. Cost Pass-Through to Customers

2. Performance Monitoring

Best Practices

Next Steps

Configuration

Tools

Advanced

Deploying Agents

Request Tracking and Costs ​

What is x-request-id? ​

ID Format ​

Routes that Return x-request-id ​

How to Extract x-request-id ​

Querying Usage Details ​

Retry Implementation Example ​

Request Example ​

Response Example ​

Response Structure ​

Main Fields ​

Item Fields ​

Use Cases ​

1. Cost Pass-Through to Customers ​

2. Performance Monitoring ​

Best Practices ​

Next Steps ​

Request Tracking and Costs

What is x-request-id?

ID Format

Routes that Return x-request-id

How to Extract x-request-id

Querying Usage Details

Retry Implementation Example

Request Example

Response Example

Response Structure

Main Fields

Item Fields

Use Cases

1. Cost Pass-Through to Customers

2. Performance Monitoring

Best Practices

Next Steps