Request Tracking and Costs
SipPulse AI provides a complete request tracking system that allows you to monitor costs, performance, and usage of each API call individually. This is especially useful for:
- Cost control by customer or project
- Detailed usage reports
- Cost pass-through to end customers
What is x-request-id?
Every consumption request (TTS, ASR, LLM, etc.) returns an HTTP header called x-request-id. This unique identifier allows you to track the costs and metrics specific to that request.
ID Format
The ID follows the pattern: req_ + 32 alphanumeric characters
Example: req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6
Routes that Return x-request-id
| Route | Method | Response Type | Description |
|---|---|---|---|
/audio/speech | POST | Streaming | Text-to-Speech (OpenAI compatible) |
/tts/generate | POST | JSON | Text-to-Speech |
/asr/transcribe | POST | JSON | Speech-to-Text |
/audio/transcriptions | POST | JSON | Speech-to-Text (OpenAI compatible) |
/llms/completion | POST | Streaming | LLM text generation |
/chat/completions | POST | Streaming | LLM Chat (OpenAI compatible) |
/text-inteligence | POST | JSON | Text analysis |
/anonymize | POST | JSON | Text anonymization |
How to Extract x-request-id
const response = await fetch('https://api.sippulse.ai/tts/generate', {
method: 'POST',
headers: {
'Authorization': 'Bearer your_token',
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'openai-tts-1',
input: 'Hello, how can I help you?',
voice: 'alloy'
})
});
// Extract the request ID from the header
const requestId = response.headers.get('x-request-id');
console.log('Request ID:', requestId);
const data = await response.json();import requests
response = requests.post(
'https://api.sippulse.ai/tts/generate',
headers={
'Authorization': 'Bearer your_token',
'Content-Type': 'application/json'
},
json={
'model': 'openai-tts-1',
'input': 'Hello, how can I help you?',
'voice': 'alloy'
}
)
# Extract the request ID from the header
request_id = response.headers.get('x-request-id')
print(f'Request ID: {request_id}')
data = response.json()curl -i -X POST https://api.sippulse.ai/tts/generate \
-H "Authorization: Bearer your_token" \
-H "Content-Type: application/json" \
-d '{
"model": "openai-tts-1",
"input": "Hello, how can I help you?",
"voice": "alloy"
}'
# The x-request-id header will appear in the responseQuerying Usage Details
With the request_id in hand, you can query the complete cost and performance details:
GET https://api.sippulse.ai/usage-requests/{request_id}Asynchronous Processing
Cost calculation and billing are processed asynchronously. This means that immediately after receiving the x-request-id, the usage details may not be available yet (typically a few milliseconds delay).
For programmatic access, consider implementing a small delay or retry logic to ensure you retrieve the data correctly.
Retry Implementation Example
When querying usage details programmatically, it's recommended to implement a retry mechanism:
async function getUsageWithRetry(requestId, maxRetries = 3, delayMs = 100) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
const response = await fetch(
`https://api.sippulse.ai/usage-requests/${requestId}`,
{ headers: { 'Authorization': 'Bearer your_token' } }
);
if (response.ok) {
const data = await response.json();
// Check if cost data is available
if (data.total_price_local !== undefined) {
return data;
}
}
// Wait before retrying (exponential backoff)
if (attempt < maxRetries) {
await new Promise(resolve => setTimeout(resolve, delayMs * attempt));
}
} catch (error) {
if (attempt === maxRetries) throw error;
}
}
throw new Error(`Failed to get usage after ${maxRetries} attempts`);
}
// Usage
const usage = await getUsageWithRetry('req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6');
console.log('Cost:', usage.total_price_local);import time
import requests
def get_usage_with_retry(request_id, max_retries=3, delay_ms=100):
for attempt in range(1, max_retries + 1):
try:
response = requests.get(
f'https://api.sippulse.ai/usage-requests/{request_id}',
headers={'Authorization': 'Bearer your_token'}
)
if response.ok:
data = response.json()
# Check if cost data is available
if data.get('total_price_local') is not None:
return data
# Wait before retrying (exponential backoff)
if attempt < max_retries:
time.sleep((delay_ms * attempt) / 1000)
except Exception as e:
if attempt == max_retries:
raise e
raise Exception(f'Failed to get usage after {max_retries} attempts')
# Usage
usage = get_usage_with_retry('req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6')
print(f"Cost: {usage['total_price_local']}")Request Example
curl -X GET https://api.sippulse.ai/usage-requests/req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6 \
-H "Authorization: Bearer your_token"Response Example
{
"id": "req_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6",
"organization_id": "org_xxxx",
"user_id": "usr_yyyy",
"agent_id": "agt_zzzz",
"thread_id": "thr_wwww",
"project_id": null,
"auth_mode": "jwt",
"created_at": "2026-01-07T10:30:45.000Z",
"event_timestamp": "2026-01-07T10:30:45.000Z",
"updated_at": "2026-01-07T10:30:47.000Z",
"execution_time_ms": 2350,
"processing_time_ms": 50,
"total_price_local": 0.0025,
"speed": 10.21,
"items": [
{
"type": "model",
"subtype": "text-to-speech",
"identifier": "openai-tts-1",
"pricing_rule": "tts_char",
"amount": 24,
"unit_price": 0.0001,
"total_price": 0.0024,
"performance": {
"execution_time_ms": 2350,
"throughput": 10.21,
"is_cached": false
}
}
]
}Response Structure
Main Fields
| Field | Type | Description |
|---|---|---|
id | string | Unique request ID |
organization_id | string | Organization ID |
user_id | string | ID of the user who made the request |
agent_id | string | Agent ID (if applicable) |
thread_id | string | Conversation thread ID (if applicable) |
auth_mode | string | Authentication type: jwt, api_key, guest_token |
execution_time_ms | number | Execution time in milliseconds |
processing_time_ms | number | Processing/setup time in milliseconds |
total_price_local | number | Total cost in local currency |
speed | number | Throughput (tokens/s, characters/s, etc.) |
Item Fields
Each item represents a cost component of the request:
| Field | Type | Description |
|---|---|---|
type | string | Resource type: model, feature, etc. |
subtype | string | Subtype: text-to-speech, speech-to-text, llm, etc. |
identifier | string | Identifier of the model/resource used |
pricing_rule | string | Applied pricing rule |
amount | number | Amount consumed |
unit_price | number | Price per unit |
total_price | number | Total item price |
Use Cases
1. Cost Pass-Through to Customers
Store the request_id along with your customer's identifier:
async function processRequest(clientId, text) {
const response = await ttsGenerate(text);
const requestId = response.headers.get('x-request-id');
// Save to database
await db.usageLog.create({
client_id: clientId,
request_id: requestId,
timestamp: new Date()
});
return response;
}
// Later, to generate cost report:
async function generateClientReport(clientId) {
const logs = await db.usageLog.findByClientId(clientId);
let totalCost = 0;
for (const log of logs) {
const usage = await fetch(`/usage-requests/${log.request_id}`);
totalCost += usage.total_price_local;
}
return totalCost;
}2. Performance Monitoring
Track performance metrics:
const response = await asrTranscribe(audioFile);
const requestId = response.headers.get('x-request-id');
const usage = await getUsageRequest(requestId);
const metrics = {
executionTime: usage.execution_time_ms,
throughput: usage.speed,
cost: usage.total_price_local
};
// Send to monitoring system
monitoring.track('asr_performance', metrics);Best Practices
- Always store the request_id for important requests you may need to audit
- Don't expose request_ids to end users, as they allow access to cost information
- Implement rate limiting based on queried costs
- Set up alerts for requests with higher than expected costs
Next Steps
- REST API - Complete API documentation
- Usage Dashboard - Aggregated cost visualization
- Credits and Billing - Credit management
