Skip to content

Instances

The Instances section allows you to create, view, and manage custom instances to run your models in dedicated environments. This is an ideal option for users with high volume requirements who prefer not to pay per consumption (like tokens, minutes, or characters) and instead want a dedicated instance running a local model. In this setup, you only pay for the instance usage.

Key Features

  • Dedicated Environment: Run your models in a dedicated environment with various GPU options offering up to 12 cores.
  • Model Availability: Only open-source models and SipPulse AI models are available for instances. Proprietary models like GPT and Claude are not supported.
  • Instance Management: Start, stop, edit, and delete instances as needed.
  • Privacy: Instances are particularly useful for handling sensitive or private information that you do not want to share with third-party companies like OpenAI.

Instances Page Overview

On the Instances page, you'll find a table listing all your instances. Each entry in the table includes:

  • Name: The name given to the instance.
  • Instance: The type of GPU allocated to the instance.
  • Model: The model running on the instance.
  • Status: The current status of the instance (e.g., running, stopped).
  • Last Usage: The last time the instance was used.
  • Cost: The hourly cost of running the instance.

Instance Actions

  • Playground: Access the playground to test and interact with your instance.
  • Start/Stop: Start or stop the instance as needed.
  • Edit: Modify the instance configuration (GPU and number of cores) when the instance is stopped.
  • Delete: Permanently delete the instance when it is stopped.

Creating an Instance

To create an instance, follow these steps:

  1. Select Model: Choose the model you want to use (options include text-generation, text-to-speech, and speech-to-text).
  2. Select GPU: Choose the GPU you want to use. The interface will recommend a GPU and indicate the minimum required GPU. GPUs that do not meet the model's requirements will be disabled.
  3. Set Number of Cores: Adjust the number of cores (up to the maximum supported by the selected GPU).
  4. Configuration Options:
    • Start on Creation: Automatically start the instance as soon as it is created.
    • Inactivity Shutdown: Configure the instance to stop after a specified period of inactivity to save costs.

Important Notes

  • Editing Instances: You can edit the instance later, but note that you cannot change the model. Changing the GPU is only possible when the instance is stopped.
  • Instance Status: After creating an instance, it will start to load. This process may take a while. Check the status in the instances table. Once the status is "in service," the instance is available for API integration and testing in the playground.

Example Workflow

  1. Create an Instance: Select a model, choose an appropriate GPU, and configure the instance.
  2. Monitor Status: Track the loading status in the instances table.
  3. Start/Stop: Manage the instance's runtime status as needed.
  4. Integrate and Test: Use the playground to test the instance and integrate it with your API.

This setup ensures your models operate with the necessary efficiency and scalability, meeting your specific resource and performance demands, and provides a secure environment for handling sensitive data.