1 of 31

Colossyan API

Welcome

Colossyan's API offers a programmable video creation service. It allows you to generate videos at scale. To facilitate this, we provide a REST API using standard JSON as request and response payload format. This way you can integrate our service into any technology stack.

Features

💬 Over 600 voices
🌐 Over 70 languages
👄 Use any actor of your choice
👤 Create your own avatars through the API, and use them at your will
Generate template videos at a massive scale, easily
📹 Everything that you can generate in the Colossyan Studio you can also generate via our API
- 🖼️ Embedded images
- Embedded videos
- Embedded audios
- Embedded texts
Get notified immediately via a callback mechanism
Thanks to our web-based templating and payload exporting options it's incredibly easy to get up-n-running.

Jump right in

Getting Started

Quickstart

Follow the next steps to generate your first video using Colossyan's API.

Grab your token as described in the Authentication page.
Send your first request

const token = "<paste-your-token-here>";
const api = "https://app.colossyan.com/api/v1";

const job = {
  videoCreative: {
    settings: {
      name: "My first video",
      videoSize: {
        width: 1920,
        height: 1080
      }
    },
    scenes: [
      {
        tracks: [
          {
            type: "actor",
            position: { x: 420, y: 0 },
            size: { width: 1080, height: 1080 },
            actor: "karen",
            text: "This is my first video generated with Colossyan! Amazing!",
            speakerId: "aquXcfLbkxpW4BBI5qKm"
          }
        ]
      }
    ]
  }
};

const response = await fetch(`${api}/video-generation-jobs`, {
  method: "POST",
  headers: {
    Authorization: `Bearer ${token}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify(job),
});

const result = await response.json();
console.log(result); // { id, videoId }

In the snippet above, we post a new video generation job. Let's walk through what you should expect the generated video to look like.

The videos name will be My first video.
It will have a width of 1920px and a height of 1080px.
It will consist of a single scene
We added an actor track to it, with the actor being karen.
- We configured the actor to say This is my first video generated with Colossyan! Amazing! using the voice of aquXcfLbkxpW4BBI5qKm.
- The actor is set to be 1080px * 1080px in size and it's top-left will be in the (420;0) coordinates, counting from the top-left, resulting in the actor being in the middle of the video.

If you successfully sent the request, you should have received two things in return.
1. An id that you can use to delete or request status updates on the job
2. A videoId that you'll be able to query the generated video with once it's ready.
Navigate to the workspace where you have created your API key. You should see the video generate. Once it's ready, it should look exactly like this

Congratulations! 🎉 You just generated your first video using our API!

Generating the payloads for complex videos can get very complicated, very quickly (multiple scenes, and images and videos etc...). To get ahead of this, we provide a way to export the request body from the Colossyan video editor.

This way you can use our UI to craft your video templates easily, and then customize them at the last mile.

Extracting request bodies from the web-editor

Constructing request bodies for complex video generation jobs can become very complicated very quickly. To get ahead of this we provide a way to use our web-based video-generation tool (Colossyan Studio) to export the payload for the video you see.

To do this, open any draft you'd like to generate via the API.
Click on the "Generate" button on the top right corner of your editor, than in the dialog appeared you should see a button called "Export to API", click that button.
1. Note that you can only see this button if you have API access.
In the appearing modal, you should see the request body that you can send to generate the exact video that you can see in the editor.
Feel free to customise this payload.

Basics

Authentication

You need to have a Business or Enterprise plan to be able to use our API.

Colossyan's API is using . This means, that every request has to be authenticated by sending the token in the Authorization header pre-pended by the text Bearer . See the example below:

You can create or find your existing tokens at the bottom of the Workspace details tab .

Each token belongs to a specific workspace. This is where are the generated videos that you create through the API will be listed. To keep things tidy, it is recommended to create a specific workspace for each use-case for the API.

Endpoints

Stable features

https://app.colossyan.com/api/v1

Experimental features

https://app.colossyan.com/api

Assets

Using the API, you can list the following assets:

List avatars
List voices

In each case, you'll get the assets that the workspace you have generated the token it has access to.

To create a new "instant" avatar please see the following page:

Create avatar

List avatars

The endpoint will return the available avatars for your workspace.

The types of avatars the endpoint will return:

Studio: Avatars provided by Colossyan.
Scenario: Also provided by Colossyan, but shown with a specific scenario background.
Instant: Custom-made avatars. To learn how to create one using the API, check out the following page ⇒Create avatar
ssa_lite/ssa_studio: These are also custom-made avatars. To learn more, visit this link.

Important: Scenario and Instant avatars are not included by default. To access them, please contact support.

List voices

Voices

Video Generation

There are multiple ways to generate a video.

You can generate a video based on a template (that you create). In this case you only send us a reference to the template and a list of dynamic variables that you'd like us to replace in the template. See more here:
You can generate a video, by sending a whole video-generation-job descriptor JSON to our API. See more here:

Video generation through the API does not support interactive videos only regular ones.

Once a video generation has been triggered, you have multiple options to get notified once it's ready. See more here:

Generating using a template

To generate a video based on a template:

Head over to Colossyan and create your desired video draft using our web-based studio
Once you are done, click on the "Generate" button on the top right corner of your editor, than in the dialog appeared click "Export to API"

Note that this button is only visible if you have a plan which has access to the API. For more information, head over to our pricing page.

On the top right corner of the appearing dialog click "Save job as a template" and make note of the appearing ID. This is the ID of your template job, you can send this id with your request to start generate the video.

Keep in mind: the template job ID refers to the state of the draft when you saved it. If later you edit this draft, you'll need to create another template id to represent the changes.

Now you have everything to send us a video generation request using the API endpoint below.

Generating a video manually

You can generate a video by sending us a video-generation-job descriptor JSON. This is essentially a recipe for a video. Constructing such a recipe manually for complex videos can become tedious very quickly. Due to this we strongly recommend to either:

Extract the video-generation-job payload from our web-based UI editor. More details here:
Or to generate based on a template. More details here:

Important concepts

To better understand the API, it's recommended to get an understanding of the following building blocks first:

Scenes
- A scene is a logical consolidation of related content.
- A video can consist of multiple scenes. These scenes are concatenated together. Think of them as slides on a presentation. Each scene is meant to aggregate and encapsulate related content.
- Scenes offer a lot of convenience in timing their content, however it is completely up to the individual to decide how to divide the content into scenes.
Tracks
- Tracks are the building blocks of scenes. You define the content of the video in tracks.
- There are different types of tracks, for different purposes. To differentiate between them we use the type property.

Here you can see the details of the endpoint below.

For advanced use-cases and customisations (such as advanced timings or animations), please reach out to us, so we can help out with a solution engineer to make sure you achieve what you are after. Read more here: .

Receiving a generated video

There are three ways to get a video, that was generated via our APIs.

You can get notified via our callback once a video is ready (recommended in production)
You can poll it using other API endpoints
You can download the video from your Colossyan Account after navigating to the workspace where the API key was created.

Callback

Upon posting a new video generation job, you have the opportunity to add a callback and some callbackPayload to the job. When the job is successful our service will issue a POST request to the url you provided in the callback field.

In the body of this post message, we add the following fields:

Key

Description

Polling

Due to the asynchronous nature of video-generations, first you need to query the video-generation-job itself, to see the status of it. To do this, use the API below.

Continue to poll the status of the job, until it returns either finished or failed. In case it successfully finished generating use the API below to get the generated video.

You can get the videoId both when queueing the job itself, or when fetching it's status.

An example of a script to poll a job can be found below:

Getting the video from the Colossyan App

Open
Navigate to workspace in which the API key was used to generate the video
.
1. The video should be listed there
2. You should also see it if it's currently being generated. In this case, it'll show the status of the job.

Generated videos

Throught the API it is possible to retrieve or delete a video that was already generated.

Retrieve a video

Delete a video

Video generation job

A video generation job holds data respective the video generation.

For example by retrieving the generation job you can tell which status the video generation is currenlty in. (e.g. generating, finished, failed)

Retrieve video generation job
Delete video generation job

Retrieve video generation job

Delete video generation job

Deleting a video generation job will stop the video generation processing.

Avatar creation

Create avatar

You can create an "instant" avatar by sending an image or video link to the API. The avatar will be generated based on the provided media.

The API will return the avatar's name, which can then be used for generating video with the avatar.

Advanced

Advanced use-cases

In general, we always recommend to generate videos either by:

Extracting the video-generation-job payload from our web-based UI editor. More details here: .
Or by

If for some reason either of these options wouldn't be enough to achieve what you are after, in the following pages you can read about some advanced concepts, when working with our API.

Using template variables

Dynamic template variables are placeholders within your video script or scene text element that can be replaced with different values each time a video is generated.

They allow you to customize content dynamically without modifying the base template.

Learn more on the following pages on how to use them:

Script template variables

You can create dynamic template variables in the script on any scene, by simply typing a variable name between curly braces at any point of the script.

For example in the following script...

,,,

Dear {name}, I have seen that you haven't logged into your account for the past two months. Should we catch up quickly? Since then we released several cool stuff, such as {cool_stuff_1} and {cool_stuff_2}. Let me know if you have a free slot on your calendar next week!

,,,

...you have created three dynamic variables:

name
cool_stuff_1
cool_stuff_2

Providing values for the variables

You can replace these template variables with their values, by sending their values in the request body when generating a video. The API will replace the placeholders in the script with the corresponding values you provide.

For example:

{
    "dynamicVariables": {
        "name": "John",
        "cool_stuff_1": "AI-powered task automation",
        "cool_stuff_1": "Slack & Google Calendar integration",
    },
    // other request parameters...
}

Generating a video

On how to generate a video you can find more information here: Video Generation

Text template variables

You can use dynamic template variables in any text element, by simply typing a variable name between curly braces.

Take the following example:

,,,

Hey {name}, we’ve been improving!

Our newest feature, {feature_name}, is live!

,,,

In this text we created two template variables: name and feature_name .

Providing values for the variables

You can replace these template variables with their values, by sending their values in the request body when generating a video. The API will replace the placeholders in the text with the corresponding values you provide.

For example:

Generating a video

On how to generate a video you can find more information here:

Timing

Each track and scene has to have a defined or inferable duration. We provide multiple ways of defining these timing constraints. It is up to you to decide which way you would like to use.

Scene duration

You can set the duration of each scene in three different ways. In either case no matter how long the duration of the tracks within the scene are, everything longer than the scene itself will be cut out.

Explicit scene duration

You can explicitly define the scene's duration in milliseconds by setting the duration property on it.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "Scene with explicit duration",
        
        tracks: [
          {
            type: "actor",
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            text: "This scene is exactly two seconds long. It does not matter how long any of the tracks are.",
            speakerId: "en-GB-RyanNeural",
            actor: "ryan",
          },
        ],
      },
    ],
  },
};

Referring to a track

If you would like your scene to be exactly as long as one of the tracks in it, you can do so by referring to the track on the scene. In this case Colossyan will try to figure out the length of the track in isolation first, and if it can do so, it'll assign the same duration to the scene itself.

In this case the track duration can only be inferred if it has an default or an explicitly defined duration.

This is very handy when you would like to make sure that the scene will not end before an actor finishes speaking, or before the end of a video.

Use the intrinsicDurationTrackReference property in the scene to refer to the referenceId of the track you would like the scene to have an equal duration with.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "This scene lasts until the actor stops speaking",
        
        tracks: [
          {
            type: "actor",
            referenceId: "speech",
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            text: "It does not matter how long I am talking. The scene will last as long as necessary.",
            speakerId: "en-GB-RyanNeural",
            actor: "ryan",
          },
        ],
      },
    ],
  },
};

Combining this technique with timing the tracks by referring to the scene's start and end time is a very powerful combination, that allows you to achieve intricate videos with the flexibility of handing varying track lengths.

Default scene duration

If you do not explicitly set the duration, nor refer to a track within the scene, Colossyan will try to infer the best guess for the duration of the scene by looking at the tracks, and trying to find a single one with either an intrinsic or explicitly set duration.

This is only possible if there is only one track in the scene with a default or explicitly set duration. Otherwise Colossyan will throw a schema validation error, when queueing the job.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "This scene defaults to last until the end of the video",
        tracks: [
          {
            type: "video",
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            videoUrl: "https://your.cdn.com/public/videos/some_video.mp4",
          },
        ],
      },
    ],
  },
};

Track duration

Track duration can be set in three different ways. No matter which way you chose, if the scene is shorter, the track will be cut to fit it.

If however the track you provided does not last long enough to reach the duration defined by you (for example a video ends earlier, or the actor finishes speaking), the track will disappear earlier than you defined.

Explicit track duration

Similarly to setting a scene's explicit duration, you can set the duration of a track by setting the duration property on it. The duration should be set in milliseconds.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "Scene with a single track with an explicitly set duration",
        tracks: [
          {
            type: "image",
            
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            imageUrl: "https://your.cdn.com/public/videos/image.jpg",
          },
        ],
      },
    ],
  },
};

Referring to the scene's start and end time

You can set the track to start after a certain amount of time has passed since the beginning of the scene, and to last until a certain amount of time before the end of the scene. By setting both of these variables, you are essentially defining the duration of the track as well.

You can define these gaps from the start and end of the scene by setting the startTimeGap and the endTimeGap properties on the track.

You can only time the track this way if the scene itself does not rely on the track's duration.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "Scene with a single track with a duration set by startTimeGap and endTimeGap",
        duration: 5000,
        tracks: [
          {
            type: "image",
            : 1000, // ms
            : 2000, // ms
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            imageUrl: "https://your.cdn.com/public/videos/image.jpg",
          },
        ],
      },
    ],
  },
};

Default track duration

If you do not define the duration of the track in any way, Colossyan will try to fall back to the intrinsic duration of the track if there is any.

Only video and actor tracks have intrinsic durations.

In case of a video track the intrinsic duration equals the duration of the video itself.
In case of an actor track the intrinsic duration equals the duration of the lipsynched video of the actor speaking.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "Scene with a single track with an intrinsic duration",
        tracks: [
          {
            type: "video",
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            imageUrl: "https://your.cdn.com/public/videos/some_video.mp4",
          },
        ],
      },
    ],
  },
};

Timing tracks within a scene

To time when the track should start or end within the scene, you can use the startTimeGap or the endTimeGap properties on the track.

If you find yourself in a situation where these options are not suitable for your use-case, double check if there is a way to subdivide your content to different scenes, to achieve the result you are looking for.

Good practices

Use startTimeGap and endTimeGap to make videos that work well for dynamic content.
Refer to the track's duration in scenes to deal with dynamic track lengths.
Do not over-constrain the timing of your scenes and tracks. This will yield in a schema validation error.

Experimental

Knowledge to draft

Experimental endpoints are not yet finalised. They can change at any point, without notice. To reflect this, these endpoints are not available on the /v1/ endpoint (See more here: )

Timing

Each track and scene has to have a defined or inferable duration. We provide multiple ways of defining these timing constraints. It is up to you to decide which way you would like to use.

Scene duration

Explicit scene duration

You can explicitly define the scene's duration in milliseconds by setting the duration property on it.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "Scene with explicit duration",
        
        tracks: [
          {
            type: "actor",
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            text: "This scene is exactly two seconds long. It does not matter how long any of the tracks are.",
            speakerId: "en-GB-RyanNeural",
            actor: "ryan",
          },
        ],
      },
    ],
  },
};

Referring to a track

In this case the track duration can only be inferred if it has an default or an explicitly defined duration.

This is very handy when you would like to make sure that the scene will not end before an actor finishes speaking, or before the end of a video.

Use the intrinsicDurationTrackReference property in the scene to refer to the referenceId of the track you would like the scene to have an equal duration with.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "This scene lasts until the actor stops speaking",
        
        tracks: [
          {
            type: "actor",
            referenceId: "speech",
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            text: "It does not matter how long I am talking. The scene will last as long as necessary.",
            speakerId: "en-GB-RyanNeural",
            actor: "ryan",
          },
        ],
      },
    ],
  },
};

Default scene duration

This is only possible if there is only one track in the scene with a default or explicitly set duration. Otherwise Colossyan will throw a schema validation error, when queueing the job.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "This scene defaults to last until the end of the video",
        tracks: [
          {
            type: "video",
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            videoUrl: "https://your.cdn.com/public/videos/some_video.mp4",
          },
        ],
      },
    ],
  },
};

Track duration

Track duration can be set in three different ways. No matter which way you chose, if the scene is shorter, the track will be cut to fit it.

Explicit track duration

Similarly to setting a scene's explicit duration, you can set the duration of a track by setting the duration property on it. The duration should be set in milliseconds.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "Scene with a single track with an explicitly set duration",
        tracks: [
          {
            type: "image",
            
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            imageUrl: "https://your.cdn.com/public/videos/image.jpg",
          },
        ],
      },
    ],
  },
};

Referring to the scene's start and end time

You can define these gaps from the start and end of the scene by setting the startTimeGap and the endTimeGap properties on the track.

You can only time the track this way if the scene itself does not rely on the track's duration.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "Scene with a single track with a duration set by startTimeGap and endTimeGap",
        duration: 5000,
        tracks: [
          {
            type: "image",
            : 1000, // ms
            : 2000, // ms
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            imageUrl: "https://your.cdn.com/public/videos/image.jpg",
          },
        ],
      },
    ],
  },
};

Default track duration

If you do not define the duration of the track in any way, Colossyan will try to fall back to the intrinsic duration of the track if there is any.

Only video and actor tracks have intrinsic durations.

In case of a video track the intrinsic duration equals the duration of the video itself.
In case of an actor track the intrinsic duration equals the duration of the lipsynched video of the actor speaking.

const job = {
  videoCreative: {
    settings: {
      name: "Simple video with a single scene",
      videoSize: { height: 1080, width: 1920 },
    },
    scenes: [
      {
        name: "Scene with a single track with an intrinsic duration",
        tracks: [
          {
            type: "video",
            position: { x: 200, y: 200 }, // From the top left in pixels
            size: { height: 680, width: 1520 }, // In pixels
            imageUrl: "https://your.cdn.com/public/videos/some_video.mp4",
          },
        ],
      },
    ],
  },
};

Timing tracks within a scene

To time when the track should start or end within the scene, you can use the startTimeGap or the endTimeGap properties on the track.

Good practices

Use startTimeGap and endTimeGap to make videos that work well for dynamic content.
Refer to the track's duration in scenes to deal with dynamic track lengths.
Do not over-constrain the timing of your scenes and tracks. This will yield in a schema validation error.