Marswave OpenAPI User Documentation (1.0.0)

Download OpenAPI specification:

Marswave Team: dev@marswave.ai License: Apache 2.0

This document is the Marswave OpenAPI reference for user-facing endpoints.

Full integration guide: https://blog.listenhub.ai/openapi-docs-en

Authentication

Uses API key authentication, format: Authorization: Bearer < your api key >

Retrieve your API key: Visit the API Keys settings page

user

User-related interfaces

Get user subscription details

Retrieve current user subscription status and credit usage information.

Authorizations:
ApiKeyAuth

Responses

Response samples

Content type
application/json
{
  • "code": 0,
  • "message": "",
  • "data": {
    }
}

speakers

Speaker-related interfaces

Get speaker list (including private voices)

Retrieve available voices. When an API Key is provided, the response includes the user's accessible private voices; otherwise only public voices are returned.

query Parameters
language
string
Example: language=en/zh/ja

Filter language type

Responses

Response samples

Content type
application/json
{
  • "code": 0,
  • "message": "",
  • "data": {
    }
}

podcast

Podcast-related interfaces

Create Podcast Episode

Create new podcast episode based on provided text and other settings.

Authorizations:
ApiKeyAuth
Request Body schema: application/json
required
query
string

Text to be synthesized

Array of objects

Optional, other sources

required
Array of objects [ 1 .. 2 ] items

Required, voice type

language
string
Enum: "en" "zh" "ja"

Optional, en/zh/ja, language type. Default: en (English)

mode
string
Enum: "deep" "quick" "debate"

Generation mode deep: deep mode, quick: quick mode, debate: debate mode. Default: quick (quick mode)

Responses

Request samples

Content type
application/json
{
  • "query": "string",
  • "sources": [
    ],
  • "speakers": [
    ],
  • "language": "en",
  • "mode": "deep"
}

Response samples

Content type
application/json
{
  • "code": 0,
  • "message": "",
  • "data": {
    }
}

Query Podcast episode information

Query detailed information of specified Podcast episode, including blog text, audio content, etc.

Authorizations:
ApiKeyAuth
path Parameters
episodeId
required
string

Podcast episode unique identifier

Responses

Response samples

Content type
application/json
Example
{
  • "code": 0,
  • "message": "",
  • "data": {
    }
}

Get Podcast episode text stream information

Get outline or script text stream content of specified Podcast episode, returned in Server-Sent Events (SSE) format.

Authorizations:
ApiKeyAuth
path Parameters
episodeId
required
string

Podcast episode unique identifier

query Parameters
event
required
string
Enum: "script" "outline"

Query event type (script or outline)

Responses

Create Podcast Episode (Content Only)

Two-stage generation - Stage 1: Generate only podcast content (scripts, outline, etc.), without audio.

After generation completes, you can:

  1. Call the query endpoint to retrieve the generated scripts
  2. Modify the scripts (optional)
  3. Call /v1/podcast/episodes/{episodeId}/audio to generate audio
Authorizations:
ApiKeyAuth
Request Body schema: application/json
required
query
string

Text to be synthesized

Array of objects

Optional, other sources

required
Array of objects [ 1 .. 2 ] items

Required, voice type

language
required
string
Enum: "en" "zh" "ja"

Required, en/zh/ja, language type. Speaker language must match this parameter

mode
string
Enum: "deep" "quick" "debate"

Generation mode: deep (deep mode), quick (quick mode), debate (debate mode). Default: quick (quick mode)

Responses

Request samples

Content type
application/json
{
  • "query": "string",
  • "sources": [
    ],
  • "speakers": [
    ],
  • "language": "en",
  • "mode": "deep"
}

Response samples

Content type
application/json
{
  • "code": 0,
  • "message": "",
  • "data": {
    }
}

Generate Podcast Audio

Two-stage generation - Stage 2: Generate audio based on existing content.

Prerequisites:

  • Must first call /v1/podcast/episodes/text-content to generate content
  • Text generation status (contentStatus) must be text-success

Use cases:

  1. First call /v1/podcast/episodes/text-content to generate content
  2. Query and retrieve the generated scripts
  3. Modify the scripts (optional)
  4. Call this endpoint, either with modified scripts or using the original scripts
Authorizations:
ApiKeyAuth
path Parameters
episodeId
required
string

Podcast episode unique identifier

Request Body schema: application/json
optional
Array of objects

Optional, custom scripts array (uses existing scripts if not provided). Podcast scripts must contain 1-2 different speakers.

Responses

Request samples

Content type
application/json
{
  • "scripts": [
    ]
}

Response samples

Content type
application/json
{
  • "code": 0,
  • "message": "",
  • "data": {
    }
}

flowspeech

Flowspeech-related interfaces

Create Flowspeech Episode

Create new flow speech episode based on provided text and other settings.

Authorizations:
ApiKeyAuth
Request Body schema: application/json
required
required
Array of objects = 1 items

Required, source information

required
Array of objects [ 1 .. 2 ] items

Required, voice type

language
string
Enum: "en" "zh" "ja"

Optional, en/zh/ja, language type

mode
string
Enum: "smart" "direct"

Generation mode smart: AI-enhanced mode (fixes grammar, typos, etc.), direct: pass-through mode (no modifications, directly converts to speech)

Responses

Request samples

Content type
application/json
{
  • "sources": [
    ],
  • "speakers": [
    ],
  • "language": "en",
  • "mode": "smart"
}

Response samples

Content type
application/json
{
  • "code": 0,
  • "message": "",
  • "data": {
    }
}

Query Flowspeech episode information

Query detailed information of specified Flowspeech episode, including text, audio content, etc.

Authorizations:
ApiKeyAuth
path Parameters
episodeId
required
string

Flowspeech episode unique identifier

Responses

Response samples

Content type
application/json
Example
{}

Get Flowspeech episode text stream information

Get outline or script text stream content of specified Flowspeech episode, returned in Server-Sent Events (SSE) format.

Authorizations:
ApiKeyAuth
path Parameters
episodeId
required
string

Flowspeech episode unique identifier

query Parameters
event
required
string
Enum: "script" "outline"

Query event type (script or outline)

Responses

Response samples

Content type
text/event-stream
Example
id: 689ef06042a332af99cd5781
event: script
data: {"code":0, "message":"", "data": {"chunk":"Suddenly turned into a giant watermark frame."}}

id: 689ef06042a332af99cd5781
event: script
data: {"code":0, "message":"", "data": {"chunk":"Every window of this century-old building"}}

id: 689ef06042a332af99cd5781
event: script
data: {"code":0, "message":"", "data": {"chunk":"[END]"}}

text-to-speech

Text-to-Speech interfaces

Direct Speech Engine Call (Synchronous)

Generate audio directly from scripts without creating an Episode, supporting multi-speaker dialogue.

Features:

  • Synchronous response: Immediately returns audio URL and details
  • No Episode creation: Does not create any Episode record
  • Direct conversion: No AI modifications to text content, directly converts to speech
  • Multi-speaker support: Supports multiple different speakers
  • Credit deduction: Deducts credits based on actual generated audio length

Use Cases:

  • Have complete script content and need quick conversion to speech
  • Need precise control over speaker and content for each sentence
  • Don't need to save Episode records, only need audio files

Credit Calculation:

  • Checks if user has sufficient credits before generation
  • Deducts credits based on actual audioUnits generated
  • Returns error code 26004 if insufficient credits
Authorizations:
ApiKeyAuth
Request Body schema: application/json
required
required
Array of objects non-empty

Required, script array

Responses

Request samples

Content type
application/json
{
  • "scripts": [
    ]
}

Response samples

Content type
application/json
Example
{}

single-speaker-tts

Single speaker TTS generation interfaces

Single Speaker TTS Generation Interface (Streaming Binary Output)

Text-to-speech interface that returns audio binary stream directly.

Features:

  • Streaming Output: Returns binary audio stream directly without waiting for full generation
  • Single Speaker: Supports single speaker text-to-speech only
  • Real-time Response: Audio is returned in real-time as a stream

Use Cases:

  • Simple single-speaker text-to-speech needs
  • Scenarios requiring streaming audio output

Differences from /v1/speech:

  • /v1/tts: Streaming binary output, single speaker, real-time return
  • /v1/speech: JSON response, multi-speaker support, synchronous URL return

Credit Calculation:

  • Credits are deducted based on actual generated audio length
  • Returns error code 26004 if insufficient credits
Authorizations:
ApiKeyAuth
Request Body schema: application/json
required
input
required
string

Text content to be converted to speech

voice
required
string

Speaker ID (speakerId)

model
string
Default: "flowtts"

Model name (optional, default: flowtts). Note: This parameter is currently not processed, for compatibility only

Responses

Request samples

Content type
application/json
{
  • "input": "The weather is beautiful today, perfect for a walk outside.",
  • "voice": "speaker_001",
  • "model": "flowtts"
}

Response samples

Content type
No sample

📘 Error Codes

System Error Codes

Code Description Suggested Action
21007 Invalid API key Verify that the API key is configured correctly
25002 Resource not found Check whether the requested resource ID exists
25008 Invalid episode status Verify that text content generation is complete and contentStatus is text-success
26004 Insufficient credits Review account credit balance, upgrade the plan, or contact support
29003 Invalid parameters Validate request parameter formats and required fields
29998 Too many requests Implement exponential backoff retries; wait 20-30 seconds between attempts

Content Generation Error Codes

Code Description Applicable APIs
91001 Content too short Podcast, FlowSpeech
91002 Content violates policy Podcast, FlowSpeech
91003 Search failed Podcast, FlowSpeech
91004 Unable to retrieve content Podcast, FlowSpeech
91005 Unable to access URL content Podcast, FlowSpeech
91006 Processing failed Podcast, FlowSpeech
91007 File size too large Podcast, FlowSpeech

Error Response Format

All errors are returned in HTTP 200 responses and differentiated by the code field:

{
  "code": 21007,
  "message": "Invalid user APIKey",
  "data": {}
}

Error Handling Examples

System Errors

When the API returns a system error code, you usually need to fix the request parameters or inspect the account status.

Content Generation Errors

When a polling endpoint returns success (code: 0) but data.failCode is present, the content generation has failed:

{
  "code": 0,
  "message": "",
  "data": {
    "episodeId": "xxx",
    "processStatus": "failed",
    "failCode": 91001,
    "message": "Content is too short"
  }
}