PhoneMyBot expects to receive an audio file for transcription using the following:
Supported audio file formats are are '.flac', '.mp3', '.ogg' and '.wav', with encodings:
The transcription is provided following the 'form-data/multipart' method.
The following parameters should be included in the 'form data':
Name | Type | Description |
---|---|---|
fileName | string | Name of the audio file. |
queryConfig | JSON Object | JSON Object defining the configuration parameters used for the request |
file | multipart | The actual file to be uploaded in chunks |
Parameters that must be included in the 'queryConfig' JSON Object:
Name | Type | Description |
---|---|---|
subscription | string | the subscriptionId provided in the set-up of the API configuration |
requestid | string | unique identifier of this request, an arbitrary identifier choosen by the sender and that can help to correlate requests with responses (the value provided in the request is returned "as-is" in the response) |
The HTTP header of the message must include basic authentication as authorization: Basic {security-token}
, where the 'security token' is provided in the setup of the API configuration.
The following is an example of implementation using curl:
curl -i -X POST \
-H "Content-Type:multipart/form-data" \
-H "Authorization:Basic {security-token} \
-F "fileName={fileName}" \
-F "queryConfig={ \
\"subscription\": \"{subscription-id}\", \
\"requestid\": \"{request-id}\" \
}"\
-F "fileContent=@\"./{fileName}\";type=audio/wav;filename={fileName}\
'{baseURL}/cloudengine/rest/interaction/v1/transcribe/{interactionId}'
Where {subscription-id} is the identifier of the subscription, {request-id} is the arbitrary identifier of the request.
Possible responses:
{
"subscription": "{subscription-id}",
"requestid": "{request-id}",
"requesttype": "transcribe",
"result": {
"success": true,
"code": "CLD_200",
"description": "Request completed"
},
"data": {
"transcription": "this is a test transcription",
"final": true
}
}
Where the "transcription" field contains the transcription of the voice file that was sent in the POST.
{
"subscription": "{subscription-id}",
"requestid": "{request-id}",
"requesttype": "transcribe",
"result": {
"success": false,
"code": "OFFLINE_002",
"description": "Mandatory field subscription is missing"
}
}
{
"subscription": "{subscription-id}",
"requestid": "{request-id}",
"requesttype": "transcribe",
"result": {
"success": false,
"code": "OFFLINE_004",
"description": "Subscription not found"
}
}
Example:
{
"subscription": "{subscription-id}",
"requestid": "{request-id}",
"requesttype": "transcribe",
"result": {
"success": false,
"code": "OFFLINE_003",
"description": "Subscription disabled"
}
}
{
"subscription": "{subscription-id}",
"requestid": "{request-id}",
"requesttype": "transcribe",
"result": {
"success": false,
"code": "OFFLINE_008",
"description": "Application error: Malformed url o Missing File!"
},
}
The transcription quality can be improved if the context
of the audio file is known and is communicated as a hint
to PhoneMyBot.
For instance, if the audio file has been recorded after the user has been asked for a telephone number, it will likely contain a telephone number. If PhoneMyBot is made aware of the context, it can optimize the transcription of the phone number for better quality and precision.
The context
should be provided in the transcription request in a hints
JSON object inside the queryConfig
configuration object.
The typical structure of the hints
object is the following.
"hints" {
"contexts": [{
"name": {{task name a}}
"type": "builtin"
},{
"name": {{task name b}}
"type": "builtin"
}]
}
The number of contexts, i.e. recognition tasks, handled by PhoneMyBot, for each language, is growing day by day. See this page for the latest list.