Transcribe Transcription Job

Docs > DDSQL Reference > Data Directory > Transcribe Transcription Job

Transcribe Transcription Job in AWS is a resource that represents the output of an Amazon Transcribe service request. It provides details about an audio-to-text transcription job, including its status, metadata, and the location of the generated transcript. This resource helps track progress, retrieve results, and manage transcription workflows for audio or video files.

aws.transcribe_transcription_job

Fields

ID	Type	Data Type	Description
_key	core	string
account_id	core	string
completion_time	core	timestamp	The date and time the specified transcription job finished processing. Timestamps are in the format YYYY-MM-DD'T'HH:MM:SS.SSSSSS-UTC. For example, 2022-05-04T12:33:13.922000-07:00 represents a transcription job that started processing at 12:33 PM UTC-7 on May 4, 2022.
content_redaction	core	json	Indicates whether redaction was enabled in your transcript.
creation_time	core	timestamp	The date and time the specified transcription job request was made. Timestamps are in the format YYYY-MM-DD'T'HH:MM:SS.SSSSSS-UTC. For example, 2022-05-04T12:32:58.761000-07:00 represents a transcription job that started processing at 12:32 PM UTC-7 on May 4, 2022.
failure_reason	core	string	If TranscriptionJobStatus is FAILED, FailureReason contains information about why the transcription job request failed. The FailureReason field contains one of the following values: Unsupported media format. The media format specified in MediaFormat isn't valid. Refer to refer to the MediaFormat parameter for a list of supported formats. The media format provided does not match the detected media format. The media format specified in MediaFormat doesn't match the format of the input file. Check the media format of your media file and correct the specified value. Invalid sample rate for audio file. The sample rate specified in MediaSampleRateHertz isn't valid. The sample rate must be between 8,000 and 48,000 hertz. The sample rate provided does not match the detected sample rate. The sample rate specified in MediaSampleRateHertz doesn't match the sample rate detected in your input media file. Check the sample rate of your media file and correct the specified value. Invalid file size: file size too large. The size of your media file is larger than what Amazon Transcribe can process. For more information, refer to Service quotas. Invalid number of channels: number of channels too large. Your audio contains more channels than Amazon Transcribe is able to process. For more information, refer to Service quotas.
identified_language_score	core	float64	The confidence score associated with the language identified in your media file. Confidence scores are values between 0 and 1; a larger value indicates a higher probability that the identified language correctly matches the language spoken in your media.
identify_language	core	bool	Indicates whether automatic language identification was enabled (TRUE) for the specified transcription job.
identify_multiple_languages	core	bool	Indicates whether automatic multi-language identification was enabled (TRUE) for the specified transcription job.
job_execution_settings	core	json	Provides information about how your transcription job was processed. This parameter shows if your request was queued and what data access role was used.
language_code	core	string	The language code used to create your transcription job. This parameter is used with single-language identification. For multi-language identification requests, refer to the plural version of this parameter, LanguageCodes.
language_codes	core	json	The language codes used to create your transcription job. This parameter is used with multi-language identification. For single-language identification requests, refer to the singular version of this parameter, LanguageCode.
language_id_settings	core	string	Provides the name and language of all custom language models, custom vocabularies, and custom vocabulary filters that you included in your request.
language_options	core	array<string>	Provides the language codes you specified in your request.
media	core	json	Provides the Amazon S3 location of the media file you used in your request.
media_format	core	string	The format of the input media file.
media_sample_rate_hertz	core	int64	The sample rate, in hertz, of the audio track in your input media file.
model_settings	core	json	Provides information on the custom language model you included in your request.
settings	core	json	Provides information on any additional settings that were included in your request. Additional settings include channel identification, alternative transcriptions, speaker partitioning, custom vocabularies, and custom vocabulary filters.
start_time	core	timestamp	The date and time the specified transcription job began processing. Timestamps are in the format YYYY-MM-DD'T'HH:MM:SS.SSSSSS-UTC. For example, 2022-05-04T12:32:58.789000-07:00 represents a transcription job that started processing at 12:32 PM UTC-7 on May 4, 2022.
subtitles	core	json	Indicates whether subtitles were generated with your transcription.
tags	core	hstore_csv
toxicity_detection	core	json	Provides information about the toxicity detection settings applied to your transcription.
transcript	core	json	Provides you with the Amazon S3 URI you can use to access your transcript.
transcription_job_name	core	string	The name of the transcription job. Job names are case sensitive and must be unique within an Amazon Web Services account.
transcription_job_status	core	string	Provides the status of the specified transcription job. If the status is COMPLETED, the job is finished and you can find the results at the location specified in TranscriptFileUri (or RedactedTranscriptFileUri, if you requested transcript redaction). If the status is FAILED, FailureReason provides details on why your transcription job failed.

Transcribe Transcription Job

Fields

How can I help you today?