Transcribe Transcription Job

Transcribe Transcription Job in AWS is a resource that represents the output of an Amazon Transcribe service request. It provides details about an audio-to-text transcription job, including its status, metadata, and the location of the generated transcript. This resource helps track progress, retrieve results, and manage transcription workflows for audio or video files.

aws.transcribe_transcription_job

Fields

TitleIDTypeData TypeDescription
_keycorestring
account_idcorestring
completion_timecoretimestampThe date and time the specified transcription job finished processing. Timestamps are in the format YYYY-MM-DD'T'HH:MM:SS.SSSSSS-UTC. For example, 2022-05-04T12:33:13.922000-07:00 represents a transcription job that started processing at 12:33 PM UTC-7 on May 4, 2022.
content_redactioncorejsonIndicates whether redaction was enabled in your transcript.
creation_timecoretimestampThe date and time the specified transcription job request was made. Timestamps are in the format YYYY-MM-DD'T'HH:MM:SS.SSSSSS-UTC. For example, 2022-05-04T12:32:58.761000-07:00 represents a transcription job that started processing at 12:32 PM UTC-7 on May 4, 2022.
failure_reasoncorestringIf TranscriptionJobStatus is FAILED, FailureReason contains information about why the transcription job request failed. The FailureReason field contains one of the following values: Unsupported media format. The media format specified in MediaFormat isn't valid. Refer to refer to the MediaFormat parameter for a list of supported formats. The media format provided does not match the detected media format. The media format specified in MediaFormat doesn't match the format of the input file. Check the media format of your media file and correct the specified value. Invalid sample rate for audio file. The sample rate specified in MediaSampleRateHertz isn't valid. The sample rate must be between 8,000 and 48,000 hertz. The sample rate provided does not match the detected sample rate. The sample rate specified in MediaSampleRateHertz doesn't match the sample rate detected in your input media file. Check the sample rate of your media file and correct the specified value. Invalid file size: file size too large. The size of your media file is larger than what Amazon Transcribe can process. For more information, refer to Service quotas. Invalid number of channels: number of channels too large. Your audio contains more channels than Amazon Transcribe is able to process. For more information, refer to Service quotas.
identified_language_scorecorefloat64The confidence score associated with the language identified in your media file. Confidence scores are values between 0 and 1; a larger value indicates a higher probability that the identified language correctly matches the language spoken in your media.
identify_languagecoreboolIndicates whether automatic language identification was enabled (TRUE) for the specified transcription job.
identify_multiple_languagescoreboolIndicates whether automatic multi-language identification was enabled (TRUE) for the specified transcription job.
job_execution_settingscorejsonProvides information about how your transcription job was processed. This parameter shows if your request was queued and what data access role was used.
language_codecorestringThe language code used to create your transcription job. This parameter is used with single-language identification. For multi-language identification requests, refer to the plural version of this parameter, LanguageCodes.
language_codescorejsonThe language codes used to create your transcription job. This parameter is used with multi-language identification. For single-language identification requests, refer to the singular version of this parameter, LanguageCode.
language_id_settingscorestringProvides the name and language of all custom language models, custom vocabularies, and custom vocabulary filters that you included in your request.
language_optionscorearray<string>Provides the language codes you specified in your request.
mediacorejsonProvides the Amazon S3 location of the media file you used in your request.
media_formatcorestringThe format of the input media file.
media_sample_rate_hertzcoreint64The sample rate, in hertz, of the audio track in your input media file.
model_settingscorejsonProvides information on the custom language model you included in your request.
settingscorejsonProvides information on any additional settings that were included in your request. Additional settings include channel identification, alternative transcriptions, speaker partitioning, custom vocabularies, and custom vocabulary filters.
start_timecoretimestampThe date and time the specified transcription job began processing. Timestamps are in the format YYYY-MM-DD'T'HH:MM:SS.SSSSSS-UTC. For example, 2022-05-04T12:32:58.789000-07:00 represents a transcription job that started processing at 12:32 PM UTC-7 on May 4, 2022.
subtitlescorejsonIndicates whether subtitles were generated with your transcription.
tagscorehstore
toxicity_detectioncorejsonProvides information about the toxicity detection settings applied to your transcription.
transcriptcorejsonProvides you with the Amazon S3 URI you can use to access your transcript.
transcription_job_namecorestringThe name of the transcription job. Job names are case sensitive and must be unique within an Amazon Web Services account.
transcription_job_statuscorestringProvides the status of the specified transcription job. If the status is COMPLETED, the job is finished and you can find the results at the location specified in TranscriptFileUri (or RedactedTranscriptFileUri, if you requested transcript redaction). If the status is FAILED, FailureReason provides details on why your transcription job failed.