Speech-to-Text

Speech-to-Text is a Google Cloud service that converts spoken language into written text using machine learning models. It supports real-time and batch audio transcription, multiple languages, and domain-specific tuning. The service can process audio from various sources such as files or streaming input, making it useful for voice commands, call analytics, and caption generation.

gcp.speech_recognizer

Fields

TitleIDTypeData TypeDescription
_keycorestring
ancestorscorearray<string>
annotationscorehstoreAllows users to store small amounts of arbitrary data. Both the key and the value must be 63 characters or less each. At most 100 annotations.
create_timecoretimestampOutput only. Creation time.
datadog_display_namecorestring
default_recognition_configcorejsonDefault configuration to use for requests with this Recognizer. This can be overwritten by inline configuration in the RecognizeRequest.config field.
delete_timecoretimestampOutput only. The time at which this Recognizer was requested for deletion.
etagcorestringOutput only. This checksum is computed by the server based on the value of other fields. This may be sent on update, undelete, and delete requests to ensure the client has an up-to-date value before proceeding.
expire_timecoretimestampOutput only. The time at which this Recognizer will be purged.
gcp_display_namecorestringUser-settable, human-readable name for the Recognizer. Must be 63 characters or less.
kms_key_namecorestringOutput only. The [KMS key name](https://cloud.google.com/kms/docs/resource-hierarchy#keys) with which the Recognizer is encrypted. The expected format is `projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{crypto_key}`.
kms_key_version_namecorestringOutput only. The [KMS key version name](https://cloud.google.com/kms/docs/resource-hierarchy#key_versions) with which the Recognizer is encrypted. The expected format is `projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{crypto_key}/cryptoKeyVersions/{crypto_key_version}`.
labelscorearray<string>
language_codescorearray<string>Optional. This field is now deprecated. Prefer the `language_codes` field in the `RecognitionConfig` message. The language of the supplied audio as a [BCP-47](https://www.rfc-editor.org/rfc/bcp/bcp47.txt) language tag. Supported languages for each model are listed in the [Table of Supported Models](https://cloud.google.com/speech-to-text/v2/docs/speech-to-text-supported-languages). If additional languages are provided, recognition result will contain recognition in the most likely language detected. The recognition result will include the language tag of the language detected in the audio. When you create or update a Recognizer, these values are stored in normalized BCP-47 form. For example, "en-us" is stored as "en-US".
modelcorestringOptional. This field is now deprecated. Prefer the `model` field in the `RecognitionConfig` message. Which model to use for recognition requests. Select the model best suited to your domain to get best results. Guidance for choosing which model to use can be found in the [Transcription Models Documentation](https://cloud.google.com/speech-to-text/v2/docs/transcription-model) and the models supported in each region can be found in the [Table Of Supported Models](https://cloud.google.com/speech-to-text/v2/docs/speech-to-text-supported-languages).
namecorestringOutput only. Identifier. The resource name of the Recognizer. Format: `projects/{project}/locations/{location}/recognizers/{recognizer}`.
organization_idcorestring
parentcorestring
project_idcorestring
project_numbercorestring
reconcilingcoreboolOutput only. Whether or not this Recognizer is in the process of being updated.
region_idcorestring
resource_namecorestring
statecorestringOutput only. The Recognizer lifecycle state.
tagscorehstore_csv
uidcorestringOutput only. System-assigned unique identifier for the Recognizer.
update_timecoretimestampOutput only. The most recent time this Recognizer was modified.
zone_idcorestring