이 페이지는 아직 한국어로 제공되지 않습니다. 번역 작업 중입니다.
현재 번역 프로젝트에 대한 질문이나 피드백이 있으신 경우 언제든지 연락주시기 바랍니다.

gcp_speech_recognizer

`ancestors`

Type: UNORDERED_LIST_STRING

`annotations`

Type: MAP_STRING_STRING
Provider name: annotations
Description: Allows users to store small amounts of arbitrary data. Both the key and the value must be 63 characters or less each. At most 100 annotations.

`create_time`

Type: TIMESTAMP
Provider name: createTime
Description: Output only. Creation time.

`default_recognition_config`

Type: STRUCT
Provider name: defaultRecognitionConfig
Description: Default configuration to use for requests with this Recognizer. This can be overwritten by inline configuration in the RecognizeRequest.config field.

adaptation
Type: STRUCT
Provider name: adaptation
Description: Speech adaptation context that weights recognizer predictions for specific words and phrases.
- custom_classes
  Type: UNORDERED_LIST_STRUCT
  Provider name: customClasses
  Description: A list of inline CustomClasses. Existing CustomClass resources can be referenced directly in a PhraseSet.
  - annotations
    Type: MAP_STRING_STRING
    Provider name: annotations
    Description: Optional. Allows users to store small amounts of arbitrary data. Both the key and the value must be 63 characters or less each. At most 100 annotations.
  - create_time
    Type: TIMESTAMP
    Provider name: createTime
    Description: Output only. Creation time.
  - delete_time
    Type: TIMESTAMP
    Provider name: deleteTime
    Description: Output only. The time at which this resource was requested for deletion.
  - etag
    Type: STRING
    Provider name: etag
    Description: Output only. This checksum is computed by the server based on the value of other fields. This may be sent on update, undelete, and delete requests to ensure the client has an up-to-date value before proceeding.
  - expire_time
    Type: TIMESTAMP
    Provider name: expireTime
    Description: Output only. The time at which this resource will be purged.
  - gcp_display_name
    Type: STRING
    Provider name: displayName
    Description: Optional. User-settable, human-readable name for the CustomClass. Must be 63 characters or less.
  - items
    Type: UNORDERED_LIST_STRUCT
    Provider name: items
    Description: A collection of class items.
    - value
      Type: STRING
      Provider name: value
      Description: The class item’s value.
  - kms_key_name
    Type: STRING
    Provider name: kmsKeyName
    Description: Output only. The KMS key name with which the CustomClass is encrypted. The expected format is projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{crypto_key}.
  - kms_key_version_name
    Type: STRING
    Provider name: kmsKeyVersionName
    Description: Output only. The KMS key version name with which the CustomClass is encrypted. The expected format is projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{crypto_key}/cryptoKeyVersions/{crypto_key_version}.
  - name
    Type: STRING
    Provider name: name
    Description: Output only. Identifier. The resource name of the CustomClass. Format: projects/{project}/locations/{location}/customClasses/{custom_class}.
  - reconciling
    Type: BOOLEAN
    Provider name: reconciling
    Description: Output only. Whether or not this CustomClass is in the process of being updated.
  - state
    Type: STRING
    Provider name: state
    Description: Output only. The CustomClass lifecycle state.
    Possible values:
    - STATE_UNSPECIFIED - Unspecified state. This is only used/useful for distinguishing unset values.
    - ACTIVE - The normal and active state.
    - DELETED - This CustomClass has been deleted.
  - uid
    Type: STRING
    Provider name: uid
    Description: Output only. System-assigned unique identifier for the CustomClass.
  - update_time
    Type: TIMESTAMP
    Provider name: updateTime
    Description: Output only. The most recent time this resource was modified.
- phrase_sets
  Type: UNORDERED_LIST_STRUCT
  Provider name: phraseSets
  Description: A list of inline or referenced PhraseSets.
  - inline_phrase_set
    Type: STRUCT
    Provider name: inlinePhraseSet
    Description: An inline defined PhraseSet.
    - annotations
      Type: MAP_STRING_STRING
      Provider name: annotations
      Description: Allows users to store small amounts of arbitrary data. Both the key and the value must be 63 characters or less each. At most 100 annotations.
    - boost
      Type: FLOAT
      Provider name: boost
      Description: Hint Boost. Positive value will increase the probability that a specific phrase will be recognized over other similar sounding phrases. The higher the boost, the higher the chance of false positive recognition as well. Valid boost values are between 0 (exclusive) and 20. We recommend using a binary search approach to finding the optimal value for your use case as well as adding phrases both with and without boost to your requests.
    - create_time
      Type: TIMESTAMP
      Provider name: createTime
      Description: Output only. Creation time.
    - delete_time
      Type: TIMESTAMP
      Provider name: deleteTime
      Description: Output only. The time at which this resource was requested for deletion.
    - etag
      Type: STRING
      Provider name: etag
      Description: Output only. This checksum is computed by the server based on the value of other fields. This may be sent on update, undelete, and delete requests to ensure the client has an up-to-date value before proceeding.
    - expire_time
      Type: TIMESTAMP
      Provider name: expireTime
      Description: Output only. The time at which this resource will be purged.
    - gcp_display_name
      Type: STRING
      Provider name: displayName
      Description: User-settable, human-readable name for the PhraseSet. Must be 63 characters or less.
    - kms_key_name
      Type: STRING
      Provider name: kmsKeyName
      Description: Output only. The KMS key name with which the PhraseSet is encrypted. The expected format is projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{crypto_key}.
    - kms_key_version_name
      Type: STRING
      Provider name: kmsKeyVersionName
      Description: Output only. The KMS key version name with which the PhraseSet is encrypted. The expected format is projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{crypto_key}/cryptoKeyVersions/{crypto_key_version}.
    - name
      Type: STRING
      Provider name: name
      Description: Output only. Identifier. The resource name of the PhraseSet. Format: projects/{project}/locations/{location}/phraseSets/{phrase_set}.
    - phrases
      Type: UNORDERED_LIST_STRUCT
      Provider name: phrases
      Description: A list of word and phrases.
      - boost
        Type: FLOAT
        Provider name: boost
        Description: Hint Boost. Overrides the boost set at the phrase set level. Positive value will increase the probability that a specific phrase will be recognized over other similar sounding phrases. The higher the boost, the higher the chance of false positive recognition as well. Negative boost values would correspond to anti-biasing. Anti-biasing is not enabled, so negative boost values will return an error. Boost values must be between 0 and 20. Any values outside that range will return an error. We recommend using a binary search approach to finding the optimal value for your use case as well as adding phrases both with and without boost to your requests.
      - value
        Type: STRING
        Provider name: value
        Description: The phrase itself.
    - reconciling
      Type: BOOLEAN
      Provider name: reconciling
      Description: Output only. Whether or not this PhraseSet is in the process of being updated.
    - state
      Type: STRING
      Provider name: state
      Description: Output only. The PhraseSet lifecycle state.
      Possible values:
      - STATE_UNSPECIFIED - Unspecified state. This is only used/useful for distinguishing unset values.
      - ACTIVE - The normal and active state.
      - DELETED - This PhraseSet has been deleted.
    - uid
      Type: STRING
      Provider name: uid
      Description: Output only. System-assigned unique identifier for the PhraseSet.
    - update_time
      Type: TIMESTAMP
      Provider name: updateTime
      Description: Output only. The most recent time this resource was modified.
  - phrase_set
    Type: STRING
    Provider name: phraseSet
    Description: The name of an existing PhraseSet resource. The user must have read access to the resource and it must not be deleted.
auto_decoding_config
Type: STRUCT
Provider name: autoDecodingConfig
Description: Automatically detect decoding parameters. Preferred for supported formats.
denoiser_config
Type: STRUCT
Provider name: denoiserConfig
Description: Optional. Optional denoiser config. May not be supported for all models and may have no effect.
- denoise_audio
  Type: BOOLEAN
  Provider name: denoiseAudio
  Description: Denoise audio before sending to the transcription model.
- snr_threshold
  Type: FLOAT
  Provider name: snrThreshold
  Description: Signal-to-Noise Ratio (SNR) threshold for the denoiser. Here SNR means the loudness of the speech signal. Audio with an SNR below this threshold, meaning the speech is too quiet, will be prevented from being sent to the transcription model. If snr_threshold=0, no filtering will be applied.
explicit_decoding_config
Type: STRUCT
Provider name: explicitDecodingConfig
Description: Explicitly specified decoding parameters. Required if using headerless PCM audio (linear16, mulaw, alaw).
- audio_channel_count
  Type: INT32
  Provider name: audioChannelCount
  Description: Optional. Number of channels present in the audio data sent for recognition. Note that this field is marked as OPTIONAL for backward compatibility reasons. It is (and has always been) effectively REQUIRED. The maximum allowed value is 8.
- encoding
  Type: STRING
  Provider name: encoding
  Description: Required. Encoding of the audio data sent for recognition.
  Possible values:
  - AUDIO_ENCODING_UNSPECIFIED - Default value. This value is unused.
  - LINEAR16 - Headerless 16-bit signed little-endian PCM samples.
  - MULAW - Headerless 8-bit companded mulaw samples.
  - ALAW - Headerless 8-bit companded alaw samples.
  - AMR - AMR frames with an rfc4867.5 header.
  - AMR_WB - AMR-WB frames with an rfc4867.5 header.
  - FLAC - FLAC frames in the ’native FLAC’ container format.
  - MP3 - MPEG audio frames with optional (ignored) ID3 metadata.
  - OGG_OPUS - Opus audio frames in an Ogg container.
  - WEBM_OPUS - Opus audio frames in a WebM container.
  - MP4_AAC - AAC audio frames in an MP4 container.
  - M4A_AAC - AAC audio frames in an M4A container.
  - MOV_AAC - AAC audio frames in an MOV container.
- sample_rate_hertz
  Type: INT32
  Provider name: sampleRateHertz
  Description: Optional. Sample rate in Hertz of the audio data sent for recognition. Valid values are: 8000-48000, and 16000 is optimal. For best results, set the sampling rate of the audio source to 16000 Hz. If that’s not possible, use the native sample rate of the audio source (instead of resampling). Note that this field is marked as OPTIONAL for backward compatibility reasons. It is (and has always been) effectively REQUIRED.
features
Type: STRUCT
Provider name: features
Description: Speech recognition features to enable.
- diarization_config
  Type: STRUCT
  Provider name: diarizationConfig
  Description: Configuration to enable speaker diarization. To enable diarization, set this field to an empty SpeakerDiarizationConfig message.
  - max_speaker_count
    Type: INT32
    Provider name: maxSpeakerCount
    Description: Optional. The system automatically determines the number of speakers. This value is not currently used.
  - min_speaker_count
    Type: INT32
    Provider name: minSpeakerCount
    Description: Optional. The system automatically determines the number of speakers. This value is not currently used.
- enable_automatic_punctuation
  Type: BOOLEAN
  Provider name: enableAutomaticPunctuation
  Description: If true, adds punctuation to recognition result hypotheses. This feature is only available in select languages. The default false value does not add punctuation to result hypotheses.
- enable_spoken_emojis
  Type: BOOLEAN
  Provider name: enableSpokenEmojis
  Description: The spoken emoji behavior for the call. If true, adds spoken emoji formatting for the request. This will replace spoken emojis with the corresponding Unicode symbols in the final transcript. If false, spoken emojis are not replaced.
- enable_spoken_punctuation
  Type: BOOLEAN
  Provider name: enableSpokenPunctuation
  Description: The spoken punctuation behavior for the call. If true, replaces spoken punctuation with the corresponding symbols in the request. For example, “how are you question mark” becomes “how are you?”. See https://cloud.google.com/speech-to-text/docs/spoken-punctuation for support. If false, spoken punctuation is not replaced.
- enable_word_confidence
  Type: BOOLEAN
  Provider name: enableWordConfidence
  Description: If true, the top result includes a list of words and the confidence for those words. If false, no word-level confidence information is returned. The default is false.
- enable_word_time_offsets
  Type: BOOLEAN
  Provider name: enableWordTimeOffsets
  Description: If true, the top result includes a list of words and the start and end time offsets (timestamps) for those words. If false, no word-level time offset information is returned. The default is false.
- max_alternatives
  Type: INT32
  Provider name: maxAlternatives
  Description: Maximum number of recognition hypotheses to be returned. The server may return fewer than max_alternatives. Valid values are 0-30. A value of 0 or 1 will return a maximum of one. If omitted, will return a maximum of one.
- multi_channel_mode
  Type: STRING
  Provider name: multiChannelMode
  Description: Mode for recognizing multi-channel audio.
  Possible values:
  - MULTI_CHANNEL_MODE_UNSPECIFIED - Default value for the multi-channel mode. If the audio contains multiple channels, only the first channel will be transcribed; other channels will be ignored.
  - SEPARATE_RECOGNITION_PER_CHANNEL - If selected, each channel in the provided audio is transcribed independently. This cannot be selected if the selected model is latest_short.
- profanity_filter
  Type: BOOLEAN
  Provider name: profanityFilter
  Description: If set to true, the server will attempt to filter out profanities, replacing all but the initial character in each filtered word with asterisks, for instance, “f***”. If set to false or omitted, profanities won’t be filtered out.
language_codes
Type: UNORDERED_LIST_STRING
Provider name: languageCodes
Description: Optional. The language of the supplied audio as a BCP-47 language tag. Language tags are normalized to BCP-47 before they are used eg “en-us” becomes “en-US”. Supported languages for each model are listed in the Table of Supported Models. If additional languages are provided, recognition result will contain recognition in the most likely language detected. The recognition result will include the language tag of the language detected in the audio.
model
Type: STRING
Provider name: model
Description: Optional. Which model to use for recognition requests. Select the model best suited to your domain to get best results. Guidance for choosing which model to use can be found in the Transcription Models Documentation and the models supported in each region can be found in the Table Of Supported Models.
transcript_normalization
Type: STRUCT
Provider name: transcriptNormalization
Description: Optional. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
- entries
  Type: UNORDERED_LIST_STRUCT
  Provider name: entries
  Description: A list of replacement entries. We will perform replacement with one entry at a time. For example, the second entry in [“cat” => “dog”, “mountain cat” => “mountain dog”] will never be applied because we will always process the first entry before it. At most 100 entries.
  - case_sensitive
    Type: BOOLEAN
    Provider name: caseSensitive
    Description: Whether the search is case sensitive.
  - replace
    Type: STRING
    Provider name: replace
    Description: What to replace with. Max length is 100 characters.
  - search
    Type: STRING
    Provider name: search
    Description: What to replace. Max length is 100 characters.
translation_config
Type: STRUCT
Provider name: translationConfig
Description: Optional. Optional configuration used to automatically run translation on the given audio to the desired language for supported models.
- target_language
  Type: STRING
  Provider name: targetLanguage
  Description: Required. The language code to translate to.

`delete_time`

Type: TIMESTAMP
Provider name: deleteTime
Description: Output only. The time at which this Recognizer was requested for deletion.

`etag`

Type: STRING
Provider name: etag
Description: Output only. This checksum is computed by the server based on the value of other fields. This may be sent on update, undelete, and delete requests to ensure the client has an up-to-date value before proceeding.

`expire_time`

Type: TIMESTAMP
Provider name: expireTime
Description: Output only. The time at which this Recognizer will be purged.

`gcp_display_name`

Type: STRING
Provider name: displayName
Description: User-settable, human-readable name for the Recognizer. Must be 63 characters or less.

`kms_key_name`

Type: STRING
Provider name: kmsKeyName
Description: Output only. The KMS key name with which the Recognizer is encrypted. The expected format is projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{crypto_key}.

`kms_key_version_name`

Type: STRING
Provider name: kmsKeyVersionName
Description: Output only. The KMS key version name with which the Recognizer is encrypted. The expected format is projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{crypto_key}/cryptoKeyVersions/{crypto_key_version}.

`labels`

Type: UNORDERED_LIST_STRING

`language_codes`

Type: UNORDERED_LIST_STRING
Provider name: languageCodes
Description: Optional. This field is now deprecated. Prefer the language_codes field in the RecognitionConfig message. The language of the supplied audio as a BCP-47 language tag. Supported languages for each model are listed in the Table of Supported Models. If additional languages are provided, recognition result will contain recognition in the most likely language detected. The recognition result will include the language tag of the language detected in the audio. When you create or update a Recognizer, these values are stored in normalized BCP-47 form. For example, “en-us” is stored as “en-US”.

`model`

Type: STRING
Provider name: model
Description: Optional. This field is now deprecated. Prefer the model field in the RecognitionConfig message. Which model to use for recognition requests. Select the model best suited to your domain to get best results. Guidance for choosing which model to use can be found in the Transcription Models Documentation and the models supported in each region can be found in the Table Of Supported Models.

`name`

Type: STRING
Provider name: name
Description: Output only. Identifier. The resource name of the Recognizer. Format: projects/{project}/locations/{location}/recognizers/{recognizer}.

`organization_id`

Type: STRING

`parent`

Type: STRING

`project_id`

Type: STRING

`project_number`

Type: STRING

`reconciling`

Type: BOOLEAN
Provider name: reconciling
Description: Output only. Whether or not this Recognizer is in the process of being updated.

`region_id`

Type: STRING

`resource_name`

Type: STRING

`state`

Type: STRING
Provider name: state
Description: Output only. The Recognizer lifecycle state.
Possible values:

STATE_UNSPECIFIED - The default value. This value is used if the state is omitted.
ACTIVE - The Recognizer is active and ready for use.
DELETED - This Recognizer has been deleted.

`tags`

Type: UNORDERED_LIST_STRING

`uid`

Type: STRING
Provider name: uid
Description: Output only. System-assigned unique identifier for the Recognizer.

`update_time`

Type: TIMESTAMP
Provider name: updateTime
Description: Output only. The most recent time this Recognizer was modified.

`zone_id`

Type: STRING

gcp_speech_recognizer

ancestors

annotations

create_time

default_recognition_config

delete_time

etag

expire_time

gcp_display_name

kms_key_name

kms_key_version_name

labels

language_codes

model

name

organization_id

parent

project_id

project_number

reconciling

region_id

resource_name

state

tags

uid

update_time

zone_id

How can I help you today?

`ancestors`

`annotations`

`create_time`

`default_recognition_config`

`delete_time`

`etag`

`expire_time`

`gcp_display_name`

`kms_key_name`

`kms_key_version_name`

`labels`

`language_codes`

`model`

`name`

`organization_id`

`parent`

`project_id`

`project_number`

`reconciling`

`region_id`

`resource_name`

`state`

`tags`

`uid`

`update_time`

`zone_id`