Input Maximum text length is 2500 characters. | C2C Community

Input Maximum text length is 2500 characters.

  • 19 April 2023
  • 6 replies

When attempting to do online prediction using my model that has been deployed on a Vertex endpoint for a Text categorization use case, I am receiving the following error: 

Unavailable Service: 503 Bad Gateway: 502
As I discovered, there appears to be a restriction of 2500 characters on the number of text characters. If I'm mistaken about this, kindly correct me. While obtaining output for input less than 2500 characters, I see the problem mentioned above for any more comprehensive information. Thank you.

6 replies

Userlevel 7
Badge +58

Hi @MarkusK and @malamin 

Can you help @newkanyemerch with his issue?

Userlevel 7
Badge +16

@pbarapatre would you perhaps have a clue? :)

Userlevel 5
Badge +2

@newkanyemerch: Here is the link to the quotas/ limits page on Vertex AI: 

Userlevel 7
Badge +21

Hello @newkanyemerch ,

Thank you for your question. If your case is working less than 25000 characters then it might be conflict with the endpoint type or you can check the vertex log information to find out the actual cause. Basically, I don’t know which endpoint you’re using. Here is the private endpoint limitation I posted below:

Note the following limitations for private endpoints:

  • Private endpoints currently do not support traffic splitting. As a workaround, you can create traffic splitting manually by deploying your model to multiple private endpoints, and splitting traffic among the resulting prediction URLs for each private endpoint.
  • Private endpoints don't support SSL/TLS.
  • To enable access logging on a private endpoint, contact
  • You can use only one network for all private endpoints in a Google Cloud project. If you want to change to another network, contact
  • Client side retry on recoverable errors are highly recommended. These can include the following errors:
    • Empty response (HTTP error code 0), possibly due to a transient broken connection.
    • HTTP error codes 5xx that indicate the service might be temporarily unavailable.
  • For the HTTP error code 429 that indicates the system is currently overloaded, consider slowing down traffic to mitigate this issue instead of retrying.
  • Prediction requests from PredictionServiceClient in the Vertex AI Python client library are not supported.

You can use the metrics dashboard to inspect the availability and latency of the traffic sent to a private endpoint.

To customize monitoring, query the following two metrics in Cloud Monitoring:


    The number of prediction responses. You can filter this metric by deployed_model_id or HTTP response code.


    The latency of the prediction request in milliseconds. You can filter this metric by deployed_model_id, only for successful requests



Also, don’t forget to check the url:

Here is the another answer bsed on same use case you can check:

I hope it might be help you to debug your case further:

Userlevel 7
Badge +21

Thank you so much @ilias to mention me here.

Userlevel 7
Badge +16

Thank you so much for your replies @MarkusK and @malamin !


@newkanyemerch, were MarkusK’s and malamin’s answers perhaps helpful to you? :)