When attempting to do online prediction using my model that has been deployed on a Vertex endpoint for a Text categorization use case, I am receiving the following error:
Unavailable Service: 503 Bad Gateway: 502
As I discovered, there appears to be a restriction of 2500 characters on the number of text characters. If I'm mistaken about this, kindly correct me. While obtaining output for input less than 2500 characters, I see the problem mentioned above for any more comprehensive information. Thank you.
@MarkusK and @malamin
Can you help
@newkanyemerch with his issue?
Thank you for your question. If your case is working less than 25000 characters then it might be conflict with the endpoint type or you can check the vertex log information to find out the actual cause. Basically, I don’t know which endpoint you’re using. Here is the private endpoint limitation I posted below:
Note the following limitations for private endpoints:
0), possibly due to a transient broken connection.
5xxthat indicate the service might be temporarily unavailable.
429that indicates the system is currently overloaded, consider slowing down traffic to mitigate this issue instead of retrying.
PredictionServiceClientin the Vertex AI Python client library are not supported.
You can use the metrics dashboard to inspect the availability and latency of the traffic sent to a private endpoint.
To customize monitoring, query the following two metrics in Cloud Monitoring:
The number of prediction responses. You can filter this metric by
deployed_model_idor HTTP response code.
The latency of the prediction request in milliseconds. You can filter this metric by
deployed_model_id, only for successful requests
Also, don’t forget to check the url:
Online prediction logging
Here is the another answer bsed on same use case you can check:
I hope it might be help you to debug your case further:
Thank you so much
@ilias to mention me here.
Thank you so much for your replies
@MarkusK and @malamin !