We've come across a challenge using an internal HTTP load balancer (L7) and could use some advice.
We're working on a project in Canada that requires GPUs (which GCP does not offer in Canada). Our Cloud Run project needs to stay in Canada due to speed/latency and data laws, however, this is forcing us to use VMs (GCE) in the US to gain access to GPUs.
We want to internal load balance (HTTP) on our VPC (Cloud Run as the client in the Canada region and GCE+GPUs as the backend in a US region), but this doesn't seem possible. It appears the only way to load balance across regions is to use a TCP load balancer, which doesn't work as well (doesn't allow us to scale on metrics like number of requests or requests/second).
We've considered setting up an Nginx proxy and other types of proxies that would allow us to cross regions, but it would be so much easier to use a GCP-native solution that autoscales.
Best answer by seijimanoanView original