Hacker News new | past | comments | ask | show | jobs | submit login

OpenAI compatible API is missing important parameters, for example I don't think there is a way to disable flash 2 thinking with it.

Vertex AI is for grpc, service auth, and region control (amongst other things). Ensuring data remains in a specific region, allowing you to auth with the instance service account, and slightly better latency and ttft






I find Google's service auth SO hard to figure out. I've been meaning to solve deploying to Cloud Run via service with for several years now but it just doesn't fit in my brain well enough for me to make the switch.

simonw, 'Google's service auth SO hard to figure out' – absolutely hear you. We're taking this feedback on auth complexity seriously. We have a new Vertex express mode in Preview (https://cloud.google.com/vertex-ai/generative-ai/docs/start/... , not ready for primetime yet!) that you can sign up for a free tier and get API Key right away. We are improving the experience, again if you would like to give feedback, please DM me on @chrischo_pm on X.

If you're on cloud run it should just work automatically.

For deploying, on GitHub I just use a special service account for CI/CD and put the json payload in an environment secret like an API key. The only extra thing is that you need to copy it to the filesystem for some things to work, usually a file named google_application_credentials.json

If you use cloud build you shouldn't need to do anything


You should consider setting up Workload Identity Federation and authentication to Google Cloud using your GitHub runner OIDC token. Google Cloud will "trust" the token and allow you to impersonate service accounts. No static keys!

Does not work for many Google services, including firebase

Yes it does. We deploy firebase and bunch of other GCP things from github actions and there are zero API keys or JSON credentials anywhere.

Everything is service accounts and workload identity federation, with restrictions such as only letting main branch in specific repo to use it (so no problem with unreviewed PRs getting production access).

Edit: if you have a specific error or issue where this doesn't work for you, and can share the code, I can have a look.


No thank you, there is zero benefit to migrating and no risk in using credentials the way I do.

How do you sign a firebase custom auth token with workload identity federation? How about a pre signed storage URL? Off the top of my head I think those were two things that don't work


First, regarding "zero benefit" and "no risk". I disagree. The risk and benefit might be low, and not worth the change for you. But it is absolutely not zero.

You have a JSON key file which you can't know how many people have. The person who created the key, downloaded it and then stored it as github secret - did they download it to /dev/shm? Did some npm/brew install script steal it from their downloads folder? Any of the github repo owners can get hold of it. Depending on whether you use github environments/deployments and have set it up properly, so can anyone with write access to the repo. Do you pin all your dependencies, reusable workflows etc, or can a compromise of someone elses repo steal your secrets?

With the workload identity auth, there is no key. Each access obtains a short lived token. Only workflows on main branch can get it. Every run will have audit logs, and so will every action taken by that token. Risk of compromise is much lower, but even more importantly, if compromised I'll be able to know exactly when and how, and what malicious actions were taken.

Maybe this is paranoid to you and not worth it. That's fine. But it's not "no risk", and it is worth to me to protect personal data of our users.

---

As for your question, first step is just to run https://github.com/google-github-actions/auth with identity provider configured in your GCP project, restricted to your github repo or org.

This will create application default credentials that most GCP tools and libraries will just work with as if when you are running things locally after "gcloud auth login".

For firebase token you can just run a python script as subsequent step in the github job doing something like https://firebase.google.com/docs/auth/admin/create-custom-to.... For signed storage url this can be done with the gcloud tool: https://cloud.google.com/storage/docs/access-control/signing...

In both cases after running the "google-github-actions/auth" step it will just work with the short-lived credentials that step generated.


You could post on Reddit asking for help and someone is likely to provide answers, an explanation, probably even some code or bash commands to illustrate.

And even if you don't ask, there are many examples. But I feel ya. The right example to fit your need is hard to find.


GCP auth is terrible in general. This is something aws did well

I don't get that. How?

- There are principals. (users, service accounts)

- Each one needs to authenticate, in some way. There are options here. SAML or OIDC or Google Signin for users; other options for service accounts.

- Permissions guard the things you can do in Google cloud.

- There are builtin roles that wrap up sets of permissions.

- you can create your own custom roles.

- attach roles to principals to give them parcels of permissions.


yeah bro just one more principal bro authenticate each one with SAML or OIDC or Google Signin bro set the permissions for each one make sure your service account has permissions aiplatform.models.get and aiplatform.models.list bro or make a custom role and attach the role to the principle to parcel the permission

It's not complicated in the context of huge enterprise applications, but for most people trying to use Google's LLMs, it's much more confusing than using an API key. The parent commenter is probably using an aws secret key.

And FWIW this is basically what google encourages you to do with firebase (with the admin service account credential as a secret key).


GCP auth is actually one of the things it does way better than AWS. it's just that the entire industry has been trained on AWS's bad practices...

From the linked docs:

> If you want to disable thinking, you can set the reasoning effort to "none".

For other APIs, you can set the thinking tokens to 0 and that also works.


Wow thanks I did not know

We built the OpenAI Compatible API (https://cloud.google.com/vertex-ai/generative-ai/docs/multim...) layer to help customers that are already using OAI library to test out Gemini easily with basic inference but not as a replacement library for the genai sdk (https://github.com/googleapis/python-genai). We recommend using th genai SDK for working with Gemini.

So, to be clear, Google only supports Python as a language for accessing your models? Nothing else?

We have Python/Go in GA.

Java/JS is in preview (not ready for production) and will be GA soon!


What about providing an actual API people can call without needing to rely on Google SDKs?

When I used the openai compatible stuff my API’s just didn’t work at all. I switched back to direct HTTP calls, which seems to be the only thing that works…

We support reasoning_effort = none. That will let you disable flash 2 thinking. We will document it better.

yeah, 2 days to get Google OAuth flow integrated into an background app/script, 1 day coding for the actual app ...

Is this vertexAI related or in general, I find googles oauth flow to be extremely well documented and easy to setup…

I got claude to write me an auth layer using only python http.client and cryptography. One shot no problem, now I can get a token from the service key any time, just have to track expiration. Annoying that they don't follow industry standard though.

should have used ai to write the integrations...

thats with AI

as there are so many variations out there the AI gets majorly confused, as a matter of fact, the google oauth part is the one thing that gemini 2.5 pro cant code

should be its own benchmark


Maybe you should just read the docs and use the examples there. I have used all kinds of GCP services for many years and auth is not remotely complicated imo.

JSONSchema support on Google's OpenAI-compatible API is very lackluster and limiting. My biggest gripe really.

yeah we are looking into it

Thank you! Adding support for `additionalProperties`[0] (and perhaps `patternProperties` too) would be particularly great!

Happy to provide test cases as well if helpful.

0: https://datatracker.ietf.org/doc/html/draft-fge-json-schema-...




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: