Google Vertex AI Setup
LlamaCloud supports Google Vertex AI for enterprise-grade access to Googleâs AI models through Google Cloud Platform. Vertex AI provides advanced features like private endpoints, enhanced security, and deep GCP integration compared to the direct Gemini API.
Prerequisites
Section titled âPrerequisitesâ- A valid Google Cloud Platform account
- Google Cloud project with Vertex AI API enabled
- Service account with appropriate IAM permissions
- Access and quota for supported models:
- Gemini 1.5 Pro
- Gemini 1.5 Flash
- Gemini 2.0 Flash
Environment Variables
Section titled âEnvironment VariablesâThe Google Vertex AI integration uses the following environment variables:
GOOGLE_VERTEX_AI_ENABLED
- Set to âtrueâ to enable Google Vertex AI (required)GOOGLE_VERTEX_AI_PROJECT_ID
- Google Cloud project ID (required)GOOGLE_VERTEX_AI_LOCATION
- Google Cloud location/region (optional, defaults tous-central1
)GOOGLE_VERTEX_AI_CREDENTIALS_JSON
- Service account credentials JSON string (required)
Configuration
Section titled âConfigurationâFollow these steps to configure Google Vertex AI integration:
Step 1: Setup Google Cloud Project
Section titled âStep 1: Setup Google Cloud Projectâ- Create or use an existing Google Cloud project
- Enable the Vertex AI API:
Terminal window gcloud services enable aiplatform.googleapis.com - Ensure billing is enabled for the project
Step 2: Create Service Account
Section titled âStep 2: Create Service AccountâCreate a service account with the required IAM permissions:
Required IAM Permissions
Section titled âRequired IAM PermissionsâYour service account needs the following permissions:
aiplatform.endpoints.predict
aiplatform.endpoints.explain
ml.models.predict
(for legacy model versions)
Option 1: Use Predefined Role
Section titled âOption 1: Use Predefined Roleâgcloud iam service-accounts create llamacloud-vertex \ --description="Service account for LlamaCloud Vertex AI" \ --display-name="LlamaCloud Vertex AI"
gcloud projects add-iam-policy-binding PROJECT_ID \ --member="serviceAccount:llamacloud-vertex@PROJECT_ID.iam.gserviceaccount.com" \ --role="roles/aiplatform.user"
Option 2: Custom Role
Section titled âOption 2: Custom RoleâCreate a custom role with minimal permissions:
{ "title": "LlamaCloud Vertex AI User", "description": "Custom role for LlamaCloud Vertex AI access", "stage": "GA", "includedPermissions": [ "aiplatform.endpoints.predict", "aiplatform.endpoints.explain", "ml.models.predict" ]}
Step 3: Generate Service Account Key
Section titled âStep 3: Generate Service Account KeyâGenerate a JSON key for the service account:
gcloud iam service-accounts keys create llamacloud-vertex-key.json \ --iam-account=llamacloud-vertex@PROJECT_ID.iam.gserviceaccount.com
Step 4: Create Kubernetes Secret
Section titled âStep 4: Create Kubernetes SecretâCreate a secret with your Vertex AI credentials:
apiVersion: v1kind: Secretmetadata: name: vertex-ai-credentialstype: OpaquestringData: GOOGLE_VERTEX_AI_ENABLED: "true" GOOGLE_VERTEX_AI_PROJECT_ID: "your-project-id" GOOGLE_VERTEX_AI_LOCATION: "us-central1" GOOGLE_VERTEX_AI_CREDENTIALS_JSON: | { "type": "service_account", "project_id": "your-project-id", "private_key_id": "key-id", "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n", "client_email": "llamacloud-vertex@your-project-id.iam.gserviceaccount.com", "client_id": "client-id", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/llamacloud-vertex%40your-project-id.iam.gserviceaccount.com" }
Apply the secret:
kubectl apply -f vertex-ai-secret.yaml
Step 5: Configure Helm Values
Section titled âStep 5: Configure Helm ValuesâReference the secret in your Helm configuration:
# External Secret (recommended)llamaParse: config: googleVertexAi: enabled: true existingSecret: "vertex-ai-credentials"
# or direct configuration (not recommended for production)llamaParse: config: googleVertexAi: enabled: true projectId: "your-project-id" location: "us-central1" credentialsJson: | { "type": "service_account", ... }
Verification
Section titled âVerificationâAfter configuration, verify your Google Vertex AI integration:
-
Verify in Admin UI: Check available Google models in LlamaCloud admin interface
-
Test functionality: Upload a document to confirm Vertex AI models are working
Troubleshooting
Section titled âTroubleshootingâCommon Issues
Section titled âCommon IssuesâService Account Permissions
Section titled âService Account PermissionsâError: Permission denied
Solution:
- Verify your service account has the required IAM permissions
- Ensure
aiplatform.endpoints.predict
permission is granted - Check that the service account is active and not disabled
Project Configuration
Section titled âProject ConfigurationâError: Project not found or Vertex AI API not enabled
Solution:
- Ensure Vertex AI API is enabled in your Google Cloud project
- Verify the project ID is correct in your configuration
- Check that billing is enabled for the project
Credentials Issues
Section titled âCredentials IssuesâError: Could not load credentials
Solution:
- Verify the service account JSON is properly formatted
- Ensure all required fields are present in the credentials JSON
- Check that the private key is properly escaped in the secret
- Verify the service account key hasnât been deleted or expired
Region Restrictions
Section titled âRegion RestrictionsâError: Model not available in region
Solution:
- Check model availability in your specified region
- Try a different region (e.g.,
us-central1
,us-east1
) - Some models may not be available in all regions
Debug Steps
Section titled âDebug Stepsâ-
Test Vertex AI authentication:
Terminal window gcloud auth activate-service-account --key-file=vertex-key.jsongcloud ai models list --region=us-central1 -
Verify secret mounting:
Terminal window kubectl get secret vertex-ai-credentials -o yamlkubectl describe pod <pod-name> | grep -A 20 Environment -
Check network connectivity: Ensure your cluster can reach
aiplatform.googleapis.com
-
Test API call:
Terminal window curl -X POST \-H "Authorization: Bearer $(gcloud auth print-access-token)" \-H "Content-Type: application/json" \"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-pro:predict" \-d '{"instances": [{"content": "Hello"}]}'