Andy Simon's Blog

asimon@blog:~/2025-12-31-ga4-data-api-integration/$ _

Server-Side Analytics with GA4 Data API

by asimon
analyticsawsinfrastructurenextjs

How to fetch Google Analytics data at build time for static sites, with a hybrid Lambda refresh pattern


title: "Server-Side Analytics with GA4 Data API" summary: "How to fetch Google Analytics data at build time for static sites, with a hybrid Lambda refresh pattern" date: "2025-12-31" tags: ["analytics", "aws", "infrastructure", "nextjs"] topics: ["analytics", "infrastructure", "static-sites"] prerequisites: ["2025-12-28-architecture-of-a-modern-static-blog"] related: ["2025-01-01-github-actions-cicd"] author: "asimon" published: true

Server-Side Analytics with GA4 Data API

Most analytics implementations are client-side: drop a script tag, watch pageviews roll in. But what if you want to display those view counts on your site? For a static blog, that creates an interesting problem.

This post covers how I integrated Google Analytics 4's Data API to fetch view counts at build time, with a Lambda function that keeps the data fresh between deployments.

Why Server-Side Analytics?

Client-side analytics work great for collecting data. But displaying that data requires an API call, which means:

  1. Loading state - Users see a spinner or placeholder while counts load
  2. Layout shift - Numbers pop in after the page renders
  3. Rate limits - Every visitor triggers an API call
  4. Privacy concerns - You're exposing API credentials or proxying through a backend

For a static site, there's a better approach: fetch the data once at build time and embed it directly in the HTML.

πŸ’‘

The Static Advantage

Build-time data means zero API calls per pageview. View counts are just numbers in the HTML - no loading states, no rate limits, no client-side JavaScript required.

The GA4 Data API

Google Analytics 4 provides a server-side API for querying your analytics data. Unlike the older Universal Analytics API, GA4's Data API is designed for programmatic access with a clean, modern interface.

Service Account Setup

First, you need a Google Cloud service account with access to your GA4 property:

  1. Create a service account in Google Cloud Console
  2. Download the JSON key file
  3. Add the service account email as a viewer on your GA4 property
  4. Base64 encode the JSON key for secure storage
# Encode the service account key
base64 -i service-account.json | tr -d '\n'

Store this encoded string in your CI/CD secrets or AWS Parameter Store.

Querying View Counts

The actual query is straightforward. We want pageviews grouped by path:

import { google } from "googleapis";
import { GoogleAuth } from "google-auth-library";

async function getAnalyticsClient() {
  const credentials = JSON.parse(
    Buffer.from(process.env.GA4_SERVICE_ACCOUNT!, "base64").toString()
  );

  const auth = new GoogleAuth({
    credentials,
    scopes: ["https://www.googleapis.com/auth/analytics.readonly"],
  });

  return google.analyticsdata({ version: "v1beta", auth });
}

async function fetchViewCounts(postSlugs: string[]) {
  const analytics = await getAnalyticsClient();
  const propertyId = process.env.GA4_PROPERTY_ID;

  const response = await analytics.properties.runReport({
    property: `properties/${propertyId}`,
    requestBody: {
      dateRanges: [{ startDate: "2020-01-01", endDate: "today" }],
      dimensions: [{ name: "pagePath" }],
      metrics: [{ name: "screenPageViews" }],
      dimensionFilter: {
        andGroup: {
          expressions: [{
            filter: {
              fieldName: "pagePath",
              inListFilter: {
                values: postSlugs.map(slug => `/${slug}`),
              },
            },
          }],
        },
      },
    },
  });

  // Parse response into { slug: viewCount } map
  const viewCounts: Record<string, number> = {};
  response.data.rows?.forEach(row => {
    const path = row.dimensionValues?.[0]?.value;
    const views = parseInt(row.metricValues?.[0]?.value || "0", 10);
    if (path?.startsWith("/")) {
      viewCounts[path.substring(1)] = views;
    }
  });

  return viewCounts;
}

Key points:

  • inListFilter lets us query multiple pages in a single request
  • screenPageViews is the GA4 metric (not pageviews from Universal Analytics)
  • We strip the leading slash from paths to match our post slugs

Build-Time Integration

The build script runs before Next.js generates static pages:

// scripts/generate-view-counts.mjs
import fs from "fs";
import path from "path";

async function main() {
  // Read post slugs from content directory
  const contentDir = path.join(process.cwd(), "src/content");
  const postSlugs = fs.readdirSync(contentDir)
    .filter(file => file.endsWith(".mdx"))
    .map(file => file.replace(/\.mdx$/, ""));

  // Fetch from GA4
  const viewCounts = await fetchViewCountsFromGA(postSlugs);

  // Write to JSON file
  const outputPath = path.join(process.cwd(), "src/data/view-counts.json");
  fs.writeFileSync(outputPath, JSON.stringify({
    viewCounts,
    generated: new Date().toISOString(),
    source: "Google Analytics 4 Data API",
  }, null, 2));
}

The package.json runs this before the build:

{
  "scripts": {
    "prebuild": "node ./scripts/generate-view-counts.mjs",
    "build": "next build"
  }
}

Now src/data/view-counts.json contains fresh data every time we deploy.

The Staleness Problem

Build-time data has one obvious limitation: it's only as fresh as your last deployment. If you deploy weekly, view counts could be a week old.

Options to address this:

| Approach | Freshness | Complexity | Cost | |----------|-----------|------------|------| | Deploy more often | Hours | Low | CI minutes | | Client-side fetch | Real-time | Medium | API calls | | Lambda refresh | Minutes | Medium | ~$0.50/month |

I chose the Lambda approach because it keeps the static site benefits while providing near-real-time data.

Hybrid Architecture: Lambda + S3

The solution uses a scheduled Lambda function that:

  1. Fetches current view counts from GA4
  2. Writes individual JSON files to S3
  3. CloudFront serves these files with a 5-minute TTL
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ EventBridge │────▢│   Lambda    │────▢│     S3      β”‚
β”‚  (5 min)    β”‚     β”‚  ga4-sync   β”‚     β”‚  /api/views β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                               β”‚
                                               β–Ό
                                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                        β”‚ CloudFront  β”‚
                                        β”‚  (5m TTL)   β”‚
                                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The Lambda Function

// src/functions/ga4-sync.ts
import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";
import { SSMClient, GetParameterCommand } from "@aws-sdk/client-ssm";

const s3 = new S3Client({ region: "us-east-2" });
const ssm = new SSMClient({ region: "us-east-2" });

export async function handler() {
  const bucket = process.env.VIEW_COUNT_BUCKET;
  const postSlugs = await resolvePostSlugs();

  // Load GA4 credentials from Parameter Store
  const ga4PropertyId = await getParameter("/asimon-blog/prod/ga4-property-id");
  const ga4ServiceAccount = await getParameter(
    "/asimon-blog/prod/ga4-service-account",
    true // decrypt
  );

  // Fetch current counts
  const counts = await fetchViewCountsFromGA(postSlugs, {
    env: { GA4_PROPERTY_ID: ga4PropertyId, GA4_SERVICE_ACCOUNT: ga4ServiceAccount }
  });

  // Write individual JSON files
  const timestamp = new Date().toISOString();
  await Promise.all(
    Object.entries(counts).map(([slug, views]) =>
      s3.send(new PutObjectCommand({
        Bucket: bucket,
        Key: `api/views/${slug}.json`,
        Body: JSON.stringify({ views, updated: timestamp }),
        ContentType: "application/json",
        CacheControl: "public, max-age=300",
      }))
    )
  );
}

Client-Side Hook

The React hook fetches from S3 (via CloudFront) and falls back to build-time data:

// src/hooks/useViewCount.ts
import { useState, useEffect } from "react";

export function useViewCount(slug: string, buildTimeCount: number) {
  const [count, setCount] = useState(buildTimeCount);

  useEffect(() => {
    fetch(`/api/views/${slug}.json`)
      .then(res => res.ok ? res.json() : null)
      .then(data => {
        if (data?.views !== undefined) {
          setCount(data.views);
        }
      })
      .catch(() => {
        // Fall back to build-time count (already set)
      });
  }, [slug]);

  return count;
}

Cost Analysis

Let's break down the monthly cost:

| Component | Calculation | Monthly Cost | |-----------|-------------|--------------| | Lambda invocations | 8,640 (every 5 min) | $0.00 (free tier) | | Lambda duration | ~500ms x 8,640 | $0.01 | | S3 writes | 8,640 x 8 posts | $0.03 | | S3 storage | ~1KB x 8 files | $0.00 | | CloudFront | Included in existing distribution | $0.00 | | GA4 API | Free tier | $0.00 |

Total: ~$0.04/month

Compare this to alternatives:

  • DynamoDB + API Gateway: $5-10/month minimum
  • Client-side API calls: Risk of rate limiting, no caching

Graceful Degradation

The implementation handles failures at every level:

  1. GA4 API fails β†’ Lambda logs error, keeps previous S3 files
  2. Lambda fails β†’ CloudFront serves cached files (5-min TTL)
  3. S3 fetch fails β†’ Client uses build-time data
  4. Build-time fetch fails β†’ Configurable: fail build or use mock data
// Fail-safe mock data for development
export function generateMockViewCounts(slugs: string[]) {
  return Object.fromEntries(
    slugs.map(slug => {
      // Deterministic hash so counts don't jump around
      const hash = createHash("sha256").update(slug).digest("hex");
      const views = 100 + (parseInt(hash.slice(0, 8), 16) % 5000);
      return [slug, views];
    })
  );
}

Security Considerations

A few security notes:

  • Service account key is stored encrypted in AWS Parameter Store
  • GA4 property ID is not sensitive but still kept in Parameter Store for consistency
  • S3 bucket is not public; CloudFront uses Origin Access Control
  • Lambda role has least-privilege access to S3 and SSM

Never commit the service account JSON file to version control. The base64-encoded version should only exist in your CI/CD secrets or cloud parameter store.

Summary

This hybrid approach gives you the best of both worlds:

  • Static-first: Build-time data embedded in HTML
  • Fresh data: Lambda updates every 5 minutes
  • Graceful fallback: Multiple layers of degradation
  • Low cost: Under $0.50/month for the entire stack

The pattern works for any data you'd normally fetch from an API: social counts, star counts, weather data, stock prices. Build-time for the baseline, Lambda for freshness.

Next up: Automated Deployment with GitHub Actions covers the CI/CD pipeline that makes this all work together.