Technical SEO for a Next.js Blog (App Router)

🎯

How I built automatic technical SEO into a Next.js App Router blog: generated metadata, JSON-LD structured data, a scalable sitemap, robots.txt as code, RSS discovery, OG image fallbacks, and canonical URLs.

Introduction

Most "SEO for developers" articles stop at "add a title tag and a meta description." That advice is true and almost useless — it's the part everyone already does. The interesting work is everything underneath: structured data that earns rich results, a sitemap that doesn't fall apart at a hundred posts, canonical URLs that prevent duplicate-content penalties, OpenGraph images that don't render blank when a post has no cover, and doing all of it automatically so you never hand-write SEO metadata again.

This post is a walkthrough of how I built technical SEO into this blog — a Next.js App Router site backed by a database. Everything here is running in production on the page you're reading right now. The throughline is one principle: generate SEO from content, don't author it by hand. Let me show you what that looks like in practice.

The Core Idea: Generate SEO, Don't Author It

Early on I made a decision that shaped everything else: there are no SEO fields in my CMS. No "meta title," no "meta description," no "social share image" inputs for me to forget to fill in. Instead, every piece of SEO metadata is derived from the content a post already has — its title, description, cover image, category, and slug.

// One generator turns a post into a complete metadata object
export function generatePostMetadata(post: Post): Metadata {
  const canonical = generateCanonicalUrl(`/blog/${post.slug}`)
  const ogImage = post.coverImageUrl ?? '/images/avatar.png'

  return {
    title: post.title,
    description: post.description,
    keywords: generateKeywords(post.category?.name),
    alternates: { canonical },
    openGraph: {
      title: post.title,
      description: post.description,
      url: canonical,
      type: 'article',
      images: [{ url: ogImage, width: 1200, height: 630 }],
    },
    twitter: {
      card: 'summary_large_image',
      title: post.title,
      description: post.description,
      images: [ogImage],
    },
  }
}

The payoff: SEO can never drift out of sync with content, because it is the content. Publish a post and its metadata, OG tags, canonical URL, and structured data all exist correctly the moment it goes live — zero manual steps, zero chance of a forgotten field.

Metadata with the App Router's `generateMetadata`

The App Router gives you an async generateMetadata export per route. Because it's async, you can fetch the post and build its metadata server-side, before the page renders:

export async function generateMetadata({
  params,
}: {
  params: { slug: string }
}): Promise<Metadata> {
  const post = await fetchPostBySlug(params.slug)
  if (!post) return generateFallbackMetadata() // never return nothing
  return generatePostMetadata(post)
}

Two foundations make the per-page metadata correct:

metadataBase — set once in the root layout. It turns every relative URL (OG images, canonicals) into an absolute one, which social crawlers and search engines require:

export const metadata: Metadata = {
  metadataBase: new URL('https://tienng21.com'),
  // ...site defaults
}

Canonical URLs — every page declares its canonical address via alternates.canonical. This is the single most underrated technical-SEO control: it tells search engines which URL is authoritative, collapsing duplicates (trailing slashes, query params, pagination) into one ranking signal instead of splitting it.

OpenGraph Images (and the Fallback That Saves You)

When someone shares your link on social media, the OG image is the post as far as the feed is concerned. A blank preview kills click-through. The trap: not every post has a cover image, and a missing og:image renders an empty card.

The fix is a one-line fallback that guarantees there's always a valid image:

const ogImage = post.coverImageUrl ?? '/images/avatar.png'

Cover image when it exists, author avatar when it doesn't — every share is always a valid summary_large_image card at the dimensions crawlers expect (1200×630). Small detail, outsized effect on how your links look in the wild.

Structured Data with JSON-LD

Metadata gets you indexed; structured data gets you rich results — the breadcrumb trails, article cards, and author info that make your listing stand out in search. It's JSON-LD following schema.org vocabulary, injected as a script tag.

I generate four schemas. A BlogPosting for each article:

export function generateArticleSchema(post: Post) {
  return {
    '@context': 'https://schema.org',
    '@type': 'BlogPosting',
    headline: post.title,
    description: post.description,
    image: post.coverImageUrl ?? '/images/avatar.png',
    datePublished: post.publishedAt.toISOString(),
    dateModified: post.updatedAt.toISOString(),
    author: { '@type': 'Person', name: 'Tien Nguyen' },
    mainEntityOfPage: generateCanonicalUrl(`/blog/${post.slug}`),
  }
}

A BreadcrumbList so search shows the navigation path; a WebSite schema with site-wide info; and a Person schema for the author profile. Each is rendered the same way — a script tag in the page:

<script
  type="application/ld+json"
  dangerouslySetInnerHTML={{ __html: JSON.stringify(articleSchema) }}
/>

Crucially, the structured data is generated from the same source as the visible content — same title, same dates, same image. Google penalises structured data that disagrees with the page, so deriving both from one source isn't just DRY, it's a correctness guarantee.

A Dynamic Sitemap That Scales Past 100 Posts

Next.js exposes MetadataRoute.Sitemap — a sitemap.ts file that returns your URLs. The naive version fetches "the posts" and maps them. The version that survives growth fetches all posts, paginating through the API so post #101 isn't silently dropped:

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const posts = await fetchAllPostsPaginated() // walks every page, not just page 1

  const postUrls = posts.map(post => ({
    url: generateCanonicalUrl(`/blog/${post.slug}`),
    lastModified: post.updatedAt,
    changeFrequency: 'weekly' as const,
    priority: 0.7,
  }))

  const staticUrls = ['', '/blog', '/about', '/project'].map(path => ({
    url: generateCanonicalUrl(path),
    lastModified: new Date(),
    priority: path === '' ? 1 : 0.8,
  }))

  return [...staticUrls, ...postUrls]
}

And wrap the fetch in error handling. A sitemap that throws during build breaks your deployment; one that catches the error and returns the static URLs degrades gracefully. SEO routes should never be able to take down a build.

robots.txt as Code

robots.ts generates robots.txt the same way — as code, not a static file you forget to update:

export default function robots(): MetadataRoute.Robots {
  return {
    rules: {
      userAgent: '*',
      allow: '/',
      disallow: ['/api/', '/admin/'],
    },
    sitemap: 'https://tienng21.com/sitemap.xml',
  }
}

The key line is the last one: pointing crawlers at your sitemap from robots.txt is how you make sure they find every URL, not just the ones they happen to link-crawl into.

An RSS Feed for Discovery

RSS isn't dead — it's how readers, aggregators, and newsletter tools subscribe to you. I expose a hand-rolled RSS 2.0 feed at /feed.xml built from every published post, and then make it discoverable with a link tag in the document head:

<link rel="alternate" type="application/rss+xml" title="RSS Feed" href="/feed.xml" />

That tag is what lets a reader paste your homepage URL into their RSS client and have it auto-detect the feed. Without it, the feed exists but nobody can find it.

Caching: SEO Routes Need Fresh Data Too

Here's a subtlety that bites people: your sitemap, RSS feed, and metadata all fetch data, and in the App Router those fetches are cached. Set the revalidation deliberately so a new post shows up in the sitemap and feed within a sensible window:

export const revalidate = 3600 // regenerate sitemap/feed hourly

Too long and search engines crawl a stale sitemap that's missing your newest posts; too short and you re-render constantly for no benefit. An hour is a reasonable default for a blog — fresh enough that new content is discoverable quickly, cheap enough that you're not regenerating on every request.

Common Mistakes

A few traps I see (and hit) repeatedly:

Returning no metadata on a 404 or fetch failure. Always return a fallback Metadata object — a blank <head> is worse than a generic one.
Relative OG image URLs without metadataBase. Social crawlers can't resolve them; the preview comes up empty.
Structured data that disagrees with the visible page. Generate both from one source or Google ignores (or penalises) the markup.
A sitemap capped at the first page of results. Paginate through everything, or your newest content never gets indexed.
Forgetting the canonical tag on paginated or filtered routes, splitting your ranking signal across near-duplicate URLs.

Conclusion

Good technical SEO isn't a checklist you run once before launch — it's an architecture. When metadata, structured data, sitemaps, and feeds are all generated from your content rather than authored alongside it, SEO stops being a chore you forget and becomes an invariant that's simply always correct.

That's the whole philosophy: write the post, and let the system produce the canonical URL, the OpenGraph card, the BlogPosting schema, the sitemap entry, and the RSS item — automatically, consistently, every time. You get to think about writing, and search engines get a site that's effortless to understand. Everything in this post is doing exactly that on the page you're reading right now.