MW
Home Blog Astro content collections: the deep-dive I needed before shipping 46 sites

Astro content collections: the deep-dive I needed before shipping 46 sites

Schemas, slugs, getStaticPaths and the four mistakes that cost me a weekend. The deep-dive I needed before shipping content collections across 46 sites.

Mahesh Waghmare
Mahesh Waghmare
10 min read
Share:
This is a comprehensive guide based on real-world experience and best practices from production projects.

ASTRO CONTENT COLLECTIONS

Advertisement

Content collections are the best thing Astro has shipped. They make every MDX file a typed, validated, queryable record. After moving 46 sites onto them, I have strong opinions about the patterns that work — and the four mistakes that bit me.

This is the deep-dive I wish existed when I started.

What collections actually solve

Before collections, every Astro site I touched had a different ad-hoc way of reading frontmatter. Some used Astro.glob(), some loaded a JSON manifest, some hand-rolled a TypeScript types file. All of them rotted. Collections kill the entire category.

The build-time validation is the unsung win. A typo in a frontmatter date field fails npm run build, not your visitors.

The Zod schema is half the system

In src/content.config.ts:

import { defineCollection, z } from 'astro:content';

const blog = defineCollection({
  type: 'content',
  schema: z.object({
    title: z.string().min(5).max(120),
    description: z.string().min(20).max(300),
    publishedAt: z.string().regex(/^\d{4}-\d{2}-\d{2}$/),
    category: z.enum(['wordpress', 'astro', 'cloudflare', 'indie-hacking']),
    tags: z.array(z.string()).max(8),
    featured: z.boolean().default(false),
  }),
});

export const collections = { blog };

Three details that aren’t obvious from the docs:

  1. Use literal enums for categories, not free strings. The build will refuse a typoed category, and your sidebar filters can index the enum directly.
  2. Set .default() everywhere you can. Authors will forget featured: false. Defaults turn forgotten fields into reasonable defaults instead of validation failures.
  3. Date fields as z.string().regex(...), not z.date(). YAML date parsing is a footgun (some parsers coerce 2026-05-13 to a Date object, some leave it a string); enforce the wire format and parse explicitly in your render code.

getCollection vs getEntry — picking the right call

Collections have two read APIs:

getCollection('blog') returns every entry as an array. Use it for index pages, sorting, filtering, computing aggregates.

getEntry('blog', slug) returns one entry by slug. Use it inside getStaticPaths for the slug page — but only when you need a single entry.

For a single dynamic route, getEntry is overkill: you’ve already used getCollection inside getStaticPaths to enumerate every slug. Pass the entry as a prop:

export async function getStaticPaths() {
  const posts = await getCollection('blog');
  return posts.map((post) => ({
    params: { slug: post.id.replace(/\.mdx?$/, '') },
    props: { entry: post },  // <- pass through
  }));
}
const { entry } = Astro.props;
const { Content } = await entry.render();

That single-fetch pattern saves the build-time cost of re-fetching every entry by ID and keeps the dynamic page lean.

The slug-vs-id mistake everyone makes

Astro’s content collection id is the file path relative to the collection root, with the extension. The slug is the URL-friendly identifier you derive from it.

For src/content/blog/hello-world.mdx:

  • entry.id is "hello-world.mdx"
  • entry.slug is "hello-world"

This trips people up because in Astro 5 they unified slug access — but the id field still has the extension. If your getStaticPaths does params: { slug: post.id }, you’ll get URLs like /blog/hello-world.mdx/ and spend an hour wondering why nothing routes.

// Helper that mw.com uses everywhere.
export function entrySlug(id: string): string {
  return id.replace(/\.(mdx|md)$/, '');
}

Fold it into a shared util and never think about it again.

Glob loaders for nested directories

For /how-to/<domain>/<tool>/<slug> content where the slug is multi-segment, the route file is /how-to/[...slug].astro (spread slug). The collection picks it up automatically:

src/content/howto/
  database/mysql/not-recognized-command.mdx
  backend/php/array-functions.mdx

entry.id becomes "database/mysql/not-recognized-command.mdx". Your getStaticPaths strips the extension and passes the multi-segment string to the spread slug. URL: /how-to/database/mysql/not-recognized-command.

The four mistakes I made

Worth writing down so you don’t repeat them.

The patterns that scaled across 46 sites

The reusable bits, lifted into @empire/content-shells:

  • sharedFrontmatter — title, description, publishedAt, hero (object, not string), schemaType.
  • Per-pattern schemasdeepDiveSchema, tutorialSchema, howToSchema, snippetSchema, noteSchema. Each enforces a type literal.
  • A discriminated union (anyContentSchema) — for sites that want one collection to hold multiple content types, the union enforces the right field set per type.
  • Helper functionsbyPublishedDesc, groupByCategory, uniqueTags, entrySlug, formatBlogDate. Boring, reused everywhere.

The schema is the API contract between you-the-author and you-the-templater. Get it boringly precise and the rest of the system stays calm.

— me, after the third site

Where collections are still rough

The honest list of things that aren’t great:

  • No incremental rebuild on schema change. Update the schema, every page rebuilds. For a 200-post blog that’s tens of seconds.
  • getCollection is eager. It loads every entry, then filters. Fine at hundreds of entries, painful at thousands.
  • Live data isn’t a collection. Anything from an API has to live outside the collection abstraction. There’s a loader API for this but it’s young.
  • The error messages on invalid frontmatter are accurate but cryptic. Plan to read them three times before they make sense.

None of these is a deal-breaker. All of them get better every release.

What to do tomorrow

If you’re starting a new Astro site this week:

  1. Stand up src/content.config.ts with one collection.
  2. Define the schema with explicit Zod validation. Use .default() aggressively.
  3. Build the [...slug].astro route using the prop-passing pattern above.
  4. Add a byPublishedDesc helper to src/lib/. You’ll use it five times in a week.
  5. When you stand up a second Astro site, extract the shared schema bits into a workspace package before the third one comes online.

That’s the whole loop. Content collections will be the most enjoyable part of your Astro stack — once you stop fighting the shape.

Get weekly notes in your inbox

Practical tips, tutorials and resources. No spam.