Content collections are the best thing Astro has shipped. They make every MDX file a typed, validated, queryable record. After moving 46 sites onto them, I have strong opinions about the patterns that work — and the four mistakes that bit me.
This is the deep-dive I wish existed when I started.
What collections actually solve
Before collections, every Astro site I touched had a different ad-hoc way of reading frontmatter. Some used Astro.glob(), some loaded a JSON manifest, some hand-rolled a TypeScript types file. All of them rotted. Collections kill the entire category.
The build-time validation is the unsung win. A typo in a frontmatter date field fails npm run build, not your visitors.
The Zod schema is half the system
In src/content.config.ts:
import { defineCollection, z } from 'astro:content';
const blog = defineCollection({
type: 'content',
schema: z.object({
title: z.string().min(5).max(120),
description: z.string().min(20).max(300),
publishedAt: z.string().regex(/^\d{4}-\d{2}-\d{2}$/),
category: z.enum(['wordpress', 'astro', 'cloudflare', 'indie-hacking']),
tags: z.array(z.string()).max(8),
featured: z.boolean().default(false),
}),
});
export const collections = { blog };
Three details that aren’t obvious from the docs:
- Use literal enums for categories, not free strings. The build will refuse a typoed category, and your sidebar filters can index the enum directly.
- Set
.default()everywhere you can. Authors will forgetfeatured: false. Defaults turn forgotten fields into reasonable defaults instead of validation failures. - Date fields as
z.string().regex(...), notz.date(). YAML date parsing is a footgun (some parsers coerce2026-05-13to a Date object, some leave it a string); enforce the wire format and parse explicitly in your render code.
getCollection vs getEntry — picking the right call
Collections have two read APIs:
getCollection('blog') returns every entry as an array. Use it for index pages, sorting, filtering, computing aggregates.
getEntry('blog', slug) returns one entry by slug. Use it inside getStaticPaths for the slug page — but only when you need a single entry.
For a single dynamic route, getEntry is overkill: you’ve already used getCollection inside getStaticPaths to enumerate every slug. Pass the entry as a prop:
export async function getStaticPaths() {
const posts = await getCollection('blog');
return posts.map((post) => ({
params: { slug: post.id.replace(/\.mdx?$/, '') },
props: { entry: post }, // <- pass through
}));
}
const { entry } = Astro.props;
const { Content } = await entry.render();
That single-fetch pattern saves the build-time cost of re-fetching every entry by ID and keeps the dynamic page lean.
The slug-vs-id mistake everyone makes
Astro’s content collectionid is the file path relative to the collection root, with the extension. The slug is the URL-friendly identifier you derive from it.
For src/content/blog/hello-world.mdx:
entry.idis"hello-world.mdx"entry.slugis"hello-world"
This trips people up because in Astro 5 they unified slug access — but the id field still has the extension. If your getStaticPaths does params: { slug: post.id }, you’ll get URLs like /blog/hello-world.mdx/ and spend an hour wondering why nothing routes.
// Helper that mw.com uses everywhere.
export function entrySlug(id: string): string {
return id.replace(/\.(mdx|md)$/, '');
}
Fold it into a shared util and never think about it again.
Glob loaders for nested directories
For /how-to/<domain>/<tool>/<slug> content where the slug is multi-segment, the route file is /how-to/[...slug].astro (spread slug). The collection picks it up automatically:
src/content/howto/
database/mysql/not-recognized-command.mdx
backend/php/array-functions.mdx
entry.id becomes "database/mysql/not-recognized-command.mdx". Your getStaticPaths strips the extension and passes the multi-segment string to the spread slug. URL: /how-to/database/mysql/not-recognized-command.
The four mistakes I made
Worth writing down so you don’t repeat them.
The patterns that scaled across 46 sites
The reusable bits, lifted into @empire/content-shells:
sharedFrontmatter— title, description, publishedAt, hero (object, not string), schemaType.- Per-pattern schemas —
deepDiveSchema,tutorialSchema,howToSchema,snippetSchema,noteSchema. Each enforces atypeliteral. - A discriminated union (
anyContentSchema) — for sites that want one collection to hold multiple content types, the union enforces the right field set pertype. - Helper functions —
byPublishedDesc,groupByCategory,uniqueTags,entrySlug,formatBlogDate. Boring, reused everywhere.
The schema is the API contract between you-the-author and you-the-templater. Get it boringly precise and the rest of the system stays calm.
Where collections are still rough
The honest list of things that aren’t great:
- No incremental rebuild on schema change. Update the schema, every page rebuilds. For a 200-post blog that’s tens of seconds.
getCollectionis eager. It loads every entry, then filters. Fine at hundreds of entries, painful at thousands.- Live data isn’t a collection. Anything from an API has to live outside the collection abstraction. There’s a
loaderAPI for this but it’s young. - The error messages on invalid frontmatter are accurate but cryptic. Plan to read them three times before they make sense.
None of these is a deal-breaker. All of them get better every release.
What to do tomorrow
If you’re starting a new Astro site this week:
- Stand up
src/content.config.tswith one collection. - Define the schema with explicit Zod validation. Use
.default()aggressively. - Build the
[...slug].astroroute using the prop-passing pattern above. - Add a
byPublishedDeschelper tosrc/lib/. You’ll use it five times in a week. - When you stand up a second Astro site, extract the shared schema bits into a workspace package before the third one comes online.
That’s the whole loop. Content collections will be the most enjoyable part of your Astro stack — once you stop fighting the shape.