Duplicate content isn't just word-for-word copies. Google flags content as duplicate when pages share substantial text overlap — typically 80% or higher similarity. This includes:
- Identical pages at different URLs (e.g., /services and /services/)
- Near-identical pages with only title or metadata changes
- Printer-friendly versions of the same article
- Paginated content without proper rel=next/prev tags
- Session parameters or tracking variables that create new URLs for the same content
- Auto-generated category pages with similar product descriptions
The key distinction: duplicate content isn't a manual penalty in most cases. Google doesn't penalize you for having duplicates. Instead, it wastes resources on your site by crawling and indexing non-canonical versions. This means less crawl budget spent on new or important pages.
For more detail on how to identify duplicates on your own site, see our audit guide for duplicate content.