Build a full-featured web application that scrapes documentation sites, processes content, and generates multiple export formats (PDF, Markdown, EPUB) with advanced filtering and formatting capabilities.
doc-scraper/
├── app/
│ ├── api/
│ │ ├── scrape/
│ │ │ ├── start/route.ts
│ │ │ ├── status/[jobId]/route.ts
│ │ │ └── cancel/[jobId]/route.ts
│ │ ├── export/
│ │ │ ├── pdf/route.ts
│ │ │ ├── markdown/route.ts
│ │ │ └── epub/route.ts
│ │ └── webhook/
│ │ └── complete/route.ts
│ ├── dashboard/
│ │ ├── page.tsx
│ │ ├── jobs/[id]/page.tsx
│ │ └── settings/page.tsx
│ ├── layout.tsx
│ └── page.tsx
├── components/
│ ├── scraper/
│ │ ├── ScrapeForm.tsx
│ │ ├── JobProgress.tsx
│ │ ├── FilterSettings.tsx
│ │ └── ExportOptions.tsx
│ ├── ui/ (shadcn components)
│ └── layouts/
├── lib/
│ ├── scraper/
│ │ ├── browser-manager.ts
│ │ ├── content-extractor.ts
│ │ ├── url-validator.ts
│ │ └── rate-limiter.ts
│ ├── export/
│ │ ├── pdf-generator.ts
│ │ ├── markdown-converter.ts
│ │ └── epub-builder.ts
│ ├── queue/
│ │ └── scrape-queue.ts
│ └── db/
│ ├── prisma.ts
│ └── schema.prisma
├── workers/
│ └── scrape-worker.ts
└── types/
└── index.ts
In Cursor:
Create new Next.js project with TypeScript
npx create-next-app@latest doc-scraper --typescript --tailwind --app