Portfolio Case Study

AI-Powered SEO Keyword Research Workflow System

A repeatable workflow that turns manual keyword research into cleaner planning data for content decisions.

This system reduces the manual cleanup work behind SEO research by collecting, deduplicating, normalizing, and structuring keyword data into a reusable planning dataset. The technical stack matters, but the business value is simpler: faster research cycles, cleaner inputs, and a stronger foundation for content strategy.

n8n OpenAI DataForSEO Google Sheets SEO Automation

Business Snapshot

Manual keyword cleanup became a repeatable planning workflow

Problem

Keyword research required repeated collection, cleanup, deduplication, and spreadsheet preparation before content planning could begin.

Outcome

Built a reusable workflow that saves an estimated 3 to 6 hours per active content-planning week.

System

n8n, OpenAI, DataForSEO, and Google Sheets workflow for collection, normalization, deduplication, and export.

Reusable Value

The architecture can support future clustering, briefs, prioritization, and SEO planning layers.

ROI Snapshot

What this workflow saves before the technical details matter

Conservative estimate: 3 to 6 hours saved per active content-planning week by replacing manual keyword collection, cleanup, deduplication, and spreadsheet setup with a reusable workflow.

Time Saved

Estimated 3 to 6 hours per week saved during SEO research cycles, especially when building topic lists, content briefs, or keyword batches.

Value Gained

Produces cleaner keyword data earlier in the process, so content decisions can start from structured inputs instead of cleanup work.

Reusable Asset

Creates a workflow foundation that can support clustering, prioritization, briefs, and future SEO planning layers.

1. The Problem

The research work was useful, but the process was fragmented

Manual keyword research tends to break down in predictable ways. Data comes from too many places, formatting drifts, and by the time you clean it up, you have already burned time you were trying to save.

It also gets harder to scale once the process depends too much on memory and cleanup habits. I wanted a workflow I could run repeatedly and trust, with structured output that was clean enough to use later for strategy, content planning, or GPT-assisted analysis.

2. Project Goals

Build a collection system that stays useful as the work expands

Modular workflow system

Use smaller modules with clear jobs instead of one oversized workflow that becomes difficult to maintain.

Reusable architecture

Build the flow once in a way that can support future research systems and content workflows.

Scalable keyword collection

Handle both direct seed keywords and wider niche exploration without rebuilding the system each time.

GPT-ready output

Produce structured data that is easy to work with later, not a stack of half-clean exports.

Support future SEO workflows

Leave room for clustering, prioritization, content planning, and brief generation later on.

Separate collection from strategy

Keep collection separate from strategy so the system stays flexible and easier to debug.

3. System Architecture

An orchestrator pattern with modular responsibility boundaries

The system uses an orchestrator pattern. One main workflow coordinates the run, and each major step is handed off to a dedicated module. That keeps the responsibilities clear, makes failures easier to isolate, and lets me improve one part without disturbing the rest of the workflow.

Summary architecture artifact for the SEO keyword research workflow
Architecture summary showing the orchestrator pattern, tool stack, and strategic boundary between collection and analysis.

The real value here is separation of responsibilities. Input handling, seed generation, data retrieval, cleaning, export, and run summaries all live in their own layers. That makes the system easier to reuse and a lot easier to reason about when something changes.

4. Workflow Breakdown

Each module does one job and hands off cleanly to the next

n8n orchestrator workflow screenshot
The main orchestrator workflow inside n8n, showing how the modules connect in sequence.

Normalize Input

Every run starts by cleaning and standardizing the input. That includes limits, location, language, and input type, so the rest of the system is working with predictable values.

Generate Seed Keywords

If the input is a niche, OpenAI generates seed terms. If the input is already a seed keyword, the workflow uses it directly. That split keeps the system flexible without complicating the later stages.

DataForSEO Retrieval

The workflow sends each seed through DataForSEO to pull real keyword metrics. This is where the process stops being conceptual and starts becoming a real dataset.

Cleaning and Deduplication

Returned rows are normalized, typed correctly, and deduplicated by normalized keyword values. Clean data is the difference between a workflow that scales and one that quietly creates more cleanup work for you later.

Google Sheets Export

Once the dataset is clean, the workflow appends the results to a structured Google Sheet. That gives the system a stable handoff layer for later analysis, prioritization, or content planning.

Run Summary

Each execution ends with a run summary so it is easier to review what happened. That helps with logging, QA, and general sanity when the workflow gets used repeatedly.

Folder view showing orchestrator and module workflows
The workflow folder structure reinforces the modular design. Each module can evolve without turning the orchestrator into a single point of confusion.

5. Data Output Structure

The output is designed for analysis, not just storage

The final sheet is structured so it can support later GPT analysis, human review, content planning, and prioritization. It is not just a raw dump. It is meant to be a usable working dataset.

Google Sheets output of keyword dataset
The output layer includes keyword metrics, trend fields, language data, difficulty, and intent classification in a format that is easier to use downstream.

Keyword metrics

Search volume, CPC, competition, and top-of-page bid ranges.

Intent classification

Main intent fields make later filtering and content planning a lot easier.

Difficulty scoring

Keyword difficulty makes prioritization easier when the dataset grows.

Future GPT support

Clean structure makes later clustering, briefing, and analysis far more reliable.

6. Strategic Design Decisions

Why the modular system matters

Modularity is not just a technical preference here. It is what makes the workflow easier to scale, easier to reuse, and much easier to debug when something breaks or changes.

Separating strategic analysis from data collection was another deliberate choice. The collection workflow is responsible for producing clean, structured inputs. It does not try to pick the best keyword, choose a content strategy, or write the brief. That keeps the system more flexible and makes it easier to connect to future analysis layers.

Scalability

New modules can be added without turning the core workflow into a maintenance problem.

Easier debugging

When a module fails, the problem area is much easier to isolate.

Reusable systems approach

The same structure can support future SEO and research workflows without starting over.

7. Challenges and Lessons Learned

The main challenge was knowing when to stop automating

One of the bigger tensions in this project was balancing automation with strategic flexibility. It is very easy to keep stacking logic into a workflow because technically you can. That does not always make the workflow better.

The clearest lesson was that clean structure matters more than clever automation. If the data is inconsistent, the downstream work gets shakier. If the workflow tries to do too much, it gets harder to maintain and harder to trust.

The better path was to build a solid collection system first, keep the modules reusable, and leave room for human judgment where it still belongs.

8. Results and Impact

What the workflow now enables

Faster keyword research

The collection process is much quicker and more repeatable than manual assembly.

Scalable content planning

The output is structured well enough to support later filtering, planning, and clustering work.

Reusable SEO datasets

Each run produces a dataset that can be used again instead of a one-off export.

Future automation expansion

The architecture is ready for additional analysis layers without needing a rebuild later.

Portfolio-ready workflow thinking

This is a practical example of systems design, not just a prompt experiment with a good screenshot.

9. Future Expansion

Where this system could go next

  • Keyword clustering
  • Topical authority scoring
  • SERP analysis layers
  • Automated content brief generation
  • Internal linking suggestions
  • Integration with broader AI research systems

The current version is doing the right job for this stage. It collects and structures data well. The next step is using that data more intelligently without losing the clean boundaries that make the workflow useful in the first place.