Skip to main content

Command Palette

Search for a command to run...

n8n Learning: LLM Chain vs Single Call — Caption Generation Experiment

Updated
5 min read
n8n Learning: LLM Chain vs Single Call — Caption Generation Experiment
M

AI Engineer based in Nara, pivoting from a 20-year career as a hairdresser. Currently building Stylus (AI captioning for salons) and Harmony (Instagram automation). Obsessed with workflow automation using n8n, Python, and LLMs. My mission is to solve inefficiencies in the beauty industry through code. Writing about SaaS development, automation, and my journey from the salon floor to software architecture.

What I Wanted to Do

In an n8n workflow for auto-generating Instagram carousel post captions, I compared two approaches:

  1. Call Claude Haiku 4.5 three times in a chain (role-division approach)

  2. Call Claude Sonnet 4.5 once (batch generation approach)

My initial hypothesis was: "Can we achieve high-quality output while reducing costs by breaking down tasks for a lower-tier model (Haiku)?" However, the conclusion was clear: Sonnet single call is superior in both consistency and quality.

Environment

  • n8n (v1.19.4+, using Advanced AI features)

  • Claude 4.5 Haiku (claude-haiku-4-5)

  • Claude 4.5 Sonnet (claude-sonnet-4-5)

  • Node.js 18+ (for validation script execution)

  • Environment variable: ANTHROPIC_API_KEY

Implementation Approaches

Approach A: Haiku 3-Chain (Role Division)

Breaking down the task into fine-grained steps, passing each output as the next input.

  1. Step 1: Generate element-by-element descriptions from raw data

  2. Step 2: Create a structural draft based on Step 1 output

  3. Step 3: Add metadata (tags, etc.) to finalize

Approach B: Sonnet Single Call (Batch Generation)

Trust the higher model's reasoning ability and pass all information at once.

  • Pass all information and generate the entire final output in a single prompt

Validation Code

To simulate n8n workflow behavior, I compared both models using Node.js with identical input data.

Setup

mkdir llm-chain-vs-single
cd llm-chain-vs-single
npm init -y
npm i @anthropic-ai/sdk dotenv

export ANTHROPIC_API_KEY="YOUR_KEY"

Execution Script (index.mjs)

import "dotenv/config";
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

// Generic data structure example
const image_details = [
  { 順番: 1, 属性A: "値1", 属性B: "値2" },
  { 順番: 2, 属性A: "値3", 属性B: "値4" },
];

async function callClaude({ model, system, prompt }) {
  const res = await client.messages.create({
    model,
    max_tokens: 1000,
    system,
    messages: [{ role: "user", content: prompt }],
  });
  return res.content[0]?.text ?? "";
}

// (A) Haiku 3-Chain: Step1→2→3
async function haikuChain() {
  const model = "claude-haiku-4-5"; // Claude 4.5 Haiku
  const system = "You are an editor. Be concise in Japanese.";

  // Step 1: Data extraction and summarization
  const step1 = await callClaude({
    model,
    system,
    prompt: `Generate output in the specified format from the following data.
Structure: [Extract description text for each element only]
---
${JSON.stringify({ image_details }, null, 2)}`,
  });

  // Step 2: Create body text
  const step2 = await callClaude({
    model,
    system,
    prompt: `Generate output in the specified format based on the following content.
Structure: [Create hook, body, and CTA]
---
${step1}`,
  });

  // Step 3: Add supplementary information
  const step3 = await callClaude({
    model,
    system,
    prompt: `Generate output in the specified format for the following body text.
Structure: [Keep body unchanged, add tags at the end]
---
${step2}`,
  });

  return { step1, step2, final: step3 };
}

// (B) Sonnet Single: Complete in one call (targeting contextual consistency)
async function sonnetSingle() {
  const model = "claude-sonnet-4-5"; // Claude 4.5 Sonnet
  const system = "You are an editor. Be concise in Japanese.";

  return await callClaude({
    model,
    system,
    prompt: `Generate output in the specified format from the following data.
Structure: [Create hook, overview, bullet points, CTA, and tags all at once]
---
${JSON.stringify({ image_details }, null, 2)}`,
  });
}

// Execute and output
const resultA = await haikuChain();
const resultB = await sonnetSingle();

console.log("=== Haiku 3Chain (final) ===\n", resultA.final);
console.log("\n=== Sonnet Single ===\n", resultB);

n8n Preprocessing Code (Code Node)

Critical points when formatting data on the n8n side.

// Prerequisite: items.json.transformedImages comes from previous node (array)
const transformedImages = items.json.transformedImages;

if (!Array.isArray(transformedImages) || transformedImages.length === 0) {
  throw new Error("transformedImages is empty");
}

// ❌ Failed pattern: Passing only key info (ID, etc.) prevents LLM from making judgments
// ✅ Correct: Pass all columns needed for judgment (feature data) via spread
const image_details = transformedImages.map((img, index) => ({
  順番: index + 1,
  ...img, // ★ Key point: Include all attributes the LLM needs
}));

return [{ json: { image_details } }];

What Tripped Me Up

Problem: Step 1 Returns "Unknown"

In early validation, insufficient information passed to the LLM prevented intended output.

Cause: The image_details passed to the LLM contained only IDs or URLs, lacking text information (attribute data) needed for judgment.

Fix:

// ❌ Before fix (ID only)
const imageDetails = data.map(d => ({ id: d.id }));

// ✅ After fix (all attributes)
const imageDetails = data.map(d => ({ ...d }));

This enabled Steps 1-3 to function correctly.

Experiment Results

Quality Comparison

AspectHaiku 3-Chain (A)Sonnet Single (B)
Context UnderstandingContext tends to fragment per stepDeep comprehension by viewing whole at once
ConsistencyUnnatural "stitching" between step outputsUnified tone and structure from start
Structural AbilityGood at decomposing and executing per instructionAutonomously determines bullet points and paragraphs
Output QualityInformation-rich but tends toward stiff, explanatory textSimple, engaging hook; readable

For caption generation, "overall tone, flow, and consistency" are critical. Splitting generation into three calls resulted in over-independence of each step, weakening the overall cohesion (groove).

About Claude 4.5 Haiku

Claude 4.5 Haiku delivers Sonnet 4-level capability at low cost and high speed. Optimal for latency-sensitive chatbots and routine data extraction tasks, but for tasks requiring "context and emotional resonance" like this one, Sonnet 4.5 won decisively.

Key Insights

Conclusion

For creative tasks like caption generation, using a mid-to-upper tier model in a single call produces better results than chaining a lower-tier model.

Usage Guidelines

ApproachBest ForExamples
Lower Model × ChainTasks with clearly separable rolesData extraction, summarization, classification, format conversion
Mid Model × SingleTasks where consistency and context matterCaption creation, article writing, copywriting

Lessons Learned

  1. Single call for context-critical tasks: When consistent tone is essential, fragmentation from chaining becomes a risk.

  2. Data structure is everything: Without necessary metadata in the prompt data, no model can produce high-quality output.

  3. Haiku's sweet spot: Haiku 4.5 is fast and cheap, but Sonnet 4.5 excels at "emotional" writing and overall composition.

References