Build a Link Preview Service with Node.js

DodaTech Updated 2026-06-21 7 min read

In this tutorial, you'll learn about Build a Link Preview Service with Node.js. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Build a link preview service with Node.js that scrapes Open Graph metadata from URLs, generates rich preview cards, and caches results in Redis for fast repeated lookups.

What You'll Build

You'll build an API that accepts any URL and returns a preview object containing the page title, description, image, and site name. The results get cached in Redis so subsequent requests for the same URL return instantly — the same technique used by social media platforms like Twitter and Facebook when you paste a link.

Why Link Previews Matter

Every time you paste a URL into Slack, WhatsApp, or a CMS, the app fetches a preview card automatically. Building your own preview service teaches you HTTP scraping, HTML parsing, Open Graph protocol handling, and Caching Strategy. Security teams use similar scraping to inspect links before users click them — Doda Browser's URL safety checker uses the same pattern to analyse links before loading.

Prerequisites

Node.js 18+ installed
Basic Express.js knowledge
Redis installed locally or via Docker

Step 1: Setup

mkdir link-previewer
cd link-previewer
npm init -y
npm install express cheerio axios ioredis helmet

cheerio parses HTML like jQuery on the server. ioredis is a robust Redis client. helmet adds security headers to protect against common web vulnerabilities.

Step 2: Open Graph Scraper

// scraper.js
const axios = require('axios');
const cheerio = require('cheerio');

async function scrape(url) {
  const response = await axios.get(url, {
    timeout: 5000,
    headers: { 'User-Agent': 'LinkPreviewer/1.0' },
    validateStatus: status => status < 400
  });

  const $ = cheerio.load(response.data);
  const meta = {};

  meta.title = $('meta[property="og:title"]').attr('content')
    || $('title').text()
    || '';

  meta.description = $('meta[property="og:description"]').attr('content')
    || $('meta[name="description"]').attr('content')
    || '';

  meta.image = $('meta[property="og:image"]').attr('content')
    || $('meta[property="og:image:url"]').attr('content')
    || '';

  meta.siteName = $('meta[property="og:site_name"]').attr('content')
    || new URL(url).hostname;

  meta.url = url;

  return meta;
}

module.exports = scrape;

We prioritize Open Graph tags because they're specifically designed for link previews — social media platforms all use them. If OG tags are missing, we fall back to <title> and <meta name="description">. This fallback chain ensures we always return something useful.

Step 3: Caching Layer and Server

// server.js
const express = require('express');
const helmet = require('helmet');
const Redis = require('ioredis');
const scrape = require('./scraper');

const app = express();
const redis = new Redis(); // defaults to localhost:6379

app.use(helmet());

const CACHE_TTL = 3600; // 1 hour in seconds

app.get('/api/preview', async (req, res) => {
  const { url } = req.query;

  if (!url) {
    return res.status(400).json({ error: 'url query parameter required' });
  }

  // Validate URL format
  try {
    new URL(url);
  } catch {
    return res.status(400).json({ error: 'Invalid URL format' });
  }

  // Block dangerous URL schemes
  const blocked = ['file:', 'ftp:', 'data:'];
  if (blocked.some(scheme => url.startsWith(scheme))) {
    return res.status(400).json({ error: 'URL scheme not allowed' });
  }

  try {
    // Check cache first
    const cached = await redis.get(`preview:${url}`);
    if (cached) {
      return res.json({ ...JSON.parse(cached), cached: true });
    }

    // Scrape and cache
    const preview = await scrape(url);
    await redis.set(`preview:${url}`, JSON.stringify(preview), 'EX', CACHE_TTL);
    res.json({ ...preview, cached: false });
  } catch (err) {
    if (err.code === 'ECONNABORTED') {
      return res.status(504).json({ error: 'Request timeout' });
    }
    res.status(502).json({ error: 'Failed to fetch URL' });
  }
});

const PORT = process.env.PORT || 4000;
app.listen(PORT, () => console.log(`Link previewer running on port ${PORT}`));

The server validates the URL, blocks dangerous schemes like file:// (which could leak local files), checks Redis cache first, scrapes if not cached, then stores the result for future requests. The helmet middleware adds security headers that prevent clickjacking and XSS.

Step 4: Frontend Preview Card

<!-- public/index.html -->
<!DOCTYPE html>
<html>
<head>
  <title>Link Preview Demo</title>
  <style>
    * { margin: 0; padding: 0; box-sizing: border-box; }
    body { font-family: system-ui; max-width: 600px; margin: 0 auto; padding: 20px; }
    input, button { padding: 10px; font-size: 16px; }
    input { flex: 1; border: 1px solid #ddd; border-radius: 6px 0 0 6px; }
    button { background: #007bff; color: white; border: none; border-radius: 0 6px 6px 0; cursor: pointer; }
    .card { margin-top: 20px; border: 1px solid #ddd; border-radius: 12px; overflow: hidden; }
    .card img { width: 100%; height: 200px; object-fit: cover; }
    .card-body { padding: 16px; }
    .card-body h2 { font-size: 18px; margin-bottom: 8px; }
    .card-body p { color: #555; font-size: 14px; line-height: 1.5; }
    .card-footer { padding: 8px 16px; background: #f5f5f5; font-size: 12px; color: #888; }
  </style>
</head>
<body>
  <h1>Link Preview API</h1>
  <div style="display: flex; margin-top: 16px">
    <input type="url" id="urlInput" placeholder="Paste a URL..." value="https://example.com">
    <button onclick="preview()">Preview</button>
  </div>
  <div id="result"></div>

  <script>
    async function preview() {
      const url = document.getElementById('urlInput').value;
      const res = await fetch(`/api/preview?url=${encodeURIComponent(url)}`);
      const data = await res.json();

      if (data.error) {
        document.getElementById('result').innerHTML =
          `<div style="color: red; margin-top: 12px">${data.error}</div>`;
        return;
      }

      document.getElementById('result').innerHTML = `
        <div class="card">
          ${data.image ? `<img src="${data.image}" alt="" loading="lazy">` : ''}
          <div class="card-body">
            <h2>${data.title}</h2>
            <p>${data.description}</p>
          </div>
          <div class="card-footer">
            ${data.siteName} ${data.cached ? '• Cached' : ''}
          </div>
        </div>
      `;
    }

    preview();
  </script>
</body>
</html>

Expected output: Paste a URL like https://github.com — the page fetches the preview and displays a card with the site's title, description, and OG image. Paste the same URL again and you'll see "Cached" in the footer — response time drops from ~500ms to under 5ms.

Architecture

sequenceDiagram
    Client->>API: GET /api/preview?url=https://example.com
    API->>Redis: CHECK preview:https://example.com
    alt Cache Hit
        Redis-->>API: Cached Data
        API-->>Client: Preview (cached: true)
    else Cache Miss
        API->>example.com: HTTP GET with User-Agent
        example.com-->>API: HTML Page
        API->>Cheerio: Parse Open Graph Tags
        Cheerio-->>API: Extracted Metadata
        API->>Redis: SET preview:url WITH TTL 3600
        API-->>Client: Preview (cached: false)
    end

Common Errors

1. Timeout on large pages Some pages load slowly or serve huge HTML. Set axios timeout to 5 seconds. Pages that don't respond in time return a 504 error rather than hanging your server.

2. SSRF vulnerabilities Without validation, an attacker can make your server request internal IPs like http://localhost:3000/admin. Block private IP ranges and dangerous schemes. Our code blocks file://, ftp://, and data:// URLs.

3. Redis connection refused If Redis isn't running, ioredis throws on first operation. Start Redis with redis-server or use Docker: docker run -p 6379:6379 redis:7. Handle connection errors with a fallback that skips Caching.

Practice Questions

1. Why do we prioritize Open Graph tags over regular meta tags? OG tags are explicitly designed for link previews. They contain curated content — a page's <title> might be "Home" while its og:title is "DodaTech - Security Tools for Everyone". Social media platforms and messaging apps all use OG tags first.

2. What is the purpose of the User-Agent header in the scraper? Some servers block requests without a User-Agent or return different content for bots vs browsers. Setting a descriptive User-Agent tells the server we're a legitimate scraper, not a malicious bot.

3. How does Redis Caching improve performance? Without Caching, every request requires an HTTP request to the target site (500ms-2000ms). With Redis, repeated lookups return in under 5ms. The 1-hour TTL ensures data stays fresh while dramatically reducing latency.

4. Challenge: Rate Limiting with Redis Add a rate limiter that allows 10 requests per minute per IP. Use redis.setex(ip, 60, count) and increment on each request. Return 429 when exceeded — preventing abuse of your preview service.

FAQ

How do I handle JavaScript-rendered pages?

Pages built with React.js or Vue often don't include OG tags in the raw HTML — they're injected by JavaScript. Use a headless browser like Puppeteer to render the page before scraping, but expect much higher latency and resource usage.

Can I deploy this without Redis?

Yes. Skip Caching entirely or use an in-memory Map. Redis is optional — it improves performance but isn't required. For production, check if your hosting platform offers a managed Redis instance.

How do I prevent abuse of my preview API?

Add authentication via API keys, Rate Limiting per IP, and a blocklist for known spam domains. The same web security techniques used in production APIs apply here.

Next Steps

Add Puppeteer integration for JavaScript-rendered pages
Explore API security best practices for Rate Limiting and authentication
Try building the API Client with Electron for a desktop tool that consumes APIs
Learn about Redis caching patterns for high-performance applications

← Previous Build a Habit Tracker with Vue.js Next → Build a Code Snippet Sharer (Pastebin Clone)

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Build Projects More