How can I save scraped headlines from Playwright into a Markdown file instead of just logging them to the console?

MattD_Burch · November 4, 2025, 6:30pm

I’m scraping book titles and URLs from a webpage using Playwright, and the data logs correctly in the console. However, I can’t figure out how to write this scraped content into a Markdown (.md) file.

Here’s the script:

const { chromium } = require('playwright');
const fs = require('fs');

(async () => {
  const browser = await chromium.launch({ headless: true });
  const page = await browser.newPage();
  await page.goto("https://books.toscrape.com/");

  const listcontent = await page.evaluate(() => {
    const data = [];
    document.querySelectorAll(".product_pod").forEach((book) => {
      const title = book.querySelector('.thumbnail').getAttribute("alt");
      const url = book.querySelector('a').getAttribute("href");
      data.push({ title, url });
    });
    return data;
  });

  // Working console output
  for (const { title, url } of listcontent) {
    console.log(`[${title}](${url})`);
  }

  // ❌ Problem: I can’t get this to write properly into a Markdown file
  fs.promises.writeFile(`file.md`, `---\n---\n`);

  await browser.close();
})();

The console shows the correct data, but the .md file remains empty or doesn’t include the list. How can I properly write the scraped data into a Markdown file using Playwright and Node.js?

ian-partridge · November 10, 2025, 12:27pm

You can try this code. It is efficient and clean - creates a proper Markdown file in one go.

const { chromium } = require('playwright');
const fs = require('fs').promises;

(async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto("https://books.toscrape.com/");

  const listcontent = await page.evaluate(() => {
    return Array.from(document.querySelectorAll(".product_pod")).map(book => ({
      title: book.querySelector('.thumbnail').alt,
      url: book.querySelector('a').href,
    }));
  });

  const markdown = `# 📚 Book Titles\n\n${listcontent.map(b => `- [${b.title}](${b.url})`).join('\n')}`;
  await fs.writeFile('books.md', markdown, 'utf-8');

  console.log("✅ Markdown file saved!");
  await browser.close();
})();

Best for structured output - overwrites old content each run.

vindhya.rddy · December 7, 2025, 7:07pm

Useful if you scrape multiple pages or sessions.

const fs = require('fs').promises;

// After scraping
let markdown = listcontent.map(b => `- [${b.title}](${b.url})`).join('\n');
await fs.appendFile('books.md', `\n${markdown}\n`, 'utf-8');

Adds new data without deleting previous runs.

prynka.chatterjee · December 7, 2025, 7:07pm

Ideal for very large datasets - avoids building huge strings in memory.

const fs = require('fs');
const stream = fs.createWriteStream('books.md');

stream.write('# 📚 Book Titles\n\n');

for (const { title, url } of listcontent) {
  stream.write(`- [${title}](${url})\n`);
}

stream.end();
console.log("✅ Markdown written using stream!");