Ragie Skill
April 14, 2026

Building a Tax Document Assistant with the Ragie Skill

Bob Remeika
,
Co-Founder and CEO
,

Tax season means digging through a pile of PDFs — W-2s, 1099s, IRS publications — trying to answer questions like "What was my total freelance income?" or "Am I eligible for the home office deduction?" Instead of ctrl+F'ing through 15 documents, let's build a RAG app that answers questions from your actual tax documents.

We'll use the Ragie skill in Claude Code to build the whole thing. The skill gives the AI agent context on Ragie's SDK — document ingestion, retrieval, metadata filtering, RAG patterns — so we can go from zero to a working app without leaving the terminal.

You can find the complete source code for this tutorial at ragieai/examples/tax-assistant.

Prerequisites

- Claude Code installed
- A Ragie account and API key
- An Anthropic API key

Install the Ragie Skill

First, add the Ragie skill to Claude Code:

npx skills add ragieai/skills

This gives your agent access to Ragie's SDK patterns — ingestion, retrieval, metadata filtering, and RAG generation. From here on out, when we ask Claude Code to do something with Ragie, it knows what to do.

Scaffold the Project

We start by telling Claude Code what we want to build:

Set up a new TypeScript project for a CLI tax document assistant. Install the ragie and @anthropic-ai/sdk packages. Load API keys from environment variables.

The agent scaffolds the project, installs dependencies, and sets up the Ragie and Anthropic clients:

import { Ragie } from "ragie";
import Anthropic from "@anthropic-ai/sdk";

const ragie = new Ragie({ auth: process.env.RAGIE_API_KEY });
const anthropic = new Anthropic();

Nothing surprising here — but we didn't have to look up which packages to install or how to initialize them. The skill handled that.

Ingest Tax Documents

Now for the interesting part. We have a folder of tax documents:

documents/
  w2-acme-corp.pdf
  1099-nec-freelance.pdf
  1099-int-savings.pdf
  irs-pub-17-your-federal-income-tax.pdf
  irs-pub-587-home-office.pdf
  receipts-home-office.pdf

We tell Claude Code:

Write an ingestion script that uploads all PDFs from a documents/ directory to Ragie. Tag each document with metadata — type should be "w2", "1099", "irs-publication", or "receipt" based on the filename. Add taxYear: 2025 to all of them. Poll until each document is ready.

The agent writes the ingestion script, using `documents.create()` for file uploads and tagging each document with metadata:

import { readFile, readdir } from "fs/promises";

function classifyDocument(filename: string): string {
  if (filename.startsWith("w2")) return "w2";
  if (filename.startsWith("1099")) return "1099";
  if (filename.startsWith("irs-")) return "irs-publication";
  return "receipt";
}

async function ingest() {
  const files = await readdir("documents");

  for (const file of files.filter((f) => f.endsWith(".pdf"))) {
    const buffer = await readFile(`documents/${file}`);
    const doc = await ragie.documents.create({
      file: new Blob([new Uint8Array(buffer)], { type: "application/pdf" }),
      name: file,
      metadata: {
        type: classifyDocument(file),
        taxYear: 2025,
      },
    });

    console.log(`Uploaded ${file} (${doc.id}) — waiting for processing...`);
    await waitForReady(doc.id);
    console.log(`${file} ready`);
  }
}

The `waitForReady` helper polls the document status until it transitions from `pending` to `ready`:

async function waitForReady(docId: string, timeoutMs = 120_000) {
  const start = Date.now();
  while (Date.now() - start < timeoutMs) {
    const doc = await ragie.documents.get({ documentId: docId });
    if (doc.status === "ready") return;
    if (doc.status === "failed") throw new Error(`Document ${docId} failed`);
    await new Promise((r) => setTimeout(r, 3000));
  }
  throw new Error(`Document ${docId} not ready after ${timeoutMs}ms`);
}

Metadata is the key detail here. By tagging documents with `type` and `taxYear`, we can scope queries later — search only your 1099s for income questions, or only IRS publications for eligibility rules.

Retrieve and Filter

With documents ingested, we can search them. We ask Claude Code:

Add a retrieval function that searches my tax documents. It should accept a query and an optional document type filter. Use rerank for quality.
async function searchDocuments(query: string, docType?: string) {
  const results = await ragie.retrievals.retrieve({
    query,
    rerank: true,
    topK: 6,
    filter: docType ? { type: docType } : undefined,
  });

  return results.scoredChunks;
}

Now we can do things like:

// Search everything
await searchDocuments("What was my total income?");

// Search only W-2s and 1099s
await searchDocuments("What was my freelance income?", "1099");

// Search only IRS publications
await searchDocuments("Am I eligible for the home office deduction?", "irs-publication");

The metadata filter narrows the search before it happens — Ragie only looks at documents matching the filter, then runs hybrid search and reranking on that subset.

Wire Up RAG with Claude

The final piece: take the retrieved chunks and generate an answer. We ask Claude Code:

Add a function that takes a user question, retrieves relevant tax document chunks, and passes them to Claude to generate an answer with citations. Reference source documents by name.
async function askTaxQuestion(question: string, docType?: string) {
  const chunks = await searchDocuments(question, docType);

  const context = chunks
    .map((c) => `[Source: ${c.documentName}]\n${c.text}`)
    .join("\n\n");

  const response = await anthropic.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 1024,
    messages: [
      {
        role: "user",
        content: `You are a tax document assistant. Answer the question using only the provided context. Cite the source document name for each claim. If the context doesn't contain enough information to answer, say so.

Context:
${context}

Question: ${question}`,
      },
    ],
  });

  return (response.content[0] as { text: string }).text;
}

Let's try it:

console.log(await askTaxQuestion("What was my total wage income in 2025?"));
Based on your W-2 from Acme Corp, your total wage income for 2025 was
$95,000, with $18,240 withheld for federal income tax.
(Source: w2-acme-corp.pdf)
console.log(
  await askTaxQuestion(
    "Am I eligible for the home office deduction?",
    "irs-publication"
  )
);
According to IRS Publication 587, you may be eligible for the home office
deduction if you use part of your home regularly and exclusively for
business. Since your 1099-NEC indicates freelance income, you likely
qualify under the simplified method ($5 per square foot, up to 300 sq ft).
(Source: irs-pub-587-home-office.pdf)

The citations tie answers back to specific documents — so you can verify anything the assistant tells you.

Put It Together

We ask Claude Code to wrap it in a CLI loop:

Add a simple CLI that takes user input in a loop, asks the tax question, and prints the answer. Let the user prefix a question with a document type filter like "1099: what was my freelance income?"
import * as readline from "readline";

const rl = readline.createInterface({
  input: process.stdin,
  output: process.stdout,
});

function prompt() {
  rl.question("\nAsk a tax question (or 'quit'): ", async (input) => {
    if (input.toLowerCase() === "quit") return rl.close();

    let query = input;
    let docType: string | undefined;

    // Parse optional filter prefix like "1099: what was my income?"
    const match = input.match(/^(w2|1099|irs-publication|receipt):\s*(.+)/i);
    if (match) {
      docType = match[1].toLowerCase();
      query = match[2];
    }

    const answer = await askTaxQuestion(query, docType);
    console.log(`\n${answer}`);
    prompt();
  });
}

console.log("Tax Document Assistant");
console.log("Tip: prefix with a doc type to filter (e.g., '1099: freelance income?')");
prompt();

Run it:

cp .env.example .env  # add your API keys
npm install
npm run ingest        # one-time: upload your documents
npm run start         # start asking questions

What We Built

A CLI that ingests your tax documents into Ragie, tags them with metadata, and lets you ask natural language questions with cited answers — all built by telling Claude Code what we wanted.

The Ragie skill is what made this smooth. We didn't look up SDK docs, figure out which methods to call, or guess at the retrieval options. The agent had that context already and used it at each step: `documents.create()` for file uploads, metadata tagging for filtering, `rerank: true` for quality, citations for trust.

The full source code is on GitHub if you want to try it with your own documents.

Disclaimer: This is for educational purposes. Consult a qualified tax professional for actual tax advice.

Share