Engineering
May 14, 2025

From Files to Insight: Building a document summarizer in minutes with Ragie

Stone Werner
,
Engineer

One of the most common use cases when building AI applications is the need to summarize documents. In this post I will walk you through the core concepts behind how I built Summarie, an open-source reference app that allows you to upload any file (even images, audio, and video) and instantly get a detailed summary of the content.

This web app is built using Next.js 15 and the Ragie TypeScript client.

Prerequisites

Before we get started, make sure to go to https://ragie.ai, sign up for a free developer account, and create an API key.

Project setup

Let's start by creating a new Next.js project and installing the Ragie TypeScript client:

# Create Next.js App
npx create-next-app@latest

# Change to project directory
cd file-summarizer

# Install the Ragie SDK
npm install ragie

Test your server

To run the Next.js development server:

npm run dev

The server will start at http://localhost:3000. Open this URL in your browser to see the default Next.js app home screen.

After installation, create a .env.local file in your project root and add your Ragie API key:

RAGIE_API_KEY=<your_api_key_here>

Building the file upload interface

We'll create a simple file upload interface using Next.js. First, let's create the main page component:

// app/page.tsx
'use client';
import { useRouter } from 'next/navigation';

export default function Home() {
  const router = useRouter();
  
  // TODO: Implement function
  const handleUpload = () => ();

  return (
    <div className="p-8">
      <h1 className="text-2xl font-bold">Summarizer</h1>
      <input
        type="file"
        onChange={handleUpload}
      />
    </div>
  );
}

Making a page.tsx file a client component with the 'use client' directive is typically advised against, but we can do it here for demonstration's sake.

Next, we'll implement the file upload handler (handleUpload) that sends the file to our API:

const handleUpload = async (e: React.ChangeEvent<HTMLInputElement>) => {
  const file = e.target.files?.[0];
  if (!file) return;

  const formData = new FormData();
  formData.append("file", file);

  const response = await fetch("/api/upload", {
    method: "POST",
    body: formData,
  });

  const data = await response.json();
  router.push(`/documents/${data.docId}`);
};

Here we define the handleUpload function to take the file from the component and send it to our API at /api/upload. After the upload is complete, we redirect the user to the document page. For a more robust implementation, check out Summarie which includes drag-and-drop support, file size limits, and file type validation.

Implementing the upload API

Create an API route handler to process file uploads and send them to Ragie.

(Note: It is very important we use a route handler like this so we do not expose our API key to the client.)

// app/api/upload/route.ts
import { NextResponse } from "next/server";
import { Ragie } from "ragie";

export async function POST(request: Request) {
  const formData = await request.formData();
  const file = formData.get("file") as File;

  const ragie = new Ragie({
    auth: process.env.RAGIE_API_KEY,
  });

  const result = await ragie.documents.create({
    file: file,
    mode: "hi_res",
  });

  return NextResponse.json({docId: result.id});
}

That's it! Once the user submits our form, the file will automatically begin processing in Ragie. For production use, consider adding metadata to track user IDs and other relevant information like we did in Summarie.

Tracking document status

Once uploaded, files are automatically parsed, chunked, and indexed. You can get information about a document, including its status, with the Get Document API call.

Create a new page.tsx file to display information about a specific document based on the document id:

// app/documents/[id]/page.tsx
import DocumentSummary from './DocumentSummary';

export default function DocumentPage({ params }: { params: { id: string } }) {
  return <DocumentSummary id={params.id} />;
};

Here we have to keep the page.tsx file a server component and pass the document id to the client component as a prop.

// app/documents/[id]/DocumentSummary.tsx
'use client';

import { useEffect, useState } from 'react';

export default function DocumentSummary({ id }: { id: string }) {
  const [document, setDocument] = useState<any>(null);
  const [loading, setLoading] = useState(true);

  useEffect(() => {
    const fetchDocument = async () => {
      try {
        const response = await fetch(`/api/documents/${id}`);
        const data = await response.json();
        setDocument(data);
      } finally {
        setLoading(false);
      }
    };
    fetchDocument();
  }, [id]);

  if (loading) return <div className="p-8">Loading...</div>;
  if (!document) return <div className="p-8">Document not found</div>;

  return (
    <div className="p-8">
      <h1 className="text-2xl font-bold mb-4">Document Details</h1>
      <pre className="text-black bg-gray-100 p-4 rounded">
        {JSON.stringify(document, null, 2)}
      </pre>
    </div>
  );
}

Create an API route handler to get the document information from Ragie:

// app/api/documents/[id]/route.ts
import { NextResponse } from "next/server";
import { Ragie } from "ragie";

export async function GET( request: Request, { params }: { params: { id: string } }) {

  const ragie = new Ragie({
    auth: process.env.RAGIE_API_KEY,
  });
  const document = await ragie.documents.get({ documentId: params.id });
  return NextResponse.json(document);
}

Now we have our document page set up and our API route to get the document by ID which will give us the document's processing status.

Before we can get the summary, we have to check if the document is ready. Documents are available for retrieval once in the "ready" state. For more information about how Ragie tracks document status, see the Create Document API documentation.

Polling for status

To poll our endpoint every 5 seconds you can update the document page like this:

// app/documents/[id]/DocumentSummary.tsx
'use client';

import { useEffect, useState } from 'react';

export default function DocumentSummary({ id }: { id: string }) {
  const [document, setDocument] = useState<any>(null);
  const [loading, setLoading] = useState(true);
  const intervalRef = useRef<NodeJS.Timeout | null>(null);

  useEffect(() => {
    const fetchDocument = async () => {
      try {
        const response = await fetch(`/api/documents/${id}`);
        const data = await response.json();
        setDocument(data);
        // If document is ready or failed, clear the interval
        if (data.status === "ready" || data.status === "failed") {
          if (intervalRef.current) {
            clearInterval(intervalRef.current);
            intervalRef.current = null;
          }
        }
      } finally {
        setLoading(false);
      }
    };
    
    fetchDocument();
    
    // Set up polling
    intervalRef.current = setInterval(fetchDocument, 5000);

    return () => {
      if (intervalRef.current) {
        clearInterval(intervalRef.current);
      }
    };
  }, [id]);

  if (loading) return <div className="p-8">Loading...</div>;
  if (!document) return <div className="p-8">Document not found</div>;

  return (
    <div className="p-8">
      <h1 className="text-2xl font-bold mb-4">Document Details</h1>
      <pre className="text-black bg-gray-100 p-4 rounded">
        {JSON.stringify(document, null, 2)}
      </pre>
    </div>
  );
}

We keep checking on the status of our document by polling the API route handler we just created. For a more robust implementation, you can also use Ragie webhooks to track document status.

Document Summaries

Ragie automatically creates detailed summaries for every document as part of the processing. To get the summary, we can include it as a part of our existing document API route handler.

// If document is ready, fetch the summary
if (document.status === "ready") {
  const result = await ragie.documents.getSummary({ documentId: params.id });

  return NextResponse.json({
    ...document,
    summary: result.summary,
  });
}

Here is the full route handler with the summary included.

// app/api/documents/[id]/route.ts
import { NextResponse } from "next/server";
import { Ragie } from "ragie";

export async function GET(request: Request, { params }: { params: { id: string } }) {

  const ragie = new Ragie({
    auth: process.env.RAGIE_API_KEY,
  });
  const document = await ragie.documents.get({ documentId: params.id });
  // If document is ready, fetch the summary
  if (document.status === "ready") {
    const result = await ragie.documents.getSummary({ documentId: params.id });
    return NextResponse.json({
      ...document,
      summary: result.summary,
    });
  }
  return NextResponse.json(document);
}

Now just update our page to show the summary when it is available:

   {document.summary && (
     <div className="mb-8">
       <h2 className="text-xl font-semibold mb-2">Summary</h2>
       <div className="text-black bg-white p-4 rounded border">
         {document.summary}
       </div>
     </div>
   )}

Here is the full document page with polling and summary included:

// app/documents/[id]/DocumentSummary.tsx
'use client';
import { useEffect, useState, useRef } from 'react';
export default function DocumentPage({ id }: { id: string }) {
  const [document, setDocument] = useState<any>(null);
  const [loading, setLoading] = useState(true);
  const intervalRef = useRef<NodeJS.Timeout | null>(null);

  useEffect(() => {
    const fetchDocument = async () => {
      try {
        const response = await fetch(`/api/documents/${id}`);
        const data = await response.json();
        setDocument(data);
        // If document is ready or failed, clear the interval
      	 if (data.status === "ready" || data.status === "failed") {
           if (intervalRef.current) {
             clearInterval(intervalRef.current);
             intervalRef.current = null;
           }
        }

      } finally {
        setLoading(false);
      }
    };
    fetchDocument();
    // Set up polling
    intervalRef.current = setInterval(fetchDocument, 5000);

    return () => {
      if (intervalRef.current) {
        clearInterval(intervalRef.current);
      }
    };
  }, [id]);

  if (loading) return <div className="p-8">Loading...</div>;
  if (!document) return <div className="p-8">Document not found</div>;

  return (
    <div className="p-8">
      <h1 className="text-2xl font-bold mb-4">Document Details</h1>
      {document.summary && (
        <div className="mb-8">
          <h2 className="text-xl font-semibold mb-2">Summary</h2>
          <div className="text-black bg-white p-4 rounded border">
            {document.summary}
          </div>
        </div>
      )}

      <h2 className="text-xl font-semibold mb-2">Document Information</h2>
      <pre className="text-black bg-gray-100 p-4 rounded">
        {JSON.stringify(document, null, 2)}
      </pre>
    </div>
  );
}

Just like that, we built a document summarizer! You saw how to set up file uploads, handle the backend API, track document progress, and retrieve summaries. Ragie makes it easy – only three SDK functions needed to upload files, check their status, and get summaries. No need to worry about databases; Ragie handles the heavy lifting of processing and storing your data.

Below is a screenshot of what we just built.

This is a simplified web app for generating summaries from uploaded documents. Before considering this production-ready, please take the following into consideration.

Considerations

File Validation

  • Validate file types on both client and server side
  • Implement file size limits
  • Consider using a file validation library

Error Handling

  • Provide clear error messages to users
  • Handle network errors gracefully
  • Implement retry logic for failed uploads

Security

  • Never expose API keys in client-side code
  • Validate file types on the server
  • Implement rate limiting

User Experience

  • Show upload progress
  • Provide feedback on success/failure
  • Allow multiple file uploads
  • Refresh UI after successful upload

Conclusion

While this example was only used for document summaries, data ingested into Ragie can power any RAG use case - from internal chatbots to personalized tutoring systems, automated meeting notes, contextual code assistants and beyond!

If you want to see the complete Summarie application, check out the open source repository on GitHub.