VuThanhThien®
LinkedIn

© 2026 Vũ Thanh Thiên. All rights reserved.

GitHubLinkedInFacebookEmail
Back to Blog
May 6, 2026
13 min read

How to Build Monthly/Weekly Reports with Holistics, S3, and api-reports

System DesignBackendReportingS3Architecture

Background

A B2B payment platform serving large enterprise clients processes millions of transactions daily across multiple services: loyalty points, gift cards, fuel payments, and more. At the end of every month (and often weekly), finance and operations teams need consolidated reports to reconcile and verify transactions.

The problem surfaces immediately: report generation is expensive. Running complex aggregation queries across tens of millions of rows directly against a production database means:

  • Holding connection pool resources and degrading the system for live users.
  • Increasing latency for customers actively making payments.
  • Risking timeouts on large datasets.

The solution: isolate reporting into a dedicated pipeline — Holistics as the query engine, S3 as durable storage, and api-reports as a thin API layer serving the frontend. The result: production stays unaffected, reports are generated on a schedule, and users download via pre-signed S3 URLs on demand.


Why Holistics + S3 + API?

Before diving into implementation, it's worth understanding the reasoning behind each architectural choice.

Holistics — A dedicated reporting query engine

Holistics is a BI tool that lets you write SQL queries, define export schedules (daily, monthly), and push results to multiple destinations. Instead of engineers maintaining custom cronjobs and complex queries, Holistics provides:

  • Visual query builder + raw SQL: The data team can update report logic without touching application code or deploying anything.
  • Built-in scheduling: Runs at the right time on the right day — no need to maintain a separate cron server.
  • SFTP export: Pushes files directly to S3 via the s3_snapshot connection, no intermediary code needed.
  • Parameterisation: Queries accept dynamic variables like date_range_start, $today, $yesterday — making daily and monthly reports flexible by default.

S3 — Immutable, low-cost storage

After Holistics exports a file, it lands in a single shared bucket (e.g. reports-export-bucket), separated by a strict path convention:

holistics/<env>/<enterprise>/snapshot/<date>/<folderName>/<fileName>.csv

Benefits:

  • Immutable: Once written, files don't change — great for auditing and compliance.
  • Low cost: CSV is the cheapest format for bulk tabular data at rest.
  • Pre-signed URLs: S3 natively supports temporary, scoped download URLs without exposing the bucket publicly.
  • Environment isolation: dev/<enterprise>/snapshot/... vs prod/<enterprise>/snapshot/... share one bucket but remain cleanly separated by path.

api-reports — A thin API layer that generates nothing

api-reports does not produce reports itself. It does exactly two things:

  1. Ingest: A cron job at 2am daily scans S3 for new files from the previous day and writes metadata rows to the database.
  2. Serve: Returns a list of available report files and pre-signed download URLs to the frontend.

This is a deliberately thin service — no heavy business logic, no complex queries at request time. The frontend calls the API, receives URLs, and the user clicks download.


System Architecture

Rendering diagram…

Detailed Flow: From Query to Download

1. Holistics generates and pushes the file to S3

Rendering diagram…

2. api-reports ingests metadata at 2am

Rendering diagram…

Critical caveat: If the snapshot config doesn't exist yet when the cron runs, the file is silently skipped. The file lives in S3 but the list API returns an empty array because there's no metadata row. Always create the config before the first Holistics export, or manually backfill the cron with --date=<date>.

3. Frontend fetches the list and the user downloads

Rendering diagram…

Data Model

api-reports tracks four entities to manage the full lifecycle of a report:

OnDemandReportConfig

export class OnDemandReportConfig {
  reportName: string;
  reportDescription: string;
  destination: 'web-dashboard' | 'web-admin';
  enterprise: string; // e.g. 'example-tenant' — short tenant slug; auto-populated from the account used to create the config
  reportMapping: Array<{
    /** report id in Holistics */
    reportId: string;
    /** filters passed to Holistics, e.g. { GLProfile: 'GIFT_CARD' } */
    prefilter: { [prop: string]: string };
  }>;
}

OnDemandReportExport

export class OnDemandReportExport {
  /** Holistics report id */
  reportId: string;
  jobId: string;
  status: 'pending' | 'exported';
  appliedFilter: { [prop: string]: string };
  /** populated when status becomes 'exported' */
  s3Url: string;
}

OnDemandReportGeneration

export class OnDemandReportGeneration {
  reportConfigId: string;
  status: 'pending' | 'generated';
  appliedFilter: { [prop: string]: string };
  /** ids of OnDemandReportExport rows this generation depends on */
  exportIds: string[];
  /** final merged report URL */
  s3Url: string;
}

OnDemandReportRequest

export class OnDemandReportRequest {
  emails: string[];
  status: 'pending' | 'sent';
  /** waiting for this generation to complete before sending email */
  generationId: string;
}

On-demand report state lifecycle:

Rendering diagram…

S3 Path Convention

One of the most important design decisions: use path prefixes, not multiple buckets.

reports-export-bucket/
└── holistics/
    ├── dev/
    │   └── example-tenant/
    │       └── snapshot/
    │           └── 2026-05-01/
    │               └── gift_card_expiry_report/
    │                   └── gift_card_expiry_report_2026-05-01.csv
    ├── staging/
    │   └── example-tenant/
    │       └── snapshot/
    │           └── ...
    └── prod/
        └── example-tenant/
            └── snapshot/
                └── 2026-05-01/
                    └── monthly_gift_card_with_positive_balance_report/
                        └── monthly_gift_card_with_positive_balance_report_2026-05-01.csv

Pattern: holistics/<env>/<enterprise>/snapshot/<date>/<folderName>/<fileName>.csv

The folderName is the single key that ties together three independent systems:

  1. The export path configured in the Holistics SFTP schedule
  2. The snapshot config registered in Admin Portal
  3. The API endpoint called by the frontend

If any of the three use a different value, the file lands in S3 but never appears in the UI.


How to Add a New Snapshot Report (End-to-End)

Step 1: Create the report in Holistics

  1. Create a new query in Holistics with the required columns and filters (e.g. date range, card program or product line).
  2. Note the report id from the URL: .../queries/588299 → id is 588299.
  3. Ensure the query accepts date parameters (date_range_start, date_range_end) if you need per-day or per-month exports.

Step 2: Register the config in Admin Portal

Go to Administrations → Snapshot reports → Create:

FieldExample
Report nameGift Card Expiry Report
Report id588299
Folder namegift_card_expiry_report (snake_case, must be unique)
Filename patterngift_card_expiry_report_$today.csv

The filename pattern must include $today, $yesterday, or $timestamp to prevent previous exports from being overwritten.

Step 3: Configure the SFTP export schedule in Holistics

In Holistics, open the report → Schedules → New SFTP Schedule:

  • SFTP Connection: select s3_snapshot
  • File Name: follow the path pattern:
    <env>/<enterprise>/snapshot/{{$today}}/<folderName>/<fileName>_{{$today}}.csv
    
    Example for dev, daily (replace example-tenant with your tenant slug):
    dev/example-tenant/snapshot/{{$today}}/gift_card_expiry_report/gift_card_expiry_report_{{$today}}.csv
    
  • Schedule: match what you set in Admin Portal (e.g. daily at 00:00, or 1st of every month).

Step 4: Implement the frontend page (web-admin)

Rendering diagram…

API call:

GET {reportsBaseUrl}/snapshot/report/configFolder/gift_card_expiry_report/list?from=2026-05-01&to=2026-05-31

Response shape:

[
  {
    "fileName": "gift_card_expiry_report_2026-05-01.csv",
    "createdAt": "2026-05-02T02:15:00Z",
    "downloadUrl": "https://s3.amazonaws.com/reports-export-bucket/holistics/prod/example-tenant/snapshot/..."
  }
]

No separate API call needed for download — just <a href={downloadUrl} target="_blank">.

No backend changes are required in web-admin for list/download; the backend is api-reports. You only add a new page and card that use the existing snapshot APIs with your new folder name.


Checklist: Adding a New Snapshot Report

Rendering diagram…

Order matters: Create the Admin Portal config before Holistics runs its first export. If done in reverse, the file will exist in S3 but the cron will have already scanned that date without a matching config — resulting in no metadata row and an empty list API. Fix: manually backfill the snapshot save cron with --date=<date>.


Troubleshooting: File Exists in S3 but UI Shows Nothing

Rendering diagram…

Most common root cause: The config was created after the Holistics export ran. The file exists in S3 but the cron already finished scanning that day without a matching config, so no metadata row was written. The list API returns [].

Also verify environment alignment: the config creation, Holistics export, S3 path, and frontend/list API must all point to the same environment (dev / staging / prod).


Quick Reference

TaskWhere
Create report (query + columns)Holistics
Register report id + folder name + filename patternAdmin Portal → Snapshot reports
Set schedule and S3 export pathHolistics → Schedules → New SFTP Schedule (s3_snapshot)
Scan S3 and store metadataapi-reports cron (2am daily)
List reports for a folderapi-reports: GET .../snapshot/report/configFolder/{folderName}/list
DownloadPre-signed S3 URL returned in list response (downloadUrl)
Show list + Download in UIweb-admin: new page following same pattern as existing snapshot pages

Real Example: Gift Card Expiry Report

StepValue / Action
Holistics reportDuplicated query id 701234 — list of gift cards expired by month, covering physical and virtual card types
Admin PortalReport name: Gift Card Expiry Report; Report id: 701234; Folder name: monthly_gift_card_with_positive_balance_report; Filename pattern: monthly_gift_card_with_positive_balance_report_$today.csv
Holistics SFTPConnection: s3_snapshot; path: dev/example-tenant/snapshot/{{$today}}/monthly_gift_card_with_positive_balance_report/monthly_gift_card_with_positive_balance_report_{{$today}}.csv; schedule: monthly on 1st
FrontendNew card on Reports landing; route reports/gift-card-expiry; list page with folderName = 'monthly_gift_card_with_positive_balance_report', columns: Report name, Created on, Download

Why This Architecture Works

ConcernHow it's handled
Production isolationHolistics reads from a data warehouse or read replica — never touches the production DB
ScalabilityLarge files only cost S3 storage, not compute at download time
AuditabilityFiles are immutable on S3 with timestamps in the filename — easy to trace
MaintainabilityData team updates query logic in Holistics without code deployments
SecurityPre-signed URLs expire after ~15 minutes; the bucket is never exposed publicly
CostCSV + S3 is the cheapest combination for bulk tabular data; Holistics runs off-peak

Constraints and Assumptions

  • Reports are CSV only (no custom styling like bold/underline).
  • A single report config can combine multiple Holistics sub-reports via reportMapping.
  • If a report requires special formatting (Excel, branded PDF), custom code must be written in api-reports to handle that case.
  • On-demand reports (user-triggered with custom filters) require an email address — the file is not available for immediate download; it is delivered via email once generated.

Summary

This pattern solves the expensive reporting problem by keeping reporting completely separate from production. The pipeline has three clearly defined layers:

  1. Holistics: Defines the query, runs on schedule, and pushes CSV files to S3 via the s3_snapshot SFTP connection.
  2. S3: Immutable object storage with a strict path convention — environments and report types are isolated by path prefix, not separate buckets.
  3. api-reports: A thin API layer — scans S3 at 2am, stores metadata, and serves pre-signed download URLs to the frontend.

The single most important design detail: folderName is the string that ties Holistics, Admin Portal, and the frontend together. All three must match exactly. A mismatch in any one of them means the file exists in S3 but never appears in the UI.

This architecture scales gracefully — adding a new report type requires a Holistics query, an Admin config, and a frontend page. No changes to the backend core are needed.

Written by Vũ Thanh Thiên