Geek & Curt
Ever thought about automating our quarterly KPI reports with a lightweight Flask microservice and Celery workers to shave hours off the prep time?
Sure, but we need a clear data source, authentication, and a test plan first. Make sure the Flask service plugs into our existing BI stack, set up a schedule for the Celery workers, and monitor for failures. If you can lay out the specs, we can prototype and cut those prep hours.
Alright, here’s a quick specs draft:
1. Data source: Pull raw KPI tables from the SQL‑Server BI warehouse via a read‑only user. Use an ORM layer (SQLAlchemy) so the Flask app can query with simple model objects.
2. Auth: Wrap the Flask endpoints with OAuth2, using the company’s IdP. The service will request a short‑lived token, validate it on each call, and log the user for audit.
3. API: Expose a single GET /kpi‑summary endpoint that returns JSON. It accepts optional start/end dates; defaults to the last fiscal quarter.
4. Celery workers:
– Use Redis broker.
– Schedule a daily task at 02:00 UTC to pre‑compute the KPI summary for the previous day.
– Workers write results to a cache table in the BI warehouse.
5. Monitoring:
– Push Celery heartbeat to Prometheus.
– Alert on task failures, queue length > 10, or worker latency > 5 s.
– Log every API request/response with request ID, user, and processing time.
6. Test plan:
– Unit tests for Flask routes (mock DB).
– Integration tests: spin up a test DB, run Celery task, verify cache table.
– Load test: 100 concurrent GET /kpi‑summary calls, ensure 99.5% within 200 ms.
7. Deployment: Containerize with Docker, tag with version, push to registry, run with Docker‑Compose (Flask, Celery worker, Redis, Prometheus exporter).
If that lines up, I can start the repo skeleton and write the first route. Just let me know the DB credentials and the IdP details.
Looks solid. I’ll pull the read‑only DB credentials and IdP config from the secrets store. Give me a brief on the expected schema for the cache table and the field mapping for the KPI summary, and I can set up the ORM models. Once the repo skeleton is up, I’ll start the Flask route and write the unit tests. Let me know if the task queue needs any special retry logic.
Cache table schema:
- id BIGINT PRIMARY KEY auto‑increment
- kpi_id INT NOT NULL (FK to master KPI list)
- period_start DATE NOT NULL
- period_end DATE NOT NULL
- metric_value DECIMAL(18,4) NOT NULL
- generated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
KPI summary fields (JSON returned by /kpi‑summary):
- kpiId (int) → kpi_id
- period (object) with startDate/endDate (dates) → period_start, period_end
- value (float) → metric_value
- source (“BI_warehouse”) and timestamp (generated_at)
Retry logic: Celery should retry failed KPI jobs up to 5 times with exponential backoff (2, 4, 8, 16, 32 s). If still failing, flag in the cache table with a status column (0=ok, 1=error) and send an alert to Slack. That should keep the queue sane. Let me know if you need a more granular breakdown.
Got it. I'll add the status column to the cache table, set up the retry policy in Celery, and hook a Slack webhook for alerts. Once the repo skeleton is ready, drop the credentials in the vault and we can spin up the container stack. Let me know if you want the route to support pagination or filtering beyond dates.