Parjanya’s Polyrepo Architecture: Building Scalable AI/LLM Products with Organizational Autonomy (Part 1)

Dec 23, 2025

After shipping Parjanya v1.0 through v1.3 (with 17+ patch releases), we’ve evolved from monolithic friction to a deliberate five-repository polyrepo architecture aligned with team structure and technical boundaries. This post documents our research-driven thought process, the specific tech stack (Nx frontend, FastAPI microservices, PyTorch + MIT-licensed DepictQA + Smriti RAFT LLM), our 4-tier ML inference scaling strategy (Lambda → ECS Spot Instances), and storage architecture (S3 objects + DynamoDB metadata)—before sharing production learnings in Part 2.

From Monolith Pain to Polyrepo Clarity

Parjanya v1.0-1.3: The Breaking Point

Parjanya 1.0 shipped as a monorepo: frontend React app, FastAPI backend, PyTorch ML inference, Terraform infra, and database schemas all in one Git repository. This worked for our initial three-person team, enabling rapid iteration through v1.2. But v1.3’s patch releases exposed fatal flaws:

1. Release coordination hell: Frontend UI updates waited weeks for ML model retraining

2. Ownership diffusion: Database schema changes created “not my job” finger-pointing

3. Dependency bloat: PyTorch/CUDA dependencies slowed frontend developer onboarding by 40+ minutes

4. Deployment waste: Image quality assessment (IQA) hotfixes redeployed unchanged frontend assets

The final straw: a DepictQA IQA model update required coordinating frontend, backend, ML, and infra changes simultaneously. We needed a structure where teams could ship independently without chaos.

Research Findings: The Middle Path

Industry analysis revealed most discussions pit Google-style monorepos against fragmented polyrepos. But high-velocity teams (Spotify, Snyk, PayPal) use **intentional polyrepos**—bounded repositories aligned with Conway’s Law. For AI/LLM products with diverse cadences (frontend weekly → ML monthly → infra rarely), we needed **exactly five repositories**:

This isn’t “many repos.” It’s the **minimum viable separation** for team autonomy.

Tech Stack: Frontend → ML → Storage → Infra

1. Frontend: Nx Monorepo (Independent Apps, Shared Libs)

parjanya-frontend/

├── apps/

│ ├── web/ # React + TypeScript

│ ├── dashboard/ # Partner admin

│ └── sdk/ # Embedded JS SDK

├── libs/

│ ├── ui/ # Headless components

│ ├── api-clients/ # OpenAPI generated

│ └── features/ # Auth, upload flows

└── nx.json # Affected builds

Nx advantages:

- `nx affected` rebuilds only changed apps (frontend CI: 2min → 30s)

- Module Federation for independent deployments

- Type-safe API clients auto-generated from FastAPI OpenAPI specs

2. Backend: FastAPI Microservices

parjanya-backend/

├── services/

│ ├── api-gateway/ # Auth, routing

│ ├── user-service/ # Profiles, billing

│ ├── upload-service/ # S3 presigned URLs

│ └── result-service/ # Query results

└── shared/ # Pydantic models

Each service deploys independently via ECS. Async SQS decouples services—upload-service triggers ML inference without waiting.

3. ML Repository: Single Source of Truth for IQA + Smriti RAFT

Architectural Decision: All ML libraries (PyTorch, MIT-licensed DepictQA, Smriti RAFT LLM) live in one repository to enable ML team to manage development, training, and production deployment with specialized infrastructure.

Why single ML repo:

- Unified dependency management: PyTorch 2.1+, CUDA 12.1, DepictQA (MIT-licensed IQA), Smriti RAFT components

- Team ownership: ML engineers control complete stack without backend/frontend coordination

- Infra specialization: GPU Spot instances, ECS Fargate scaling separate from CPU workloads

- IQA pipeline consistency: DepictQA as **single source of truth** for image quality assessment

4-Tier Inference Scaling (object size + compute needs):

ML Repository Structure:

parjanya-ml/ # Single source of truth

├── models/

│ ├── depictqa/ # MIT-licensed IQA (single source of truth)

│ │ ├── model.pt # Pre-trained weights

│ │ ├── inference.py # Core IQA pipeline

│ │ └── metrics/ # Quality scoring logic

│ └── smriti/ # Indian heritage RAFT LLM

│ ├── raft/ # Retrieval-Augmented Fine-Tuning

│ ├── prompts/ # Sanskrit/cultural templates

│ └── embeddings/ # Knowledge graph vectors

├── tiers/

│ ├── tier1_lambda.py # Lightweight DepictQA

│ ├── tier2_lambda.py # Full DepictQA

│ ├── tier3_ecs.py # DepictQA + Smriti lite

│ └── tier4_gpu.py # Full RAFT pipeline

├── requirements.txt # PyTorch 2.1, DepictQA, transformers

└── Dockerfile.gpu # NVIDIA CUDA 12.1 + all ML deps

Routing Logic (upload-service → ML):

python

async def route_inference(image_size_mb: float, use_case: str):

    if image_size_mb < 1:

        return await tier1_lambda_depictqa(image_size_mb)  # Quick check

    elif image_size_mb < 50:

        return await tier2_lambda_depictqa(image_size_mb)  # Full IQA

    elif image_size_mb < 500:

        return await tier3_ecs_depictqa_smriti(image_size_mb) 

    else:

        return await tier4_gpu_raft(image_size_mb)  # Heritage-grade analysis

Cost breakdown (10K users/month):

Tier 1: 60% × $0.0000002 = $0.12 (DepictQA metadata)

Tier 2: 35% × $0.000002 = $1.40 (DepictQA IQA)

Tier 3: 4% × $0.10/task = $16.00 (DepictQA + Smriti)

Tier 4: 1% × $0.50/job = $50.00 (RAFT heritage analysis)

-------------------------------------------------------

Total ML inference: ~$67.52/month

Future flexibility: DepictQA may evolve to training/RAG/RAFT roles, but remains IQA single source of truth. Smriti RAFT components modularised for independent scaling.

4. Shared Libraries: The Contract Layer

parjanya-shared/

├── db/

│ ├── dynamodb/ # Metadata tables

│ │ ├── users.json

│ │ └── results.json # DepictQA scores + Smriti outputs

│ └── rds/ # Relational schemas

├── models/

│ ├── ts/ # npm: @parjanya/models

│ └── python/ # pip: parjanya-models

└── sdk/

├── api-client-ts/

└── api-client-py/

Version policy: `^1.2.0` allows patches/minors, blocks majors. Breaking changes → v2.0.0.

5. Infrastructure: ML-Optimized Deployments

parjanya-infra/

├── terraform/

│ ├── s3.tf # Object storage

│ ├── dynamodb.tf # Metadata (GSI: user_id+quality_score)

│ ├── lambda.tf # Tier 1+2 (DepictQA lightweight)

│ ├── ecs.tf # Tier 3 Fargate (DepictQA + Smriti)

│ ├── ec2.tf # Tier 4 Spot GPU (g4dn.xlarge RAFT)

│ └── iam.tf # ML service roles

├── github-actions/

│ ├── deploy-ml-tier1.yml

│ ├── deploy-ml-tier4.yml # GPU Spot orchestration

│ └── terraform-plan.yml

└── monitoring/

├── cloudwatch.tf # Tier-specific DepictQA/Smriti alarms

└── grafana.json # IQA pipeline cost dashboard

Storage: S3 Objects + DynamoDB Metadata

Workflow:

1. Frontend → S3 presigned URL (upload-service)

2. S3 Event → Lambda → DynamoDB metadata + Tier routing

3. ML Tier → DepictQA IQA → Smriti RAFT (if needed) → S3 results

4. DynamoDB update: {quality_score, smriti_insights, tier_used}

5. Frontend queries DynamoDB → fetches S3 objects

DynamoDB Schema:

results_table:

- PK: result_id

- SK: user_id#timestamp

- quality_score (DepictQA)

- smriti_raft_score

- GSI1: user_id + quality_score (top heritage images)

Deployment: Independent Cadences

ML Tier 4 GPU Deployment:

yaml
name: Deploy RAFT GPU Tier
on:
push:
paths: [’models/smriti/**’, ‘tiers/tier4_gpu/**’]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Test DepictQA + Smriti RAFT
run: docker run --gpus all parjanya-ml pytest
- name: Deploy Spot GPU cluster
run: aws ec2 run-instances --spot --image-id gpu-ami \
--instance-type g4dn.xlarge --count 2

Why This Architecture? The Principles

1. ML Team Autonomy: Single ML repo = PyTorch/DepictQA/Smriti ownership

2. Cost Intelligence: 4-tier routes 95% to Lambda ($1.52) vs GPU ($50)

3. IQA Consistency: DepictQA single source of truth across all tiers

4. Indian Heritage Focus: Smriti RAFT purpose-built for cultural context

5. Infra Specialization: GPU Spot/ECS separate from CPU workloads

Risks We’re Watching

- ML dependency conflicts: PyTorch/DepictQA/Smriti version alignment

- Tier promotion logic: When Smriti RAFT needs Tier 3 vs Tier 4

- DepictQA evolution: Training/RAG/RAFT expansion while maintaining IQA truth

Part 2 will answer: Did 4-tier + single ML repo deliver? DepictQA/Smriti scaling? Cost savings?

Sources

https://depictqa.github.io/

https://depictqa.github.io/depictqa-wild/

https://depictqa.github.io/deqa-score/

https://arxiv.org/abs/2403.10131

https://docs.opencv.org/4.x/d2/d96/tutorial_py_table_of_contents_imgproc.html

https://phagyul.ai/

Jagadeesh Rampam

Discussion about this post

Ready for more?