mirror of https://github.com/srbhr/Resume-Matcher.git synced 2026-01-19 15:27:13 +00:00

Files

srbhr 0e817bc7d6 fix: enhance error handling, prevent race conditions, and improve documentation

2025-12-30 23:48:08 +05:30

32 KiB

Raw Blame History

Backend Architecture Documentation

Last Updated: December 27, 2024 Purpose: Complete map of backend components, functions, data flows, and extension points for i18n and feature additions.

1. Directory Structure

apps/backend/
├── app/
│   ├── __init__.py                 # Version: "2.0.0"
│   ├── main.py                     # FastAPI app entry point
│   ├── config.py                   # Settings via pydantic-settings
│   ├── database.py                 # TinyDB wrapper class
│   ├── llm.py                      # LiteLLM multi-provider abstraction
│   ├── pdf.py                      # Playwright PDF rendering
│   ├── routers/
│   │   ├── __init__.py             # Router exports
│   │   ├── health.py               # GET /health, GET /status
│   │   ├── config.py               # GET/PUT /config/llm-api-key, POST /config/llm-test
│   │   ├── resumes.py              # Resume CRUD + improve + PDF
│   │   └── jobs.py                 # Job description upload
│   ├── schemas/
│   │   ├── __init__.py             # Schema exports
│   │   └── models.py               # All Pydantic request/response models
│   ├── services/
│   │   ├── __init__.py             # Service exports
│   │   ├── parser.py               # Document parsing (PDF/DOCX → Markdown → JSON)
│   │   ├── improver.py             # Resume tailoring via LLM
│   │   └── cover_letter.py         # Cover letter & outreach message generation
│   └── prompts/
│       ├── __init__.py             # Prompt exports
│       └── templates.py            # LLM prompt templates
├── data/
│   ├── database.json               # TinyDB storage file
│   └── config.json                 # Runtime LLM configuration
├── pyproject.toml                  # Dependencies & project config
└── .env                            # Environment variables

2. API Endpoints Reference

2.1 Health & Status (`routers/health.py`)

Method	Endpoint	Handler	Description	LLM Call?	DB Call?
GET	`/api/v1/health`	`health_check()`	Basic health with LLM status	YES (`check_llm_health()`)	No
GET	`/api/v1/status`	`get_status()`	Full system status	YES (`check_llm_health()`)	YES (`db.get_stats()`)

Function Flow:

health_check()
├── check_llm_health() → LLM test call (30s timeout)
└── Return HealthResponse

get_status()
├── get_llm_config() → Load from settings/config.json
├── check_llm_health(config) → LLM test call
├── db.get_stats() → Count resumes/jobs/improvements
└── Return StatusResponse

2.2 Configuration (`routers/config.py`)

Method	Endpoint	Handler	Description	LLM Call?	DB Call?
GET	`/api/v1/config/llm-api-key`	`get_llm_config_endpoint()`	Get current config (key masked)	No	No
PUT	`/api/v1/config/llm-api-key`	`update_llm_config()`	Update LLM settings	YES (validation)	No
POST	`/api/v1/config/llm-test`	`test_llm_connection()`	Test LLM connectivity	YES	No

Function Flow:

get_llm_config_endpoint()
├── _load_config() → Read data/config.json
├── _mask_api_key() → "sk-...xxxx" format
└── Return LLMConfigResponse

update_llm_config(request: LLMConfigRequest)
├── _load_config() → Read existing
├── Merge request fields
├── Build LLMConfig object
├── check_llm_health(config) → Validate via test call
├── _save_config() → Write data/config.json
└── Return LLMConfigResponse

test_llm_connection()
├── _load_config()
├── Build LLMConfig
├── check_llm_health(config)
└── Return health result dict

2.3 Resumes (`routers/resumes.py`)

Method	Endpoint	Handler	Description	LLM Call?	DB Call?
POST	`/api/v1/resumes/upload`	`upload_resume()`	Upload PDF/DOCX	YES (parsing)	YES (create)
GET	`/api/v1/resumes`	`get_resume()`	Fetch resume by ID	No	YES (read)
GET	`/api/v1/resumes/list`	`list_resumes()`	List all resumes	No	YES (list)
POST	`/api/v1/resumes/improve`	`improve_resume_endpoint()`	Tailor resume for job	YES (2-4 calls)	YES (create)
PATCH	`/api/v1/resumes/{id}`	`update_resume_endpoint()`	Update resume data	No	YES (update)
GET	`/api/v1/resumes/{id}/pdf`	`download_resume_pdf()`	Generate PDF	No (uses frontend)	YES (read)
DELETE	`/api/v1/resumes/{id}`	`delete_resume()`	Delete resume	No	YES (delete)
PATCH	`/api/v1/resumes/{id}/cover-letter`	`update_cover_letter()`	Update cover letter	No	YES (update)
PATCH	`/api/v1/resumes/{id}/outreach-message`	`update_outreach_message()`	Update outreach message	No	YES (update)
GET	`/api/v1/resumes/{id}/cover-letter/pdf`	`download_cover_letter_pdf()`	Generate cover letter PDF	No (uses frontend)	YES (read)

Function Flow - Upload:

upload_resume(file: UploadFile)
├── Validate file type (PDF/DOCX/DOC) and size (≤4MB)
├── parse_document(content, filename) → Markdown string
├── Check if first resume → is_master=True
├── db.create_resume(..., processing_status="processing")
├── parse_resume_to_json(markdown) → LLM call for structured JSON
│   ├── Success: db.update_resume(processed_data, status="ready")
│   └── Failure: db.update_resume(status="failed")
└── Return ResumeUploadResponse{message, request_id, resume_id}

Function Flow - Improve:

improve_resume_endpoint(request: ImproveResumeRequest)
├── db.get_resume(resume_id) → Fetch original
├── db.get_job(job_id) → Fetch job description
├── _load_feature_config() → Load cover letter/outreach toggles
├── extract_job_keywords(job_content) → LLM call #1
│   └── Returns {required_skills, preferred_skills, keywords, ...}
├── improve_resume(original, job_desc, keywords) → LLM call #2
│   └── Returns improved ResumeData JSON
├── generate_improvements(keywords) → Build suggestions (no LLM)
├── [If enable_cover_letter] generate_cover_letter() → LLM call #3
├── [If enable_outreach_message] generate_outreach_message() → LLM call #4
│   └── Uses asyncio.gather() for parallel generation
├── db.create_resume(improved, cover_letter, outreach_message, ...)
├── db.create_improvement(original_id, tailored_id, job_id, improvements)
└── Return ImproveResumeResponse{request_id, data, cover_letter, outreach_message}

Function Flow - Cover Letter PDF:

download_cover_letter_pdf(resume_id, pageSize)
├── db.get_resume(resume_id)
├── Check cover_letter field exists (404 if not)
├── Build URL: {frontend_base_url}/print/cover-letter/{id}?pageSize={pageSize}
├── render_resume_pdf(url, pageSize, selector=".cover-letter-print")
│   └── Playwright renders HTML to PDF
└── Return Response(content=pdf_bytes, media_type="application/pdf")

Note: The selector parameter in render_resume_pdf() is critical. For cover letters, it must be .cover-letter-print to match the CSS class in the print route.

Function Flow - PDF:

download_resume_pdf(resume_id, template, pageSize, margins, spacing, fontSize, ...)
├── db.get_resume(resume_id)
├── Build URL: {frontend_base_url}/print/resumes/{id}?{params}
├── render_resume_pdf(url, pageSize) → Playwright renders HTML to PDF
└── Return Response(content=pdf_bytes, media_type="application/pdf")

2.4 Jobs (`routers/jobs.py`)

Method	Endpoint	Handler	Description	LLM Call?	DB Call?
POST	`/api/v1/jobs/upload`	`upload_job_descriptions()`	Store job descriptions	No	YES (create)
GET	`/api/v1/jobs/{id}`	`get_job()`	Fetch job by ID	No	YES (read)

Function Flow:

upload_job_descriptions(request: JobUploadRequest)
├── For each description in request.job_descriptions:
│   ├── Validate non-empty
│   └── db.create_job(content, resume_id) → Returns {job_id}
└── Return JobUploadResponse{message, job_id[], request}

3. Database Operations (`database.py`)

Storage: TinyDB (JSON file at data/database.json)

3.1 Tables

Table	Purpose	Key Fields
`resumes`	All resumes (master & tailored)	resume_id, content, processed_data, is_master, parent_id
`jobs`	Job descriptions	job_id, content, resume_id
`improvements`	Tailoring records	request_id, original_resume_id, tailored_resume_id, improvements[]

3.2 Resume Operations

# CREATE - Returns full document with generated resume_id
db.create_resume(
    content: str,                    # Markdown or JSON string
    content_type: str = "md",        # "md" | "json"
    filename: str | None = None,     # Original filename
    is_master: bool = False,         # Only one master allowed
    parent_id: str | None = None,    # For tailored resumes
    processed_data: dict | None,     # Structured ResumeData
    processing_status: str = "pending"  # pending→processing→ready|failed
) → dict

# READ
db.get_resume(resume_id: str) → dict | None
db.get_master_resume() → dict | None
db.list_resumes() → list[dict]

# UPDATE
db.update_resume(resume_id: str, updates: dict) → dict | None
db.set_master_resume(resume_id: str) → bool  # Unsets previous master

# DELETE
db.delete_resume(resume_id: str) → bool

3.3 Job Operations

db.create_job(
    content: str,                    # Raw job description text
    resume_id: str | None = None     # Optional link to resume
) → dict

db.get_job(job_id: str) → dict | None

3.4 Improvement Operations

db.create_improvement(
    original_resume_id: str,
    tailored_resume_id: str,
    job_id: str,
    improvements: list[dict]         # [{suggestion, lineNumber}, ...]
) → dict

3.5 Statistics

db.get_stats() → {
    "total_resumes": int,
    "total_jobs": int,
    "total_improvements": int,
    "has_master_resume": bool
}

3.6 Document Schemas

Resume Document:

{
  "resume_id": "uuid-string",
  "content": "markdown or json string",
  "content_type": "md | json",
  "filename": "original_file.pdf",
  "is_master": true | false,
  "parent_id": "uuid or null",
  "processed_data": {
    "personalInfo": {"name", "title", "email", "phone", "location", "website", "linkedin", "github"},
    "summary": "string",
    "workExperience": [{"id", "title", "company", "location", "years", "description": []}],
    "education": [{"id", "institution", "degree", "years", "description"}],
    "personalProjects": [{"id", "name", "role", "years", "description": []}],
    "additional": {"technicalSkills": [], "languages": [], "certificationsTraining": [], "awards": []}
  },
  "processing_status": "pending | processing | ready | failed",
  "cover_letter": "string or null",          // Generated cover letter text (tailored resumes only)
  "outreach_message": "string or null",      // Generated outreach message (tailored resumes only)
  "created_at": "ISO-8601",
  "updated_at": "ISO-8601"
}

4. LLM Operations (`llm.py`)

4.1 Supported Providers

Provider	Prefix	API Key Env Var	Notes
openai	(none)	`OPENAI_API_KEY`	GPT models, supports JSON mode
anthropic	`anthropic/`	`ANTHROPIC_API_KEY`	Claude models, supports JSON mode
openrouter	`openrouter/`	`OPENROUTER_API_KEY`	Multi-model gateway
gemini	`gemini/`	`GEMINI_API_KEY`	Google models, supports JSON mode
deepseek	`deepseek/`	`DEEPSEEK_API_KEY`	DeepSeek models, supports JSON mode
ollama	`ollama/`	(none, uses api_base)	Local models via `OLLAMA_API_BASE`

4.2 Core Functions

# Configuration
get_llm_config() → LLMConfig  # Load from settings

# Health Check
check_llm_health(config: LLMConfig | None) → dict
# Makes minimal test call: "Hi" → 5 tokens max
# Passes api_key directly to litellm.acompletion() (avoids os.environ race conditions)
# Timeout: 30 seconds
# Returns: {healthy: bool, provider, model, error?}

# Text Completion
complete(
    prompt: str,
    system_prompt: str | None = None,
    config: LLMConfig | None = None,
    max_tokens: int = 4096,
    temperature: float = 0.7
) → str
# Passes api_key directly to litellm.acompletion()
# Includes try-except with server-side logging
# Returns generic error messages to callers
# Timeout: 120 seconds

# JSON Completion (with retry logic)
complete_json(
    prompt: str,
    system_prompt: str | None = None,
    config: LLMConfig | None = None,
    max_tokens: int = 4096,
    retries: int = 2
) → dict
# Passes api_key directly to litellm.acompletion()
# Uses JSON mode when supported (openai, anthropic, gemini, deepseek)
# Falls back to text extraction via _extract_json()
# Timeout: 180 seconds
# Retry: Lowers temperature (0.1 → 0.0), adds JSON instruction

Important: API keys are passed directly via the api_key parameter instead of setting os.environ. This prevents race conditions when multiple async requests use different API configurations.

4.3 JSON Extraction Helper

_extract_json(content: str) → str
# Handles various LLM output formats:
# 1. ```json ... ``` code blocks
# 2. ``` ... ``` generic code blocks
# 3. Raw JSON starting with {
# 4. JSON properties without braces (wraps them)
# Raises ValueError if no JSON found
#
# Includes recursion guard: Only recurses if { is found after position 0
# This prevents infinite recursion when content starts with { but bracket matching fails

4.4 LLM Timeout Constants

LLM_TIMEOUT_HEALTH_CHECK = 30   # seconds
LLM_TIMEOUT_COMPLETION = 120    # seconds
LLM_TIMEOUT_JSON = 180          # seconds (JSON may take longer)

5. Services Layer

5.1 Parser Service (`services/parser.py`)

# Document Conversion
async def parse_document(content: bytes, filename: str) → str
# Converts PDF/DOCX to Markdown using markitdown library
# Flow:
#   1. Extract file extension from filename
#   2. Write bytes to temp file
#   3. Use MarkItDown().convert() to extract text
#   4. Return markdown string
# Libraries: markitdown, pdfminer.six, python-docx

# Resume JSON Parsing (LLM)
async def parse_resume_to_json(markdown_text: str) → dict
# Converts markdown resume to structured JSON
# Flow:
#   1. Format PARSE_RESUME_PROMPT with schema example
#   2. Call complete_json() with system_prompt="You are a JSON extraction engine..."
#   3. Validate result against ResumeData schema
#   4. Return as dict
# LLM CALL: YES

5.2 Cover Letter Service (`services/cover_letter.py`)

# Cover Letter Generation (LLM)
async def generate_cover_letter(
    resume_data: dict,
    job_description: str,
) -> str
# Generates tailored cover letter based on resume and job
# Flow:
#   1. Format COVER_LETTER_PROMPT with job description and resume JSON
#   2. Call complete() with system_prompt="You are a professional career coach..."
#   3. Return plain text cover letter
# LLM CALL: YES
# Returns: Plain text cover letter (300-400 words)

# Outreach Message Generation (LLM)
async def generate_outreach_message(
    resume_data: dict,
    job_description: str,
) -> str
# Generates cold outreach message for LinkedIn/email
# Flow:
#   1. Format OUTREACH_MESSAGE_PROMPT with job description and resume JSON
#   2. Call complete() with system_prompt="You are a professional networking coach."
#   3. Return plain text message
# LLM CALL: YES
# Returns: Plain text message (100-150 words)

5.3 Improver Service (`services/improver.py`)

# Keyword Extraction (LLM)
async def extract_job_keywords(job_description: str) → dict
# Extracts requirements from job description
# Flow:
#   1. Format EXTRACT_KEYWORDS_PROMPT with job text
#   2. Call complete_json() with system_prompt="You are an expert job description analyzer."
#   3. Return structured keywords
# LLM CALL: YES
# Returns: {required_skills[], preferred_skills[], experience_requirements[],
#           education_requirements[], key_responsibilities[], keywords[]}

# Resume Tailoring (LLM)
async def improve_resume(
    original_resume: str,      # Markdown
    job_description: str,
    job_keywords: dict
) → dict
# Tailors resume content to match job
# Flow:
#   1. Format IMPROVE_RESUME_PROMPT with job, keywords, resume, schema
#   2. Call complete_json() with system_prompt="You are an expert resume editor..."
#   3. Validate against ResumeData schema
#   4. Return as dict
# LLM CALL: YES

# Improvement Suggestions (No LLM)
def generate_improvements(job_keywords: dict) → list[dict]
# Generates human-readable suggestions based on keywords
# Flow:
#   1. Take top 3 required_skills → "Emphasized '{skill}' to match job requirements"
#   2. Take top 2 responsibilities → "Aligned experience with: {responsibility}"
#   3. If empty, add default suggestion
# LLM CALL: NO
# Returns: [{suggestion: str, lineNumber: int | None}, ...]

6. LLM Prompts (`prompts/templates.py`)

6.1 PARSE_RESUME_PROMPT

Purpose: Convert resume markdown to structured JSON
Variables:
  - {schema}: RESUME_SCHEMA_EXAMPLE (full JSON with example values)
  - {resume_text}: Raw resume in markdown format

Prompt Text:
"Parse this resume into JSON. Output ONLY the JSON object, no other text.

Example output format:
{schema}

Rules:
- Use "" for missing text fields, [] for missing arrays, null for optional fields
- Number IDs starting from 1
- Format years as "YYYY - YYYY" or "YYYY - Present"

Resume to parse:
{resume_text}"

6.2 EXTRACT_KEYWORDS_PROMPT

Purpose: Extract job requirements for resume tailoring
Variables:
  - {job_description}: Raw job posting text

Prompt Text:
"Extract job requirements as JSON. Output ONLY the JSON object, no other text.

Example format:
{{
  "required_skills": ["Python", "AWS"],
  "preferred_skills": ["Kubernetes"],
  "experience_requirements": ["5+ years"],
  "education_requirements": ["Bachelor's in CS"],
  "key_responsibilities": ["Lead team"],
  "keywords": ["microservices", "agile"]
}}

Job description:
{job_description}"

6.3 IMPROVE_RESUME_PROMPT

Purpose: Tailor resume content to match job requirements
Variables:
  - {job_description}: Target job posting
  - {job_keywords}: Extracted keywords (JSON string)
  - {original_resume}: Original resume (markdown)
  - {schema}: RESUME_SCHEMA_EXAMPLE

Prompt Text:
"Tailor this resume for the job. Output ONLY the JSON object, no other text.

Rules:
- Rephrase content to highlight relevant experience
- DO NOT invent new information
- Use action verbs and quantifiable achievements

Job Description:
{job_description}

Keywords to emphasize:
{job_keywords}

Original Resume:
{original_resume}

Output in this JSON format:
{schema}"

6.4 RESUME_SCHEMA_EXAMPLE

{
  "personalInfo": {
    "name": "John Doe",
    "title": "Software Engineer",
    "email": "john@example.com",
    "phone": "+1-555-0100",
    "location": "San Francisco, CA",
    "website": "https://johndoe.dev",
    "linkedin": "linkedin.com/in/johndoe",
    "github": "github.com/johndoe"
  },
  "summary": "Experienced software engineer with 5+ years...",
  "workExperience": [
    {
      "id": 1,
      "title": "Senior Software Engineer",
      "company": "Tech Corp",
      "location": "San Francisco, CA",
      "years": "2020 - Present",
      "description": [
        "Led development of microservices architecture",
        "Improved system performance by 40%"
      ]
    }
  ],
  "education": [
    {
      "id": 1,
      "institution": "University of California",
      "degree": "B.S. Computer Science",
      "years": "2014 - 2018",
      "description": "Graduated with honors"
    }
  ],
  "personalProjects": [
    {
      "id": 1,
      "name": "Open Source Tool",
      "role": "Creator & Maintainer",
      "years": "2021 - Present",
      "description": [
        "Built CLI tool with 1000+ GitHub stars",
        "Used by 50+ companies worldwide"
      ]
    }
  ],
  "additional": {
    "technicalSkills": ["Python", "JavaScript", "AWS", "Docker"],
    "languages": ["English (Native)", "Spanish (Conversational)"],
    "certificationsTraining": ["AWS Solutions Architect"],
    "awards": ["Employee of the Year 2022"]
  }
}

7. PDF Rendering (`pdf.py`)

7.1 Playwright Integration

# Global browser instance (initialized at startup)
_playwright = None
_browser: Browser | None = None
_init_lock = asyncio.Lock()  # Prevents race conditions during init

async def init_pdf_renderer() → None
# Called at app startup via lifespan
# Uses asyncio.Lock to prevent race conditions when multiple
# concurrent requests try to initialize the browser simultaneously
# Pattern: Fast path check → Lock → Double-check → Initialize

async def close_pdf_renderer() → None
# Called at app shutdown
# Closes browser and playwright instance

async def render_resume_pdf(
    url: str,
    page_size: str = "A4",
    selector: str = ".resume-print"  # CSS selector to wait for
) → bytes
# Flow:
#   1. Create new browser page
#   2. Navigate to URL (wait_until="networkidle")
#   3. Wait for selector (default: ".resume-print", or ".cover-letter-print" for cover letters)
#   4. Wait for document.fonts.ready
#   5. Generate PDF with:
#      - format: "A4" or "Letter"
#      - print_background: True
#      - margin: {top: 0, right: 0, bottom: 0, left: 0}  # Margins in HTML
#   6. Close page
#   7. Return PDF bytes
# Raises:
#   PDFRenderError: If connection to frontend fails or rendering errors occur

# IMPORTANT: The selector parameter must match the CSS class used in the print page.
# For resumes: selector=".resume-print"
# For cover letters: selector=".cover-letter-print"

7.4 Error Handling

The PDF renderer includes a custom PDFRenderError exception that provides helpful messages:

class PDFRenderError(Exception):
    """Custom exception for PDF rendering errors with helpful messages."""
    pass

Connection Errors: When Playwright cannot connect to the frontend (e.g., net::ERR_CONNECTION_REFUSED), the error message includes:

The attempted URL
Instructions to check that the frontend is running
Instructions to verify FRONTEND_BASE_URL in the backend .env file

Example error message:

Cannot connect to frontend for PDF generation. Attempted URL: http://localhost:3000/print/resumes/abc123?...
Please ensure: 1) The frontend is running, 2) The FRONTEND_BASE_URL environment variable
in the backend .env file matches the URL where your frontend is accessible.

The PDF endpoints (/resumes/{id}/pdf and /resumes/{id}/cover-letter/pdf) catch PDFRenderError and return HTTP 503 with the helpful error message.

7.2 Critical: CSS Visibility Rules

WARNING: The frontend's globals.css hides all content in print mode by default:

@media print {
  body * { visibility: hidden !important; }

  .resume-print,
  .resume-print *,
  .cover-letter-print,
  .cover-letter-print * {
    visibility: visible !important;
  }
}

If a new print class is not whitelisted in CSS, Playwright will generate blank PDFs.

When adding new printable document types:

Create print route with unique class (e.g., .report-print)
Add class to globals.css visibility whitelist
Pass correct selector to render_resume_pdf()

7.3 Print URL Format

Resume Print URL:

{frontend_base_url}/print/resumes/{resume_id}?
  template=swiss-single|swiss-two-column
  &pageSize=A4|LETTER
  &marginTop=10 (5-25mm)
  &marginBottom=10
  &marginLeft=10
  &marginRight=10
  &sectionSpacing=3 (1-5)
  &itemSpacing=2 (1-5)
  &lineHeight=3 (1-5)
  &fontSize=3 (1-5)
  &headerScale=3 (1-5)

Cover Letter Print URL:

{frontend_base_url}/print/cover-letter/{resume_id}?
  pageSize=A4|LETTER

8. Configuration (`config.py`)

8.1 Settings Class

class Settings(BaseSettings):
    # LLM Configuration
    llm_provider: Literal["openai", "anthropic", "openrouter", "gemini", "deepseek", "ollama"] = "openai"
    llm_model: str = "gpt-4o-mini"
    llm_api_key: str = ""
    llm_api_base: str | None = None  # For Ollama

    # Server Configuration
    host: str = "0.0.0.0"
    port: int = 8000
    frontend_base_url: str = "http://localhost:3000"

    # CORS Configuration
    cors_origins: list[str] = ["http://localhost:3000", "http://127.0.0.1:3000"]

    # Paths
    data_dir: Path = Path(__file__).parent.parent / "data"

    @property
    def db_path(self) → Path:
        return self.data_dir / "database.json"

    @property
    def config_path(self) → Path:
        return self.data_dir / "config.json"

8.2 Environment Variables

Variable	Default	Description
`LLM_PROVIDER`	`openai`	LLM provider (openai, anthropic, etc.)
`LLM_MODEL`	`gpt-4o-mini`	Model name
`LLM_API_KEY`	(empty)	API key for provider
`LLM_API_BASE`	(empty)	Custom API base URL (Ollama)
`HOST`	`0.0.0.0`	Server bind address
`PORT`	`8000`	Server port
`FRONTEND_BASE_URL`	`http://localhost:3000`	Frontend URL for PDF rendering
`CORS_ORIGINS`	`http://localhost:3000,...`	Comma-separated CORS origins

9. Extension Points for i18n

9.1 Text Locations That Need Translation

Location	File	Text Type
LLM Prompts	`prompts/templates.py`	System prompts, instructions
Error Messages	`routers/*.py`	HTTPException details
Response Messages	`routers/resumes.py`	Success messages
Section Labels	`schemas/models.py`	Field names in schema

9.2 Recommended i18n Strategy

Create i18n/ directory:

app/i18n/
├── __init__.py
├── locales/
│   ├── en.json
│   ├── es.json
│   └── ...
└── loader.py

Modify prompts to accept language parameter:

# prompts/templates.py
def get_parse_resume_prompt(lang: str = "en") → str:
    # Load localized version

Add language header handling:

# routers/resumes.py
@router.post("/upload")
async def upload_resume(
    file: UploadFile,
    accept_language: str = Header("en")
):
    lang = parse_language(accept_language)
    ...

9.3 LLM Prompt Localization

For multi-language resume generation:

# Add language instruction to IMPROVE_RESUME_PROMPT
IMPROVE_RESUME_PROMPT_I18N = """
Tailor this resume for the job.
OUTPUT LANGUAGE: {target_language}
...
"""

10. Extension Points for New Features

10.1 Adding New LLM Providers

Update llm.py:
- Add to provider_prefixes dict
- Add env var mapping in setup_llm_environment()
- Add to _supports_json_mode() if applicable
Update config.py:
- Add to llm_provider Literal type
Update frontend:
- Add to PROVIDER_INFO in lib/api/config.ts

10.2 Adding New Resume Sections

Update schemas/models.py:
- Add new section model (e.g., Volunteer)
- Add field to ResumeData
Update prompts/templates.py:
- Update RESUME_SCHEMA_EXAMPLE with new section
Update frontend:
- Add form component
- Update resume templates

10.3 Adding New Export Formats

Create new router or add to resumes.py:

@router.get("/{resume_id}/export/{format}")
async def export_resume(resume_id: str, format: str):
    # docx, markdown, txt, etc.

Add export service in services/:

# services/exporter.py
async def export_to_docx(resume_data: dict) → bytes:
    ...

11. Dependencies

11.1 Core Dependencies

Package	Version	Purpose
fastapi	≥0.115.0	Web framework
uvicorn	≥0.34.0	ASGI server
pydantic	≥2.11.0	Data validation
pydantic-settings	≥2.8.0	Settings management
tinydb	≥4.8.0	JSON database
python-multipart	≥0.0.20	File upload handling

11.2 AI/ML Dependencies

Package	Version	Purpose
litellm	≥1.56.0	Multi-provider LLM abstraction

11.3 Document Processing

Package	Version	Purpose
markitdown	≥0.1.0	Document conversion (PDF/DOCX → Markdown)
pdfminer.six	≥20231228	PDF text extraction
python-docx	≥1.1.0	DOCX parsing
playwright	≥1.50.0	PDF generation via headless Chrome

12. Application Lifecycle

12.1 Startup (`main.py`)

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Startup
    settings.data_dir.mkdir(parents=True, exist_ok=True)
    await init_pdf_renderer()  # Launch Chromium
    yield
    # Shutdown - Each cleanup wrapped in try-except to ensure all resources are released
    try:
        await close_pdf_renderer()  # Close Chromium
    except Exception as e:
        logging.error(f"Error closing PDF renderer: {e}")
    try:
        db.close()  # Close TinyDB
    except Exception as e:
        logging.error(f"Error closing database: {e}")

Note: Shutdown cleanup uses individual try-except blocks so one failure doesn't prevent cleanup of other resources.

12.2 Request Flow

Request → FastAPI Router → Handler Function
                              ├── Database Operations (TinyDB)
                              ├── LLM Operations (LiteLLM)
                              ├── Service Layer (Parser/Improver)
                              └── Response

13. Function Reference by Category

13.1 Functions That Invoke LLM

Function	Location	Purpose	Timeout
`check_llm_health()`	`llm.py`	Health check	30s
`complete()`	`llm.py`	Text completion	120s
`complete_json()`	`llm.py`	JSON completion	180s
`parse_resume_to_json()`	`services/parser.py`	Resume parsing	Uses complete_json
`extract_job_keywords()`	`services/improver.py`	Keyword extraction	Uses complete_json
`improve_resume()`	`services/improver.py`	Resume tailoring	Uses complete_json

13.2 Functions That Read Database

Function	Location	Table
`db.get_resume()`	`database.py`	resumes
`db.get_master_resume()`	`database.py`	resumes
`db.list_resumes()`	`database.py`	resumes
`db.get_job()`	`database.py`	jobs
`db.get_stats()`	`database.py`	all tables

13.3 Functions That Write Database

Function	Location	Table	Operation
`db.create_resume()`	`database.py`	resumes	INSERT
`db.update_resume()`	`database.py`	resumes	UPDATE
`db.delete_resume()`	`database.py`	resumes	DELETE
`db.set_master_resume()`	`database.py`	resumes	UPDATE
`db.create_job()`	`database.py`	jobs	INSERT
`db.create_improvement()`	`database.py`	improvements	INSERT

13.4 Functions That Process Files

Function	Location	Input	Output
`parse_document()`	`services/parser.py`	bytes + filename	Markdown string
`render_resume_pdf()`	`pdf.py`	URL + page_size	PDF bytes

This document is part of the Resume Matcher technical documentation. See also: frontend-architecture.md, design-system.md, api-flow-maps.md

32 KiB Raw Blame History

Backend Architecture Documentation

1. Directory Structure

2. API Endpoints Reference

2.1 Health & Status (routers/health.py)

2.2 Configuration (routers/config.py)

2.3 Resumes (routers/resumes.py)

2.4 Jobs (routers/jobs.py)

3. Database Operations (database.py)

3.1 Tables

3.2 Resume Operations

3.3 Job Operations

3.4 Improvement Operations

3.5 Statistics

3.6 Document Schemas

4. LLM Operations (llm.py)

4.1 Supported Providers

4.2 Core Functions

4.3 JSON Extraction Helper

4.4 LLM Timeout Constants

5. Services Layer

5.1 Parser Service (services/parser.py)

5.2 Cover Letter Service (services/cover_letter.py)

5.3 Improver Service (services/improver.py)

6. LLM Prompts (prompts/templates.py)

6.1 PARSE_RESUME_PROMPT

6.2 EXTRACT_KEYWORDS_PROMPT

6.3 IMPROVE_RESUME_PROMPT

6.4 RESUME_SCHEMA_EXAMPLE

7. PDF Rendering (pdf.py)

7.1 Playwright Integration

7.4 Error Handling

7.2 Critical: CSS Visibility Rules

7.3 Print URL Format

8. Configuration (config.py)

8.1 Settings Class

8.2 Environment Variables

9. Extension Points for i18n

9.1 Text Locations That Need Translation

9.2 Recommended i18n Strategy

9.3 LLM Prompt Localization

10. Extension Points for New Features

10.1 Adding New LLM Providers

10.2 Adding New Resume Sections

10.3 Adding New Export Formats

11. Dependencies

11.1 Core Dependencies

11.2 AI/ML Dependencies

11.3 Document Processing

12. Application Lifecycle

12.1 Startup (main.py)

12.2 Request Flow

13. Function Reference by Category

13.1 Functions That Invoke LLM

13.2 Functions That Read Database

13.3 Functions That Write Database

13.4 Functions That Process Files

32 KiB

Raw Blame History

2.1 Health & Status (`routers/health.py`)

2.2 Configuration (`routers/config.py`)

2.3 Resumes (`routers/resumes.py`)

2.4 Jobs (`routers/jobs.py`)

3. Database Operations (`database.py`)

4. LLM Operations (`llm.py`)

5.1 Parser Service (`services/parser.py`)

5.2 Cover Letter Service (`services/cover_letter.py`)

5.3 Improver Service (`services/improver.py`)

6. LLM Prompts (`prompts/templates.py`)

7. PDF Rendering (`pdf.py`)

8. Configuration (`config.py`)

12.1 Startup (`main.py`)