← Showcase / 02 // PIPELINE CORE
STATUS // RUNNING // DEPLOYED AUTOMATION PIPELINE ISO 14644

GTSCO Cleanroom
Report Pipeline

A Python automation pipeline engineered for GTSCO that intercepts raw particle counter machine dumps, validates every reading against ISO 14644 Class 7/8 air cleanliness limits, generates Matplotlib charts, and dispatches a fully branded client-ready Word + PDF compliance report — with zero manual steps between machine and mailbox.

Python 3xlrd / openpyxlpython-docxMatplotliblxml (OOXML)ISO 14644
← All Case Studies
pipeline.log — GTSCO_Report_Generator v7 AUTONOMOUS
// GTSCO Air Particle Count Report Generator v7
// Multi-Room | Multi-Date | Template-Based | Dynamic Pages
root@pipeline:~$ ./generate_report.py
─────────────────────────────────────────────────────────
[STEP 1] Loading: NMDC_DataCenter_20260521.xls
Format detected: Grouped (Format B) | Readings: 144
[STEP 2] Dates selected: 21/05/202621/05/2026
[STEP 3] Rooms mapped:
LOC009 → Main Server Room
LOC011 → Network Operations Center
[STEP 4] Job: NMDC Data Center · Abu Dhabi, UAE · ISO Class 8
[STEP 5] ISO 14644 validation:
0.5µm — 144 readings — all below 3,520,000 p/m³ PASS
1.0µm — 144 readings — all below 832,000 p/m³ PASS
5.0µm — 144 readings — all below 29,300 p/m³ PASS
[STEP 6] Mastersheet updated → NMDC_mastersheet.xlsx OK
[STEP 7] Charts generated: 6 charts (3 particle sizes × 2 rooms) OK
[STEP 8] Word report built → NMDC_Data_Center_Report_20260521.docx OK
[STEP 9] PDF converted → NMDC_Data_Center_Report_20260521.pdf OK
═══════════════════════════════════════════════════════════
✓ REPORT COMPLETE — 0 manual steps · 0 human errors
Client: NMDC Data Center · Rooms: 2 · ISO Class: 8
Readings validated: 144 · Report No: NMDCDATAC-20260521
root@pipeline:~$
// THE PROBLEM

Hours of manual work per report — with zero tolerance for error

After every cleanroom service, GTSCO engineers manually opened the particle counter's XLS dump, copied readings into Excel, rebuilt charts from scratch, checked each value against ISO 14644 Class 7 and Class 8 limits by hand, wrote the cover letter, formatted the Word report using the branded template, and then converted it to PDF.

Each report took hours. With multiple rooms, multiple service dates, and multiple clients per week — this was an entire work day spent on documentation that could get a reading wrong and expose the business to liability on a safety-critical compliance document.

// THE SOLUTION

Double-click. Walk away. Report in your inbox.

Built a 9-step Python pipeline wrapped in a double-click BAT file that non-technical staff can operate. Drop the machine XLS in the inputs folder, run it, answer a few prompts (client name, ISO class, room names), and the pipeline handles everything: parsing two different XLS format variants, ISO validation, chart generation, Word template population at the XML level, mastersheet update, and PDF export.

The Word report is generated by directly manipulating the OOXML structure of the branded GTSCO template — preserving all logos, headers, and formatting while injecting live data, charts, and a dynamic cover letter. No copy-pasting. No format drift. No calculation errors.

// ARCHITECTURE DECISIONS

Engineering choices that mattered.

01
// FORMAT DETECTION

Dual-format XLS parser

The particle counter machine exports in two different XLS layouts depending on firmware version — Format A (timestamp-indexed) and Format B (location-grouped). The pipeline auto-detects the format by inspecting the header row and routes to the correct parser, so staff never need to know which format they have.

02
// WORD GENERATION

OOXML manipulation over template copy

Rather than generating a Word file from scratch (losing GTSCO's branding, fonts, and table styles), the pipeline operates directly on the existing branded MASTER_TEMPLATE.docx at the XML level using lxml. It clones room table sections, replaces drawings with live Matplotlib charts, and injects data while preserving every pixel of the client's format.

03
// PDF EXPORT

Two-path PDF conversion

PDF conversion tries LibreOffice headless first (cross-platform), then falls back to COM automation via Microsoft Word if LibreOffice isn't installed. If neither is available, the DOCX is still perfect and staff can convert manually. The pipeline never fails to produce a deliverable.

// STACK BREAKDOWN

Technology deployed.

DATA INGESTION
xlrd

Reads legacy .xls particle counter machine dumps. Auto-detects two format variants. Extracts readings, timestamps, and location codes from both.

MASTERSHEET
openpyxl

Generates a per-client Excel mastersheet with professional styling, column groups, alternating row colours, and merged headers. Each run appends a new sheet — never overwrites history.

CHART GENERATION
Matplotlib (Agg backend)

Generates one line chart per room per particle size (3 sizes × n rooms). Each chart overlays actual readings against ISO Class 7 and Class 8 limit lines. Rendered headlessly at 150 DPI for crisp print output.

REPORT GENERATION
python-docx + lxml

Operates on the GTSCO MASTER_TEMPLATE.docx at the OOXML XML level. Clones room table sections dynamically, replaces chart placeholder drawings with live PNGs, fills data tables and cover letter text while preserving all branding.

PDF EXPORT
LibreOffice / comtypes (Word COM)

Dual-path PDF conversion: LibreOffice headless subprocess first, COM automation via Microsoft Word as fallback. Ensures PDF output on any Windows machine regardless of what's installed.

DEPLOYMENT
RUN_REPORT.bat

A double-click BAT file that auto-installs all Python dependencies on first run and then launches the pipeline. Zero technical knowledge required from GTSCO staff — drop XLS, double-click, collect report.

// ISO 14644 VALIDATION LOGIC

Hardcoded safety limits. Zero calculation drift.

iso_limits.py
# Particle limits in particles per cubic metre (p/m³)
ISO_LIMITS = {
"0.5": {"class_8": 3,520,000, "class_7": 352,000},
"1.0": {"class_8": 832,000, "class_7": 83,200},
"5.0": {"class_8": 29,300, "class_7": 2,930},
}
# Pass = ALL readings below the required class limit
# Fail = ANY reading above the required class limit

ISO 14644 limits are immutable constants — not spreadsheet formulas that can drift. Every reading in every room for every particle size is individually validated against the selected class. One reading over the limit fails the entire room.

// OUTCOMES

What shipped.

0
MANUAL STEPS
100%
HUMAN ERROR ELIMINATED
<2min
DUMP TO REPORT
9
AUTOMATED STEPS
MULTI-ROOM

Handles any number of rooms in a single run. Each room gets its own 3-chart section in the report, its own Pass/Fail status, and its own rows in the mastersheet.

HISTORICAL TRACKING

A per-client Excel mastersheet accumulates every run as a new sheet. GTSCO's team can pull up any client's full service history at any time — zero data loss across runs.

NON-TECHNICAL DEPLOYMENT

One double-click BAT file. No Python knowledge needed. Auto-installs dependencies on first run. Any GTSCO staff member can generate a report on any Windows machine.

// MORE BLUEPRINTS

See all case studies

← Back to Showcase
// YOUR PIPELINE, NEXT

Have a workflow like this?

Automate It  →