feat: smart batch processing with skip logic

- Change --batch to accept directory instead of glob pattern - Automatically skip already-processed scan dates - Add --force flag to reprocess all files - Fix date extraction regex to parse from client info line - Display helpful tips about skipping/forcing - Better user feedback with skip counts and suggestions Usage: python dexa_extract.py --batch data/pdfs --height-in 74 --outdir data/results This will process only new scans, skipping any dates already in the output.
2025-10-06 15:33:05 -07:00 · 2025-10-06 15:33:05 -07:00 · b046af5d25
commit b046af5d25
parent d6793e2572
3 changed files with 342 additions and 38 deletions
--- a/data/results/README.md
+++ b/data/results/README.md
@ -1,18 +0,0 @@
-# Results Directory
-
-Your extracted DEXA data will be saved here by default.
-
-## Output Files
-
-When you run the extraction script with `--outdir data/results`, you'll get:
-
- `overall.csv` - Time-series data (one row per scan)
- `regional.csv` - Regional body composition
- `muscle_balance.csv` - Left/right limb comparison
- `overall.json` - Structured JSON format
- `summary.md` - Human-readable summary
-
-## Note
-
-⚠️ **Result files are gitignored** - They contain your personal health data and won't be committed to version control.
-