feat: smart batch processing with skip logic
- Change --batch to accept directory instead of glob pattern - Automatically skip already-processed scan dates - Add --force flag to reprocess all files - Fix date extraction regex to parse from client info line - Display helpful tips about skipping/forcing - Better user feedback with skip counts and suggestions Usage: python dexa_extract.py --batch data/pdfs --height-in 74 --outdir data/results This will process only new scans, skipping any dates already in the output.
This commit is contained in:
parent
d6793e2572
commit
b046af5d25
3 changed files with 342 additions and 38 deletions
|
|
@ -1,18 +0,0 @@
|
|||
# Results Directory
|
||||
|
||||
Your extracted DEXA data will be saved here by default.
|
||||
|
||||
## Output Files
|
||||
|
||||
When you run the extraction script with `--outdir data/results`, you'll get:
|
||||
|
||||
- `overall.csv` - Time-series data (one row per scan)
|
||||
- `regional.csv` - Regional body composition
|
||||
- `muscle_balance.csv` - Left/right limb comparison
|
||||
- `overall.json` - Structured JSON format
|
||||
- `summary.md` - Human-readable summary
|
||||
|
||||
## Note
|
||||
|
||||
⚠️ **Result files are gitignored** - They contain your personal health data and won't be committed to version control.
|
||||
|
||||
Loading…
Add table
Add a link
Reference in a new issue