DiskForge — File Recovery Engine

Rust Windows No Cloud MIT

A forensics-inspired file recovery engine built in pure Rust.
Raw disk reads. NTFS metadata recovery. SMART/NVMe health signals. Web + CLI interfaces.
Runs locally. Read-only scanning. Built for a hackathon.

LOC Crates Signatures IOCTLs No Cloud


The Problem

People lose important files every day — often due to:

  • accidental deletion
  • drive corruption
  • OS crashes or bad updates
  • “quick format” mistakes
  • failing storage hardware

When that happens, options tend to be split between:

Category Examples Tradeoff
Consumer recovery tools Recuva, Disk Drill Free Easy to use, but many rely primarily on high-level filesystem scans and may miss recoverable data in edge cases.
Professional forensic suites EnCase, FTK, R-Studio Powerful and battle-tested, but priced and designed for labs and corporate environments.

DiskForge targets the middle: a local, open-source recovery engine that goes deeper than surface scans by reading raw sectors and parsing filesystem structures directly.

⚠️ Reality check: SSDs with TRIM enabled can make deleted data unrecoverable after garbage collection. DiskForge detects TRIM/UNMAP support and warns early, but no software can “undo” TRIM once blocks are erased.

The Problem vs The Solution


What DiskForge Does (and what it doesn’t)

✅ DiskForge does

  • Reads raw disk sectors on Windows using \\.\PhysicalDriveN + IOCTLs (admin required)
  • Parses NTFS metadata (MFT records, filenames, timestamps, run lists) to find deleted files whose metadata still exists
  • Correlates Recycle Bin $I/$R pairs to restore original names/paths when possible
  • Carves file types from unallocated space using header/footer signatures + basic structural checks
  • Surfaces SMART / NVMe health signals to help users understand drive condition and risk

❌ DiskForge does not claim

  • Guaranteed recovery (recovery depends on overwrite/TRIM/encryption and time elapsed)
  • Full forensic case management (chain-of-custody tooling, court-ready reporting)
  • General “post-TRIM resurrection” on SSDs (DiskForge captures available NVMe telemetry/health logs, but SSD FTL behavior is vendor-specific)

How it works (recovery pipeline)

                    +---------------------------------+
                    |         YOUR DELETED FILE        |
                    +----------------+----------------+
                                     |
                    +----------------v----------------+
                    |     NTFS Master File Table       |
                    |   (MFT entry may still exist)    |
                    +----------------+----------------+
                                     |
              +----------+-----------+-----------+----------+
              |          |                       |          |
        +-----v----+ +--v--------+       +------v-----+ +-v----------+
        | $FILE_NAME| |  $DATA    |       | $STANDARD  | | $Bitmap    |
        | UTF-16LE  | | Run List  |       | INFO       | | Free Map   |
        | + Parent  | | (clusters)|       | Timestamps | | (per-bit)  |
        +----------+ +-----------+       +------------+ +------------+
              |          |                                     |
              v          v                                     v
        +-----------+  +-------------+              +------------------+
        | Recycle    |  | Reconstruct |              | Carve unalloc    |
        | Bin $I/$R  |  | from runs   |              | clusters/sectors |
        | correlation|  | (assembler) |              | for signatures   |
        +-----------+  +-------------+              +------------------+
                          |
                    +-----v------+
                    | Integrity  |
                    | Verify     |
                    | (hash +    |
                    | structure) |
                    +------------+

DiskForge Recovery Pipeline

Phases (hackathon build)

Phase What Happens Notes
1. Drive discovery + health Enumerate drives, detect SSD/HDD/NVMe where possible, read SMART/NVMe health logs Health ≠ recovery guarantee — it’s a signal, not a verdict
2. Partition discovery Parse GPT/MBR from raw sectors No filesystem driver dependency for discovery
3. NTFS deep scan Walk MFT entries; parse filenames, timestamps, run lists Works best when metadata has not been overwritten
4. Recycle Bin intelligence Parse $I metadata to recover original path + deletion time when present High value for “oops I deleted this” cases
5. Bitmap-guided carving Use $Bitmap to prioritize free clusters Reduces scan time and noise
6. Raw carving Scan unallocated areas for signatures; validate with basic structure checks Always produces some false positives; DiskForge ranks confidence
7. Reconstruction + integrity Reassemble from data runs where possible; verify with hashes and format checks Integrity checks are “best effort” per file type
8. Recovery forecast (heuristic) A conservative estimate based on device type, TRIM support, and health indicators Designed to prevent wasted hours on hopeless scenarios


Architecture (high level)

graph TB
    subgraph "User Interfaces"
        CLI["CLI + TUI (ratatui + clap)"]
        GUI["Desktop GUI (egui/eframe)"]
        WEB["Web Dashboard (Axum + embedded SPA)"]
    end

    subgraph "DiskForge Core Engine"
        ORCH["Orchestrator scan()/reconstruct()"]

        subgraph "Filesystem Parsers"
            NTFS["NTFS parser (MFT, $Bitmap, run lists)"]
            GPT["GPT + MBR parser"]
        end

        subgraph "Scanner & Carver"
            CARVER["Carver (header/footer + checks)"]
            SIGDB["Signature DB (20+ types)"]
            SCHED["Scheduler (multi-thread scan)"]
        end

        subgraph "Reconstruction"
            ASM["Assembler (runs -> file)"]
            INTEGRITY["Integrity checks (hash + structure)"]
            DEDUP["Dedup + result scoring"]
        end
    end

    subgraph "Windows Hardware Layer"
        DISK["Raw reader (CreateFileW + ReadFile)"]
        SMART["SMART/NVMe health (IOCTLs)"]
        HPA["HPA detection (where supported)"]
        PRIV["Privilege check"]
        TELEM["NVMe telemetry capture (experimental)"]
    end

    CLI --> ORCH
    GUI --> ORCH
    WEB --> ORCH

    ORCH --> NTFS & GPT
    ORCH --> CARVER & SCHED
    ORCH --> ASM

    CARVER --> SIGDB
    ASM --> INTEGRITY & DEDUP

    NTFS & GPT --> DISK
    CARVER --> DISK
    ASM --> DISK

    DISK --> SMART & HPA & PRIV & TELEM

The Web Dashboard

DiskForge includes a local web dashboard baked into the binary using rust-embed:

  • no Node.js
  • no npm install
  • no external assets

DiskForge Web Dashboard

Wizard flow:

  1. Select Drive (shows type, size, TRIM support when detectable)
  2. Configure (file categories, carving toggles, thread/memory caps)
  3. Scan (phase progress + file-found count; cancel anytime)
  4. Results (filter by intact/partial/carved; confidence score; export CSV)
  5. Recover (writes output to a separate destination folder)

What makes this hackathon-worthy

  • Low-level reality: raw disk reads and filesystem parsing are hard to get right
  • Honest constraints: SSD TRIM, encryption, and overwrites are hard limits — DiskForge surfaces them up front
  • Open + local: no cloud dependencies; users keep their data on their machine
  • Multiple entry points: web dashboard for normal users + CLI/TUI for power users

The 9 Windows IOCTL calls (what they’re used for)

# IOCTL Purpose
1 IOCTL_DISK_GET_DRIVE_GEOMETRY_EX Sector size + total capacity
2 IOCTL_STORAGE_QUERY_PROPERTY (DeviceProperty) Model, serial, bus type
3 IOCTL_STORAGE_QUERY_PROPERTY (TrimProperty) TRIM/UNMAP support
4 IOCTL_STORAGE_QUERY_PROPERTY (SeekPenalty) SSD vs HDD signal
5 IOCTL_ATA_PASS_THROUGH (SMART READ) ATA SMART attribute table
6 IOCTL_ATA_PASS_THROUGH (NATIVE MAX) HPA detection (when supported)
7 IOCTL_STORAGE_QUERY_PROPERTY (NVMe Health) NVMe SMART / Health log
8 IOCTL_STORAGE_QUERY_PROPERTY (NVMe Error) NVMe error log
9 IOCTL_STORAGE_QUERY_PROPERTY (NVMe Telemetry) Telemetry capture (diagnostics / experimental)

Note: NVMe telemetry is primarily used for diagnostics. SSD internal mapping (FTL/L2P) is vendor-specific and not guaranteed to be recoverable or interpretable.


Quick Start

Prerequisites

  • Windows 10/11
  • Rust toolchain (stable)
  • Administrator privileges (required for raw \\.\PhysicalDrive access)

Build & Run

git clone https://github.com/salvation06/diskforge.git
cd diskforge

cargo build --release

# Web Dashboard (recommended)
# Run as Administrator, then open http://localhost:3000
.\target\release\diskforge-web.exe

# CLI scan
.\target\release\diskforge-cli.exe scan --drive \\.\PhysicalDrive1 --output C:\Recovery

# Drive health report
.\target\release\diskforge-cli.exe health --drive \\.\PhysicalDrive0

Safety notes (please read)

  • Never write recovered files to the same drive you’re scanning
  • DiskForge scanning is intended to be read-only
  • If a drive is failing, image it first if you can (roadmap item: guided imaging mode + hashing)
  • SSD + TRIM often means low recovery odds after time has passed

Roadmap (post-hackathon)

  • Guided imaging mode (acquire -> hash -> analyze image)
  • More NTFS artifacts (directory index parsing for more path recovery)
  • Expand signature validations and confidence scoring
  • Broader filesystem support (FAT/exFAT) where verified by tests

Built with human vision and AI velocity.
DiskForge — because your data deserves a second chance.

Built With

Share this project:

Updates