File Indexer

File Indexer is a blazingly fast command-line tool built in Rust that recursively indexes file systems, extracts comprehensive metadata, and provides advanced search capabilities with duplicate detection. Built for developers and system administrators who need to efficiently manage and analyze large directory structures.

Features

  • Lightning-fast indexing with parallel processing across all CPU cores
  • Comprehensive metadata extraction including size, timestamps, permissions, and checksums
  • Advanced search capabilities with regex, wildcards, and boolean operators
  • Duplicate file detection using SHA-256 checksums
  • Flexible export options in JSON and CSV formats
  • Content indexing with configurable text preview extraction
  • Configurable depth limits and ignore patterns
  • Real-time progress reporting with detailed statistics
  • Cross-platform compatibility (Windows, macOS, Linux)

Performance

File Indexer leverages Rust’s zero-cost abstractions and Rayon’s work-stealing parallelism to achieve exceptional performance:

  • Multi-core utilization: Automatically scales across all available CPU cores
  • Memory efficient: Processes files in streaming fashion without loading entire contents
  • Thousands of files per second on modern hardware
  • Minimal overhead: Written in Rust for maximum performance and safety

Usage Examples

Index a directory with all features

Terminal window
file_indexer index /path/to/projects --depth 10 --content --checksums

Search with advanced options

Terminal window
# Regex search
file_indexer search ".*\.rs$" --regex
# Wildcard search with content
file_indexer search "config*" --content
# Case-sensitive exact match
file_indexer search "README" --case-sensitive

Export and analysis

Terminal window
# Export to JSON for programmatic analysis
file_indexer export json results.json
# Export to CSV for spreadsheet analysis
file_indexer export csv results.csv
# Find duplicate files
file_indexer duplicates

Architecture

File Indexer uses a sophisticated two-phase architecture:

  1. Discovery Phase: Recursively traverses directory structure collecting file paths
  2. Processing Phase: Parallel processing of files using Rayon’s thread pool for maximum throughput

The tool intelligently handles error conditions, provides detailed progress reporting, and maintains thread-safe data structures for reliable concurrent operation.

Configuration

Highly configurable with sensible defaults:

  • Ignore patterns: Automatically skips .git, node_modules, target directories
  • Include patterns: Optional whitelist for selective indexing
  • Content preview size: Configurable text extraction limits
  • Symlink handling: Optional symlink following
  • Depth limits: Prevent infinite recursion

File Indexer is licensed under the MIT license and actively maintained. Perfect for DevOps workflows, system auditing, and large-scale file management tasks.


← Back to projects