File Indexer
File Indexer is a blazingly fast command-line tool built in Rust that recursively indexes file systems, extracts comprehensive metadata, and provides advanced search capabilities with duplicate detection. Built for developers and system administrators who need to efficiently manage and analyze large directory structures.
Features
- Lightning-fast indexing with parallel processing across all CPU cores
- Comprehensive metadata extraction including size, timestamps, permissions, and checksums
- Advanced search capabilities with regex, wildcards, and boolean operators
- Duplicate file detection using SHA-256 checksums
- Flexible export options in JSON and CSV formats
- Content indexing with configurable text preview extraction
- Configurable depth limits and ignore patterns
- Real-time progress reporting with detailed statistics
- Cross-platform compatibility (Windows, macOS, Linux)
Performance
File Indexer leverages Rust’s zero-cost abstractions and Rayon’s work-stealing parallelism to achieve exceptional performance:
- Multi-core utilization: Automatically scales across all available CPU cores
- Memory efficient: Processes files in streaming fashion without loading entire contents
- Thousands of files per second on modern hardware
- Minimal overhead: Written in Rust for maximum performance and safety
Usage Examples
Index a directory with all features
file_indexer index /path/to/projects --depth 10 --content --checksums
Search with advanced options
# Regex searchfile_indexer search ".*\.rs$" --regex
# Wildcard search with contentfile_indexer search "config*" --content
# Case-sensitive exact matchfile_indexer search "README" --case-sensitive
Export and analysis
# Export to JSON for programmatic analysisfile_indexer export json results.json
# Export to CSV for spreadsheet analysisfile_indexer export csv results.csv
# Find duplicate filesfile_indexer duplicates
Architecture
File Indexer uses a sophisticated two-phase architecture:
- Discovery Phase: Recursively traverses directory structure collecting file paths
- Processing Phase: Parallel processing of files using Rayon’s thread pool for maximum throughput
The tool intelligently handles error conditions, provides detailed progress reporting, and maintains thread-safe data structures for reliable concurrent operation.
Configuration
Highly configurable with sensible defaults:
- Ignore patterns: Automatically skips
.git
,node_modules
,target
directories - Include patterns: Optional whitelist for selective indexing
- Content preview size: Configurable text extraction limits
- Symlink handling: Optional symlink following
- Depth limits: Prevent infinite recursion
File Indexer is licensed under the MIT license and actively maintained. Perfect for DevOps workflows, system auditing, and large-scale file management tasks.
← Back to projects