Reference

Preset Reference

Complete reference for m1f preset configuration and options

Preset Reference

This is the complete reference for m1f’s preset system, which enables file-specific processing rules and customization for different file types within your bundles.

Preset File Structure

Preset files are YAML documents that define processing rules. The basic structure is:

# Group name (required)
group_name:
  # Group metadata
  description: "Description of this preset group"
  enabled: true              # Whether this group is active (default: true)
  priority: 10              # Higher priority groups are checked first
  base_path: "src"          # Optional base path for pattern matching
  
  # Global settings (optional)
  global_settings:
    # Settings that apply to all files
    
  # Presets (required)
  presets:
    # Individual preset definitions

Complete YAML Schema

Group-Level Properties

PropertyTypeDefaultDescription
descriptionstring-Human-readable description of the preset group
enabledbooleantrueWhether this preset group is active
priorityinteger0Processing priority (higher values are processed first)
base_pathstring-Base path prepended to all patterns in this group
global_settingsobject-Global settings that apply to all files
presetsobject-Named presets with file-specific rules

Global Settings

All settings that can be specified at the command line can also be set in global_settings:

global_settings:
  # Encoding and formatting
  encoding: "utf-8"                    # File encoding
  separator_style: "Detailed"          # Separator style: Standard, Detailed, Markdown, None
  line_ending: "lf"                    # Line ending style: lf, crlf, cr
  
  # Include/exclude patterns
  include_patterns: ["src/**/*"]       # Glob patterns for files to include
  exclude_patterns: ["*.min.js"]       # Glob patterns for files to exclude
  include_extensions: [".py", ".js"]   # File extensions to include
  exclude_extensions: [".log"]         # File extensions to exclude
  
  # File filtering options
  include_dot_paths: false             # Include hidden files/directories
  include_binary_files: false          # Include binary files
  include_symlinks: false              # Follow symbolic links
  no_default_excludes: false           # Disable default exclude patterns
  max_file_size: "10MB"                # Maximum file size to process
  
  # Include/exclude from files
  exclude_paths_file: ".gitignore"     # File(s) containing exclude patterns
  include_paths_file: "files.txt"      # File(s) containing include patterns
  
  # Processing options
  remove_scraped_metadata: true        # Remove m1f metadata from output
  abort_on_encoding_error: false       # Abort on encoding errors
  
  # Security
  security_check: "warn"               # Security check mode: abort, skip, warn, null
  
  # Extension-specific overrides
  extensions:
    .md:
      security_check: null             # Disable security checks for markdown
      max_file_size: "5MB"             # Different size limit for markdown

Preset Properties

Each preset in the presets section can have:

PropertyTypeDescription
extensionsarrayList of file extensions to match (e.g., [".py", ".js"])
patternsarrayGlob patterns for matching files (e.g., ["src/**/*.py"])
actionsarrayList of processing actions to apply
separator_stylestringOverride separator style for these files
include_metadatabooleanWhether to include file metadata
max_linesintegerTruncate file after N lines
strip_tagsarrayHTML tags to remove (for strip_tags action)
preserve_tagsarrayHTML tags to preserve when stripping
custom_processorstringName of custom processor to use
processor_argsobjectArguments for custom processor
security_checkstringOverride security check for these files
max_file_sizestringOverride size limit for these files
remove_scraped_metadatabooleanOverride metadata removal
include_dot_pathsbooleanOverride hidden file inclusion
include_binary_filesbooleanOverride binary file inclusion

Built-in Actions

minify

Reduces file size by removing unnecessary whitespace:

  • HTML: Removes comments, compresses whitespace
  • CSS: Removes comments, compresses rules
  • JS: Basic minification (removes comments and newlines)
actions:
  - minify

strip_tags

Removes specified HTML tags from the content:

actions:
  - strip_tags
strip_tags: ["script", "style"]      # Tags to remove
preserve_tags: ["pre", "code"]        # Tags to preserve

strip_comments

Removes comments based on file type:

  • Python: Removes # comments (preserves docstrings)
  • JS/Java/C/C++: Removes // and /* */ comments
actions:
  - strip_comments

compress_whitespace

Normalizes whitespace:

  • Replaces multiple spaces with single space
  • Reduces multiple newlines to double newline
actions:
  - compress_whitespace

remove_empty_lines

Removes all empty lines from the file:

actions:
  - remove_empty_lines

custom

Apply a custom processor:

actions:
  - custom
custom_processor: "processor_name"
processor_args:
  key: value

Built-in Custom Processors

truncate

Limits content length:

actions:
  - custom
custom_processor: "truncate"
processor_args:
  max_chars: 1000                     # Maximum characters to keep

redact_secrets

Removes sensitive data based on patterns:

actions:
  - custom
custom_processor: "redact_secrets"
processor_args:
  patterns:
    - '(?i)api[_-]?key\s*[:=]\s*["\']?[\w-]+["\']?'
    - '(?i)password\s*[:=]\s*["\']?[^"\']+["\']?'

extract_functions

Extracts only function definitions (Python):

actions:
  - custom
custom_processor: "extract_functions"

File Matching Priority

When a file is processed, presets are checked in this order:

  1. Groups by priority - Higher priority values are checked first
  2. Within each group:
    • Extension matches (exact match)
    • Pattern matches (glob patterns)
    • Default preset (if defined)
  3. First match wins - The first matching preset is used
  4. No match - Standard m1f processing applies

Complete Examples

Web Development Preset

web_project:
  description: "Modern web development project"
  priority: 10
  
  global_settings:
    exclude_patterns: ["node_modules/**", "dist/**", "*.min.*"]
    security_check: "warn"
    
  presets:
    # JavaScript/TypeScript files
    javascript:
      extensions: [".js", ".jsx", ".ts", ".tsx"]
      actions:
        - strip_comments
        - compress_whitespace
      separator_style: "Detailed"
      
    # Styles
    styles:
      extensions: [".css", ".scss", ".sass", ".less"]
      actions:
        - minify
      max_file_size: "100KB"
      
    # HTML templates
    templates:
      extensions: [".html", ".htm"]
      patterns: ["templates/**/*.html"]
      actions:
        - minify
        - strip_tags
      strip_tags: ["script", "style"]
      preserve_tags: ["pre", "code", "textarea"]
      
    # Configuration files
    config:
      patterns: ["*.config.js", "*.json", ".env*"]
      actions:
        - custom
      custom_processor: "redact_secrets"
      security_check: "abort"
      
    # Images and assets
    assets:
      extensions: [".png", ".jpg", ".jpeg", ".gif", ".svg", ".ico"]
      actions:
        - custom
      custom_processor: "truncate"
      processor_args:
        max_chars: 100

Documentation Project Preset

documentation:
  description: "Documentation and knowledge base"
  priority: 5
  
  global_settings:
    separator_style: "Markdown"
    security_check: null              # No security checks for docs
    
  presets:
    # Markdown files
    markdown:
      extensions: [".md", ".mdx"]
      actions:
        - remove_empty_lines
      include_metadata: true
      
    # Code examples
    examples:
      patterns: ["examples/**/*", "snippets/**/*"]
      actions:
        - strip_comments
      max_lines: 50                   # Keep examples concise
      
    # API documentation
    api_docs:
      patterns: ["api/**/*.md", "reference/**/*.md"]
      actions: []                     # No processing
      separator_style: "Detailed"
      
    # Large data files
    data:
      extensions: [".json", ".yaml", ".yml", ".csv"]
      actions:
        - custom
      custom_processor: "truncate"
      processor_args:
        max_chars: 5000

Security-Focused Preset

secure_project:
  description: "High-security project with strict controls"
  priority: 20
  
  global_settings:
    security_check: "abort"           # Strict by default
    max_file_size: "1MB"
    
    extensions:
      # Disable security for docs
      .md:
        security_check: null
      .txt:
        security_check: null
        
      # Extra strict for sensitive files
      .env:
        security_check: "abort"
        max_file_size: "10KB"
      .key:
        security_check: "abort"
        include_binary_files: false
      .pem:
        security_check: "abort"
        include_binary_files: false
        
  presets:
    # Source code
    source:
      extensions: [".py", ".js", ".php", ".rb"]
      actions:
        - strip_comments
        - custom
      custom_processor: "redact_secrets"
      
    # Configuration
    config:
      patterns: ["config/**/*", "*.config.*", ".env*"]
      actions:
        - custom
      custom_processor: "redact_secrets"
      processor_args:
        patterns:
          - '(?i)api[_-]?key\s*[:=]\s*["\']?[\w-]+["\']?'
          - '(?i)secret\s*[:=]\s*["\']?[\w-]+["\']?'
          - '(?i)password\s*[:=]\s*["\']?[^"\']+["\']?'
          - '(?i)token\s*[:=]\s*["\']?[\w-]+["\']?'

Command Line Usage

Basic Usage

# Use a single preset file
m1f -s ./project -o bundle.txt --preset my-presets.yml

# Use a specific group from the preset file
m1f -s ./project -o bundle.txt --preset presets.yml --preset-group frontend

# Use multiple preset files (merged in order)
m1f -s ./project -o bundle.txt --preset base.yml project.yml production.yml

Preset Options

OptionDescription
--preset FILELoad preset file(s)
--preset-group NAMEUse specific group from preset file
--disable-presetsDisable all preset processing

Multiple Files

When using multiple preset files, they are merged in order:

  • Later files override earlier ones
  • Groups with the same name are merged
  • Presets with the same name are replaced

Troubleshooting

Common Issues

Preset not applying:

  • Ensure file extensions include the dot (.py not py)
  • Check pattern syntax (use ** for recursive matching)
  • Verify the preset group is enabled
  • Use --verbose to see which presets are being checked

Wrong preset selected:

  • Check priority values (higher numbers are processed first)
  • Use more specific patterns
  • Place more specific presets before general ones
  • Use --preset-group to target a specific group

Processing errors:

  • Some actions may not work on all file types
  • Binary files skip most processing actions
  • Check processor_args syntax for custom processors

Debugging

Use verbose mode to see preset matching:

m1f -s . -o bundle.txt --preset my-presets.yml --verbose

This will show:

  • Which preset files are loaded
  • Which groups are active
  • Which preset matches each file
  • Which actions are applied

See Also