Utilities API Reference

Reference documentation for utility functions and helper classes in LMM-Vibes.

Data Utilities

`validate_data_format`

Validate that data follows the expected format.

from lmmvibes.utils import validate_data_format

validate_data_format(
    data: List[Dict],
    required_fields: List[str] = ["question", "answer", "model_output"],
    optional_fields: List[str] = []
) -> bool

Parameters: - data: List of data dictionaries to validate - required_fields: Fields that must be present in each item - optional_fields: Fields that may be present

Returns: - True if data is valid, raises DataValidationError otherwise

`convert_data_format`

Convert data between different formats.

from lmmvibes.utils import convert_data_format

convert_data_format(
    data: List[Dict],
    target_format: str,
    field_mapping: Optional[Dict[str, str]] = None
) -> List[Dict]

Parameters: - data: Input data - target_format: Target format ("jsonl", "json", "csv") - field_mapping: Optional mapping of field names

Returns: - Data in the target format

File Utilities

`ensure_directory`

Ensure a directory exists, creating it if necessary.

from lmmvibes.utils import ensure_directory

ensure_directory(path: str) -> None

Parameters: - path: Directory path to ensure exists

`get_file_extension`

Get the file extension from a path.

from lmmvibes.utils import get_file_extension

get_file_extension(file_path: str) -> str

Parameters: - file_path: Path to the file

Returns: - File extension (e.g., ".json", ".csv")

`sanitize_filename`

Create a safe filename from a string.

from lmmvibes.utils import sanitize_filename

sanitize_filename(filename: str) -> str

Parameters: - filename: Original filename

Returns: - Sanitized filename safe for filesystem

Text Utilities

`normalize_text`

Normalize text for consistent processing.

from lmmvibes.utils import normalize_text

normalize_text(
    text: str,
    lowercase: bool = True,
    remove_punctuation: bool = False,
    remove_whitespace: bool = False
) -> str

Parameters: - text: Text to normalize - lowercase: Whether to convert to lowercase - remove_punctuation: Whether to remove punctuation - remove_whitespace: Whether to normalize whitespace

Returns: - Normalized text

`tokenize_text`

Tokenize text into words or subwords.

from lmmvibes.utils import tokenize_text

tokenize_text(
    text: str,
    method: str = "word",
    language: str = "en"
) -> List[str]

Parameters: - text: Text to tokenize - method: Tokenization method ("word", "subword", "sentence") - language: Language code for tokenization

Returns: - List of tokens

Metric Utilities

`compute_metric`

Compute a single metric on predictions and references.

from lmmvibes.utils import compute_metric

compute_metric(
    metric_name: str,
    predictions: List[str],
    references: List[str],
    **kwargs
) -> float

Parameters: - metric_name: Name of the metric to compute - predictions: List of model predictions - references: List of reference answers - **kwargs: Additional arguments for the metric

Returns: - Computed metric score

`aggregate_metrics`

Aggregate multiple metric scores.

from lmmvibes.utils import aggregate_metrics

aggregate_metrics(
    scores: List[float],
    method: str = "mean"
) -> float

Parameters: - scores: List of metric scores - method: Aggregation method ("mean", "median", "max", "min")

Returns: - Aggregated score

Configuration Utilities

`load_config_file`

Load configuration from a file.

from lmmvibes.utils import load_config_file

load_config_file(
    file_path: str,
    validate: bool = True
) -> Dict

Parameters: - file_path: Path to configuration file - validate: Whether to validate the configuration

Returns: - Configuration dictionary

`save_config_file`

Save configuration to a file.

from lmmvibes.utils import save_config_file

save_config_file(
    config: Dict,
    file_path: str,
    format: str = "yaml"
) -> None

Parameters: - config: Configuration dictionary - file_path: Path where to save the configuration - format: Output format ("yaml", "json")

Logging Utilities

`setup_logging`

Set up logging configuration.

from lmmvibes.utils import setup_logging

setup_logging(
    level: str = "INFO",
    file_path: Optional[str] = None,
    format_string: Optional[str] = None
) -> None

Parameters: - level: Logging level - file_path: Optional log file path - format_string: Optional log format string

`get_logger`

Get a logger instance.

from lmmvibes.utils import get_logger

get_logger(name: str = "lmmvibes") -> logging.Logger

Parameters: - name: Logger name

Returns: - Logger instance

Time Utilities

`Timer`

Context manager for timing operations.

from lmmvibes.utils import Timer

with Timer("operation_name"):
    # Your code here
    pass

`format_duration`

Format a duration in human-readable format.

from lmmvibes.utils import format_duration

format_duration(seconds: float) -> str

Parameters: - seconds: Duration in seconds

Returns: - Formatted duration string (e.g., "2m 30s")

Progress Utilities

`ProgressBar`

Simple progress bar for long-running operations.

from lmmvibes.utils import ProgressBar

with ProgressBar(total=100, desc="Processing") as pbar:
    for i in range(100):
        # Your processing code
        pbar.update(1)

Parameters: - total: Total number of items - desc: Description of the operation

Validation Utilities

`validate_metric_name`

Validate that a metric name is supported.

from lmmvibes.utils import validate_metric_name

validate_metric_name(metric_name: str) -> bool

Parameters: - metric_name: Name of the metric to validate

Returns: - True if metric is supported, False otherwise

`validate_file_format`

Validate that a file format is supported.

from lmmvibes.utils import validate_file_format

validate_file_format(format_name: str) -> bool

Parameters: - format_name: Name of the file format to validate

Returns: - True if format is supported, False otherwise

Error Handling Utilities

`handle_errors`

Decorator for consistent error handling.

from lmmvibes.utils import handle_errors

@handle_errors
def my_function():
    # Your function code
    pass

`retry_on_failure`

Decorator for retrying failed operations.

from lmmvibes.utils import retry_on_failure

@retry_on_failure(max_attempts=3, delay=1.0)
def my_function():
    # Your function code
    pass

Parameters: - max_attempts: Maximum number of retry attempts - delay: Delay between retries in seconds

Next Steps

Check out Core API for main functions and classes
Learn about Basic Usage for practical examples
Explore Configuration for advanced setup