legal-doc-masker/backend/docs/OLLAMA_CLIENT_ENHANCEMENT.md

7.4 KiB

OllamaClient Enhancement Summary

Overview

The OllamaClient has been successfully enhanced to support validation and retry mechanisms while maintaining full backward compatibility.

Key Enhancements

1. Enhanced Constructor

def __init__(self, model_name: str, base_url: str = "http://localhost:11434", max_retries: int = 3):
  • Added max_retries parameter for configurable retry attempts
  • Default retry count: 3 attempts

2. Enhanced Generate Method

def generate(self, 
            prompt: str, 
            strip_think: bool = True,
            validation_schema: Optional[Dict[str, Any]] = None,
            response_type: Optional[str] = None,
            return_parsed: bool = False) -> Union[str, Dict[str, Any]]:

New Parameters:

  • validation_schema: Custom JSON schema for validation
  • response_type: Predefined response type for validation
  • return_parsed: Return parsed JSON instead of raw string

Return Type:

  • Union[str, Dict[str, Any]]: Can return either raw string or parsed JSON

3. New Convenience Methods

generate_with_validation()

def generate_with_validation(self, 
                           prompt: str, 
                           response_type: str,
                           strip_think: bool = True,
                           return_parsed: bool = True) -> Union[str, Dict[str, Any]]:
  • Uses predefined validation schemas based on response type
  • Automatically handles retries and validation
  • Returns parsed JSON by default

generate_with_schema()

def generate_with_schema(self, 
                        prompt: str, 
                        schema: Dict[str, Any],
                        strip_think: bool = True,
                        return_parsed: bool = True) -> Union[str, Dict[str, Any]]:
  • Uses custom JSON schema for validation
  • Automatically handles retries and validation
  • Returns parsed JSON by default

4. Supported Response Types

The following response types are supported for automatic validation:

  • 'entity_extraction': Entity extraction responses
  • 'entity_linkage': Entity linkage responses
  • 'regex_entity': Regex-based entity responses
  • 'business_name_extraction': Business name extraction responses
  • 'address_extraction': Address component extraction responses

Features

1. Automatic Retry Mechanism

  • Retries failed API calls up to max_retries times
  • Retries on validation failures
  • Retries on JSON parsing failures
  • Configurable retry count per client instance

2. Built-in Validation

  • JSON schema validation using jsonschema library
  • Predefined schemas for common response types
  • Custom schema support for specialized use cases
  • Detailed validation error logging

3. Automatic JSON Parsing

  • Uses LLMJsonExtractor.parse_raw_json_str() for robust JSON extraction
  • Handles malformed JSON responses gracefully
  • Returns parsed Python dictionaries when requested

4. Backward Compatibility

  • All existing code continues to work without changes
  • Original generate() method signature preserved
  • Default behavior unchanged

Usage Examples

1. Basic Usage (Backward Compatible)

client = OllamaClient("llama2")
response = client.generate("Hello, world!")
# Returns: "Hello, world!"

2. With Response Type Validation

client = OllamaClient("llama2")
result = client.generate_with_validation(
    prompt="Extract business name from: 上海盒马网络科技有限公司",
    response_type='business_name_extraction',
    return_parsed=True
)
# Returns: {"business_name": "盒马", "confidence": 0.9}

3. With Custom Schema Validation

client = OllamaClient("llama2")
custom_schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "number"}
    },
    "required": ["name", "age"]
}

result = client.generate_with_schema(
    prompt="Generate person info",
    schema=custom_schema,
    return_parsed=True
)
# Returns: {"name": "张三", "age": 30}

4. Advanced Usage with All Options

client = OllamaClient("llama2", max_retries=5)
result = client.generate(
    prompt="Complex prompt",
    strip_think=True,
    validation_schema=custom_schema,
    return_parsed=True
)

Updated Components

1. Extractors

  • BusinessNameExtractor: Now uses generate_with_validation()
  • AddressExtractor: Now uses generate_with_validation()

2. Processors

  • NerProcessor: Updated to use enhanced methods
  • NerProcessorRefactored: Updated to use enhanced methods

3. Benefits in Processors

  • Simplified code: No more manual retry loops
  • Automatic validation: No more manual JSON parsing
  • Better error handling: Automatic fallback to regex methods
  • Cleaner code: Reduced boilerplate

Error Handling

1. API Failures

  • Automatic retry on network errors
  • Configurable retry count
  • Detailed error logging

2. Validation Failures

  • Automatic retry on schema validation failures
  • Automatic retry on JSON parsing failures
  • Graceful fallback to alternative methods

3. Exception Types

  • RequestException: API call failures after all retries
  • ValueError: Validation failures after all retries
  • Exception: Unexpected errors

Testing

1. Test Coverage

  • Initialization with new parameters
  • Enhanced generate methods
  • Backward compatibility
  • Retry mechanism
  • Validation failure handling
  • Mock-based testing for reliability

2. Run Tests

cd backend
python3 test_enhanced_ollama_client.py

Migration Guide

1. No Changes Required

Existing code continues to work without modification:

# This still works exactly the same
client = OllamaClient("llama2")
response = client.generate("prompt")

2. Optional Enhancements

To take advantage of new features:

# Old way (still works)
response = client.generate(prompt)
parsed = LLMJsonExtractor.parse_raw_json_str(response)
if LLMResponseValidator.validate_entity_extraction(parsed):
    # use parsed

# New way (recommended)
parsed = client.generate_with_validation(
    prompt=prompt,
    response_type='entity_extraction',
    return_parsed=True
)
# parsed is already validated and ready to use

3. Benefits of Migration

  • Reduced Code: Eliminates manual retry loops
  • Better Reliability: Automatic retry and validation
  • Cleaner Code: Less boilerplate
  • Better Error Handling: Automatic fallbacks

Performance Impact

1. Positive Impact

  • Reduced code complexity
  • Better error recovery
  • Automatic retry reduces manual intervention

2. Minimal Overhead

  • Validation only occurs when requested
  • JSON parsing only occurs when needed
  • Retry mechanism only activates on failures

Future Enhancements

1. Potential Additions

  • Circuit breaker pattern for API failures
  • Caching for repeated requests
  • Async/await support
  • Streaming response support
  • Custom retry strategies

2. Configuration Options

  • Per-request retry configuration
  • Custom validation error handling
  • Response transformation hooks
  • Metrics and monitoring

Conclusion

The enhanced OllamaClient provides a robust, reliable, and easy-to-use interface for LLM interactions while maintaining full backward compatibility. The new validation and retry mechanisms significantly improve the reliability of LLM-based operations in the NER processing pipeline.