legal-doc-masker/backend/docs/OLLAMA_CLIENT_ENHANCEMENT.md

# OllamaClient Enhancement Summary

## Overview
The `OllamaClient` has been successfully enhanced to support validation and retry mechanisms while maintaining full backward compatibility.

## Key Enhancements

### 1. **Enhanced Constructor**
```python
def __init__(self, model_name: str, base_url: str = "http://localhost:11434", max_retries: int = 3):
```
- Added `max_retries` parameter for configurable retry attempts
- Default retry count: 3 attempts

### 2. **Enhanced Generate Method**
```python
def generate(self,
            prompt: str,
            strip_think: bool = True,
            validation_schema: Optional[Dict[str, Any]] = None,
            response_type: Optional[str] = None,
            return_parsed: bool = False) -> Union[str, Dict[str, Any]]:
```

**New Parameters:**
- `validation_schema`: Custom JSON schema for validation
- `response_type`: Predefined response type for validation
- `return_parsed`: Return parsed JSON instead of raw string

**Return Type:**
- `Union[str, Dict[str, Any]]`: Can return either raw string or parsed JSON

### 3. **New Convenience Methods**

#### `generate_with_validation()`
```python
def generate_with_validation(self,
                           prompt: str,
                           response_type: str,
                           strip_think: bool = True,
                           return_parsed: bool = True) -> Union[str, Dict[str, Any]]:
```
- Uses predefined validation schemas based on response type
- Automatically handles retries and validation
- Returns parsed JSON by default

#### `generate_with_schema()`
```python
def generate_with_schema(self,
                        prompt: str,
                        schema: Dict[str, Any],
                        strip_think: bool = True,
                        return_parsed: bool = True) -> Union[str, Dict[str, Any]]:
```
- Uses custom JSON schema for validation
- Automatically handles retries and validation
- Returns parsed JSON by default

### 4. **Supported Response Types**
The following response types are supported for automatic validation:

- `'entity_extraction'`: Entity extraction responses
- `'entity_linkage'`: Entity linkage responses
- `'regex_entity'`: Regex-based entity responses
- `'business_name_extraction'`: Business name extraction responses
- `'address_extraction'`: Address component extraction responses

## Features

### 1. **Automatic Retry Mechanism**
- Retries failed API calls up to `max_retries` times
- Retries on validation failures
- Retries on JSON parsing failures
- Configurable retry count per client instance

### 2. **Built-in Validation**
- JSON schema validation using `jsonschema` library
- Predefined schemas for common response types
- Custom schema support for specialized use cases
- Detailed validation error logging

### 3. **Automatic JSON Parsing**
- Uses `LLMJsonExtractor.parse_raw_json_str()` for robust JSON extraction
- Handles malformed JSON responses gracefully
- Returns parsed Python dictionaries when requested

### 4. **Backward Compatibility**
- All existing code continues to work without changes
- Original `generate()` method signature preserved
- Default behavior unchanged

## Usage Examples

### 1. **Basic Usage (Backward Compatible)**
```python
client = OllamaClient("llama2")
response = client.generate("Hello, world!")
# Returns: "Hello, world!"
```

### 2. **With Response Type Validation**
```python
client = OllamaClient("llama2")
result = client.generate_with_validation(
    prompt="Extract business name from: 上海盒马网络科技有限公司",
    response_type='business_name_extraction',
    return_parsed=True
)
# Returns: {"business_name": "盒马", "confidence": 0.9}
```

### 3. **With Custom Schema Validation**
```python
client = OllamaClient("llama2")
custom_schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "number"}
    },
    "required": ["name", "age"]
}

result = client.generate_with_schema(
    prompt="Generate person info",
    schema=custom_schema,
    return_parsed=True
)
# Returns: {"name": "张三", "age": 30}
```

### 4. **Advanced Usage with All Options**
```python
client = OllamaClient("llama2", max_retries=5)
result = client.generate(
    prompt="Complex prompt",
    strip_think=True,
    validation_schema=custom_schema,
    return_parsed=True
)
```

## Updated Components

### 1. **Extractors**
- `BusinessNameExtractor`: Now uses `generate_with_validation()`
- `AddressExtractor`: Now uses `generate_with_validation()`

### 2. **Processors**
- `NerProcessor`: Updated to use enhanced methods
- `NerProcessorRefactored`: Updated to use enhanced methods

### 3. **Benefits in Processors**
- Simplified code: No more manual retry loops
- Automatic validation: No more manual JSON parsing
- Better error handling: Automatic fallback to regex methods
- Cleaner code: Reduced boilerplate

## Error Handling

### 1. **API Failures**
- Automatic retry on network errors
- Configurable retry count
- Detailed error logging

### 2. **Validation Failures**
- Automatic retry on schema validation failures
- Automatic retry on JSON parsing failures
- Graceful fallback to alternative methods

### 3. **Exception Types**
- `RequestException`: API call failures after all retries
- `ValueError`: Validation failures after all retries
- `Exception`: Unexpected errors

## Testing

### 1. **Test Coverage**
- Initialization with new parameters
- Enhanced generate methods
- Backward compatibility
- Retry mechanism
- Validation failure handling
- Mock-based testing for reliability

### 2. **Run Tests**
```bash
cd backend
python3 test_enhanced_ollama_client.py
```

## Migration Guide

### 1. **No Changes Required**
Existing code continues to work without modification:
```python
# This still works exactly the same
client = OllamaClient("llama2")
response = client.generate("prompt")
```

### 2. **Optional Enhancements**
To take advantage of new features:
```python
# Old way (still works)
response = client.generate(prompt)
parsed = LLMJsonExtractor.parse_raw_json_str(response)
if LLMResponseValidator.validate_entity_extraction(parsed):
    # use parsed

# New way (recommended)
parsed = client.generate_with_validation(
    prompt=prompt,
    response_type='entity_extraction',
    return_parsed=True
)
# parsed is already validated and ready to use
```

### 3. **Benefits of Migration**
- **Reduced Code**: Eliminates manual retry loops
- **Better Reliability**: Automatic retry and validation
- **Cleaner Code**: Less boilerplate
- **Better Error Handling**: Automatic fallbacks

## Performance Impact

### 1. **Positive Impact**
- Reduced code complexity
- Better error recovery
- Automatic retry reduces manual intervention

### 2. **Minimal Overhead**
- Validation only occurs when requested
- JSON parsing only occurs when needed
- Retry mechanism only activates on failures

## Future Enhancements

### 1. **Potential Additions**
- Circuit breaker pattern for API failures
- Caching for repeated requests
- Async/await support
- Streaming response support
- Custom retry strategies

### 2. **Configuration Options**
- Per-request retry configuration
- Custom validation error handling
- Response transformation hooks
- Metrics and monitoring

## Conclusion

The enhanced `OllamaClient` provides a robust, reliable, and easy-to-use interface for LLM interactions while maintaining full backward compatibility. The new validation and retry mechanisms significantly improve the reliability of LLM-based operations in the NER processing pipeline.