legal-doc-masker/backend/docs/OLLAMA_CLIENT_ENHANCEMENT.md

256 lines
7.4 KiB
Markdown

# OllamaClient Enhancement Summary
## Overview
The `OllamaClient` has been successfully enhanced to support validation and retry mechanisms while maintaining full backward compatibility.
## Key Enhancements
### 1. **Enhanced Constructor**
```python
def __init__(self, model_name: str, base_url: str = "http://localhost:11434", max_retries: int = 3):
```
- Added `max_retries` parameter for configurable retry attempts
- Default retry count: 3 attempts
### 2. **Enhanced Generate Method**
```python
def generate(self,
prompt: str,
strip_think: bool = True,
validation_schema: Optional[Dict[str, Any]] = None,
response_type: Optional[str] = None,
return_parsed: bool = False) -> Union[str, Dict[str, Any]]:
```
**New Parameters:**
- `validation_schema`: Custom JSON schema for validation
- `response_type`: Predefined response type for validation
- `return_parsed`: Return parsed JSON instead of raw string
**Return Type:**
- `Union[str, Dict[str, Any]]`: Can return either raw string or parsed JSON
### 3. **New Convenience Methods**
#### `generate_with_validation()`
```python
def generate_with_validation(self,
prompt: str,
response_type: str,
strip_think: bool = True,
return_parsed: bool = True) -> Union[str, Dict[str, Any]]:
```
- Uses predefined validation schemas based on response type
- Automatically handles retries and validation
- Returns parsed JSON by default
#### `generate_with_schema()`
```python
def generate_with_schema(self,
prompt: str,
schema: Dict[str, Any],
strip_think: bool = True,
return_parsed: bool = True) -> Union[str, Dict[str, Any]]:
```
- Uses custom JSON schema for validation
- Automatically handles retries and validation
- Returns parsed JSON by default
### 4. **Supported Response Types**
The following response types are supported for automatic validation:
- `'entity_extraction'`: Entity extraction responses
- `'entity_linkage'`: Entity linkage responses
- `'regex_entity'`: Regex-based entity responses
- `'business_name_extraction'`: Business name extraction responses
- `'address_extraction'`: Address component extraction responses
## Features
### 1. **Automatic Retry Mechanism**
- Retries failed API calls up to `max_retries` times
- Retries on validation failures
- Retries on JSON parsing failures
- Configurable retry count per client instance
### 2. **Built-in Validation**
- JSON schema validation using `jsonschema` library
- Predefined schemas for common response types
- Custom schema support for specialized use cases
- Detailed validation error logging
### 3. **Automatic JSON Parsing**
- Uses `LLMJsonExtractor.parse_raw_json_str()` for robust JSON extraction
- Handles malformed JSON responses gracefully
- Returns parsed Python dictionaries when requested
### 4. **Backward Compatibility**
- All existing code continues to work without changes
- Original `generate()` method signature preserved
- Default behavior unchanged
## Usage Examples
### 1. **Basic Usage (Backward Compatible)**
```python
client = OllamaClient("llama2")
response = client.generate("Hello, world!")
# Returns: "Hello, world!"
```
### 2. **With Response Type Validation**
```python
client = OllamaClient("llama2")
result = client.generate_with_validation(
prompt="Extract business name from: 上海盒马网络科技有限公司",
response_type='business_name_extraction',
return_parsed=True
)
# Returns: {"business_name": "盒马", "confidence": 0.9}
```
### 3. **With Custom Schema Validation**
```python
client = OllamaClient("llama2")
custom_schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "number"}
},
"required": ["name", "age"]
}
result = client.generate_with_schema(
prompt="Generate person info",
schema=custom_schema,
return_parsed=True
)
# Returns: {"name": "张三", "age": 30}
```
### 4. **Advanced Usage with All Options**
```python
client = OllamaClient("llama2", max_retries=5)
result = client.generate(
prompt="Complex prompt",
strip_think=True,
validation_schema=custom_schema,
return_parsed=True
)
```
## Updated Components
### 1. **Extractors**
- `BusinessNameExtractor`: Now uses `generate_with_validation()`
- `AddressExtractor`: Now uses `generate_with_validation()`
### 2. **Processors**
- `NerProcessor`: Updated to use enhanced methods
- `NerProcessorRefactored`: Updated to use enhanced methods
### 3. **Benefits in Processors**
- Simplified code: No more manual retry loops
- Automatic validation: No more manual JSON parsing
- Better error handling: Automatic fallback to regex methods
- Cleaner code: Reduced boilerplate
## Error Handling
### 1. **API Failures**
- Automatic retry on network errors
- Configurable retry count
- Detailed error logging
### 2. **Validation Failures**
- Automatic retry on schema validation failures
- Automatic retry on JSON parsing failures
- Graceful fallback to alternative methods
### 3. **Exception Types**
- `RequestException`: API call failures after all retries
- `ValueError`: Validation failures after all retries
- `Exception`: Unexpected errors
## Testing
### 1. **Test Coverage**
- Initialization with new parameters
- Enhanced generate methods
- Backward compatibility
- Retry mechanism
- Validation failure handling
- Mock-based testing for reliability
### 2. **Run Tests**
```bash
cd backend
python3 test_enhanced_ollama_client.py
```
## Migration Guide
### 1. **No Changes Required**
Existing code continues to work without modification:
```python
# This still works exactly the same
client = OllamaClient("llama2")
response = client.generate("prompt")
```
### 2. **Optional Enhancements**
To take advantage of new features:
```python
# Old way (still works)
response = client.generate(prompt)
parsed = LLMJsonExtractor.parse_raw_json_str(response)
if LLMResponseValidator.validate_entity_extraction(parsed):
# use parsed
# New way (recommended)
parsed = client.generate_with_validation(
prompt=prompt,
response_type='entity_extraction',
return_parsed=True
)
# parsed is already validated and ready to use
```
### 3. **Benefits of Migration**
- **Reduced Code**: Eliminates manual retry loops
- **Better Reliability**: Automatic retry and validation
- **Cleaner Code**: Less boilerplate
- **Better Error Handling**: Automatic fallbacks
## Performance Impact
### 1. **Positive Impact**
- Reduced code complexity
- Better error recovery
- Automatic retry reduces manual intervention
### 2. **Minimal Overhead**
- Validation only occurs when requested
- JSON parsing only occurs when needed
- Retry mechanism only activates on failures
## Future Enhancements
### 1. **Potential Additions**
- Circuit breaker pattern for API failures
- Caching for repeated requests
- Async/await support
- Streaming response support
- Custom retry strategies
### 2. **Configuration Options**
- Per-request retry configuration
- Custom validation error handling
- Response transformation hooks
- Metrics and monitoring
## Conclusion
The enhanced `OllamaClient` provides a robust, reliable, and easy-to-use interface for LLM interactions while maintaining full backward compatibility. The new validation and retry mechanisms significantly improve the reliability of LLM-based operations in the NER processing pipeline.