AI Agents

AI agents are autonomous systems that can perform tasks, make decisions, and interact with users or other systems. In the context of HITL.sh, AI agents are the systems that generate requests for human review when they encounter uncertainty or need human oversight.

What are AI Agents?

AI agents are software systems that can:

Process Information: Analyze text, images, videos, and structured data
Make Decisions: Apply machine learning models to classify, predict, or generate content
Take Actions: Execute tasks based on their analysis and decisions
Learn and Adapt: Improve performance over time through feedback and training

AI Agent Examples

Content moderation systems
Customer service chatbots
Financial fraud detection
Medical diagnosis assistants
Quality assurance tools

When AI Agents Need Human Oversight

Even the most advanced AI agents encounter situations where human judgment is essential:

Low Confidence Decisions

When AI models have low confidence scores, human review ensures accuracy and prevents errors.

Edge Cases

Unusual or novel situations that weren’t covered in training data require human expertise.

High-Stakes Decisions

Financial, medical, or legal decisions where accuracy is critical need human verification.

Compliance Requirements

Regulatory frameworks often require human oversight for certain types of decisions.

Integration Patterns

Request-Response Pattern

The most common integration pattern for AI agents:

AI Processing

Your AI agent processes input and identifies a decision point.

Confidence Check

If confidence is below threshold, create a HITL.sh request.

Human Review

Human reviewers examine the request and provide their decision.

Decision Integration

Use the human decision to complete the workflow.

Continuous Learning Pattern

AI agents can learn from human decisions to improve future performance:

Diagram showing AI agent learning from human feedback loop

Building AI Agents with HITL.sh

1. Decision Thresholds

Set confidence thresholds that trigger human review:

def process_with_hitl(content, confidence_threshold=0.8):
    # AI processing
    ai_decision = ai_model.predict(content)
    confidence = ai_decision.confidence
    
    if confidence < confidence_threshold:
        # Route to human review
        request = create_hitl_request(
            content=content,
            ai_decision=ai_decision,
            confidence=confidence
        )
        return wait_for_human_decision(request.id)
    else:
        # Use AI decision directly
        return ai_decision

2. Request Creation

Structure requests to provide reviewers with necessary context:

def create_hitl_request(content, ai_decision, confidence):
    return hitl_client.create_request(
        loop_id="content_moderation",
        data={
            "content": content,
            "ai_decision": ai_decision.prediction,
            "confidence": confidence,
            "model_version": ai_decision.model_version,
            "processing_time": ai_decision.processing_time,
            "context": {
                "user_id": content.user_id,
                "content_type": content.type,
                "previous_flags": get_user_history(content.user_id)
            }
        }
    )

3. Response Handling

Process human decisions and integrate them into your workflow:

def handle_human_decision(request_id):
    response = hitl_client.get_response(request_id)
    
    if response.decision == "approved":
        # Process approved content
        publish_content(response.content_id)
        # Learn from human decision
        update_training_data(response.content_id, "approved")
        
    elif response.decision == "rejected":
        # Handle rejected content
        flag_content(response.content_id, response.reason)
        # Learn from human decision
        update_training_data(response.content_id, "rejected")
        
    elif response.decision == "needs_changes":
        # Request modifications
        request_content_changes(response.content_id, response.feedback)

Best Practices for AI Agent Integration

Request Design

Clear Context

Provide reviewers with all information needed to make informed decisions.

Structured Data

Organize request data in logical, easy-to-scan formats.

Relevant Metadata

Include AI confidence scores, model versions, and processing details.

Historical Context

Reference previous decisions and user behavior patterns.

Performance Optimization

Batch Processing

Group similar requests to reduce human review overhead.

Priority Routing

Route urgent or high-value requests to experienced reviewers.

Smart Thresholds

Dynamically adjust confidence thresholds based on content type and risk.

Learning and Improvement

Collect Feedback

Gather human decisions and reasoning for training data.

Analyze Patterns

Identify common decision patterns and edge cases.

Update Models

Retrain AI models with human feedback data.

Measure Improvement

Track accuracy improvements and reduce human review needs.

Common Integration Scenarios

Content Moderation

AI agents flag potentially inappropriate content for human review:

def moderate_content(content):
    # AI content analysis
    moderation_result = content_moderator.analyze(content)
    
    if moderation_result.requires_review:
        request = hitl_client.create_request(
            loop_id="content_moderation",
            data={
                "content": content.text,
                "content_type": "text",
                "ai_flags": moderation_result.flags,
                "confidence": moderation_result.confidence,
                "risk_level": moderation_result.risk_level
            }
        )
        return request

Quality Assurance

AI-generated content is reviewed for accuracy and quality:

def quality_check_content(content):
    # AI quality assessment
    quality_score = quality_checker.assess(content)
    
    if quality_score < QUALITY_THRESHOLD:
        request = hitl_client.create_request(
            loop_id="quality_assurance",
            data={
                "content": content,
                "quality_score": quality_score,
                "ai_feedback": quality_checker.get_feedback(),
                "expected_standards": get_content_standards()
            }
        )
        return request

Fraud Detection

AI systems flag suspicious transactions for human investigation:

def fraud_detection(transaction):
    # AI fraud analysis
    fraud_score = fraud_detector.analyze(transaction)
    
    if fraud_score > FRAUD_THRESHOLD:
        request = hitl_client.create_request(
            loop_id="fraud_review",
            data={
                "transaction": transaction,
                "fraud_score": fraud_score,
                "risk_factors": fraud_detector.get_risk_factors(),
                "user_history": get_user_transaction_history(transaction.user_id)
            }
        )
        return request

Monitoring and Analytics

Track your AI agent’s performance and human review effectiveness:

AI Performance

Monitor accuracy, confidence distributions, and decision quality.

Human Review Metrics

Track response times, decision consistency, and reviewer performance.

Learning Progress

Measure improvements in AI accuracy over time.

Cost Optimization

Balance AI automation with human review costs.

Next Steps

Ready to integrate your AI agent with HITL.sh?

Create Your First Loop

Set up a human-in-the-loop workflow for your AI agent.

Explore Request Types

Learn about different types of requests you can create.

Set Up Webhooks

Configure real-time notifications for human decisions.

Getting Started

Integration Guides

Core Concepts

Mobile App

Requests

Responses

Integrations

Webhooks

​AI Agents

​What are AI Agents?

AI Agent Examples

​When AI Agents Need Human Oversight

​Integration Patterns

​Request-Response Pattern

​Continuous Learning Pattern

​Building AI Agents with HITL.sh

​1. Decision Thresholds

​2. Request Creation

​3. Response Handling

​Best Practices for AI Agent Integration

​Request Design