API 教程

CAPTCHA API 调用的断路器模式

当验证码解决 API 速度缓慢或返回错误时,继续发送请求会浪费时间和金钱。断路器模式停止调用失败的 API,等待恢复,然后自动恢复 - 防止管道中发生级联故障。


断路器的工作原理

三种状态:

  1. 关闭 – 正常操作。请求通过。失败被计数。
  2. 打开 – 失败次数过多。所有请求均会立即被拒绝,无需调用 API。
  3. 半开 – 经过一段冷却时间后,允许一个测试请求通过。如果成功,电路关闭。如果失败,电路再次打开。

Python实现

import time
import threading
import requests

SUBMIT_URL = "https://ocr.captchaai.com/in.php"
RESULT_URL = "https://ocr.captchaai.com/res.php"
API_KEY = "YOUR_API_KEY"


class CircuitBreaker:
    def __init__(self, failure_threshold=5, recovery_timeout=60):
        self.failure_threshold = failure_threshold
        self.recovery_timeout = recovery_timeout
        self.failure_count = 0
        self.last_failure_time = 0
        self.state = "closed"  # closed, open, half-open
        self._lock = threading.Lock()

    def call(self, func, *args, **kwargs):
        with self._lock:
            if self.state == "open":
                if time.time() - self.last_failure_time > self.recovery_timeout:
                    self.state = "half-open"
                    print("[circuit] State: half-open — testing one request")
                else:
                    remaining = self.recovery_timeout - (
                        time.time() - self.last_failure_time
                    )
                    raise CircuitOpenError(
                        f"Circuit open — retry in {remaining:.0f}s"
                    )

        try:
            result = func(*args, **kwargs)
            with self._lock:
                self.failure_count = 0
                if self.state == "half-open":
                    print("[circuit] State: closed — API recovered")
                self.state = "closed"
            return result
        except Exception as e:
            with self._lock:
                self.failure_count += 1
                self.last_failure_time = time.time()
                if self.failure_count >= self.failure_threshold:
                    self.state = "open"
                    print(
                        f"[circuit] State: open — "
                        f"{self.failure_count} failures"
                    )
            raise


class CircuitOpenError(Exception):
    pass


def solve_captcha(sitekey, page_url):
    resp = requests.post(SUBMIT_URL, data={
        "key": API_KEY,
        "method": "userrecaptcha",
        "googlekey": sitekey,
        "pageurl": page_url,
        "json": "1",
    }, timeout=15)
    data = resp.json()
    if data["status"] != 1:
        raise Exception(f"Submit error: {data['request']}")

    task_id = data["request"]
    for _ in range(24):
        time.sleep(5)
        poll = requests.get(RESULT_URL, params={
            "key": API_KEY,
            "action": "get",
            "id": task_id,
            "json": "1",
        }, timeout=15).json()
        if poll["status"] == 1:
            return poll["request"]
        if poll["request"] != "CAPCHA_NOT_READY":
            raise Exception(f"Poll error: {poll['request']}")
    raise TimeoutError(f"Task {task_id} timed out")


# Usage
breaker = CircuitBreaker(failure_threshold=3, recovery_timeout=30)

for i in range(10):
    try:
        token = breaker.call(
            solve_captcha, "6Le-SITEKEY", "https://example.com"
        )
        print(f"[task-{i}] Solved: {token[:40]}...")
    except CircuitOpenError as e:
        print(f"[task-{i}] Skipped: {e}")
    except Exception as e:
        print(f"[task-{i}] Failed: {e}")

预期输出:

[task-0] Solved: 03AGdBq26ZfPxL...
[task-1] Solved: 03AGdBq27AbCdE...
[task-2] Failed: Submit error: ERROR_NO_SLOT_AVAILABLE
[task-3] Failed: Submit error: ERROR_NO_SLOT_AVAILABLE
[task-4] Failed: Submit error: ERROR_NO_SLOT_AVAILABLE
[circuit] State: open — 3 failures
[task-5] Skipped: Circuit open — retry in 28s
[task-6] Skipped: Circuit open — retry in 25s
...
[circuit] State: half-open — testing one request
[task-8] Solved: 03AGdBq28FgHiJ...
[circuit] State: closed — API recovered

JavaScript 实现

class CircuitBreaker {
  constructor(options = {}) {
    this.failureThreshold = options.failureThreshold || 5;
    this.recoveryTimeout = options.recoveryTimeout || 60000;
    this.failureCount = 0;
    this.lastFailureTime = 0;
    this.state = 'closed';
  }

  async call(fn, ...args) {
    if (this.state === 'open') {
      if (Date.now() - this.lastFailureTime > this.recoveryTimeout) {
        this.state = 'half-open';
        console.log('[circuit] State: half-open');
      } else {
        const remaining = this.recoveryTimeout - (Date.now() - this.lastFailureTime);
        throw new Error(`Circuit open — retry in ${Math.ceil(remaining / 1000)}s`);
      }
    }

    try {
      const result = await fn(...args);
      this.failureCount = 0;
      if (this.state === 'half-open') {
        console.log('[circuit] State: closed — recovered');
      }
      this.state = 'closed';
      return result;
    } catch (error) {
      this.failureCount++;
      this.lastFailureTime = Date.now();
      if (this.failureCount >= this.failureThreshold) {
        this.state = 'open';
        console.log(`[circuit] State: open — ${this.failureCount} failures`);
      }
      throw error;
    }
  }
}

// Usage
const axios = require('axios');

const API_KEY = 'YOUR_API_KEY';
const breaker = new CircuitBreaker({ failureThreshold: 3, recoveryTimeout: 30000 });

async function solveCaptcha(sitekey, pageurl) {
  const submit = await axios.post('https://ocr.captchaai.com/in.php', null, {
    params: { key: API_KEY, method: 'userrecaptcha', googlekey: sitekey, pageurl, json: 1 }
  });

  if (submit.data.status !== 1) throw new Error(submit.data.request);
  const taskId = submit.data.request;

  for (let i = 0; i < 24; i++) {
    await new Promise(r => setTimeout(r, 5000));
    const poll = await axios.get('https://ocr.captchaai.com/res.php', {
      params: { key: API_KEY, action: 'get', id: taskId, json: 1 }
    });
    if (poll.data.status === 1) return poll.data.request;
    if (poll.data.request !== 'CAPCHA_NOT_READY') throw new Error(poll.data.request);
  }
  throw new Error('Timeout');
}

(async () => {
  for (let i = 0; i < 10; i++) {
    try {
      const token = await breaker.call(solveCaptcha, '6Le-SITEKEY', 'https://example.com');
      console.log(`[task-${i}] Solved: ${token.substring(0, 40)}...`);
    } catch (err) {
      console.log(`[task-${i}] ${err.message}`);
    }
  }
})();

选择阈值

范围 低流量(< 10/min) 高流量(> 100/min)
failure_threshold 3 10
recovery_timeout 30秒 60年代

将故障阈值设置得足够高,以容忍间歇性错误(单次超时不应导致电路跳闸),但又足够低,以阻止对失败的 API 进行攻击。


与重试逻辑结合

在断路器内部使用重试逻辑。断路器对最终失败进行计数(重试次数耗尽后):

def solve_with_retry(sitekey, page_url, max_retries=2):
    for attempt in range(max_retries + 1):
        try:
            return solve_captcha(sitekey, page_url)
        except Exception:
            if attempt == max_retries:
                raise
            time.sleep(2 ** attempt)

# Circuit breaker wraps the retry function
token = breaker.call(solve_with_retry, "6Le-SITEKEY", "https://example.com")

故障排除

问题 原因 处理方式
电路开路太快 门槛太低 增加failure_threshold
电路永远无法恢复 recovery_timeout 太长 缩短至 30-60 秒
多线程使用中的竞争条件 无状态锁定 使用 threading.Lock (Python) 或原子操作
部分中断期间所有请求均被阻止 适用于所有端点的单一断路器 对提交和轮询端点使用单独的断路器

常问问题

我应该使用单独的断路器来提交和轮询吗?

是的,对于大型系统。提交端点可能会失败,而轮询仍然有效(反之亦然)。单独的断路器可以为您提供更细粒度的控制。

开路时该怎么办?

将 CAPTCHA 任务排队以供稍后使用、显示后备 UI 或跳过该操作。看解决失败时的优雅降级。


使用 CaptchaAI 构建弹性验证码工作流程

获取您的 API 密钥:验证码网站


相关指南

该文章已禁用评论。