Skip to content

Document AI

OCR

使用 ComPDFKit Document AI 的 OCR 工具,您可以根据需要从各种图像或扫描的 PDF 文件中识别文本。以下示例展示了如何上传 JPG 文件并使用 Java、PHP、C#、Python 和 Swift 编程语言运行 OCR 处理。然后输出 JSON 文件。

Java
// Create a client
CPDFClient client = new CPDFClient(publicKey,secretKey);

// Create a task
// Create an example of a DocumentAI OCR task
CPDFCreateTaskResult result = client.createTask(CPDFDocumentAIEnum.OCR);

// Get a task id
String taskId = result.getTaskId();

// File handling parameter settings
CPDFOcrParameter fileParameter = new CPDFOcrParameter();
fileParameter.setLang("auto");

// Upload files
client.uploadFile(new File("test.jpg"), taskId, fileParameter);

// Execute task
client.executeTask(taskId);

// Query TaskInfo
CPDFTaskInfoResult taskInfo = client.getTaskInfo(taskId);
PHP
// Create a client
$client = new CPDFClient('public_key', 'secret_key');

// Create a task
// Create an example of a DocumentAI OCR task
$taskInfo = $client->createTask(CPDFDocumentAI::OCR);

// File handling parameter settings
$file = $client->addFile('test.jpg')
    ->setLang('auto');

// Upload files
$fileInfo = $file->uploadFile($taskInfo['taskId']);

// Execute task
$client->executeTask($taskInfo['taskId']);

// Query TaskInfo
$taskInfo = $client->getTaskInfo($taskInfo['taskId']);
C#
// Create a client
CPDFClient client = new CPDFClient(publicKey,secretKey);

// Create a task
// Create an example of a DocumentAI OCR task
CPDFCreateTaskResult result = client.CreateTask(CPDFDocumentAIEnum.OCR);

// Get a task id
string taskId = result.TaskId;

// File handling parameter settings
CPDFOcrParameter fileParameter = new CPDFOcrParameter();
fileParameter.Lang = "auto";

// Upload files
client.UploadFile(new FileInfo("test.jpg"), taskId, fileParameter);

// Execute task
client.ExecuteTask(taskId);

// Query TaskInfo
CPDFTaskInfoResult taskInfo = client.GetTaskInfo(taskId);
Python
# Create a client
client = CPDFClient(public_key, secret_key)

# Create a task
# Create an example of a DocumentAI OCR task
create_task_result = client.create_task(CPDFDocumentAIEnum.OCR)

# Get a task id
task_id = create_task_result.task_id

# File handling parameter settings
file_parameter = CPDFOcrParameter()
file_parameter.lang = "auto"

# Upload files
client.upload_file('test.jpg', task_id, file_parameter)

# Execute task
client.execute_task(task_id)

# Query TaskInfo
task_info = client.get_task_info(task_id)
Swift
// Create a client
let client: CPDFClient = CPDFClient(publicKey: public_key, secretKey: secret_key)

Task { @MainActor in
    // Create a task
    // Create an example of a DocumentAI OCR task
    let taskModel = await client.createTask(url: CPDFDocumentAI.OCR)
    
    // Get a task id
    let taskId = taskModel?.taskId ?? ""

    // Upload files
    let path = Bundle.main.path(forResource: "test", ofType: "png")
    let uploadFileModel = await client.uploadFile(filepath: path ?? "", params: [
        CPDFFileUploadParameterKey.lang.string():"auto"
    ], taskId: taskId)
    
    // Execute task
    let _ = await client.processFiles(taskId: taskId)
    
    // Query TaskInfo
    let taskInfoModel = await client.getTaskInfo(taskId: taskId)
}
  • lang:支持的类型及定义
  • auto - 自动分类语言
  • english - 英文
  • chinese - 简体中文
  • chinese_tra - 繁体中文
  • korean - 韩文
  • japanese - 日语
  • latin - 拉丁文
  • devanagari - 梵文

结果:

文件类型说明
.jsonOCR识别结果。

内容:

参数说明
costOCR识别时间。
boxes输入图片中所有检测到的物体框位置。
textOCR识别内容。
rec_scoresOCR文本识别分数,分数越高,结果越可信。