Intelligent Document Extraction Tool Guide
Note:Before learning how to use different functions, we recommend that read the Request Workflow to know a basic PDF processing process. When using different functions, you can set their own special parameters when uploading files. Other basic steps are consistent.
Intelligent Document Extraction:
{
"keys": ["Title"],
"tableHandles": ["Invoice Number"],
"extractType": "0"
}Required Parameters:
keys: Text, e.g., ["Title"].
tableHandles: Table headers, e.g., ["Invoice Number"]
extractType: Full-text extraction (0: Default full text, 1: All text, 2: All tables)
Request Example:
You need to replace apiKey with the publicKey obtained from the console, file with the file you want to convert, and language with the desired interface error prompt language type.
curl --location --request POST 'https://api-server.compdf.com/server/v2/process/idp/documentExtract' \
--header 'x-api-key: apiKey' \
--header 'Accept: */*' \
--header 'Connection: keep-alive' \
--header 'Content-Type: multipart/form-data' \
--form 'file=@"file"' \
--form 'password="" \
--form 'parameter="{ \"keys\":[], \"tableHandles\":[],\"extractType\":2}"' \
--form 'language="1"'import java.io.*;
import okhttp3.*;
public class main {
public static void main(String []args) throws IOException{
OkHttpClient client = new OkHttpClient().newBuilder()
.build();
MediaType mediaType = MediaType.parse("text/plain");
RequestBody body = new MultipartBody.Builder().setType(MultipartBody.FORM)
.addFormDataPart("file","{{file}}",
RequestBody.create(MediaType.parse("application/octet-stream"),
new File("<file>")))
.addFormDataPart("language","{{language}}")
.addFormDataPart("password","")
.addFormDataPart("parameter","{ \"ocrRecognitionLang\": \"AUTO\" , \"keys\":[], \"tableHandles\":[],\"extractType\":2}")
.build();
Request request = new Request.Builder()
.url("https://api-server.compdf.com/server/v2/process/idp/documentExtract")
.method("POST", body)
.addHeader("x-api-key", "{{apiKey}}")
.build();
Response response = client.newCall(request).execute();
}
}Response Information:
A successful request returns an HTTP 200 OK status code and a JSON response body showing the order details.
Response type:application/json
| Response Parameter | Data Type | Description |
|---|---|---|
| code | String | HTTP request status, "200" indicates success |
| message | String | Request message |
| data | Object | Return result |
| +taskId | String | Task ID |
| +taskFileNum | int | Number of files processed in the task |
| +taskSuccessNum | int | Number of files successfully processed in the task |
| +taskFailNum | int | Number of files failed in the task |
| +taskStatus | String | Task status |
| +assetTypeId | int | Used asset type ID |
| +taskCost | int | Task cost |
| +taskTime | int | Task duration |
| +sourceType | String | Original format |
| +targetType | String | Target format |
| +fileInfoDTOList | Array | Task file information |
| ++fileKey | String | File key |
| ++taskId | String | Task ID |
| ++fileName | String | Original file name |
| ++downFileName | String | Download file name |
| ++fileUrl | String | Original file URL |
| ++downloadUrl | String | Processed result file download URL |
| ++sourceType | String | Original format |
| ++targetType | String | Target format |
| ++fileSize | int | File size |
| ++convertSize | int | Processed result file size |
| ++convertTime | int | Processing time |
| ++status | String | File processing status |
| ++failureCode | String | File processing failure error code |
| ++failureReason | String | File processing failure description |
| ++fileParameter | String | Processing parameter |
Response Example:
"code": "200",
"msg": "success",
"data": {
"taskId": "f416dbcf-0c10-4f93-ab9e-a835c1f5dba1",
"taskFileNum": 1,
"taskSuccessNum": 1,
"taskFailNum": 0,
"taskStatus": "<taskStatus>",
"assetTypeId": 0,
"taskCost": 1,
"taskTime": 1,
"sourceType": "<sourceType>",
"targetType": "<targetType>",
"fileInfoDTOList": [
{
"fileKey": "<fileKey>",
"taskId": "<taskId>",
"fileName": "<fileName>",
"downFileName": "<downFileName>",
"fileUrl": "<fileUrl>",
"downloadUrl": "<downloadUrl>",
"sourceType": "<sourceType>",
"targetType": "<targetType>",
"fileSize": 24475,
"convertSize": 6922,
"convertTime": 8,
"status": "<status>",
"failureCode": "",
"failureReason": "",
"fileParameter": "<fileParameter>"
}
]
}Result:
| File Type | File Description |
|---|---|
| .json | JSON file with intelligent document extraction completed |
Return Data Structure Explanation:
JSON Content Explanation
| Return Parameter | Data Type | Description |
|---|---|---|
| code | String | Error code, "200" indicates success |
| message | String | Error message |
| data | Object | Return result |
| +details | Object | Key information extraction result |
| ++Page-index | Object | Extraction result for the corresponding page number |
| +++key | String | Key information field extraction result, key:value |
| +++tables | Array | Key information table extraction result, tables:[ [table1], [table2] ] |
JSON Structure Example:
{
"code": "200",
"msg": "success",
"data": {
"details": {
"Page-1": {
"Order Date": "xxx",
"Order #": "xxx",
"Quote#": "xxx",
"Your estimated delivery date is": "xxx",
"tables": null
}
}
}
}Asynchronous Request
If you need to use the file asynchronous processing flow, please read the Asynchronous Request Instructions.