Skip to content

Table Extraction Tool Guide

Note:Before learning how to use different functions, we recommend that read the Request Workflow to know a basic PDF processing process. When using different functions, you can set their own special parameters when uploading files. Other basic steps are consistent.

Table Extraction :

java
 {    
   "lang": 8 ,  
 }

Request Parameter:

lang: OCR recognition language. Supported types and definitions: 1: Simplified Chinese; 2: Traditional Chinese; 3: English; 4: Korean; 5: Japanese; 6: Latin; 7: Sanskrit; 8: Auto.

Request Example:

Replace apiKey with the publicKey obtained from the dashboard, file with the file you want to convert, and language with your preferred interface error prompt language type.

curl
curl --location --request POST 'https://api-server.compdf.com/server/v2/process/documentAI/tableRec' \
--header 'x-api-key: apiKey' \
--header 'Accept: */*' \
--header 'Connection: keep-alive' \
--header 'Content-Type: multipart/form-data' \
--form 'file=@"file"' \
--form 'password="" \
--form 'parameter="{  \"ocrRecognitionLang\": \"AUTO\"  }"' \
--form 'language="1"'
java
import java.io.*;
import okhttp3.*;
public class main {
  public static void main(String []args) throws IOException{
    OkHttpClient client = new OkHttpClient().newBuilder()
      .build();
    MediaType mediaType = MediaType.parse("text/plain");
    RequestBody body = new MultipartBody.Builder().setType(MultipartBody.FORM)
      .addFormDataPart("file","{{file}}",
 RequestBody.create(MediaType.parse("application/octet-stream"),
                                          new File("<file>")))
      .addFormDataPart("language","{{language}}")
      .addFormDataPart("password","")
      .addFormDataPart("parameter","{    \"ocrRecognitionLang\": \"AUTO\"    }")
      .build();
    Request request = new Request.Builder()
      .url("https://api-server.compdf.com/server/v2/process/documentAI/tableRec")
      .method("POST", body)
      .addHeader("x-api-key", "{{apiKey}}")
      .build();
    Response response = client.newCall(request).execute();
  }
}

Response Information:

A successful request returns an HTTP 200 OK status code and a JSON response body showing the order details.

Response type:application/json

Response ParameterData TypeDescription
codeStringHTTP request status, "200" indicates success
messageStringRequest message
dataObjectReturn result
+taskIdStringTask ID
+taskFileNumintNumber of files processed in the task
+taskSuccessNumintNumber of files successfully processed in the task
+taskFailNumintNumber of files failed in the task
+taskStatusStringTask status
+assetTypeIdintUsed asset type ID
+taskCostintTask cost
+taskTimeintTask duration
+sourceTypeStringOriginal format
+targetTypeStringTarget format
+fileInfoDTOListArrayTask file information
++fileKeyStringFile key
++taskIdStringTask ID
++fileNameStringOriginal file name
++downFileNameStringDownload file name
++fileUrlStringOriginal file URL
++downloadUrlStringProcessed result file download URL
++sourceTypeStringOriginal format
++targetTypeStringTarget format
++fileSizeintFile size
++convertSizeintProcessed result file size
++convertTimeintProcessing time
++statusStringFile processing status
++failureCodeStringFile processing failure error code
++failureReasonStringFile processing failure description
++fileParameterStringProcessing parameter

Response Example:

json
"code": "200",
"msg": "success",
"data": {
    "taskId": "f416dbcf-0c10-4f93-ab9e-a835c1f5dba1",
    "taskFileNum": 1,
    "taskSuccessNum": 1,
    "taskFailNum": 0,
    "taskStatus": "<taskStatus>",
    "assetTypeId": 0,
    "taskCost": 1,
    "taskTime": 1,
    "sourceType": "<sourceType>",
    "targetType": "<targetType>",
    "fileInfoDTOList": [
      {
        "fileKey": "<fileKey>",
        "taskId": "<taskId>",
        "fileName": "<fileName>",
        "downFileName": "<downFileName>",
        "fileUrl": "<fileUrl>",
        "downloadUrl": "<downloadUrl>",
        "sourceType": "<sourceType>",
        "targetType": "<targetType>",
        "fileSize": 24475,
        "convertSize": 6922,
        "convertTime": 8,
        "status": "<status>",
        "failureCode": "",
        "failureReason": "",
        "fileParameter": "<fileParameter>"
      }
    ]
}

Result:

File TypeDescription
.JSONForm Recognition results

Content:

ParameterDescription
costtime spent on form identification
typetype of form
angleThe angle at which the form is rotated
widthwidth of the form
heightheight of the form
rowsnumber of rows in the form
colsNumber of columns in the form
positionThe rectangular box position of the form
height_of_rowsheight of each row of the form
width_of_colswidth of each column of the form
table_cellsinformation about all cells in the form
table_cells: start_rowThe start row of a cell
table_cells: end_rowThe end row of a cell
table_cells: start_colThe start column of a cell
table_cells: end_colThe end column of a cell
table_cells: textText in cells
table_cells: positionRectangular box position information for cells
table_cells: linesThe text lines included in the cell
table_cells: lines: textThe text line
table_cells: lines: scoreThe score identified by the text line
table_cells: lines: positiontext line position information
json
{
  "cost": 7566,
  "json_items": [
    {
      "type": "table_with_line",
      "angle": 0.0,
      "width": 488,
      "height": 191,
      "rows": 4,
      "cols": 4,
      "position": [
        114,
        657,
        602,
        657,
        602,
        848,
        114,
        848
      ],
      "height_of_rows": [
        65,
        30,
        31,
        36
      ],
      "width_of_cols": [
        122,
        122,
        118,
        122
      ],
      "table_cells": [
        {
          "start_row": 1,
          "end_row": 1,
          "start_col": 1,
          "end_col": 1,
          "text": "",
          "position": [
            2,
            2,
            124,
            2,
            124,
            67,
            2,
            67
          ],
          "lines": []
        },
        {
          "start_row": 2,
          "end_row": 2,
          "start_col": 1,
          "end_col": 1,
          "text": "Absorbed",
          "position": [
            2,
            64,
            125,
            64,
            125,
            95,
            2,
            95
          ],
          "lines": [
            {
              "text": "Absorbed",
              "score": 1.0,
              "position": [
                29,
                65,
                99,
                65,
                99,
                88,
                29,
                88
              ]
            }
          ]
        }
      ]
    }
  ],
  "html_items": [
    "<table border=\ "1\" width='488px' height='191px'>\n
  <tr>
    <th width='122px' height='65px'></th>
    <th width='122px' height='65px' style=\ "white-space: pre-line\">Absorbed</th>
    <th width='118px' height='65px' style=\ "white-space: pre-line\">Neuter</th>
    <th width='122px' height='65px' style=\ "white-space: pre-line\">Fatigue</th>
  </tr>\n
  <tr>
    <th width='122px' height='30px' style=\ "white-space: pre-line\">Absorbed</th>
    <th width='122px' height='30px'>
    </th>
    <th width='118px' height='30px' style=\ "white-space: pre-line\">2</th>
    <th width='122px' height='30px'>
    </th>
  </tr>\n
  <tr>
    <th width='122px' height='31px' style=\ "white-space: pre-line\">Neuter</th>
    <th width='122px' height='31px'>
    </th>
    <th width='118px' height='31px'>
    </th>
    <th width='122px' height='31px'>
    </th>
  </tr>\n
  <tr>
    <th width='122px' height='36px' style=\ "white-space: pre-line\">Fatigue</th>
    <th width='122px' height='36px'>
    </th>
    <th width='118px' height='36px'>
    </th>
    <th width='122px' height='36px' style=\ "white-space: pre-line\">8</th>\t</tr>\n</table>", "
<table border=\ "1\" width='489px' height='166px'>\n
  <tr>
    <th width='123px' height='61px' style=\ "white-space: pre-line\">Expression</th>
    <th width='117px' height='61px' style=\ "white-space: pre-line\">Image Num</th>
    <th width='118px' height='61px' style=\ "white-space: pre-line\">Correct</th>
    <th width='125px' height='61px' style=\ "white-space: pre-line\">Recognition Rate</th>
  </tr>\n
  <tr>
    <th width='123px' height='31px' style=\ "white-space: pre-line\">Absorbed</th>
    <th width='117px' height='31px' style=\ "white-space: pre-line\">9</th>
    <th width='118px' height='31px' style=\ "white-space: pre-line\">7</th>
    <th width='125px' height='31px' style=\ "white-space: pre-line\">77.8%</th>
  </tr>\n
  <tr>
    <th width='123px' height='30px' style=\ "white-space: pre-line\">Neuter</th>
    <th width='117px' height='30px' style=\ "white-space: pre-line\">9</th>
    <th width='118px' height='30px'>
    </th>
    <th width='125px' height='30px' style=\ "white-space: pre-line\">55.6%</th>
  </tr>\n
  <tr>
    <th width='123px' height='31px' style=\ "white-space: pre-line\">Fatigue</th>
    <th width='117px' height='31px' style=\ "white-space: pre-line\">9</th>
    <th width='118px' height='31px'>
    </th>
    <th width='125px' height='31px' style=\ "white-space: pre-line\">88.9%</th>
  </tr>\n
  <tr>
    <th width='483px' height='33px' colspan=\ "4\" style=\ "white-space: pre-line\">Average recognition rate: 74.1%</th>\t</tr>\n</table>"
  ]
}

Asynchronous Request

If you need to use the file asynchronous processing flow, please read the Asynchronous Request Instructions.