Skip to content

PDF to Excel Tool Guide

Note:Before learning how to use different functions, we recommend that read the Request Workflow to know a basic PDF processing process. When using different functions, you can set their own special parameters when uploading files. Other basic steps are consistent.

PDF to Excel:

java
{
    "contentOptions": "2",
    "worksheetOptions": "1",
    "isContainAnnot": "1" ,  
    "isContainImg": "1",
  	"isAllowOcr":"0",
  	"isContainOcrBg":"0",
  	"isOnlyAiTable":"0"
}

Needed Parameters

contentOptions: Options to extract contents (1: only text, 2: only table, 3: all content) Default 2.

worksheetOptions: Options to create worksheet (1: create a sheet for each table, 2: create a sheet for each page, 3: create a single sheet for file) Default 1.

isContainAnnot: Whether to include comments (1: yes; 0: no) Default 0.

isContainImg: whether to include pictures (1: yes; 0: no) Default 0.

isAllowOcr: Whether to allow to open OCR (1: yes, 0: no), Default 0.

isContainOcrBg: Whether to keep the background image after OCR is enabled (1: yes, 0: no) Default 0.

isOnlyAiTable: Whether to enable AI to recognize table (1: yes, 0: no) Default 0.

Example

  1. Authentication

    You need to replace and with accessToken in the publicKey and secretKey authentication getback values you get from the console.

    curl
    curl --location --request POST 'https://api-server.compdf.com/server/v1/oauth/token' \
    --header 'Content-Type: application/json' \
    --data-raw '{
        "publicKey": "publicKey",
        "secretKey": "secretKey"
    }'
    java
    import java.io.*;
    import okhttp3.*;
    public class main {
      public static void main(String []args) throws IOException{
        OkHttpClient client = new OkHttpClient().newBuilder()
          .build();
        MediaType mediaType = MediaType.parse("text/plain");
        RequestBody body = RequestBody.create(mediaType, "{\n    \"publicKey\": \"{{public_key}}\",\n    \"secretKey\": \"{{secret_key}}\"\n}");
        Request request = new Request.Builder()
          .url("https://api-server.compdf.com/server/v1/oauth/token")
          .method("POST", body)
          .build();
        Response response = client.newCall(request).execute();
      }
    }
  2. Create Task

    You need to replace with the accessToken which was obtained from the previous step, and replace with the language type you want to display the error information. After replacing them, you will get the taskId in the response data.

    curl
    curl --location --request GET 'https://api-server.compdf.com/server/v1/task/pdf/xlsx' \
    --header 'Authorization: Bearer accessToken'
    java
    import java.io.*;
    import okhttp3.*;
    public class main {
      public static void main(String []args) throws IOException{
        OkHttpClient client = new OkHttpClient().newBuilder()
          .build();
        MediaType mediaType = MediaType.parse("text/plain");
        RequestBody body = RequestBody.create(mediaType, "");
        Request request = new Request.Builder()
          .url("https://api-server.compdf.com/server/v1/task/pdf/xlsx?language={{language}}")
          .method("GET", body)
          .addHeader("Authorization", "Bearer {{accessToken}}")
          .build();
        Response response = client.newCall(request).execute();
      }
    }
  3. Upload Files

    Replace with the file you want to convert, with the taskId obtained in the previous step, with the language type you want to display the error information, and with the accessToken obtained in the first step.

    curl
    curl --location --request POST 'https://api-server.compdf.com/server/v1/file/upload' \
    --header 'Authorization: Bearer accessToken' \
    --form 'file=@"test.pdf"' \
    --form 'taskId="taskId"' \
    --form 'password=""' \
    --form 'parameter="{ \"contentOptions\": \"2\", \"worksheetOptions\": \"1\",\"isContainAnnot\": 1 , \"isContainImg\":1,\"isAllowOcr\":0,\"isContainOcrBg\":0,\"isOnlyAiTable\":0}"' \
    --form 'language=""'
    java
    import java.io.*;
    import okhttp3.*;
    public class main {
      public static void main(String []args) throws IOException{
        OkHttpClient client = new OkHttpClient().newBuilder()
          .build();
        MediaType mediaType = MediaType.parse("text/plain");
        RequestBody body = new MultipartBody.Builder().setType(MultipartBody.FORM)
          .addFormDataPart("file","{{file}}",
                           RequestBody.create(MediaType.parse("application/octet-stream"),
                                              new File("<file>")))
          .addFormDataPart("taskId","{{taskId}}")
          .addFormDataPart("language","{{language}}")
          .addFormDataPart("password","")
          .addFormDataPart("parameter","{  \"contentOptions\": \"2\",  \"worksheetOptions\": \"1\"}")
          .build();
        Request request = new Request.Builder()
          .url("https://api-server.compdf.com/server/v1/file/upload")
          .method("POST", body)
          .addHeader("Authorization", "Bearer {{accessToken}}")
          .build();
        Response response = client.newCall(request).execute();
      }
    }
  4. Process Files

    Replace with the taskId you obtained from the Create task, and with the accessToken obtained in the first step, and replace with the language type you want to display the error information.

    curl
    curl --location -g --request GET 'https://api-server.compdf.com/server/v1/execute/start?taskId=taskId' \
    --header 'Authorization: Bearer accessToken'
    java
    import java.io.*;
    import okhttp3.*;
    public class main {
     public static void main(String []args) throws IOException{
       OkHttpClient client = new OkHttpClient().newBuilder()
         .build();
       MediaType mediaType = MediaType.parse("text/plain");
       RequestBody body = RequestBody.create(mediaType, "");
       Request request = new Request.Builder()
         .url("https://api-server.compdf.com/server/v1/execute/start?taskId={{taskId}}&language={{language}}")
         .method("GET", body)
         .addHeader("Authorization", "Bearer {{accessToken}}")
         .build();
       Response response = client.newCall(request).execute();
     }
    }
  5. Get Task Information

    Replace with you from Create the task obtained in the taskId, replaced by access_token obtained in the first step.

    curl
    curl --location -g --request GET 'https://api-server.compdf.com/server/v1/task/taskInfo?taskId=taskId' \
    --header 'Authorization: Bearer accessToken'
    java
    import java.io.*;
    import okhttp3.*;
    public class main {
      public static void main(String []args) throws IOException{
        OkHttpClient client = new OkHttpClient().newBuilder()
          .build();
        MediaType mediaType = MediaType.parse("text/plain");
        RequestBody body = RequestBody.create(mediaType, "");
        Request request = new Request.Builder()
          .url("https://api-server.compdf.com/server/v1/task/taskInfo?taskId={{taskId}}")
          .method("GET", body)
          .addHeader("Authorization", "Bearer {{accessToken}}")
          .build();
        Response response = client.newCall(request).execute();
      }
    }

Result

File TypeDescription
.xlsxThe Excel file after the transfer process is completed