Your browser does not support JavaScript. Please to enable it.

Terms & Conditions

The idea you wish to view belongs to a community that requires acceptance of terms and conditions.


    Help to Improve This Idea.


    Prev | Next

    Table Recognition and Extraction

    by Tian Qing 02/28/2018 04:42 PM GMT

    • {{:upVoteCount}}
    Username * ()

        Move idea from "Expert Review" stage to:


          Which workspace template do you wish to use?

          I accept the terms and conditions (see side bar). I understand all content I am submitting must be licensed under an open-source software or Creative Commons license as described in the Terms and Conditions:



          The outcome of this task will be extracting table from the PDFs of Yearbook and structuring the tables into readable data set (csv) for Government, researchers and others to use in the future.

          Initial Phase: 

          1. Downloading the PDFs.

          2. Using packages in R / Python to Extract table from PDFs.

          3. Transforming the tables into the data frame.

          Cleaning Phase:

          1. Removing and modifying rows or columns which contain incorrect value

          2. Identifying each table by its page No. and name of PDF.

          3. Normalizing similar tables into same structure to avoid redundancy if the data would be stored in Database in UN.


          Co-authors to your solution

          Gokulakrishnan Narasimhan, Guilherme Silveira, Meijie Li, Guangyue Li

          Link to your concept design and documentation (Required by the final day of the Submission & Collaboration phase)

          Link to an online working solution or prototype (Required by the final day of the Submission & Collaboration phase):

          Link to a video or screencast of your solution or prototype (Required by the final day of the Submission & Collaboration phase):

          Link to source code of your solution or prototype above. (If you submitted a link to an online solution or prototype, or to a video of your solution of prototype, you must provide a link to the source code. This item is required by the final day of the submission phase):

          Initial idea,#StatsHistory

          Move this Idea

          Select a Category

          Close this idea

          When closing an idea, you must determine whether the idea has exited successfully or unsuccessfully.

          Copy idea to another community

          Add Team Members

            Maximum number of team members allowed: 5

            Help to Improve This Idea.

            User Tasks ?
            Required for graduation.
            Task Assigned to Due Date Status
            Approval 06/15/2018 Completed
            on 05/04/2018
            No ideas found!
            No activities yet.