Skip to content
Snippets Groups Projects
  1. Mar 29, 2022
  2. Jan 11, 2022
  3. Jan 07, 2022
  4. Jan 04, 2022
  5. Dec 15, 2021
  6. Dec 14, 2021
  7. Oct 21, 2021
  8. Oct 19, 2021
    • Daniel Ecer's avatar
      use pdf user coordinates (#434) · 6cbce9b0
      Daniel Ecer authored
      * added pdfminer.six dependency
      
      * output pt_bbox and coords using pt_bbox
      
      * increased version to 0.0.2
      
      * avoid failing cv operation with float bounding box
      
      * configure pdfminer logging level (local only)
      
      * make canny filter configurable (disable by default)
      
      * make min template match score configurable
      
      * move constants to the beginning of the module
      
      * changed default min template match score to 0.6
      
      * log template matching progress
      6cbce9b0
  9. Oct 18, 2021
  10. Oct 14, 2021
  11. Oct 13, 2021
  12. Oct 11, 2021
  13. Oct 07, 2021
  14. Sep 20, 2021
  15. Sep 15, 2021
  16. Sep 14, 2021
    • Daniel Ecer's avatar
      added bounding box pipeline (#403) · 603edca9
      Daniel Ecer authored
      * process pdf and xml file lists
      
      * allow sub directories in output path
      
      * make output_annotated_images_path relative to output
      
      * make sure that output directory is created
      
      * use write_bytes in favour of explicit makedirs (cloud ready)
      
      * make --pdf-base-path required
      
      * added pipeline
      
      * not extending ABC due to serialization errors
      
      * log pipeline options
      
      * added test to check serialization
      
      * changed super import to avoid one of the serialization errors in Dataflow
      
      * moved most functionality to separate module
      
      * added libgl1 to setup.py
      
      * added PreventFusion
      
      * minor import grouping
      
      * added TransformAndCount
      
      * reverted super __init__ call
      
      * use parse args, not ignoring unknown args
      
      * expose all of the worker arguments
      603edca9
  17. Sep 09, 2021
  18. Sep 08, 2021
    • Daniel Ecer's avatar
      improve bounding box accuracy, use second pass (#401) · 022a6549
      Daniel Ecer authored
      * specify cache key for object keypoints
      
      * made image id required for lower level functions
      
      * added logging to find bounding boxes
      
      * 2nd it to find bounding box without bounding box
      
      * use fixed size when calculating image similarity
      022a6549
  19. Sep 07, 2021
    • Daniel Ecer's avatar
      check bounding box; fixed image cache key (#400) · f69743c2
      Daniel Ecer authored
      * calculate structural similarity using skimage
      
      * fixed crop_image_to_bounding_box
      
      * output score in JSON
      
      * optionally output images with bounding boxes
      
      * display bbox label inside if not enough space above
      
      * sort by score, then key points
      
      * fixed cache issue by using explicit cache key prefix
      
      (otherwise ids may have been reused after memory being freed)
      f69743c2
  20. Sep 03, 2021
    • Daniel Ecer's avatar
      raise error when figure bounding box could not be found (#399) · 46e14b3c
      Daniel Ecer authored
      * raise GraphicImageNotFoundError
      
      * allow skipping errors
      46e14b3c
    • Daniel Ecer's avatar
      figure image bounding box annotation for single document (#389) · 0bf9e780
      Daniel Ecer authored
      * enable debug logging for tests
      
      * added cli scaffolding
      
      * extract images from pdf
      
      * fixed type hint
      
      * added bounding box to_list
      
      * converted bounding box to named tuple
      
      * added tests for validate
      
      * implemented bounding box intersection
      
      * implemented finding bounding boxes of single image
      
      * added test for smaller partial image
      
      * added libgl1 for open cv
      
      * linting: use with statement for Popen
      
      * added support for multiple image files
      
      * added support for xml files
      
      * join graphic href with xml dirname
      
      * renamed cv2 to cv
      
      * using ObjectDetectorMatcher
      
      * moved funtions to image object matching module
      
      * added TestGetObjectMatch
      
      * added ImageObjectMarchResult
      
      * added test_should_match_smaller_image
      
      * added test_should_match_smaller_rotated_90_image
      
      * fixed typo ImageObjectMatchResult
      
      * moved object_detector_matcher parameter down
      
      * added get_image_list_object_match
      
      * added su...
      0bf9e780