Skip to content
Snippets Groups Projects
  1. Oct 14, 2021
  2. Oct 13, 2021
  3. Oct 07, 2021
  4. Sep 07, 2021
    • Daniel Ecer's avatar
      check bounding box; fixed image cache key (#400) · f69743c2
      Daniel Ecer authored
      * calculate structural similarity using skimage
      
      * fixed crop_image_to_bounding_box
      
      * output score in JSON
      
      * optionally output images with bounding boxes
      
      * display bbox label inside if not enough space above
      
      * sort by score, then key points
      
      * fixed cache issue by using explicit cache key prefix
      
      (otherwise ids may have been reused after memory being freed)
      f69743c2
  5. Sep 03, 2021
    • Daniel Ecer's avatar
      figure image bounding box annotation for single document (#389) · 0bf9e780
      Daniel Ecer authored
      * enable debug logging for tests
      
      * added cli scaffolding
      
      * extract images from pdf
      
      * fixed type hint
      
      * added bounding box to_list
      
      * converted bounding box to named tuple
      
      * added tests for validate
      
      * implemented bounding box intersection
      
      * implemented finding bounding boxes of single image
      
      * added test for smaller partial image
      
      * added libgl1 for open cv
      
      * linting: use with statement for Popen
      
      * added support for multiple image files
      
      * added support for xml files
      
      * join graphic href with xml dirname
      
      * renamed cv2 to cv
      
      * using ObjectDetectorMatcher
      
      * moved funtions to image object matching module
      
      * added TestGetObjectMatch
      
      * added ImageObjectMarchResult
      
      * added test_should_match_smaller_image
      
      * added test_should_match_smaller_rotated_90_image
      
      * fixed typo ImageObjectMatchResult
      
      * moved object_detector_matcher parameter down
      
      * added get_image_list_object_match
      
      * added su...
      0bf9e780
  6. Aug 26, 2021
  7. Aug 02, 2021
  8. Jul 19, 2021
  9. Jul 14, 2021
  10. Jul 07, 2021
  11. Jun 23, 2021
  12. Jun 14, 2021
  13. May 25, 2021
  14. May 24, 2021
  15. May 18, 2021
    • Daniel Ecer's avatar
      create vocabulary (#349) · e3ec9802
      Daniel Ecer authored
      * initial create vocabulary utility
      
      * extract vocabulary from embeddings
      
      * renamed to --output-word-count-file
      
      * added main call
      
      * extracted iter_tokenized_tokens
      
      * avoid empty tokens
      
      * using tokenizer from delft
      
      * optionally sort by count
      
      * added file list support
      
      * added support for remote files
      
      * added limit argument
      
      * added fsspec dependency
      
      * optionally use multi threading or processing
      
      * included full github link
      
      * renamed to create_vocabulary
      
      * moved to tools vocabulary
      
      * filter embeddings
      
      * renamed to embeddings
      
      * using fsspec to open embeddings file when extracting
      
      * use fsspec when filtering embeddings
      
      * document tools
      
      * added link to tools.md
      e3ec9802
  16. May 13, 2021
  17. May 12, 2021
  18. May 06, 2021
  19. Apr 22, 2021
  20. Apr 14, 2021
  21. Apr 08, 2021
  22. Apr 06, 2021
  23. Mar 22, 2021
  24. Mar 08, 2021
  25. Feb 26, 2021
  26. Feb 22, 2021