- Oct 13, 2021
-
-
Daniel Ecer authored
* replace dagger html entity * strip extra spaces at beginning of xml document
-
- Oct 11, 2021
-
-
Daniel Ecer authored
-
- Sep 20, 2021
-
-
Daniel Ecer authored
* renamed --skip-errors to --ignore-unmatched-graphics * implemented --skip-errors * run test_should_annotate_using_jats_xml using beam * fixed MapOrLog usage when skipping errors
-
- Sep 15, 2021
-
-
Daniel Ecer authored
-
Daniel Ecer authored
* moved bbox main to separate module * renamed test module to match main module under test
-
Daniel Ecer authored
* conditionally write dummy copy of xml file * renamed to output_json_path * added coords attribute * using namespace for coords attribute * add namespace to nsmap
-
- Sep 14, 2021
-
-
Daniel Ecer authored
* process pdf and xml file lists * allow sub directories in output path * make output_annotated_images_path relative to output * make sure that output directory is created * use write_bytes in favour of explicit makedirs (cloud ready) * make --pdf-base-path required * added pipeline * not extending ABC due to serialization errors * log pipeline options * added test to check serialization * changed super import to avoid one of the serialization errors in Dataflow * moved most functionality to separate module * added libgl1 to setup.py * added PreventFusion * minor import grouping * added TransformAndCount * reverted super __init__ call * use parse args, not ignoring unknown args * expose all of the worker arguments
-
- Sep 07, 2021
-
-
Daniel Ecer authored
* calculate structural similarity using skimage * fixed crop_image_to_bounding_box * output score in JSON * optionally output images with bounding boxes * display bbox label inside if not enough space above * sort by score, then key points * fixed cache issue by using explicit cache key prefix (otherwise ids may have been reused after memory being freed)
-
- Sep 03, 2021
-
-
Daniel Ecer authored
* raise GraphicImageNotFoundError * allow skipping errors
-
Daniel Ecer authored
* enable debug logging for tests * added cli scaffolding * extract images from pdf * fixed type hint * added bounding box to_list * converted bounding box to named tuple * added tests for validate * implemented bounding box intersection * implemented finding bounding boxes of single image * added test for smaller partial image * added libgl1 for open cv * linting: use with statement for Popen * added support for multiple image files * added support for xml files * join graphic href with xml dirname * renamed cv2 to cv * using ObjectDetectorMatcher * moved funtions to image object matching module * added TestGetObjectMatch * added ImageObjectMarchResult * added test_should_match_smaller_image * added test_should_match_smaller_rotated_90_image * fixed typo ImageObjectMatchResult * moved object_detector_matcher parameter down * added get_image_list_object_match * added su...
-