- Sep 15, 2021
-
-
Daniel Ecer authored
* moved bbox main to separate module * renamed test module to match main module under test
-
Daniel Ecer authored
* conditionally write dummy copy of xml file * renamed to output_json_path * added coords attribute * using namespace for coords attribute * add namespace to nsmap
-
- Sep 14, 2021
-
-
Daniel Ecer authored
* process pdf and xml file lists * allow sub directories in output path * make output_annotated_images_path relative to output * make sure that output directory is created * use write_bytes in favour of explicit makedirs (cloud ready) * make --pdf-base-path required * added pipeline * not extending ABC due to serialization errors * log pipeline options * added test to check serialization * changed super import to avoid one of the serialization errors in Dataflow * moved most functionality to separate module * added libgl1 to setup.py * added PreventFusion * minor import grouping * added TransformAndCount * reverted super __init__ call * use parse args, not ignoring unknown args * expose all of the worker arguments
-
- Sep 07, 2021
-
-
Daniel Ecer authored
* calculate structural similarity using skimage * fixed crop_image_to_bounding_box * output score in JSON * optionally output images with bounding boxes * display bbox label inside if not enough space above * sort by score, then key points * fixed cache issue by using explicit cache key prefix (otherwise ids may have been reused after memory being freed)
-
- Sep 03, 2021
-
-
Daniel Ecer authored
* raise GraphicImageNotFoundError * allow skipping errors
-
Daniel Ecer authored
* enable debug logging for tests * added cli scaffolding * extract images from pdf * fixed type hint * added bounding box to_list * converted bounding box to named tuple * added tests for validate * implemented bounding box intersection * implemented finding bounding boxes of single image * added test for smaller partial image * added libgl1 for open cv * linting: use with statement for Popen * added support for multiple image files * added support for xml files * join graphic href with xml dirname * renamed cv2 to cv * using ObjectDetectorMatcher * moved funtions to image object matching module * added TestGetObjectMatch * added ImageObjectMarchResult * added test_should_match_smaller_image * added test_should_match_smaller_rotated_90_image * fixed typo ImageObjectMatchResult * moved object_detector_matcher parameter down * added get_image_list_object_match * added su...
-
- Aug 26, 2021
-
-
Daniel Ecer authored
* added mypy dependency * added dev-mypy * added mypy make target * declare EMPTY class prop * removed incorrect tensors type hint * added type hint to excluded_tokens * removed unused ProcessedWrapper * replacing backports.tempfile with builtin * added T_ArgumentParserOrGroup * fixed iter_tokenized_tokens return type hint * added types-requests * added type to DEFAULT_ANNOTATORS * removed blank line * ignore distutils import * replaced T_Element with etree.ElementBase * changed type check back to etree._Element * replaced project_tests.sh * removed second mypy make target dependency
-
- May 18, 2021
-
-
Daniel Ecer authored
* initial create vocabulary utility * extract vocabulary from embeddings * renamed to --output-word-count-file * added main call * extracted iter_tokenized_tokens * avoid empty tokens * using tokenizer from delft * optionally sort by count * added file list support * added support for remote files * added limit argument * added fsspec dependency * optionally use multi threading or processing * included full github link * renamed to create_vocabulary * moved to tools vocabulary * filter embeddings * renamed to embeddings * using fsspec to open embeddings file when extracting * use fsspec when filtering embeddings * document tools * added link to tools.md
-
- Jan 17, 2020
-
-
dependabot-preview[bot] authored
* Bump pylint from 2.3.1 to 2.4.4 Bumps [pylint](https://github.com/PyCQA/pylint) from 2.3.1 to 2.4.4. - [Release notes](https://github.com/PyCQA/pylint/releases) - [Changelog](https://github.com/PyCQA/pylint/blob/master/ChangeLog) - [Commits](https://github.com/PyCQA/pylint/compare/pylint-2.3.1...pylint-2.4.4 ) Signed-off-by:
dependabot-preview[bot] <support@dependabot.com> * removed explicit astroid dependency * linting * more flake8 linting * explicit pylint dependency Co-authored-by:
Daniel Ecer <de-code@users.noreply.github.com>
-
- Jun 03, 2019
-
-
Daniel Ecer authored
-