-
- Downloads
Updated Resilipipe to be more independent of deployment specifics like S3...
Updated Resilipipe to be more independent of deployment specifics like S3 credentials or Spark deployment. Expanded test cases.
Showing
- Makefile 2 additions, 4 deletionsMakefile
- README.md 11 additions, 39 deletionsREADME.md
- resilipipe/pyproject.toml 1 addition, 0 deletionsresilipipe/pyproject.toml
- resilipipe/resilipipe/conf/config.py 8 additions, 26 deletionsresilipipe/resilipipe/conf/config.py
- resilipipe/resilipipe/conf/data_centers.yaml 0 additions, 8 deletionsresilipipe/resilipipe/conf/data_centers.yaml
- resilipipe/resilipipe/conf/minio.yaml 0 additions, 8 deletionsresilipipe/resilipipe/conf/minio.yaml
- resilipipe/resilipipe/jobs/spark.py 95 additions, 41 deletionsresilipipe/resilipipe/jobs/spark.py
- resilipipe/resilipipe/log/README.md 34 additions, 0 deletionsresilipipe/resilipipe/log/README.md
- resilipipe/resilipipe/log/info_logging.py 11 additions, 0 deletionsresilipipe/resilipipe/log/info_logging.py
- resilipipe/resilipipe/log/statistics.py 67 additions, 324 deletionsresilipipe/resilipipe/log/statistics.py
- resilipipe/resilipipe/parse/README.md 2 additions, 0 deletionsresilipipe/resilipipe/parse/README.md
- resilipipe/resilipipe/parse/modules/collection_indices.py 22 additions, 19 deletionsresilipipe/resilipipe/parse/modules/collection_indices.py
- resilipipe/resilipipe/parse/modules/curlielabels.py 4 additions, 1 deletionresilipipe/resilipipe/parse/modules/curlielabels.py
- resilipipe/resilipipe/parse/warc_preprocessing.py 60 additions, 54 deletionsresilipipe/resilipipe/parse/warc_preprocessing.py
- scripts/prepare.sh 11 additions, 23 deletionsscripts/prepare.sh
- scripts/run_preprocessor.sh 1 addition, 1 deletionscripts/run_preprocessor.sh
- scripts/submit_spark.sh 0 additions, 82 deletionsscripts/submit_spark.sh
- tests/jobs/__init__.py 0 additions, 0 deletionstests/jobs/__init__.py
- tests/jobs/test_spark.py 217 additions, 0 deletionstests/jobs/test_spark.py
- tests/parse/test_warc_preprocessing.py 2 additions, 2 deletionstests/parse/test_warc_preprocessing.py
Loading
Please register or sign in to comment