Skip to content
Snippets Groups Projects
README.md 7.38 KiB
Newer Older
  • Learn to ignore specific revisions
  • Martin Golasowski's avatar
    Martin Golasowski committed
    
    
    Marek Nieslanik's avatar
    Marek Nieslanik committed
    # LEXIS Distributed Data Interface
    
    Package for LEXIS Distributed Data Interface implementing all data and metadata management.
    For more information read documentation.
    
    ## Contains 6 packages
    - irods_ddi
    - matedataapi
    - stagingapi
    - transfer
    - unittests
    - user_sync
    
    ### iRODS DDI
    Library of internal functions for Keycloak, OpenSearch, iRODS, data validating, data amangemeng, etc.
     
     
     ### Metadata API
    
     FAST API application for metadata management and searching.
      ```sh
    cd src
    uvicorn src.metadataapi.views:app --reload
    ```
     
    ### Staging API
    
    Marek Nieslanik's avatar
    Marek Nieslanik committed
    FAST API application for asynchronous operations such as data movement between locations, data deletion, dataset listing, etc.
    
    Marek Nieslanik's avatar
    Marek Nieslanik committed
    ```sh
    cd src
    uvicorn src.stagingapi.views:app --reload
    
    Marek Nieslanik's avatar
    Marek Nieslanik committed
    celery -A stagingapi.tasks worker -E -Q queue -l INFO -n staging_api
    ```
    
    Marek Nieslanik's avatar
    Marek Nieslanik committed
    ### Transfer API
     FAST API application for datat uploading and downloading
     ```sh
    cd src
    uvicorn src.transferapi.views:app --reload
    
    Marek Nieslanik's avatar
    Marek Nieslanik committed
    celery -A transferapi.tasks worker -E -Q queue -l INFO -n transfer_api
    
    Martin Golasowski's avatar
    Martin Golasowski committed
    ```
    
    Marek Nieslanik's avatar
    Marek Nieslanik committed
     
    ### Unit Tests
    
    Contains python unit tests for DDI
    Run them by :
    ```sh
    cd  src
    pytest  unittests/  -s
    
    Martin Golasowski's avatar
    Martin Golasowski committed
    ```
    
    
    Marek Nieslanik's avatar
    Marek Nieslanik committed
    ### User Sync
    
    Package that synchronizes users in iRODS and Keycloak
    
    ## Structure
    
    Marek Nieslanik's avatar
    Marek Nieslanik committed
    Here you can how the repositry is structured, and how packages are connected together
    
    ![Repository structure](ddi-repo-structure.drawio.png)
    
    ## Deployment on Docker
    - Database initialization e.g.: `docker run -it --rm --mount type=bind,src=${PWD}/config/ddi_api2/alembic.ini,dst=/app/src/alembic.ini --mount type=bind,src=${PWD}/config/ddi_api2/config.yaml,dst=/etc/metaapi/config.yaml -w /app/src opencode.it4i.eu:5050/lexis-platform/data/api-v2:devel alembic upgrade head`
    - `alembic.ini`:
    ```conf
    # A generic, single database configuration.
    
    [alembic]
    # path to migration scripts
    script_location = alembic
    
    # template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
    # Uncomment the line below if you want the files to be prepended with date and time
    # see https://alembic.sqlalchemy.org/en/latest/tutorial.html#editing-the-ini-file
    # for all available tokens
    # file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s
    
    # sys.path path, will be prepended to sys.path if present.
    # defaults to the current working directory.
    prepend_sys_path = .
    
    # timezone to use when rendering the date within the migration file
    # as well as the filename.
    # If specified, requires the python>=3.9 or backports.zoneinfo library.
    # Any required deps can installed by adding `alembic[tz]` to the pip requirements
    # string value is passed to ZoneInfo()
    # leave blank for localtime
    # timezone =
    
    # max length of characters to apply to the
    # "slug" field
    # truncate_slug_length = 40
    
    # set to 'true' to run the environment during
    # the 'revision' command, regardless of autogenerate
    # revision_environment = false
    
    # set to 'true' to allow .pyc and .pyo files without
    # a source .py file to be detected as revisions in the
    # versions/ directory
    # sourceless = false
    
    # version location specification; This defaults
    # to alembic/versions.  When using multiple version
    # directories, initial revisions must be specified with --version-path.
    # The path separator used here should be the separator specified by "version_path_separator" below.
    # version_locations = %(here)s/bar:%(here)s/bat:alembic/versions
    
    # version path separator; As mentioned above, this is the character used to split
    # version_locations. The default within new alembic.ini files is "os", which uses os.pathsep.
    # If this key is omitted entirely, it falls back to the legacy behavior of splitting on spaces and/or commas.
    # Valid values for version_path_separator are:
    #
    # version_path_separator = :
    # version_path_separator = ;
    # version_path_separator = space
    version_path_separator = os  # Use os.pathsep. Default configuration used for new projects.
    
    # set to 'true' to search source files recursively
    # in each "version_locations" directory
    # new in Alembic version 1.10
    # recursive_version_locations = false
    
    # the output encoding used when revision files
    # are written from script.py.mako
    # output_encoding = utf-8
    
    sqlalchemy.url = postgresql://<user>:<password>@<db_url>:5432/<db_name>
    
    
    [post_write_hooks]
    # post_write_hooks defines scripts or Python functions that are run
    # on newly generated revision scripts.  See the documentation for further
    # detail and examples
    
    # format using "black" - use the console_scripts runner, against the "black" entrypoint
    # hooks = black
    # black.type = console_scripts
    # black.entrypoint = black
    # black.options = -l 79 REVISION_SCRIPT_FILENAME
    
    # lint with attempts to fix using "ruff" - use the exec runner, execute a binary
    # hooks = ruff
    # ruff.type = exec
    # ruff.executable = %(here)s/.venv/bin/ruff
    # ruff.options = --fix REVISION_SCRIPT_FILENAME
    
    # Logging configuration
    [loggers]
    keys = root,sqlalchemy,alembic
    
    [handlers]
    keys = console
    
    [formatters]
    keys = generic
    
    [logger_root]
    level = WARN
    handlers = console
    qualname =
    
    [logger_sqlalchemy]
    level = WARN
    handlers =
    qualname = sqlalchemy.engine
    
    [logger_alembic]
    level = INFO
    handlers =
    qualname = alembic
    
    [handler_console]
    class = StreamHandler
    args = (sys.stderr,)
    level = NOTSET
    formatter = generic
    
    [formatter_generic]
    format = %(levelname)-5.5s [%(name)s] %(message)s
    datefmt = %H:%M:%S
    ```
    - `config.yaml`:
    ```yaml
    local_info:
      base_path: "/staging-area"
      worker_password: "******"
      url_prefix: "/ddiapi/v2"
    
    # only for irods sync script
    irods:
      user: 'rods'
      host: 'irods.**.msad.it4i.lexis.tech'
      port: 1247
      password: '**'
      zone: 'testZone'
      project: 'TestProject10' #used in tests
    
    # sync script
    keycloak:
      admin_url: 'http://keycloak.*.msad.it4i.lexis.tech'
      client_id: '*_SYNC'
      client_secret: '**'
      realm_name: '**'
    
    userorg:
      url: "http://userorg:8080/api"
      info-endpoint: "/UserInfo"
      sync-details: "/Service/SyncDetails"
      get-extended-project-enpoint: "/Project/ExtendedProjectInfo"
      get_project_resource: "/ProjectResource"
      provider-service-key: "**"
      data-staging-info: "/DataStaging/WorkerData"
    
    # staging api
    keycloak_token:
      microservice:
        - "https://irods.*.msad.it4i.lexis.tech"
      CLIENT_ID: "ddi"
      CLIENT_SECRET: "**"
      KEYCLOAK_REALM: "https://aai.lexis.tech/auth/realms/**"
      KEYCLOAK_URL: "https://aai.lexis.tech/auth/"
      KEYCLOAK_REALM_NAME: "***"
      user: 'test_project_owner'
      password: '**'
    
    sync:
      interval: 120 #seconds
      log: DEBUG
    
    opensearch:
      HOST: 'opensearch.**.msad.it4i.lexis.tech'
      PORT: 9200
      USER: 'admin'
      PASSWORD: 'admin'
      INDEX: 'dev_datacite_index'
      WINDOW_SIZE: 500
    
    # NEW STAGING API
    
    celery:
      BROKER_URL: 'amqp://*:*@rabbitmq:5672/'
      RESULT_BACKEND: 'db+postgresql://<user>:<password>@<db_url>:5432/<db_name>'
      TASK_SERIALIZER: 'json'
      RESULT_SERIALIZER: 'json'
      SEND_TASK_SENT_EVENT: True
      ACCEPT_CONTENT: ['application/json']
    
    metadataapi:
      url: 'http://metaapi:8081'
    
    result_backend:
      connection_string: 'postgresql://<user>:<password>@<db_url>:5432/<db_name>'
    
    transfer:
      tus_location: 'http://api.*.msad.it4i.lexis.tech/ddiapi/v2/transfer/upload'
      tus_directory: '/tus_folder'
      max_size: 128849018880
      queue_name: 'transfer_queue'
      download_chunk_size: 1048576
    
    unittests:
      credentials:
        user: 'test_project_owner'
        password: '**'
      transfer:
        url: 'http://localhost:8080/'
      irods:
        zone: 'testZone'
        project: 'resourcetest'
        location: 'iRODS DEV'
        resource: 'iRODS Dev1'
      staging:
        url: 'http://localhost:8082/'
        location: 'IT4I Staging Area'
        resource: 'Staging area on IT4T'
    ```