NAV
ocarina uploader json python

Introduction

Malleable All-seeing Journal Of Research Artifacts

Majora is a Django-based wet-and-dry information management system. Majora is being rapidly developed as part of the COVID-19 Genomics UK Consortium (COG-UK) response to the outbreak of SARS-CoV-2.

Majora is a system that stores metadata on biological samples, sequencing runs, bioinformatics pipelines and files. These different items are referred to generally, as "artifacts". Majora is composed of three main parts:

This documentation attempts to cover all bases by showing all the fields for each of the artifacts and processes that can be added, updated and retrieved from Majora. Although intended primarily for users who wish to write a computer program to use the API or users of the Ocarina command line tool, it should be useful for users of the CGPS metadata uploader. Users of the uploader will likely also want to refer to the documentation for the metadata uploader.

You may be interested to know that this API documentation page was created with Slate.

Important notes

Authentication

Biosamples

Add one or more biosamples to Majora

/artifact/biosample/add/

Attributes

{
    "biosamples": [
        {
            "adm1": "UK-ENG",
            "adm2": "Birmingham",
            "adm2_private": "B20",
            "admitted_date": null,
            "admitted_hospital_name": null,
            "admitted_hospital_trust_or_board": null,
            "admitted_with_covid_diagnosis": null,
            "anonymised_care_home_code": null,
            "biosample_source_id": "ABC12345",
            "central_sample_id": "BIRM-12345",
            "collecting_org": "Hypothetical University of Hooting",
            "collection_date": "2020-06-03",
            "collection_pillar": "2",
            "employing_hospital_name": null,
            "employing_hospital_trust_or_board": null,
            "is_care_home_resident": null,
            "is_care_home_worker": null,
            "is_hcw": null,
            "is_hospital_patient": null,
            "is_icu_patient": null,
            "is_surveillance": "Y",
            "metadata": {
                "epi": {
                    "epi_cluster": "CLUSTER8"
                },
                "investigation": {
                    "investigation_cluster": "Ward 0",
                    "investigation_name": "West Midlands HCW",
                    "investigation_site": "QEHB"
                }
            },
            "metrics": {
                "ct": {
                    "records": {
                        "ct_value": "25",
                        "test_kit": "INHOUSE",
                        "test_platform": "INHOUSE",
                        "test_target": "ORF8"
                    }
                }
            },
            "received_date": "2020-06-04",
            "root_sample_id": "PHA12345",
            "sample_type_collected": "swab",
            "sample_type_received": "primary",
            "sender_sample_id": "LAB12345",
            "source_age": "29",
            "source_sex": "F",
            "swab_site": "nose-throat"
        }
    ],
    "token": "6e06392f-e030-4cf9-911a-8dc9f2d4e714",
    "username": "majora-sam"
}

Minimal Ocarina command with mandatory parameters:

ocarina put biosample \
    --adm1 UK-ENG \
    --central-sample-id BIRM-12345 \
    --collection-date 2020-06-03 \
    --is-surveillance Y 

Full Ocarina command example:

ocarina put biosample \
    --adm1 UK-ENG \
    --central-sample-id BIRM-12345 \
    --collection-date 2020-06-03 \
    --is-surveillance Y \
    --received-date 2020-06-04 \
    --adm2 Birmingham \
    --source-age 29 \
    --source-sex F \
    --adm2-private B20 \
    --biosample-source-id ABC12345 \
    --collecting-org 'Hypothetical University of Hooting' \
    --collection-pillar 2 \
    --root-sample-id PHA12345 \
    --sample-type-collected swab \
    --sample-type-received primary \
    --sender-sample-id LAB12345 \
    --swab-site nose-throat 

Attributes currently unsupported by Ocarina: admitted_date, admitted_hospital_name, admitted_hospital_trust_or_board, admitted_with_covid_diagnosis, anonymised_care_home_code, employing_hospital_name, employing_hospital_trust_or_board, is_care_home_resident, is_care_home_worker, is_hcw, is_hospital_patient, is_icu_patient

Function not currently implemented in Ocarina Python API

Documentation for this function can be found on the CGPS uploader website linked below:
https://metadata.docs.cog-uk.io/bulk-upload-1/bulk-upload

There may be some differences between this specification and the uploader, particularly for providing Metrics and Metadata. See the Metadata and Metrics sections below for column names that are compatible with the API spec.

Name Description Options
adm1
string, required, enum
Code of UK home nation of the patient from which the sample was collected
  • UK-ENG
  • UK-NIR
  • UK-SCT
  • UK-WLS
central_sample_id
string, required
The centrally shared ID that you will use to refer to this sample inside the consortium.
    collection_date
    string, required
    Provide where possible. When collection_date cannot be provided, you must provide received_date instead.
      is_surveillance
      string, required, enum
      Whether this sample was collected under the COGUK surveillance protocol.
      • N
      • Y
      received_date
      string, possibly required
      Date sample was first received by any lab. This date should be as close to possible to collection_date. This date must be provided if collection_date is missing.
        adm2
        string, recommended
        The city or county that the patient lives in (avoid abbreviations or short hand)
          source_age
          integer, recommended
          Ages should be whole numbers. Neonatals should be entered as 0.
            source_sex
            string, recommended, enum
            • F
            • M
            • Other
            adm2_private
            string
            The outer postcode for the patient's home address (first half of the postcode only)
              admitted_date
              string
              If is_hospital_patient, the date (YYYY-MM-DD) that the patient was admitted to hospital
                admitted_hospital_name
                string
                If is_hospital_patient, provide the name of the hospital. If you do not know the name, use HOSPITAL
                  admitted_hospital_trust_or_board
                  string
                  If is_hospital_patient, provide the name of the trust or board that administers the hospital the patient was admitted to.
                    admitted_with_covid_diagnosis
                    string, enum
                    If is_hospital_patient, whether the patient was admitted with a COVID diagnosis
                    • (blank)
                    • N
                    • Y
                    anonymised_care_home_code
                    string
                    A code to represent a particular care home, the mapping of this code to the care home should be kept securely by your organisation. You must take care to select a code that can not link the identity of the care home.
                      biosample_source_id
                      string
                      A unique identifier of patient or environmental sample. If you have multiple samples from the same patient, enter the FIRST central_sample_id assigned to one of their samples here.
                        collecting_org
                        string
                        The site (eg. hospital or surgery) that this sample was originally collected by.
                          collection_pillar
                          integer, enum
                          The pillar under which this sample was collected (e.g. 1, 2). This is likely 1, but leave blank if unsure.
                          • 1
                          • 2
                          • 103
                          • 34613
                          employing_hospital_name
                          string
                          If is_hcw, provide the name of the employing hospital. If you do not know the name, use HOSPITAL
                            employing_hospital_trust_or_board
                            string
                            If is_hcw, provide the name of the employing trust or board.
                              is_care_home_resident
                              string, enum
                              • (blank)
                              • N
                              • Y
                              is_care_home_worker
                              string, enum
                              • (blank)
                              • N
                              • Y
                              is_hcw
                              string, enum
                              Whether the sample was collected from a healthcare worker. This includes hospital-associated workers.
                              • (blank)
                              • N
                              • Y
                              is_hospital_patient
                              string, enum
                              • (blank)
                              • N
                              • Y
                              is_icu_patient
                              string, enum
                              • (blank)
                              • N
                              • Y
                              root_sample_id
                              string
                              Identifier assigned to this sample from one of the health agencies (eg. PHE samples will be prefixed with H20). This is necessary for linking samples to private patient metadata later.
                                sample_type_collected
                                string, enum
                                • BAL
                                • aspirate
                                • dry swab
                                • sputum
                                • swab
                                sample_type_received
                                string, enum
                                • culture
                                • extract
                                • lysate
                                • primary
                                sender_sample_id
                                string
                                If you are permitted, provide the identifier that was sent by your laboratory to SGSS here.
                                  swab_site
                                  string, enum
                                  Required if sample_type_collected is swab
                                  • endotracheal
                                  • nose
                                  • nose-throat
                                  • rectal
                                  • throat

                                  Metrics

                                  To provide metrics with Ocarina:

                                  ocarina put biosample \
                                      ...
                                      --metric ct.# ct_value 25 \
                                      --metric ct.# test_kit INHOUSE \
                                      --metric ct.# test_platform INHOUSE \
                                      --metric ct.# test_target ORF8 
                                  

                                  If a particular metric supports storing multiple records, you can provide them by incrementing a numerical suffix after the metric's namespace: e.g. --metric name.1 key value ... --metric name.N key value.

                                  Some metrics can be provided via the uploader using these column names:

                                  • ct ct_valuect_#_ct_value (limit 2)
                                  • ct test_kitct_#_test_kit (limit 2)
                                  • ct test_platformct_#_test_platform (limit 2)
                                  • ct test_targetct_#_test_target (limit 2)

                                  Some artifacts in Majora can be annotated with additional Metric objects. Metric objects group together specific information that allows for additional description of an artifact, but does not belong in the artifact itself. Each metric has its own namespace, containing a fixed set of keys. Some or all of the keys may need a value to validate the Metric. This endpoint allows you to submit the following Metrics:

                                  Namespace Name Description Options
                                  ct ct_value Cycle threshold value. Cannot be negative. Code an inconclusive or negative test as 0.
                                    ct test_kit
                                    • (blank)
                                    • ABBOTT
                                    • ALINITY
                                    • ALTONA
                                    • AMPLIDIAG
                                    • AUSDIAGNOSTICS
                                    • BD
                                    • BOSPHORE
                                    • INHOUSE
                                    • QIASTAT
                                    • ROCHE
                                    • SEEGENE
                                    • TAQPATH_HT
                                    • VIASURE
                                    • XPERT
                                    ct test_platform
                                    • (blank)
                                    • ABBOTT_ALINITY
                                    • ABBOTT_M2000
                                    • ALTONA
                                    • ALTOSTAR_AM16
                                    • AMPLIDIAG_EASY
                                    • APPLIED_BIO_7500
                                    • AUSDIAGNOSTICS
                                    • BD_MAX
                                    • CEPHEID_XPERT
                                    • ELITE_INGENIUS
                                    • INHOUSE
                                    • PANTHER
                                    • QIAGEN_ROTORGENE
                                    • QIASTAT_DX
                                    • ROCHE_COBAS
                                    • ROCHE_FLOW
                                    • ROCHE_LIGHTCYCLER
                                    • SEEGENE_NIMBUS
                                    • THERMO_AMPLITUDE
                                    ct test_target
                                    • (blank)
                                    • E
                                    • N
                                    • ORF1AB
                                    • ORF8
                                    • RDRP
                                    • RDRP+N
                                    • S

                                    Metadata

                                    To provide metadata with Ocarina:

                                    ocarina put biosample \
                                        ...
                                        -m epi cluster CLUSTER8 \
                                        -m investigation cluster 'Ward 0' \
                                        -m investigation name 'West Midlands HCW' \
                                        -m investigation site QEHB 
                                    

                                    Some metadata can be provided via the uploader using these column names:

                                    • epi clusterepi_cluster
                                    • investigation clusterinvestigation_cluster
                                    • investigation nameinvestigation_name
                                    • investigation siteinvestigation_site

                                    Any artifact in Majora can be 'tagged' with arbitrary key-value metadata. Unlike Metrics, there is no fixed terminology or validation on the keys or their values. Like Metrics, to aid organisation, metadata keys are grouped into namespaces. This endpoint has 'reserved' metadata keys that should only be used to provide meaningful information:

                                    Namespace Name Description Options
                                    epi epi_cluster A local identifier for a known case cluster
                                      investigation investigation_cluster An optional identifier for a cluster within an investigation
                                        investigation investigation_name A named investigation (eg. a surveillance or directed case group)
                                          investigation investigation_site An optional site name or code to differentiate between sites if the investigation covers more than one site.

                                            Scopes

                                            Add one or more empty biosamples to Majora

                                            /artifact/biosample/addempty/

                                            Attributes

                                            {
                                                "biosamples": [
                                                    {
                                                        "central_sample_id": "BIRM-12345",
                                                        "sender_sample_id": "LAB12345"
                                                    }
                                                ],
                                                "token": "6e06392f-e030-4cf9-911a-8dc9f2d4e714",
                                                "username": "majora-sam"
                                            }
                                            

                                            Minimal Ocarina command with mandatory parameters:

                                            ocarina empty biosample \
                                                --central-sample-id BIRM-12345 
                                            

                                            Full Ocarina command example:

                                            ocarina empty biosample \
                                                --central-sample-id BIRM-12345 \
                                                --sender-sample-id LAB12345 
                                            

                                            Function not currently implemented in Ocarina Python API

                                            Function not currently implemented in CGPS Metadata Uploader

                                            Name Description Options
                                            central_sample_id
                                            string, required
                                            The centrally shared ID that you will use to refer to this sample inside the consortium.
                                              sender_sample_id
                                              string
                                              If you are permitted, provide the identifier that was sent by your laboratory to SGSS here.

                                                Scopes

                                                Library

                                                Add a sequencing library to Majora

                                                /artifact/library/add/

                                                Attributes

                                                {
                                                    "biosamples": [
                                                        {
                                                            "barcode": "02",
                                                            "central_sample_id": "BIRM-12345",
                                                            "library_primers": "ARTIC v3",
                                                            "library_protocol": "ARTIC v3 (LoCost)",
                                                            "library_selection": "PCR",
                                                            "library_source": "VIRAL_RNA",
                                                            "library_strategy": "AMPLICON",
                                                            "metadata": {
                                                                "artic": {
                                                                    "artic_primers": "3",
                                                                    "artic_protocol": "v3 (LoCost)"
                                                                }
                                                            },
                                                            "sequencing_org_received_date": "2021-01-14"
                                                        }
                                                    ],
                                                    "library_layout_config": "PAIRED",
                                                    "library_layout_insert_length": 100,
                                                    "library_layout_read_length": 300,
                                                    "library_name": "HOOT-LIBRARY-20200322",
                                                    "library_seq_kit": "Illumina MiSeq v3",
                                                    "library_seq_protocol": "MiSeq 150 Cycle",
                                                    "metadata": {},
                                                    "token": "6e06392f-e030-4cf9-911a-8dc9f2d4e714",
                                                    "username": "majora-sam"
                                                }
                                                

                                                Minimal Ocarina command with mandatory parameters:

                                                ocarina put library \
                                                    --biosample BIRM-12345 VIRAL_RNA PCR AMPLICON 'ARTIC v3 (LoCost)' 'ARTIC v3' \
                                                    --library-layout-config PAIRED \
                                                    --library-name HOOT-LIBRARY-20200322 \
                                                    --library-seq-kit 'Illumina MiSeq v3' \
                                                    --library-seq-protocol 'MiSeq 150 Cycle' 
                                                

                                                Full Ocarina command example:

                                                ocarina put library \
                                                    --biosample BIRM-12345 VIRAL_RNA PCR AMPLICON 'ARTIC v3 (LoCost)' 'ARTIC v3' \
                                                    --library-layout-config PAIRED \
                                                    --library-name HOOT-LIBRARY-20200322 \
                                                    --library-seq-kit 'Illumina MiSeq v3' \
                                                    --library-seq-protocol 'MiSeq 150 Cycle' \
                                                    --library-layout-insert-length 100 \
                                                    --library-layout-read-length 300 \
                                                    --sequencing-org-received-date 2021-01-14 
                                                

                                                Attributes merged into positional arguments by Ocarina:

                                                • biosamplecentral_sample_id library_source library_selection library_strategy library_protocol library_primers

                                                Attributes currently unsupported by Ocarina: barcode

                                                Function not currently implemented in Ocarina Python API

                                                Documentation for this function can be found on the CGPS uploader website linked below:
                                                https://metadata.docs.cog-uk.io/bulk-upload-1/samples-and-sequencing

                                                There may be some differences between this specification and the uploader, particularly for providing Metrics and Metadata. See the Metadata and Metrics sections below for column names that are compatible with the API spec.

                                                Some attributes are named differently on the CGPS uploader:

                                                • library_primersartic_primers
                                                • library_protocolartic_protocol
                                                Name Description Options
                                                central_sample_id
                                                string, required
                                                  library_layout_config
                                                  string, required, enum
                                                  • PAIRED
                                                  • SINGLE
                                                  library_name
                                                  string, required
                                                  A unique, somewhat memorable name for your library.
                                                    library_selection
                                                    string, required, enum
                                                    • OTHER
                                                    • PCR
                                                    • RANDOM
                                                    • RANDOM_PCR
                                                    library_seq_kit
                                                    string, required
                                                      library_seq_protocol
                                                      string, required
                                                        library_source
                                                        string, required, enum
                                                        • GENOMIC
                                                        • METAGENOMIC
                                                        • METATRANSCRIPTOMIC
                                                        • OTHER
                                                        • TRANSCRIPTOMIC
                                                        • VIRAL_RNA
                                                        library_strategy
                                                        string, required, enum
                                                        • AMPLICON
                                                        • OTHER
                                                        • TARGETED_CAPTURE
                                                        • WGA
                                                        • WGS
                                                        library_primers
                                                        string, recommended
                                                          library_protocol
                                                          string, recommended
                                                            barcode
                                                            string
                                                              library_layout_insert_length
                                                              integer
                                                                library_layout_read_length
                                                                integer
                                                                  sequencing_org_received_date
                                                                  string
                                                                  Date sample was received by the organisation which sequenced it. This date is used for tracking sample turnaround time.

                                                                    Metadata

                                                                    To provide metadata with Ocarina:

                                                                    ocarina put library \
                                                                        ...
                                                                        -m artic primers 3 \
                                                                        -m artic protocol 'v3 (LoCost)' 
                                                                    

                                                                    Some metadata can be provided via the uploader using these column names:

                                                                    • artic primersartic_primers
                                                                    • artic protocolartic_protocol

                                                                    Any artifact in Majora can be 'tagged' with arbitrary key-value metadata. Unlike Metrics, there is no fixed terminology or validation on the keys or their values. Like Metrics, to aid organisation, metadata keys are grouped into namespaces. This endpoint has 'reserved' metadata keys that should only be used to provide meaningful information:

                                                                    Namespace Name Description Options
                                                                    artic artic_primers The version number of the ARTIC primer set (if used) to prepare this library
                                                                      artic artic_protocol The version number of the ARTIC protocol (if used) to prepare this library

                                                                        Scopes

                                                                        Sequencing

                                                                        Add a sequencing run to Majora

                                                                        /process/sequencing/add/

                                                                        Attributes

                                                                        {
                                                                            "library_name": "HOOT-LIBRARY-20200322",
                                                                            "runs": [
                                                                                {
                                                                                    "bioinfo_pipe_name": "ARTIC Pipeline (iVar)",
                                                                                    "bioinfo_pipe_version": "1.3.0",
                                                                                    "end_time": "YYYY-MM-DD HH:MM",
                                                                                    "flowcell_id": "ABCDEF",
                                                                                    "flowcell_type": "v3",
                                                                                    "instrument_make": "ILLUMINA",
                                                                                    "instrument_model": "MiSeq",
                                                                                    "run_name": "YYMMDD_AB000000_1234_ABCDEFGHI0",
                                                                                    "start_time": "YYYY-MM-DD HH:MM"
                                                                                }
                                                                            ],
                                                                            "token": "6e06392f-e030-4cf9-911a-8dc9f2d4e714",
                                                                            "username": "majora-sam"
                                                                        }
                                                                        

                                                                        Minimal Ocarina command with mandatory parameters:

                                                                        ocarina put sequencing \
                                                                            --instrument-make ILLUMINA \
                                                                            --instrument-model MiSeq \
                                                                            --library-name HOOT-LIBRARY-20200322 \
                                                                            --run-name YYMMDD_AB000000_1234_ABCDEFGHI0 
                                                                        

                                                                        Full Ocarina command example:

                                                                        ocarina put sequencing \
                                                                            --instrument-make ILLUMINA \
                                                                            --instrument-model MiSeq \
                                                                            --library-name HOOT-LIBRARY-20200322 \
                                                                            --run-name YYMMDD_AB000000_1234_ABCDEFGHI0 \
                                                                            --bioinfo-pipe-name 'ARTIC Pipeline (iVar)' \
                                                                            --bioinfo-pipe-version 1.3.0 \
                                                                            --end-time 'YYYY-MM-DD HH:MM' \
                                                                            --flowcell-id ABCDEF \
                                                                            --flowcell-type v3 \
                                                                            --start-time 'YYYY-MM-DD HH:MM' 
                                                                        

                                                                        Function not currently implemented in Ocarina Python API

                                                                        Documentation for this function can be found on the CGPS uploader website linked below:
                                                                        https://metadata.docs.cog-uk.io/bulk-upload-1/samples-and-sequencing

                                                                        There may be some differences between this specification and the uploader, particularly for providing Metrics and Metadata. See the Metadata and Metrics sections below for column names that are compatible with the API spec.

                                                                        Name Description Options
                                                                        instrument_make
                                                                        string, required, enum
                                                                        • ION_TORRENT
                                                                        • LLUMINA
                                                                        • OXFORD_NANOPORE
                                                                        instrument_model
                                                                        string, required
                                                                          library_name
                                                                          string, required
                                                                          The name of the library as submitted to add_library
                                                                            run_name
                                                                            string, required
                                                                            A unique name that corresponds to your run. Ideally, use the name generated by your sequencing instrument.
                                                                              bioinfo_pipe_name
                                                                              string, recommended
                                                                              The name of the bioinformatics pipeline used for downstream analysis of this run
                                                                                bioinfo_pipe_version
                                                                                string, recommended
                                                                                The version number of the bioinformatics pipeline used for downstream analysis of this run
                                                                                  end_time
                                                                                  string
                                                                                    flowcell_id
                                                                                    string
                                                                                      flowcell_type
                                                                                      None
                                                                                        start_time
                                                                                        string

                                                                                          Scopes

                                                                                          Errors

                                                                                          The Majora API uses the following error codes:

                                                                                          Error Code Meaning
                                                                                          400 Bad Request -- Your request is invalid or unauthorized (Majora never sends a 401).
                                                                                          403 Forbidden -- You are not permitted to make this request.
                                                                                          404 Not Found -- Your requested Artifact or Process could not be found.
                                                                                          429 Too Many Requests -- You're requesting too many resources, try adding a small delay between queries.
                                                                                          500 Internal Server Error -- Your action generated an error. Try again later. If the error persists, report to an administrator.
                                                                                          503 Service Unavailable -- We're temporarily offline for maintenance. Please try again later.