HOPSKOTCH

SCiMMA HOPSKOTCH

Hop.SCIMMA is a production instance of HopSkotch run by SCiMMA for the community.

The Basic Hopskotch Service

Hopskotch is a service which provides a publish-subscribe capability for Multi-Messenger Astronomy. The pub-sub system is based on the Apache Kafka software. Accounts allowing a person to access the Hopskotch service are freely available to the community. Anyone with an account, which can be simply created from your institutional login: here to get credentials, subscribe to public topics, and apply for access to private topics. SCiMMA uses SCRAM credentials which are realized as a file you download.

A quick overview of the current live system is available at here .

Table: Specifics of SCiMMA Hopskotch Deployment
Item Specification
Kafka Endpoint URL: kafka://kafka.scimma.org/
Archive RESTful Endpoint URL: https://archive-api.hop.scimma.org
Live Dashborard URL: https://www.scimma.org/dashboard
Message Retention Policy within Kafka: 28 days, 1 hour.
Message Retention Policy within Archive: Deletion by Request only
Max message size for Basic Hopskotch: 1,000,020 bytes (N.b. variations of +- tens of bytes)
:

Significant Concepts

Users mint their own SCRAM credentials here . A user may possess more than one SCRAM credential. Credentials do not expire. A given user’s credential may be authorized to publish and subscribe to topics in any number of groups (described below).

Topics partition messages into streams of interest to the subscribers. Subscribers elect to receive messages from a topic. Publishers publish to a topic. Subscribers to a topic receive all messages in the hopskotch system published to that topic. All messages within a topic are either public or private. Any given topic is administered by a group (described below). Topics names contain the name of the group accountable for the topic.

A Group is a unit of authority delegated to entities outside of SCiMMA allowing them to organise publishing and consuming data they are interested in. A user may belong to any number of groups. Group Administrators make topics within their group, Group Administrators decide if a topic is public or private. Group administrators can assign read or write permissions for topics owned by their group to other groups. Members of a group automatically have both read and write access to that group's topics, and receive any permissions granted to their group by other groups. Groups have a liaison relationship with the SCIMMA project. SCiMMA policy on creation of groups is here .

Messages are published on topics. Hopskotch places no constraints on the format of messages, except that messages are constrained to be smaller than a designated size. Metadata can be associated with messages, Some metadata keywords are reserved to SCiMMA. Messages are stored within the basic system for 28 days. Subscribers can retrieve and re-retrieve messages from the basic system which are less than 30 days old.

The Archive Extention

The basic pub-sub system is extended with an archive. Archived messages are associated with the topic on which they were published. Archived messages remain available, even if the pub-sub topic used to publish the message is de-commissioned. Each message is identified by a UUID. The UUID is used specifu a message to retrive from the archive. THe UUID can be provided by the publisher, or autogenerated upon pubication, No matter how generated, the UUID is available to subscribers in the Kafka message header accompanying each message.

Public messages and large, offloaded messages (described below) are automatically archived. SCiMMA’s policy is to not delete archived messages, so as to retain all archived messages indefinitely.

SCRAM credentials are not needed to access archived public messages. This uncredentialed access allows access to messages outside the SCiMMA community, for example, in citations, to validate claims of prior discovery, and the like. SCRAM credentials are needed to access archived private messages. Credentials authorized to read a private topic are also authorized to read all archived messages corresponding to that topic.

Archived Messages are exposed through a RESTful interface. The restful interface endpoint is https://archive-api.hop.scimma.org/ and the API documentation is here

The Large Message Offload Extention

Publishers may need to send messages larger than use cases supported by Apache Kafka. An example is the rapid exchange of a spectrum. SCiMMA supports use cases like this with a large message offload. The design provides for subscribers to receive large messages with the same interface as ordinary messages. This feature transparently places large messages in the archive, using primitives in the archive extension, described above, and sends only a small message containing the URL of the offloaded data over the basic Kafka topic; the SCiMMA Hopskotch python client then reassembles the original message by downloading the oversize payload from the archive using the URL in the 'placeholder' message. Archive authentication, authorization, and retention policies apply to large messages.

The implementation of Large message offload is documented here

Client Software

To access the full functionalty of Hopskotch, SCiMMA supplies a python library and shell clients available on GitHub, from PyPIPyPI, and via Conda-Forge. A tutorial for a quickstart is available. The full client software documentation is at readthedocs .

Hopskotch client software >= 0.11.0 is required to publish large messages. Hopscotch client software >= 0.11.0 is required to subscribe to large messages seamlessly, “as if” Kafka supported large messages natively. Subscribers using older clients would receive a json message containing the information to retrieve the large message from the archive.

The basic system can be accessed with any Kafka client supporting SCRAM.

Service Provisioning and Operations

The SCiMMA Hopscotch service runs on AWS (cloud) for high availability. Care is taken to ensure that message latency is kept low. The provisioning philosophy ensures that service is well-resourced so that service is available during peak “discovery” moments, and that upgrades to software are minimally disruptive. Changes and service disruptions are announced on the Scimma main page. Support can be obtained via ticket portal or email.