Workflow and Structure for Content Resource Archive

The purpose of this document is to outline the directory structure and workflow of archiving content uploaded to instances of mapstory in order to create an easily scriptable re-import method for major platform changes and data migration.

A successful implementation of this strategy will require:

Full outline of Data Model(s) for user-created content
Use of Standard formats for long-term interoperability
Stakeholder engagement to verify the completeness and validity of strategy.

Importantly, the actual mechanism of archiving is less important than the workflow described below. Alternative formats (such as GeoPackage) and structures are acceptable provided that they are highly interoperable and stable in order to maintain longevity. We are not archiving the data model of any particular instance but rather archiving the inputs to the data model.

Rational

While it may seem that the built-in backup functionality of the tools we use would be sufficient to capture user-created data. The possibility significant changes in the core-technologies used or re-factored data models creates the need for component data (data that would be supplied by the user in the normal workflow of creating a layer and/or story; both external and internal to the platform).

For example, If we were to switch from Postgres/PostGIS to Cassandra/GeoMesa back-end, a simple table migration may not be feasible or possible. However, the component data of existing layers could be re-imported and assigned to the appropriate user(s) for a seamless transition from the User perspective.

Furthermore, I am proposing that StoryLayers represent the granual unit that we archive. A layer is the most widely interacted with element of our data at the moment.

General workflow

Content Resource Capture

for a given instance/version of mapstory a snapshot will capture the current content and all associated information
This will then be stored in a directory structure of component data that has the basic elements of each resource.
Examples of component data include:
- shapefiles in zipped format
  - potentially also location of geoGig repo
- Metadata for the layer itself (Title, Purpose, Data quality statement etc.)
- Metadata for Mapstory platform (User, associated Stories)
- Community elements (Comments, rating status)
- Styling (if not basic styling, the relevant SLD files that can be applied to the layer)

Content Resource Maintenance

at an interval determined by the management team, a script will add new resources to the existing archive, update changes in existing archive and check health of storage mechanism.

Content Resource Recovery

from the archive, a translation script will parse through the resources and automate re-upload into the target instance of MapStory via the user-facing import/entry methods so as to maintain data model integrity.
incompatibility issues (such as new feature or unused resource) would be addressed through default settings or omissions.

Note: Translation scripts should function bi-directionally. Both extraction, update, and entry of data should be supported for a given instance of MapStory platform

Content Archive Structure

This structure is subject to feedback from the development and management team for coverage and community needs.

There are a number of structural decisions to be made (ie naming conventions, JSON formats, etc.) that will inform the usefulness of this strategy

Furthermore, the initial setup of this tool will inform an assessment of current resources, dependencies of said resources, and states of layer completeness.

Proposed Directory Structure - Subject to change

Archive Directory/
  |- Translation Scripts/
  |  |- beta.py
  |  |- retro.py
  |  |- v1.0.py
  |  |- etc...
  |
  |- Stories/
  |  |- Instance_Retro/
  |  |  |- Story_Alpha_Metadata.JSON
  |  |  |- Story_Beta_Metadata.JSON
  |  |  |- etc...
  |  |- Instance_Prod/
  |  |  |- Story_Alpha_Metadata.JSON
  |  |  |- Story_Beta_Metadata.JSON
  |  |  |- etc...
  |
  |- Layers/
  |  |- Layer_Alpha/
  |  |  |- Layer_Alpha_Layer_Metadata.JSON
  |  |  |- Layer_Alpha_Platform_Metadata.JSON
  |  |  |- Layer_Alpha_Community_Elements.JSON
  |  |  |- Layer_Alpha_Geometry/
  |  |  |  |- Layer_Alpha_Shapefile.zip
  |  |  |  |- GeoGig Export (?)
  |  |  |  |
  |  |  |- Layer_Alpha_Styling/
  |  |  |  |- Style_A.SLD
  |  |  |  |- Style_B.SLD
  |  |  |  |- etc...
  |  |  |  |
  |  |- Layer_Beta/
  |  |- Layer_Gamma/
  |  |- etc...
  |
  |- Additional_Resources/ (If Deemed Necessary)
  |  |

Sample Layer Metadata JSON

{
  "Layer": {
    "LayerID": {
      "Name": "Name",
      "OriginInstance": "Prod/Beta/Retro",
      "OriginInstancePK": INT
    },
    "Owner": {
      "User_ID": PK_INT,
      "UserName": "UserName",
      "UserEmail": "[email protected]"
    },
    "Geometry": "Link to or PK of **BASE** Geometry",
    "MetaData":{
      "Title": STRING,
      "StartTimeAttribute": STRING,
      "EndTimeAttribute": STRING_OR_NULL,
      "Category": STRING,
      "Summary": STRING,
      "Purpose": STRING,
      "DataSource": STRING,
      "DataQuality": STRING,
      "TAGS": ["Tag1", "Tag2", "..."],
      "Thumbnail": LINK_TO_JPG/GIF
    },
    "Styling": {
      "default": STYLE_ID,
      "validStyles": ["STYLE1_ID", "STYLE2_ID", "..."]
    },
    "Versioning(GeoGig)": {
      "GeoGigRepo": LINK_TO_GEOGIG_REPO,
      "ContributorList":  ["User1_ID", "User2_ID", "..."]
    },
    "DetailPageInfo": {
      "Rating": DEC,
      "Views": INT,
      "FavoriteBy": ["User1_ID", "User2_ID", "..."],
      "Comments": {
        "Comment_1": {
          "User_ID": PK_INT,
          "Text": STRING
        },
        "Comment_2": {
          "User_ID": PK_INT,
          "Text": STRING
        },
        "Comment_N": {
          "User_ID": PK_INT,
          "Text": STRING
        }
      }
    }
  }
}

Sample Story Metadata JSON

{
  "Story": {
    "StoryID": {
      "Name": "Name",
      "OriginInstance": "Prod/Beta/Retro",
      "OriginInstancePK": INT
    },
    "Owner": {
      "User_ID": PK_INT,
      "UserName": "UserName",
      "UserEmail": "[email protected]"
    },
    "MetaData":{
      "Title": STRING,
      "Category": STRING,
      "Summary": STRING,
      "TAGS": ["Tag1", "Tag2", "..."],
      "Thumbnail": LINK_TO_JPG/GIF
    },
    "Chapters": {
      "Chapter1": {
        "ChapterMetadata":{ ... },
        "Layers": {
          "Layer1":{
            "Layer_PK": LAYER_PK,
            "StyleUsed": STYLE_PK
          },
          "Layer2":{},
        },
        "StoryBoxes":{ ... },
        "Annotations/StoryPins":{ ... },
        "BaseMap": "BaseMapName"
      },
      "Chapter2":{},
      "ChapterN":{}
    },
    "PlaybackSettings":{
      "Speed": STRING,
      "Cumulative": BOOL,
      "Repeat": BOOL
    },
    "DetailPageInfo": {
      "Rating": DEC,
      "Views": INT,
      "FavoriteBy": ["User1_ID", "User2_ID", "..."],
      "Comments": {
        "Comment_1": {
          "User_ID": PK_INT,
          "Text": STRING
        },
        "Comment_2": {
          "User_ID": PK_INT,
          "Text": STRING
        },
        "Comment_N": {
          "User_ID": PK_INT,
          "Text": STRING
        }
      }
    }
  }
}

lhcramer/User Content Archive.md