Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎁 WIP Flexible metadata for Valkyrie Objects #6830

Draft
wants to merge 43 commits into
base: main
Choose a base branch
from

Conversation

orangewolf
Copy link
Member

@orangewolf orangewolf commented Jun 7, 2024

Summary

Beginning support for dynamic, user changeable metadata profiles for Valkyrie objects.

Issue:

Note this should be rebased on main once double combo is in

Guidance for testing, such as acceptance criteria or new user interface behaviors:

Detailed Description

This is part of a broader feature to enable flexible metadata for Valkyrie objects. Essentially we are looking to be able to drive the fields in a work type (adminset, fileset, and collection) from an M3 profile. This is similar to the goals of Allinson flex. However, because this feature is often sought after, we are looking to put it in to Hyrax in some capacity.

This draft needs the following

These items are excluded from the PR to keep the PR manageable.

They are necessary for the feature to be released, but since it is behind a flag we recommend merging as we go.

@samvera/hyrax-code-reviewers

@ShanaLMoore ShanaLMoore changed the title Flexible metadata for Valkyrie Objects 🎁 Flexible metadata for Valkyrie Objects Jun 12, 2024
Base automatically changed from double_combo to main June 25, 2024 03:40
@@ -0,0 +1,141 @@
# frozen_string_literal: true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code was moved as is from simple_schema_loader

@orangewolf orangewolf marked this pull request as ready for review June 25, 2024 04:50
@@ -94,7 +94,7 @@ module Hyrax
# @see https://wiki.lyrasis.org/display/samvera/Hydra::Works+Shared+Modeling
# for a historical perspective.
class Work < Hyrax::Resource
include Hyrax::Schema(:core_metadata)
include Hyrax::Schema(:core_metadata) unless Hyrax.config.flexible?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two remarks pertaining to this (and all instances of include Hyrax::Schema(:core_metadata)) :⁠—

  1. Could this be made more configurable (i·e, able to be disabled in general, not just when Hyrax.config.flexible?)? We use M3‐derived metadata already at Project Surfliner, and we’re not likely to switch over to using this method anytime soon, but being able to still disable Hyrax::Schema(:core_metadata) would be useful for us (the fact that it defines :title is a persistent source of annoyance.

  2. date_uploaded, date_modified, and depositor are administrative metadata Hyrax needs to function. I’m not sure it is appropriate to be defining these in the same place as descriptive metadata. ① notwithstanding, I liked the separation that Hyrax enforced between core administrative metadata that the application required to work properly (core_metadata), and descriptive metadata which appears in the form and can be customized by developers (basic_metadata, etc).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. seems pretty viable to me. Flipping the flag from being "unless has flexible on" to "if autoload metadata" feels like a reasonable change in perspective and increases flexibility

  2. I'd rather stay out of the "what metadata goes in which category" debate when it comes to the context fo this PR. The goal of this PR is to enable the fully M3 enabled and dynamically changeable structure, not to debate whether title belongs in core_metadata.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was less intended to litigate which metadata goes in which category and more to acknowledge that supporting multiple categories of metadata might be useful; i·e that maybe not every metadata attribute belongs in the same box. I don’t feel so strongly about this that I think it needs to hold up a merge, but I do think that it maybe should be discussed (with metadata folks) whether allowing/requiring metadata specialists to manage things like date_uploaded and date_modified, which do not appear in any of the web forms and cannot be deleted, is useful.

Along these lines, I notice that Hyrax::AdministrativeSet and Hyrax::FileSet are being managed with M3 in this MR. I definitely think it is good that they have that ability, but I also think it is entirely possible that people will not want to have to manage AdminSet or FileSet classes as part of their M3 models. There’s currently no capacity for people to use M3 just for objects, or just for objects and collections, or just for descriptive metadata and not administrative metadata, and I think that those are useful constraints that people might want to enable.

Again, I don’t feel so strongly about this that I think it should hold up a merge, but I do think it is a conversation worth having with metadata experts, and a bit of proactive design towards enabling that kind of flexibility might be good.

@@ -31,6 +31,7 @@ module Hyrax
# implementations).
#
class Resource < Valkyrie::Resource
include Hyrax::Flexibility if Hyrax.config.flexible?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would really prefer it if this instead followed the module builder pattern and one did, for example,

class MyWork < Hyrax::Resource
  include Hyrax::Schema(:my_work, schema_loader: Hyrax::M3SchemaLoader.new(:my_work))
  # or simply…
  include Hyrax::M3Schema(:my_work)
end

to define a new work. It really seems like this MR is leaning hard into having everything be magic/automatic, but I don’t think that’s a good thing (it prevents applications from customizing the behaviour). Hyrax should continue supporting multiple means of defining the schema, allow different resources to use different methods, etc… not lock applications into exactly one approach with an opaque include Hyrax::Flexibility. This would also potentially obviate the need for Hyrax.config.flexible? checks, as it no longer needs to be a global setting.

Additionally, right now it feels like a lot of the schema‐defining code is currently being split between Hyrax::Flexibility and Hyrax::M3SchemaLoader, and it would be nice for it to all be in one place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do feel that turning flexible metadata on should be very simple to configure and not require a lot of steps, but that doesn't prevent us from having a more direct structure. Would you agree that moving the include down a level to the children of Hyrax::Resource, maybe moving them to module builder ( the merits of that are suspect to me, but I could be convinced ) and then keeping them behind a configuration flag adds enough customizability?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial thinking was that turning flexible metadata on should be easy enough that it’s feasible to do in a generator, but maybe doesn’t need to be so easy that it happens automatically with an environment variable. Changing an include from include Hyrax::Schema(…) to include Hyrax::M3Schema(…) felt reasonable in that context. But this isn’t something I feel particularly strongly about, and I could be persuaded otherwise.

Is it right that the plan is that Hyrax.config.flexible? will also eventually be used for other features, like modifying or enabling views? If the plan is for substantial behaviour to eventually be enabled/disabled behind that flag, it makes a lot more sense to group this behaviour with it. But without knowing the eventual scope of the feature flag, it’s hard for me to evaluate whether it feels justified.

I’ll drop some notes about how we’ve handled M3 in Surfliner as another comment, for reference. We don’t have the problem of versioning or live reloading, so I understand that you have some additional needs/constraints. But hopefully it will help us form consensus on how we can best support both approaches.

@marrus-sh
Copy link
Collaborator

marrus-sh commented Jun 26, 2024

The Surfliner M3 approach is a bit different because we don’t support versioning or live updating, but we do dynamically generate the model classes on startup rather than requiring them to be already present in the application. We use Valkyrie.config.resource_class_resolver for this, which winds up looking something like this…

def resolve(class_name)
  # Get the M3 definition of the class
  resource_class = resource_class_from_name(class_name)
  # “availability” refers to M3 `available_on`…
  availability = resource_class.name
  # Returns Hyrax::Work for works, Hyrax::Collection for collections…
  base_class = valkyrie_resource_class.call(resource_class)

  # Create the Hyrax model for the M3 class…
  Class.new(base_class) do |klass|
    # Define the Valkyrie attributes…
    klass.attributes(to_struct_attributes(availability: availability))
  end
end

Fundamentally, regarding the core Hyrax models, this is what we need to not break: We need to be able to dynamically define a new class extending a minimal Hyrax::Work, Hyrax::Collection… and add attributes to it in the ordinary Valkyrie way. I don’t think this MR does anything to get in the way of that, so I have no complaints there. The major difference in approach is that we aren’t doing any modification/overriding of the base Hyrax::Work or Hyrax::Collection models. Keeping those models as simple and vanilla‐Valkyrie as possible is ideal for us, which is part of why I’m hesitant towards the overriding of .new currently happening in Hyrax::Flexibility.

We aren’t currently using M3 for FileSets or AdminSets (and have no immediate plans to), and would like to keep using the upstream Hyrax models. One concern is that, if Hyrax moves towards a different M3 approach, we might want to support it in Hyrax::FileSet and Hyrax::AdministrativeSet without increasing the complexity of Hyrax::Work or Hyrax::Collection. (I mean, probably not, unless the old way of doing things is deprecated, but maybe it will be.) This is one of the motivating factors behind me asking for a more nuanced/modular approach. But this is perhaps a distant concern.

We do continue using the module builder approach for indexers and forms, and I would strongly encourage that here as well (but it seems like you’re doing that). In general, I agree with the approach being taken so far for indexing and form behaviors (as I understand it), altho it looks like there may be more work required to fully support M3 in this respect?

Regarding class/instance methods, we just do something like…

  def included(descendant)
    super

    descendant.singleton_class.include(ClassBehaviors)
    # etc
  end

inside the module builder to extend the singleton class whenever the module is included. We need this in our forms, for example, to define some form‐specific M3‐related class methods.

laritakr and others added 2 commits June 27, 2024 14:36
Configures dassie, koppie, hyrax app, and generators for flexible
metadata indexing to occur.
Copy link
Contributor

@dlpierce dlpierce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in agreement with @marrus-sh. Enabling flexible metadata should be done selectively on a per-model basis. I want to have standard work models functioning next to flexible work models. Using a flexible work should not need flexible collections/admin sets/file sets.

Comment on lines +7 to +8
include Hyrax::Indexer(:generic_work_resource) unless Hyrax.config.flexible?
include Hyrax::Indexer('GenericWorkResource') if Hyrax.config.flexible?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would naively expect these lines to do the same thing, but clearly they do not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the flexible schema is expecting everything in one yaml, and looking by work model to find anything linked to that model. The simple schema is loading each yaml that is defined.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it’s reasonable to expect #index_rules_for to camelize its argument in the M3 loader, if the expectation is that M3 class names are camelized.

@laritakr
Copy link
Contributor

I am in agreement with @marrus-sh. Enabling flexible metadata should be done selectively on a per-model basis. I want to have standard work models functioning next to flexible work models. Using a flexible work should not need flexible collections/admin sets/file sets.

This seems like a nice feature. I guess we had anticipated adding this later if needed. If you prefer it right at the start, we can do some brainstorming re: the best way to implement it.

@dlpierce
Copy link
Contributor

I am in agreement with @marrus-sh. Enabling flexible metadata should be done selectively on a per-model basis. I want to have standard work models functioning next to flexible work models. Using a flexible work should not need flexible collections/admin sets/file sets.

This seems like a nice feature. I guess we had anticipated adding this later if needed. If you prefer it right at the start, we can do some brainstorming re: the best way to implement it.

Okay, let's get this merged, but I expect this to be addressed before this sees a release.

@laritakr laritakr changed the title 🎁 Flexible metadata for Valkyrie Objects 🎁 WIP Flexible metadata for Valkyrie Objects Jun 28, 2024
@laritakr laritakr marked this pull request as draft June 28, 2024 19:41
@@ -9,7 +9,8 @@ class PcdmCollectionIndexer < Hyrax::Indexers::ResourceIndexer
include Hyrax::VisibilityIndexer
include Hyrax::LocationIndexer
include Hyrax::ThumbnailIndexer
include Hyrax::Indexer(:core_metadata)
include Hyrax::Indexer(:core_metadata) unless Hyrax.config.flexible?
include Hyrax::Indexer('Hyrax::PcdmCollection') if Hyrax.config.flexible?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to push back lightly against the expectation that applications are using Hyrax::PcdmCollection as their collection model (and not something which inherits from Hyrax::PcdmCollection). The collection model is configurable in Hyrax, and it’s a bit strange to expect flexible metadata to be defined for Hyrax::PcdmCollection when the collection model being used is, in fact, class ::Collection < Hyrax::PcdmCollection. (This is in fact the case in Koppie.)

Copy link
Contributor

@laritakr laritakr Jun 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't prevent inheritance. We also inherit from PcdmCollection in Hyku. And we also inherit from Administrative Set rather than using it directly. :)

However, it does require that the m3 profile include PcdmCollection, so we are removing line 13. Since this model only includes core metadata, there's no need to support flexible metadata here as an equivalence to the simple schema's flexibility.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not concerned about inheritance behaviours so much as the experience for application developers / end‐users. If my collection class is named MyCollection, I would expect to be able to add metadata properties to MyCollection in the M3 and have it work. However, that isn’t the case here; only Hyrax::PcdmCollection adds collection properties, and there is no mechanism for loading collection metadata by a different name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collection work is still not finalized. I honestly thought we had already included it in the dassie and koppie models but it appears that isn't the case. Thanks for pointing that out. You're right that it should be based on your resource class name.

@laritakr
Copy link
Contributor

laritakr commented Jun 29, 2024

This PR has been moved back to draft. We had hoped an initial PR that is behind a flag would allow some foundational behavior to get merged, allowing incremental subsequent small (more easily reviewed) PRs to build on that foundation.

While the requested changes are possibly reasonable requests, the details require more conversation to implement. We don't have the time right now to build the ideal solution and still meet our deadline, so we are reluctantly returning to the undesired "long-running branch" method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
notes-minor Release Notes: Non-breaking features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants