Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example import configuration for DSpace xoai format #3942

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

alexklbuckley
Copy link
Contributor

Sponsored-by: Auckland University of Technology, New Zealand

Sponsored-by: Auckland University of Technology, New Zealand
@alexklbuckley
Copy link
Contributor Author

We are using a custom xsl file to harvest DSpace XOAI format into VuFind Solr.

As there is no current xsl file for XOAI we would like to upstream it so other libraries can use it.

I've removed custom elements of the xsl - I have left in the mapping of dc.rights to the edition Solr field. Please do let us know if we should remove that from this upstream file. Or if there are any other changes you would like us to make.

Copy link
Member

@demiankatz demiankatz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @alexklbuckley! See below for a few comments. I also think it would be helpful to either add the matching properties file, or else put a comment in one of the existing DSpace-related properties files raising awareness of the existence of this alternate XSLT.

It also looks to me like this XSLT probably won't work correctly with a collection of records, though maybe I'm mistaken. Are you doing all your harvesting and indexing one record at a time? If so, you might want to consider revising so that you can do batches -- the performance is tremendously improved that way.

import/xsl/dspace-xoai.xsl Outdated Show resolved Hide resolved
import/xsl/dspace-xoai.xsl Outdated Show resolved Hide resolved
import/xsl/dspace-xoai.xsl Outdated Show resolved Hide resolved
</xsl:if>
</xsl:for-each>
<!-- CO AUTHOR -->
<xsl:for-each select="//*[@name='contributor']/*[@name='advisor']/*[@name='none']/*[@name='value']">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "advisor" the only possible value here that could apply to author2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @demiankatz I'm trying to find the answer to this. On the AUT DSpace instance it does look like 'advisor' is the only possible value here. I am checking for a wider DSpace context if that is the case.

import/xsl/dspace-xoai.xsl Outdated Show resolved Hide resolved
import/xsl/dspace-xoai.xsl Outdated Show resolved Hide resolved
@demiankatz demiankatz changed the title Upstream a xsl file to convert DSpace xoai to Solr compatible format Add example import configuration for DSpace xoai format Sep 16, 2024
alexklbuckley and others added 2 commits September 18, 2024 08:44
review feedback

- Removing unnecessary blank lines
- Store DSpace item rights in Solr dynamic rights_str field
- Update header comment
- Store institution variable in institution field

Sponsored-by: Auckland University of Technology, New Zealand
@alexklbuckley
Copy link
Contributor Author

Thanks, @alexklbuckley! See below for a few comments. I also think it would be helpful to either add the matching properties file, or else put a comment in one of the existing DSpace-related properties files raising awareness of the existence of this alternate XSLT.

It also looks to me like this XSLT probably won't work correctly with a collection of records, though maybe I'm mistaken. Are you doing all your harvesting and indexing one record at a time? If so, you might want to consider revising so that you can do batches -- the performance is tremendously improved that way.

Thanks @demiankatz I have pushed a follow-up addressing all but three of your points. Which I will be following up on in the next few days.

I am checking about if 'advisor' is the only possible value for author2.

I will also add a comment to one of the existing DSpace-related properties files raising awareness of the existence of this alternative XSLT.

You're right, the harvesting is being done one record at a time. Thank you for noting that recommendation to do it in batches! Is there any documentation that you could please point me towards for making that change?

@demiankatz
Copy link
Member

Thanks for the progress, @alexklbuckley! Regarding your question about supporting multiple records, you might find it helpful to look at #2034 for reference. This was the pull request where we added multi-record support to all of the existing XSLT files. The key is to create a top-level template (matching "/") that can handle a collection tag or a single record, and then creating a more carefully-scoped template to transform a single record. The main key is avoiding global selectors like // since that can be problematic.

I'm far from an XSLT expert, so I apologize if my advice is a bit unclear (or if I'm using the wrong language). But I was able to muddle through and create #2034 by imitating the patterns of better XSLT programmers than myself, and perhaps you can do the same. If you run into trouble, I'll help if I can! :-)

@alexklbuckley
Copy link
Contributor Author

Thanks for the progress, @alexklbuckley! Regarding your question about supporting multiple records, you might find it helpful to look at #2034 for reference. This was the pull request where we added multi-record support to all of the existing XSLT files. The key is to create a top-level template (matching "/") that can handle a collection tag or a single record, and then creating a more carefully-scoped template to transform a single record. The main key is avoiding global selectors like // since that can be problematic.

I'm far from an XSLT expert, so I apologize if my advice is a bit unclear (or if I'm using the wrong language). But I was able to muddle through and create #2034 by imitating the patterns of better XSLT programmers than myself, and perhaps you can do the same. If you run into trouble, I'll help if I can! :-)

Thanks so much @demiankatz !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants