-
Notifications
You must be signed in to change notification settings - Fork 14
Machine Readable Serialization With Atom
We provide a standard way to serialize a BentoSearch::Results
set of BentoSearch::ResultItem
s to a machine-readable serialization based on the Atom format.
The atom response isn't just for 'atom feed reader' software, although it can be used that way. But it's been enhanced with elements from other vocabularies, to serve as a machine-readable serialization that expresses nearly every part of a BentoSearch::Item. The Atom-based serialization can be used to provide an API response for bento_search powered results from your app.
You can serialize any BentoSearch::Results
set to enhanced Atom using the bento_search/atom_results.atom.builder
view template. Atom requires a feed name and author, so you need to provide, or some rather dumb defaults will be used. You might typically add an atom response format to a rails controller action that's already providing an HTML results page, like so:
# ...
respond_to do |format|
format.html # default view
format.atom do
render( :template => "bento_search/atom_results",
:locals => {
:atom_results => @results,
:feed_name => "Acme results",
:feed_author_name => "MyCorp"
}
)
end
The resulting serialization is Atom, although it has some idiosyncracies. It's also been enhanced with some opensearch elements with metadata about the result set itself. EAch individual atom:entry representing an individual item has been enhanced with elements from the prism, dcterms, and bibo (just one at the moment!) vocabularies/namespaces, as well as elements from atom itself. We attempt to completely express everything modeled in a BentoSearch::ResultItem.
Experience shows that even if one tried to stick to the letter of the specs of those namespaces, the consumer would still need to discover and code for application-specific choices and idiosyncracies. With this in mind, we have also sometimes been willing to violate the letter or spirit of some of the related vocabularies/specs, when complying completely would have been too expensive without clear benefit to clients, or even counter-productive to clients. So it goes.
Here is a sample serialized enhanced atom result.
The developer of a consumer software is also encouraged to check the source at ./app/views/bento_search/atom_results.atom.builder
and ./app/views/bento_search/_atom_item.atom.builder
as the obviously last word on implementation details.
Here are some notes to be aware of when using the Atom-based serialization as fully expressive machine-readable results:
-
There is limited feed-level metadata provided via
opensearch
namespaced elements, principallyopensearch:totalItems
.- There is other feed-level metadata which can usefully be provided in atom/opensearch/etc, but there was no good way to do it in bento_search without knowing the details of the host app. If you'd like, you can write your own rails view template replacing
bento_search/atom_results
providing more complete feed-level metadata, but still re-using thebento_search/atom_item
partial for rendering the individual item/entries. For instance,atom:link
elements withrel
: self, alternate, prev, next, first, last, search (to an opensearch description). Oropensearch:Query
- an
atom:id
attribute is required by atom at the feed-level -- the built-in implementation just use the current application request URL that is delivering the atom results.
- There is other feed-level metadata which can usefully be provided in atom/opensearch/etc, but there was no good way to do it in bento_search without knowing the details of the host app. If you'd like, you can write your own rails view template replacing
-
The
updated
element is required by atom at both feed and entry, but bento_search (and it's target search engines) don't really track updated_at, so it will always be filled out to the current timestamp, sorry. -
Unique atom:id elements, with URI values, are required by the Atom standard. bento_search will try to construct a basic opaque non-resolvable unique identifier URI by default, using engine_id, entry unique_id, and application base URL. In some cases it won't be able to do so, and will leave out
<id>
, violating the standard. If you'd like to take control of this to ensure an is present to your specifications, including resolvability in your app, simply configure a decorator for the bento_search engine, which provides a customuri_identifier
method. -
serial publication info is present using the prism vocabulary. prism:coverDate using
yyyy-mm-dd
format (or justyyyy
), as well asprism:volume
,prism:number
,prism:startingPage
,prism:endingPage
,prism:issn
,prism:isbn
.- prism:doi is a bare DOI without any kind of URI encapsulation, such as
10.1109/MIC.2005.74
(the prism standard is a bit confusing on this; actually doi might technically not be part of the prism version who's namespace we're using, sorry.).
- prism:doi is a bare DOI without any kind of URI encapsulation, such as
-
almost every entry-level element is optional in the
ResultItem
model, so may or may not be present in the serialization. -
The atom:summary can be plain text or html, marked as specified in the atom spec with attribute
type="text"
orhtml
. If html, it may include<b class="bento_search_highlight">
tags demarcating search-in-context highlighting. (highlighting of title not currently available in atom response. do you need it?) -
For some dcterms elements we provide a custom 'vocabulary' attribute specifying the vocabulary. For instance, one or more
dcterms:type
ordcterms:language
elements may be provided, specifying values according to different vocabularies, where available. An element with novocabulary
attribute is generally an uncontrolled label suitable for presenting directly to the user (usually in English). Thedcterms:type vocabulary='http://purl.org/NET/bento_search/ontology'
element is bento_search's internal vocabulary. *
<dcterms:type vocabulary="http://schema.org/">http://schema.org/Article</dcterms:type>
<dcterms:type>Journal Article</dcterms:type>
<dcterms:type vocabulary="http://purl.org/NET/bento_search/ontology">Article</dcterms:type>
<dcterms:language vocabulary="http://dbpedia.org/resource/ISO_639-1">es</dcterms:language>
<dcterms:language vocabulary="http://dbpedia.org/resource/ISO_639-3">spa</dcterms:language>
<dcterms:language>Spanish</dcterms:language>
-
the
dcterms:type
element with novocabulary
attribute is often a label passed directly from the underlying search service. -
The only element we're currently using from bibo is
bibo:oclcnum
. Most of our search engines provide oclcnums rarely, but if one is present, that's where it will be serialized. -
ResultItem#link
(main link, only URL in model) andResultItem#other_links
are included if present (as added/modified by your local decorators).- if a main
#link
is present, it'll be included asatom:link rel="alternate"
. If it's not present, there will be noatom:link rel="alternate"
, violating the Atom standard in some cases. -
#other_links
by default arerel=related
, but can specify their own rel, their own contenttype
, and will havetitle
filled out with theirBentoSearch::Link#label
.
- if a main