On This Page
advertisement

The transforms in this category access the Docoument Object Model (DOM) of the current document:

Transform Description
attribute Searches the current HTML document for an element that matches the query terms and returns the named attribute
query Searches the current HTML document for an element that matches the query terms and returns the text of the element

attribute:selector:index:attribute-name

The following information is intended for advanced users.

Returns the value of the HTML attribute with the given attribute-name from the indexth HTML element that matches the selector. All the parameters are required.

The attribute transform is intended to allow ORA users to write templates that access attributes, such as HREF values, that ORA does not extract. The selector and index parameters have the same purpose as for the query transform.

The selector parameter must be a valid CSS selector. It is beyond the scope of this help page to explain CSS Selectors.

The attribute transform has a couple unusual characteristics:

  • The attribute transform may only be used with a special Field named "DOM". If you attempt to use it with any other Field, the result will be an empty string.
  • The attribute transform cannot be tested on the OraSettings page. The attribute transform must have the HTML of the page available, and that HTML is not available from the OraSettings page. Instead, the attribute transform inspects the HTML for the OraSettings page, not the collection page for which it is intended.

Example

To return the HREF attibute of the first "A" (link) element on the page: [DOM:attribute:a:1:href]. If the page has an A element, the result is the text of the HREF attribute.

Selectors are usually more involved than the selector used in the example above.

HREFs

To convert an HREF attribute value to a full URL, pass the attribute result to the hrefToUrl transform:

[DOM:attribute:a:1:href:hrefToUrl]

query:selector:index

The following information is intended for advanced users.

Returns the text of the indexth HTML element that matches the selector. If the optional index is not supplied, it defaults to 1, and the transform will return the text of the first matching element. Otherwise, it will return the text of the indexth HTML element.

The query transform is a thin wrapper around the Document.querySelectorAll() method. It is intended to allow ORA users to write templates that access text that ORA does not extract.

The selector parameter must be a valid CSS selector. It is beyond the scope of this help page to explain CSS Selectors.

The query transform has a couple unusual characteristics:

  • The query transform may only be used with a special Field named "DOM". If you attempt to use it with any other Field, the result will be an empty string.
  • The query transform cannot be tested on the OraSettings page. The query transform must have the HTML of the page available, and that HTML is not available from the OraSettings page. Instead, the query transform inspects the HTML for the OraSettings page, not the collection page for which it is intended.

Examples

  1. Return the text of the first "H2" element on the page:
    [DOM:query:h2]

    If the page has an H2 element with the text "Part One", the result is "Part One".

  2. Return the second "H2" element on the page:
    [DOM:query:h2:2]

    If the page has two H2 elements, "Part One" and "Part Two", the result is "Part Two".

Selectors are usually more involved than the selectors used in the examples above.