Uploaded On |
Oct. 16, 2013, 6:02 p.m. |
Uploaded By |
matts |
Status |
Test
|
<map>
<entry>
<string>plugin_config_props</string>
<list>
<org.lockss.daemon.ConfigParamDescr>
<key>base_url</key>
<displayName>Base URL</displayName>
<description>Usually of the form http://<journal-name>.com/</description>
<type>3</type>
<size>40</size>
<definitional>true</definitional>
<defaultOnly>false</defaultOnly>
</org.lockss.daemon.ConfigParamDescr>
<org.lockss.daemon.ConfigParamDescr>
<key>doi</key>
<displayName>DOI</displayName>
<description>DOI, transformed to become a legal XML ID or
file name, like '10.5072__FK298700'.</description>
<type>1</type>
<size>100</size>
<definitional>true</definitional>
<defaultOnly>false</defaultOnly>
</org.lockss.daemon.ConfigParamDescr>
</list>
</entry>
<entry>
<string>au_name</string>
<string>"%s", doi</string>
</entry>
<entry>
<string>au_start_url</string>
<string>"%s/lockss-manifest.html", base_url</string>
</entry>
<entry>
<string>au_crawl_depth</string>
<int>1</int>
</entry>
<entry>
<string>au_def_pause_time</string>
<long>6000</long>
</entry>
<entry>
<string>au_def_new_content_crawl</string>
<long>86400000</long>
</entry>
<entry>
<string>plugin_notes</string>
<string>edu.purdue.purr is a specialized version of HarvestAllFilesInSubdirectory, where the 'volume_name' parameter has been replaced by the 'doi' parameter.
Note that the DOI value has been transformed to a valid XML ID and Linux file name by stripping the initial 'doi:' and substituting double-underline ('__') for the filename-XML-ID-illegal slash character ('/'). As an example, 'doi:10.5072/FK298700' would become '10.5072__FK298700' under these rules. The DOI value identifies a BagIt-formatted subdirectory where all of the files for the AU are kept.
Please note that 'base_url' is used instead of a fixed URL everywhere a URL is required in this plugin for flexibility's sake. The Crawl Rules are set up to include only the main LOCKSS manifest page and the files inside of the AU directory (the directory named by the 'doi' parameter).</string>
</entry>
<entry>
<string>plugin_name</string>
<string>Purdue University Research Repository Plugin</string>
</entry>
<entry>
<string>plugin_identifier</string>
<string>edu.purdue.purr.lockss.plugin</string>
</entry>
<entry>
<string>au_crawlrules</string>
<list>
<string>4,"^%s", base_url</string>
<string>1,"%s/lockss-manifest.html", base_url</string>
<string>2,"/\?"</string>
<string>1,"^%s/%s/", base_url, doi</string>
</list>
</entry>
</map>