Acquisition Options
Info
These options configure the acquisition process for a specific host. They are set in the host's Properties field in the monitoring interface under the acquisition. prefix, and in the host's Directory editor where the remote paths are defined. The retrieval.* options (also on this page) control how individual file downloads are rate-limited and timed out.
File Discovery & Filtering
| Option | Type | Default | Description |
acquisition.action | String | queue | Action taken on discovered files: queue to register them for retrieval/dissemination, or delete to delete them on the remote site |
acquisition.wildcardFilter | String | none | Wildcard filter applied directly by the transfer module during listing — limits files returned at source. Preferred over acquisition.regexPattern when the transfer module supports it |
acquisition.regexPattern | String | none | Regex filter applied at the data-mover level on listing output. Use when the transfer module does not support wildcard filtering. Placeholders $date and $namePattern are substituted before evaluation |
acquisition.fileage | String | none | Age-based file filter. Simple forms: >10d, <=5d. When $age is detected the expression is evaluated as JavaScript with $age replaced by the file timestamp in milliseconds (e.g. $age > (5*60*1000) && $age < (20*24*60*60*1000)) |
acquisition.filesize | String | none | Size-based file filter. Simple forms: >10kb, <=5mb. When $size is detected the expression is evaluated as JavaScript with $size replaced by the file size in bytes |
acquisition.onlyValidTime | Boolean | false | When true, discard files whose timestamp cannot be read or parsed. When false, all files are selected regardless of timestamp validity (those files are exempt from acquisition.fileage checks) |
acquisition.useSymlink | Boolean | none | Whether symbolic links found in the listing should be included (true) or excluded (false) |
acquisition.removeParameters | Boolean | false | Strip HTTP query parameters (everything from ? onwards) from filenames when listing output contains URLs |
acquisition.debug | Boolean | false | Emit extra debug messages in the master logs for this host's acquisition process |
Date Parsing
These options control how $date and $dirdate placeholders are resolved in acquisition.metadata and acquisition.target.
| Option | Type | Default | Description |
acquisition.datesource | String | none | Source string to parse the date from (e.g. a filename segment). When unset, the current date/time is used |
acquisition.dateformat | String | yyyyMMdd | Java SimpleDateFormat pattern used to format the resolved date |
acquisition.datedelta | Duration | none | Offset applied to the resolved date. Positive = forward in time, negative = backward |
acquisition.datepattern | String | none | Java SimpleDateFormat pattern used to parse the string from acquisition.datesource |
acquisition.defaultDateFormat | String | none | Default date format expected by the FTP server in listing output |
acquisition.recentDateFormat | String | none | Recent-date format expected by the FTP server (used when dates are displayed in a different format for recent files) |
acquisition.serverTimeZoneId | String | none | Time zone of the FTP server (e.g. UTC, America/New_York) |
acquisition.serverLanguageCode | String | none | Language code of the FTP server (affects month name parsing in listings) |
acquisition.shortMonthNames | String | none | Space-separated abbreviated month names if the server uses non-standard abbreviations |
acquisition.systemKey | String | none | FTP server system key for selecting the correct listing parser |
acquisition.regexFormat | String | none | Regex describing how the FTP server listing line is split into file attributes (see Apache Commons Net FTP file entry parser) |
File Registration
When a new file is discovered it is registered as a data transfer. These options control how it is registered.
| Option | Type | Default | Description |
acquisition.target | String | none | Target filename for the registered transfer. Overrides the source filename. Placeholders: $destination, $name, $target, $original, $link, $dirdate, $date, $timestamp (epoch ms) |
acquisition.metadata | String | none | Metadata attached to the registered transfer. Comma-separated key=value pairs. Same placeholders as acquisition.target |
acquisition.lifetime | Duration | none | Lifetime of the registered transfer. Expired transfers are marked unavailable for dissemination or download |
acquisition.priority | Integer | none | Dissemination queue priority for the registered transfer (higher = earlier) |
acquisition.standby | Boolean | false | Register the transfer in standby mode (not processed until explicitly released) |
acquisition.noretrieval | Boolean | false | When true, the file is not downloaded to the data movers. Instead it is accessed on-demand via the source host |
acquisition.deleteoriginal | Boolean | false | Delete the file from the remote site after it has been successfully retrieved |
acquisition.version | String | none | (inherited from scheduler context) Optional version string attached to the transfer |
acquisition.groupby | String | ACQ_{destination}_{host} | Group-by label assigned to the registered transfer |
acquisition.transferGroup | String | none | Transfer group for processing the registered request |
acquisition.event | Boolean | false | Emit an event (e.g. MQTT notification) once the file has been retrieved and disseminated or made available in the Data Portal |
Deduplication
| Option | Type | Default | Description |
acquisition.uniqueByTargetOnly | Boolean | false | Use only the target name (not the source URL) as the deduplication key |
acquisition.uniqueByNameAndTime | Boolean | false | Include the file timestamp in the deduplication key when a valid timestamp is available |
acquisition.useTargetAsUniqueName | Boolean | false | When acquisition.uniqueByTargetOnly is enabled, use the target name rather than the original name for the key |
Requeue Behaviour
| Option | Type | Default | Description |
acquisition.requeueonupdate | Boolean | false | Requeue the transfer when the file is rediscovered with a newer timestamp AND a different size. Equivalent to setting acquisition.requeueon = "$time2 > $time1 && $size2 != $size1" |
acquisition.requeueonsamesize | Boolean | false | Requeue even when only the timestamp has changed (size is the same). Equivalent to acquisition.requeueon = "$time2 > $time1". Only used when acquisition.requeueonupdate is set and acquisition.requeueon is not |
acquisition.requeueOnFailure | Boolean | false | Requeue the transfer after a retrieval failure. Useful for MQTT-based acquisition where notifications are sent only once |
acquisition.requeueon | String | none | Custom JavaScript boolean expression controlling requeue behaviour. Variables: $size1 (original size), $size2 (new size), $time1 (original timestamp), $time2 (new timestamp), $destination, $target, $original. Takes precedence over requeueonupdate / requeueonsamesize |
acquisition.skipPostRetrievalSizeCheckPattern | String | none | Regex pattern matching filenames for which the post-retrieval size check (comparing discovered size vs. retrieved size) should be skipped |
Queue & Listing Control
| Option | Type | Default | Description |
acquisition.listParallel | Boolean | false | Process multiple directories from the directory editor simultaneously |
acquisition.listMaxThreads | Integer | system | Maximum concurrent connections for parallel directory listing (requires acquisition.listParallel) |
acquisition.listMaxWaiting | Integer | system | Maximum queued listing jobs when parallel listing is enabled. Adding beyond this limit blocks until a slot is free |
acquisition.listSynchronous | Boolean | false | When true, wait for the full listing to complete before starting to process entries. Disable for MQTT sources where the listing never ends |
acquisition.maximumDuration | Duration | none | Maximum wall-clock time for one acquisition listing cycle |
acquisition.interruptSlow | Boolean | system | Kill the listing when acquisition.maximumDuration is exceeded. Falls back to the system-wide value when unset |
MQTT Payload
| Option | Type | Default | Description |
acquisition.payloadExtension | String | none | Extension appended to the inline-payload file created alongside the data file when an MQTT notification is received |
Quick-start examples
# Basic acquisition: requeue if file changes, expire after 7 days
acquisition.lifetime = "P7D"
acquisition.requeueonupdate = "yes"
acquisition.deleteoriginal = "no"
# Date-based target rename: rename to date extracted from filename
acquisition.datesource = "$target[2..12]"
acquisition.datepattern = "yyyyMMddHH"
acquisition.datedelta = "-1d"
acquisition.dateformat = "MMdd"
acquisition.target = "/archive/$date/$name"
# Parallel listing of many directories
acquisition.listParallel = "yes"
acquisition.listMaxThreads = "5"
acquisition.listSynchronous = "no"
acquisition.maximumDuration = "PT2H"
acquisition.interruptSlow = "yes"
# MQTT acquisition: always requeue on failure, process payload inline
acquisition.requeueOnFailure = "yes"
acquisition.payloadExtension = ".json"
acquisition.event = "yes"
Retrieval Rate Control
These options limit how fast individual files are downloaded from the remote site during acquisition.
| Option | Type | Default | Description |
retrieval.minimumRate | ByteSize/s | none | Abort the retrieval if the transfer rate drops below this threshold after retrieval.minimumDuration has elapsed |
retrieval.minimumDuration | Duration | none | Grace period before rate and duration checks begin |
retrieval.maximumDuration | Duration | none | Abort the retrieval if it exceeds this wall-clock time |
retrieval.rateThrottling | ByteSize/s | none | Cap the retrieval rate to this maximum |
retrieval.interruptSlow | Boolean | false | Enable the slow-transfer kill switch (requires retrieval.maximumDuration or retrieval.minimumRate) |
Quick-start example
retrieval.minimumRate = "100kB"
retrieval.minimumDuration = "PT1M"
retrieval.maximumDuration = "PT2H"
retrieval.interruptSlow = "yes"
retrieval.rateThrottling = "50MB"