# Feature Update
I have added the following features / changes in this PR due t…o experience with client needs.
I think that this PR helps the stability of the utility and provides features that are more aligned to an enterprise application.
## Configuration File Alternative to Command-Line Arguments
This PR for the `SailPoint File Upload Utility` supports configuration using a JSON file to reduce the complexity of the configuration and command line arguments and supports additional features.
Below is an example of a configuration file:
```json
{
"tenant": {
"url": "https://<example>.api.identitynow.com",
"clientId": "env",
"clientSecret": "env"
},
"proxy": {
"enabled": false,
"host": "",
"port": 1234,
"user": "",
"password": ""
},
"aggregations": [
{
"sourceId": "<example source id>>",
"disableOptimization": false,
"objectType": "Account",
"recursive": false,
"simulate": false,
"timeout": 10000,
"extensions": [
"csv",
"tsv"
],
"enableFileJourney": true,
"structure": {
"in": "/data/sailpoint/in",
"stage": "/data/sailpoint/stage",
"archive": "/data/sailpoint/archive",
"error": "/data/sailpoint/error"
}
}
],
"ks": "",
"iv": ""
}
```
Most of the values in the config file directly relate to the command line arguments outlined in the [README.md](./README.md), organized more efficiently with some additional features outlined below:
### Attributes
The following attributes and objects are required to be present in the configuration file. If there is no value for a specific attribute, the blank `""` is accepted and `[]` for arrays:
|Object|Name|Type|Description|
|---|---|---|---|
||`tenant`|`Tenant`|Object to outline the details of the SailPoint tenant to connect to.|
|`Tenant`|`url`|`String`|Contains the details of the SailPoint tenant to connect to.|
|`Tenant`|`clientId`|`String`|SailPoint Client ID (PAT). If value of `env` is provided, then the value for environment variable `SAIL_CLIENT_ID` will be used.|
|`Tenant`|`clientSecret`|`String`|SailPoint Client Secret (PAT). If value of `env` is provided, then the value for environment variable `SAIL_CLIENT_SECRET` will be used. Note: Plain text `clientId` is automatically encrypted on first run.|
||`aggregations`|`Aggregation (Array)`|Array of `Aggregation` objects which contain the configuration for each source o be aggregated.|
|`Aggregation`|`sourceId`|`String`|SailPoint ID for the Source to be aggregated. If File Upload Utility cannot find the specified `sourceId` in the tenant's configuration, then it will skip the aggregation and move on to the next if there is one.|
|`Aggregation`|`objectType`|`String`|`Account` or `Entitlement` Schema|
|`Aggregation`|`recursive`|`Boolean`|Recursively search directories|
|`Aggregation`|`timeout`|`Integer`|Timeout (in milliseconds). Default: 10000 (10s)|
|`Aggregation`|`disableOptimization`|`Boolean`|Disable Optimization on Account Aggregation|
|`Aggregation`|`simulate`|`Boolean`|Simulation Mode. Scans for files but does not aggregate.|
|`Aggregation`|`extensions`|`String (Array)`|List of extensions to search for within the directory structure.|
|`Aggregation`|`enableFileJourney`|`Boolean`|Flag to enable file processing to move files to `in`, `staging`, `archive` or `error`|
|`Aggregation`|`structure`|`Structure`|Directory structure for importing files|
|`Structure`|`in`|`String`|Path to file or directory for aggregation|
|`Structure`|`stage`|`String`|Path to directory for processing aggregations|
|`Structure`|`archive`|`String`|Path to directory for successful aggregations|
|`Structure`|`error`|`String`|Path to directory for failed aggregations|
||`proxy`|`Proxy`|Object to outline the details of a proxy server (if required).|
|`Proxy`|`enabled`|`Boolean`|Flag to enable Web (HTTP) proxies support.|
|`Proxy`|`host`|`String`|Hostname for proxy server (if `enabled` is set to `true`).|
|`Proxy`|`port`|`Integer`|Proxy TCP port for proxy server (if `enabled` is set to `true`).|
|`Proxy`|`user`|`String`|Username for proxy server (if `enabled` is set to `true`).|
|`Proxy`|`password`|`String`|Password for proxy server (if `enabled` is set to `true`). Note: Plain text `clientId` is automatically encrypted on first run.|
||`ks`|`String`|Internally used. Soon to be removed.|
||`iv`|`String`|Internally used. Soon to be removed.|
### New Features
The following new features have been introduced as part of the configuration file:
#### One File Config
One single file for configuration, reducing the reliance on command line arguments, helps the maintenance of the application.
#### Individual Source Aggregation Configuration
Previously, the following settings could only be set as globally for the application's run (regardless of the files and sources being aggregated):
- timeout
- disableOptimization
- extensions
- simulate
As of now each of these options can be configured per-source, enabling a more granular configuration, as shown below:
```json
{
...
"aggregations": [
{
"sourceId": "<<<source id>>>",
"disableOptimization": false,
"objectType": "Account",
"recursive": false,
"simulate": false,
"timeout": 10000,
"extensions": [
"csv",
"tsv"
],
"enableFileJourney": true,
"structure": {
"in": "/data/sailpoint/in",
"stage": "/data/sailpoint/stage",
"archive": "/data/sailpoint/archive",
"error": "/data/sailpoint/error"
}
}
],
...
}
```
#### File Archival / File Journey - Built in
In many cases, files are picked up from a directory and processed, and there is often a reliance of a separate or overarching script of some sort (`bash`/`PowerShell` for example) which either picks the file up for processing by this application, or moves the file to an archival directory.
This has been addressed with the `structure` object within the configuration:
```json
{
...
"aggregations": [
{
"sourceId": "<example source id>>",
...
"extensions": [
"csv",
"tsv"
],
"enableFileJourney": true,
"structure": {
"in": "/data/sailpoint/in",
"stage": "/data/sailpoint/stage",
"archive": "/data/sailpoint/archive",
"error": "/data/sailpoint/error"
}
}
],
...
}
```
As can be seen above, if the `enableFileJourney` flag is set to `true` the following happens during processing:
1) The directory specified in the `in` attribute is scanned for files with extensions which match one of the `extensions` also specified.
2) Upon finding a file, the file is renamed with a prefix of teh date and time e.g. `20250705010203_<filename>`.
3) The file `20250705010203_<filename>` is then moved to the diectory specified in the `stage` attribute
so to clean the `in` directory.
4) The file is then attempted to be imported / aggregated.
5) If the file is successfully aggregated, the file is moved in to the directory specified in the `archive` folder.
6) If the aggregation of the file fails for any reason, the file is moved in to the directory specified in the `error` attribute.
If the `enableFileJourney` flag is set to `false` the following happens during processing:
1) The directory specified in the `in` attribute is scanned for files with extensions which match one of the `extensions` also specified.
2) The file is then attempted to be imported / aggregated.
#### On-the-fly Encryption of Credentials
As per the original codebase, the environment variables for the tenant Client ID and Secret can be used. However if this is not viable, the Client Secret can be entered **Free Text** in to the configuration file.
However, upon the first run on the application, the secret will be encrypted and no longer be human readable.
Note: Currently this is not a fool-proof encryption method as the keys are stored in the config file, this is only a deterrent at this time. A more suitable alternative is currently being looked in to.
## Logging Support with Log4j 2
This PR for `SailPoint File Upload Utility` replaces the basic logging functionality and implements `Apache Log4j 2`, providing a robust and consistent mechanism for capturing and managing application logs. This enhancement enables fine-grained control over log levels, formatting, and output destinations, making it easier for developers and administrators to monitor and troubleshoot the utility in a variety of environments.
By adopting Log4j 2, the utility inherently supports streaming log data to `Security Information and Event Management (SIEM)` solutions such as `Splunk`, `ELK Stack`, or other centralized log aggregation platforms. This is particularly valuable for enterprise deployments, where maintaining visibility, ensuring compliance, and detecting anomalies in real time are critical requirements.
Key benefits of this implementation include:
- **Flexible Configuration:** Customize log levels and appenders (e.g., console, rolling files) via simple configuration changes without modifying the application code.
- **Enterprise Integration:** Seamlessly forward logs to SIEM systems to support centralized monitoring and alerting pipelines.
- **Enhanced Debugging and Auditing:** Provides clear, structured logs to facilitate operational support and forensic analysis.
### Default Configuration
The default configuration is outlined below and logs detailed output to the console and to the file `logs/FileUpload.log`:
```properties
# Root logger option
rootLogger=INFO, STDOUT, LOGFILE
# Define console appender
appender.console=org.apache.log4j.ConsoleAppender
appender.console.name = STDOUT
appender.console.type = Console
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = [%-5level] %msg%n
# Define rolling file appender
appender.file.type = File
appender.file.name = LOGFILE
appender.file.fileName = logs/FileUpload.log
appender.file.layout.type = PatternLayout
appender.file.layout.pattern = [%-5level] %d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %c{1} - %msg%n
appender.file.filter.threshold.type = ThresholdFilter
appender.file.filter.threshold.level = info
```
You can override the default configuration by specifying the `-Dlog4j.configurationFile=directory/file.xml` on the command line as outlined below.
### Log4J Configuration Formats
Log4J supports both `properties` format and `xml`.
The above configuration `properties` file can alternatively be provided as an `xml` file as below:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Console name="STDOUT" target="SYSTEM_OUT">
<PatternLayout pattern="[%level] %msg%n"/>
</Console>
<File name="LOGFILE" fileName="logs/FileUpload.log">
<PatternLayout pattern="[%level] %d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %c{1} - %msg%n"/>
<ThresholdFilter level="info" onMatch="ACCEPT" onMismatch="DENY"/>
</File>
</Appenders>
<Loggers>
<Root level="INFO">
<AppenderRef ref="STDOUT"/>
<AppenderRef ref="LOGFILE"/>
</Root>
</Loggers>
</Configuration>
```
Both the `properties` and the `xml` file achieve the same result which is:
- Output runtime messages to the console.
- Log runtime messages to a file `logs/FileUpload.log` with a little more detail.
- Both output the log level of INFO.
### Custom Logging
As previously mentioned you can provide your own log configuration file that will override the default log configuration.
This can be easily achieved by adding `-Dlog4j.configurationFile=directory/file.xml` to the command line and provinding a valid Log4j2 `properties` or `xml` file, for example:
- `java -Dlog4j.configurationFile=directory/file.xml -jar sailpoint-file-upload-utility.jar <commands>`
- `java -Dlog4j.configurationFile=directory/file.properties -jar sailpoint-file-upload-utility.jar <commands>`
### Security Information and Event Management (SIEM)
`Apache Log4j 2` is designed to be highly extensible and integrates smoothly with `SIEM` systems without requiring significant custom development. Out of the box, Log4j 2 provides multiple mechanisms to stream or forward logs to external systems like `Splunk`, `ELK (Elasticsearch, Logstash, Kibana)`, `QRadar`, and other SIEM platforms. For this reason, I have integrated Log4j.
#### Basic Integration (Splunk) Example
The following example and instructions facilitate logging from the `SailPoint File Upload Utility` to an instance of `Splunk` by providing a custom `Log4j 2 Configuration File` on the command line.
This example assumes you have an instance of Splunk available that has:
1) A HTTP Event Collector enabled [Splunk HEC Documentation](https://dev.splunk.com/enterprise/docs/devtools/httpeventcollector/).
2) An access token for the HTTP Event Collector.
3) Firewall rules from the File Uploader machine to the Splunk instance.
##### Log4j Configuration
The following outlines a configuration file that does the following:
- Prints the log output to the console.
- Appends the log output to the file `logs/FileUpload.log` with more detail.
- Pushes the log output to the `Splunk` server outlined in the appenders section.
- The log level of DEBUG for is used for testing purposes.
```xml
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Console name="STDOUT" target="SYSTEM_OUT">
<PatternLayout pattern="[%level] %msg%n"/>
</Console>
<File name="LOGFILE" fileName="logs/FileUpload.log">
<PatternLayout pattern="[%level] %d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %c{1} - %msg%n"/>
<ThresholdFilter level="info" onMatch="ACCEPT" onMismatch="DENY"/>
</File>
<Http name="SPLUNK" url="https://<<<Splunk:Instance>>>/services/collector/raw">
<Property name="Authorization" value="Splunk <<<Splunk:Access Token>>>"/>
<Property name="Content-Type" value="application/json"/>
<PatternLayout pattern="%d %m%n"/>
</Http>
</Appenders>
<Loggers>
<Root level="DEBUG">
<AppenderRef ref="STDOUT"/>
<AppenderRef ref="LOGFILE"/>
<AppenderRef ref="SPLUNK"/>
</Root>
</Loggers>
</Configuration>
```
##### Command Line Arguments
As previously mentioned you can override the default configuration by specifying the `-Dlog4j.configurationFile=directory/file.xml` on the command line to point to your custom `Splunk` log configuration above:
```bash
java -Dlog4j.configurationFile=/path/to/splunk-logging.xml -jar /path/to/sailpoint-file-upload-utility-4.1.1-all.jar <commands>
```
Example:
```bash
java -Dlog4j.configurationFile=/var/uploader/splunk-logging.xml -jar /var/uploader/sailpoint-file-upload-utility-4.1.1-all.jar --config-file /var/uploader/my-config.json
```
### Further Log Configuration
For more details on configuring logging for your environment using `Log4j`, refer to the [Log4j 2 Documentation](https://logging.apache.org/log4j/2.x/manual/configuration.html) or utilize the examples outlined above.
## Added Environment Variables for Proxy Username and Password
As with the environment variables for Client ID and Secret, I have added functionality to specify the proxy username and password with the following environment variables:
- SAIL_PROXY_USER
- SAIL_PROXY_PASS
These can be used by either setting the value `env` for `--proxyUser` and or `--proxyPassword` on the command line or setting the respective values to `env` in the config file:
```json
{
...
"proxy": {
"enabled": true,
"host": "my-proxy.host",
"port": 1234,
"user": "env",
"password": "env"
},
...
}
```
The above configuration will retrieve the proxy username and password fro the the environment variables.