JasperReports - CSV Data Source Sample
Shows how the CSV data source and the CSV query executer implementations could be used to fill reports.
Main Features in This Sample
CSV Data Source
CSV Query Executer
Secondary Features
CSV Data Source
Description / Goal
How to fill a report using data from a CSV file.
Since: 1.2.0
Other Samples
/demo/samples/datasource
CSV Data Sources
Report filling is one of the basic steps during the report generation. After the report compilation, significant report data are read from the report data source, or calculated from report expressions, and the generated JasperReport object is filled section by section
Data sources are very useful when data come as a set of structured records, either extracted from a relational database, or loaded from specific files. In order to become more familiar with data source objects please consult the Data Sources section.
Sometimes data that users need to fill the report with is found in plain text files, in a certain format, such as the popular CSV
(comma-separated value).
JasperReports provides an implementation for such a data source, by wrapping the CSV data from a text file into a JRCsvDataSource. The CSV data source usually needs to read a file from disk, or at least from an input stream. CSV is a file format with very few formatting rules: data rows are separated by a record delimiter (text sequence) and fields inside each row are separated by a field delimiter (character). Fields containing delimiter characters can be placed inside quotes. If fields contain quotes themselves, these are duplicated (for example, "John ""Doe""" will be displayed as John "Doe").
The most common delimiters for CSV files are:
- comma (
,
) - as field delimiter - newline (
\n
) - as record delimiter
Users can override these default values by calling setFieldDelimiter(char)
and setRecordDelimiter(String)
of the JRCsvDataSource class.
There are two categories of CSV files:
- files with a header record containing column names, all the other records containing field values;
- files without a header record; all records contain only field values. This is the default category.
For the files in the first category, column names are read from the first row in the CSV file. But the engine should be informed that the input file belongs to the first category. This can be done by calling the setUseFirstRowAsHeader(true)
method of the JRCsvDataSource.
If files belong to the second category (default), column names have to be provided separately, taking into account that the column names order should be the same as the column order in a data record. If this is not possible, the default naming convention is to use the COLUMN_
prefix followed by the zero-based column index.
For instance, if is known that a particular column is the third column in the record (index=2
), then one could name the corresponding field "COLUMN_2"
and use the column data without problems.
Another problem when working with CSV files is related to data types. Handling data types for fields in CSV data sources is special since the CSV file format does not provide such information. This kind of matter is solved by trying to match each field in the data source to its corresponding report field type. For number and date/time fields, converting text values to java.lang.Number
and java.util.Date
values respectively, requires text parsing using format objects. This is controlled by specifying the date and number format objects to be used with the JRCsvDataSource instance by calling its setDateFormat(DateFormat)
and setNumberFormat(NumberFormat)
methods before passing it to the report-filling process.
CSV Data Source Example
In our example data records are stored in the /data/CsvDataSource.txt file. Although the file extension is .txt
, it contains structured data according to the CSV rules:
"Dallas",47,"Janet Fuller","445 Upland Pl.","Trial"
"Lyon",38,"Andrew Heiniger","347 College Av.","Active"
"Dallas",43,"Susanne Smith","2 Upland Pl.","Active"
"Berne",22,"Bill Ott","250 - 20th Ave.","Active"
"Boston",32,"Michael Ott","339 College Av.","Trial"
"Dallas",4,"Sylvia Ringer","365 College Av.","Active"
"Boston",23,"Julia Heiniger","358 College Av.","Active"
"Chicago",39,"Mary Karsen","202 College Av.","Active"
"Dallas",40,"Susanne Miller","440 - 20th Ave.","Trial"
"Berne",9,"James Schneider","277 Seventh Av.","Active"
"Dallas",36,"John Steel","276 Upland Pl.","Suspended"
"Chicago",35,"George Karsen","412 College Av.","Suspended"
"Dallas",37,"Michael Clancy","19 Seventh Av.","Deleted"
"Lyon",2,"Anne Miller","20 Upland Pl.","Active"
"Dallas",0,"Laura Steel","429 Seventh Av.","Active"
"Lyon",28,"Susanne White","74 - 20th Ave.","Deleted"
"Paris",5,"Laura Miller","294 Seventh Av.","Active"
"Lyon",17,"Laura Ott","443 Seventh Av.","Active"
"New York",46,"Andrew May","172 Seventh Av.","Active"
"New York",44,"Sylvia Ott","361 College Av.","Active"
"Dallas",19,"Susanne Heiniger","86 - 20th Ave.","Active"
"Chicago",11,"Julia White","412 Upland Pl.","Active"
"Dallas",10,"Anne Fuller","135 Upland Pl.","Active"
"New York",41,"Bill King","546 College Av.","Deleted"
"Oslo",45,"Janet May","396 Seventh Av.","Active"
"Paris",18,"Sylvia Fuller","158 - 20th Ave.","Trial"
"San Francisco",48,"Robert White","549 Seventh Av.","Active"
"Paris",25,"Sylvia Steel","269 College Av.","Suspended"
"San Francisco",7,"James Peterson","231 Upland Pl.","Active"
"Oslo",42,"Robert Ott","503 Seventh Av.","Trial"
The file has no header row with column names. Column names are set independently, as shown in the getDataSource()
method in the /src/CsvDataSourceApp.java
file:
private static JRCsvDataSource getDataSource() throws JRException
{
String[] columnNames = new String[]{"city", "id", "name", "address", "state"};
JRCsvDataSource ds = new JRCsvDataSource(JRLoader.getLocationInputStream("data/CsvDataSource.txt"));
ds.setRecordDelimiter("\r\n");
ds.setColumnNames(columnNames);
return ds;
}
The five column names are: city, id, name, address
and state
, in this particular order. Field names are identical to the column names defined here.
The JRCsvDataSource object prepared above is passed to the engine at fill time (see again the /src/CsvDataSourceApp.java
file):
public void fill() throws JRException
{
long start = System.currentTimeMillis();
//Preparing parameters
Map parameters = new HashMap();
parameters.put("ReportTitle", "Address Report");
parameters.put("DataFile", "CsvDataSource.txt - CSV data source");
Set states = new HashSet();
states.add("Active");
states.add("Trial");
parameters.put("IncludedStates", states);
JasperFillManager.fillReportToFile("build/reports/CsvDataSourceReport.jasper", parameters, getDataSource());
System.err.println("Filling time : " + (System.currentTimeMillis() - start));
}
The IncludedStates
parameter defined above is used for data filtering. Only records with Active
or Trial
states will be taken into account:
<parameter name="IncludedStates" class="java.util.Set"/>
...
<filterExpression><![CDATA[$P{IncludedStates}.contains($F{state})]]></filterExpression>
Running the Sample
Running the sample requires the Apache Maven library. Make sure that maven
is already installed on your system (version 3.6 or later).
In a command prompt/terminal window set the current folder to demo/samples/csvdatasource within the JasperReports source project and run the following command:
> mvn clean compile exec:exec@all
It will generate all supported document types containing the sample report in the demo/samples/csvdatasource/target/reports
directory.
CSV Query Executer
Description / Goal
How to fill a report using the CSV query executer.
Since: 4.0.0
Other Samples
/demo/samples/ejbql
/demo/samples/hibernate
/demo/samples/mondrian
/demo/samples/xmldatasource
Using The CSV Query Executer
The other (and highly recommended) option available for preparing a JRCsvDataSource is to let the JasperReports built-in CSV query executer to prepare one for us. When the query language is set to CSV
or csv
, the CSV query executer registered for this language will take the responsibilty to create and populate the needed CSV data source.
JasperReports ships with a default implementation of the CSV query executer, named JRCsvQueryExecuter.In order to create a valid CSV data source this query executer needs:
- To access the resource containing the CSV data. Depending on its type, the resource could be located using one of the following:
CSV_READER
parameter, if the resource is available as java.io.Reader objectCSV_INPUT_STREAM
parameter, if the resource is available as java.io.InpuStream objectCSV_URL
parameter, if the resource is available is located at a given URLCSV_FILE
parameter, if the resource is available as java.io.File objectCSV_SOURCE
- a String parameter or property, representing a valid path to the requested resource
- To know the charset encoding for the CSV data, in order to create a data source with an appropriate encoding. The encoding can be set using the
CSV_ENCODING
parameter or property. - To know the
java.util.Locale
associated with the CSV data, in order to create a properly localized data source. Thejava.util.Locale
can be set using theCSV_LOCALE
orCSV_LOCALE_CODE
parameter. - To know the
java.util.TimeZone
associated with the CSV data, in order to create a properly localized data source. Thejava.util.TimeZone
can be set using theCSV_TIMEZONE
orCSV_TIMEZONE_ID
parameter. - To know the number format associated with the CSV data, in order to create proper
java.lang.Number
objects from numeric fields stored in the CSV data. The number format can be set using theCSV_NUMBER_FORMAT
orCSV_NUMBER_PATTERN
parameter. - To know the date format associated with the CSV data, in order to create proper
java.util.Date
objects from date fields stored in the CSV data. The date format can be set using theCSV_DATE_FORMAT
orCSV_DATE_PATTERN
parameter. - To know the field delimiter character, in order to parse the CSV resource in a proper way. The field delimiter can be set using the
CSV_FIELD_DELIMITER
parameter or property. The default field delimiter is a comma. - To know the record delimiter sequence, in order to parse the CSV resource in a proper way. The record delimiter can be set using the
CSV_RECORD_DELIMITER
parameter or property. - To know the column order and column names, to be used when create the mapping between fields in the report template and fields in the data source. Column names can be set using the
CSV_COLUMN_NAMES
parameter or property, or theCSV_COLUMN_NAMES_ARRAY
parameter. In various situations the first row of the CSV resource contains no data, but column names. If so, the query executer will read the column names from this row if theCSV_USE_FIRST_ROW_AS_HEADER
parameter/property is set to true. In this case column names stored inCSV_COLUMN_NAMES
orCSV_COLUMN_NAMES_ARRAY
will be neglected.
Once all this input information is loaded, the query executer will lookup for the CSV resource to read and extract data to populate the requested JRCsvDataSource.
The CSV Query Executer Sample
An example of CSV Query Executer is available in the reports/CsvQueryExecuterReport.jrxml
template.
There are two properties set at report level:
<property name="net.sf.jasperreports.csv.column.names" value="city, id, name, address, state"/>
<property name="net.sf.jasperreports.csv.source" value="data/CsvDataSource.txt"/>
The first one stores the column names, the last one stores a relative path to the CSV resource file.
Further we see a report parameter containing the record delimiter sequence:
<parameter name="net.sf.jasperreports.csv.record.delimiter" class="java.lang.String">
<defaultValueExpression><![CDATA["\r\n"]] ></defaultValueExpression>
</parameter>
Notice also that the query language is set to csv, meaning that the CSV query executer will be enabled to prepare the data source for the report:
<query language="csv"/>
Field names in the report are the same as column names set in the net.sf.jasperreports.csv.column.names
property:
<field name="id" class="java.lang.Integer"/>
<field name="name" class="java.lang.String"/>
<field name="address" class="java.lang.String"/>
<field name="city" class="java.lang.String"/>
<field name="state" class="java.lang.String"/>
In the src/CsvDataSourceApp.java
one can see the absence of the data source or connection info at fill time:
// query executer filling
{
start = System.currentTimeMillis();
Map parameters = new HashMap();
parameters.put("ReportTitle", "Address Report");
parameters.put("DataFile", "CsvDataSource.txt - CSV query executer");
Set states = new HashSet();
states.add("Active");
states.add("Trial");
parameters.put("IncludedStates", states);
JasperFillManager.fillReportToFile("build/reports/CsvQueryExecuterReport.jasper", parameters);
System.err.println("Report : CsvQueryExecuterReport.jasper. Filling time : " + (System.currentTimeMillis() - start));
}
This is because the report datasource will be provided by the query executer. To see the result, run the sample as shown here, and take a look at CsvQueryExecuterReport.*
documents in the build/reports
output folder.