Streaming lines from a file or resource using the Java Stream API
Prior to the Java Stream API introduced in Java 8, the two main ways to read lines from a file in Java were via the BufferedReader class, which provides a readLine() method, or using Files.readAllLines()
. The latter has the advantage of a simpler API, at the expense of reading the entire file into memory at once. (For convenience, we will refer to a "file" in the text that follows, but the same principle and API applies to any resource to which we can construct a Path.)
The Files.lines()
method provides a convenient way to get almost the best of both worlds. I say "almost" because, as we will see, the issue of resource closure and exception handling make the code less clean than the simple case of streaming items from a collection or other in-memory data source. For reasons that we will explain below, a typical code pattern for streaming lines from a file will look something like this:
try (Stream<String> s = Files.lines(path)) {
s.forEach(line -> {
...
});
} catch (IOException | UncheckedIOException ioex) {
...
}
Compared to the usual case of streaming from a list or other collection, we have (perhaps disappointingly...) more "boilerplate" code.
But as you might imagine, a Stream that is generated from a file or other I/O resource has some slightly different properties to a Stream
generated from a collection or other stream source:
- constructing the stream may involve opening the file or resource, and hence could itself fail and generate an exception (contrast
this with a Stream over a collection, where "the collection is not touched" until we actually terminate the stream pipeline with a collector etc);
- less commonly, the file could theoretically cause an exception to be thrown upon closure;
- the fact that the Stream source file needs to be closed means that the Stream that accesses it must be closed once we have finished
reading items from it;
- the underlying methods that actually fetch items from the stream may fail due to an I/O error during streaming.
These factors explain the code pattern above:
- we need to catch the IOException thrown by Files.lines(), and potentially by its closure;
- we need to close the Stream, in this case by using a try-with-resource (but we could theoretically have used a finally block instead);
- we are likely to want to catch unchecked exceptions (an UncheckedIOException in this case) that could be thrown as the actually lines are read or decoded from the stream
(although strictly speaking, this is optional).
To emphasise: IOException will be thrown in the event of an I/O error while initiating or closing the stream; UnchecedIOException will
be thrown if there is an I/O error reading an individual line. (Note that even if the bytes are physically read from the file, a CharacterCodingException,
which is a subclass of IOException, could be thrown if the bytes cannot then be interpreted in the specified character encoding.)
What is UncheckedIOException?
As mentioned in our section on throwing exceptions from lambda expressions, the general-purpose interfaces in the java.util.function package
(and, therefore, the Java Stream API in general) do not support checked exceptions. This is actually not an issue in many typical cases of streaming, sorting and
filtering data from collections. Where an exception does need to be handled within a lambda expression that does not allow it to be thrown, a common solution is to re-cast
the exception as a RuntimeException.
With the advent of the Stream API, methods such as Files.lines have been added to make I/O operations easier thanks to the use of
lamvda expressions. A common exception that needs to be caught and re-cast within a lambda expression is therefore the IOException and its
subclasses. While these could be re-cast to RuntimeExceptions just like any other exception, the UncheckedIOException class was introduced
for convenience: it provides a standard RuntimeException subclass that we can use specifically when we re-cast IOExceptions,
reserving plain RuntimeExceptions for other less expected cases, and allowing us to distinguish between the different cases.
For more details, see the section on lambdas and exception handling.
Related information: streaming data from other sources
Various methods have been added to Java to allow you to stream data from collections and other sources and filter and organise the resulting items using lambda expressions.
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.