Java 8 is finally here! After years of waiting, Java programmers will finally get support for functional programming in Java. Functional programming support helps streamline existing code while providing powerful new capabilities to the Java language. One area that will be disrupted by these new features is how programmers work with databases in Java. Functional programming support opens up exciting new possibilities for simpler yet more powerful database APIs. Java 8 will enable new ways to access databases that are competitive with those of other programming languages such as C#’s LINQ.

The Functional Way of Working With Data

Java 8 not only adds functional-support to the Java language, but it extends the Java collection classes with new functional ways of working with data. Traditionally, working with large amounts of data in Java requires a lot of loops and iterators.

Java 8终于到来了! 经过几年的等待, java程序员终于能在java中得到函数式编程的支持了. 函数式编程的支持能流程化现有的代码并且为java提供强大的能力.在这些新特性中最瞩目的是java程序员对数据库的操作方式.函数式编程带来了令人激动的简便高效的数据库API. Java 8 将会支持可与像C#的LINQ等语言竞争的新的数据库访问方式.

处理数据的函数式方式

Java 8 不仅仅添加了函数式支持,它也通过新的函数式处理数据的方式扩展了集合(Collection)类. 而通常情况下java处理大量数据时需要大量的循环和迭代器.

For example, suppose you have a collection of Customer objects:

Collection<Customer> customers;

If you were only interested in the customers from Belgium, you would have to iterate over all the customers and save the ones you wanted.

Collection<Customer> belgians = new ArrayList<>();
for (Customer c : customers) {
    if (c.getCountry().equals("Belgium"))
        belgians.add(c);
}

This takes five lines of code. It is also poorly abstracted. What happens if you have 10 million customers, and you want to speed up the code by filtering it in parallel using two threads? You would have to rewrite everything to use futures and a lot of hairy multi-threaded code.

With Java 8, you can write the same code in one line. With its support for functional programming, Java 8 lets you write a function saying which customers you are interested in (those from Belgium) and then to filter collections using that function. Java 8 has a new Streams API that lets you do this.

customers.stream().filter(
    c -> c.getCountry().equals("Belgium")
);

Not only is the Java 8 version of the code shorter, but the code is easier to understand as well. There is almost no boilerplate. The code calls the method filter(), so it's clear that this code is used for filtering customers. You don't have to spend your time trying to decipher the code in a loop to understand what it is doing with its data.

And what happens if you want to run the code in parallel? You just have to use a different type of stream.

customers.parallelStream().filter(
    c -> c.getCountry().equals("Belgium")
);

What's even more exciting is that this functional-style of code works with databases as well!

例如, 假设你有一个存储客户(Customer)对象的collection:

Collection<Customer> customers;

如果你只对来自Belgium的客户感兴趣, 你将不得不迭代所有的customer对象并只保存你需要的.

Collection<Customer> belgians = new ArrayList<>();
for (Customer c : customers) {
    if (c.getCountry().equals("Belgium"))
        belgians.add(c);
}

这不仅花费了5行代码,而且它也不怎么抽象.假使你有1千万个对象时会怎样呢?你会通过两个线程并发过滤所有对象来提速么?那你将不得不使用大量危险的多线程代码来重写所有代码.

而通过Java 8,仅仅只需要一行代码就能实现相同的功能.通过对函数式编程的支持, Java 8 能让你只写一个函数表明你对哪些客户(对象)感兴趣然后使用那个函数对集合做过滤就可以了. Java 8 的新 Steams API 支持你这样做:

customers.stream().filter(
    c -> c.getCountry().equals("Belgium")
);

上面Java 8 版本的代码不仅更短,而且更容易理解.它几乎没有什么 陈词滥调(循环或迭代器等).代码调用了filter()方法,那很明显这段代码是用来过滤客户(对象)的.你不需要再把时间浪费在解读循环中的代码来理解它在对它的数据做什么.

假使你想并发执行这段代码该怎么办呢?你只需使用另一个类型的stream

customers.parallelStream().filter(
    c -> c.getCountry().equals("Belgium")
);

更另人激动的是这种函数式风格的代码也同样适用于数据库

The Functional Way of Working with Databases

Traditionally, programmers have needed to use special database query languages to access the data in databases. For example, below is some JDBC code for finding all the customers from Belgium:

PreparedStatement s = con.prepareStatement(
      "SELECT * "
    + "FROM Customer C "
    + "WHERE C.Country = ? ");
s.setString(1, "Belgium");
ResultSet rs = s.executeQuery();

Much of the code is in the form a string, which the compiler can&#8217;t check for errors and which can lead to security problems due to sloppy coding. There is also a lot of boilerplate code that makes writing database access code quite tedious. Tools such as jOOQ solve the problem of error-checking and security by providing a database query language that can be written using special Java libraries. Or you can use tools such as object-relational mappers to hide a lot of boring database code for common access patterns, but if you need to write non-trivial database queries, you will still need to use a special database query language again.

With Java 8, it&#8217;s possible to write database queries using the same functional-style used when working with the Streams API. For example, Jinq is an open source project that explores how future database APIs can make use of functional programming. Here is a database query written using Jinq:

customers.where(
    c -> c.getCountry().equals("Belgium")
);

This code is almost identical to the code using the Streams API. In fact, future versions of Jinq will let you write queries directly using the Streams API. When the code is run, Jinq will automatically translate the code into a database query like the JDBC query shown before.

在数据库上使用函数式方式

传统上来说, 程序员需要用特殊数据库查询语句去访问数据库的数据. 例如,下面就是用 JDBC 代码去查找来自Belgium的客户:

PreparedStatement s = con.prepareStatement(
      "SELECT * "
    + "FROM Customer C "
    + "WHERE C.Country = ? ");
s.setString(1, "Belgium");
ResultSet rs = s.executeQuery();

大部分这些代码都是字符串, 这样会使编译器不能发现错误而且这草率的代码会导致安全问题. 还有这些大量的样板代码使得写数据访问代码变得十分冗余. 一些工具例如 jOOQ ,通过使用特殊的java库去提供数据库查询语言可以解决错误检查和安全问题。 或者使用对象关系映射工具可以免去大量的无趣的代码,可它们只能用在通用访问查询, 如果需要复杂的查询,还是需要用特殊的数据库查询语言。

使用Java 8,借助流式API就可以用函数式方式去查询数据库了。例如, Jinq 是一个开源的项目,它探索怎样的未来数据库API可以令函数式编程成为可能。这里就是一个使用Jinq的数据库查询:

customers.where(
    c -> c.getCountry().equals("Belgium")
);

这代码几乎跟跟使用流式API的代码一样. 事实上,未来的Jinq版本可以让你用流式API直接写数据库查询。 当代码运行的时候,Jinq将自动翻译成数据库查询代码,正如之前JDBC查询一样。

So without having to learn a new database query language, you can write efficient database queries. You can use the same style of code you would use for Java collections. You also don&#8217;t need a special Java compiler or virtual machine. All of this code compiles and runs using the normal Java 8 JDK. If there are errors in your code, the compiler will find them and report them to you, just like normal Java code.

Jinq supports queries that can be as complicated as SQL92. Selection, projection, joins, and subqueries are all supported. The algorithm for translating Java code into database queries is also very flexible in what code it will accept and translate. For example, Jinq has no problem translating the code below into a database query, despite its complexity.

customers
    .where( c -> c.getCountry().equals("Belgium") )
    .where( c -> {
        if (c.getSalary() < 100000)
            return c.getSalary() < c.getDebt();
        else
            return c.getSalary() < 2 * c.getDebt();
        } );

As you can see, the functional programming support in Java 8 is well-suited for writing database queries. The queries are compact, and complex queries are supported.

这样的话,就算没有学过一些新的数据库查询语言,你也可以写出有效率的数据库查询。你可以用同样样式的代码用在java集合上。你也不需要特殊的java编译器或者虚拟机。所有的代码编译和运行在普通的java 8 JDK上。如果你的代码有错误,编译器将找出它们并且报告给你,就像普通的java代码。

Jinq 支持跟SQL92一样的复杂查询. Selection(选择), projection(投影), joins(连接), 和子查询 它都支持。翻译java代码成数据库查询的算法是十分灵活的,只要是它能接受的,都能翻译。例如,Jinq能够翻译下面的数据库查询,尽管它很复杂。

customers
    .where( c -> c.getCountry().equals("Belgium") )
    .where( c -> {
        if (c.getSalary() < 100000)
            return c.getSalary() < c.getDebt();
        else
            return c.getSalary() < 2 * c.getDebt();
        } );

正如你看到的,java 8 的函数式编程非常适合数据库查询。而且查询紧凑,甚至复杂的查询也能够胜任。

Inner Workings

But how does this all work? How can a normal Java compiler translate Java code into database queries? Is there something special about Java 8 that makes this possible?

The key to supporting these new functional-style database APIs is a type of bytecode analysis called symbolic execution. Although your code is compiled by a normal Java compiler and run in a normal Java virtual machine, Jinq is able to analyze your compiled Java code when it is run and construct database queries from them. Symbolic execution works best when analyzing small functions, which are common when using the Java 8 Streams API.

The easiest way to understand how this symbolic execution works is with an example. Let&#8217;s examine how the following query is converted by Jinq into the SQL query language:

customers
    .where( c -> c.getCountry().equals("Belgium") )

Initially, the customers variable is a collection that represents this database query

SELECT *
  FROM Customers C

 

内部运作

但这都是如何工作的呢?怎么能让普通的Java编译器将Java代码转换成数据库查询?Java 8 有什么特别之处使这个成为可能?

支持这些函数性风格的新的数据库PI的关键是一种叫做“象征性执行”的字节码分析手段。虽然你的代码是被一个普通的Java编译器编译的并运行在一个普通的Java虚拟机中,但 Jinq 能够在你被编译的Java代码运行时进行分析并从中构建数据库查询。使用 Java 8 Streams API 时,常会发现分析短小的函数时,象征性执行的工作效果最好。

要了解这个象征性执行是如何工作的,最简单的方法是用一个例子。让我们检查一下下面的查询是如何被 Jinq 转换为SQL查询语言的:

customers
    .where( c -> c.getCountry().equals("Belgium") )

初始时, 变量 customers 是一个集合,其对应的数据库查询是:

SELECT *
  FROM Customers C

 

Then, the where() method is called, and a function is passed to it. In this where() method, Jinq opens the.class file of the function and gets the compiled bytecode for the function to analyze. In this example, instead of using real bytecode, let's just use some simple instructions to represent the bytecode of the function:

  1. d = c.getCountry()

  2. e = &#8220;Belgium&#8221;

  3. e = d.equals(e)

  4. return e

Here, we pretend that the function has been compiled by the Java compiler into four instructions. This is what Jinq sees when the where() method is called. How can Jinq make sense of this code?

Jinq analyzes the code by executing it. Jinq doesn't run the code directly though. It runs the code 'abstractly'. Instead of using real variables and real values, Jinq uses symbols to represent all values when executing the code. This is why the analysis is called symbolic execution.

Jinq executes each instruction and keeps track of all the side-effects or all the things that the code changes in the state of the program. Below is a diagram showing all the side-effects that Jinq finds when it executes the four lines of code using symbolic execution.

 

Symbolic execution example

然后,where() 方法被调用,一个函数被传递给它。在 where() 方法中,Jinq 打开这个函数的 .class 文件,得到这个函数被编译成的字节码进行分析。在这个例子中,不使用真正的字节码,让我们用一些简单的指令来代表这个函数的字节码:

  1. d = c.getCountry()

  2. e = &#8220;Belgium&#8221;

  3. e = d.equals(e)

  4. return e

在这里,我们假设函数已被Java编译器编译成这四条指令。当调用 where() 方法时,Jinq 看到的就是这些。如何才能使Jinq理解这些代码呢?

Jinq 通过执行代码来分析。但 Jinq 不直接运行代码。它是“抽象”地运行代码:不使用真实的变量和真实的值,Jinq 使用符号来表示执行代码时的所有值。这就是这个分析为什么被称为“象征性执行”。

Jinq 执行每条指令,并跟踪所有的副作用或代码在程序状态时改变的所有东西。下面是一个图表,显示出 Jinq 用象征性执行方式执行这四行代码时发现的所有副作用。

 

象征性执行的例子

In the diagram, you can see how after the first instruction runs, Jinq finds two side-effects: the variable dhas changed and the method Customer.getCountry() has been called. With symbolic execution, the variable d is not given a real value like 'USA' or 'Denmark'. It is assigned the symbolic value ofc.getCountry().

After all the instructions have been executed symbolically, Jinq prunes the side-effects. Since the variablesd and e are local variables, any changes to them are discarded after the function exits, so those side-effects can be ignored. Jinq also knows that the methods Customer.getCountry() and String.equals()do not modify any variables or show any output, so those method calls can also be ignored. From this, Jinq can conclude that executing the function produces only one effect: it returnsc.getCountry().equals("Belgium").

Once Jinq has understood what the function passed to it in the where() method does, it can then merge this knowledge with the database query underlying the customers collection to create a new database query.

 

Generating a database query

在图中,你可以看到第一条指令运行后,Jinq 发现了两个副作用:变量d已经发生了变化,方法 Customer.getCountry() 被调用。由于是象征性执行,变量d没有给出一个真正的比如是“USA”或“Denmark”的值,它被分配为 c.getCountry() 的象征性的值。

在所有这些指令被象征性执行之后,Jinq 对副作用作精简。由于变量 d 和 e 是局部变量,它们的任何变化在函数退出后都会被丢弃,所以这些副作用可以忽略不计。Jinq也知道 Customer.getCountry() and String.equals() 方法没修改任何变量或显示任何输出,因此这些方法调用也可以被忽略。由此,Jinq 可以得出这样的结论:执行这个函数只会产生一个作用,它会返回 c.getCountry().equals("Belgium")。

一旦Jinq已明白在 where()方法中传递给它的函数,它可以混合数据库查询方面的知识,优先于 customers 集合来创建一个新的数据库查询。

 

生成数据库查询

And that's how Jinq generates database queries from your code. The use of symbolic execution means that this approach is quite robust to the different code patterns outputted by different Java compilers. If Jinq ever encounters code with side-effects that can't be emulated using a database query, Jinq will leave your code untouched. Since everything is written using normal Java code, Jinq can just run that code directly instead, and your code will produce the expected results.

This simple translation example should have given you an idea of how the query translation works. You should feel confident that these algorithms can correctly generate database queries from your code.

An Exciting Future

I hope I have given you a taste for how Java 8 enables new ways of working with databases in Java. The functional programming support in Java 8 allows you write database code in a similar way to writing code for working with Java collections. Hopefully, existing database APIs will soon be extended to support these styles of queries.

这就是 Jinq 如何从你的代码生成数据库查询的。象征性执行的使用意味着,这种方法对于不同的Java编译器输出的不同的代码模式都是相当强大的。如果 Jinq 遇到的代码有不能转化为数据库查询的副作用,Jinq 将保持你的这些代码不变。因为一切都是用正常的Java代码写的,Jinq 可以直接运行那些代码,您的代码将产生预期的结果。

这个简单的翻译实例应该让你明白了怎样查询翻译作品。你可以确信,这些算法可以正确地从你的代码生成数据库查询。

美好前景

我希望我已经让你品尝到了Java 8带来的在Java中进行数据库工作的新方式。Java 8 支持的函数式编程允许你用和为Java集合编写代码同样的方式来为数据库写代码。希望不久现有的数据库API都能被扩展以支持这些类型的查询。

 

原文: 开源中国