On June 20, 2016, Microsoft released a preview of their Microsoft.Phoenix.Client on Nuget.org. This package provides a .NET framework compatible collection of classes to interface with the Apache Phoenix Query Server for Apache HBase. I had been evaluating Phoenix and HBase in the prior weeks and the release of the .NET client library was very interesting to me. It was the only .NET compatible client I was aware of and I immediately began experimenting with it.
As I began learning how to use the Microsoft.Phoenix.Client, I noticed the use of Google Protocol Buffers and discovered the underlying wire protocol relied on Protocol Buffers also. As it turns our Apache Phoenix uses Apache Calcite, and specifically Avatica, at the network layer to facilitate the Java Database Connectivity (JDBC) interface.
The Microsoft.Phoenix.Client’s project site, hosted on GitHub, has some great examples
of using the PhoenixClient
. I also found duoxo’s tweet-sentiment-phoenix
project to have some great examples. As I worked though my first use case, I was missing the good old
IDbConnection et al paradigm implemented for so many
relational database providers. SqlConnection, OracleConnection, etc. Wouldn’t it be great if there was a PhoenixConnection?
With a resounding “yes!”, I started the PhoenixConnection
class, which led to the PhoenixCommand
and then PhoenixDataReader
classes.
Along with these classes came the familiar ConnectionString
property on the PhoenixConnection
class. After working through most of the
interfaces, it became possible for me to open a connection to Apache Phoenix and execute queries from my code in a manner virtually every .NET
developer is familiar with:
using (IDbConnection phConn = new PhoenixConnection())
{
phConn.ConnectionString = cmdLine.ConnectionString;
phConn.Open();
using (IDbCommand cmd = phConn.CreateCommand())
{
cmd.CommandText = "SELECT * FROM GARUDATEST";
using (IDataReader reader = cmd.ExecuteReader())
{
while(reader.Read())
{
for(int i = 0; i < reader.FieldCount; i++)
{
Console.WriteLine(string.Format("{0}: {1}", reader.GetName(i), reader.GetValue(i)));
}
}
}
}
}
It seemed apparent that these APIs would be useful to other .NET developers who needed to interface with big data stored in Apache Phoenix/HBase. I decided to prepare a Nuget package which I named Garuda.Data, after the mythical bird from Hindu and Buddhist traditions, the Garuda, and uploaded an early “alpha” version to nuget.org in late July.
Query Apache Phoenix+Hbase from #dotnet just like #sqlserver, using Garuda.Data. https://t.co/gL2mJboGmR #Hadoop pic.twitter.com/mOqMoMwyiF
— Daniel Dittenhafer (@dwdii) July 25, 2016
The Garuda.Data package has been updated several times since the original alpha release. As of the time of this writing, it is in beta (v0.5.6067.42547) and includes:
PhoenixBulkCopy
class which takes advantage of the
ExecuteBatchRequest mechanism for more
efficient inserts/updates (UPSERTS in Phoenix SQL).PhoenixTransaction
class enables traditional transactional commit/rollback
using the Phoenix Transactions functionality.The Garuda.Data project is part of a solution, GarudaUtil, which includes a graphical user interface for connecting to and querying Apache Phoenix using the Garuda.Data library.
The GarudaUtil solution including Garuda.Data and Garuda Query, is available on GitHub.. I welcome feedback! Please use the GarudaUtil Issues section to report any bugs or enhancements.
Links:
Best,